-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SVT dataset integration #597
Conversation
Codecov Report
@@ Coverage Diff @@
## main #597 +/- ##
==========================================
- Coverage 96.16% 96.04% -0.12%
==========================================
Files 111 111
Lines 4300 4299 -1
==========================================
- Hits 4135 4129 -6
- Misses 165 170 +5
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I added some optimization comment & 1 about XML security
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still got many questions 🙃
doctr/datasets/svt.py
Outdated
(int(rect_tag.attrib['x']) + int(rect_tag.attrib['width']) / 2, | ||
int(rect_tag.attrib['y']) + int(rect_tag.attrib['height']) / 2, | ||
int(rect_tag.attrib['width']), int(rect_tag.attrib['height']), 0) | ||
for rect_tag in image_tag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to what you said above, I don't see why we don't need to check that if image_tag.tag == 'taggedRectangles'
then 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
image.tag : imageName
image.attrib : {}
image.text : img/04_16.jpg
image.tag : address
image.attrib : {}
image.text : 324 South Glenoaks Boulevard Burbank CA 91502-1318
image.tag : lex
image.attrib : {}
image.text : BLOCKBUSTER,VIDEO,LAW,OFFICES,RAY,FOUNTAIN,CRIMINAL,DEFENSE,ATTORNEY,LAWYER,PIZZA,MAN,CAFE,COLOMBIA,GERRO,MARSHA,RICO,GLENOAKS,PHARMACY,AND,MEDICAL,SUPPLY,RAMIREZ,LYNN,TER,POGHOSYAN,ZARINE,JAS,GIFTS,KNOX,TIMOTHY,DDS,LILLY,PARK,HOME,THEATER,HAMOUI,MOBIL,MART,CLEANERS,TOKYO,YAKIDORI,SHIRVAN,REALTY,GROUP,NAPA,AUTO,CARE,CENTER,TRAN,DEBBY
image.tag : Resolution
image.attrib : {'x': '1280', 'y': '1024'}
image.text : None
image.tag : taggedRectangles
image.attrib : {}
image.text :
This is for one example hopefully it makes it a bit clearer
Why we need no check ?
taggedRectangles has only a attribut
and tag with the label is child from taggedRectangles
but in image_tag we have multible tags: imageName, address, lex, Resolution and taggedRectangles
<taggedRectangle height="31" width="120" x="351" y="455">
<tag>MAGIC</tag>
</taggedRectangle>
with xpath we can search in the tree:
element = (x.findall('.//td[@class="teamid"]'))
but in this case i would prefer the if statement ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but I'm not really familiar with XML, so I have no clue in your snippet whether there is some hierarchy, whether this is a flat data structure or whether we can access attribute values by name 😅
Generally speaking, nested FOR loops have to be avoided, hence my previous question ;)
So, in Python, when processing the XML, for what I understand:
xml_root
holds the list of pages/images- in each page/image, there are several attributes, and we're interested in "imageName", "x", "y", "width", "height" and those are on the same level
Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No lets try to explain with this snippet:
tagset is root
image is first depth tag
imageName, address, lex, Resolution and taggedRectangles is second depth (x,y from Resolution are attributes and the other are text)
taggedRectangles has 1 more depth taggedRectangle with attributes height, width, x, y
and each taggedRectangle has one more depth tag with a text value
@fg-mindee
i have also tryed pythons built-in import xmltodict but this does not work in fact that the xml file is not well formated :/
the other way (and i think the only other way) with XPath something like this: .//imageName but for readability i would prefer the actual implementation 😅
wdyt ?
* start synth * cleanup * reopening #597 * apply changes
This PR integrates the Street View Text dataset