Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[orientation] page orientation improvements #1553

Merged
merged 29 commits into from
Jun 12, 2024

Conversation

felixdittrich92
Copy link
Contributor

@felixdittrich92 felixdittrich92 commented Apr 15, 2024

This PR:

  • refactor OCRPredictor logic (limit duplication)
  • extend estimate angle function to handle also upside-down images
  • correct return value from estimate angle (negative - left side rot / positive - right side rot) -> update tests
  • integrate page_orientation_predictor

NOTE: Draft until TF orientation models are merged

Any feedback is welcome 🤗

Closes: #1460

Copy link

codecov bot commented May 3, 2024

Codecov Report

Attention: Patch coverage is 96.96970% with 2 lines in your changes missing coverage. Please review.

Project coverage is 96.41%. Comparing base (b9c4e6d) to head (e9946de).
Report is 3 commits behind head on main.

Files Patch % Lines
doctr/models/_utils.py 91.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1553      +/-   ##
==========================================
- Coverage   96.44%   96.41%   -0.04%     
==========================================
  Files         164      164              
  Lines        7743     7773      +30     
==========================================
+ Hits         7468     7494      +26     
- Misses        275      279       +4     
Flag Coverage Δ
unittests 96.41% <96.96%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@felixdittrich92 felixdittrich92 added this to the 0.9.0 milestone May 15, 2024
@felixdittrich92 felixdittrich92 linked an issue May 15, 2024 that may be closed by this pull request
@felixdittrich92 felixdittrich92 self-assigned this May 15, 2024
@felixdittrich92 felixdittrich92 added type: enhancement Improvement module: models Related to doctr.models type: code quality Related to code quality labels May 15, 2024
@felixdittrich92 felixdittrich92 changed the title [Draft] page orientation improvements [orientation] page orientation improvements Jun 6, 2024
@felixdittrich92 felixdittrich92 marked this pull request as ready for review June 6, 2024 09:21
@felixdittrich92
Copy link
Contributor Author

Maybe for better understanding:

To test I took some different images and rotated randomly in range (-180 - 180)
Example: (The test data contains much more dirty samples as the attached one 😅)
Screenshot from 2024-06-06 12-48-09

some results:

Page 0
orientation: 10, rnd: 11
Page 1
orientation: -106, rnd: -106
Page 2
orientation: -95, rnd: -96
Page 3
orientation: -132, rnd: -133
Page 4
orientation: 148, rnd: 147
Page 5
orientation: -99, rnd: -100
Page 6
orientation: 138, rnd: -42
Page 7
orientation: -118, rnd: -117

doctr/models/_utils.py Show resolved Hide resolved
doctr/models/_utils.py Show resolved Hide resolved
doctr/models/predictor/base.py Show resolved Hide resolved
doctr/models/predictor/base.py Show resolved Hide resolved
Comment on lines 107 to 111
if page_orientation and orientation_confidence >= min_confidence:
# special case where the estimated angle is mostly wrong:
# when estimated angle is + or - 90 degree, we should consider the general orientation
if abs(estimated_angle) == 90 and abs(page_orientation) in [0, 90]:
return page_orientation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand entirely this section. Here, estimated_angle == 90, so according to your comment, it's wrong, so page_orientation is returned but it's also 90. It means there is no evident fallback to fix this issue ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah the problem case is if estimated_angle is exact 90 or -90 it's mostly not correct (swapping - and + or complete wrong)

So we should check the page orientation also..

I think we can simplify this:

if abs(estimated_angle) == abs(page_orientation):
    return page_orientation

wdyt ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what "mostly wrong" means here ? Based on your experimentations, if it's 90 or -90, it's completly wrong ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the most cases yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comes from:

elif w / h < 1 / ratio_threshold_for_lines:  # if lines are vertical, substract 90 degree
            angles.append(angle - 90)

but i don't have a better solution for this yet :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@odulcy-mindee

pushed a small update

tested again with a complex set of documents (mobile phone, receipts, etc) 30 samples

randomly rotated in range -180 - 180

it mostly varies between 22 - 25 correct from 30

before this integration / changes on the same set we have had 7 - 11 correct so as mentioned still some space for further improvements it's not perfect but already much better as before :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah.. if i lower the min_confidence we get ~27 - 29 correct xD trust the page orientation model xD

Copy link
Collaborator

@odulcy-mindee odulcy-mindee Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, it sounds good for a "first" iteration. We can improve later in another PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep i think the same

@felixdittrich92 felixdittrich92 merged commit a26bea5 into mindee:main Jun 12, 2024
79 of 81 checks passed
@felixdittrich92 felixdittrich92 deleted the page-orient branch June 12, 2024 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: models Related to doctr.models type: code quality Related to code quality type: enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[orientation] Add page orientation interface & integration
3 participants