fix: Fixed SAR model for training and inference in PyTorch #831

fg-mindee · 2022-02-23T17:09:02Z

Following up on #802, this PR introduces the following changes:

fixes a major typo in the loss computation (the sequence masking was inverted)
cleaned encoder and decoder for a much more understandable version

I tried training the model with the synthetic training set, and a real-world validation set and managed to reach almost 100% of exact match after a few epochs.

Any feedback is welcome!

codecov · 2022-02-23T17:15:24Z

Codecov Report

Merging #831 (58be438) into main (51dc49b) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #831      +/-   ##
==========================================
+ Coverage   95.97%   95.99%   +0.02%     
==========================================
  Files         131      131              
  Lines        4988     5042      +54     
==========================================
+ Hits         4787     4840      +53     
- Misses        201      202       +1

Flag	Coverage Δ
unittests	`95.99% <100.00%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
doctr/models/recognition/sar/pytorch.py	`99.20% <100.00%> (+0.06%)`	⬆️
doctr/models/detection/linknet/tensorflow.py	`97.67% <0.00%> (-1.00%)`	⬇️
doctr/models/detection/linknet/pytorch.py	`97.97% <0.00%> (-0.89%)`	⬇️
doctr/transforms/modules/base.py	`94.59% <0.00%> (ø)`
doctr/models/classification/resnet/pytorch.py	`100.00% <0.00%> (ø)`
doctr/models/classification/resnet/tensorflow.py	`100.00% <0.00%> (ø)`
doctr/models/classification/magc_resnet/pytorch.py	`100.00% <0.00%> (ø)`
doctr/models/utils/pytorch.py	`100.00% <0.00%> (+5.00%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 51dc49b...58be438. Read the comment docs.

felixdittrich92 · 2022-02-23T19:01:57Z

@fg-mindee
Offtopic:
I have faiced the same in my tests (~99% exact) but after loading this trained model in the whole OCR pipeline there was only wrong results 😅 Do you tested this also ? 😄 (train size was in my test cases 1M / val ~ 200k)

charlesmindee

Thanks!

fg-mindee · 2022-02-24T11:13:48Z

@fg-mindee Offtopic: I have faiced the same in my tests (~99% exact) but after loading this trained model in the whole OCR pipeline there was only wrong results sweat_smile Do you tested this also ? smile (train size was in my test cases 1M / val ~ 200k)

@felixdittrich92 I had the same issue until I changed the transformation pipeline during training. On my end, I used a real-world validation dataset, and it worked! If it doesn't on the full OCR pipeline, the problem comes from something else then :)

fg-mindee added 3 commits February 23, 2022 17:01

fix: Fixed SAR for PyTorch

ced7a5e

fix: Fixed wrong loss masking

d2c49e8

style: Fixed lint

58be438

fg-mindee added type: bug Something isn't working critical High priority module: models Related to doctr.models framework: pytorch Related to PyTorch backend topic: text recognition Related to the task of text recognition labels Feb 23, 2022

fg-mindee added this to the 0.5.1 milestone Feb 23, 2022

fg-mindee requested a review from charlesmindee February 23, 2022 17:09

fg-mindee self-assigned this Feb 23, 2022

fg-mindee mentioned this pull request Feb 23, 2022

Cannot train pytorch sar_resnet31 and master recognition model #802

Closed

4 tasks

charlesmindee approved these changes Feb 24, 2022

View reviewed changes

fg-mindee merged commit 0310d6c into main Feb 24, 2022

fg-mindee deleted the sar-fix branch February 24, 2022 11:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Fixed SAR model for training and inference in PyTorch #831

fix: Fixed SAR model for training and inference in PyTorch #831

fg-mindee commented Feb 23, 2022

codecov bot commented Feb 23, 2022

felixdittrich92 commented Feb 23, 2022

charlesmindee left a comment

fg-mindee commented Feb 24, 2022 •

edited

Loading

fix: Fixed SAR model for training and inference in PyTorch #831

fix: Fixed SAR model for training and inference in PyTorch #831

Conversation

fg-mindee commented Feb 23, 2022

codecov bot commented Feb 23, 2022

Codecov Report

felixdittrich92 commented Feb 23, 2022

charlesmindee left a comment

Choose a reason for hiding this comment

fg-mindee commented Feb 24, 2022 • edited Loading

fg-mindee commented Feb 24, 2022 •

edited

Loading