-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Optimized data loading for PyTorch #362
Conversation
Codecov Report
@@ Coverage Diff @@
## main #362 +/- ##
==========================================
+ Coverage 96.11% 96.15% +0.03%
==========================================
Files 83 83
Lines 3426 3460 +34
==========================================
+ Hits 3293 3327 +34
Misses 133 133
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
torch.backends.cudnn.benchmark = True | ||
|
||
st = time.time() | ||
val_set = DetectionDataset( | ||
img_folder=os.path.join(args.data_path, 'val'), | ||
label_folder=os.path.join(args.data_path, 'val_labels'), | ||
sample_transforms=Compose([ | ||
Lambda(lambda x: x / 255), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you remove the normalization here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The to_tensor
in the dataset __getitem__
already does that :)
@@ -137,7 +138,6 @@ def main(args): | |||
img_folder=os.path.join(args.data_path, 'train'), | |||
label_folder=os.path.join(args.data_path, 'train_labels'), | |||
sample_transforms=Compose([ | |||
Lambda(lambda x: x / 255), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cf. previous comment
Following up on #190, this PR introduces the following modifications:
Iterating with the dataloader on FUNSD gets a 25%+ speedup with the same number of workers (3X compared to the original default number of workers)
Closes #190
Any feedback is welcome!