-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the flag filtering default to include PCR duplicates #13
Comments
Hi, thanks for your feedback. If the reads sharing the same UMI/CB pair are marked as PCR duplicates by CellRanger, cellSNP would filter them given a small value of parameter maxFLAG. As we found some test datasets on which CellRanger did not perform the marking, we would run CellRanger on a few more datasets, especially using its default parameters. |
Alright, if you think it's the more reasonable choice to filter them. Perhaps it may be worth explicitly stating in the manual that they are filtered by default and may lead to an increased SNV false negative rate, which has been the case in my experience. |
We have changed the default value of maxFLAG to include PCR duplicates for scRNA-seq data when UMItag is turned on and state it in the README file. Thanks for your advice! |
No problem. I am glad you made the change in our experience Vireo has performed extremely well using all of the reads including PCR duplicates. |
From my cursory understanding of cellSNP you filter all of the reads that are marked as PCR duplicates by Cell Ranger. However, this would remove a large number of UMI duplicates as noted by the vartrix documentation:
ignore alignments marked as duplicates? Take care when turning this on with scRNA-seq data, as duplicates are marked in that pipeline for every extra read sharing the same UMI/CB pair, which will result in most variant data being lost.
I wouldn't be surprised if this negatively affects the performance of vireo on some datasets.
The text was updated successfully, but these errors were encountered: