Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Custom cell and UMI barcode position #247

Closed
Hoohm opened this issue Jul 2, 2018 · 8 comments
Closed

[Feature request] Custom cell and UMI barcode position #247

Hoohm opened this issue Jul 2, 2018 · 8 comments
Assignees
Labels
alevin issue is primarily related to alevin feature request

Comments

@Hoohm
Copy link

Hoohm commented Jul 2, 2018

Hello,

I would like to use alevin but I'm not sure I actually can. We use a custom R1 barcode. 1-7 for the cell barcode and 8-16 for UMI.

Is is complicated to implement this?

@k3yavi k3yavi added the alevin issue is primarily related to alevin label Jul 4, 2018
@k3yavi
Copy link
Member

k3yavi commented Jul 4, 2018

Hi @Hoohm ,
Thanks for the feature request.
Currently Alevin do have a hidden feature, where you can explicitly specify the CB and UMI length. Although we have not yet extensively tested these options but in your settings you might have to specify the following command line argument:

--barcodeLength 7 --umiLength 9 --end 5

Please let us know how it works out for you in these settings, it will help validate these options for Alevin.

PS: Just a quick question for my understanding, is there a specific reason you chose to use the length of the UMI longer than CB in your experiment ?

@Hoohm
Copy link
Author

Hoohm commented Jul 4, 2018

Hi @k3yavi ,
Thanks for the info!

We are working on an optimized version of SCRBseq and one of the problems we had with the original protocol is the minimum distance between the cell barcodes being too low. So we increased the number of bases. The original protocol was 6 bc and 10 umi. We just switched the 7 position from umi to barcode.

We use a known whitelist of barcodes since it's a well plate based protocol. We know that any other barcode are not cells. Is there an option for max distance allowed between BC or UMI?

@k3yavi
Copy link
Member

k3yavi commented Jul 4, 2018

Hi @Hoohm ,
Thanks for the quick reply and the explanation .

I personally am not very well versed in the working of SCRBseq. But, as you explained, knowing a set of whitelist CB beforehand is always a plus for the downstream working of the pipeline. Currently, Alevin merge all the observed CB which are 1-edit distance from a known whitelist CB towards the whitelist. The underlying assumption being that the sequencing error (although with low probability) can change CB sequence and we can correct for that. I wonder, is this right to do for your experiment ?

re: >Is there an option for max distance allowed between BC or UMI?
Sorry, but I don't completely understand this question. When you say distance allowed between CB and UMI, do you mean there is a sequence between CB and UMI ( like in in-drop seq)? If that's the case then we might have to tweak a bit in alevin command line flags again.

But I suspect what you meant by above statement is -- max distance allowed between CB among themselves. If that's the question then unfortunately we currently allow correcting for only 1-edit distance for both CB and UMI. But if you think more correction is needed by your protocol then we can put this on the feature request list and discuss about working on this on the next release.

@Hoohm
Copy link
Author

Hoohm commented Jul 5, 2018

Hey @k3yavi , Thanks for the fast answer.

You got it right. I was asking about the max distance allowed for cell barcodes and UMIs.

I will try it out with the default options and come back to you for a potential future feature request.

@k3yavi
Copy link
Member

k3yavi commented Jul 27, 2018

Hey @Hoohm , we have released v0.11.1 with some fix for the customized length mode.

Let us know how it worked out for you.

@k3yavi
Copy link
Member

k3yavi commented Aug 12, 2018

@Hoohm , I believe the custom length options looks good from our end. Feel free to reopen the issue if you face any problem while using alevin in this mode.

Re: More customizable options like a regex for extracting CB and UMI is still in development and has been raised in issue #233 and will keep that issue open until we have more generic extraction.

Thanks again for the feedback and useful comments.

@kh49
Copy link

kh49 commented Sep 5, 2018

How do I apply the custom length settings if I am utilizing the wrapper for v1 10x data? I have v1 data that has a 5bp instead of 10bp UMI but everything else is the same. Can i just add --umiLength 5?
Data:
https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/cd14_monocytes

@k3yavi
Copy link
Member

k3yavi commented Sep 5, 2018

Hi @pophipi ,
I think if you specify --gemcode along with all three i.e. --end 5 --umiLength 5 --barcodeLength 14 as the command line flag it should work for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alevin issue is primarily related to alevin feature request
Projects
None yet
Development

No branches or pull requests

3 participants