-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review API query for Ensembl #696
Comments
@johnllopez616 will copy the API call out of the code and paste it as a comment to this issue for @kdahlquist to review. |
`let getEnsemblInfo = function (geneSymbol) {
};` |
So I see that this call looks up the gene ID in the "symbol" field, which is what we want. I also see that the species is hard-coded in the call, which is what we will work on with #698. |
I would like to really discuss this during our next meeting, but I would like to note a few observations I made about Ensembl API and a possible lack of data we need for Saccharomyces cerevisiae.
According to the documentation (https://rest.ensembl.org/) and (https://rest.ensembl.org/documentation/info/symbol_lookup), when a species name and gene symbol is given, to the first $.get function seen in the comment above (there isn't actually a need for the second one), the data we seek is meant to be returned. This works for Homo Sapiens, and has worked for other species (I tested the capuchin monkey). Look what happens, however, when we try to do a gene for Saccharomyces cerevisiae: This was tested for ACE2, YHP1, and ADH1. This lack of data can be substantiated by examining the Ensembl interface. Observe what occurs when a search is made for a gene like ACE2: Note the ACE2 for a human gene contains an Ensembl ID and summary data: |
At the last meeting, it was discussed that the most important thing to obtain for Ensembl was the link to the Ensembl page. To access the Ensembl page for a given gene, https://uswest.ensembl.org/Saccharomyces_cerevisiae/Gene/Summary?db=core;g= + locus tag, where the locus tag may be obtained from NCBI. |
It looks like the https statement in the previous comment pings a specific mirror of the ensembl database. Do we want to hardcode a specific mirror or should we just hit "ensembl.org"? |
According to the documentation, looking up a gene requires either an ensembl ID or the gene species or symbol. Our code presently uses the latter:
Interestingly enough, there is a workaround that would allow us to extract the species name by just the symbol. According to the documentation for the taxonomy database, we can get the species name using the NCBI taxon, which, as explained in #697, we parse anyway in order to have NCBI's API call working. What we might be able to do is retrieve the gene species from here. This means, however, if NCBI is not functioning on a given day, we won't be able to access Ensembl, unless we retrieve the species data from another source Is this something we want to discuss further, @kdahlquist? |
We will not use the Ensembl API, but will just populate a URL on the gene page, for which another issue is opened. Also see: https://github.com/dondi/GRNsight/wiki/Web-API-Guide. |
TODO: Review API query for Ensembl
The text was updated successfully, but these errors were encountered: