Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document colour encoding (bases for red, green, yellow and blue) #22

Open
peterjc opened this issue Nov 23, 2018 · 11 comments
Open

Document colour encoding (bases for red, green, yellow and blue) #22

peterjc opened this issue Nov 23, 2018 · 11 comments

Comments

@peterjc
Copy link

peterjc commented Nov 23, 2018

The rival project Blocksford Brickopore LEGO sequencer uses red (“T”), green (“A”), yellow (“G”) and blue (“C”) LEGO bricks - as documented here:

http://www.earlham.ac.uk/articles/earlham-institute-lego-sequencer

My immediate concern was did they invent a new encoding (I'm having FASTQ flashbacks here)?

Sadly it seems yes, since according to https://samnicholls.net/2017/03/15/lego-sequencer/ you used blue for "T".

Further digging, https://imgur.com/gallery/4O8r4 says "It would later turn out that we could distinguish yellow using the Clear channel, as it is much more reflective these values dwarf the rest of the values. Red, blue and green can easily be distinguished by checking the which of the RGB channels is currently giving the highest reading."

I could find the relevant part of your code here, mapping an RGB and Clear value to DNA base letters:

https://github.com/SAMTOMINDUSTRYS/monsterlab/blob/master/software/samseq.ino#L132

  if(clear > 10000){
    Serial.print("C");
  }
  else {
    if(green > blue && green > red){
        Serial.print("A");
    }
    else if(red > green && red > blue){
        Serial.print("G");
    }
    else if(blue > green){
        Serial.print("T");
    }
    else{
        Serial.print("N");
    }
  }

I think that means yellow "C", green "A", red "G", blue "T", other "N".

(On a more serious note, I presume the printed sheets etc deliberately do not give away the colour key so as not to spoil the surprise)

@SamStudio8
Copy link
Member

SamStudio8 commented Nov 23, 2018

I think we took our colour codings from the ones used by Tablet, but we should make a note of them in the README somewhere. Though you're right, the monster sheets are intended to not provide a key to prevent kids gaming the system! Although, at future workshops we'll probably get participants to look at the Monster Lab Zoo and see what bases encode what phenotype.

Of course, it's likely that Brickopore may have chosen an alternative encoding strategy to avoid infringing our intellectual property when building their own sequencer... ;)

@peterjc
Copy link
Author

peterjc commented Nov 23, 2018

Yes, the colour scheme does seem to match Tablet having checked a couple of screenshots:
https://ics.hutton.ac.uk/tablet/tablet-screenshots/

I expect @imilne would be able to tell us where that scheme originally came from (as I assume it followed an even older convention).

@gringer
Copy link

gringer commented Nov 23, 2018

The convention for electrophoresis colours, as used in trace plots, is red (“T”), green (“A”), yellow (“G”) and blue (“C”), e.g. see here. This is the closest to a DNA colour convention that I have found, and what I prefer to use in all my code.

@imilne
Copy link

imilne commented Nov 23, 2018

Tablet would have taken its colours from Flapjack, which was a follow on from TOPALi (http://www.topali.org/topali-v1/), the first program I ever wrote doing that kind of visualization. But you're talking ~2002 - I've no recollection now if I copied those colours from something else around at the time (eg Bioedit or Jalview) or just picked ones I thought looked pretty :)

@peterjc
Copy link
Author

peterjc commented Nov 23, 2018

Chromas (which is the oldest Sanger capillary sequencing tool I've used) does red "T", green "A", black "G" and blue "C" http://technelysium.com.au/wp/chromas/ - which with apparently common yellow/black substitution matches Blocksford Brickopore LEGO sequencer.

I wonder if we can find some early references for precedents establishing these kinds of convention? Might even make a nice short review paper somewhere, if no one has done that already?

@peterjc
Copy link
Author

peterjc commented Nov 23, 2018

From the original JalView documentation they describe various protein colour schemes:
http://www.jalview.org/version118/documentation.html#colour

P.S. The citation for the Taylor protein colour scheme is indeed an entertaining read! It does not mention nucleic acids though.

W R Taylor. Residual colours: a proposal for aminochromography.
Protein Engineering, Vol 10 , 743-746 (1997)
https://dx.doi.org/10.1093/protein/10.7.743

@peterjc
Copy link
Author

peterjc commented Nov 23, 2018

BioEdit screenshots here:
http://www.mbio.ncsu.edu/BioEdit/screenshots.html

Again, red "T" or "U", green "A", black "G" and blue "C"

@peterjc
Copy link
Author

peterjc commented Nov 24, 2018

Iain Macaulay on Twitter https://twitter.com/whatchamacaulay/status/1066265073142964224 said:

It's based on the emission spectra of the fluorescent dyes used in ABI gel sequencers - yellow was changed to black as it's hard to see a yellow trace on a white electropherogram (but not on a black gel background).
image

i.e. Color for raw data on ABI Prism 310 electropherogram (using black for G) and ABI Prism 377 gel image (using yellow for G), and both using red T, green A, blue C.

I wonder when that documentation was published? That might be one of the earliest citable sources for this convention (as very sensibly used by the Blocksford Brickopore LEGO sequencer).

@gringer
Copy link

gringer commented Nov 26, 2018

After 1986, anyway. ABI used A:Fluorescein / FITC (520nm; green emission), T:NBD (550nm; green/yellow emission), G:Tetramethylrhodamine (580nm; yellow emission) and C:Texas Red (610nm; orange emission) in what looks like it could be their first fluorescence sequencing paper:

https://doi.org/10.1038/321674a0

Deep blue was avoided at the time because of the potential for scattering and overlap with fluorescence background (according to the paper). It seems like that unease has since been overcome.

@peterjc
Copy link
Author

peterjc commented Nov 26, 2018

Using red, green, greeny-yellow, and yellow Lego blocks just wouldn't have the same visual impact (and would be more expensive to source too). Blue is better :)

@peterjc
Copy link
Author

peterjc commented Nov 26, 2018

Update from Jim Proctor (@foreveremain on GitHub) on Twitter https://twitter.com/foreveremain/status/1067076249745539072 saying:

AM Waterhouse created Jalview's nucleotide colours in Jan 2005, based on TOPALi's topali.org - so best ask Frank Wright and Iain Milne ;) FWIW see other schemes in Figure 2 of nature.com/articles/nmeth… (paywalled but figure is visible)

Here is Figure 2(c) from https://doi.org/10.1038/nmeth.1434 Proctor et al. (2010)

screenshot 2018-11-26 15 24 30

Amusingly none of the four schemes shown using four colours for the four different bases matches the ABI electrophoresis/gel colours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants