Skip to content

khughitt/eve

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EVE: Ensemble Variant Detection

Andre Hennig, Keith Hughitt, Alexander Peltzer, Shrutii Sarda, Kay Nieselt

Overview

This project was started as part of a summer course in bioinformatics and computational biology hosted by the University of Tübingen in collaboration with the University of Maryland, College Park from August 4-9, 2014.

Requirements

EVE is written in Python and requires a recent version of Python and several Python libraries, as well as a number of command-line bioinformatics tools.

Python

Bioinformatics tools

Installation

@TODO

Configuration

@TODO

Usage

FASTQ input example

python eve.py -f path/to/genome.fasta       \
              reads_1.fastq reads_2.fastq

BAM input example

python eve.py -f path/to/genome.fasta       \
              accepted_hits.bam

Training example

python eve.py -f path/to/genome.fasta       \
              --train=actual_snps.vcf       \
              --num-threads=32              \
              reads_1.fastq reads_2.fastq

A more complex example:

python eve.py --fasta=path/to/genome.fasta             \
              --mapper=bowtie2                         \
              --variant-detectors=gatk,mpileup,varscan \
              --working-directory=/scratch/eve         \
              --output-dir=/scratch/eve-output         \
              --num-threads=32                         \
              reads_1.fastq.gz reads_2.fastq.gz

TODO

  • Add support for single-end reads
  • Enable setting of Picard location
  • Incorporate coverage,quality scores,sequence complexity and GC richness into classification.
  • Include trimming/QA step before mapping?
  • Check for FASTA indices
  • unit testing / CI
  • sphinx documentation
  • setup.py

References

  • Michael D Linderman, Tracy Brandt, Lisa Edelmann, Omar Jabado, Yumi Kasai, Ruth Kornreich, Milind Mahajan, Hardik Shah, Andrew Kasarskis, Eric E Schadt, (2014) Analytical Validation of Whole Exome And Whole Genome Sequencing For Clinical Applications. Bmc Medical Genomics 7 20-NA 10.1186/1755-8794-7-20
  • A. Talwalkar, J. Liptrap, J. Newcomb, C. Hartl, J. Terhorst, K. Curtis, M. Bresler, Y. S. Song, M. I. Jordan, D. Patterson, (2014) Smash: A Benchmarking Toolkit For Human Genome Variant Calling. Bioinformatics 10.1093/bioinformatics/btu345

About

Ensemble Variant Detection (EVE)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published