Skip to content
/ ALGA Public

This repository contains code of ALGA - short read de novo genome assembler.

License

Notifications You must be signed in to change notification settings

swacisko/ALGA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ALGA

This repository contains code of ALGA - a de novo genome assembler.


Description:

ALGA (ALgorithm for Genome Assembly) is a genome-scale de novo sequence assembler based on the overlap graph approach. The method accepts at the input reads from the next generation DNA sequencing, paired or not. It can be used without setting any parameter by a user, parameters are adjusted internally by ALGA on the basis of input data. Only one optional parameter is left, the maximum allowed error rate in overlaps of reads, with its default (and suggested) value 0.

Please make sure that the reads you provide as an input have a very good quality (it is strongly recommended to use Musket, a tool for read correction based on a k-mer analysis, before running ALGA).


Requirements:

CMake VERSION 2.8.7 or higher
C++ 17 or higher


Installation:

Download the archive with code of ALGA and unpack it, or clone the ALGA repository. Use CMake to obtain the binary file. For example, in Linux, in the main directory of ALGA, you can use the following commands:

mkdir build
cd build
cmake ..
make

After this, the executable file named "ALGA" should be in the "build" directory.


Usage tips:

PLEASE use Musket software to correct reads, before running ALGA.
A typical usage of ALGA consists in specifying one or two input files (both with .fastq or .fasta extension), the number of threads and the output file name for contigs.

./ALGA --file1=path1/reads_1.fastq --file2=path2/reads_2.fastq --threads=8 --output=contigs.fasta

You can run ALGA specifying only one input file. In that case just remove the argument --file2=path2/reads_2.fastq. The number of threads is an optional parameter and can be removed, it is set to 6 by default.


Additional parameters:

If you suspect that the input data are for some reason of very poor quality and may – even after the read correction – still contain a large number of errors, you can additionally use the option --error-rate=0.02 (the value used, here 0.02, denotes the average expected fraction of errors).


Docker:

One can use docker to run ALGA.

	docker build -t ALGA . 

About

This repository contains code of ALGA - short read de novo genome assembler.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages