Skip to content

Commit

Permalink
Docs 0.1.0dev (#75)
Browse files Browse the repository at this point in the history
* update docs

* fix linting

* update description of pipeline
  • Loading branch information
nvnieuwk authored Mar 25, 2024
1 parent 8d2e7dc commit a4d35e6
Show file tree
Hide file tree
Showing 14 changed files with 253 additions and 175 deletions.
3 changes: 2 additions & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,10 @@ Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-cmgg/stru
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-cmgg/structural/tree/master/.github/CONTRIBUTING.md)
- [ ] Make sure your code lints (`nf-core lint`).
- [ ] Ensure the test suite passes (`nf-test test main.nf.test -profile test,docker`).
- [ ] Ensure the test suite passes (`nf-test test`).
- [ ] Check for unexpected warnings in debug mode (`nextflow run . -profile debug,test,docker --outdir <OUTDIR>`).
- [ ] Usage Documentation in `docs/usage.md` is updated.
- [ ] Output Documentation in `docs/output.md` is updated.
- [ ] Parameters Documentation is updated with `nf-core schema docs --format markdown --output docs/parameters.md --force`
- [ ] `CHANGELOG.md` is updated.
- [ ] `README.md` is updated (including new tool citations and authors/contributors).
5 changes: 4 additions & 1 deletion .nf-core.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
lint:
files_exist:
- CITATIONS.md
- docs/README.md
- CODE_OF_CONDUCT.md
- .github/ISSUE_TEMPLATE/config.yml
- .github/workflows/awstest.yml
Expand All @@ -12,6 +14,7 @@ lint:
- manifest.homePage
files_unchanged:
- LICENSE
- .github/PULL_REQUEST_TEMPLATE.md
- .github/CONTRIBUTING.md
- .github/ISSUE_TEMPLATE/bug_report.yml
- .github/workflows/linting.yml
Expand All @@ -22,6 +25,6 @@ lint:
repository_type: pipeline
template:
author: nvnieuwk
description: A nextflow pipeline for calling structural variants
description: A bioinformatics best-practice analysis pipeline for calling structural variants (SVs), copy number variants (CNVs) and repeat region expansions (RREs) from short DNA reads.
name: structural
prefix: nf-cmgg
41 changes: 0 additions & 41 deletions CITATIONS.md

This file was deleted.

62 changes: 3 additions & 59 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[![GitHub Actions CI Status](https://github.com/nf-cmgg/structural/actions/workflows/ci.yml/badge.svg)](https://github.com/nf-cmgg/structural/actions/workflows/ci.yml)
[![GitHub Actions Linting Status](https://github.com/nf-cmgg/structural/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-cmgg/structural/actions/workflows/linting.yml)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX)
[![GitHub Actions Linting Status](https://github.com/nf-cmgg/structural/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-cmgg/structural/actions/workflows/linting.yml)
[![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com)

[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A523.10.0-23aa62.svg)](https://www.nextflow.io/)
Expand All @@ -10,62 +10,6 @@

## Introduction

**nf-cmgg/structural** is a bioinformatics best-practice analysis pipeline for calling structural variants from short reads.
**nf-cmgg/structural** is a bioinformatics best-practice analysis pipeline for calling structural variants (SVs), copy number variants (CNVs) and repeat region expansions (RREs) from short DNA reads. The pipeline handles the calling of the variants and postprocessing (filtering, annotating...)

The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

![metro map](docs/images/metro_map.png)

## Usage

> **Note**
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how
> to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline)
> with `-profile test` before running the workflow on actual data.
Now, you can run the pipeline using:

```bash
nextflow run nf-cmgg/structural \
-profile <docker/singularity/.../institute> \
--input samplesheet.csv \
--outdir <OUTDIR>
```

> **Warning:**
> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those
> provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_;
> see [docs](https://nf-co.re/usage/configuration#custom-configuration-files).
## Documentation

The CenterForMedicalGenetics/structural pipeline comes with documentation about the pipeline [usage](https://github.com/nf-cmgg/structural/tree/master/docs/usage.md) and [output](https://github.com/nf-cmgg/structural/tree/master/docs/output.md).

> [!WARNING]
> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_;
> see [docs](https://nf-co.re/usage/configuration#custom-configuration-files).
## Credits

nf-cmgg/structural was originally written by Nicolas Vannieuwkerke and Mattias Van Heetvelde.

## Contributions and Support

If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).

## Citations

<!-- TODO nf-core: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. -->
<!-- If you use nf-cmgg/structural for your analysis, please cite it using the following doi: [10.5281/zenodo.XXXXXX](https://doi.org/10.5281/zenodo.XXXXXX) -->

<!-- TODO nf-core: Add bibliography of tools and data used in your pipeline -->

An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.

You can cite the `nf-core` publication as follows:

> **The nf-core framework for community-curated bioinformatics pipelines.**
>
> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
>
> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).
Please have a look at the [documentation](https://nf-cmgg.github.io/structural/latest/) on how to run the pipeline
8 changes: 6 additions & 2 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,15 @@
"properties": {
"sample": {
"type": "string",
"meta": ["id", "sample"]
"meta": ["id", "sample"],
"pattern": "^\\S+$",
"errorMessage": "The sample name must be a string and cannot contain spaces."
},
"family": {
"type": "string",
"meta": ["family"]
"meta": ["family"],
"pattern": "^\\S+$",
"errorMessage": "The family name must be a string and cannot contain spaces."
},
"cram": {
"type": "string",
Expand Down
4 changes: 1 addition & 3 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,11 @@ params {
qdnaseq_male = params.test_data["homo_sapiens"]["genome"]["genome_qdnaseq"]
qdnaseq_female = params.test_data["homo_sapiens"]["genome"]["genome_qdnaseq"]
igenomes_ignore = true
genomes_ignore = false
genomes_ignore = true
genome = 'GRCh38'
genomes_base = "s3://reference-data/genomes"
vep_cache = null
annotsv_annotations = null

annotate = true
concat_output = true

// Pipeline parameters
Expand Down
99 changes: 99 additions & 0 deletions docs/CITATIONS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# nf-cmgg/structural: Citations

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.
## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.
## Pipeline tools

- [AnnotSV](https://pubmed.ncbi.nlm.nih.gov/29669011/)

> Geoffroy V, Herenger Y, Kress A, Stoetzel C, Piton A, Dollfus H, Muller J. AnnotSV: an integrated tool for structural variations annotation. Bioinformatics. 2018 Oct 15;34(20):3572-3574. doi: 10.1093/bioinformatics/bty304. PMID: 29669011.
- [BCFTools](https://pubmed.ncbi.nlm.nih.gov/21903627/)

> Li H: A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011 Nov 1;27(21):2987-93. doi: 10.1093/bioinformatics/btr509. PubMed PMID: 21903627; PubMed Central PMCID: PMC3198575.
- [bedgovcf](https://github.com/nvnieuwk/bedgovcf)

- [DELLY](https://academic.oup.com/bioinformatics/article/28/18/i333/245403)

> Tobias Rausch, Thomas Zichner, Andreas Schlattl, Adrian M. Stütz, Vladimir Benes, Jan O. Korbel, DELLY: structural variant discovery by integrated paired-end and split-read analysis, Bioinformatics, Volume 28, Issue 18, September 2012, Pages i333–i339, https://doi.org/10.1093/bioinformatics/bts378
- [EnsemblVEP](https://pubmed.ncbi.nlm.nih.gov/27268795/)

> McLaren W, Gil L, Hunt SE, et al.: The Ensembl Variant Effect Predictor. Genome Biol. 2016 Jun 6;17(1):122. doi: 10.1186/s13059-016-0974-4. PubMed PMID: 27268795; PubMed Central PMCID: PMC4893825.
- [ExpansionHunter](https://academic.oup.com/bioinformatics/article/35/22/4754/5499079)

> Egor Dolzhenko, Viraj Deshpande, Felix Schlesinger, Peter Krusche, Roman Petrovski, Sai Chen, Dorothea Emig-Agius, Andrew Gross, Giuseppe Narzisi, Brett Bowman, Konrad Scheffler, Joke J F A van Vugt, Courtney French, Alba Sanchis-Juan, Kristina Ibáñez, Arianna Tucci, Bryan R Lajoie, Jan H Veldink, F Lucy Raymond, Ryan J Taft, David R Bentley, Michael A Eberle, ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions, Bioinformatics, Volume 35, Issue 22, November 2019, Pages 4754–4756, https://doi.org/10.1093/bioinformatics/btz431
- [Gawk](https://www.gnu.org/software/gawk/)

- [GNU sed](http://www.gnu.org/software/sed/)

- [GNU tar](https://www.gnu.org/software/tar/)

- [Jasmine](https://pubmed.ncbi.nlm.nih.gov/36658279/)

> Kirsche M, Prabhu G, Sherman R, Ni B, Battle A, Aganezov S, Schatz MC. Jasmine and Iris: population-scale structural variant comparison and analysis. Nat Methods. 2023 Mar;20(3):408-417. doi: 10.1038/s41592-022-01753-3. Epub 2023 Jan 19. PMID: 36658279; PMCID: PMC10006329.
- [Manta](https://pubmed.ncbi.nlm.nih.gov/26647377/)

> Chen X, Schulz-Trieglaff O, Shaw R, et al.: Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics. 2016 Apr 15;32(8):1220-2. doi: 10.1093/bioinformatics/btv710. PubMed PMID: 26647377.
- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)

> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
- [ngs-bits](https://github.com/imgag/ngs-bits)

- [SAMtools](https://pubmed.ncbi.nlm.nih.gov/19505943/)

> Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352. Epub 2009 Jun 8. PubMed PMID: 19505943; PubMed Central PMCID: PMC2723002.
- [QDNAseq](https://pubmed.ncbi.nlm.nih.gov/25236618/)

> Scheinin I, Sie D, Bengtsson H, van de Wiel MA, Olshen AB, van Thuijl HF, van Essen HF, Eijk PP, Rustenburg F, Meijer GA, Reijneveld JC, Wesseling P, Pinkel D, Albertson DG, Ylstra B. DNA copy number analysis of fresh and formalin-fixed specimens by shallow whole-genome sequencing with identification and exclusion of problematic regions in the genome assembly. Genome Res. 2014 Dec;24(12):2022-32. doi: 10.1101/gr.175141.114. Epub 2014 Sep 18. PMID: 25236618; PMCID: PMC4248318.
- [smoove](https://github.com/brentp/smoove)

- [svync](https://github.com/nvnieuwk/svync)

- [Tabix](https://academic.oup.com/bioinformatics/article/27/5/718/262743)

> Li H, Tabix: fast retrieval of sequence features from generic TAB-delimited files, Bioinformatics, Volume 27, Issue 5, 1 March 2011, Pages 718–719, doi: 10.1093/bioinformatics/btq671. PubMed PMID: 21208982. PubMed Central PMCID: PMC3042176.
- [Vcfanno](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0973-5)

> Pedersen, B.S., Layer, R.M. & Quinlan, A.R. Vcfanno: fast, flexible annotation of genetic variants. Genome Biol 17, 118 (2016). https://doi.org/10.1186/s13059-016-0973-5
- [WisecondorX](https://academic.oup.com/nar/article/47/4/1605/5253050)

> Lennart Raman, Annelies Dheedene, Matthias De Smet, Jo Van Dorpe, Björn Menten, WisecondorX: improved copy number detection for routine shallow whole-genome sequencing, Nucleic Acids Research, Volume 47, Issue 4, 28 February 2019, Pages 1605–1614, https://doi.org/10.1093/nar/gky1263
## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)

> Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.
- [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)

> Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.
- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)

> da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.
- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

> Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241.
- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)

> Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
8 changes: 0 additions & 8 deletions docs/README.md

This file was deleted.

Loading

0 comments on commit a4d35e6

Please sign in to comment.