Skip to content

Latest commit

 

History

History
30 lines (22 loc) · 2.38 KB

README.md

File metadata and controls

30 lines (22 loc) · 2.38 KB

ITH_TCGA

this repository contains the code necessary to run the analysis of ITH (intra-tumor heterogeneity) in 3 cancer types on TCGA data published in "Assessing reliability of intra-tumor heterogeneity estimates from single sample whole exome sequencing data" Abecassis et al, 2018. The master code can be found in the bash script run_all.sh. It is not advised to execute it all at once, considering the run time, and the paths to change (python virtual environments, paths to executables etc).

Installation and dependencies

this project uses many packages. Here is a list of the most important ones. Requirement files are provided to use with pip. Here is a list of the main ones, including some that can be installed with pip (specified). A Python or R version is specified when it did not work with R-3.2.3 or Python 3.5.5. For Python, 3 virtual environments were used for the 3 needed versions (2.7, 3.5, 3.6). All computation was performed under a Debian distribution, with torque as scheduler.

data download packages
data preprocessing
ITH methods
statistical packages

All scripts should be run from the root of the folder. Folder tmp contains the final file with all results from ITH methods, allowing to reproduce the survival analysis.

All calls for SNV and CNA on the single cell dataset are in the tmp_results_single_cell folder. All target bed for exome capture are in the external_data folder.