FishComparativeAtlas is a snakemake pipeline to trace the evolution of sister duplicated chromosomes derived from whole genome duplication (WGD) in teleost genomes.
If you use FishComparativeAtlas, please cite:
Parey E, Louis A, Monfort J, Guiguen Y, Roest Crollius H, Berthelot C. 2022. An atlas of fish genome evolution reveals delayed rediploidization following the teleost whole-genome duplication. Genome Research. August 12, 2022.
FishComparativeAtlas takes as input:
- ancestral chromosomes (pre-TGD) mapped on a subset of 4 teleost genomes (see the examples, taken from Nakatani and McLysaght 2017),
- genes coordinates files for all studied teleosts (see the examples),
- gene trees with the genes of all studied teleosts and outgroups (see the example),
- the corresponding species tree (see the example).
The generated fish comparative atlas is provided in a tab-delimited file with 3 columns: the unique identifier of the post-duplication gene family, all extant teleost genes in the family and the predicted post-duplication ancestral chromosome (1a, 1b, 2a...).
All dependencies are listed in envs/fish_atlas.yaml
and include mainly python 3.6, snakemake, ete3, matpotlib and seaborn. You can install the dependencies directly with conda, as explained below, or manually install the packages listed in envs/fish_atlas.yaml
before running FishComparativeAtlas.
- Create the conda environment:
conda install mamba
mamba env create -f envs/fish_atlas.yaml
- Activate the conda environment:
conda activate fish_atlas
- Run on toy example data (10 teleost genomes, ~ 3 minutes):
snakemake --configfile config_example.yaml --cores 4
The output file out_example/comparative_atlas.tsv
will be generated, along with figures with genomic annotations and statistics in out_example/figures
.
- Generate a snakemake report after a run:
snakemake --configfile config_example.yaml --report report_example.html
The snakemake report report_example.html
will be generated.
To run on a user-defined dataset, create a new configuration file and format your input data following the provided example.
- Elise Parey
- Alexandra Louis
- Hugues Roest Crollius
- Camille Berthelot
This code may be freely distributed and modified under the terms of the GNU General Public License version 3 (GPL v3) and the CeCILL licence version 2 of the CNRS:
FishComparativeAtlas takes as input the pre-TGD ancestral chromosomes predictions from:
- Nakatani and McLysaght 2017: Nakatani Y, McLysaght A. 2017. Genomes as documents of evolutionary history: a probabilistic macrosynteny model for the reconstruction of ancestral genomes. Bioinformatics 33:i369–i378.