Data are backed up in long term storage on the RDSF in two folders:
/projects/Butterfly_genome_analysis
/projects/Adaptation_in_rainforest_flies
For the most part these files will be read only to prevent them from being deleted accidentally. This means the person who transferred the files to the RDSF has to modify permissions if anything needs to be deleted. In cases where this person has left the RDSF storage team can change permissions as needed.
email: hpc-help@bristol.ac.uk
B1_Cupido_minimus
B2_Polyommatus_coridon
B3_Polyommatus_bellargus
C1_Aricia_artaxerxes
C3_Aricia_agestis (Velocity data distinct from JB & MdJ's Pop gen data - see below)
D3_Pararge_aegeria_data
AriciaAgestsis (population genomics and linkage map data - JB & MdJ)
rawseqMuseum1_Jan2019
TrimmedMuseumII = adapter trimmed data (by Sam)
rawseqModern2
rawseq_Pararge_aegeria
Hesperia_comma
Hipparchia_semele
Maniola_jurtina
Ochlodes_sylvanus
Plebejus_argus
Thymalicus_acteon
Erebia_epiphron
Erebia_aethiops
RNA_work_A.agestis.zip - from James Buckley's PhD work.
rawseqModern1_2018
rawseqMuseum2_2019
rawseqMuseum3_July2019
rawseqModern3.2_Aug2019
A1_Lymantria_monacha
G1_Thymelicus_acteon
G2_Ochlodes_sylvanus
E1_Erebia_epiphron
E2_Erebia_aethiops
E3_Aphantopus_hyperantus
Raw data for each species
Aphantopus_hyperantus
Eilema_griseola
Ochlodes_sylvanus
Xanthorhoe_fluctuata
Aricia_agestis
Hesperia_comma
Polyommatus_coridon
Aricia_artaxerxes
Maniola_jurtina
temp.sh
Cupido_minimus
Miltochrista_miniata
Thymelicus_acteon
See below for a description of folders.
All raw output from the sequencing facility will be kept as is in a folder and are named according to the library batch.
e.g. Modern1, Museum1.
Raw data for reference genomes have been kept as received - raw data is organised by species.
Reference genomes for each species is stored in their respective species folders.
Data for each species will be kept in a species folder ordered by triplet name (see shared google doc).
Data within each folder will be organised as follows
Raw fastq files for modern and museum samples.
These folders will be emptied after the cutadapt reads have been produced to save space. Raw data can be recovered from the raw data folders. See shared doc to identify relevant libraries.
Adapter trimmed fastq samples for modern and musuem.
Mapped and indexed (.bam and .bai) files for modern and musuem. Where samples have been further processed, all these data have been moved into a single folder called mapped leaving the 02a.. folders empty.
Folder where samtools mpileup and bcftools call variant calling takes place.
The /tmp folder contains the raw bcf files for each region.
The xaa.. or regionsaa files list the independently processed regions.
The intermediate and final raw combined bcf files are found directly in the 03_variants folder
The filtered_variant_files_xxx folder contains the filtered museum and modern bcf files
The /filtered_variant_files_xxx/dir/ folder contains the modern, museum, and combined bcf files that contain only the intersecting datasets.
Reference genome, index, and gff file for each species.
All ANGSD analyses.