Skip to content

This repository collects pipelines, codes, and some intermediate results for adult brain somatic mosaicism study.

License

Notifications You must be signed in to change notification settings

shishenyxx/Adult_brain_somatic_mosaicism

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adult_brain_somatic_mosaicism

This repository is a collaboration of codes and scripts used in the analysis of a somatic mosaicism study of the NIMH Brain Somatic Mosaicism Network on the entire human cortical regions, subregions, sorted populations, and single nuclei from ID01(Raw data are available at the NDA website under study #919) and ID02-04 (Raw data are available here). The 300x WGS panel of normal data is also available here.


1. Pipelines for the process of whole-genome sequencing data

1.1 Pipelines for WGS data pre-process, alignment, post-process, and quality control

Pipelines for converting the Dragen bam into fastqs, mapping them with new parameters to GRCh37, and finish the indel re-alignment as well as base quality score recalibration.

Codes and scripts for WGS quality control.

1.2 Pipelines for mosaic SNV/indel calling and variant annotations

Pipelines for MuTect2 (paired mode) and Strelka2 (somatic mode) variant calling from WGS data

Pipelines for MuTect2 (single mode), followed by MosaicForecast, and the variant annotation pipeline.

PBS script for MosaicHunter (single mode), followed by the variant annotation pipeline.

After variant calling from different strategies, variants were annotated and filtered by a python script and positive mosaic variants as well as the corresponding tissue and additional information were annotated.


2. Pipelines for the process of Massive Parallel Amplicon Sequencing (MPAS) and single-nuclei MPAS (snMPAS)

2.1 Pipelines for MPAS data alignment and processing

Pipelines for alignment, processing, and germline variant calling of MPAS and snMPAS reads.

2.2 Pipelines for AF quantification and variant annotations

Pipelines for AF quantification and variant anntations.


3. Pipelines for the data analysis, variant filtering, comprehensive annotations, and statistical analysis

3.1 Pipelines for mosaic variant determination, annotations, and plotting for ID01

Codes to filter and annotate on MPAS and snMPAS data.

Codes and config files for the Circos plot of square root-transformed AFs measured by MPAS.

Codes for permutation analysis from gnomAD and codes for plotting the permutation result.

Codes for plotting of AF measured in snMAPS.

3.2 Pipelines for mosaic variant determination, annotations, and plotting for ID02, 03, and 04

Codes to filter and annotate and plot for ID02, 03, and 04.

3.3 Pipelines for statistically analysis, QC, and the related plotting

Codes for the QC of MPAS and snMPAS based on the heterozygous and reference homozygous control variants in the panel.

Codes for sorted population, and the QC for sorting were already described in the previous publication.

Codes for correlation analysis and cluster representation of the AFs measured by MPAS.

Codes and permutations for the analysis for the distribution of sublobar areas.

Computational simulations and plotting for left-right starting populations.

4. Codes for the plotting of panels in the main figures and supplements for ID01

Codes for plotting supplement panels for ID02.

5. Jupyter Notebook for the UMAP

UMAP clustering from MPAS from bulk and sorted samples from ID01.

6. Codes for the lineage construction

Lineage reconstruction based on snMPAS and MPAS evaluated genotypes from sorted nuclei as well as bulk and sorted populations from ID01.

7. Contact information

📧 Xiaoxu Yang: u6055394@utah.edu, xiaoxuyanglab@gmail.com, xiy010@health.ucsd.edu

📧 Martin Breuss: martin.breuss@cuanschutz.edu

📧 Joseph Gleeson: jogleeson@health.ucsd.edu

8. Cite the codes

Breuss MW, Yang X, Antaki D, Schlachetzki JCM, et al., Gleeson JG. Somatic mosaicism reveals clonal distributions of neocortical development. 2022. (Nature, DOI:10.1038/s41586-022-04602-7, PMID:35444276)

Sperm_Mosaic_Cover

About

This repository collects pipelines, codes, and some intermediate results for adult brain somatic mosaicism study.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •