GitHub - kernlab-brandeis/PCS-CPMG: XPLOR-NIH code and test datasets for PCS-CPMG methodology

This repository includes the Python code and test dataset for the PCS-CPMG methodology as described in:

John B. Stiller, Renee Otten, Daniel Häussinger, Pascal S. Rieder, Douglas L. Theobald & Dorothee Kern. Structure determination of high-energy states in a dynamic protein ensemble (2022), Nature, doi: 10.1038/s41586-022-04468-9

The main file is refine_EM_PCS.py, which is a Python-based script written for the XPLOR-NIH software [reference]. It is a modified version of the refine.py script that is part of the XPLOR-NIH package.

All the work was done using XPLOR-NIH version 2.46 (i.e., Python 2.7), but it should work with all versions in the 2.x branch. Please note, that we are currently working on making sure that it runs with XPLOR-NIH v3.x and to make it more user-friendly.

The script takes degenerate pseudocontact shift (PCS) data extracted extracted from Carr-Purcell-Meiboom-Gill (CPMG) relaxation dispersion experiments and uses it to solve the high-energy structure. The methodology relies on an Expectation Maximization algorithm for handling of the degenerate data.

Test dataset and output of a run for adenylate kinase

The following files are used in the example run for adenylate kinase:

4AKE_Modeled.pdb: chain A of PDB entry 4AKE, modeled with the Geobacillus stearothermophilus sequence. Open Adk structure that is a simulated high-energy state in this case.
4AKE_Modeled_MinorPCS_True_Co.txt: simulated PCS generated for the 4AKE structure.
4AKE_Modeled_FourPCS_Co.txt: four-fold degenerate simulated PCSs generated for the 4AKE structure.

Flow-chart for a typical, iterative PCS-CPMG run

calculate 50-100 structures with XPLOR-NIH (Run1) using the protocol in refine_EM_PCS.py
extract information from each run: - the take each structure and consider its likelihood (using the script Make_MC_Files_Indiv.py) - calculate features for each Adk structure (using the script Extract_Information_Indiv.py)
start further iterations (Run2 and Run3), but use the output from the previous step as the starting point. The parameters are very similar, if not the same, but we automatically use the refined PDB structure and PCS file with optimized probabilities from the previous run.

Peaklists for lanthanide-tagged ubiquitin

Finally, peak assignments for the ubiquitin mutants with diamagnetic/paramagnetic lanthanide-tags are also provided and can be found in the ubiquitin directory:

gNhsqc_ubiquitin-K6C-Lu-M7-PyThz_25C.list
gNhsqc_ubiquitin-K6C-Tm-M7-PyThz_25C.list
gNhsqc_ubiquitin-K6C-Lu-M8-4R4S_25C.list
gNhsqc_ubiquitin-K6C-Tm-M8-4R4S_25C.list
gNhsqc_ubiquitin-S20C-Lu-M7-PyThz_25C.list
gNhsqc_ubiquitin-S20C-Tm-M7-PyThz_25C.list

where M7 and M8 refer to the DOTA-M7PyThiazole and DOTA-M8-(4R4S)-SSPy lanthanide-binding tags, respectively; Lu is the diamagnetic lutetium and Tm is the paramagnetic thulium. Side-chains NH2 groups are randomly assigned as SC#?-?, where # is a number and ? is a and b for the downfield and upfield cross peak, respectively. Cross-peaks originating from untagged protein are suffixed by a u (e.g., Q2Nu-Hu).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
4AKE_example_run		4AKE_example_run
ubiquitin		ubiquitin
README.rst		README.rst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Test dataset and output of a run for adenylate kinase

Flow-chart for a typical, iterative PCS-CPMG run

Peaklists for lanthanide-tagged ubiquitin

About

Releases

Packages

Contributors 2

Languages

kernlab-brandeis/PCS-CPMG

Folders and files

Latest commit

History

Repository files navigation

Test dataset and output of a run for adenylate kinase

Flow-chart for a typical, iterative PCS-CPMG run

Peaklists for lanthanide-tagged ubiquitin

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages