Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pseudocode API discussion #1

Open
sukritsingh opened this issue Oct 25, 2024 · 1 comment
Open

Pseudocode API discussion #1

sukritsingh opened this issue Oct 25, 2024 · 1 comment

Comments

@sukritsingh
Copy link
Contributor

sukritsingh commented Oct 25, 2024

Figured I should start a thread of what the pseudocode might look like for this repo, and what sort of user facing stuff might look like.

Here's a quick 5-minute attempt at what the pipeline would look like. I figure this thread would serve as a way to discuss and update this as we go:

# Import necessary modules from PALE
from pale import ProteinSystem, MutationEngine, HybridTopologyFactory, FreeEnergyCalculator
from pale import SimulationParameters, SamplingMethod

# Load the original protein structure
original_protein = ProteinSystem.load('path/to/protein.pdb')

# Maybe we do some sort of quality assessment here
# PDBFixer would probably be useful for something like this
list_of_missing_atoms = original_protein.check_for_missing_atoms()
original_protein.add_missing_atoms(list_of_missing_atoms)

# If we assume everything is protonated by the user prior to input, we should likely check to vierfy
original_protein.verify_hydrogens_present()

# Define the mutations to apply
# Each single point mutation is specified by chain ID, residue ID, and the new amino acid
list_of_mutations = [
    {'chain_id': 'A', 'residue_index': 100, 'new_residue': 'ALA'},
    {'chain_id': 'B', 'residue_index': 150, 'new_residue': 'GLY'},
    # Add more mutations as needed
]

# Maybe verify that mappings can be made for each mutation?
# Suggest alternatives/error out of they aren't?
mutation_mapper = MutationEngine(list_of_mutations, mapper='lomap')
mutation_mapper.validate() # this method might print out the issues?
mutation_mapper.create_alternatives()

# Initialize the mutation engine with the original protein for a single mutation
htf_mutation_single = HybridTopologyFactory(original_protein, list of mutations[0])

# if there are multiple mutations, then generate a bunch of HTFs
htf_mutations_list = HybridTopologyFactory.apply_mutations(mutation_mapper)

# Optionally save the hybrid topology factory to disk for later use
htf_mutation_single.save('path/to/hybrid_factory.xml.bz2')

# At this point, the user may choose to stop and use the saved hybrid factory later
# If the user wants to proceed to sampling, they can load the saved hybrid factory or use the one in memory

# Define simulation parameters
simulation_parameters = SimulationParameters(
    force_field='amber99sb-ildn',
    water_model='tip3p',
    temperature=300,     # in Kelvin
    pressure=1.0,        # in atm
    simulation_time=100, # in nanoseconds
    solvent_padding=10,  # in Angstroms
    # Additional parameters can be added here
)

# validate that an openmm system can be created with these simualtion parameters 
# probably easy to just build in vacuum
htf_test_system = HybridTopologyFactory.test_if_valid(simulation_parameters)

# Create a solvated system
htf_system = HybridTopologyFactory.create_solvated_htf(simulation_parameters)

# Choose the integrator method (NEQ or Replica Exchange)
# Option 1: Non-Equilibrium Cycling (NEQ)
neq_sampling = SamplingMethod.neq(
    neq_switches=1, # 1 forward and reverse transformation
    switching_time=1500, # in picoseconds
    time_step=0.002, # in picoseconds
    hydrogen_mass_amu = 3.5,
    
)

# Option 2: Replica Exchange (RE)
repex_sampling = SamplingMethod.repex(
    replicas=16, # this should probably be defined by # of lambdas
    exchange_attempts=1000,
    exchange_frequency=1000,
    time_step=0.002
)

# Select the desired sampling method
sampling_method = neq_sampling  # For NEQ sampling
# sampling_method = repex_sampling  # For Replica Exchange sampling

# Set up the free energy calculator with the sampling method and hybrid factory
fe_calculator = FreeEnergyCalculator(
    hybrid_factory=htf_system,
    sampling_method=sampling_method
)

# Run the free energy calculation to estimate ∆∆G
delta_delta_G = fe_calculator.compute_ddg()

# Output the results
print(f"Estimated ∆∆G for the specified mutations: {delta_delta_G:.2f} kcal/mol")

This is just attempt v0.01 for what this might look like and what assumptions may go into it! Absolutely should edit/clarify things as we go!

@ijpulidos
Copy link
Contributor

This is great, thanks! I'll have to sit and digest this and hopefully start creating code in this repo that would match it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants