Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add recombination rate to stdpopsim.DemographicModel #1225

Open
gregorgorjanc opened this issue Apr 22, 2022 · 3 comments
Open

Add recombination rate to stdpopsim.DemographicModel #1225

gregorgorjanc opened this issue Apr 22, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@gregorgorjanc
Copy link
Contributor

Short description of what you'd like added:

Add recombination rate specification in demographic models in addition to in species

Rationale and considerations:

We can specify recombination rate in the species file, but demographic models for that species could have assumed different recombination rate (and mutation rate) in estimation. We can already specify mutation rate separately for species and demographic models, so it would be good to offer the same for recombination rate.

This came about in QC of the BosTau #579, where the difference between the most recent recombination estimate used for species definition and a bit older recombination rate estimate for demographic model are not too different, so the issue is not huge there, but could be much bigger in other species.

How can I help?

@gregorgorjanc gregorgorjanc added the enhancement New feature or request label Apr 22, 2022
@gregorgorjanc
Copy link
Contributor Author

@gregorgorjanc look at how demography-specific mutation rate was handled in #839

@gregorgorjanc
Copy link
Contributor Author

gregorgorjanc commented Nov 9, 2024

I am looking into this since many demographic models are inferred/calibrated by assuming a specific recombination rate, maybe even a specific genetic map, which will invariably differ from the species recombination rate and genetic map we have implemented in stdpopsim.

I looked across the codebase, but found it tricky, hence writing this summary to see if I am following the structure correctly.

We have:

The docs and code interchange between "recombination map" and "genetic map", which I understand as they are synonyms, but Contig class distinguishes between them. Based on the above summary, this is because of the possibility of gene conversion, so "recombination map" in stdpopsim is "genetic map" combined with gene conversion, right? Should we then be a bit clearer in the docs to avoid this confusion.

Potential TODOs:

  • Document genetic_map attribute in Contig class
  • Clarify/consolidate recombination/genetic map language in the docs, possibly also code (use genetic map to refer to crossover rate map and recombination map to refer to crossover rate and gene conversion map combined?)
  • Expand the DemographicModel class with:
    • recombination_rate (to specify which recombination rate was used in constructing demographic model, if not provided, species default is used)
    • genetic_map (to specify which genetic map was used in constructing demographic model [assuming some did?], if provided, overrules recombination_rate)
  • Pass DemographicModel.recombination_rate or DemographicModel.genetic_map to engines (by feeding them into Contig)
  • Expand tutorial docs on how to pass
    • DemographicModel.recombination_rate to Contig for work with Python API
    • DemographicModel.genetic_map to Contig for work with Python API

Any suggestions or comments on this plan?

@gregorgorjanc
Copy link
Contributor Author

gregorgorjanc commented Nov 9, 2024

I have implemented the above (except genetic map in a demographic model - not sure how important that is - maybe for future).

Using that PR code I get this behaviour:

> stdpopsim BosTau \
  --chromosome 25 \
  --right 10000 \
  --output BosTau.trees \
  pop_0:1
...
Mean recombination rate: 9.26e-09
Mean mutation rate: 1.2e-08
...

> stdpopsim BosTau \
  --chromosome 25 \
  --right 10000 \
  --demographic-model HolsteinFriesian_1M13 \
  --output BosTau.trees \
  Holstein_Friesian:1
...
Mean recombination rate: 9.4e-09
Mean mutation rate: 1e-08
...

So, we now have a cleaner / more correct way to define demographic models as they are published.

Python API would work like this:

import stdpopsim
species = stdpopsim.get_species("BosTau")
contig = species.get_contig('25')
model = species.get_demographic_model("HolsteinFriesian_1M13")

print("contig mutation rate:", contig.mutation_rate)
# contig mutation rate: 1.2e-08
print("model mutation rate:", model.mutation_rate)
# model mutation rate: 9.4e-09
print("contig recombination rate:", f"{contig.recombination_map.mean_rate:.3}")
# contig recombination rate: 9.26e-09
print("model recombination rate:", model.recombination_rate)
# model recombination rate: 1e-08

contig = species.get_contig("chr25", mutation_rate=model.mutation_rate, recombination_rate=model.recombination_rate)
print(contig.mutation_rate == model.mutation_rate)
# True
print(contig.recombination_map.mean_rate == model.recombination_rate)
# True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant