Skip to content

Commit

Permalink
Merge 9fbc22d into 38ee1c5
Browse files Browse the repository at this point in the history
  • Loading branch information
ym2877 authored Jul 7, 2023
2 parents 38ee1c5 + 9fbc22d commit d697804
Show file tree
Hide file tree
Showing 10 changed files with 1,976 additions and 3 deletions.
23 changes: 23 additions & 0 deletions .github/workflows/joss-pdf.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
on: [push]

jobs:
paper:
runs-on: ubuntu-latest
name: Paper Draft
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Build draft PDF
uses: openjournals/openjournals-draft-action@master
with:
journal: joss
# This should be the path to the paper within your repo.
paper-path: paper.md
- name: Upload
uses: actions/upload-artifact@v1
with:
name: paper
# This is the output path where Pandoc will write the compiled
# PDF. Note, this should be the same directory as the input
# paper.md
path: paper.pdf
7 changes: 5 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,11 @@
!/pymgpipe/
!/examples/
!/docs/
!/.VERSION

!paper.md
!paper.bib
!/.github
!figure.png
!figure/

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
25 changes: 25 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,28 @@ When using pymgpipe please cite-

Baldini, F., Heinken, A., Heirendt, L., Magnusdottir, S., Fleming, R. M. T., & Thiele, I. (2019). *The Microbiome Modeling Toolbox: from microbial interactions to personalized microbial communities.* Bioinformatics (Oxford, England), 35(13), 2332–2334. https://doi.org/10.1093/bioinformatics/bty941

## Contributing

We warmly welcome and appreciate contributions from everyone. There are several ways you can contribute:

- Reporting Bugs: If you find a bug, please create a new issue on our GitHub page. Be sure to include as much information as possible so we can reproduce and fix the bug. The more detail you provide, the better.
- Code Contributions: If you'd like to contribute code, great! Please fork this repository, make your changes in a separate branch, and then submit a pull request. We'll review your changes and discuss any necessary modifications or improvements before merging.

Here are some general guidelines for code contributions:

1. Fork the repo and create your branch from the master.
2. If you've added code, add tests.
3. Ensure the test suite passes.
4. Issue that pull request!

## Reporting Issues

Issues should be reported using the [GitHub issue tracker](https://github.com/korem-lab/pymgpipe/issues). Please check the existing issues to avoid duplicates. When reporting an issue, please provide as much detail as possible about how to reproduce the problem, including the following information:

- Operating system and version
- Browser and version
- Details of the problem, including any error messages and screenshots if possible

Thank you for your contributions!

Copyright 2023 The Trustees of Columbia University in the City of New York. See [LICENSE](https://github.com/korem-lab/pymgpipe/blob/main/LICENSE) for additional details.
Binary file added figure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
144 changes: 144 additions & 0 deletions figure/figure.ipynb

Large diffs are not rendered by default.

733 changes: 733 additions & 0 deletions figure/mgpipe_nmpcs.csv

Large diffs are not rendered by default.

733 changes: 733 additions & 0 deletions figure/pymgpipe_nmpcs.csv

Large diffs are not rendered by default.

256 changes: 256 additions & 0 deletions paper.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
@article{villanueva2015gut,
title={Gut microbiota: a key player in health and disease. A review focused on obesity},
author={Villanueva-Mill{\'a}n, MJ and P{\'e}rez-Matute, P and Oteo, JA},
journal={Journal of physiology and biochemistry},
volume={71},
pages={509--525},
year={2015},
publisher={Springer},
doi={10.1007/s13105-015-0390-3}
}
@article{bar2020reference,
title={A reference map of potential determinants for the human serum metabolome},
author={Bar, Noam and Korem, Tal and Weissbrod, Omer and Zeevi, David and Rothschild, Daphna and Leviatan, Sigal and Kosower, Noa and Lotan-Pompan, Maya and Weinberger, Adina and Le Roy, Caroline I and others},
journal={Nature},
volume={588},
number={7836},
pages={135--140},
year={2020},
publisher={Nature Publishing Group UK London},
doi={10.1038/s41586-020-2896-2}
}
@article{mallick2019predictive,
title={Predictive metabolomic profiling of microbial communities using amplicon or metagenomic sequences},
author={Mallick, Himel and Franzosa, Eric A and Mclver, Lauren J and Banerjee, Soumya and Sirota-Madi, Alexandra and Kostic, Aleksandar D and Clish, Clary B and Vlamakis, Hera and Xavier, Ramnik J and Huttenhower, Curtis},
journal={Nature communications},
volume={10},
number={1},
pages={3136},
year={2019},
publisher={Nature Publishing Group UK London},
doi={10.1038/s41467-019-10927-1}
}
@article{baldini2019microbiome,
title={The Microbiome Modeling Toolbox: from microbial interactions to personalized microbial communities},
author={Baldini, Federico and Heinken, Almut and Heirendt, Laurent and Magnusdottir, Stefania and Fleming, Ronan MT and Thiele, Ines},
journal={Bioinformatics},
volume={35},
number={13},
pages={2332--2334},
year={2019},
publisher={Oxford University Press},
doi={10.1101/318485}
}
@article{diener2020micom,
title={MICOM: metagenome-scale modeling to infer metabolic interactions in the gut microbiota},
author={Diener, Christian and Gibbons, Sean M and Resendis-Antonio, Osbaldo},
journal={MSystems},
volume={5},
number={1},
pages={e00606--19},
year={2020},
publisher={Am Soc Microbiol},
doi={10.1101/361907}
}
@article{noecker2022mimosa2,
title={MIMOSA2: a metabolic network-based tool for inferring mechanism-supported relationships in microbiome-metabolome data},
author={Noecker, Cecilia and Eng, Alexander and Muller, Efrat and Borenstein, Elhanan},
journal={Bioinformatics},
volume={38},
number={6},
pages={1615--1623},
year={2022},
publisher={Oxford University Press},
doi={10.1093/bioinformatics/btac003}
}
@article{orth2010flux,
title={What is flux balance analysis?},
author={Orth, Jeffrey D and Thiele, Ines and Palsson, Bernhard {\O}},
journal={Nature biotechnology},
volume={28},
number={3},
pages={245--248},
year={2010},
publisher={Nature Publishing Group US New York},
doi={10.1038/nbt.1614}
}
@article{thiele2010protocol,
title={A protocol for generating a high-quality genome-scale metabolic reconstruction},
author={Thiele, Ines and Palsson, Bernhard {\O}},
journal={Nature protocols},
volume={5},
number={1},
pages={93--121},
year={2010},
publisher={Nature Publishing Group UK London},
doi={10.1038/nprot.2009.203}
}
@article{heinken2023genome,
title={Genome-scale metabolic reconstruction of 7,302 human microorganisms for personalized medicine},
author={Heinken, Almut and Hertel, Johannes and Acharya, Geeta and Ravcheev, Dmitry A and Nyga, Malgorzata and Okpala, Onyedika Emmanuel and Hogan, Marcus and Magn{\'u}sd{\'o}ttir, Stefan{\'\i}a and Martinelli, Filippo and Nap, Bram and others},
journal={Nature Biotechnology},
pages={1--12},
year={2023},
publisher={Nature Publishing Group US New York},
doi={10.1038/s41587-022-01628-0}
}
@article{norsigian2020bigg,
title={BiGG Models 2020: multi-strain genome-scale models and expansion across the phylogenetic tree},
author={Norsigian, Charles J and Pusarla, Neha and McConn, John Luke and Yurkovich, James T and Dr{\"a}ger, Andreas and Palsson, Bernhard O and King, Zachary},
journal={Nucleic acids research},
volume={48},
number={D1},
pages={D402--D406},
year={2020},
publisher={Oxford University Press},
doi={10.1093/nar/gkz1054}
}
@article{machado2018fast,
title={Fast automated reconstruction of genome-scale metabolic models for microbial species and communities},
author={Machado, Daniel and Andrejev, Sergej and Tramontano, Melanie and Patil, Kiran Raosaheb},
journal={Nucleic acids research},
volume={46},
number={15},
pages={7542--7553},
year={2018},
publisher={Oxford University Press},
doi={10.1093/nar/gky537}
}
@article{kindschuh2023preterm,
title={Preterm birth is associated with xenobiotics and predicted by the vaginal metabolome},
author={Kindschuh, William F and Baldini, Federico and Liu, Martin C and Liao, Jingqiu and Meydan, Yoli and Lee, Harry H and Heinken, Almut and Thiele, Ines and Thaiss, Christoph A and Levy, Maayan and others},
journal={Nature Microbiology},
pages={1--14},
year={2023},
publisher={Nature Publishing Group},
doi={10.1101/2021.06.14.448190}
}
@article{heinken2019systematic,
title={Systematic assessment of secondary bile acid metabolism in gut microbes reveals distinct metabolic capabilities in inflammatory bowel disease},
author={Heinken, Almut and Ravcheev, Dmitry A and Baldini, Federico and Heirendt, Laurent and Fleming, Ronan MT and Thiele, Ines},
journal={Microbiome},
volume={7},
pages={1--18},
year={2019},
publisher={Springer},
doi={10.1186/s40168-019-0689-3}
}
@article{hertel2021integration,
title={Integration of constraint-based modeling with fecal metabolomics reveals large deleterious effects of Fusobacterium spp. on community butyrate production},
author={Hertel, Johannes and Heinken, Almut and Martinelli, Filippo and Thiele, Ines},
journal={Gut Microbes},
volume={13},
number={1},
pages={1915673},
year={2021},
publisher={Taylor \& Francis},
doi={10.1080/19490976.2021.1915673}
}
@article{hertel2019integrated,
title={Integrated analyses of microbiome and longitudinal metabolome data reveal microbial-host interactions on sulfur metabolism in Parkinson’s disease},
author={Hertel, Johannes and Harms, Amy C and Heinken, Almut and Baldini, Federico and Thinnes, Cyrille C and Glaab, Enrico and Vasco, Daniel A and Pietzner, Maik and Stewart, Isobel D and Wareham, Nicholas J and others},
journal={Cell reports},
volume={29},
number={7},
pages={1767--1777},
year={2019},
publisher={Elsevier},
doi={10.2139/ssrn.3305554}
}
@article{baldini2020parkinson,
title={Parkinson’s disease-associated alterations of the gut microbiome predict disease-relevant changes in metabolic functions},
author={Baldini, Federico and Hertel, Johannes and Sandt, Estelle and Thinnes, Cyrille C and Neuberger-Castillo, Lorieza and Pavelka, Lukas and Betsou, Fay and Kr{\"u}ger, Rejko and Thiele, Ines},
journal={BMC biology},
volume={18},
pages={1--21},
year={2020},
publisher={Springer},
doi={10.1186/s12915-020-00775-7}
}
@article{ebrahim2013cobrapy,
title={COBRApy: constraints-based reconstruction and analysis for python},
author={Ebrahim, Ali and Lerman, Joshua A and Palsson, Bernhard O and Hyduke, Daniel R},
journal={BMC systems biology},
volume={7},
pages={1--6},
year={2013},
publisher={Springer},
doi={10.1186/1752-0509-7-74}
}
@article{jensen2017optlang,
title={Optlang: An algebraic modeling language for mathematical optimization.},
author={Jensen, Kristian and Cardoso, Joao GR and Sonnenschein, Nikolaus},
journal={J. Open Source Softw.},
volume={2},
number={9},
pages={139},
year={2017},
doi={10.21105/joss.00139}
}
@article{guebila2020vffva,
title={VFFVA: dynamic load balancing enables large-scale flux variability analysis},
author={Guebila, Marouen Ben},
journal={BMC bioinformatics},
volume={21},
pages={1--13},
year={2020},
publisher={Springer},
doi={10.1186/s12859-020-03711-2}
}
@article{mahadevan2003effects,
title={The effects of alternate optimal solutions in constraint-based genome-scale metabolic models},
author={Mahadevan, Radhakrishnan and Schilling, Chrisophe H},
journal={Metabolic engineering},
volume={5},
number={4},
pages={264--276},
year={2003},
publisher={Elsevier},
doi={10.1016/j.ymben.2003.09.002}
}
@article{heinken2013systems,
title={Systems-level characterization of a host-microbe metabolic symbiosis in the mammalian gut},
author={Heinken, Almut and Sahoo, Swagatika and Fleming, Ronan MT and Thiele, Ines},
journal={Gut microbes},
volume={4},
number={1},
pages={28--40},
year={2013},
publisher={Taylor \& Francis},
doi={10.4161/gmic.22370}
}
@misc{gurobi,
author = {{Gurobi Optimization, LLC}},
title = {{Gurobi Optimizer Reference Manual}},
year = 2023,
url = "https://www.gurobi.com"
}
@misc{cplex,
author = {{IBM, Inc.}},
title = {ILOG Cplex Optimization Studio},
year = 2023,
url = "https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer"
}
57 changes: 57 additions & 0 deletions paper.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: 'pymgpipe: microbiome metabolic modeling in Python'
tags:
- metabolic modeling
- flux balance analysis
- microbial communities
- microbiome
authors:
- name: Yoli Meydan
orcid: 0009-0003-4597-3340
equal-contrib: true
affiliation: 1
- name: Federico Baldini
orcid: 0000-0001-9079-8869
equal-contrib: true
affiliation: 1
- name: Tal Korem
orcid: 0000-0002-0609-0858
corresponding: true
affiliation: "1, 2"
affiliations:
- name: Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA.
index: 1
- name: Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, USA.
index: 2
date: 16 May 2023
bibliography: paper.bib
---

# Introduction

Microbially-produced metabolites and microbiome metabolism in general are strongly linked to ecosystem-level phenotypes, including the health of the human host [@villanueva2015gut; @bar2020reference]. To aid in the study of microbial metabolism from observational, human-derived data, a variety of computational methods that predict microbial community metabolic output from microbial abundances have been developed [@mallick2019predictive; @baldini2019microbiome; @diener2020micom; @noecker2022mimosa2]. Several of these methods rely on community-scale metabolic models, which are mechanistic, knowledge-based models that enable the formulation and *in silico* testing of biological hypotheses regarding the metabolism of microbial communities [@baldini2019microbiome; @diener2020micom]. Community-scale models primarily use Flux Balance Analysis, a technique that infers the metabolic fluxes in a system by optimizing an objective function, typically growth rate, subject to an assumption of a steady state and constraints imposed by the metabolic reactions present in the system [@orth2010flux]. These metabolic reactions are obtained from genome-scale metabolic networks (GEMs), knowledge-based computational models encompassing the known biochemical reactions present within an organism [@thiele2010protocol]. In recent years, curated GEMs for thousands of human-associated microbial organisms have become increasingly available, enabling a more in-depth exploration of the human microbiome [@heinken2023genome; @norsigian2020bigg; @machado2018fast]. In addition, several community-scale metabolic modeling methods specifically tailored to the human microbiome have emerged, such as MICOM and mgPipe [@baldini2019microbiome; @diener2020micom].

# Statement of need

mgPipe is a method that combines individual GEMs into a shared compartment according to the microbial abundances observed in every sample to construct a community-level metabolic model. Input and output compartments are added to allow for a distinction between the uptake and secretion of metabolites by the community. After constructing a representative model for each sample, mgPipe computes the metabolic capacity for all present metabolites in the form of Net Maximal Production Capacities (NMPCs). NMPCs are calculated as the absolute difference between the maximum secretion through the output compartment and the maximal uptake through the input compartment [@baldini2019microbiome]. To accomplish this, Flux Variability Analysis (FVA) [@mahadevan2003effects] is used to compute reaction bounds (minimum and maximum fluxes) through metabolite exchange reactions.

mgPipe models can further be used to explore metabolic interactions among individual taxa, the contribution of these taxa to the overall community metabolism, and to raise hypotheses regarding the biochemical machinery underlying an observed phenotype. This utility of mgPipe has been demonstrated in various studies of the role of the human microbiome in complex conditions such as preterm birth, inflammatory bowel disease, colorectal cancer, and Parkinson’s disease [@kindschuh2023preterm; @heinken2019systematic; @hertel2021integration; @hertel2019integrated; @baldini2020parkinson]. However, and despite its wide use and utility, only a MATLAB implementation of mgPipe is currently available, limiting its accessibility for those who are not proficient in MATLAB or cannot afford its license. Here, we provide a reliable, tested, open-source, and efficient Python implementation of mgPipe.

# Implementation & Availability

pymgpipe is a Python implementation of mgPipe [@baldini2019microbiome]. It utilizes COBRApy [@ebrahim2013cobrapy] as its main constraint-based metabolic modeling interface, and optlang [@jensen2017optlang] to formulate and modify the underlying mathematical optimization problem. pymgpipe merges individual GEMs into a single model following mgPipe’s biologically-informed metabolic assumptions, such as the use of preordained diets, compartmentalized structure, abundance-scaled constraints on microbial flux contributions [@heinken2013systems], and community biomass optimization objective [@baldini2019microbiome]. After building community-level models, metabolic profiles are computed in the form of NMPCs, as discussed above [@baldini2019microbiome]. As part of this step, pymgpipe uses the VFFVA C package for a fast and efficient FVA implementation [@guebila2020vffva]. pymgpipe is compatible with both the Gurobi [@gurobi] and ILOG Cplex [@cplex] solvers, which are both commercially available and free for academic use.

pymgpipe models are backwards-compatible with the MATLAB mgPipe models to ensure cross-software compatibility. Additionally, pymgpipe offers multithreading capabilities for both model construction and simulation, making it scalable to studies with a large sample size. The pymgpipe python package, as well as all associated documentation, tests, and example workflows, can be found at https://github.com/korem-lab/pymgpipe.

# Comparison to mgPipe

![Histogram of magnitude of differences in NMPCs between mgPipe and pymgpipe.\label{fig:histogram}](figure.png)

To assess the accuracy of pymgpipe we compared its models and predictions with mgPipe, as implemented in the Microbiome Modeling Toolbox, Cobra Toolbox commit: 71c117305231f77a0292856e292b95ab32040711 [@baldini2019microbiome]. We generated community-scale models for a vaginal microbiome dataset consisting of 232 samples, each composed of between 2 to 50 taxa (94 unique taxa), as previously described [@kindschuh2023preterm]. The models exhibited identical metabolic networks and structure between the two implementations (not shown). Additionally, metabolic profiles (NMPCs) output by pymgpipe exhibited only minor differences (mean±sd. 5.37e-7±1.23e-5; difference is below 1e-5 for 99.4% of all data points; \autoref{fig:histogram}). These differences are negligible (within solver tolerance) and are most likely due to variations in FVA implementations [@guebila2020vffva], solver versions, and tolerances. Overall, pymgpipe presents as an accurate Python implementation of the mgPipe pipeline.

# Acknowledgments

We thank members of the Korem lab and Dr. Marouen Ben Guebila for useful discussions. Y.M. and F.B. equally contributed to this work and are listed in random order. This work was supported by the Program for Mathematical Genomics at Columbia University (T.K.), R01HD106017 (T.K.) and R01CA255298 (Julian Abrams).

# References

1 change: 0 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@
"resources/models/*",
"resources/problems/*",
"resources/miniTaxa/*",
"resources/.VERSION"
],
},
project_urls={
Expand Down

0 comments on commit d697804

Please sign in to comment.