Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PySCeS ThinLayer: Fit multiple replicates, plot fit result #46

Merged
merged 7 commits into from
Oct 22, 2022

Conversation

jmrohwer
Copy link
Member

@jmrohwer jmrohwer commented Jun 14, 2022

This pull request implements 4 features/bug fixes:

  • The PySCeS TL can now fit datasets with multiple replicates
  • A function has been added to plot the experimental data together with the fitted model
  • Various cases where parameters with a zero value were not loaded/saved, have been fixed (if self.value: -> if self.value is not None).
  • The PySCeS loading of the model has been optimized, the conversion to PSC is only done if the EnzymeMLDoc is newer (this process takes time), and a bug with compartment volumes not equal to 1 has been fixed.

The new functionality is demonstrated with an example notebook that I am sharing with @JR-1991 via Google Colab. Any other reviewers, let me know and I'll be happy to share it. It takes the Model 3 from Scenario 4 from the Lauterbach 2022 paper and fits this with PySCeS. The fitted parameters are the same (within small errors) as obtained with direct fitting in the Jupyter notebook using lmfit and odeint.


This change is Reviewable

@jmrohwer jmrohwer marked this pull request as ready for review June 14, 2022 15:11
@jmrohwer jmrohwer requested a review from JR-1991 June 14, 2022 15:11
Copy link
Member Author

@jmrohwer jmrohwer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 2 unresolved discussions (waiting on @JR-1991)


pyenzyme/thinlayers/TL_Pysces.py line 288 at r1 (raw file):

        return pd.DataFrame(np.hstack(output).T, columns=self.model.species)

    def _get_replicate_info(self, time):

@JR-1991 I'd be interested to know if there is a better way of doing this. The point is that in the experimental data all the replicates are vertically stacked above each other, yet each replicate requires only a single simulation (we don't have to repeat the simulation), and this is what I'm trying to achieve here.


pyenzyme/thinlayers/TL_Pysces.py line 301 at r1 (raw file):

        return (time[:i].values, num_reps)

    def plot_fit(self):

This is quite useful and if there is more than one reagent for which there is data, they are plotted with different colours. It is of course not as elegant as EnzymeMLDocument.visualize()but that one can use seaborn.FacetGrid because it is only a single dataframe. Here we are plotting data from two different dataframes, and moreover we want to plot the modelling data with more time-points than the experimental data to give smooth curves. So I think the only way to do this is directly with matplotlib, which gives less flexibility for interactive visualization. I have set this to plot 3 columns and the number of rows to be adjusted automatically depending number of measurements. However, I am open for suggestions for improving this. I think a method to quickly visualize a fit with data and model is very important in the library.

Copy link
Member

@JR-1991 JR-1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @jmrohwer)


pyenzyme/thinlayers/TL_Pysces.py line 288 at r1 (raw file):

Previously, jmrohwer (Johann Rohwer) wrote…

@JR-1991 I'd be interested to know if there is a better way of doing this. The point is that in the experimental data all the replicates are vertically stacked above each other, yet each replicate requires only a single simulation (we don't have to repeat the simulation), and this is what I'm trying to achieve here.

Thats a great solution! I was wondering if it might be an option to modify the exportData function to not stack replicates vertically, but to provide a Dataframe for each species. This way everything is stacked horizontally. I have a feeling that this might also be more handy for other use cases. This would then be another optional argument to the function to change behaviour. What do you think?

Copy link
Member

@JR-1991 JR-1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we previously discussed, the extension is great. Will merge the changes into main.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @jmrohwer)

Copy link
Member

@JR-1991 JR-1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @jmrohwer)

@JR-1991 JR-1991 merged commit 2efeb21 into main Oct 22, 2022
@JR-1991 JR-1991 deleted the fit_replicates branch June 25, 2024 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants