-
-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project: Qualitative Analysis #43
Comments
I agree that this is a direction in which I would like to move the package (not only being a platform for making simulations, but providing some additional, related, utility). The It also contains two arrays of the So if we have this network: v*Z/(K+Z)*X*Y
2.0*Z^2 The first reactions entry would contain a substrate field with two Would it be possible for you to use this array to extract the information you would need? The The Lets continue this discussion. |
How would I write X --> Y catalyzed by Z? One could either write An obvious approach for purely qualitative analysis would be to have a dummy function that generates the necessary information, in order to write How do I write a continuous feed into a reactor (the flux would be a parameter)? Naively, I would by tempted to write something like Is it reasonably possible to optionally name reactions, with another symbol? That would be quite necessary for readable output that involves reaction rates and not just concentrations. |
How would I write X --> Y catalyzed by Z? k11, X + Z --> Y + Z
k11*Z, X --> Y I would probably use the second to keep the regulatory term away from the equation. If you want some more complicated regulation you could also use f(Z), X --> Y You can extract the fact that the reaction depends on I think we could use two main approaches for this.
hill(Z,v,K,n), X --> Y
``` v K n
We know how this function work and can have that information in the program. It would probably be both possible and desirable to have a majority of the functions that people want to use already implemented in the program (including information of their monotonicity).
2) If you want to use a custom function you already have to declare it using
```julia
@reaction_func my_hill_repression(x, v, k, n) = v*k^n/(k^n+x^n)
r = @reaction_network MyReactionType begin
my_hill_repression(x, v_x, k_x, n_x), 0 --> x
end v_x k_x n_x It would be possible to add an optional argument to the These two options are correct notation: k1, 0 --> X
k2, ∅ --> X (just like you can write the reverse for degradation) Right now reactions do not have a name nor option to add it. It would probably be possible to do at some point if it is needed. |
Thanks, so we can really get everything we need! Giving optional names (a I see that the list of species is auto-generated from the list of reactions. That makes sense! Sometimes I'd like a fixed order, though. There are some normal forms that depend on the ordering of species and reactions, if we want it to be unique up to identity and not some complicated notion of equivalence. When I did qualitative analysis in python, I wrote a very simple line-based DSL for chemical reaction networks (with line-based parser and generator, using mostly regexps). I then either typed in my models directly, or I used python-libSBML to read in models from data-bases, and then used the generator to translate these into my DSL (obviously only translating a tiny subset of SBML features). The ordering was done by having an optional "species metadata" section; when reading a file without this section, I auto-generated it from the reaction data; then I wrote back the file, in order to conveniently modify this section by hand (the code is a smoldering unmaintainable abandoned ruin). This way of "pretty printing / normalizing / making optional arguments explicit" and then modifying the normalized file by hand was a nice workflow, though. Not sure whether this workflow can be replicated; I think the DSL can contain arbitrary julia expressions, and I'm not gonna write a parser+generator for julia. I think the constraints for the DSL are: Is this correct? |
It is true that the species are extracted directly from the reactions, with the result that the order of the species becomes the order in which they appear the the reactions. Right now this only matter if you want them to appear in a specific order for Latexify, but if more features are added there might be more instances were it matters. For the large class of networks that have e.g. degradation for all terms order is easy to fix by beginning the network with a degradation term: I am hesitant to e.g. require the user to input such an order by default, since this would require a lot of additional input which would not be used in 90% of all cases. The best solution would probably be to have a reorder(model,[Z,X,Y]) which orders things as the user want to. Next concerning the names of reactions. For now it would probably be easiest to have a separate vector with reaction names, which would be inputted to additional function like: model = @reaction_network MyReactionType begin
(p,d), X ↔0
end
names = [:prod,:degrad]
graph = makeGraph(model,names) If we then get these methods in place, in the long run we might think of other ways of inserting names, either as a function or as an optional part of the initiation: model1 = @reaction_network MyReactionType begin
(p,d), X ↔0, (prod, degrad)
X --> Y, activation
end
set_names(model1, [:prod,:degrad,:activation]
model2 = @reaction_network MyReactionType begin
(p,d), X ↔0, (prod, degrad)
X --> Y, activation
end |
As a third (optional) parameter sounds nice. If we want to be future-safe, maybe it is time to go for keywords (who knows what kind of optional metadata we may want to support in the future). Brainstorming:
If we now read in this network, and afterwards write it back (from data-structure to text-file), we might get something like:
or (potentially also valid!):
Personally, I would prefer to spec lexicographic ordering in cases where there is no user-supplied ordering (or where the user-supplied ordering is incomplete, like in the above example), and would prefer to have a specced verbose output format that makes all defaults explicit (and makes Do we have an ascii synonym for |
We might start thinking about keywords, as we are getting ideas for options like that they are probably the way to go. If one want to designate a whole lot of stuff one might get a very long line though, e.g. model = @reaction_network type=rn_type_3 noise_scaling=Eta species_order=[Z,X,Y] some_other_option= whatever begin
(p,d), X ↔Y
k11, X --> Z,
end p d k11 in that case model = @reaction_network type=rn_type_3 noise_scaling=Eta begin
(p,d), X ↔Y
k11, X --> Z,
end p d k11
set_species_order(model,[Z,X,Y])
some_other_option(mode,whatever) would be shorter. I am generally in favour of allowing for as concise a notation as possible. @ChrisRackauckas What do you think of adding implementing keywords and having that as a future system for options? We could declare noise scaling and type that way as well (for now we could leave the old option there to avoid breaking peoples stuff). @chethega Note that parameters (and their order) is declared at the end. For now though we can state that adding things like that is doable, either through keyword options or additional functions/macros which can be used on the constructed network. With that cleared out we can begin thinking about the implementation of additional functionality and then if we can accept some of the inputs required for those networks already in the first macro. (if we have a function some_cool_feature(model,options) we can then see if the options can be imputed already in the model declaration and the make some_cool_feature(model) = some_cool_feature(model,model.options) ) |
Re debugging networks: I encountered the following bug classes: I once did a short informal test on the curated section of the biomodels database. I think that a majority of models, when imported from SBML, had bugs. Making debugging of models convenient is important (automated tools for this are out of scope, but preventing extreme pain should be in scope). So an optional |
I am not really sure I understand? You want option to automatically read text files with potentially different notation and make networks of them? |
IMO trying to stick everything into the macro is going to be difficult. I think we need to map a lot of the structure that we have to a ModelingToolkit.jl style. Using a structured IR format would be much easier to keep extending. I think @AleMorales has been working on this and it would be nice to check in with where that is. A macro form could be short hand for a much longer detailed form of course. |
@ChrisRackauckas comment is exactly what I was going to discuss. It seems like what is really needed is a well-documented representation for storing and getting information about a reaction network. (Species, Products, Stoichiometry, rate law type, etc). Then we all can think about building our needed functionality on top of it (I need stuff like knowing for each species, which reactions depend on it, or knowing for each reaction which other reactions have rate laws that should be recalculated when the first reaction occurs...). @chethega probably needs even more info. Once we've got a nice way for actually storing and accessing basic network info we can start building that added functionality. It would also provide a target for SBML importers and such to output to. EDIT: Of course, I also think a lot of the functionality we may need may be common. For example reaction dependency graphs. It makes sense or that to be defined on top of the reaction network representation and made available here. It would be redundant for network analysis codes and SSA codes to each create their own routines for calculating dependencies. |
is executable code and needs julia.
is also relatively easy to parse and generate in a decrepit python2.7 script. So I'd guess we want both: Functions to modify models in-memory, and DSL support to describe models in text-files.
Long term, sure: We can get a model (in IR/struct form) either from the DSL by executing a macro / source file or by programmatically constructing the model, e.g. by using libSBML to painstakingly extract info from databases. For bonus points, we should make a way to output a model (in IR/struct from) in the DSL, because no one wants to read and edit serialized object dumps (that's the point of having a DSL that is actually a file format with specification and not just a julia macro). Having a non-terrible way of outputting models for various other tools, e.g. Ok, I guess this discussion is getting out of hand: We actually have two topics, (1) qualitative analysis and (2) where to go with in-memory-models, IR and DSL. @isaacsas What kind of qualitative analyses would you plan to run? |
That's right. I think the issue is maybe we are now hitting a point with lots of different applications that are all interested in using reaction networks (in different ways). It seems like getting a standardized/documented/queryable representation for the reaction network could help us all in then building our functionality on top of it.
I'm primarily simulating biological network models as continuous-time jump processes and focusing on the SSAs in DiffEqJump, so more quantitative analysis. My actual research is focused on simulating them as spatial models with both particle diffusion and reaction, codes which I'm hoping to ultimately port over to Julia from C++. Optimized SSAs need just a bit of network info (essentially the dependency graphs I mentioned above). I haven't worked on qualitative analyses as you describe, but I think as a modeler I would be very interested in having tools that let me easily calculate from just the network structure what kinds of qualitative behavior is possible, especially in the stochastic model context. For example, I know there are results for systems with absolute concentration robustness about the occurrence of extinction events. Are there algorithms for easily determining if a network has absolute concentration robustness? Similarly there are results about the existence of product form stationary distributions in (mass action) networks with zero deficiency, so that is another property I would like to know. I admit though, I haven't looked at such analyses before, so I don't know how easy it is to make a code that can check for these properties. |
@chethega Maybe I understand, you suggest we should introduce a notation which is not Julia Dependants, or at least where all the functionality can be reached without Julia code? Now I agree we maybe should structure up the various semi parallel and still dependant discussions/issues that are going on:
|
ModelingToolkit.jl is pretty much "done" in terms of standard features. We'll be adding things like event handling and components for a fully Modelica compatible IR, but I don't think that we'll need to make use of those since that's more for large DAEs. The rest of what still needs to be worked on are automatic transformations, like DAE index reduction, automatic addition of sensitivity analysis ODEs, etc., but those are things that will just be free features in the future. Even "advanced necessary" features, like Jacobian and inverse Jacobian calculations, already exist. So yes, let's get feedback from @AleMorales on where he is with the reaction-based DSL on it to how we should start designing it. Just to get everyone up to speed here, what I've noticed with DSLs is that the macro parsing front end isn't the hard or interesting part. The hard part is the internal representation and the setup to analyze and compile equations. ModelingToolkit.jl is an intermediate representation (IR) to hold the equations in a structured form to perform analysis and compilation to different targets. Think of it like a "model compiler" instead of a DSL: it's made so you can stick other DSLs on there. In there you have On systems the interface is to define In this format, the equations are stored as a computational graph, so it's all parsable and anything you want is available. If the SBML reader directly writes to this computational graph, then we'd have everything we ever wanted and then some. |
Good to hear about the What you are saying is that the problem is to decide exactly what we want our So if we move back to the topic what what be good is if people said what kind of "stuff" they would like to implement for a When we know what is good to have in the network we can decide how to represent that information in the IR (as well as how to produce it). We can also see if there are several very similar things which could simply be a single field in the IR. Finally from there we can start discussing fancy ways of inputting this information through a macro? |
Yeah, if we focus on the intermediate representation then the input is then just a choice. SBML, a macro, or a lower-level form. Each can have their own strengths and weaknesses, but build the same objects for analysis.
I assume it would mostly be the DSL you've already made. That seems to cover everything I've every wanted and more. If we just make the macro write a |
For the network, I am 90% sure that we'll want to have something that interfaces with LightGraphs.jl. It allows for you to design a graph for its |
@Gaussia moving our discussion here as @ChrisRackauckas requested. For building a dependency graph; I think it would work best if there is a function to return the dependency graph given a
For a large network it might take some time to construct (I think it would take O(K) time, if there are K reactions). I can write this function if the |
@isaacsas Sounds good. I will start thinking of making some modifications to what info goes into the @ChrisRackauckas Exactly what is the I am very fond of the current DSL :) , it is concise, looks good and easy to read (or at least I think so, but then I have also been using it for a time). |
At the moment ReactionDSL is in stand-by and has not been updated for a month, partly because ModelingToolkit.jl was such a fast moving target but also due to my own need to focus on my day job. However, I think the basic idea is already fully implemented in ReactionDSL, just need some updating and adding features. I will try to do such update asap and write down a description of the how it all works, but probably you will need to wait until the weekend... Hopefully that would help change the bus factor of that project as I don't have that much time to work on it at the moment. In the mean time I give some examples below (probably do not work anymore with latest version of ModelingToolkit.jl) Species are just simple types with optional fields that could stored default values, metadata, compartments, etc. # Basic elements
@Species s1 s2 s3 = 1
@test s1 == Species(:s1)
@test s3 == Species(:s3, 1) A formula is just another data structure where species are stored as reactants (which info on stoichiometry) @Param st
f1 = 2s1 + st*s2 ⟺ s3
f2 = Formula(:⟺, [Reactant(s1, Constant(2)), Reactant(s2, st)],
[Reactant(s3, Constant(1))])
@test f1 == f2 The type of arrow determines directionality, reversibility, etc. This could map 100% the arrows in DiffEqBiological.jl When you want to assign the rate expression you just do the following, to create a Reaction: # An actual reaction
@Param k Keq
r1 = 2s1 + st*s2 ⟺ s3 ~ k*(s1*s2 - s3/Keq)
op = k*(s1*s2 - s3/Keq)
r2 = Reaction(f1, op)
@test r1 == r2 The right hand side is 100% handled by ModelingToolkit.jl.. A whole reaction system would be: @Param gₓ xₒ kₓ Kₑ vₛ Kₘ
@IVar t
@Species x y
eq = [∅ ⟺ x ~ gₓ*(xₒ - x),
x ⟺ y ~ kₓ*(x - y/Kₑ),
y ⟺ ∅ ~ vₛ*y/(Kₘ + y)]
ode1 = ReactionODE(eq, t)
generate_ode_function(ode1)
SummaryReactionDSL does not use macros at all, it all relies of defining a series of data types ( The limitations I have encountered so far is that (i) operators need to be chosen carefully based on the precedence of parsing and sometimes parentheses cannot be avoided and (ii) algebraic expressions on Mapping these data structures to graphs should be easy. Indeed, this data structures are kind of similar to SBML and I know with SBML it is possible to map the model description to a graph. |
@Gaussia Great, thanks! If you've gotten too much to do between all these discussions I'd be happy to take charge of updating the |
Out of curiousity, why use Also, am I reading this correctly that But yeah, the point is that @Gaussia's macro could lower to the form that @AleMorales is writing about and spit out that (The problem with macros is you have to have the full expression at compile-time, so it's hard to computationally build large networks which is where we want to go. For example, adding a bunch of reactions using a for loop). |
Talking with @sbromberger
|
Similarly,
Yes. How to make use of the rate in the lowered system depends on the type of arrow (open/close, reversible, etc).
Exactly. A third option could be to have macros that facilitate the construction of |
You can make https://github.com/AleMorales/ReactionDSL.jl/blob/master/src/Elements.jl#L44-L47
Yeah, this is a bit more than just an https://github.com/AleMorales/ReactionDSL.jl/blob/master/src/Elements.jl#L50-L67 That's a really interesting idea. Is this an |
Ah, ok, I now understand now DSL problem; I failed at terminology. Most of you meant by "DSL" a set of macros or definitions, that allow one to write text files that (1) look like a chemical network (concise, human readable) and (2) that are also executable julia files; when included (in contexts where we are I also wanted it to (3) conform to a strict and simple file format, such that one could write a julia/python/perl/javascript parser in an afternoon, and (4) allow it to also serve as an output format for programs that generate or manipulate chemical networks. The combination of (4) and (2) automatically gives us a "prettify". Obviously each output file should start with its (optional) file format version, e.g. Writing an ABNF spec is probably overkill (and if someone proposes an xml schema I will point at SBML and run away crying), and I would be happy with an informal spec that is readable for people who know nothing about julia. But having a format that is convenient for humans, simple enough that we could write a short ABNF spec, that is expressive enough for the types we have, and is still a small subset of the julia language (as parsed by the femtolisp parts of julialang), would still be a plus. |
I think |
Yeah, let's be clear about the language. The DSL, or domain-specific language, is a language (in Julia, using the Julia parser) which writes to an intermediate representation (IR). That IR then has IR passes (transformations) and then compiles to byte code (here we use the Julia compiler and build Julia functions and types). Right now the DSL we have is the http://docs.juliadiffeq.org/latest/models/biological.html#The-Reaction-DSL-1 rs = @reaction_network begin
c1, S + E --> SE
c2, SE --> S + E
c3, SE --> P + E
end c1 c2 c3 The pros of a macro-based DSL is it's compact, concise, and easily human readible. The bad part is that, since macros act at compile time, they are hard to utilize programmatically. This DSL has its own internal IR which can split out some problems to solve: p = (0.00166,0.0001,0.1)
# S = 301, E = 100, SE = 0, P = 0
prob = DiscreteProblem([301, 100, 0, 0], (0.0, 100.0), p)
jump_prob = JumpProblem(prob, Direct(), rs)
sol = solve(jump_prob, Discrete()) The question is where to go next. ModelingToolkit.jl is an intermediate representation for Julia-based DSLs. It somewhat has a natural DSL on it through some basic syntactic sugar. using ModelingToolkit
# Define some variables
@IVar t
@DVar x(t) y(t) z(t)
@Deriv D'~t
@Param σ ρ β
eqs = [D*x ~ σ*(y-x),
D*y ~ x*(ρ-z)-y,
D*z ~ x*y - β*z]
de = DiffEqSystem(eqs) where instead of those macros we could have done: t = IndependentVariable(:t)
x = DependentVariable(:x,dependents = [t])
y = DependentVariable(:y,dependents = [t])
z = DependentVariable(:z,dependents = [t])
D = Differential(t) # Default of first derivative, Derivative(t,1)
σ = Parameter(:σ)
ρ = Parameter(:ρ)
β = Parameter(:β) and then since it's a compilation IR it has transformation functions that spit outputs for compilation, for example: ModelingToolkit.generate_ode_function(de)
## Which returns:
:((du, u, p, t)->begin
x = u[1]
y = u[2]
z = u[3]
σ = p[1]
ρ = p[2]
β = p[3]
x_t = σ * (y - x)
y_t = x * (ρ - z) - y
z_t = x * y - β * z
du[1] = x_t
du[2] = y_t
du[3] = z_t
end
end) gives a Julia function to compile. @AleMorales 's ReactionDSL and the Since that representation is programmable, SBML parsing is really just another DSL targetting the same IR that Now let's dissect the flow of this a little bit. To build the IR, you first build your using ModelingToolkit
# Define some variables
@IVar t
@DVar x(t) y(t) z(t)
@Deriv D'~t
@Param σ ρ β
eqs = [D*x ~ σ*(y-x),
D*y ~ x*(ρ-z)-y,
D*z ~ x*y - β*z] is just a bunch of expressions. Then de = DiffEqSystem(eqs) puts it all into one object, the So for reactions, IMO it looks like we want to extend the IR to have the The interesting issue is to build out ModelingToolkit.jl to get feature-parity: SDEs and jumps need to be added to the DiffEqSystem. But that's a separate issue from re-structuring the DSL: if you just build the reaction DSL assuming the rest of the diffeq systems will be handled, the features will come when I get to that (hopefully before the summer). |
Yeah it's probably too different. |
I'm not an expert on this subject, but I'm throwing this out there for visibility: https://arxiv.org/pdf/1412.5257.pdf The basic idea: It seems there are some nice results that can identify reaction networks that admit multiple steady states based on the structure of networks. For example, it is possible (for the "right types of networks") to prove that a given reaction network has multiple steady states while a smaller, embedded sub-network cannot produce the same steady states. The lead authors have additional papers and references on their personal websites regarding multi-stationarity. |
Yeah, that's a much better survey of what I called Feinberg-style methods; thanks for linking! Unfortunately, checking qualitative networks for multiple steady states is afaik not easy. That is, whether a subexponential algorithm exists is still an open question; and if such an algorithm exists then it also solves https://arxiv.org/pdf/math/9911268.pdf. Quite frankly, inventing such an algorithm is probably above my paygrade, if it even exists (just implementing the linked algorithm which solves a much simpler problem would probably be quite daunting for me). Of course, there is a trivial exponential time algorithm, of typical fashion for NP problems: There always is a witness that allows fast verification of multi-stationarity; it is unknown whether witnesses for non-multi-stationarity exists; there are "only" exponentially many potential witnesses, and you can proceed by trying all of them. Luckily in practice, irreducible non-multi-stationar networks should tend not to be too small, and witnesses of their multiple steady states should tend not be too rare, which means that trying out all potential witnesses might not even be catastrophically bad. (not sure whether the witnesses of multi-stationarity are published anywhere; I'll probably post an unpolished preprint on the arxiv along with the code here) |
@chethega Is this still something you are interested in? I think we've reached a point where most of the information you want is available, so we could easily add a simply API for querying and getting it out. With the new
Given a # vector of ReactantStruct storing species symbol and stoichiometry coefficient
rn.reactions[i].substrates
# vector of ReactantStruct storing species symbol and stoichiometry coefficient
rn.reactions[i].products
# vector of species the reaction law depends on
rn.reactions[i].dependents # dependent species symbols
# ODE rate law expression
rn.reactions[i].rate_DE
# true if pure mass action reaction
rn.reactions[i].is_pure_mass_action We also have functionality for making reaction-reaction dependency graphs, and substrate-reaction graphs.
For mass action reactions and the different Hill/MM functions we support this is easy to extract. For more general rates you'd probably want user input as you mention.
As I said above, this is there now.
This is something that would be great, but I don't understand what the standard approaches are. People have posted about "visualizing" reaction networks, but it seems like there are many different networks one could visualize. Is there a canonical reaction network graph that people mean? How is it, or the output of these types of analyses on it, commonly visualized? |
Qualitative analysis is progressing and we can continue discussion in more concrete issues for specific functionality. |
It would be nice to implement various algorithms for qualitative analysis of chemical reaction networks. So, this issue is for brainstorming:
I would plan to implement BF16 (other potential references: OM16, OTM17); in short, this takes a network with qualitative kinetics (we only know stoichiometry, and know on which concentrations each reaction rate depends), and outputs a graph that describes which steady state concentration and steady-state reaction rates depend on which parameters of the system, and also generates some bounds on the kind of possible bifurcations from the steady state. This graph is pretty structured: It has a transitivity property that allow nice visualizations.
I would further plan to implement some variant of "Feinberg Shinar concordance" analysis. I recommend using this as a google term; I know no readable paper on this topic that I can recommend (instead of subjecting any of you to the pain of trying to really understand more than the abstract of this, I'd rather recommend waiting until I post some code).
What kind of things do I need from the network:
I would need to extract, for each reaction, stoichiometry and dependencies (on which concentrations does the reaction rate depend).
Concordance analysis would also require monotonicity information (each reaction rate can depend each concentrations either positively, negatively, with unspecified sign or not at all). Unless we do some extensive interval math, I'd guess that this information would need to be user supplied.
For convenience, one would need a nice way to "clamp" concentrations (remove them from the state-space by considering them constant).
If we want to do some more exotic Feinberg-style analyses which only apply to mass-action kinetics, we would also need to know which reactions are supposed to have strict mass action kinetics. This would need to be a user-supplied parameter: Most of the time, you consider mass-action as an approximation of some unfathomably complicated kinetics; sometimes (rarely) you want to consider it physical/mathematical truth.
If we want the comparison between different variants of networks from section 5 of https://arxiv.org/abs/1606.00279, then we would need some some support from the DSL. Not sure how this should look like.
Afterwards, we would like to visualize the output of the analysis, and store it in some usable data structure. We would further like to use it for the steady state solve, and if parameter-tracking / bifurcation analysis ever becomes a thing in diffeq, then we would like to use it there as well.
For the steady state solver, the result looks like the following: We factorize our big, coupled nonlinear system into a block tridiagonal structure; that is, we have a list of lower-dimensional systems f_i(x_1,..., x_k)==0 (each f_i and x_j is multi-dimensional), and a transitive relation isless, such that f_i(x)=f_i({x_k: isless(k,i)}). Obviously this severely constrains the kind of bifurcations or multiple steady states we can have.
The text was updated successfully, but these errors were encountered: