-
-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable serialization of Cantera objects #984
Conversation
Codecov Report
@@ Coverage Diff @@
## main #984 +/- ##
==========================================
+ Coverage 70.66% 72.39% +1.73%
==========================================
Files 361 364 +3
Lines 44522 46433 +1911
==========================================
+ Hits 31460 33616 +2156
+ Misses 13062 12817 -245
Continue to review full report at Codecov.
|
599e499
to
86435ab
Compare
004decb
to
ad1f1a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@speth ... as indicated earlier, it is great that this is finally close to becoming reality (:tada:). I haven't looked at a very detailed level (I assume that the unit test suite is able to ensure that the emitted YAML code can be re-imported without issues), but do have some minor comments.
interfaces/cython/cantera/utils.pyx
Outdated
@@ -72,3 +72,78 @@ class CanteraError(RuntimeError): | |||
pass | |||
|
|||
cdef public PyObject* pyCanteraError = <PyObject*>CanteraError | |||
|
|||
cdef anyvalueToPython(string name, CxxAnyValue& v): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
names in utils.pyx
are not very 'pythonic' (although it's not exposed to Python) - anyvalue_to_python
would be (same for anymap_to_dict
below - would suggest dict
as this is what it returns)
interfaces/cython/cantera/utils.pyx
Outdated
return {key: value for (_, key, value) in py_items} | ||
|
||
|
||
cdef mergeAnyMap(CxxAnyMap& primary, CxxAnyMap& extra): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to handle that within the C++ layer? Also, as a comment, the anymapToPython
(or anymap_to_dict
) applies units whereas mergeAnyMap
does not.
Regarding my other comment about getParameters
, a C++ implementation of merge
would go in hand with having unique_ptr<AnyMap> parameters() const
returned throughout, which then would mean that you can AnyMap::merge(const AnyMap& other)
(or similar).
I assume that there may be a small performance penalty for smart pointers (and potentially a larger one if merging requires copying), but it would be easier to track where objects are initially created (while this may be subjective, it is often difficult to track how things are assembled when the receiving function creates the object which is subsequently added to as in your getParameters
approach).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while my initial comment is mostly moot, I believe this function is mainly used to merge output from getParameters
and input
? I believe that it would be nice to merge this by passing an optional flag to getParameters
where it applies (and handle the merge in C++). This may eliminate the need for this specific Cython function ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial reason for this came from the desire to control the order of the output keys. I want the user-defined keys in input
to generally come last, and the ordering mechanism mostly relies on preserving insertion order. If the base Reaction
class were responsible for adding in the user-defined keys, then they would come before any keys added by child classes.
I actually ended up finding a hybrid solution that both gets rid of the need to merge the fields defining object with the extra input data at the point of use, and avoids the requirement for the user-accessible function to take an AnyMap
as it's input, by making the virtual getParameters
method protected
, and introducing a non-virtual method with the signature AnyMap parameters(bool withInput=true)
on the base classes only. This way, the parameters
method can initialize the AnyMap
, use the virtual getParameters
method to populate all of the relevant fields, and then add the extra keys last before returning it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@speth ... I rescinded some of my initial comments as I believe your implementation is more efficient (despite being harder to track).
Beyond, I think adding an optional flag to getParameters
to add input
where it applies (and handling a merge within C++) may be more consistent and easier long-term. I'm also not sure about getSpeciesParameters
, as it seems somewhat superfluous. Spot-wise benchmark tests that illustrate that a re-imported solution preserves results may add another level of security.
PS: I believe it would make sense to finalize this before #995 (and rebase the latter, as this would allow me to port getParameters
immediately, rather than requiring another go-around).
PPS: Adding this capability to the extract_submechanism.py example would be neat.
self.assertEqual(len(generated['Pt_surf-reactions']), surf.n_reactions) | ||
self.assertEqual(len(generated['species']), surf.n_total_species) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may make sense to expand tests some, e.g. generate yaml and benchmark test problems for both original and generated yaml input (along the lines of test_convert
for CTI/XML->YAML). Including larger mechanisms (Reitz?) may be interesting also.
Individual objects are well covered by C++ tests, so this is mainly testing the YamlWriter
's front end, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these tests were mainly just focused on the YamlWriter
Python interface, but I did add a couple of additional checks here to show that whole Solution
is being restored in a consistent state. I think the consistency of all the different phase and reaction types is reasonably-well covered by the ThermoYamlRoundTrip
test suite (in test/thermo/thermoToYaml.cpp
), and the checks done in the ReactionToYaml
test suite (in test/kinetics/kineticsFromYaml.cpp
) by the compareReactions
function.
test/general/test_serialization.cpp
Outdated
{"quantity", "mol"}, | ||
{"length", "cm"} | ||
}); | ||
// Should fail because pre-exponential factors from XML can't be converted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that I am following based on what this comment states - while XML may have limitations for import, shouldn't this become moot once data are read and available to the C++ objects? (at which point everything should be standardized, i.e. everything is known implicitly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely a thorny issue. Because reactions are instantiated from XML without having a Kinetics
object available, there is no way to determine the dimensions of the rate constant. All you know is that the value itself has already been converted to Cantera's mks+kmol system. I guess you could set the value of Reaction.rate_units
when calling Kinetics.addReaction
, which would resolve this for what will presumably be the most common use case, serializing a complete phase definition. But a bare Reaction object created from XML does not contain the information needed to serialize it to a non-default unit system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, the reaction order provides most of the missing information, so the main issues are non-standard units as well as volumetric vs surface geometry. Am I seeing this correctly? Regardless, I believe export should work if associated kinetics objects are defined (as you mention, this can be done implicitly and is likely the most common use case); this would leave an exception if the reaction isn’t fully set up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, the units are determined by the dimensionality of the various phases, plus the units of the standard concentration for each phase. And while kmol/m^3 or kmol/m^2 are probably the most common, there are several phases where the standard concentration is dimensionless (see the various implementations of standardConcentrationUnits
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification; I believe I’m on the same page now - units do pop up in #995.
Regarding the XML conversion, I still believe that implicitly setting units would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized while thinking about this that CTI/XML isn't the only case where you can manage to create a reaction without needing to set it's rate units -- this is also true if you just create a Reaction
object from scratch. Setting Reaction.rate_units
now occurs as part of Kinetics.addReaction
, if it doesn't get set before then. I also ended up refactoring up the rateCoeffUnits
function and making it a member function of the Reaction
class (as calculateRateCoeffUnits
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s what had been my suspicion all along. If you look at #995, I have new constructors for ReactionRate
from AnyMap
where this will become relevant. Obviously things need to be consolidated as I didn’t have rate_units
at my disposal before.
"Multiple species with different definitions are not " | ||
"supported:\n>>>>>>\n{}\n======\n{}\n<<<<<<\n", | ||
speciesDef.toYamlString(), | ||
speciesDefs[speciesDefIndex[name]].toYamlString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As YamlWriter
is central to serialization, it may make sense to add a unit test for special cases (here and elsewhere as marked by CodeCov).
out << valueStr; | ||
width += name.size() + valueStr.size() + 4; | ||
} else { | ||
// Put items of an unknown (compound) type on a line alone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As AnyMap
is central to serialization, it may make sense to add a unit test for special cases (here and elsewhere as marked by CodeCov).
interfaces/cython/cantera/utils.pyx
Outdated
return {key: value for (_, key, value) in py_items} | ||
|
||
|
||
cdef mergeAnyMap(CxxAnyMap& primary, CxxAnyMap& extra): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while my initial comment is mostly moot, I believe this function is mainly used to merge output from getParameters
and input
? I believe that it would be nice to merge this by passing an optional flag to getParameters
where it applies (and handle the merge in C++). This may eliminate the need for this specific Cython function ...
Currently only handles thermo, not kinetics or transport
The default order is the order in which items are added. Items originating from an input file come after programmatically added items, and retain their existing ordering.
Eliminates the 'mergeAnyMap' function, and introduces a 'parameters' method for classes to return an AnyMap which can optionally contain the user-provided input data, rather than needing to create the AnyMap in advance and add the user-created fields separately.
2f5ecea
to
c8aea00
Compare
I think I'm happy with the code coverage provided by the tests at this point. |
@speth ... thanks for going into more detail with test coverage. As mentioned above, this looks good to go. |
@ischoegl Feel free to hit the merge button 😄 |
🎉 |
Changes proposed in this pull request
getParameters(AnyMap&)
methods to ThermoPhase, Kinetics, Reaction, etc. classes to collect data needed for serializationinput_data
property to the corresponding Python objects, which returns this information converted to native Python types, e.g.dict
s andlist
sYamlWriter
class for creating input files defining phases, species, and reactions. Generated output files can contain multiple phase definitions and reaction sections, and have use user-specified unit systems.Examples
Python
input_data
property:YamlWriter
class in C++:From a Python
Solution
orInterface
:Remaining issues
_SolutionBase.input_data
property and_SolutionBase.write_yaml
method with theSolution
class. I tried adding a:members: input_data, write_yaml
directive inimporting.rst
, but it didn't seem to do anything.Potential future improvements
--extra
option ofck2yaml
. This probably requires adding a converter for Python data structures toAnyMap
.precision
option.Addresses Cantera/enhancements#11.
Checklist
scons build
&scons test
) and unit tests address code coverage