proposal for serialisation of models #2787

martinjrobins · 2023-03-20T09:33:04Z

Description

Develop a text serialisation format for pybamm models
Write an export tool to export pybamm models into this format
Write an import tool to read in models in this format

Motivation

One of the original goals of PyBaMM was to facilitate the sharing of physics-based battery models. PyBaMM has been very successful in this, allowing users to share pybamm models created using the Python language. However, to enable a wider "shareability" of PyBaMM models I would propose that we need a text serialisation format that pybamm models can be converted to and created from. This would enable easiler interoperability of pybamm with other solvers or tools. For example, if someone was developing a battery model in Matlab they could write it out in this format for later import into pybamm. Or if someone was developing a battery parameterisation tool (in any language) they could allow the import of pybamm models by writing a reader for our serialisation format.

Possible Implementation

I would propose that we focus on serialisation of pybamm models that are already discretised and ready to be solved, as sharing a continuum model still leaves many questions on how in particular this model should be discretised.

My proposal for a serialisation format is a text based, human readable language based on tensors (inspired mainly by the TACO tensor algebra compiler for reference), and example of which is below. This is based on another project I'm working on and I'm happy to iterate on this, just wanted to put something down to start the conversation!

in_i {                    //"input" tensor, describes input parameters to the model
    r -> [0, inf],
    k -> [0, inf],
}
sm_ij {                     // rank 2 tensor (indexed by i and j)
    (0..2, 0..2): 1,       // diagonal entries denoted by .. range format
}
I_ij {
    (0:2, 0:2): sm_ij,    // block entries denoted by : range format
    (2, 2): 1,                // sparse tensors, any indices not here are implicitly zero
    (3, 3): 1,
}
u_i {
    y -> R**2 = 1,      // "u" tensor is the state, here y is a vector of dimension 2, initialised to 1 at t=0
    z -> R**2,
}
rhs_i {
    (r * y_i) * (1 - (y_i / k)),    // expressions use tensor index notation
    (2 * y_i) - z_i,
}
F_i {                                  // model equations expressed by "F" and "G" tensors
    dot(y_i),                       // such that the equations are $ F(u, \dot{u}, t) = G(u, t) $
    0,
    0,
}
G_i {
    sum(j, I_ij * rhs_i),          // this is a matrix multiply using a sum over index "j"
}
out_i {                             // "out" tensor describes that output of the model
    y_i,
    t,
    z_i,
}

Additional context

There have been a few proposals for serialisation formats for model parameters (e.g. BPX) but I would argue that the usefulness of these is very much hampered by the lack of a model serialisation format. Having a bunch of parameters means nothing unless you have a description of the model that uses these parameters. E.g. $y = exp(-at)$ with $a=1$ is very different to $y = exp(-at/10)$ with $a=1$, even if both of those models are very similar (you would describe them both as "exponential decay", just the details are different)

The text was updated successfully, but these errors were encountered:

valentinsulzer · 2023-03-20T14:30:08Z

Sounds good. Anytree also have a json exporter which would be easier to implement but maybe not as generalizable. What about FMU?

martinjrobins · 2023-03-20T15:57:46Z

FMU is nice that its an existing standard, I like that. The bit I don't like is the XML description, its machine readable but not human readable. There is a reason we program in languages and not in xml.....

Mind you, the "existing standard" bit is very convincing, so happy to be pursuaded!

martinjrobins · 2023-03-20T15:59:42Z

can FMU do sparse vectors or linear algebra? I can't find this....?

martinjrobins · 2023-04-17T08:22:52Z

I'm still playing around with a human-readable serialisation format similar to the above, but would suggest that for now we just go with @tinosulzer's suggestion of a json exporter, you basically write out every node in the expression tree in a large json-format tree. Not really readable but it will be much easier to write the exporter/importer.

I think we should stick a version number in the output and make sure that if the format ever needs to change (e.g. a node in the expression tree gets a new field) we increment the version number, and make sure we support reading in all prior versions)

martinjrobins · 2023-04-17T13:13:07Z

an alternative to json is flatbuffers (https://flatbuffers.dev/). This could simplify transferring pybamm models to other languages (e.g. Julia). Saying that, there are lots of libraries for json as well, so perhaps not simplify, just make the actual data transfer a lot faster!

UPDATE: Supported languages are:

C
C++ - snapcraft.io
C# - nuget.org
Dart - pub.dev
Go - go.dev
Java - Maven
JavaScript - NPM
Kotlin
Lobster
Lua
PHP
Python - PyPI
Rust - crates.io
Swift - swiftpackageindex
TypeScript - NPM
Nim
Julia - https://docs.juliahub.com/FlatBuffers/rNtRK/0.6.1/

martinjrobins · 2023-04-17T13:17:07Z

There is also this: https://protobuf.dev/

pipliggins · 2023-07-07T17:54:44Z

Just wanted to post a quick progress update on this issue:

I first looked at Pydantic, a library which can use type annotations to generate a JSON schema for serialising Python objects. To integrate with Pydantic, PyBaMM would have to be type-hinted throughout and inherit from the Pydantic's BaseModel class. Actually, we’d have to make use to this patch which fixes a Pydantic issue related to property getters/setters – a pattern used frequently in PyBaMM. However Pydantic’s serialisation support doesn’t work out of the box for PyBaMM, since most PyBaMM objects are not natively JSON serialisable. I.e., it does not seem that Pydantic can infer this from the base types alone so we'd still have to manually extend the JSONEncoder for each PyBaMM object we wish to serialise.

While experimenting with Pydantic, I added type hints to most expression tree files. I’ll include these in a separate pull request even if they end up not being required for serialization.

Before continuing with Pydantic, I looked for more automated alternatives. JSONpickle is an obvious candidate: the library reads/writes JSON files for pickleable Python objects. The authors demonstrate cross-language support with a deserialization module that can reconstruct Python objects in JavaScript, but similar code could be written in any language. JSONpickle also supports complex Python objects: “py/id” tags are used to handle multiple references made to the same Python object.

I ran a few tests of JSONpickle. First, I serialized an expression tree (inspired by the PyBaMM expression tree example)

Import pybamm
import jsonpickle

y = pybamm.StateVector(slice(0,1))
t = pybamm.t

equation = 2*y * (1 - y) + t

eq_json = jsonpickle.dumps(equation, keys=True)
eq_loaded = jsonpickle.loads(eq_json, keys=True)

This works great! Next, I tried to serialize a model object which is part of a PyBaMM simulation (inspired by this example):

import pybamm
import jsonpickle
import jsonpickle.ext.numpy as jsonpickle_numpy
jsonpickle_numpy.register_handlers()

model = pybamm.lithium_ion.DFN()
sim = pybamm.Simulation(model)
state = sim.__getstate__()

model_json = jsonpickle.dumps(state[‘model’], keys=True)
model_r = jsonpickle.loads(model_json, keys=True)

Unfortunately, this code produces errors. I started debugging by writing a script that recurses through the object and tries dumping & loading each property. This reveals errors with multiple properties in the object structure. I’ve attached a stack dump I generated: 2023-07-06_13-06-43.txt

I’m going to continue debugging issues here while having a look at Google Protobuf as an alternative cross-platform serialisation method.

martinjrobins added the feature label Mar 20, 2023

martinjrobins assigned pipliggins May 12, 2023

valentinsulzer added priority: medium To be resolved if time allows difficulty: hard Will take several weeks labels May 15, 2023

valentinsulzer mentioned this issue May 24, 2023

[Bug]: Error when pickling simulation #2982

Closed

valentinsulzer added the in-progress Assigned in the core dev monthly meeting label Jun 12, 2023

pipliggins linked a pull request Oct 2, 2023 that will close this issue

Serialisation of models #3397

Merged

8 tasks

martinjrobins closed this as completed in #3397 Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal for serialisation of models #2787

proposal for serialisation of models #2787

martinjrobins commented Mar 20, 2023

valentinsulzer commented Mar 20, 2023

martinjrobins commented Mar 20, 2023

martinjrobins commented Mar 20, 2023

martinjrobins commented Apr 17, 2023

martinjrobins commented Apr 17, 2023 •

edited

Loading

martinjrobins commented Apr 17, 2023

pipliggins commented Jul 7, 2023 •

edited

Loading

proposal for serialisation of models #2787

proposal for serialisation of models #2787

Comments

martinjrobins commented Mar 20, 2023

Description

Motivation

Possible Implementation

Additional context

valentinsulzer commented Mar 20, 2023

martinjrobins commented Mar 20, 2023

martinjrobins commented Mar 20, 2023

martinjrobins commented Apr 17, 2023

martinjrobins commented Apr 17, 2023 • edited Loading

martinjrobins commented Apr 17, 2023

pipliggins commented Jul 7, 2023 • edited Loading

martinjrobins commented Apr 17, 2023 •

edited

Loading

pipliggins commented Jul 7, 2023 •

edited

Loading