-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pythonizations for collection subscript #570
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
598e039
added pythonizer base class
m-fila 8172c85
added collection subscript pythonization
m-fila d972f41
Update python/podio/pythonizations/__init__.py
m-fila 734b786
collection `__getitem__` uses `at` wrapped to throw python exception
m-fila f724f4f
fix exception stacktrace readability
m-fila a489371
Update python/podio/pythonizations/collection_subscript.py
m-fila d438c5d
split pythonization callback to predicate and modifcation
m-fila 16f8a7d
added documentation
m-fila a7cf951
Applied suggestions in docs
m-fila File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# Python interface for data models | ||
|
||
Podio provides support for a Python interface for the generated data models. The [design choice](design.md) to create Python interface resembling the C++ interface is achieved by generating Python bindings from the C++ interface using | ||
[cppyy](https://cppyy.readthedocs.io/en/latest/index.html). To make pyROOT aware of the bindings, the cppyy functionality bundled with ROOT can be used. | ||
|
||
It's important to note that cppyy loads the bindings and presents them lazily at runtime to the Python interpreter, rather than writing Python interface files. Consequently, the Python bindings have a runtime dependencies on ROOT, cppyy and the data model's C++ interface. | ||
|
||
To load the Python bindings from a generated C++ model dictionary, first make sure the model's library and headers can be found in `LD_LIBRARY_PATH` and `ROOT_INCLUDE_HEADERS` respectively, then: | ||
|
||
```python | ||
import ROOT | ||
|
||
res = ROOT.gSystem.Load('libGeneratedModelDict.so') | ||
if res < 0: | ||
raise RuntimeError('Failed to load libGeneratedModelDict.so') | ||
``` | ||
|
||
For reference usage, see the [Python module of EDM4hep](https://github.com/key4hep/EDM4hep/blob/main/python/edm4hep/__init__.py). | ||
|
||
## Pythonizations | ||
|
||
Python as a language uses different constructions and conventions than C++, perfectly fine C++ code translated one to one to Python could be clunky by Python's standard. cppyy offers a mechanism called [pythonizations](https://cppyy.readthedocs.io/en/latest/pythonizations.html) to make the resulting bindings more pythonic. Some basic pythonizations are included automatically (for instance `operator[]` is translated to `__getitem__`) but others can be specified by a user. | ||
|
||
Podio comes with its own set of pythonizations useful for the data models generated with it. To apply all the provided pythonizations to a `model_namespace` namespace: | ||
|
||
```python | ||
from podio.pythonizations import load_pythonizations | ||
|
||
load_pythonizations("model_namespace") | ||
``` | ||
|
||
If only specific pythonizations should be applied: | ||
|
||
```python | ||
from podio.pythonizations import collection_subscript # specific pythonization | ||
|
||
collection_subscript.CollectionSubscriptPythonizer.register("model_namespace") | ||
``` | ||
|
||
### Developing new pythonizations | ||
|
||
To be discovered by `load_pythonizations`, any new pythonization should be placed in `podio.pythonizations` and be derived from the abstract class `podio.pythonizations.utils.pythonizer.Pythonizer`. | ||
|
||
A pythonization class should implement the following three class methods: | ||
|
||
- `priority`: The `load_pythonizations` function applies the pythonizations in increasing order of their `priority` | ||
- `filter`: A predicate to filter out classes to which given pythonization should be applied. See the [cppyy documentation](https://cppyy.readthedocs.io/en/latest/pythonizations.html#python-callbacks). | ||
- `modify`: Applying the modifications to the pythonized classes. | ||
|
||
### Considerations | ||
|
||
The cppyy pythonizations come with some considerations: | ||
|
||
- The general cppyy idea to lazily load only things that are needed applies only partially to the pythonizations. For instance, a pythonization modifying the `collection[]` will be applied the first time a class of `collection` is used, regardless if `collection[]` is actually used. | ||
- Each pythonization is applied to all the entities in a namespace and relies on a conditional mechanism (`filter` method) inside the pythonizations to select entities they modify. With a large number of pythonizations, the overheads will add up and slow down the usage of any class from a pythonized namespace. | ||
- The cppyy bindings hooking to the C++ routines are characterized by high performance compared to ordinary Python code. The pythonizations are written in Python and are executed at ordinary Python code speed. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
"""cppyy pythonizations for podio""" | ||
|
||
from importlib import import_module | ||
from pkgutil import walk_packages | ||
from .utils.pythonizer import Pythonizer | ||
|
||
|
||
def load_pythonizations(namespace): | ||
"""Register all available pythonizations for a given namespace""" | ||
module_names = [name for _, name, _ in walk_packages(__path__) if not name.startswith("test_")] | ||
for module_name in module_names: | ||
import_module(__name__ + "." + module_name) | ||
pythonizers = sorted(Pythonizer.__subclasses__(), key=lambda x: x.priority()) | ||
for i in pythonizers: | ||
i.register(namespace) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
"""Pythonize subscript operation for collections""" | ||
|
||
import cppyy | ||
from .utils.pythonizer import Pythonizer | ||
|
||
|
||
class CollectionSubscriptPythonizer(Pythonizer): | ||
"""Bound-check __getitem__ for classes derived from podio::CollectionBase""" | ||
|
||
@classmethod | ||
def priority(cls): | ||
return 50 | ||
|
||
@classmethod | ||
def filter(cls, class_, name): | ||
return issubclass(class_, cppyy.gbl.podio.CollectionBase) | ||
|
||
@classmethod | ||
def modify(cls, class_, name): | ||
def get_item(self, i): | ||
try: | ||
return self.at(i) | ||
except cppyy.gbl.std.out_of_range: | ||
raise IndexError("collection index out of range") from None | ||
|
||
class_.__getitem__ = get_item |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
"""cppyy pythonizations for podio""" | ||
|
||
from abc import ABCMeta, abstractmethod | ||
import cppyy | ||
|
||
|
||
class Pythonizer(metaclass=ABCMeta): | ||
""" | ||
Base class to define cppyy pythonization for podio | ||
""" | ||
|
||
@classmethod | ||
@abstractmethod | ||
def priority(cls): | ||
"""Order in which the pythonizations are applied | ||
|
||
Returns: | ||
int: Priority | ||
""" | ||
|
||
@classmethod | ||
@abstractmethod | ||
def filter(cls, class_, name): | ||
""" | ||
Abstract classmethod to filter classes to which the pythonizations should be applied | ||
|
||
Args: | ||
class_ (type): Class object. | ||
name (str): Name of the class. | ||
|
||
Returns: | ||
bool: True if class should be pythonized. | ||
""" | ||
|
||
@classmethod | ||
@abstractmethod | ||
def modify(cls, class_, name): | ||
"""Abstract classmethod modifying classes to be pythonized | ||
|
||
Args: | ||
class_ (type): Class object. | ||
name (str): Name of the class. | ||
""" | ||
|
||
@classmethod | ||
def register(cls, namespace): | ||
"""Helper method to apply the pythonization to the given namespace | ||
|
||
Args: | ||
namespace (str): Namespace to by pythonized | ||
""" | ||
|
||
def pythonization_callback(class_, name): | ||
if cls.filter(class_, name): | ||
cls.modify(class_, name) | ||
|
||
cppyy.py.add_pythonization(pythonization_callback, namespace) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails on
dev3
due to some overload resolution issue. I would assume it's a ROOT issue that we are hitting here, but I am not sure yet which one.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well an exception is thrown, but of a different type than expected:
According to the docs if there is an exception then cppyy checks the rest of overloads and if they all throw the same type then the exception is propagated, if they all throw different types then
TypeError
is risenHere both overloads throws
std::out_of_range
but somehow it isn't recognized as the same typeThe possible solutions I could think of now are:
cppyy.gbl.std.out_of_range
orTypeError
and raiseIndexError
- I don't like it becasue it's also used for other things, like actual type mismatch, eg. calling withstr
instead ofint
TypeError
notIndexError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this can be boiled down to a small reproducer that we can submit to ROOT because they probably didn't want to introduce too many "breaking" changes in their update of the cppyy they bundle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened an issue wlav/cppyy#230 as the problems appears also outside of ROOT bundle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also made the ROOT folks aware via root-project/root#15375