Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add python bindings for the datamodel classes #204

Merged
merged 13 commits into from
Jun 14, 2023
Merged

Conversation

jmcarcell
Copy link
Contributor

BEGINRELEASENOTES

  • Add python bindings for the datamodel classes and some documentation on how to use the bindings

ENDRELEASENOTES

See the README for examples on how to use them

Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking care of this. This already looks pretty good, My main complaint is the factory being something that the users can see. I suppose we need the factory because of the dynamic nature of ROOTs python bindings, right? Can we still try to "hide" it in the __init__.py or in datamodel.py such that it is already instantiated and what the user sees would look something like:

from edm4hep import edm4hep
particle = edm4hep.MCParticle()
# ...

This should also work with from edm4hep import *, but I am not sure I like that better. Ideally it would just be import edm4hep and then we have the factory in hand, but I don't see a way how that could work.

README.md Show resolved Hide resolved
python/edm4hep/datamodel.py Outdated Show resolved Hide resolved
@jmcarcell
Copy link
Contributor Author

It has been changed now to edm4hep so both work:

>>> from edm4hep import *
>>> edm4hep
<edm4hep.datamodel.Datamodel object at 0x7f4bd08fa0b0>
>>> edm4hep.MCParticle()
<cppyy.gbl.edm4hep.MCParticle object at 0x55b3f52c4710>

or the one in the README:

>>> from edm4hep import edm4hep
>>> edm4hep.MCParticle()
<cppyy.gbl.edm4hep.MCParticle object at 0x55b4034ba030>

import edm4hep doesn't work since it's the top level module... And maybe it's not a good idea in case we have other python tools in the future for edm4hep

@tmadlener
Copy link
Contributor

tmadlener commented May 7, 2023

Thanks, this looks very good now. I just realized we do have some tests for utility functionality already, that redo some of the stuff that is introduced here:

if ROOT.gSystem.Load('libedm4hepDict.so') < 0:
print('Cannot load edm4hep dictionary for tests')
sys.exit(1)
ROOT.gInterpreter.LoadFile(os.path.dirname(__file__) + '/../../utils/include/edm4hep/utils/kinematics.h')
from ROOT import edm4hep

Now that we have a proper place for an __init__.py maybe we could even use that to also include the JIT loading of the utils/kinematics.h header properly instead of the hackish way we do it at the moment for the tests.

@tmadlener
Copy link
Contributor

Even more cases in the tests that could be cleaned up:

# Load the implementations of the utility functions
ROOT.gSystem.Load('libedm4hepRDF.so')
# Load the declarations of the utility functions to make them available for JIT
# compilation
ROOT.gInterpreter.LoadFile('edm4hep/utils/dataframe.h')

@jmcarcell
Copy link
Contributor Author

That is done now. I think it could be done on demand when using python but in the cases where something in edm4hep::utils is passed as a string python has no way to know so this will always load both headers. For running the tests manually it's a bit annoying that one has to set the paths but it's hard to make it possible to load the headers always. I tried and had a working version for tests and runtime without needing to set the paths but then someone could break it by installing instead of in install/ in the main tree in install1/install2 or making an in-source build

hegner
hegner previously requested changes May 10, 2023
README.md Outdated Show resolved Hide resolved
python/edm4hep/datamodel.py Outdated Show resolved Hide resolved
Comment on lines +10 to +12
res = ROOT.gInterpreter.LoadFile('edm4hep/utils/kinematics.h')
if res != 0:
raise RuntimeError('Failed to load kinematics.h')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this work like this?

from edm4hep import utils
p4 = utils.p4

or

import edm4hep
p4 = edm4hep.utils.p4

Or is this already possible?

Copy link
Contributor Author

@jmcarcell jmcarcell Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some changes in ad539e7 and now it's possible:

from edm4hep import edm4hep
particle = edm4hep.MCParticle() # default initialized particle
particle.getCharge() # 0.0

from edm4hep import utils
p  = utils.p4(particle) # edm4hep.utils.p4 or utils.p4 works
print(p) # (0,0,0,0)

There are two classes now, one for serving all classes from edm4hep and another one for the utils namespace. If there were more namespaces it's easy to change to a dictionary-based class that would store each of the edm4hep.namespace object and serve by doing a lookup on the dictionary

Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks a lot for having another look at this. Since we are already at it, I think we could also add a __version__ to the __init__.py for an even more pythonic feel.

In podio we have a __init__.py.in that we configure during the cmake stage. This should be trivial to add here as well, I think.

import ROOT
res = ROOT.gSystem.Load('libedm4hepDict.so')
if res < 0:
raise RuntimeError('Failed to load libedm4hepDict.so')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering whether all these RuntimeErrors should be ImportErrors instead. In the end import edm4hep is the thing that will fail if one of these is triggered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well typically for the libraries it will be because they are not in LD_LIBRARY_PATH which is a runtime search so one could say it's a RuntimeError ... both are quite generic, I don't think ImportError tells you much, that will also be quite obveious when checking the trace. So I don't have a strong opinion on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe a RuntimeError is better, because that doesn't point you towards a python problem in this case.

@jmcarcell
Copy link
Contributor Author

I added a version file. I preferred that over having an __init__.py.in so that it's possible to have the module loaded when developing and have suggestions, fixes, etc right away:
2023-06-14-152148_1254x156_scrot
and if the version is not there it will complain but keep loading everything else.

python/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really final comment (:wink:): I think it would make sense to add python/edm4hep/__version__.py to .gitignore

@tmadlener tmadlener enabled auto-merge (squash) June 14, 2023 16:33
@tmadlener tmadlener dismissed hegner’s stale review June 14, 2023 17:10

request has been addressed

@tmadlener tmadlener merged commit 8f93d27 into master Jun 14, 2023
@tmadlener tmadlener deleted the python-bindings branch June 28, 2023 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants