-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add dask_awkward support to fastjet #189
Conversation
@chrispap95 OK - sorry for the noise here while I beat it into shape. This is all ready for completion now. Let me know if you have any questions! There are perhaps a few places where it's not a trivial map_partitions but I haven't thought about it too hard. |
We probably also need to turn on the thread safety options for fastjet now, since dask's default scheduler is threaded. |
@chrispap95 ok - finished this up. It would be super useful to try this out with some user analysis and derive some tests before we merge it. |
@andrzejnovak FYI |
@lgray I can run the SUEP workflow on it. It is using only |
@chrispap95 cool - dask-based NanoEvents in the awkward2_dev branch of coffea should work for PFNano so you may just be able to start from that. |
@jmduarte I know you guys are using some of the lund plane functions and stuff - if you have some time could you give this new software stack a try? |
There are some bugs in this that need fixing:
|
OK - this is going to need some work now that I've got some tests running on a file. Update: I have a janky fix that seems to work on files well. @chrispap95 can you make a 2 event sample of that PFNano file you pointed me to so that I can make proper tests? |
@jpivarski Do you think this is reasonably robust enough? I checked in the code that deeper within all that is ever asked for is E/px/py/pz, but it all looks sorta sketchy IMO. |
for more information, see https://pre-commit.ci
uuughhh |
Fixes #188
Bindings appear to be pretty straightforward at the end of the day.
The trick we use to make delayed operations work is to construct a new ClusterSequence for each concrete array being processed.
Within DaskAwkwardClusterSequence we dispatch the function to be called and the JetDefinition.
There are a large number of functions to convert, but the majority of them seem pretty straightforward.