Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for arbitrary mappings in a FunctionTask #565

Merged
merged 1 commit into from
Feb 27, 2023

Conversation

ghisvail
Copy link
Collaborator

Types of changes

  • New feature (non-breaking change which adds functionality)

Summary

This PR adds support for returning total or partial mappings of the output spec by the wrapped function, as an alternative to going through a conversion from the returned tuple to a dictionary via output_names. Unlike the named tuple case where the shape of the data is known a priori (I have got exactly 2 output variables which I want named to foo and bar), this PR adds support for functions which may return more, less or exactly these data as an arbitrary mapping.

This could be particularly useful for dynamic task construction use cases, like BIDS readers, parsers and writers.

Checklist

  • I have added tests to cover my changes (if necessary)
  • I have updated documentation (if necessary)

@ghisvail
Copy link
Collaborator Author

For a concrete example, it would simplify this piece of terrible code very much:

https://github.com/aramis-lab/pydra-bids/blob/main/pydra/tasks/bids/utils/read_bids.py

@ghisvail
Copy link
Collaborator Author

A safer alternative could also be to only return if the mapping returned by the inner function contains all keys declared in output_names. Right now, I went for the coalesce to None option, since this is what we are going for in case the function did not return anything. I am happy with either.

@satra
Copy link
Contributor

satra commented Aug 16, 2022

a side note: for dynamic tasks as being targeted here, the cache checksum/location will only depend on inputs (e.g. the query string/template), not the underlying database. hence either the source directory would need to be provided and cached as a separate input(a potentially expensive operation), or another default input should be associated that uses a random number generator as a default. the current forcing option of self.rerun_task may not suffice in all cases, especially if the same task gets called multiple times.

@ghisvail
Copy link
Collaborator Author

Good point. How does caching work today with nipype for things like BIDSDataGrabber and DataSink? Would it work differently with pydra fundamentally?

@satra
Copy link
Contributor

satra commented Aug 16, 2022

How does caching work today with nipype for things like BIDSDataGrabber and DataSink?

caching is local to a workflow and nipype1 does not have the concept of reusing the same task in multiple places or even across workflows. this is only available in pydra. in nipype1, simply setting the rerun flag would force that interface to rerun, but because there is no shared rerun capabilities, this particular issue does not come into play. so dynamic interfaces are simply set to rerun by default in nipype1.

@codecov
Copy link

codecov bot commented Aug 19, 2022

Codecov Report

Base: 81.23% // Head: 38.26% // Decreases project coverage by -42.96% ⚠️

Coverage data is based on head (0f8869b) compared to base (a9a116d).
Patch coverage: 0.00% of modified lines in pull request are covered.

❗ Current head 0f8869b differs from pull request most recent head b89f9a2. Consider uploading reports for the commit b89f9a2 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           master     #565       +/-   ##
===========================================
- Coverage   81.23%   38.26%   -42.97%     
===========================================
  Files          20       20               
  Lines        4391     4393        +2     
  Branches     1262     1202       -60     
===========================================
- Hits         3567     1681     -1886     
- Misses        820     2416     +1596     
- Partials        4      296      +292     
Flag Coverage Δ
unittests 38.24% <0.00%> (-43.00%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pydra/engine/task.py 47.09% <0.00%> (-46.45%) ⬇️
pydra/tasks/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
pydra/mark/functions.py 17.64% <0.00%> (-82.36%) ⬇️
pydra/engine/audit.py 27.50% <0.00%> (-70.00%) ⬇️
pydra/engine/state.py 30.86% <0.00%> (-65.54%) ⬇️
pydra/engine/graph.py 31.89% <0.00%> (-62.80%) ⬇️
pydra/engine/helpers_state.py 35.08% <0.00%> (-60.53%) ⬇️
pydra/utils/messenger.py 47.94% <0.00%> (-47.95%) ⬇️
pydra/engine/helpers.py 39.00% <0.00%> (-47.76%) ⬇️
pydra/engine/specs.py 48.21% <0.00%> (-46.56%) ⬇️
... and 9 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@djarecka
Copy link
Collaborator

@ghisvail - can't access the link with the example anymore.

perhaps you can also add some test, e.g. pydra/mark/tests/test_functions.pyand/orpydra/engine/tests/test_task.py`

@ghisvail
Copy link
Collaborator Author

Yes, I am planning to add some unit tests covering this particular usage.

I'll ping you once done :+1

@ghisvail ghisvail force-pushed the feat/run-task-dict branch 2 times, most recently from 2b5d79a to 4d765c8 Compare December 31, 2022 10:53
@ghisvail
Copy link
Collaborator Author

ghisvail commented Dec 31, 2022

I added a test case which covers present, absent and undefined mappings from returned dict to annotation.

A natural extension to this PR would be to handle functions annotated with a TypedDict return value.

@ghisvail
Copy link
Collaborator Author

@ghisvail - can't access the link with the example anymore.

I did some refactoring of the code in the meantime. This is what it looks like today:

https://github.com/aramis-lab/pydra-bids/blob/main/pydra/tasks/bids/utils.py

Granted it's likely a terrible way of making dynamic tasks for reasons I have not thought of, but I sense there could be a nice pattern emerging from this.

@ghisvail
Copy link
Collaborator Author

@djarecka do you think we could get this one in for the release too?

@ghisvail
Copy link
Collaborator Author

I was also thinking that extending this feature to TypedDict may help reduce typing redundancy by doing:

class Point(TypedDict):
    x: int
    y: int

@task
def random_point() -> Point:
    return Point(x=random.randrange(16), y=random.randrange(16))

And have the resulting task automatically expose x and y as integer-typed output.

Maybe?

@djarecka
Copy link
Collaborator

if you think that you could use it, then sure

@effigies
Copy link
Contributor

Going forward, I think it would be good to deprecate {'name': type} for TypedDict, which didn't exist when we initially came up with the return annotations.

@ghisvail
Copy link
Collaborator Author

Going forward, I think it would be good to deprecate {'name': type} for TypedDict, which didn't exist when we initially came up with the return annotations.

We might not need to deprecate it, since TypedDict can be constructed dynamically. So we could normalize tasks internals to use TypedDict but keep annotate as a user-friendly wrapper to help generate the TypedDict definition from the mapping passed in annotate.

What do you think of this plan?

@effigies
Copy link
Contributor

Sounds good to me.

@ghisvail
Copy link
Collaborator Author

Ok, then I'll find some time to start hacking on it 👍

@djarecka
Copy link
Collaborator

@ghisvail - you want to still work in this PR, or you want me to merge this one and open a new one?

@ghisvail
Copy link
Collaborator Author

@djarecka please go ahead with merging. I'll open a follow-up PR for TypedDict.

@djarecka djarecka merged commit 306ac27 into nipype:master Feb 27, 2023
@ghisvail ghisvail deleted the feat/run-task-dict branch March 8, 2023 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants