Add support for arbitrary mappings in a FunctionTask #565

ghisvail · 2022-08-16T13:42:19Z

Types of changes

New feature (non-breaking change which adds functionality)

Summary

This PR adds support for returning total or partial mappings of the output spec by the wrapped function, as an alternative to going through a conversion from the returned tuple to a dictionary via output_names. Unlike the named tuple case where the shape of the data is known a priori (I have got exactly 2 output variables which I want named to foo and bar), this PR adds support for functions which may return more, less or exactly these data as an arbitrary mapping.

This could be particularly useful for dynamic task construction use cases, like BIDS readers, parsers and writers.

Checklist

I have added tests to cover my changes (if necessary)
I have updated documentation (if necessary)

ghisvail · 2022-08-16T13:44:42Z

For a concrete example, it would simplify this piece of terrible code very much:

https://github.com/aramis-lab/pydra-bids/blob/main/pydra/tasks/bids/utils/read_bids.py

ghisvail · 2022-08-16T13:51:33Z

A safer alternative could also be to only return if the mapping returned by the inner function contains all keys declared in output_names. Right now, I went for the coalesce to None option, since this is what we are going for in case the function did not return anything. I am happy with either.

satra · 2022-08-16T15:10:42Z

a side note: for dynamic tasks as being targeted here, the cache checksum/location will only depend on inputs (e.g. the query string/template), not the underlying database. hence either the source directory would need to be provided and cached as a separate input(a potentially expensive operation), or another default input should be associated that uses a random number generator as a default. the current forcing option of self.rerun_task may not suffice in all cases, especially if the same task gets called multiple times.

ghisvail · 2022-08-16T16:02:24Z

Good point. How does caching work today with nipype for things like BIDSDataGrabber and DataSink? Would it work differently with pydra fundamentally?

satra · 2022-08-16T18:51:23Z

How does caching work today with nipype for things like BIDSDataGrabber and DataSink?

caching is local to a workflow and nipype1 does not have the concept of reusing the same task in multiple places or even across workflows. this is only available in pydra. in nipype1, simply setting the rerun flag would force that interface to rerun, but because there is no shared rerun capabilities, this particular issue does not come into play. so dynamic interfaces are simply set to rerun by default in nipype1.

codecov · 2022-08-19T16:34:50Z

Codecov Report

Base: 81.23% // Head: 38.26% // Decreases project coverage by -42.96% ⚠️

Coverage data is based on head (0f8869b) compared to base (a9a116d).
Patch coverage: 0.00% of modified lines in pull request are covered.

❗ Current head 0f8869b differs from pull request most recent head b89f9a2. Consider uploading reports for the commit b89f9a2 to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##           master     #565       +/-   ##
===========================================
- Coverage   81.23%   38.26%   -42.97%     
===========================================
  Files          20       20               
  Lines        4391     4393        +2     
  Branches     1262     1202       -60     
===========================================
- Hits         3567     1681     -1886     
- Misses        820     2416     +1596     
- Partials        4      296      +292

Flag	Coverage Δ
unittests	`38.24% <0.00%> (-43.00%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
pydra/engine/task.py	`47.09% <0.00%> (-46.45%)`	⬇️
pydra/tasks/__init__.py	`0.00% <0.00%> (-100.00%)`	⬇️
pydra/mark/functions.py	`17.64% <0.00%> (-82.36%)`	⬇️
pydra/engine/audit.py	`27.50% <0.00%> (-70.00%)`	⬇️
pydra/engine/state.py	`30.86% <0.00%> (-65.54%)`	⬇️
pydra/engine/graph.py	`31.89% <0.00%> (-62.80%)`	⬇️
pydra/engine/helpers_state.py	`35.08% <0.00%> (-60.53%)`	⬇️
pydra/utils/messenger.py	`47.94% <0.00%> (-47.95%)`	⬇️
pydra/engine/helpers.py	`39.00% <0.00%> (-47.76%)`	⬇️
pydra/engine/specs.py	`48.21% <0.00%> (-46.56%)`	⬇️
... and 9 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

djarecka · 2022-08-19T17:12:09Z

@ghisvail - can't access the link with the example anymore.

perhaps you can also add some test, e.g. pydra/mark/tests/test_functions.pyand/orpydra/engine/tests/test_task.py`

ghisvail · 2022-08-20T15:58:23Z

Yes, I am planning to add some unit tests covering this particular usage.

I'll ping you once done :+1

ghisvail · 2022-12-31T11:08:15Z

I added a test case which covers present, absent and undefined mappings from returned dict to annotation.

A natural extension to this PR would be to handle functions annotated with a TypedDict return value.

ghisvail · 2022-12-31T11:35:39Z

@ghisvail - can't access the link with the example anymore.

I did some refactoring of the code in the meantime. This is what it looks like today:

https://github.com/aramis-lab/pydra-bids/blob/main/pydra/tasks/bids/utils.py

Granted it's likely a terrible way of making dynamic tasks for reasons I have not thought of, but I sense there could be a nice pattern emerging from this.

ghisvail · 2023-02-17T15:34:09Z

@djarecka do you think we could get this one in for the release too?

ghisvail · 2023-02-17T15:34:15Z

I was also thinking that extending this feature to TypedDict may help reduce typing redundancy by doing:

class Point(TypedDict):
    x: int
    y: int

@task
def random_point() -> Point:
    return Point(x=random.randrange(16), y=random.randrange(16))

And have the resulting task automatically expose x and y as integer-typed output.

Maybe?

djarecka · 2023-02-17T15:39:55Z

if you think that you could use it, then sure

effigies · 2023-02-17T15:41:20Z

Going forward, I think it would be good to deprecate {'name': type} for TypedDict, which didn't exist when we initially came up with the return annotations.

ghisvail · 2023-02-17T16:31:36Z

Going forward, I think it would be good to deprecate {'name': type} for TypedDict, which didn't exist when we initially came up with the return annotations.

We might not need to deprecate it, since TypedDict can be constructed dynamically. So we could normalize tasks internals to use TypedDict but keep annotate as a user-friendly wrapper to help generate the TypedDict definition from the mapping passed in annotate.

What do you think of this plan?

effigies · 2023-02-17T16:37:29Z

Sounds good to me.

ghisvail · 2023-02-17T16:50:37Z

Ok, then I'll find some time to start hacking on it 👍

djarecka · 2023-02-22T01:44:54Z

@ghisvail - you want to still work in this PR, or you want me to merge this one and open a new one?

ghisvail · 2023-02-25T08:24:40Z

@djarecka please go ahead with merging. I'll open a follow-up PR for TypedDict.

ghisvail force-pushed the feat/run-task-dict branch 2 times, most recently from 2b5d79a to 4d765c8 Compare December 31, 2022 10:53

ENH: Add support for mappings in a FunctionTask

b89f9a2

ghisvail force-pushed the feat/run-task-dict branch from 0f8869b to b89f9a2 Compare December 31, 2022 10:54

djarecka merged commit 306ac27 into nipype:master Feb 27, 2023

ghisvail deleted the feat/run-task-dict branch March 8, 2023 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for arbitrary mappings in a FunctionTask #565

Add support for arbitrary mappings in a FunctionTask #565

ghisvail commented Aug 16, 2022

ghisvail commented Aug 16, 2022

ghisvail commented Aug 16, 2022

satra commented Aug 16, 2022

ghisvail commented Aug 16, 2022

satra commented Aug 16, 2022

codecov bot commented Aug 19, 2022 •

edited

Loading

djarecka commented Aug 19, 2022

ghisvail commented Aug 20, 2022

ghisvail commented Dec 31, 2022 •

edited

Loading

ghisvail commented Dec 31, 2022

ghisvail commented Feb 17, 2023

ghisvail commented Feb 17, 2023

djarecka commented Feb 17, 2023

effigies commented Feb 17, 2023

ghisvail commented Feb 17, 2023

effigies commented Feb 17, 2023

ghisvail commented Feb 17, 2023

djarecka commented Feb 22, 2023

ghisvail commented Feb 25, 2023

Add support for arbitrary mappings in a FunctionTask #565

Add support for arbitrary mappings in a FunctionTask #565

Conversation

ghisvail commented Aug 16, 2022

Types of changes

Summary

Checklist

ghisvail commented Aug 16, 2022

ghisvail commented Aug 16, 2022

satra commented Aug 16, 2022

ghisvail commented Aug 16, 2022

satra commented Aug 16, 2022

codecov bot commented Aug 19, 2022 • edited Loading

Codecov Report

djarecka commented Aug 19, 2022

ghisvail commented Aug 20, 2022

ghisvail commented Dec 31, 2022 • edited Loading

ghisvail commented Dec 31, 2022

ghisvail commented Feb 17, 2023

ghisvail commented Feb 17, 2023

djarecka commented Feb 17, 2023

effigies commented Feb 17, 2023

ghisvail commented Feb 17, 2023

effigies commented Feb 17, 2023

ghisvail commented Feb 17, 2023

djarecka commented Feb 22, 2023

ghisvail commented Feb 25, 2023

codecov bot commented Aug 19, 2022 •

edited

Loading

ghisvail commented Dec 31, 2022 •

edited

Loading