moving from deepcopy to pickle files in to_job method #246

djarecka · 2020-05-08T04:54:59Z

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Summary

this is a part of an effort to decrease memory load (#226)

to_job creates pickle files for a task (without input) and the input (right now the input is saved together with the original task) instead of creating deep copies. The files are load and the specific input is set before running the task/wf.

Checklist

All tests passing
I have added tests to cover my changes
I have updated documentation (if necessary)
My code follows the code style of this project
(we are using black: you can pip install pre-commit,
run pre-commit install in the pydra directory
and black will be run automatically with each commit)

Acknowledgment

I acknowledge that this contribution will be available under the Apache 2 license.

…sk has a state, but instead a soft copy of the task and the input is saved in pickle files, so it could be load later (in specific workers)

codecov · 2020-05-08T05:33:22Z

Codecov Report

Merging #246 into master will decrease coverage by 0.87%.
The diff coverage is 87.02%.

@@            Coverage Diff             @@
##           master     #246      +/-   ##
==========================================
- Coverage   86.74%   85.86%   -0.88%     
==========================================
  Files          19       19              
  Lines        3077     3156      +79     
  Branches      824      837      +13     
==========================================
+ Hits         2669     2710      +41     
- Misses        256      285      +29     
- Partials      152      161       +9

Flag	Coverage Δ
#unittests	`85.86% <87.02%> (-0.88%)`	⬇️

Impacted Files	Coverage Δ
pydra/engine/helpers.py	`85.01% <83.09%> (-2.08%)`	⬇️
pydra/engine/submitter.py	`88.28% <84.61%> (-3.23%)`	⬇️
pydra/engine/workers.py	`78.60% <92.10%> (-6.14%)`	⬇️
pydra/conftest.py	`77.77% <100.00%> (-22.23%)`	⬇️
pydra/engine/core.py	`89.95% <100.00%> (ø)`
pydra/utils/profiler.py	`38.16% <0.00%> (-5.35%)`	⬇️
pydra/__init__.py	`93.33% <0.00%> (+33.33%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f042738...d17b593. Read the comment docs.

satra · 2020-05-09T20:34:10Z

just some thoughts here:

deepcopy in to_job may be essential to ensure that tasks inside a Workflow are copied instead of referenced since to_job applies to all Tasks. a shallow copy will not make copies of nested objects. depending when a copy happens, this may require deepcopy, but i haven't though through the entire workflow.

i don't think getting input ind should happen in the worker task load func. workers should just process a single task that they are given. so architecturally i would like that function not to evaluate inputs corresponding to indices, but just run a specific job.

finally it will be important to understand how workflow splits are being handled as all of that runs on the main thread and not in workers. so

wf = Workflow("foo")
wf.add(FunctionTask(...))
wf.splits(...)

then wf will make copies in the main thread, and any nested workflow with splits will also make copies in the main thread.

so this will require creating a concept diagram of how a workflow is executed and identifying ways to minimize memory usage on the main thread. perhaps understanding the aysnc workflow execution is more important first than fixing the memory issue. it will help then to think about how best to handle memory issue.

djarecka · 2020-05-09T20:56:14Z

@satra - you are right that tasks within wf are referenced, and this was the reason why deepcopy had to be there, but now I'm saving the task in the pickle file, so the pickling library does it (as we were discussing this on wed).

I'm not asking for the input in the worker part anymore, could you point me to the line?

I will think about your comments, but the first thing I have to fix is the slurm, made some changes (not commited), but doesn't work...

pydra/engine/workers.py

…ctions

…t to decrease the time)

…copy): modyfying create_pyscript to use load_and_run for task with a state, changes in _prepare_runscript and run_el to deal with a tuple instead of a runnable

pydra/engine/core.py

pydra/engine/helpers.py

…e task with full input; submitter passes the same pickle file, but different indices for tasks with states (so load_and_runwould be able to set proper inputs)

pydra/engine/core.py

pydra/engine/helpers.py

satra · 2020-05-31T21:02:06Z

pydra/engine/helpers.py

+    task = load_task(task_pkl=task_pkl, ind=ind)
+    task._run(rerun=rerun, **kwargs)


i would put this in a try - except clause, since things could fail for various reasons and then update the result file appropriately whether things crashed or not.

which result file you have in mind?

the result file created by the task. in case the task crashes then a result file created by the code that traps that exception.

so you want to write an error into _result.pklz? not sure if I follow

i've added a simple crashfile

satra · 2020-05-31T21:02:46Z

pydra/engine/helpers.py

+
+    if not task.result():
+        raise Exception("Something went wrong")
+    return task


shouldn't this return a result pickle file, not task?

it returns task after running it. it's likely taht doesn't have to return anything

still returning task though, not result. i still think this should return a pointer to a pickled result file. and the result file should have a result object that indicates that the task has crashed.

i was just double checking this, returned object is not used anywhere, so I'm removing this for now. it really could be anything here...

actually I could put the file name, so could be used when function is used outside worker or submitter

i think it's ok to return the result file name. however see the previous comment about improving what load and run traps.

pydra/engine/helpers.py

… often

…d_run

…load_and_run

…ipype#267

djarecka · 2020-06-02T03:37:29Z

@satra - let me know if I can merge this when tests pass

editing to_job, so it doesnt create a deep copy of a task when the ta…

c5b9a1d

…sk has a state, but instead a soft copy of the task and the input is saved in pickle files, so it could be load later (in specific workers)

djarecka changed the title ~~[wip] moving from deepcopy to pickle files in to_job method~~ moving from deepcopy to pickle files in to_job method May 8, 2020

satra reviewed May 9, 2020

View reviewed changes

pydra/engine/workers.py Outdated Show resolved Hide resolved

djarecka added 6 commits May 28, 2020 23:48

removing load_and_run from task and Workflow, creating standalone fun…

8efa963

…ctions

Merge https://github.com/nipype/pydra into mnt/memory_tojob

adef23f

simplifying load_and_run, adding a tets

fb54adf

changing pytest setting - running only slurm if sbatch available (jus…

d8d7353

…t to decrease the time)

adjusting slurm worker to the current version of to_job (without deep…

c417698

…copy): modyfying create_pyscript to use load_and_run for task with a state, changes in _prepare_runscript and run_el to deal with a tuple instead of a runnable

small changes in load_and_run, to be able to use in create_pyscript

cc07170