-
Notifications
You must be signed in to change notification settings - Fork 533
[ENH] Optimize workflow.run performance #3260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Optimize workflow.run performance #3260
Conversation
- Traverse nested workflows in a loop - Avoid constructing the entire workflow.inputs or workflow.outputs data structure
Codecov Report
@@ Coverage Diff @@
## master #3260 +/- ##
==========================================
+ Coverage 64.23% 64.62% +0.39%
==========================================
Files 300 302 +2
Lines 39884 39824 -60
Branches 5276 5279 +3
==========================================
+ Hits 25618 25735 +117
+ Misses 13210 12995 -215
- Partials 1056 1094 +38
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Thanks for this, that looks like a great improvement! I've read your solution and my only suggestions would be aesthetic ones, but I realized that we probably already have a function to fetch a particular node, and found nipype/nipype/pipeline/engine/workflows.py Lines 369 to 383 in e9217c2
I think we might be able to replace all calls to |
That's a really good idea :-) |
Test failures appear to be #3261. How does the profiling look? Do we need to clean up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using fMRIPrep with the latest commit, this gets a ~100x speedup for _generate_flatgraph()
.
LGTM, though let me know if there's anything else you want to include before merge.
thank you @HippocampusGirl - much appreciated. |
Thank you for benchmarking that @effigies! No, I don't have anything to add |
Summary
A few months back, I submitted pull request #3184 to improve the performance of using
connect
when creating large workflows. Specifically, I had discovered that the use of theinputs
oroutputs
properties of workflows can create a performance bottleneck if there are many child nodes or nested workflows.Recently, I noticed that the same bottleneck can cause a delay between calling
workflow.run()
and the start of the actual execution, meaning when nodes and interfaces start to run.Running cProfile suggests that the delay occurs in

_create_flat_graph
. Note that the profile does not include the full workflow execution, but was cancelled immediately when the first node started to run.As far as I can tell, before execution starts, nested workflows are merged into one overall workflow using
_create_flat_graph
. To resolve the final connections between nodes in this merged workflow,_create_flat_graph
calls_get_parameter_node
for each input from or output to a nested workflow, and then modifies the connection information accordingly.nipype/nipype/pipeline/engine/workflows.py
Lines 975 to 979 in e9217c2
As a result, for each connection to/from a nested workflow,
_get_parameter_node
constructs the entireinputs
oroutputs
data structure of the nested workflow, and then uses it to resolve the correct connection information. Just as for #3184, constructing this entire data structure over and over again for each connection can reduce performance.List of changes proposed in this PR (pull-request)
Instead of generating the full
inputs
oroutputs
data structure, I propose that the_get_parameter_node
function should traverse the individual workflow graphs until it finds the target node (or not).I have created a quick implementation that leads to a significant speedup. This implementation is a slightly modified copy of the code from #3184.

I hope that this code will be useful for the nipype community.
Acknowledgment