-
Notifications
You must be signed in to change notification settings - Fork 533
Results file failure with 1.3.0rc1 (MapNode) #3088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I checked. It didn't make a difference. |
that's weird. a few questions/comments:
|
|
i would expect it to as well. that's why it's weird what other thing is causing it not to. it's almost as if something in the crashfile generation function is crashing.
if you can start from scratch it would help. also if i can replicate this, i can try to see what else could be triggering this. |
Sorry, I was hoping to give instructions for how to set up an environment, but didn't finish. Will do that shortly. Here's a Circle test, which should be pretty reproducible. |
Set up a fresh
Install poldracklab/fitlins@c95eea3:
I'll assume you have datalad and some place you like to keep BIDS datasets datalad install -r -s ///labs/poldrack/ds003_fmriprep $BIDS/ds003_fmriprep
datalad get $BIDS/ds003_fmriprep/sub-0{1,2,3}/func/*_space-MNI152NLin2009cAsym_desc-*.nii.gz \\
$BIDS/ds003_fmriprep/sub-0{1,2,3}/func/*_desc-confounds_*.tsv \
$BIDS/ds003_fmriprep/dataset_description.json \
$BIDS/ds003_fmriprep/sub-*/*/*.json Grab a model:
Reproduce:
|
Hi, I've seen this issue as well, and it does cause the execution to hang under multiproc and legacy multiproc. It seems that all of the missing files are related to a mapnode's "parent" node's result file. For example:
Catching the eventual FileNotFoundException at line 70 below will prevent the execution from hanging, but I'm not sure about why the result file is missing in the first place. Maybe it is looking for it too soon, before all of the mapflows are finished up? nipype/nipype/pipeline/plugins/multiproc.py Lines 65 to 70 in 2125c0b
Edit: Possible postponing the raising of this exception here ( nipype/nipype/pipeline/engine/nodes.py Lines 1288 to 1300 in 1a86999
Allow that to return and check the retvals and after saving the result file: nipype/nipype/pipeline/engine/nodes.py Lines 1368 to 1379 in 1a86999
|
@effigies: got a chance to try this out. i cannot repeat your error with the instructions above. the process completes and the three nodes have crashed End of the Error log
what's weird to me is that i cannot see an unfinished.json file in either of these directories
|
@wtriplett - the missing result file is something i thought we had taken care of, and from my run it looks like it. |
@satra : here is a portion of my log in case that is helpful: https://gist.github.com/wtriplett/75fac896d7ba2ca200675119f1f3c04d I see this indefinitely at the end as the workflow
as the watcher thread loops endless waiting for the jobs to clean up/finish. |
thanks @wtriplett - just to confirm - does this hang happen with nipype 1.4.0 on a fresh working dir? and if so, is it possible to create an example like @effigies ? i cannot repeat the issue currently with @effigies code anymore. |
@satra: This seems to reproduce the issue with 1.4.0:
crash_example.py: https://gist.github.com/wtriplett/d216a4e9aad146bd0f3b272e22a128a3 I've attached the output of linear and multiproc, where multiproc was run under the |
@wtriplett - i think this PR #3143 should fix the problem. thank you very much for your example. i reduced it down to a single MapNode workflow with a Function interface. |
NP let me know if I can help further. -B |
@wtriplett - can you run your code using that branch? |
@satra : sure... I ran my original workflow with Then ran with Thanks! |
thanks @wtriplett - i'll let @effigies run his workflow, and if all goes well, we can merge it in. and then patch the 1.3.x series. |
Summary
An example of the results file issue, again involving
MapNode
s. Using FitLins installed in a conda environment with poldracklab/fitlins#195.It also causes the whole workflow to hang.
Platform details:
Execution environment
Choose one
The text was updated successfully, but these errors were encountered: