-
Notifications
You must be signed in to change notification settings - Fork 533
[FIX] immunize shutil.rmtree to node non-existence for remove_node_di… #3148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
;-) |
Thanks. Do you think you could put together a small regression test? |
Codecov Report
@@ Coverage Diff @@
## maint/1.4.x #3148 +/- ##
==============================================
+ Coverage 67.59% 67.6% +<.01%
==============================================
Files 299 299
Lines 39499 39499
Branches 5220 5220
==============================================
+ Hits 26700 26703 +3
+ Misses 12086 12081 -5
- Partials 713 715 +2
Continue to review full report at Codecov.
|
@effigies , Not really sure what you have in mind RE: a regression test, but let me know if this is approaching it:
|
Yup, something of the sort. You'll want to make sure that the test fails before your fix and passes after it. You can use the |
With the exception of the regression, the above update should be a bit closer to what we want. Not sure how you want to approach the regression exactly-- i.e. across previous nipype versions, by overriding the module with a paramaterized fixture for rmtree with the additional flag added? |
No need to parameterize. Just test on master and on this branch. It should fail and pass, respectively. |
…rectories=True in the case that stop_on_first_crash=False *Add regression test
Removed parameterization. Tested on maint/1.4.x and it passes. Does not yet seem to fail on master as expected so we might still need to tweak the test to add more nodes to workflow (i.e. nodes after the first node that crashes, yet the workflow continues to run and attempt node directory removal for directories that never get created)... |
Perhaps you could explain more how you were running into this issue? Were you running multiple workflows in parallel? |
Hmm, it occurred repeatedly when running a single workflow (with nested workflows) using MultiProc, and adding ignore_errors=True completely inoculated the issue. Hard to say what the fail context is precisely, but the solution was surefire. |
could this have something to do with the mapnode fix from yesterday? |
I tried this test on 1.4.0, and it still passes. So whatever it's testing, it's not the error that was being hit. Checking the coverage, however, it does at least hit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes on the test. Also, just so you know, we now use the black styler. If you install pre-commit and run it, it will run black for you when you try to make a commit.
if arg1 == 2: | ||
raise Exception('arg cannot be ' + str(arg1)) | ||
except: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the goal to fail or not? If not, we can skip the entire try
/except
block. If so, we should not catch the exception.
return arg1 | ||
|
||
funkynode = pe.MapNode(niu.Function(function=func, input_names=['arg1'], | ||
output_names=['out']), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is redundant:
funkynode = pe.MapNode(niu.Function(function=func),
Also, is there something intrinsic about the problem and MapNode
s? If not, then perhaps just make it a Node
, to reduce the scope as small as possible.
Co-Authored-By: Chris Markiewicz <effigies@gmail.com>
Co-Authored-By: Chris Markiewicz <effigies@gmail.com>
Co-Authored-By: Chris Markiewicz <effigies@gmail.com>
It does feel very much like a race condition. |
@effigies -- I'm going to take another look at this on Friday to see if I can't pinpoint exactly what's going on. Things go a little 'nuts' when the iterable expansion starts hitting 10's of thousands of threads... Stay tuned. |
Let's rebase and reopen if this becomes an issue again. |
@effigies -- since it seems you're still actively working on the 1.4.x merge, I just went ahead and opened a fresh PR directly onto that branch since rebasing was causing issues...