[MAINT] Cleaning / simplify Node #2325

oesteban · 2017-12-01T23:10:07Z

This PR uses some code I had under the hood before 0.14.0 that simplifies Node. These are the changes:

Move nipype.pipeline.engine.utils.make_output_dir to nipype.utils.filemanip.makedirs and change the function signature to be consistent with the os.makedirs of Python >=3.3. The behavior has been barely modified (if exists_ok=False falls back to the traditional os.makedirs; runs former nipype code otherwise).
Cache output directory with first call to output_dir().
Simplify hash_exists - should implement the same in a hopefully more reliable way (less branches)
Simplify Node.run - again, reducing code branches.
Simplify Node._run_interface, moving the chdirs to Node.run. Seems to fix [BUG] FileNotFoundError: [Errno 2] No such file or directory nipreps/fmriprep#868, however I haven't checked deep enough.
Pep8 fixes, tidy up imports, etc.

satra · 2017-12-03T21:42:43Z

nipype/pipeline/engine/nodes.py

-            result, _, _ = self._load_resultfile(cwd)
-            return result
+        # Cache first
+        if not self._result:


we don't want to do this i think. this will increase memory consumption. we want results to be loaded on the fly.

Okay, I'll see how to improve the process

Done, no self._result anymore.

satra · 2017-12-03T21:45:18Z

nipype/pipeline/engine/nodes.py

@@ -189,12 +188,11 @@ def interface(self):

    @property
    def result(self):
-        if self._result:
-            return self._result


i think this was a legacy from before. we should check when self._result is actually not None.

satra · 2017-12-03T21:54:39Z

nipype/pipeline/engine/nodes.py

+                for hf in hashfiles:
+                    os.remove(hf)
+
+            if updatehash and len(hashfiles) == 1:


i don't think the second clause is necessary or the os.remove statement.

the intent of updatehash==True was to always update the hash and return true. not just when there was only one hashfile. it was intended to be used cautiously.

satra · 2017-12-03T22:04:34Z

nipype/utils/filemanip.py

+    return path
+
+
+def emptydirs(path):


oesteban · 2017-12-05T16:52:09Z

@djarecka, @mgxd another maintenance PR for you to review, sorry about that. I don't have more of these for now... if it is of any help. The idea of these two refactorings is making things ready for the code rodeo and to scrub some rust off.

Here I'm changing some of the functionality of Node, so this PR requires a deeper scrutiny from you.

oesteban · 2017-12-07T06:52:41Z

nipreps/fmriprep#868 (comment)

satra · 2017-12-07T15:10:37Z

nipype/pipeline/engine/nodes.py

+        if op.exists(outdir):
+            # Find previous hashfiles
+            hashfiles = glob(op.join(outdir, '_0x*.json'))
+            if len(hashfiles) > 1:  # Remove hashfiles if more than one found


or (hashfiles and hashfiles[0] != hashfile)

this would ensure that two hashfiles are never left in the directory.

also we should check for any unfinished hashfiles in the directory and raise an exception. i.e. tell the user that you can update a hashfile in an unfinished hashfile situation.

Regarding the second idea: are you sure we want to error? We don't want to just dismiss any viable hashfile and clear up the folder (if an unfinished hashfile was found)?

I'm pushing a commit for this, with error. Let me know if we just want to clear up the folder and return no hashfiles.

BTW @satra, what do you think about passing the responsibilities of hashing/caching on to the interface?

The run() of the interface would have a force_run argument (so it can be done dynamically) and a updatehash to also support that option.

It'd be great to have the interfaces cache themselves.

satra · 2017-12-07T20:13:23Z

@oesteban - passing hashing to interfaces is a bigger discussion of merging node and interface into a common baseclass. let's leave that out of this and talk about separately. @djarecka has finally gotten back to the engine revision. let's review the base class question after that's completed.

the dichotomy for caching has often been around whether we should be caching or users do it. i think making memcache be the default way interfaces run would be a good thing, but that's a drastic change, so we leave that for 2.0.

oesteban · 2017-12-07T20:18:01Z

Oh, sure thing. I wasn't proposing that for this PR :)

Have you thought about rising an error or just go on when an unfinished hashfile is found in the output directory (when it shouldn't)?

djarecka · 2018-01-04T09:13:01Z

nipype/pipeline/engine/nodes.py

+        except OSError:
+            # Changing back to cwd is probably not necessary
+            # but this makes sure there's somewhere to change to.
+            cwd = os.path.split(outdir)[0]


you can unify and use op everywhere

djarecka · 2018-01-04T11:37:54Z

nipype/pipeline/engine/nodes.py

@@ -304,143 +354,109 @@ def run(self, updatehash=False):
        updatehash: boolean
            Update the hash stored in the output directory


I know you didn't change this, but I'm really missing longer explanation of updatehash or suggestions when users should use it

If @satra agrees, we can remove it. I think it is an ancient feature.

The idea being that, when you run with updatehash=True, even if the hash of some input changed, the interface will not be run again and reuse the old output as output.

For instance, you forgot to mark with nohash=True some of your inputs (say the number of threads), but you don't want to rerun interfaces that are already cached. Then you would change the number of threads and run with updatehash=True. When nipype finds some of your interfaces with this problem cached, they are not run again and the hash of this input gets updated.

I'm not sure the example is a use case but I understand the feature that way.

it's true that this is rarely used currently, but when people were using interactive workflows and moving around the working directory this became a very useful thing. examples are:

you run a workflow and move the working directory from one filesystem to another. all the hashes will change

you go into the working directory to debug something and messed things up. you can quickly rerun an update hash that updates your cache

Thanks for explanations! if we decide to leave it, can we extend the docstring?

djarecka · 2018-01-04T12:15:12Z

nipype/pipeline/engine/nodes.py

+            if not force_run and str2bool(self.config['execution']['stop_on_first_rerun']):
+                raise Exception('Cannot rerun when "stop_on_first_rerun" is set to True')
+
+        # Hashfile while running, remove if exists already


this is probably more abut the comment placing: I understand that you don't want to remove unfinished hashfile here..?

I'll check.

djarecka · 2018-01-04T12:39:26Z

nipype/pipeline/engine/nodes.py

@@ -1156,8 +1037,9 @@ def _make_nodes(self, cwd=None):
                        base_dir=op.join(cwd, 'mapflow'),
                        name=nodename)
            node.plugin_args = self.plugin_args
-            node._interface.inputs.trait_set(
+            node.interface.inputs.trait_set(


is this change necessary? just want to understand

in general, it is preferable to access public members rather than private members (marked generally with underscore) from outside the object.

ok, yes, it is for node, not for self...

djarecka · 2018-01-04T13:09:36Z

nipype/utils/misc.py

+    # Values in common keys would differ quite often,
+    # so we need to join the messages together
+    for k in new_keys.intersection(old_keys):
+        same = False


you sure? how would you modify this?

it's too late in Europe to be sure about anything, but regardless what happens in this loop, same should be defined/changed either in try part or except part, so looked to me that it doesn't have to be defined before try/except

Added to #2320. Even though this PR is changing a lot already, I'd leave this catch for a new one.

@djarecka I agree that you can remove this one line. same is defined before being accessed in all branches here.

(This comment in response to #2387.)

djarecka · 2018-01-04T13:15:37Z

@oesteban - thanks for this PR! I've tried to review the changes again today and just had some small questions/comments.
After this is merged I'll check which points from #2320 are still relevant.

djarecka · 2018-01-04T13:19:56Z

nipype/pipeline/engine/utils.py

@@ -904,7 +1142,7 @@ def _standardize_iterables(node):
    fields = set(node.inputs.copyable_trait_names())
    # Flag indicating whether the iterables are in the alternate
    # synchronize form and are not converted to a standard format.
-    synchronize = False
+    # synchronize = False  # OE: commented out since it is not used


is these comments are valid for node.synchronize instead of synchronize. would be good to clean it so it's clear

I didn't want to remove it since the comment above seemed relevant. But I agree that once we clarify the intent of this synchronize, we should clean up these comments.

that was my guess that the these comments are relevant also for node.synchronize but wasn't sure if entirely.

Added to #2320

This line was added in 2a6bb8f, and was unused then. @djarecka I think it's safe to remove this line in #2387.

@effigies my concerns were more about the comments explaining synchronize. Does it still apply to node.synchronize after all changes? I don't fully understand the comments and never used it, so if you can confirm (or suggest changes) that would be great.

Oh, sorry. It does read to me that there was intent to switch synchronize behavior with this flag, and that the two lines above only apply to that case. I think I would remove them, because the comment below seems to accurately describe the operation.

mgxd

thanks for the clean up - at first glance functionality should remain the same - LGTM

mgxd · 2018-01-05T20:40:33Z

nipype/pipeline/engine/nodes.py

-        return op.abspath(op.join(outputdir, self.name))
+
+        self._output_dir = op.abspath(op.join(outputdir, self.name))
+        return self._output_dir

    def set_input(self, parameter, val):
        """ Set interface input value"""


excess space

mgxd · 2018-01-05T20:40:40Z

nipype/pipeline/engine/nodes.py

-            if result and result.outputs:
-                val = getattr(result.outputs, parameter)
-        return val
+        return getattr(self.result.outputs, parameter, None)

    def help(self):
        """ Print interface help"""


mgxd · 2018-01-05T20:41:08Z

nipype/pipeline/engine/nodes.py


-    def _copyfiles_to_wd(self, outdir, execute, linksonly=False):
+    def _copyfiles_to_wd(self, execute=True, linksonly=False):
        """ copy files over and change the inputs"""


mgxd · 2018-01-05T20:41:15Z

nipype/pipeline/engine/nodes.py

                **deepcopy(self._interface.inputs.get()))
+            node.interface.resource_monitor = self._interface.resource_monitor


self.interface.resource_monior

I would opt for self._interface. I've made sure no self.interface remain.

mgxd · 2018-01-05T20:41:19Z

nipype/pipeline/engine/utils.py

 from copy import deepcopy
 from glob import glob
+from distutils.version import LooseVersion


from ... import LooseVersion

mgxd · 2018-01-05T20:41:23Z

nipype/pipeline/engine/utils.py

+
+
+def write_report(node, report_type=None, is_mapnode=False):
+    """


Can just make this one line if not expanding

mgxd · 2018-01-05T20:41:27Z

nipype/pipeline/engine/workflows.py

        if level > len(colorset) - 2:
-            level = 3 # Loop back to blue
+            level = 3  # Loop back to blue


extra space?

Nope, apparently there is some PEP somewhere advising to have 2 spaces before the inline comments

mgxd · 2018-01-05T20:41:33Z

nipype/pipeline/engine/nodes.py

+            outdir = op.join(outdir, '_tempinput')
+            makedirs(outdir, exist_ok=True)
+
+        for info in self.interface._get_filecopy_info():


self._interface and self.interface should be equal - but as an added layer of security wouldn't it be better to keep the usage consistent?

oesteban · 2018-01-07T16:50:01Z

I'm seeing this from time to time:

File "/usr/local/miniconda/lib/python3.6/multiprocessing/forkserver.py", line 178, in main
_serve_one(s, listener, alive_r, handler)
File "/usr/local/miniconda/lib/python3.6/multiprocessing/forkserver.py", line 212, in _serve_one
code = spawn._main(child_r)
File "/usr/local/miniconda/lib/python3.6/multiprocessing/spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/__init__.py", line 13, in <module>
from .utils.config import NipypeConfig
File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/utils/__init__.py", line 4, in <module>
from .config import NUMPY_MMAP
File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/utils/config.py", line 82, in <module>
""" % (homedir, os.getcwd())
FileNotFoundError: [Errno 2] No such file or directory

Will let you know when I find why this is happening.

oesteban · 2018-01-08T02:06:55Z

Alright, this is looking good. The FileNotFoundError seems to be under control with the new code, I'll keep an eye on these kinds of errors. Unless you want me to go one extra round of checking, this is ready to hit merge.

mgxd · 2018-01-10T03:03:08Z

@oesteban - while using this PR

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/envs/neuro/lib/python3.6/logging/__init__.py", line 992, in emit
    msg = self.format(record)
  File "/opt/conda/envs/neuro/lib/python3.6/logging/__init__.py", line 838, in format
    return fmt.format(record)
  File "/opt/conda/envs/neuro/lib/python3.6/logging/__init__.py", line 575, in format
    record.message = record.getMessage()
  File "/opt/conda/envs/neuro/lib/python3.6/logging/__init__.py", line 338, in getMessage
    msg = msg % self.args
TypeError: not enough arguments for format string
Call stack:
  File "/opt/conda/envs/neuro/bin/mindboggle123", line 315, in <module>
    mbFlow.run(plugin=args.plugin)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/workflows.py", line 574, in run
    runner.run(execgraph, updatehash=updatehash, config=self.config)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/plugins/linear.py", line 43, in run
    node.run(updatehash=updatehash)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 443, in run
    result = self._run_interface(execute=True)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 520, in _run_interface
    return self._run_command(execute)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 594, in _run_command
    self._interface.__class__.__name__)
Message: '[Node] Running "%s" ("%s.%s"), a CommandLine Interface with command:\nantsCorticalThickness.sh -a /datastore/sub-BANDA001/anat/sub-BANDA001_T1w.nii.gz -m /output/work/Mindboggle123/antsCorticalThickness/T_template0_BrainCerebellumProbabilityMask.nii.gz -e /opt/data/OASIS-30_Atropos_template/T_template0.nii.gz -d 3 -f /opt/data/OASIS-30_Atropos_template/T_template0_BrainCerebellumExtractionMask.nii.gz -s nii.gz -o /output/ants_subjects/sub-BANDA001/ants -p nipype_priors/BrainSegmentationPrior%02d.nii.gz -t /opt/data/OASIS-30_Atropos_template/T_template0_BrainCerebellum.nii.gz -j 1'
Arguments: ('antsCorticalThickness', 'nipype.interfaces.ants.segmentation', 'antsCorticalThickness')

mgxd · 2018-01-10T03:09:26Z

perhaps we should move towards .format, since any command containing % will raise this error

oesteban added 9 commits November 17, 2017 15:50

[REF] Clean-up class Node code

e420ceb

general cleanup

a5ba813

Merge remote-tracking branch 'upstream/master' into ref/Node-cleanup

d21b321

revert changes to plugin base - make in different PR

5cb476e

revert changes to multiproc - to another PR

5f87526

take switch and loggin outside try except

7e66b04

pep8 fixups

15b13ea

tidy up imports, fix pep8 issues

02dba98

add some comments [skip ci]

5467376

oesteban changed the title ~~[MAINT] Cleaning / simplify Node~~ [MAINT/WIP] Cleaning / simplify Node Dec 2, 2017

meng-du mentioned this pull request Dec 2, 2017

[BUG] FileNotFoundError: [Errno 2] No such file or directory nipreps/fmriprep#868

Closed

use new makedirs

e049918

satra reviewed Dec 3, 2017

View reviewed changes

nipype/utils/filemanip.py Outdated

return path

def emptydirs(path):

Copy link

Member

satra Dec 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring

oesteban added 5 commits December 4, 2017 16:21

Merge remote-tracking branch 'upstream/master' into ref/Node-cleanup

2fcab30

add docstring to emptydirs

3070378

remove self._results from Node

de11e79

hashfiles is True

a1c0fbd

fix import of md5

43537aa

oesteban requested review from djarecka and mgxd December 5, 2017 07:42

oesteban changed the title ~~[MAINT/WIP] Cleaning / simplify Node~~ [MAINT] Cleaning / simplify Node Dec 5, 2017

satra reviewed Dec 7, 2017

View reviewed changes

improved hashfiles checks

b46704f

djarecka reviewed Jan 4, 2018

View reviewed changes

mgxd approved these changes Jan 5, 2018

View reviewed changes

oesteban mentioned this pull request Jan 7, 2018

PEP8 violation cleanup - Nov2017 #2358

Merged

oesteban added this to the 0.14.1 milestone Jan 7, 2018

oesteban added 10 commits January 7, 2018 10:59

improve log trace when os.getcwd() failed

77e5257

cache cwd in config object, handle os.getcwd errors

a1c780b

elevate trace to warning

d0e1ec8

add test for cwd

ee7c70b

cleaning pe.utils

459e0ac

finishing pe.nodes cleanup

1725515

finishing pe.nodes - unfinished hashfiles

4e0f6e1

fix Bunch does not have keys()

396aa1e

[skip ci] fix ordering as per @djarecka

bb905b9

[skip ci] Extend docstring as per @djarecka

4e4c670

djarecka mentioned this pull request Jan 8, 2018

suggesting changes to interfaces.base #2320

Closed

12 tasks

[skip ci] Update CHANGES

a80c9b1

mgxd merged commit 9be816e into nipy:master Jan 8, 2018

oesteban deleted the ref/Node-cleanup branch January 9, 2018 06:00

oesteban added a commit to nipreps/niworkflows that referenced this pull request Jan 9, 2018

[PIN] Update nipype after nipy/nipype#2325 was merged

46d02fb

mgxd mentioned this pull request Jan 10, 2018

fix: logging error if % in node cmd #2364

Merged

effigies modified the milestones: 0.14.1, 1.0 Jan 20, 2018

djarecka mentioned this pull request Jan 21, 2018

changes in Interfaces base (closes #2320) #2387

Merged

7 tasks

effigies mentioned this pull request Jun 18, 2019

Workflow crashing with file not found due to strange nipype.pipeline.engine.utils.modify_path behavior #2944

Closed

		@@ -304,143 +354,109 @@ def run(self, updatehash=False):
		updatehash: boolean
		Update the hash stored in the output directory

		**deepcopy(self._interface.inputs.get()))
		node.interface.resource_monitor = self._interface.resource_monitor



		def write_report(node, report_type=None, is_mapnode=False):
		"""

[MAINT] Cleaning / simplify Node #2325

[MAINT] Cleaning / simplify Node #2325

Conversation

oesteban commented Dec 1, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oesteban commented Dec 5, 2017

oesteban commented Dec 7, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

satra commented Dec 7, 2017

oesteban commented Dec 7, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djarecka Jan 4, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djarecka commented Jan 4, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mgxd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oesteban commented Jan 7, 2018

oesteban commented Jan 8, 2018

mgxd commented Jan 10, 2018

mgxd commented Jan 10, 2018

djarecka Jan 4, 2018 •

edited

Loading