Skip to content

FIX: DataSink to S3 buckets #3130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 18, 2020
Merged

FIX: DataSink to S3 buckets #3130

merged 2 commits into from
Jan 18, 2020

Conversation

kimsin98
Copy link
Contributor

@kimsin98 kimsin98 commented Jan 2, 2020

Due to how pathlib.Path handles path strings, S3 bucket paths are missing a /.

Summary

Fixes #3105.

List of changes proposed in this PR (pull-request)

  • Changes DataSink input specs for output paths to be Str rather than Directory.

Acknowledgment

  • (Mandatory) I acknowledge that this contribution will be available under the Apache 2 license.

@effigies
Copy link
Member

effigies commented Jan 6, 2020

I agree with @oesteban that I think converting base_directory to Str is probably more correct in terms of keeping the inputs consistent with what is passed by the user. On the other hand, this also doesn't seem egregious to me, as s3:/ is not really going to happen when s3:// is not intended. So I'm fine with either approach, myself.

In any event, could you rebase onto maint/1.4.x and update the base branch? As this is a bug fix, we can aim for the 1.4.1 release.

Screen Shot 2020-01-06 at 13 03 09

@effigies effigies added this to the 1.4.1 milestone Jan 6, 2020
@kimsin98 kimsin98 changed the base branch from master to maint/1.4.x January 7, 2020 02:35
@kimsin98 kimsin98 changed the base branch from maint/1.4.x to master January 7, 2020 02:35
Due to how pathlib.Path handles path strings, S3 bucket paths are missing a /.
@codecov
Copy link

codecov bot commented Jan 7, 2020

Codecov Report

Merging #3130 into maint/1.4.x will increase coverage by 0.12%.
The diff coverage is 57.14%.

Impacted file tree graph

@@               Coverage Diff               @@
##           maint/1.4.x    #3130      +/-   ##
===============================================
+ Coverage        67.59%   67.72%   +0.12%     
===============================================
  Files              299      299              
  Lines            39499    39834     +335     
  Branches          5220     5364     +144     
===============================================
+ Hits             26700    26976     +276     
- Misses           12086    12125      +39     
- Partials           713      733      +20
Flag Coverage Δ
#smoketests 53.04% <57.14%> (ø) ⬆️
#unittests 65.01% <42.85%> (+0.17%) ⬆️
Impacted Files Coverage Δ
nipype/interfaces/io.py 59.82% <57.14%> (+0.15%) ⬆️
nipype/interfaces/dcm2nii.py 49.76% <0%> (-8.93%) ⬇️
nipype/interfaces/base/core.py 87.73% <0%> (+0.76%) ⬆️
nipype/pipeline/plugins/tools.py 80.72% <0%> (+1.77%) ⬆️
nipype/pipeline/plugins/legacymultiproc.py 64.98% <0%> (+2.16%) ⬆️
nipype/pipeline/plugins/multiproc.py 85.96% <0%> (+4.71%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7dd3b37...847553e. Read the comment docs.

@kimsin98 kimsin98 changed the base branch from master to maint/1.4.x January 7, 2020 03:19
Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this. @oesteban, @satra How do you feel about this change?

@kimsin98
Copy link
Contributor Author

As all the reviewers noted, there seems to be little to no advantage of keeping base_directory to be a Directory. If anything goes wrong, the problem is probably in the user's inputs, and keeping them as Str would make the problem more transparent.

Changed the DataSink input spec instead.

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. A quick question and a suggestion.

container = Str(desc="Folder within base directory in which to store output")
parameterization = traits.Bool(
True, usedefault=True, desc="store output in parametrized structure"
)
strip_dir = Directory(desc="path to strip out of filename")
strip_dir = Str(desc="path to strip out of filename")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the effect of this change? Was something failing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think this will make any difference in proper use cases.

But since we are setting base_directory trait to Str to be faithful to the user's inputs, it would not make sense to disable input processing on one path input but not the other.

@effigies effigies merged commit 3468d9f into nipy:maint/1.4.x Jan 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants