Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use file instead of new File #245

Merged
merged 2 commits into from
Jun 29, 2019
Merged

Use file instead of new File #245

merged 2 commits into from
Jun 29, 2019

Conversation

olgabot
Copy link
Contributor

@olgabot olgabot commented Jun 27, 2019

As I learned here, file != new File and new File doesn't know how to handle S3 paths. This leads to weird behavior like creating an s3: folder, with all the bucket "subfolders" when a pipeline is run:

 Thu 27 Jun - 09:03  ~/code/nf-core/rnaseq   origin ☊ olgabot/salmon-gencode ✔ 28☀ 
  ll --tree s3:
Permissions Size User    Date Modified Git Name
drwxr-xr-x     - olgabot 11 Jun 10:26   -- s3:
drwxr-xr-x     - olgabot 11 Jun 10:26   -- └── olgabot-maca
drwxr-xr-x     - olgabot 11 Jun 10:26   --    └── mini-maca
drwxr-xr-x     - olgabot 11 Jun 10:26   --       └── results
drwxr-xr-x     - olgabot 11 Jun 10:26   --          └── pipeline_info
.rw-r--r--   12k olgabot 11 Jun 16:40   --             ├── pipeline_report.html
.rw-r--r--  2.7k olgabot 11 Jun 16:40   --             └── pipeline_report.txt

PR checklist

  • This comment contains a description of changes (with reason)
  • If you've fixed a bug or added code that should be tested, add tests!
  • If necessary, also make a PR on the nf-core/rnaseq branch on the nf-core/test-datasets repo
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Make sure your code lints (nf-core lint .).
  • Documentation in docs is updated
  • CHANGELOG.md is updated
  • README.md is updated

Learn more about contributing: https://github.com/nf-core/rnaseq/tree/master/.github/CONTRIBUTING.md

As I learned [here](nextflow-io/nextflow#1185), `file` != `new File` and `new File` doesn't know how to handle S3 paths. This leads to weird behavior like creating an `s3:` folder, with all the bucket "subfolders" when a pipeline is run:
```
 Thu 27 Jun - 09:03  ~/code/nf-core/rnaseq   origin ☊ olgabot/salmon-gencode ✔ 28☀ 
  ll --tree s3:
Permissions Size User    Date Modified Git Name
drwxr-xr-x     - olgabot 11 Jun 10:26   -- s3:
drwxr-xr-x     - olgabot 11 Jun 10:26   -- └── olgabot-maca
drwxr-xr-x     - olgabot 11 Jun 10:26   --    └── mini-maca
drwxr-xr-x     - olgabot 11 Jun 10:26   --       └── results
drwxr-xr-x     - olgabot 11 Jun 10:26   --          └── pipeline_info
.rw-r--r--   12k olgabot 11 Jun 16:40   --             ├── pipeline_report.html
.rw-r--r--  2.7k olgabot 11 Jun 16:40   --             └── pipeline_report.txt
```
@olgabot
Copy link
Contributor Author

olgabot commented Jun 27, 2019

(I'm testing out this edit here before editing nf-core/tools)

@apeltzer
Copy link
Member

Oh, good spot! Please test and then ping someone for reviewing! (and add it upstream in tools, yes! )

@olgabot olgabot changed the base branch from master to dev June 27, 2019 22:04
@olgabot olgabot requested a review from apeltzer June 28, 2019 22:53
@olgabot
Copy link
Contributor Author

olgabot commented Jun 28, 2019

Okay (I think) this is ready for review!

Copy link
Member

@apeltzer apeltzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably report this upstream in the nf-core/tools package too @olgabot !

It's the same way in the template and if that breaks AWS compatibility, that is something that concerns me!

@apeltzer apeltzer merged commit 7498697 into dev Jun 29, 2019
olgabot added a commit to nf-core/tools that referenced this pull request Jul 3, 2019
As mentioned in nf-core/rnaseq#245, the `pipeline_report.{html,txt}` files get written with `new File` instead of `file` which leads to weird behavior and creating an `s3:/` folder locally if the output folder is on AWS S3:

```
 Thu 27 Jun - 09:03  ~/code/nf-core/rnaseq   origin ☊ olgabot/salmon-gencode ✔ 28☀ 
  ll --tree s3:
Permissions Size User    Date Modified Git Name
drwxr-xr-x     - olgabot 11 Jun 10:26   -- s3:
drwxr-xr-x     - olgabot 11 Jun 10:26   -- └── olgabot-maca
drwxr-xr-x     - olgabot 11 Jun 10:26   --    └── mini-maca
drwxr-xr-x     - olgabot 11 Jun 10:26   --       └── results
drwxr-xr-x     - olgabot 11 Jun 10:26   --          └── pipeline_info
.rw-r--r--   12k olgabot 11 Jun 16:40   --             ├── pipeline_report.html
.rw-r--r--  2.7k olgabot 11 Jun 16:40   --             └── pipeline_report.txt
```

This is especially problematic as after the first time the pipeline is run, then the `s3:/` folder is created and any input files get tested against that "folder" and suddenly they "don't exist" because they look like they're on the local filesystem as locally, `s3://` --> `s3:/`, and then pipelines break 😢
@apeltzer apeltzer deleted the olgabot/pipeline_report_file branch July 8, 2019 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants