Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Discussion about input parameters #5

Merged
merged 12 commits into from
Mar 12, 2024
Merged

Conversation

ewels
Copy link
Member

@ewels ewels commented Feb 7, 2024

Result of discussion at the nf-core core team retreat.

@edmundmiller edmundmiller marked this pull request as ready for review February 12, 2024 18:33
@edmundmiller edmundmiller marked this pull request as draft February 12, 2024 18:52
@edmundmiller edmundmiller marked this pull request as ready for review February 12, 2024 19:03
@edmundmiller
Copy link
Contributor

Think using those as yamls is the move. Then we can do a "on-changes" CI to launch tower jobs.

Ooo better, yet, we can cat the yamls together if they've changed and launch one tower job PR commit instead of multiple.

I was hating on the - genome: ... part, but that's going to come in really handy.

@maxulysse
Copy link
Member

I like where this is going

Comment on lines 32 to 34
gtf:
type: string
default: s3://ngi-igenomes/igenomes/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Annotation/Genes/genes.gtf

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem I have with the current genomes is the static nature. There isn't one version of GRCh38, there is a 1-to-many relationship between the genome build and annotation data (here, a GTF file). Can we include that somehow? 🤔

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, these are just some initial examples.

I don't think the bottleneck isn't going to be:

  1. Storage space
  2. Compute

It's going to be:

  1. Collaboration overhead
  2. Getting things right and keeping it simple (The Sarek example is perfect)
  3. Egress fees

Comment on lines 19 to 25
- ensembl
- ucsc
- ncbi
- gencode
- refseq
- encode
- custom

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need to at least be maps or something to capture versions, different builds etc. For example, if we add MANE (the most appropriate transcriptome build for most people):

Suggested change
- ensembl
- ucsc
- ncbi
- gencode
- refseq
- encode
- custom
- ensembl
- ucsc
- ncbi
- gencode
- refseq
- encode
- MANE
- refseq
- ensembl
- custom

Comment on lines 2 to 4
properties:
# Required
reference_id:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
properties:
# Required
reference_id:
properties:
version: 0 # version of this document!
# Required
reference_id:

# OR Manually submitted
# Each optional, build what we can based on what is provided
fasta:
type: string

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely you mean?

Suggested change
type: string
type: file

Copy link
Contributor

@edmundmiller edmundmiller Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't look at me, @ewels 😆

Copy link
Contributor

@edmundmiller edmundmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna go ahead with this one and merge it, and we can follow up with those in another PR.

I feel like there's a better/prettier way to do the yaml, but I can't seem to get it to work with nf-validation.

@edmundmiller edmundmiller merged commit 3c523cd into main Mar 12, 2024
@edmundmiller edmundmiller deleted the retreat-brainstorming branch March 12, 2024 22:04
@edmundmiller edmundmiller added this to the v1.0.0 milestone Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants