-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add sample_name
as possible column in samplesheet
#31
base: dev
Are you sure you want to change the base?
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great Steven!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I tested in IRIDA Next and works perfectly. Thanks so much Steven 😄
CHANGELOG.md
Outdated
|
||
- Added the ability to include a `sample_name` column in the input samplesheet.csv. Allows for compatibility with IRIDA-Next input configuration. | ||
|
||
- `sample_name` special characters will be replaced with `"_"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be useful to include what characters are special (non-alphanumeric?).
README.md
Outdated
|
||
`sample_name`: An **optional** column, that overrides `sample` for outputs (filenames and sample names) and reference assembly identification. | ||
|
||
`sample_name`, allows more flexibility in naming output files or sample identification. Unlike `sample`, `sample_name` is not required to contain unique values. `Nextflow` requires unique sample names, and therefore in the instance of repeat `sample_names`, `sample` will be suffixed to any `sample_name`. Non-alphanumeric characters (excluding `_`,`-`,`.`) will be replaced with `"_"`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comma after sample_name
might be unneeded.
workflows/gasclustering.nf
Outdated
ID_COLUMN = "sample_name" | ||
ID_COLUMN2 = "sample" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend changing ID_COLUMN
and ID_COLUMN2
to something more like SAMPLE_NAME_COLUMN
and SAMPLE_COLUMN
, or ID_SAMPLE_NAME
and ID_SAMPLE
, so that's it's more clear the difference between the two when reading the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is ID_COLUMN
used within the code? If it's not, it should probably be removed and maybe that changes the name to ID_HEADER
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch ID_COLUMN
is a vestige from some testing. Well, really I should remove ID_COLUMN2
, and replace with ID_COLUMN = "sample"
.
It goes into the formatting of the metadata_headers
channel:
metadata_headers = Channel.of(
tuple(
ID_COLUMN2,
params.metadata_1_header, params.metadata_2_header,
params.metadata_3_header, params.metadata_4_header,
params.metadata_5_header, params.metadata_6_header,
params.metadata_7_header, params.metadata_8_header)
)
I think I will change it to SAMPLE_HEADER
so it fits nicely in the channel. Here d877ee2
Modified the template for input
samplesheet.csv
file to include thesample_name
column in addition tosample
in-line with changes to IRIDA-Next update as seen with the speciesabundance pipeline and staramrnf for example. What this means is that the output files and thesample
name will be changed tosample_name
if asample_name
is called. Ifgasclustering
is being locally then thesample_name
can be left blank.Made a few changes:
-
sample_name
special characters will be replaced with"_"
- If no
sample_name
is supplied in the columnsample
will be used- To avoid repeat values for
sample_name
allsample_name
values will be suffixed withsample
- Tests to check that the variety of different
sample_names
work with thePR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).nf-test
to test new featuredocs/usage.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).