Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added background subtraction #48

Open
wants to merge 6 commits into
base: dev
Choose a base branch
from
Open

added background subtraction #48

wants to merge 6 commits into from

Conversation

RobJY
Copy link
Contributor

@RobJY RobJY commented Jul 25, 2024

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/mcmicro branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link

github-actions bot commented Jul 25, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 0ffa4d6

+| ✅ 176 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗  25 tests had warnings |!

❗ Test warnings:

  • nextflow_config - Config manifest.version should end in dev: 2.0.0
  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in README.md: TODO nf-core:
  • pipeline_todos - TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core
  • pipeline_todos - TODO string in README.md: Fill in short bullet-pointed list of the default steps in the pipeline
  • pipeline_todos - TODO string in README.md: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.
  • pipeline_todos - TODO string in README.md: update the following command to include all required parameters for a minimal example
  • pipeline_todos - TODO string in README.md: If applicable, make list of people who have also contributed
  • pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
  • pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
  • pipeline_todos - TODO string in test.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in output.md: Write this documentation describing your workflow's output
  • pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in ci.yml: You can customise CI pipeline run tests as required
  • schema_lint - Parameter input not found in schema

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-07-25 19:24:06

@RobJY RobJY requested a review from jmuhlich July 25, 2024 19:40
@RobJY RobJY changed the title added bacground subtraction added background subtraction Jul 25, 2024
Copy link

@kbestak kbestak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I would only change the exposure time to be a number to allow for floats as we have a dataset with such exposure times.

I'll be hopefully updating the nf-core module for Backsub soon so it adheres to nf-core standards more with conda support.

I'm leaning towards having a different step name than the tool name, so for the step name, I think it should be background/af_correction/channel_subtraction, with Backsub as the tool to do it. What are the preferences from your side?

@@ -34,6 +34,15 @@
"emission_wavelength": {
"type": "integer",
"errorMessage": ""
},
"exposure": {
"type": "integer"
Copy link

@kbestak kbestak Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"type": "integer"
"type": "number"

@jmuhlich
Copy link
Member

jmuhlich commented Sep 6, 2024

I'm leaning towards having a different step name than the tool name, so for the step name, I think it should be background/af_correction/channel_subtraction, with Backsub as the tool to do it. What are the preferences from your side?

From what I see in other nf-core pipelines, modules are imported and used under their original names (unless a module is used multiple times and must be aliased for uniqueness) and publishDir folder names also generally follow the module name. Sometimes publishDir paths are grouped under a higher level folder if there are many alternatives for the same step like we have for segmentation. The mcmicro-legacy module names and publishDir layout did not use this style, rather naming everything by its purpose, e.g "registration", "segmentation", "quantification", and we did start to use that same publishDir structure here in mcmicro 2.0. I think we should pivot to adopting the nf-core style for consistency though.

@@ -19,6 +19,9 @@ params {
// Illumination correction
illumination = null

// Background subtraction
backsub = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this, or can it be inferred by the presence of the relevant columns in the marker sheet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, that's a good idea. We could trigger background subtraction on the presence of a remove column in the marker sheet.
Is there a better column to use?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The column critical for the module would be background, as only using the tool for channel removal seems like an unnecessary data duplication step.

Copy link
Contributor Author

@RobJY RobJY Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, sounds good, I'll use the background column. Thanks!
I thought it might be used for something else as well, but it seemed like the remove column would only be used for background subtraction.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to add here, it might be good to check for both the presence of the background and exposure columns as both are required for channel subtraction. If you check by background, and no exposure column is given, the tool would give an error.

I think the best way to approach this would be to: check for presence of the background column, if it is given, check the values in the column, if none are a string, we skip Backsub. Otherwise, check for presence of the exposure column (and values), if not provided, an error should be thrown, and otherwise Backsub is run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should validate that background column values are a subset of the marker_name values in validateInputMarkersheet.

input[0] = Channel.of(
[
[id:"TEST1", cycle_number:1, channel_count:4],
"https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/imaging/background_subtraction/cycif_tonsil_registered.ome.tif",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is an actual background-subtraction test case, but structurally I guess it's fine.

channel_number + "," + cycle_number + "," + marker_name + "," + exposure + "," + background + "," + remove}] }
.flatten()
.map { it.replace('[]', '') }
.collectFile(name: 'markers_backsub.csv', sort: false, newLine: true, storeDir: "${workDir}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use storeDir here. collectFile returns a channel, and you should use that channel directly where needed downstream. Check out ch_mcquant_markers and its usage in calling MCQUANT above.

.flatten()
.map { it.replace('[]', '') }
.collectFile(name: 'markers_backsub.csv', sort: false, newLine: true, storeDir: "${workDir}")
BACKSUB(ASHLAR.out.tif, [[id:"$ASHLAR.out.tif[0]['id']"], "${workDir}/markers_backsub.csv"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, instead of an explicit path for the csv, use ch_backsub_markers. You'll need to use combine to join the collectFile-returned single-value channel with each entry in the meta-map channel for the second arg to BACKSUB.

@kbestak
Copy link

kbestak commented Sep 6, 2024

Sometimes publishDir paths are grouped under a higher level folder if there are many alternatives for the same step like we have for segmentation. The mcmicro-legacy module names and publishDir layout did not use this style, rather naming everything by its purpose, e.g "registration", "segmentation", "quantification", and we did start to use that same publishDir structure here in mcmicro 2.0. I think we should pivot to adopting the nf-core style for consistency though.

I was under the impression our goal is to still keep the multiple choice aspect based on steps - especially when it will come to defining the "execution stream" - e.g. staging, registration, AF_correction, segmentation and quantification. The broader steps are defined and used to determine the "stream", but for each step (or several steps), we would allow for multiple options (e.g. staging, registration*, segmentation). I think it would be somewhat confusing to start mixing process names into the broader execution stream which is why I made the point above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants