-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
anota2seq fails with more than 2 levels in the samplesheet phenotype / contrast variable #89
Comments
I think the linked bug report has to do with replicate numbers, so may not be directly relevant. But the intended functionality is, I think, as you suggest- the process should run multiple times, once for each contrast. We need to understand why that isn't happening. Could you post the contrast file used here please, and the associated nextflow logs for an example run? I'd like to discount the possibility of the splitting logic failing due to e.g. bad line endings |
Here is the contrast file. To me it looks like ASCII text, the last line as now newline character. This is the nextflow.log |
OK, I suspect there's some confusion here from anota2seq's messaging. The process is definitely just receiving information for one contrast:
So that's not the issue. |
Issue #66 reports two problems.
The first issue relates to the pipeline's supposed limitation in handling
more than one comparison. As you mentioned, it successfully processes the
first contrast, as shown in the lines you've shared. That said, although I
haven’t been able to test whether it can handle two comparisons—since the
program crashes when attempting to run the first one—based on what I see in
the anota2seqrun.r script, it doesn’t seem to support multiple
comparisons. If that’s the case, I believe it would be useful to introduce
this functionality, allowing the pipeline to process as many comparisons as
specified in the contrast file.
The second issue in #66 pertains to the number of replicates. I suspect
this may be a limitation of Anota2seq, that is ok. However, if that’s the
case, it would be helpful to include this in the documentation.
Thank you!
|
Please do make PRs to documentation, happy to review. |
Thank you Jonathan,
I am sorry if I have misunderstood something, but I am not sure that
setting opt$subset_to_contrast_samples to FALSE will fix the multiple
comparison issue. Do you think that the pipeline will perform two
comparisons (i.e. run nota2seqrun.r twice) given the contrast matrix below?
id,variable,reference,target,batch,pair
KI_LIF_vs_WT,treatment,WT_LIF,KI_LIF,,pair
KI_LIF2i_vs_WT,treatment,WT_LIF2i,KI_LIF2i,,pair
El jue, 30 ene 2025 a las 17:00, Jonathan Manning ***@***.***>)
escribió:
… I *think* this is basically the same as the issue solved in #91
<#91> in response to #90
<#90>.
—
Reply to this email directly, view it on GitHub
<#89 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFTMG6MWXKYQGAY4WBOTZWT2NJLELAVCNFSM6AAAAABWFJQWMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMRVGA3DENJTHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Naiara Garcia Bediaga, PhD
|
@naiarabediaga the point is that I think you're misinterpreting the error- the messages from anota2seq are confusing and this has nothing to do with the contrasts as the workflow understands them. If you look at the code that produces this error it's this:
We're always specifying a single contrast- the module has no capacity to do anything else - so |
Indeed, the messages from anota2seq can be quite confusing. I agree, the
error message 'T*oo few custom contrasts supplied. Please check your
contrast matrix'* (see below) is unrelated to the issue of multiple
comparisons, and has to do with the "leveling" of the contrast matrix
mentioned in #68. I think this has already been addressed and it was
closed today.
[image: Screenshot 2025-01-30 at 17.30.55.png]
The opt$subset_to_contrast_samples was another important issue
that made anota2seq crash, but I think you have already solved it , right?
Both the table of counts and sample sheet m*ust be subset to include only
the samples involved in the contrast.*
Regarding the multiple comparisons issue, it’s unrelated to the errors
we’ve been encountering. It’s more about how certain lines in the
documentation led me to believe the pipeline could somehow loop through
more than one comparison ( see below)
*"To carry out this analysis, the pipeline must be supplied with one or
more ‘contrasts’ describing the comparison to be made."*
But I am happy with one comparison, if this functionality (something like a
loop for more than one comparison 😬,) cannot be introduced.
Thank you so much!
El jue, 30 ene 2025 a las 17:24, Jonathan Manning ***@***.***>)
escribió:
… @naiarabediaga <https://github.com/naiarabediaga> the point is that I
think you're misinterpreting the error- the messages from anota2seq are
confusing and this has nothing to do with the contrasts as the workflow
understands them.
If you look at the code that produces this error
<https://rdrr.io/bioc/anota2seq/src/R/anota2seqInternalFunctions.R> it's
this:
if ([dim](https://rdrr.io/r/base/dim.html)([contrasts](https://rdrr.io/r/stats/contrasts.html))[2] != (nPheno - 1)) {
if ([dim](https://rdrr.io/r/base/dim.html)([contrasts](https://rdrr.io/r/stats/contrasts.html))[2] > (nPheno - 1)) {
[stop](https://rdrr.io/r/base/stop.html)("Too many custom contrasts supplied.\nPlease check your contrast matrix.\n")
}
if ([dim](https://rdrr.io/r/base/dim.html)([contrasts](https://rdrr.io/r/stats/contrasts.html))[2] < (nPheno - 1)) {
[stop](https://rdrr.io/r/base/stop.html)("Too few custom contrasts supplied.\nPlease check your contrast matrix.\n")
}
}
We're always specifying a single contrast- the module has no capacity to
do anything else. So what this is saying is that there can *only* be 2
levels in the phenotype vector. That means we need to subset the sample
sheet to only the samples for the contrast at hand, which is what we force
in #91 <#91>.
—
Reply to this email directly, view it on GitHub
<#89 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFTMG6OOHM62OVHMLQSONTD2NJN45AVCNFSM6AAAAABWFJQWMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMRVGEYTONJZHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Naiara Garcia Bediaga, PhD
|
@naiarabediaga the documentation is correct, the pipeline will indeed loop over the multiple contrasts you provide, by running multiple iterations of the ANOTA2SEQ process. It does that by splitting the contrasts file you provide. The process itself only works with a single contrast (not a contrast file) at once is all. It died here because of the problem with the first iteration is all. Once that issue is solved you should see those multiple contrasts being analysed, one by one. |
The multi-contrast issue has now been fixed (#94), and is making its way into the 1.1.0 release as we speak. I think we can close this issue now. |
Description of feature
Background: We used a public dataset with matched Ribo-seq and RNA-seq and tried to get the
anota2seq
step to work. [by we I mostly mean my colleague @naiarabediaga]As contrast file we used the contrast file:
This fails for the reason that
anota2seq
only accepts one contrast per file, which had already been pointed out in this bug report. Splitting the contrast files into two single files has the potential to work:By looking at the anota2seq job that gets run, it looks like the Nextflow logic extracts the contrast information from the contrast file, and submits this as a meta flag to the run (
[id:KI_LIF_vs_WT, variable:treatment, reference:WT_LIF, target:KI_LIF, batch:, pair:pair]
):I wonder if the logic could be changed within
ANOTA2SEQ_ANOTA2SEQRUN
to extract these information for each line and submit them as a separate job, similar to re-running the job multiple times with only 1 line of contrast information? As a test with the test data, couldn't we simple re-use the same line as a second contrast to see that it launches 2ANOTA2SEQ
jobs?As a side note, this also failed because
anota2seq
requires the levels in the contrast file to be sorted, but this has already been addressed here: nf-core/modules#7395.Many thanks!
The text was updated successfully, but these errors were encountered: