Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acceptable data to use as DNA concentrations and what to do with non-quantifiable samples #155

Open
mimouschka opened this issue Dec 6, 2024 · 8 comments

Comments

@mimouschka
Copy link

Hi @benjjneb
Thanks a lot for your packages and the very detailed tutorials!
I am reaching out because I am unclear about what to do relating to the "conc" parameter in decontam.
It seems that you designed the package to use Qubit-like quantification values of the PCR products before pooling them equimolarly.
Now,

  1. What if I have QiaXcel readings of my PCR products, i.e. readings originating from automated capillary electrophoresis. Those readings represent DNA concentrations of the expected amplicon size (so in theory not including primers, primer dimers etc), but I have noticed that they can be quite dubious with low band intensity.
  2. Can one use DNA extract concentrations too (as I have those from Qubit)?
  3. In both cases, most of my controls (i.e. sampling, subsampling, extraction, PCR, sequencing) have "non quantifiable" readings. What to do in this case, as decontam does not accept zero-values? The limit of detection for qubit being 0.005 ng/ul, should I use this value for all non-quantifiable samples? Or should I just use the prevalence method in this case?

Thanks for your feedback!

@mimouschka mimouschka changed the title Acceptable data to use as DNA concentrations and what to do with non-quanitifiable samples Acceptable data to use as DNA concentrations and what to do with non-quantifiable samples Dec 6, 2024
@benjjneb
Copy link
Owner

benjjneb commented Dec 6, 2024

What if I have QiaXcel readings of my PCR products, i.e. readings originating from automated capillary electrophoresis. Those readings represent DNA concentrations of the expected amplicon size (so in theory not including primers, primer dimers etc), but I have noticed that they can be quite dubious with low band intensity.

I have not every tested with this type of DNA quantitation information. If they become "dubious" over a significant range of the DNA concentrations in your study, then I would be hesitant to use them.

Can one use DNA extract concentrations too (as I have those from Qubit)?

Yes. One idea that might be nice is to use both types of DNA concentration measurements and then compare results to see if they at least roughly line up.

In both cases, most of my controls (i.e. sampling, subsampling, extraction, PCR, sequencing) have "non quantifiable" readings. What to do in this case, as decontam does not accept zero-values? The limit of detection for qubit being 0.005 ng/ul, should I use this value for all non-quantifiable samples? Or should I just use the prevalence method in this case?

I would recommend excluding the controls when performing the frequency identification method. This actually happens automatically when you use the "combined" method, which does both frequency (without controls) and prevalence (using the controls) and then creates a combined decontam score from the two methods.

@mimouschka
Copy link
Author

mimouschka commented Dec 9, 2024

Hi @benjjneb
Thanks a lot for your feedback.
Regarding point 2. I will do this, thanks for the suggestion.
Just to clarify your last point: does this mean that if samples have "NA" in their concentration reading (i.e. no value specified), they are automatically not taken into account by decontam's frequency method, but that the method will still work?
So, that the frequency method uses all samples available to flag ASVs as contaminants, without a priori knowledge about samples being controls or not?

Thanks for your feedback!

@mimouschka
Copy link
Author

My last question is answered in #113 and #38

Indeed the frequency method "is effective all on its own", and in the "combined" method it separates out the negative controls, to do the contaminant labeling only using true samples.
So my interpretation that zero values are not accepted in decontam was misunderstood: the frequency method is designed to use samples only, and not controls. So it does not accept any zero values in true samples.

So then my question is: why ever use the "minimum", "either" or "both" methods, as it seems that the "combined" method is the most complete approach ?

@benjjneb
Copy link
Owner

benjjneb commented Dec 9, 2024

So then my question is: why ever use the "minimum", "either" or "both" methods, as it seems that the "combined" method is the most complete approach ?

Those other variants are just different ways of using the scores from the prevalence + frequency methods to classify contaminants. One might use these variants depending on study goals (e.g. is it of utmost importance to rule out all possible contaminants? or instead to only rule out obvious contaminants?).

@mimouschka
Copy link
Author

Ok, so here I have some P score histograms using 1. different DNA concentration estimates for the frequency method (qubit of extracts and qiaxcel of PCR products), 2. the prevalence method and 3. the combined method

Frequency method: qubit of eDNA extracts

decontam_frequency_extractionDNAconc-1-1

score histogram looks strange - maybe the DNA concentrations are not appropriately describing input DNA - see #125

Frequency method: qiaxcel of PCR products

decontam_frequency_extractionPCRconc-2

which score to choose?

prevalence method

decontam_prevalence-1

combined method

decontam_combined-1

and the prevalence plot using the combined method and a threshold of 0.3:

fig-contaminants-prevalence-4

Based on your feedback and issue 125, I would not use the qubit readings of DNA extracts, but rather the qiaxcel readings of amplicons. I would stick to the combined method, but then, which threshold to choose?

@benjjneb
Copy link
Owner

Based on these diganostic outputs, I would not use decontam to remove contaminants at all.

What kind of samples are you measuring? The output of "combined" could be consistent with there being little-to-no contamination in your samples.

@benjjneb
Copy link
Owner

benjjneb commented Dec 16, 2024

And the "qubit of DNA extracts" measure is producing output that looks like one would get if a random number was assigned to the DNA concentration. I almost wonder if those measurements are being merged correctly with the taxonomic profiles.

@mimouschka
Copy link
Author

The samples are eDNA extracts of sediment samples, processed in an ancient DNA lab, so with expected very low contamination.

It makes sense what you say about the "qubit of DNA extracts" plot, it seems that the QuBit measure of the initial eDNA extracts does not appropriately describe PCR input DNA concentration...

What do you mean by "I almost wonder if those measurements are being merged correctly with the taxonomic profiles."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants