Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpreting results with low sample size / Alternatives? #154

Open
Skirnir3141 opened this issue Nov 13, 2024 · 2 comments
Open

Interpreting results with low sample size / Alternatives? #154

Skirnir3141 opened this issue Nov 13, 2024 · 2 comments

Comments

@Skirnir3141
Copy link

Great package! Trying to establish its applicability to my research. I'm doing DNA metabarcoding of eukaryotic community DNA and I've processed my reads into OTUs. I'd like to use decontam (or a similar package) to remove any contaminants. But, my sample size is very low: 1 negative control and 9 samples.

The prevalence method obviously won't work with so few samples, so I tried the frequency method using the default threshold of .1 and decontam flagged 6 OTUs as contaminants:
frequency method example

Just eyeballing, 4 of these look like they may actually be contaminants based on frequency being much higher in the negative control than in the samples. But, for 2 of them (OTU_183 and OTU_332) it looks like decontam is incorrectly flagging them based on some minor linearity in the samples (frequency is no higher in the negative control).

Two questions:

  1. Do you think decontam is valid for my use case? I see in https://doi.org/10.1101/221499 that a minimum of 5 negative controls is recommended, which of course I don't have.
  2. If it is, do you have any recommendations for differentiating between contamination and cross-contamination (I see in the same paper that decontam isn't designed for cross-contamination)? E.g., I'm able to assign some of the OTUs I've flagged as contaminants to taxonomies that should be present in my community DNA. Given this, I assume that these OTUs are cross-contamination rather than contamination.

Thanks so much!

  • Mike
@benjjneb
Copy link
Owner

Do you think decontam is valid for my use case?

Unfortunately, no.

The prevalence method doesn't work with just 1 negative control, and the plots you are presenting suggest that the signal that the frequency method is picking up is based in that one negative control (or one sample), and thus isn't reliable.

If it is, do you have any recommendations for differentiating between contamination and cross-contamination (I see in the same paper that decontam isn't designed for cross-contamination)?

If you have 10 samples, 1 of which is a negative control? No, I don't have any recommendations. This isn't enough information.

@Skirnir3141
Copy link
Author

As I thought, thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants