-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to get ASVs with zero counts in all samples after DADA2 run? #1958
Comments
I don't see how that would be possible if following the dada2 tutorial workflow. ASVs aren't added to the table unless they exist in >0 abundance, and steps like chimera removal remove the ASV column entirely, they don't just delete all the counts and leave a zero-study-wise-abundance ASV around. Perhaps try checking for these zero-count ASVs at intermediate steps along the workflow, to try to isolate where they may be cropping up? |
Thank you very much @benjjneb and sorry, yes, you're correct...Zero-count ASVs appear in the table only after I perform decontam and then finally remove the blank samples from the table. Another final related question is why would singletons appear in the counts table if in the standard workflow, singleton detection is set to FALSE by default? Thank you again. |
Singletons can appear due to merging. That is, a unique pair of forward and reverse ASVs, both of which were not singletons themselves, can produce a singleton if only one paired-read had that combination. |
Many thanks again for your kind clarification and your time @benjjneb |
Hi
I'm a frequent user of your wonderful package. In the Novaseq 6000 dataset (COI marker gene) that I'm currently analyzing, I noticed an odd thing in the ASV count output table for the first time- a couple hundred ASVs have got zero read counts (out of a total of 59K plus ASVs) across all samples after the completion of the standard DADA2 workflow run. Is this possible? or is this an artifact of using NovaSeq data? Would it be ok to proceed by just removing these ASVs or should I consider enforcing monotonicity and rerun the pipeline and then hopefully such zero-count ASVs will not appear anymore?
I was initially hoping to just filter out the rare (and possibly erroneous) ASVs (if found in <2 sample libraries and if represented by <10 reads) as was done in some published papers with such NovaSeq data rather than spend a lot of time testing monotonicity enforcing parameters as this could take some time due to the size of my dataset. Rare ASVs aren't actually that important as far as my current project objectives go. However, I realize now that going down this rare ASV elimination route will result in the loss of >94% of the reads across all of my samples.
Thank you very much in advance for any valuable suggestions here.
The text was updated successfully, but these errors were encountered: