Clarification on hierarchical read classification with multiple databases #161

gtonkinhill · 2024-01-10T23:40:24Z

Hi, thanks for creating such a useful tool!

Apologies if I've missed this in the documentation. I wanted to clarify how krakenuniq handles multiple databases when run as

krakenuniq --db HOST --db PROK --db EUK_DRAFT

Am I correct in assuming that only kmers that do not match the HOST DB will be subsequently searched in the PROK DB?
Would this generally be a more conservative way to remove host DNA than including the host genome in a single DB?
Given a single taxonomy, is it possible to have the same genome in multiple DB's or does this cause problems and is it important to ensure the DBs do not overlap?

The text was updated successfully, but these errors were encountered:

salzberg · 2024-01-11T00:56:57Z

hmm, maybe some of the others will answer but all I can say is what I do - I never run krakenuniq with multiple DBs. Instead, I run it with one DB and then use krakentools to extract all the unmapped reads. I then take those reads and (if I have a 2nd DB) I run them against the 2nd DB.
And to remove host DNA, I usually run bowtie2 to align against human (that's the only host I've filtered out) and then take the unmapped reads from that, and run them through KrakenUniq.
It does not cause a problem to have the same genome in multiple DBs. However some kmers might get assigned to different taxonomic IDs if you do that.

gtonkinhill · 2024-01-11T01:21:27Z

Thanks very much for the quick response!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on hierarchical read classification with multiple databases #161

Clarification on hierarchical read classification with multiple databases #161

gtonkinhill commented Jan 10, 2024

salzberg commented Jan 11, 2024

gtonkinhill commented Jan 11, 2024

Clarification on hierarchical read classification with multiple databases #161

Clarification on hierarchical read classification with multiple databases #161

Comments

gtonkinhill commented Jan 10, 2024

salzberg commented Jan 11, 2024

gtonkinhill commented Jan 11, 2024