Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fungal reads - Which is the best database? #278

Open
andressamv opened this issue Feb 5, 2024 · 1 comment
Open

Fungal reads - Which is the best database? #278

andressamv opened this issue Feb 5, 2024 · 1 comment

Comments

@andressamv
Copy link

Hi! I have been using Kaiju for a while, and now I am interested in filtering fungal reads. For this, I used the Kaiju app in KBase and compared the results using two different databases: NCBI BLAST nr+euk (protein sequences from nr: Bacteria, Archaea, Viruses, Fungi, and microbial eukaryotes) and fungi (protein sequences from a representative set of fungal genomes). Based on the same samples, I would expect to have more fungal reads using the comprehensive database (nr since I thought RefSeq would be included in nr), but the fungal one results in way more hits. Please, what is the explanation for that?

@pmenzel
Copy link
Member

pmenzel commented Feb 14, 2024

Hi! Not necessarily all genomes from RefSeq are contained in the BLAST nr database, so it might well be, that more reads get classified by the RefSeq fungi database.

You can manually check some of the reads that are classified by the RefSeq database and not by the nr database and use the NCBI BLAST website to see if they have good matches in nr..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants