Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

krakenuniq-build error #155

Open
Sheerlik opened this issue Dec 5, 2023 · 6 comments
Open

krakenuniq-build error #155

Sheerlik opened this issue Dec 5, 2023 · 6 comments

Comments

@Sheerlik
Copy link

Sheerlik commented Dec 5, 2023

Hi,

I received an error when attempting to perform krakenuniq-build on the refseq genomes.

This is the command I use:
./scripts/krakenuniq-build --db DBDIR-temp --kmer-len 31 --threads 10 --taxids-for-genomes --taxids-for-sequences --jellyfish-bin 1

This is the error message:
Kraken build set to minimize disk writes.
Finding all library files
Found 1 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory.
Creating k-mer set (step 1 of 6)...
Using 1
/home/ubuntu/krakenuniq-1.0.4/scripts/build_db.sh: line 127: count_unique: command not found
xargs: cat: terminated by signal 13

(1.0.4 krakenuniq version)

Thank you!
Sheerli

@jvolkening
Copy link

./scripts/krakenuniq-build --db DBDIR-temp --kmer-len 31 --threads 10 --taxids-for-genomes --taxids-for-sequences --jellyfish-bin 1

If you specify --jellyfish-bin, it should be the path to the jellyfish v1 executable, not "1". I'm not sure if this is related to your error, since count_unique should be part of the krakenuniq install and is run prior to running jellyfish.

@Sheerlik
Copy link
Author

Sheerlik commented Dec 6, 2023

Thank you Jeremy for you response!

We did that-

krakenuniq-1.0.4_2 ./krakenuniq-build --db ./DBDIRmicrobial-nt/ --kmer-len 31 --threads 50 --taxids-for-genomes --taxids-for-sequences --jellyfish-bin

and now received a different error-

Kraken build set to minimize disk writes.
Found 1 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory.
Creating k-mer set (step 1 of 6)...
Using .
Hash size not specified, using '53187024'
/home/ubuntu/krakenuniq-1.0.4_2/build_db.sh: line 46: count: No such file or directory
xargs: cat: terminated by signal 13

Please advise.
Thank you!
Sheerli

@jvolkening
Copy link

Hello Sheerli,

This is still due to having an empty value for --jellyfish-bin in your command. You need to have Jellyfish version 1 installed to use the KrakenUniq build commands. If it is installed and in your PATH already, you don't need to specify --jellyfish-bin on the command line. For instance, if you run jellyfish --version in a shell, do you get a version number or a 'No such file or directory' error? If the latter, you need to install jellyfish v.1 and then specify the path, e.g. --jellyfish-bin /path/to/jellyfish, substituting for the second part the actual path to the binary file.

The easiest thing to do, in my opinion, is to install KrakenUniq through a package manager like Conda -- that will handle installing all of the dependencies for you.

@Sheerlik
Copy link
Author

Sheerlik commented Dec 6, 2023

Thank you Jeremy for your prompt answer!

You are right, our jellyfish version was version 2 (although jellysfish was not installed separately from krakenuniq).
We installed jellyfish-1.1.12.

ran the command:
./krakenuniq-build --db ./DBDIRmicrobial-nt/ --kmer-len 31 --threads 30 --jellyfish-bin /home/ubuntu/krakenuniq-1.0.4_2/jellyfish-1.1.12/bin/jellyfish

and got this error:
Kraken build set to minimize disk writes.
Found 1 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory.
Creating k-mer set (step 1 of 6)...
Using /home/ubuntu/krakenuniq-1.0.4_2/jellyfish-1.1.12/bin/jellyfish
Hash size not specified, using '53187024'
Can't merge hashes with different reprobing stratgies
K-mer set created. [14.016s]
Skipping step 2, no database reduction requested.
Sorting k-mer set (step 3 of 6)...
db_sort: Getting database into memory ...db_sort: unable to open database.jdb: No such file or directory

Perhaps you would know how to solve this problem?

Thank you!
Sheerli

@jvolkening
Copy link

Can't merge hashes with different reprobing stratgies

This is a jellyfish error -- I've not encountered it before. Are you sure you're not running out of disk space for the temporary files?

Hash size not specified, using '53187024'

This seems quite small based on my experience; if your input database if large I think this will result in a large number of temporary files. Our build process specifies the hash size explicitly, and you could try this to see if it makes any difference. For instance, on an machine with 128GB RAM we use --jellyfish-hash-size 15000000000, which seems to be about the max possible without running out of memory. For smaller or larger instances we adjust the value proportionally.

db_sort: unable to open database.jdb: No such file or directory

Almost certainly due to the previous jellyfish error, so the merged database was not written.

@Sheerlik
Copy link
Author

Thank you Jeremy!
We changed the hash size and it looks like its working.
I will let you know if we encounter additional build problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants