Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem running kaiju-makedb on a cluster #239

Open
LeonardosMageiros opened this issue Sep 20, 2022 · 7 comments
Open

Problem running kaiju-makedb on a cluster #239

LeonardosMageiros opened this issue Sep 20, 2022 · 7 comments

Comments

@LeonardosMageiros
Copy link

Hi,

I have installed kaiju on a cluster that I am working on and after I created a kaijudb directory, I am trying to execute the following:
qsub -V -cwd ../bin/kaiju-makedb -s nr_euk

The job ends after a few seconds with the following error:
Error: File kaiju-taxonlistEuk.tsv not found in /opt/site/sge/default/spool/node616/job_scripts

I can see the file kaiju-taxonlistEuk.tsv located under /net/scratch2/mdehsnpa/CURE/raw_reads/Tools/kaiju/kaiju-master/bin/

Any idea how to solve this issue?

Thank you in advance for your help.

Best
Leonardos

@pmenzel
Copy link
Member

pmenzel commented Sep 20, 2022

Hi,

kaiju-makedb expects the file kaiju-taxonlistEuk.tsv to be in the same directory as itself. So maybe you can just call kaiju-makedb in you script with the full path /net/scratch2/mdehsnpa/CURE/raw_reads/Tools/kaiju/kaiju-master/bin/kaiju-makedb instead of ../bin/kaiju-makedb

@LeonardosMageiros
Copy link
Author

Hi,

I am afraid that changing the command to
qsub -V -cwd /net/scratch2/mdehsnpa/CURE/raw_reads/Tools/kaiju/kaiju-master/bin/kaiju-makedb -s nr_euk
gives me exactly the same error.

I guess when running in a cluster using qsub the script is loaded on a node and then the $SCRIPTDIR takes the value of the directory that the script is executed on that node.

I think I solved the problem by commenting SCRIPTDIR=$(dirname $0) and replacing in it with
SCRIPTDIR="/net/scratch2/mdehsnpa/CURE/raw_reads/Tools/kaiju/kaiju-master/bin/"

Not the most elegant solution but the database I want is downloading at the moment.

Best
Leo

@pmenzel
Copy link
Member

pmenzel commented Sep 22, 2022

Sounds like a good solution to me! :)

@LeonardosMageiros
Copy link
Author

Hi,
I believe that the download has finished successfully.
Nonetheless In your instructions you say that the files needed are kaiju_db_*.fmi, nodes.dmp, and names.dmp

I can see the last 2:
image

Nonetheless I dont see the .fmi file. Is there something wrong?
The folder nr_euk contains the following:

image

Please let me know what you think.
Best
Leo

@pmenzel
Copy link
Member

pmenzel commented Sep 27, 2022

Welp, it seems like the index creation did not work out, there should be more files in the nr_euk folder, including kaiju_db_nr_euk.fmi. Check the stdout and stderr logs of your cluster job to see what happened. You could also download the ready-made fmi files from https://kaiju.binf.ku.dk/server (they are from March 2022)

@LeonardosMageiros
Copy link
Author

My error output is the following:

17:53:45 Reading taxa from file /net/scratch2/mdehsnpa/CURE/raw_reads/Tools/kaiju/kaiju-master/bin//kaiju-taxonlistEuk.tsv 17:53:47 Reading accession to taxon id map from file nr_euk/prot.accession2taxid.gz terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc /opt/site/sge/default/spool/node619/job_scripts/3935168: line 254: 30180 Broken pipe gunzip -c $DB/nr.gz 30181 Aborted | kaiju-convertNR -m merged.dmp -t nodes.dmp -g $DB/prot.accession2taxid.gz -e $SCRIPTDIR/kaiju-excluded-accessions.txt -a -o $DB/kaiju_db_$DB.faa -l $SCRIPTDIR/kaiju-taxonlistEuk.tsv

I downloaded the zipped file from the server though using wget.
I guess I just have to decompress the files and execute kaiju?

Thx a lot once again

@pmenzel
Copy link
Member

pmenzel commented Sep 28, 2022

My error output is the following:

17:53:45 Reading taxa from file /net/scratch2/mdehsnpa/CURE/raw_reads/Tools/kaiju/kaiju-master/bin//kaiju-taxonlistEuk.tsv 17:53:47 Reading accession to taxon id map from file nr_euk/prot.accession2taxid.gz terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc /opt/site/sge/default/spool/node619/job_scripts/3935168: line 254: 30180 Broken pipe gunzip -c $DB/nr.gz 30181 Aborted | kaiju-convertNR -m merged.dmp -t nodes.dmp -g $DB/prot.accession2taxid.gz -e $SCRIPTDIR/kaiju-excluded-accessions.txt -a -o $DB/kaiju_db_$DB.faa -l $SCRIPTDIR/kaiju-taxonlistEuk.tsv

looks like the job did not have enough memory availble, see the README.md for an estimate (lower bound) of the memory requirements for each database.

I downloaded the zipped file from the server though using wget. I guess I just have to decompress the files and execute kaiju?

yep

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants