Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

makedb progenomes fails #273

Open
Mewgia opened this issue Nov 23, 2023 · 9 comments
Open

makedb progenomes fails #273

Mewgia opened this issue Nov 23, 2023 · 9 comments

Comments

@Mewgia
Copy link

Mewgia commented Nov 23, 2023

Hello!
I'm trying to download progenomes database and see that:
Downloading taxdump.tar.gz
.listing [ <=> ] 1.85K --.-KB/s in 0.01s
2023-11-23 13:11:56 URL: ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz [1890] -> ".listing" [1]
taxdump.tar.gz 100%[=========================================================================>] 60.88M 13.6MB/s in 5.5s
2023-11-23 13:12:03 URL: ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz [63835104] -> "taxdump.tar.gz" [1]
Extracting taxdump.tar.gz
Downloading proGenomes database
https://progenomes.embl.de/data/repGenomes/freeze12.proteins.representatives.fasta.gz:
2023-11-23 13:12:16 ERROR 404: Not Found.

What should I do?
Thanks.

@pmenzel
Copy link
Member

pmenzel commented Nov 24, 2023

update kaiju to the latest version from GitHub. The URL for the progenomes download changed since release v1.9.2.

I should make a new release including this fix..

@Mewgia
Copy link
Author

Mewgia commented Nov 28, 2023

Thanks, done.
But now there's a new error:
Downloading proGenomes database
progenomes3.proteins.representatives.f 100%[=========================================================================>] 23.78G 21.1MB/s in 17m 45s
2023-11-27 10:53:36 URL:https://progenomes.embl.de/data/repGenomes/progenomes3.proteins.representatives.fasta.bz2 [25536098227/25536098227] -> "progenomes/source/progenomes3.proteins.representatives.fasta.bz2" [1]
Downloading virus genomes from RefSeq
Extracting protein sequences from downloaded files
xargs: warning: options --max-args and --replace/-I/-i are mutually exclusive, ignoring previous --max-args value
Creating Borrows-Wheeler transform
infilename= progenomes/kaiju_db_progenomes.faa
outfilename= progenomes/kaiju_db_progenomes
Alphabet= ACDEFGHIKLMNPQRSTVWY
nThreads= 5
length= 0.000000
checkpoint= 3
caseSens=OFF
revComp=OFF
term= *
revsort=OFF
help=OFF
Sequences read time = 2346.803643s
SLEN 46856907323
NSEQ 141847069
ALPH *ACDEFGHIKLMNPQRSTVWY
Killed

96Gb RAM, Debian GNU/Linux 11 (bullseye)
Previously, I've used kaiju with progenomes on debian jessie without any problems.

@pmenzel
Copy link
Member

pmenzel commented Nov 28, 2023

Have a look at the table in the README for the memory requirements for building the kaiju index for each available reference database. For proGenomes v3 it is 120GB of RAM.

@EorgeKit
Copy link

EorgeKit commented Dec 3, 2023

update kaiju to the latest version from GitHub. The URL for the progenomes download changed since release v1.9.2.

I should make a new release including this fix..

I would like to install the latest version because of the same issue , but unfortunately the conda version I downloaded is still recording v1.9.2 despite bioconda page saying its 1.10.0 and the kaiju-makedb -s progenomes still gets the same not found error. can you please check since I cant install the github way because I am working in a cluster and I don't have the sudo permissions which means I have to wait until they resolve my installation ticket which takes days sometimes.

conda create -n kaiju2 -y  -c bioconda kaiju=1.10.0
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/maloo/.conda/envs/kaiju2

  added / updated specs:
    - kaiju=1.10.0


The following NEW packages will be INSTALLED:

  _libgcc_mutex      conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge 
  _openmp_mutex      conda-forge/linux-64::_openmp_mutex-4.5-2_gnu 
  bzip2              conda-forge/linux-64::bzip2-1.0.8-hd590300_5 
  c-ares             conda-forge/linux-64::c-ares-1.23.0-hd590300_0 
  ca-certificates    conda-forge/linux-64::ca-certificates-2023.11.17-hbcca054_0 
  curl               conda-forge/linux-64::curl-8.4.0-hca28451_0 
  gettext            conda-forge/linux-64::gettext-0.21.1-h27087fc_0 
  kaiju              bioconda/linux-64::kaiju-1.10.0-h43eeafb_0 
  keyutils           conda-forge/linux-64::keyutils-1.6.1-h166bdaf_0 
  krb5               conda-forge/linux-64::krb5-1.21.2-h659d440_0 
  ld_impl_linux-64   conda-forge/linux-64::ld_impl_linux-64-2.40-h41732ed_0 
  libcurl            conda-forge/linux-64::libcurl-8.4.0-hca28451_0 
  libedit            conda-forge/linux-64::libedit-3.1.20191231-he28a2e2_2 
  libev              conda-forge/linux-64::libev-4.33-h516909a_1 
  libexpat           conda-forge/linux-64::libexpat-2.5.0-hcb278e6_1 
  libffi             conda-forge/linux-64::libffi-3.4.2-h7f98852_5 
  libgcc-ng          conda-forge/linux-64::libgcc-ng-13.2.0-h807b86a_3 
  libgomp            conda-forge/linux-64::libgomp-13.2.0-h807b86a_3 
  libidn2            conda-forge/linux-64::libidn2-2.3.4-h166bdaf_0 
  libnghttp2         conda-forge/linux-64::libnghttp2-1.58.0-h47da74e_0 
  libnsl             conda-forge/linux-64::libnsl-2.0.1-hd590300_0 
  libsqlite          conda-forge/linux-64::libsqlite-3.44.2-h2797004_0 
  libssh2            conda-forge/linux-64::libssh2-1.11.0-h0841786_0 
  libstdcxx-ng       conda-forge/linux-64::libstdcxx-ng-13.2.0-h7e041cc_3 
  libunistring       conda-forge/linux-64::libunistring-0.9.10-h7f98852_0 
  libuuid            conda-forge/linux-64::libuuid-2.38.1-h0b41bf4_0 
  libzlib            conda-forge/linux-64::libzlib-1.2.13-hd590300_5 
  ncurses            conda-forge/linux-64::ncurses-6.4-h59595ed_2 
  openssl            conda-forge/linux-64::openssl-3.2.0-hd590300_1 
  perl               conda-forge/linux-64::perl-5.32.1-4_hd590300_perl5 
  pip                conda-forge/noarch::pip-23.3.1-pyhd8ed1ab_0 
  python             conda-forge/linux-64::python-3.12.0-hab00c5b_0_cpython 
  readline           conda-forge/linux-64::readline-8.2-h8228510_1 
  setuptools         conda-forge/noarch::setuptools-68.2.2-pyhd8ed1ab_0 
  tk                 conda-forge/linux-64::tk-8.6.13-noxft_h4845f30_101 
  tzdata             conda-forge/noarch::tzdata-2023c-h71feb2d_0 
  wget               conda-forge/linux-64::wget-1.20.3-ha35d2d1_1 
  wheel              conda-forge/noarch::wheel-0.42.0-pyhd8ed1ab_0 
  xz                 conda-forge/linux-64::xz-5.2.6-h166bdaf_0 
  zlib               conda-forge/linux-64::zlib-1.2.13-hd590300_5 
  zstd               conda-forge/linux-64::zstd-1.5.5-hfc55251_0 



Downloading and Extracting Packages

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate kaiju2
#
# To deactivate an active environment, use
#
#     $ conda deactivate
conda activate kaiju2
 kaiju
Error: Please specify the location of the nodes.dmp file, using the -t option.

Kaiju 1.9.2
Copyright 2015-2022 Peter Menzel, Anders Krogh
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

@pmenzel
Copy link
Member

pmenzel commented Dec 3, 2023

Just tried to install v1.10.0 via conda and got the correct version. You can also just compile it from source and run it without need for root.

@Jeffery-Ni
Copy link

I have encountered the same problem:

\033[0;32mExtracting taxdump.tar.gz\033[0m
Creating Borrows-Wheeler transform

infilename= refseq/kaiju_db_refseq.faa

outfilename= refseq/kaiju_db_refseq

Alphabet= ACDEFGHIKLMNPQRSTVWY

nThreads= 10

length= 0.000000

checkpoint= 3

caseSens=OFF

revComp=OFF

term= *

revsort=OFF

help=OFF

Sequences read time = 339.643674s
SLEN 50636639214
NSEQ 155772604
ALPH *ACDEFGHIKLMNPQRSTVWY
/home/Public/Anaconda3/ENTER/envs/kaiju/bin/kaiju-makedb: line 261: 1817838 Killed kaiju-mkbwt -n $threadsBWT -e $exponentSA -a ACDEFGHIKLMNPQRSTVWY -o $DB/kaiju_db_$DB $DB/kaiju_db_$DB.faa

Is this the problem with RAM?

@pmenzel
Copy link
Member

pmenzel commented Apr 19, 2024

Probably. See the README for required RAM for each reference database. You can also download premade indexes.

@Jeffery-Ni
Copy link

but the indexes offered on kaiju website is a little outdatetd, is there a way to locally make the index with out this much RAM? the server i work on has about 115gb of free RAM, just short to build the current refseq index

@pmenzel
Copy link
Member

pmenzel commented Apr 19, 2024

You won't loose much when using the index file from last year. The memory requirements cannot be reduced..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants