Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error setting up databases DRAM_data/tmp/10969822372036018711/createindex.sh: line 56: 1115772 Killed #317

Open
ileleiwi opened this issue Dec 6, 2023 · 0 comments

Comments

@ileleiwi
Copy link
Member

ileleiwi commented Dec 6, 2023

Hello,

I'm having difficulty setting up the databases for DRAM on the server I'm using. The command I'm running is

DRAM-setup.py prepare_databases --output_dir DRAM_data --uniref_loc /global/cfs/cdirs/m3264/databases/dram_database_files/uniref90.fasta.gz --vogdb_loc /global/cfs/cdirs/m3264/databases/dram_database_files/vog.hmm.tar.gz --pfam_loc /global/cfs/cdirs/m3264/databases/dram_database_files/Pfam-A.full.gz --verbose

The output I receive is as follows:

/global/homes/l/leleiwi1/.conda/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_handler.py:123: UserWarning: Database does not exist at path None
  warnings.warn("Database does not exist at path %s" % description_loc)
2023-12-06 09:17:24,802 - Starting the process of downloading data
2023-12-06 09:17:24,829 - The kegg_loc argument was not used to specify a downloaded kegg file, and dram can not download it its self. So it is assumed that the user wants to set up DRAM without it
2023-12-06 09:17:24,829 - The gene_ko_link_loc argument was not used to specify a downloaded gene_ko_link file, and dram can not download it its self. So it is assumed that the user wants to set up DRAM without it
2023-12-06 09:17:24,829 - Database preparation started
2023-12-06 09:17:24,829 - Downloading kofam_hmm
downloading ftp://ftp.genome.jp/pub/db/kofam/profiles.tar.gz
2023-12-06 09:22:29,886 - Downloading kofam_ko_list
downloading ftp://ftp.genome.jp/pub/db/kofam/ko_list.gz
2023-12-06 09:22:35,761 - Downloading pfam_hmm
downloading ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.hmm.dat.gz
2023-12-06 09:22:38,585 - Downloading dbcan
downloading http://bcb.unl.edu/dbCAN2/download/dbCAN-HMMdb-V11.txt
2023-12-06 09:22:40,827 - Downloading dbcan_fam_activities
2023-12-06 09:22:40,827 - Downloading dbCAN family activities from : https://bcb.unl.edu/dbCAN2/download/Databases/V11/CAZyDB.08062022.fam-activities.txt
downloading https://bcb.unl.edu/dbCAN2/download/Databases/V11/CAZyDB.08062022.fam-activities.txt
2023-12-06 09:22:41,084 - Downloading dbcan_subfam_ec
2023-12-06 09:22:41,084 - Downloading dbCAN sub-family encumber from : https://bcb.unl.edu/dbCAN2/download/Databases/V11/CAZyDB.08062022.fam.subfam.ec.txt
downloading https://bcb.unl.edu/dbCAN2/download/Databases/V11/CAZyDB.08062022.fam.subfam.ec.txt
2023-12-06 09:22:41,500 - Downloading vog_annotations
downloading http://fileshare.csb.univie.ac.at/vog/latest/vog.annotations.tsv.gz
2023-12-06 09:22:45,785 - Downloading viral
downloading ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.protein.faa.gz
2023-12-06 09:22:48,877 - Downloading peptidase
downloading ftp://ftp.ebi.ac.uk/pub/databases/merops/current_release/pepunit.lib
2023-12-06 09:22:55,874 - Downloading genome_summary_form
downloading https://raw.githubusercontent.com/WrightonLabCSU/DRAM/master/data/genome_summary_form.tsv
2023-12-06 09:22:56,179 - Downloading module_step_form
downloading https://raw.githubusercontent.com/WrightonLabCSU/DRAM/master/data/module_step_form.tsv
2023-12-06 09:22:56,580 - Downloading function_heatmap_form
downloading https://raw.githubusercontent.com/WrightonLabCSU/DRAM/master/data/function_heatmap_form.tsv
2023-12-06 09:22:56,769 - Downloading amg_database
downloading https://raw.githubusercontent.com/WrightonLabCSU/DRAM/master/data/amg_database.tsv
2023-12-06 09:22:56,938 - Downloading etc_module_database
downloading https://raw.githubusercontent.com/WrightonLabCSU/DRAM/master/data/etc_module_database.tsv
2023-12-06 09:22:57,132 - All raw data files were downloaded successfully
2023-12-06 09:22:57,132 - Processing uniref
2023-12-06 10:00:26,828 - The subcommand ['mmseqs', 'createindex', 'DRAM_data/uniref90.20231206.mmsdb', 'DRAM_data/tmp', '--threads', '10'] experienced an error: DRAM_data/tmp/10969822372036018711/createindex.sh: line 56: 1115772 Killed                  "$MMSEQS" $INDEXER "$INPUT" "$INPUT" ${INDEX_PAR}

Traceback (most recent call last):
  File "/global/homes/l/leleiwi1/.conda/envs/DRAM/bin/DRAM-setup.py", line 184, in <module>
    args.func(**args_dict)
  File "/global/homes/l/leleiwi1/.conda/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_processing.py", line 555, in prepare_databases
    processed_locs = process_functions[i](locs[i], output_dir, LOGGER,
  File "/global/homes/l/leleiwi1/.conda/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_processing.py", line 262, in process_uniref
    make_mmseqs_db(uniref_fasta_zipped, uniref_mmseqs_db, logger, create_index=True, threads=threads, verbose=verbose)
  File "/global/homes/l/leleiwi1/.conda/envs/DRAM/lib/python3.10/site-packages/mag_annotator/utils.py", line 98, in make_mmseqs_db
    run_process(['mmseqs', 'createindex', output_loc, tmp_dir, '--threads', str(threads)], logger, verbose=verbose)
  File "/global/homes/l/leleiwi1/.conda/envs/DRAM/lib/python3.10/site-packages/mag_annotator/utils.py", line 71, in run_process
    raise subprocess.SubprocessError(f"The subcommand {' '.join(command)} experienced an error, see the log for more info.")
subprocess.SubprocessError: The subcommand mmseqs createindex DRAM_data/uniref90.20231206.mmsdb DRAM_data/tmp --threads 10 experienced an error, see the log for more info.

Do you know why this is happening and can you help me solve the problem? I'm running in a conda environment with DRAM version 1.4.6. The databases all seem to download fine, it's just this processing step with mmseqs2.
I noticed that the DRAM_data/tmp/10969822372036018711/createindex.sh file referenced in the error is made with the following file permisions -rwx------.
I thought it could be a file permission problem since line 56 is the last line of that file. However, when I change the permisions of the file and try to run bash DRAM_data/tmp/10969822372036018711/createindex.sh I'm prompted with this output Please provide <sequenceDB> <tmp> and there seems to be no way to restart the original command when the DRAM_data folder already exists.

Thanks for your help,
Kai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant