Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmseq search: swapresults died #814

Open
tjmier opened this issue Feb 14, 2024 · 0 comments
Open

mmseq search: swapresults died #814

tjmier opened this issue Feb 14, 2024 · 0 comments

Comments

@tjmier
Copy link

tjmier commented Feb 14, 2024

Intention

I am trying perform an exhaustive all against all search of a large database with sensitivity similar to blast. The database contains approximately 73,000 sequences with an average length around 300 amino acids. Process is killed while reading results with the error swapresults died.

On Linux mint and using MMseqs2 Release 15-6f452

output

cahoonlab@lagbb-bcecmint:~$ mmseqs search Documents/01_FAD_FAH_90_DB/FAD_FAH_90 Documents/01_FAD_FAH_90_DB/FAD_FAH_90 alnment/alignment tmp --exhaustive-search -s 8
search Documents/01_FAD_FAH_90_DB/FAD_FAH_90 Documents/01_FAD_FAH_90_DB/FAD_FAH_90 alnment/alignment tmp --exhaustive-search -s 8

MMseqs Version: 78ae2c5
Substitution matrix aa:blosum62.out,nucl:nucleotide.out
Add backtrace false
Alignment mode 2
Alignment mode 0
Allow wrapped scoring false
E-value threshold 0.001
Seq. id. threshold 0
Min alignment length 0
Seq. id. mode 0
Alternative alignments 0
Coverage threshold 0
Coverage mode 0
Max sequence length 65535
Compositional bias 1
Compositional bias 1
Max reject 2147483647
Max accept 2147483647
Include identical seq. id. false
Preload mode 0
Pseudo count a substitution:1.100,context:1.400
Pseudo count b substitution:4.100,context:5.800
Score bias 0
Realign hits false
Realign score bias -0.2
Realign max seqs 2147483647
Correlation score weight 0
Gap open cost aa:11,nucl:5
Gap extension cost aa:1,nucl:2
Zdrop 40
Threads 16
Compressed 0
Verbosity 3
Seed substitution matrix aa:VTML80.out,nucl:nucleotide.out
Sensitivity 8
k-mer length 0
Target search mode 0
k-score seq:2147483647,prof:2147483647
Alphabet size aa:21,nucl:5
Max results per query 300
Split database 0
Split mode 2
Split memory limit 0
Diagonal scoring true
Exact k-mer matching 0
Mask residues 1
Mask residues probability 0.9
Mask lower case residues 0
Minimum diagonal score 15
Selected taxa
Spaced k-mers 1
Spaced k-mer pattern
Local temporary path
Rescore mode 0
Remove hits by seq. id. and coverage false
Sort results 0
Mask profile 1
Profile E-value threshold 0.1
Global sequence weighting false
Allow deletions false
Filter MSA 1
Use filter only at N seqs 0
Maximum seq. id. threshold 0.9
Minimum seq. id. 0.0
Minimum score per column -20
Minimum coverage 0
Select N most diverse seqs 1000
Pseudo count mode 0
Min codons in orf 30
Max codons in length 32734
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 1
Forward frames 1,2,3
Reverse frames 1,2,3
Translation table 1
Translate orf 0
Use all table starts false
Offset of numeric ids 0
Create lookup 0
Add orf stop false
Overlap between sequences 0
Sequence split mode 1
Header split mode 0
Chain overlapping alignments 0
Merge query 1
Search type 0
Search iterations 1
Start sensitivity 4
Search steps 1
Prefilter mode 0
Exhaustive search mode true
Filter results during exhaustive search 0
Strand selection 1
LCA search mode false
Disk space limit 0
MPI runner
Force restart with latest tmp false
Remove temporary files false

prefilter tmp/5233442526903138997/profileDB Documents/01_FAD_FAH_90_DB/FAD_FAH_90 tmp/5233442526903138997/pref --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -s 8 -k 0 --target-search-mode 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 65535 --max-seqs 75233 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --threads 16 --compressed 0 -v 3

Query database size: 75233 type: Aminoacid
Estimated memory consumption: 1G
Target database size: 75233 type: Aminoacid
Index table k-mer threshold: 91 at k-mer size 6
Index table: counting k-mers
[=================================================================] 100.00% 75.23K 0s 271ms
Index table: Masked residues: 52417
Index table: fill
[=================================================================] 100.00% 75.23K 0s 415ms
Index statistics
Entries: 23045881
DB size: 620 MB
Avg k-mer size: 0.360092
Top 10 k-mers
GGNQHH 4218
NTSHHH 3502
NYHFDY 2183
LEVYHY 2100
VTDHHH 1805
TPMRHS 1770
GWNHFP 1732
LIWRGT 1732
GLYIHL 1684
WAHVSS 1682
Time for index table init: 0h 0m 1s 16ms
Process prefiltering step 1 of 1

k-mer similarity threshold: 91
Starting prefiltering scores calculation (step 1 of 1)
Query db start 1 to 75233
Target db start 1 to 75233
[=================================================================] 100.00% 75.23K 8m 43s 269ms

3286.821735 k-mers per position
713534 DB matches per sequence
1 overflows
26270 sequences passed prefiltering per query sequence
26950 median result list length
0 sequences with 0 size result lists
Time for merging to pref: 0h 0m 0s 15ms
Time for processing: 0h 8m 44s 825ms
result2stats tmp/5233442526903138997/profileDB Documents/01_FAD_FAH_90_DB/FAD_FAH_90 tmp/5233442526903138997/pref tmp/5233442526903138997/pref_count.tsv --stat linecount --tsv --threads 16 --compressed 0 -v 3

[=================================================================] 100.00% 75.23K 1s 543ms
Time for merging to pref_count.tsv: 0h 0m 0s 20ms
Time for processing: 0h 0m 1s 810ms
align tmp/5233442526903138997/profileDB Documents/01_FAD_FAH_90_DB/FAD_FAH_90 tmp/5233442526903138997/pref tmp/5233442526903138997/aln --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 0 --alignment-mode 2 --alignment-output-mode 1 --wrapped-scoring 0 -e 0.001 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 16 --compressed 0 -v 3

Compute score and coverage
Query database size: 75233 type: Aminoacid
Target database size: 75233 type: Aminoacid
Calculation of alignments
[=================================================================] 100.00% 75.23K 1h 41m 40s 149ms
Time for merging to aln: 0h 0m 0s 11ms
1976417423 alignments calculated
672599536 sequence pairs passed the thresholds (0.340313 of overall calculated)
8940.219727 hits per query sequence
Time for processing: 1h 41m 41s 324ms
rmdb tmp/5233442526903138997/pref -v 3

Time for processing: 0h 0m 0s 616ms
mvdb tmp/5233442526903138997/aln tmp/5233442526903138997/aln_merged -v 3

Time for processing: 0h 0m 0s 0ms
align /home/cahoonlab/Documents/01_FAD_FAH_90_DB/FAD_FAH_90 Documents/01_FAD_FAH_90_DB/FAD_FAH_90 tmp/5233442526903138997/aln_merged tmp/5233442526903138997/aln --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 0 --alignment-mode 2 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.001 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 16 --compressed 0 -v 3

Compute score and coverage
Query database size: 75233 type: Aminoacid
Target database size: 75233 type: Aminoacid
Calculation of alignments
[=================================================================] 100.00% 75.23K 1h 9m 57s 768ms
Time for merging to aln: 0h 0m 0s 17ms
672599536 alignments calculated
672599536 sequence pairs passed the thresholds (1.000000 of overall calculated)
8940.219727 hits per query sequence
Time for processing: 1h 9m 58s 713ms
rmdb tmp/5233442526903138997/aln_merged -v 3

Time for processing: 0h 0m 0s 73ms
swapresults /home/cahoonlab/Documents/01_FAD_FAH_90_DB/FAD_FAH_90 Documents/01_FAD_FAH_90_DB/FAD_FAH_90 tmp/5233442526903138997/aln alnment/alignment --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -e 1.79769e+308 --split-memory-limit 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --threads 16 --compressed 0 --db-load-mode 0 -v 3

Computing offsets.
[=================================================================] 100.00% 75.23K 8s 163ms

Reading results.
Killed============================> ] 51.07% 38.42K eta 6s
Error: swapresults died

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant