.dbtype already exists error when clustering using profiles #844

schmittel · 2024-05-06T18:21:48Z

Hi,

I'm having difficulty clustering using profiles when following the instructions in the wiki. Specifically I'm referring to this section:

# extract consensus sequences from profiles
mmseqs profile2consensus profileDB1 profileDB1_consensus
# search with profiles against consensus sequences of seqDB1
mmseqs search profileDB1 profileDB1_consensus resultDB2 tmp --add-self-matches -a # Add your cluster criteria here
# cluster the results 
mmseqs clust profileDB1 resultDB2 profileDB1_clu

I can run mmseqs search without issue but when I run mmseqs clust I get the following error:

Create directory /final/db_cluster/low_1/Genus02938/Genus02938_DB
cluster /final/db_profile/low_1/Genus02938/Genus02938_DB /final/db_profile_vs_consensus/low_1/Genus02938/Genus02938_DB /final/db_cluster/low_1/Genus02938/Genus02938_DB

MMseqs Version:                         15.6f452
Substitution matrix                     aa:blosum62.out,nucl:nucleotide.out
Seed substitution matrix                aa:VTML80.out,nucl:nucleotide.out
Sensitivity                             4
k-mer length                            0
Target search mode                      0
k-score                                 seq:2147483647,prof:2147483647
Alphabet size                           aa:21,nucl:5
Max sequence length                     65535
Max results per query                   20
Split database                          0
Split mode                              2
Split memory limit                      0
Coverage threshold                      0.8
Coverage mode                           0
Compositional bias                      1
Compositional bias                      1
Diagonal scoring                        true
Exact k-mer matching                    0
Mask residues                           1
Mask residues probability               0.9
Mask lower case residues                0
Minimum diagonal score                  15
Selected taxa
Include identical seq. id.              false
Spaced k-mers                           1
Preload mode                            0
Pseudo count a                          substitution:1.100,context:1.400
Pseudo count b                          substitution:4.100,context:5.800
Spaced k-mer pattern
Local temporary path
Threads                                 144
Compressed                              0
Verbosity                               3
Add backtrace                           false
Alignment mode                          3
Alignment mode                          0
Allow wrapped scoring                   false
E-value threshold                       0.001
Seq. id. threshold                      0
Min alignment length                    0
Seq. id. mode                           0
Alternative alignments                  0
Max reject                              2147483647
Max accept                              2147483647
Score bias                              0
Realign hits                            false
Realign score bias                      -0.2
Realign max seqs                        2147483647
Correlation score weight                0
Gap open cost                           aa:11,nucl:5
Gap extension cost                      aa:1,nucl:2
Zdrop                                   40
Rescore mode                            0
Remove hits by seq. id. and coverage    false
Sort results                            0
Cluster mode                            0
Max connected component depth           1000
Similarity type                         2
Weight file name
Cluster Weight threshold                0.9
Single step clustering                  false
Cascaded clustering steps               3
Cluster reassign                        false
Remove temporary files                  false
Force restart with latest tmp           false
MPI runner
k-mers per sequence                     21
Scale k-mers per sequence               aa:0.000,nucl:0.200
Adjust k-mer length                     false
Shift hash                              67
Include only extendable                 false
Skip repeating k-mers                   false

Set cluster sensitivity to -s 6.000000
Set cluster mode SET COVER
Set cluster iterations to 3
/final/db_profile_vs_consensus/low_1/Genus02938/Genus02938_DB.dbtype exists already!

Yes, /final/db_profile_vs_consensus/low_1/Genus02938/Genus02938_DB.dbtype already exists; it was created by mmseqs search. I'm not sure why mmseqs clust cares? Do you have any ideas - I can't figure this out. Many thanks!!

The text was updated successfully, but these errors were encountered:

schmittel · 2024-05-06T18:47:32Z

I just learned that mmseqs cluster and mmseqs clust were different things, which solved the issue. Apologies for the confusion.

schmittel changed the title ~~'could not copy file' error when clustering using profiles~~ .dbtype already exists error when clustering using profiles May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.dbtype already exists error when clustering using profiles #844

.dbtype already exists error when clustering using profiles #844

schmittel commented May 6, 2024 •

edited

schmittel commented May 6, 2024

.dbtype already exists error when clustering using profiles #844

.dbtype already exists error when clustering using profiles #844

Comments

schmittel commented May 6, 2024 • edited

schmittel commented May 6, 2024

schmittel commented May 6, 2024 •

edited