Pipeline for getting taxonomy for clusters #815

alopgar · 2024-03-01T12:21:54Z

Hi, I have been using MMseqs2 to obtain clusters of multiple sequence files and then obtain each sequence's taxonomy. I followed this pipeline:

mmseqs easy-cluster ${rawfas[@]} newcluster tmp --min-seq-id 0.3 -c 0.5 --cov-mode 1 --cluster-mode 2 -e 0.001 -s 6
mmseqs createdb ${rawfas[@]} queryDB_all
mmseqs taxonomy queryDB $TXDB clusterTax tmp --lca-mode 4 --split-memory-limit 60G \
     --lca-ranks superkingdom,phylum,class,order,family,genus
mmseqs createtsv queryDB clusterTax ../clusterTax.tsv

The output of these is a clusterRes_cluster.tsv file including the representative sequences and the cluster members, and a clusterTax file with the taxonomy for each sequence.

My question is, is there any MMseqs2 implementation to obtain the common taxonomy for each cluster, like an LCA algorithm applied to all the sequences belonging to each cluster, or something similar? Or another software that allows me to do that?

Thanks in advance

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline for getting taxonomy for clusters #815

Pipeline for getting taxonomy for clusters #815

alopgar commented Mar 1, 2024

Pipeline for getting taxonomy for clusters #815

Pipeline for getting taxonomy for clusters #815

Comments

alopgar commented Mar 1, 2024