Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use data obtained from other software to start from the middle of the steps of SqueezeMeta and run the subsequent steps? #821

Open
mojiefei opened this issue Mar 29, 2024 · 4 comments

Comments

@mojiefei
Copy link

I used funannotate software to predict the genes in the assembled metagenomic sequences and obtained the corresponding result files. I want to run steps 4 to 16 in SqueezeMeta, using these result files as input files directly.

However, I only saw in the software instructions that SqueezeMeta supports restarting from a certain step or re-running a certain script. These seem to require using the intermediate output files from the initial run of SqueezeMeta as input to the steps needing re-run. And the input of SqueezeMeta is only the paired-end sequence files (.fq.gz) obtained by sequencing.

So, what can I do to use the result files of funannotate software as input of steps 4 to 16 in SqueezeMeta?

@fpusan
Copy link
Collaborator

fpusan commented Apr 8, 2024

Hi!
I guess you could write a parser that parses the results of the funannotate software to a format similar to the intermediate files used by SqueezeMeta (which is documented in the PDF manual).
You can also create an empty SqueezeMeta run by adding the flag --empty, this would allow you to have the project's "skeleton", so to speak so you could then restart it from whatever step you want (as long as the required intermediate files are in place).
Hope this is helpful, I can not provide much more help as I am not familiar with the funannotate output.

@jtamames
Copy link
Owner

jtamames commented Apr 8, 2024

Hello
If the only information you want to use is a new gene prediction, you could replace the 03.faa (aa sequences), 03.fna (nucleotide sequences) and 03.gff (information for ORFs) file for new ones you have from funannotate (keeping the ORF naming schema of SqueezeMeta, which is contigname_ORFinitpos_ORFendpos).
The restart in step 4. You would also need to run the rest of steps, not only 16, for getting abundance measures for ORFs, new annotations, etc.
Hope it helps.
Best,
J

@mojiefei
Copy link
Author

mojiefei commented Apr 10, 2024

Hello If the only information you want to use is a new gene prediction, you could replace the 03.faa (aa sequences), 03.fna (nucleotide sequences) and 03.gff (information for ORFs) file for new ones you have from funannotate (keeping the ORF naming schema of SqueezeMeta, which is contigname_ORFinitpos_ORFendpos). The restart in step 4. You would also need to run the rest of steps, not only 16, for getting abundance measures for ORFs, new annotations, etc. Hope it helps. Best, J

@jtamames

I did what you said, but SqueezeMeta reported an error in Step 9. The error displayed by syslog was "Stopping in STEP9 -> 09.summarycontigs3.pl. File /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/09.B11_Eukaryote_TEST.contiglog is empty!"

The 01.B11_Eukaryote_TEST.fasta was eukaryotic contigs isolated using EukRep from the files obtained through Step 1 of SqueezeMeta. The first few lines of 01.B11_Eukaryote_TEST.fasta were as follows.
image

The first 10 lines of 03.B11_Eukaryote_TEST.faa from funannotate were as follows.
image

The first 10 lines of 03.B11_Eukaryote_TEST.fna from funannotate were as follows.
image

The first 10 lines of 03.B11_Eukaryote_TEST.gff from funannotate were as follows.
image

I moved these four files into results, and restart SqueezeMeta using 02.rnas.pl B11_Eukaryote_TEST, and then SqueezeMeta.pl --restart -p B11_Eukaryote_TEST -step 4 --force_overwrite. The complete syslog is as follows.

Running barrnap for Bacteria: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/barrnap --quiet --threads 32 --kingdom bac --reject 0.1 /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/02.B11_Eukaryote_TEST.maskedrna.fasta --dbdir /database/SqueezeMetaDB/db > /hdd/mojf/output/B11_Eukaryote_TEST/temp/bac.gff
Running barrnap for Archaea: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/barrnap --quiet --threads 32 --kingdom arc --reject 0.1 /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/02.B11_Eukaryote_TEST.maskedrna.fasta --dbdir /database/SqueezeMetaDB/db > /hdd/mojf/output/B11_Eukaryote_TEST/temp/arc.gff
Running barrnap for Eukaryote: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/barrnap --quiet --threads 32 --kingdom euk --reject 0.1 /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/02.B11_Eukaryote_TEST.maskedrna.fasta --dbdir /database/SqueezeMetaDB/db > /hdd/mojf/output/B11_Eukaryote_TEST/temp/euk.gff
Running barrnap for Mitochondrial: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/barrnap --quiet --threads 32 --kingdom mito --reject 0.1 /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/02.B11_Eukaryote_TEST.maskedrna.fasta --dbdir /database/SqueezeMetaDB/db > /hdd/mojf/output/B11_Eukaryote_TEST/temp/mito.gff
Running RDP classifier: java -jar /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/lib/classifier/classifier.jar classify /hdd/mojf/output/B11_Eukaryote_TEST/temp/16S.fasta -o /hdd/mojf/output/B11_Eukaryote_TEST/temp/16S.out -f filterbyconf
Running Aragorn: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/aragorn -w /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/02.B11_Eukaryote_TEST.maskedrna.fasta -o /hdd/mojf/output/B11_Eukaryote_TEST/temp/trnas.aragorn
Creating new gff file: cat /hdd/mojf/output/B11_Eukaryote_TEST/temp/*gff.mod > /hdd/mojf/output/B11_Eukaryote_TEST/temp/02.B11_Eukaryote_TEST.rna.gff
Diamond block size set to 16 (Free Mem 705.85 Gb)
  Working with taxonomy database in /database/SqueezeMetaDB/db/nr.dmnd
Running Diamond for taxa: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/diamond blastp -q /hdd/mojf/output/B11_Eukaryote_TEST/results/03.B11_Eukaryote_TEST.faa -p 32 -d /database/SqueezeMetaDB/db/nr.dmnd -e 0.001 --id 40 -f tab -b 16 -o /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/04.B11_Eukaryote_TEST.nr.diamond > /hdd/mojf/output/B11_Eukaryote_TEST/temp/diamond.nr.log 2>&1
Diamond block size set to 16 (Free Mem 671.82 Gb)
  Working with taxonomy database in /database/SqueezeMetaDB/db/nr.dmnd
Running Diamond for taxa: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/diamond blastp -q /hdd/mojf/output/B11_Eukaryote_TEST/results/03.B11_Eukaryote_TEST.faa -p 32 -d /database/SqueezeMetaDB/db/nr.dmnd -e 0.001 --id 40 -f tab -b 16 -o /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/04.B11_Eukaryote_TEST.nr.diamond > /hdd/mojf/output/B11_Eukaryote_TEST/temp/diamond.nr.log 2>&1
Running Diamond for COGs: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/diamond blastp -q /hdd/mojf/output/B11_Eukaryote_TEST/results/03.B11_Eukaryote_TEST.faa -p 32 -d /database/SqueezeMetaDB/db/eggnog -e 0.001 --id 30 --quiet -b 16 -f 6 qseqid qlen sseqid slen pident length evalue bitscore qstart qend sstart send -o /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/04.B11_Eukaryote_TEST.eggnog.diamond
Running Diamond for KEGG: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/diamond blastp -q /hdd/mojf/output/B11_Eukaryote_TEST/results/03.B11_Eukaryote_TEST.faa -p 32 -d /database/SqueezeMetaDB/db/keggdb -e 0.001 --id 30 --quiet -b 16 -f 6 qseqid qlen sseqid slen pident length evalue bitscore qstart qend sstart send -o /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/04.B11_Eukaryote_TEST.kegg.diamond

[55 minutes, 9 seconds]: STEP5 -> 05.run_hmmer.pl
Running HMMER3 for Pfam: /home/gfz3/miniconda3/envs/SqueezeMeta/SqueezeMeta/bin/hmmer/hmmsearch --domtblout /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/05.B11_Eukaryote_TEST.pfam.hmm -E 1e-10 --cpu 32 /database/SqueezeMetaDB/db/Pfam-A.hmm /hdd/mojf/output/B11_Eukaryote_TEST/results/03.B11_Eukaryote_TEST.faa > /dev/null 2>&1

[1 hours, 6 minutes, 43 seconds]: STEP6 -> 06.lca.pl
  Splitting Diamond file
  Total lines in Diamond: 141918; Allocating 4434 in 32 threads
  Opening file 1 in line  (estimated in 4434)
  Opening file 2 in line 4443 (estimated in 4434)
  Opening file 3 in line 8889 (estimated in 8868)
  Opening file 4 in line 13327 (estimated in 13302)
  Opening file 5 in line 17757 (estimated in 17736)
  Opening file 6 in line 22194 (estimated in 22170)
  Opening file 7 in line 26614 (estimated in 26604)
  Opening file 8 in line 31058 (estimated in 31038)
  Opening file 9 in line 35483 (estimated in 35472)
  Opening file 10 in line 39925 (estimated in 39906)
  Opening file 11 in line 44345 (estimated in 44340)
  Opening file 12 in line 48789 (estimated in 48774)
  Opening file 13 in line 53231 (estimated in 53208)
  Opening file 14 in line 57643 (estimated in 57642)
  Opening file 15 in line 62098 (estimated in 62076)
  Opening file 16 in line 66520 (estimated in 66510)
  Opening file 17 in line 70946 (estimated in 70944)
  Opening file 18 in line 75383 (estimated in 75378)
  Opening file 19 in line 79832 (estimated in 79812)
  Opening file 20 in line 84267 (estimated in 84246)
  Opening file 21 in line 88701 (estimated in 88680)
  Opening file 22 in line 93123 (estimated in 93114)
  Opening file 23 in line 97560 (estimated in 97548)
  Opening file 24 in line 102006 (estimated in 101982)
  Opening file 25 in line 106433 (estimated in 106416)
  Opening file 26 in line 110874 (estimated in 110850)
  Opening file 27 in line 115300 (estimated in 115284)
  Opening file 28 in line 119739 (estimated in 119718)
  Opening file 29 in line 124172 (estimated in 124152)
  Opening file 30 in line 128595 (estimated in 128586)
  Opening file 31 in line 133023 (estimated in 133020)
  Opening file 32 in line 137457 (estimated in 137454)
  Opening file 33 in line 141894 (estimated in 141888)
  Starting multithread LCA in 32 threads
  Starting thread 1
  Starting thread 2
  Starting thread 3
  Starting thread 4
  Starting thread 5
  Starting thread 6
  Starting thread 7
  Starting thread 8
  Starting thread 9
  Starting thread 10
  Starting thread 11
  Starting thread 12
  Starting thread 13
  Starting thread 14
  Starting thread 15
  Starting thread 16
  Starting thread 17
  Starting thread 18
  Starting thread 19
  Starting thread 20
  Starting thread 21
  Starting thread 22
  Starting thread 23
  Starting thread 24
  Starting thread 25
  Starting thread 26
  Starting thread 27
  Starting thread 28
  Starting thread 29
  Starting thread 30
  Starting thread 31
  Starting thread 32
  Creating /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.wranks file: cat /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_1.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_2.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_3.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_4.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_5.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_6.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_7.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_8.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_9.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_10.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_11.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_12.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_13.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_14.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_15.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_16.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_17.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_18.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_19.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_20.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_21.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_22.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_23.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_24.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_25.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_26.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_27.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_28.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_29.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_30.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_31.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_32.wranks > /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.wranks
  Creating /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.noidfilter.wranks file: cat /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_1.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_2.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_3.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_4.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_5.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_6.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_7.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_8.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_9.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_10.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_11.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_12.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_13.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_14.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_15.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_16.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_17.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_18.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_19.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_20.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_21.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_22.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_23.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_24.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_25.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_26.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_27.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_28.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_29.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_30.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_31.noidfilter.wranks /hdd/mojf/output/B11_Eukaryote_TEST/temp/fun3tax_32.noidfilter.wranks  > /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.noidfilter.wranks
  Removing temporaty diamond files in /hdd/mojf/output/B11_Eukaryote_TEST/temp

[1 hours, 9 minutes, 2 seconds]: STEP7 -> 07.fun3assign.pl
  Reading COGs hits from /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/04.B11_Eukaryote_TEST.eggnog.diamond
  Output in /hdd/mojf/output/B11_Eukaryote_TEST/results/07.B11_Eukaryote_TEST.fun3.cog
  Reading KEGG hits from /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/04.B11_Eukaryote_TEST.kegg.diamond
  Output in /hdd/mojf/output/B11_Eukaryote_TEST/results/07.B11_Eukaryote_TEST.fun3.kegg
  Reading pfam hits from /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/05.B11_Eukaryote_TEST.pfam.hmm
  Output in /hdd/mojf/output/B11_Eukaryote_TEST/results/07.B11_Eukaryote_TEST.fun3.pfam

[1 hours, 9 minutes, 2 seconds]: STEP9 -> 09.summarycontigs3.pl
  Reading taxa for genes from /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.wranks
  Reading results without eukaryotic filter from /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.noidfilter.wranks
  Writing output to /hdd/mojf/output/B11_Eukaryote_TEST/temp/09.B11_Eukaryote_TEST.allorfs and /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/09.B11_Eukaryote_TEST.contiglog
  Reading taxa for genes from /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.noidfilter.wranks
  Reading results without eukaryotic filter from /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.noidfilter.wranks
  Writing output to /hdd/mojf/output/B11_Eukaryote_TEST/temp/09.B11_Eukaryote_TEST.allorfs.noidfilter and /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/09.B11_Eukaryote_TEST.contiglog.noidfilter
Stopping in STEP9 -> 09.summarycontigs3.pl. File /hdd/mojf/output/B11_Eukaryote_TEST/intermediate/09.B11_Eukaryote_TEST.contiglog is empty!
_____________

System information:
Linux dell 6.5.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 12 10:22:43 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
_____________

Tree for the project:
[4.0K Apr  9 19:48]  /hdd/mojf/output/B11_Eukaryote_TEST
├── [  35 Apr  9 19:48]  creator.txt
├── [4.0K Apr  9 19:48]  data
│   ├── [  56 Apr  9 19:48]  00.B11_Eukaryote_TEST.samples
│   └── [4.0K Apr  9 19:48]  raw_fastq
│       ├── [3.4G Apr  9 19:48]  par1.fastq.gz
│       └── [2.2G Apr  9 19:49]  par2.fastq.gz
├── [4.0K Apr  9 19:48]  ext_tables
├── [4.0K Apr  9 22:15]  intermediate
│   ├── [ 12M Apr  9 21:02]  02.B11_Eukaryote_TEST.maskedrna.fasta
│   ├── [9.7M Apr  9 22:00]  04.B11_Eukaryote_TEST.eggnog.diamond
│   ├── [7.2M Apr  9 22:00]  04.B11_Eukaryote_TEST.kegg.diamond
│   ├── [9.1M Apr  9 21:59]  04.B11_Eukaryote_TEST.nr.diamond
│   ├── [2.4M Apr  9 22:12]  05.B11_Eukaryote_TEST.pfam.hmm
│   ├── [ 289 Apr  9 22:15]  09.B11_Eukaryote_TEST.contiglog
│   ├── [ 297 Apr  9 22:15]  09.B11_Eukaryote_TEST.contiglog_allranks
│   ├── [ 308 Apr  9 22:15]  09.B11_Eukaryote_TEST.contiglog_allranks.noidfilter
│   ├── [ 300 Apr  9 22:15]  09.B11_Eukaryote_TEST.contiglog.noidfilter
│   ├── [4.0K Apr  9 19:48]  binners
│   └── [  56 Apr  9 21:05]  DB_BUILD_DATE
├── [ 904 Apr  9 22:12]  methods.txt
├── [3.1K Apr  9 19:48]  parameters.pl
├── [  36 Apr  9 19:48]  progress
├── [4.0K Apr  9 22:14]  results
│   ├── [ 12M Apr  9 20:59]  01.B11_Eukaryote_TEST.fasta
│   ├── [ 130 Apr  9 21:02]  02.B11_Eukaryote_TEST.16S.txt
│   ├── [ 171 Apr  9 21:01]  02.B11_Eukaryote_TEST.rnas
│   ├── [2.7K Apr  9 21:02]  02.B11_Eukaryote_TEST.trnas
│   ├── [ 28K Apr  9 21:02]  02.B11_Eukaryote_TEST.trnas.fasta
│   ├── [3.0M Apr  9 20:43]  03.B11_Eukaryote_TEST.faa
│   ├── [8.8M Apr  9 20:57]  03.B11_Eukaryote_TEST.fna
│   ├── [611K Apr  9 20:34]  03.B11_Eukaryote_TEST.gff
│   ├── [952K Apr  9 22:14]  06.B11_Eukaryote_TEST.fun3.tax.noidfilter.wranks
│   ├── [932K Apr  9 22:14]  06.B11_Eukaryote_TEST.fun3.tax.wranks
│   ├── [181K Apr  9 22:14]  07.B11_Eukaryote_TEST.fun3.cog
│   ├── [118K Apr  9 22:14]  07.B11_Eukaryote_TEST.fun3.kegg
│   └── [348K Apr  9 22:14]  07.B11_Eukaryote_TEST.fun3.pfam
├── [8.2K Apr  9 19:48]  SqueezeMeta_conf.pl
├── [ 14K Apr  9 22:15]  syslog
└── [4.0K Apr  9 22:15]  temp
    ├── [7.1K Apr  9 21:02]  02.B11_Eukaryote_TEST.rna.gff
    ├── [ 290 Apr  9 22:15]  09.B11_Eukaryote_TEST.allorfs
    ├── [ 301 Apr  9 22:15]  09.B11_Eukaryote_TEST.allorfs.noidfilter
    ├── [   0 Apr  9 21:01]  16S.fasta
    ├── [  40 Apr  9 21:02]  16S.out
    ├── [  16 Apr  9 21:01]  arc.gff
    ├── [  16 Apr  9 21:01]  arc.gff.mod
    ├── [ 109 Apr  9 21:01]  bac.gff
    ├── [ 134 Apr  9 21:01]  bac.gff.mod
    ├── [ 13K Apr  9 21:59]  diamond.nr.log
    ├── [  16 Apr  9 21:02]  euk.gff
    ├── [  16 Apr  9 21:02]  euk.gff.mod
    ├── [ 30K Apr  9 22:13]  fun3tax_10.noidfilter.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_10.wranks
    ├── [ 32K Apr  9 22:13]  fun3tax_11.noidfilter.wranks
    ├── [ 31K Apr  9 22:13]  fun3tax_11.wranks
    ├── [ 34K Apr  9 22:13]  fun3tax_12.noidfilter.wranks
    ├── [ 33K Apr  9 22:13]  fun3tax_12.wranks
    ├── [ 27K Apr  9 22:13]  fun3tax_13.noidfilter.wranks
    ├── [ 27K Apr  9 22:13]  fun3tax_13.wranks
    ├── [ 30K Apr  9 22:13]  fun3tax_14.noidfilter.wranks
    ├── [ 30K Apr  9 22:13]  fun3tax_14.wranks
    ├── [ 28K Apr  9 22:13]  fun3tax_15.noidfilter.wranks
    ├── [ 27K Apr  9 22:13]  fun3tax_15.wranks
    ├── [ 32K Apr  9 22:13]  fun3tax_16.noidfilter.wranks
    ├── [ 30K Apr  9 22:13]  fun3tax_16.wranks
    ├── [ 28K Apr  9 22:13]  fun3tax_17.noidfilter.wranks
    ├── [ 27K Apr  9 22:13]  fun3tax_17.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_18.noidfilter.wranks
    ├── [ 28K Apr  9 22:13]  fun3tax_18.wranks
    ├── [ 30K Apr  9 22:13]  fun3tax_19.noidfilter.wranks
    ├── [ 30K Apr  9 22:13]  fun3tax_19.wranks
    ├── [ 32K Apr  9 22:12]  fun3tax_1.noidfilter.wranks
    ├── [ 32K Apr  9 22:12]  fun3tax_1.wranks
    ├── [ 32K Apr  9 22:13]  fun3tax_20.noidfilter.wranks
    ├── [ 31K Apr  9 22:13]  fun3tax_20.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_21.noidfilter.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_21.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_22.noidfilter.wranks
    ├── [ 28K Apr  9 22:13]  fun3tax_22.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_23.noidfilter.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_23.wranks
    ├── [ 28K Apr  9 22:13]  fun3tax_24.noidfilter.wranks
    ├── [ 27K Apr  9 22:13]  fun3tax_24.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_25.noidfilter.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_25.wranks
    ├── [ 31K Apr  9 22:13]  fun3tax_26.noidfilter.wranks
    ├── [ 30K Apr  9 22:13]  fun3tax_26.wranks
    ├── [ 29K Apr  9 22:13]  fun3tax_27.noidfilter.wranks
    ├── [ 28K Apr  9 22:13]  fun3tax_27.wranks
    ├── [ 28K Apr  9 22:13]  fun3tax_28.noidfilter.wranks
    ├── [ 27K Apr  9 22:13]  fun3tax_28.wranks
    ├── [ 30K Apr  9 22:14]  fun3tax_29.noidfilter.wranks
    ├── [ 30K Apr  9 22:14]  fun3tax_29.wranks
    ├── [ 29K Apr  9 22:12]  fun3tax_2.noidfilter.wranks
    ├── [ 28K Apr  9 22:12]  fun3tax_2.wranks
    ├── [ 27K Apr  9 22:14]  fun3tax_30.noidfilter.wranks
    ├── [ 27K Apr  9 22:14]  fun3tax_30.wranks
    ├── [ 29K Apr  9 22:14]  fun3tax_31.noidfilter.wranks
    ├── [ 28K Apr  9 22:14]  fun3tax_31.wranks
    ├── [ 28K Apr  9 22:14]  fun3tax_32.noidfilter.wranks
    ├── [ 27K Apr  9 22:14]  fun3tax_32.wranks
    ├── [ 32K Apr  9 22:12]  fun3tax_3.noidfilter.wranks
    ├── [ 32K Apr  9 22:12]  fun3tax_3.wranks
    ├── [ 30K Apr  9 22:12]  fun3tax_4.noidfilter.wranks
    ├── [ 30K Apr  9 22:12]  fun3tax_4.wranks
    ├── [ 32K Apr  9 22:12]  fun3tax_5.noidfilter.wranks
    ├── [ 30K Apr  9 22:12]  fun3tax_5.wranks
    ├── [ 33K Apr  9 22:13]  fun3tax_6.noidfilter.wranks
    ├── [ 32K Apr  9 22:13]  fun3tax_6.wranks
    ├── [ 26K Apr  9 22:13]  fun3tax_7.noidfilter.wranks
    ├── [ 26K Apr  9 22:13]  fun3tax_7.wranks
    ├── [ 27K Apr  9 22:13]  fun3tax_8.noidfilter.wranks
    ├── [ 27K Apr  9 22:13]  fun3tax_8.wranks
    ├── [ 33K Apr  9 22:13]  fun3tax_9.noidfilter.wranks
    ├── [ 32K Apr  9 22:13]  fun3tax_9.wranks
    ├── [  16 Apr  9 21:02]  mito.gff
    ├── [  16 Apr  9 21:02]  mito.gff.mod
    ├── [6.9K Apr  9 21:02]  trna.gff.mod
    ├── [ 75K Apr  9 21:02]  trnas.aragorn
    └── [  89 Apr  9 22:12]  wc

8 directories, 113 files

Also, I have checked /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.wranks and /hdd/mojf/output/B11_Eukaryote_TEST/results/06.B11_Eukaryote_TEST.fun3.tax.noidfilter.wranks.

The first 10 lines in 06.B11_Eukaryote_TEST.fun3.tax.wranks were as follows.
image

The first 10 lines in 06.B11_Eukaryote_TEST.fun3.tax.noidfilter.wranks were as follows.
image

I think this may be an error caused by incorrect content in the first column of these two files. But I don't know what should I do to correct this error. Or is there possible other error?

Looking forward to your solution. I will be very grateful for your help.

@jtamames
Copy link
Owner

Hello
Sorry for my very late renspose to this one, somehow I overlooked this issue.
As I as saying above:

you could replace the 03.faa (aa sequences), 03.fna (nucleotide sequences) and 03.gff (information for ORFs) file for new ones you have from funannotate (keeping the ORF naming schema of SqueezeMeta, which is contigname_ORFinitpos_ORFendpos).

But your files still have the original funannotate format, something like FUN00001_1-T1. The right format could be something like FUN00001_1_128, being 1 and 128 the starting and end positions of the ORF in the contig. Notice that in this case, the contig name will be FUN00001, so you would need to change the gff accordingly.

Best,
J

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants