Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error at assembly (flye step) #9

Open
paulgarias opened this issue Jul 26, 2022 · 7 comments
Open

error at assembly (flye step) #9

paulgarias opened this issue Jul 26, 2022 · 7 comments

Comments

@paulgarias
Copy link

I am working with a student who is having this issue with their execution of nextflow


executor >  local (3)
[30/a60146] process > assembly:porechop (H37Rv.1) [100%] 1 of 1 ✔
[24/79a658] process > assembly:japsa (H37Rv.1)    [100%] 1 of 1 ✔
[dd/860979] process > assembly:flye (H37Rv.1)     [  0%] 0 of 1
[-        ] process > assembly:racon_cpu          -
[-        ] process > assembly:medaka_cpu         -
[-        ] process > assembly:nextpolish         -
[-        ] process > assembly:fixstart           -
[-        ] process > assembly:quast              -
Error executing process > 'assembly:flye (H37Rv.1)'

Caused by:
  Missing output file(s) `assembly.fasta` expected by process `assembly:flye (H37Rv.1)`

Command executed:

  set +eu
  flye --nano-raw filtered.fastq.gz --genome-size 5.0m --threads 4 --out-dir $PWD --plasmids
  flye -v 2> flye_version.txt

Command exit status:
  0

Command output:
  (empty)

Command error:
  WARNING: Skipping mount /var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
  [2022-07-22 17:41:29] INFO: Starting Flye 2.5-release
  [2022-07-22 17:41:29] INFO: >>>STAGE: configure
  [2022-07-22 17:41:29] INFO: Configuring run
  [2022-07-22 17:43:47] INFO: Total read length: 5089510998
  [2022-07-22 17:43:47] INFO: Input genome size: 5000000
  [2022-07-22 17:43:47] INFO: Estimated coverage: 1017
  [2022-07-22 17:43:47] WARNING: Expected read coverage is 1017, the assembly is not guaranteed to be optimal in this setting. Are you sure that the genome size was entered correctly?
  [2022-07-22 17:43:47] INFO: Reads N50/N90: 9733 / 2679
  [2022-07-22 17:43:47] INFO: Minimum overlap set to 3000
  [2022-07-22 17:43:47] INFO: Selected k-mer size: 15
  [2022-07-22 17:43:47] INFO: >>>STAGE: assembly
  [2022-07-22 17:43:47] INFO: Assembling disjointigs
  [2022-07-22 17:43:47] INFO: Reading sequences
  [2022-07-22 17:45:15] INFO: Generating solid k-mer index
  [2022-07-22 17:45:32] INFO: Counting k-mers (1/2):
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2022-07-22 17:48:26] INFO: Counting k-mers (2/2):
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2022-07-22 17:54:34] INFO: Filling index table
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2022-07-22 18:05:38] INFO: Extending reads
  [2022-07-22 18:24:23] INFO: Overlap-based coverage: 868
  [2022-07-22 18:24:23] INFO: Median overlap divergence: 0.0852075
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2022-07-24 03:32:08] INFO: Assembled 0 disjointigs
  [2022-07-24 03:32:08] INFO: Generating sequence
  [2022-07-24 03:32:09] ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct

Work dir:
  /projectsp/alland/PanGenome_Project/ReviewerResponses/testing_pipelines/work/dd/8609795cae4b8d69393b8e7daee1bf

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

Looking for some guidance on how to proceed.

Best,
Paul

@vmurigneu
Copy link
Contributor

Hi Paul,

This is a Flye error that seems to be linked to a high read coverage (1017) which could confuse Flye, it is similar to this issue:
fenderglass/Flye#128

Here are some suggestions from the author of Flye:
I suggest to try two more runs (i) metagenome mode (ii) normal mode with --asm-coverage 50 to use the longest 50x reads for disjointig assembly.

Cam you try to rerun after modifying the nextflow.config file line 90 to reduce the coverage for initial disjointig assembly:
flye_args = "--plasmids" => flye_args = "--plasmids --asm-coverage 50"
or using the metagenome mode :
flye_args = "--plasmids" => flye_args = "--plasmids --meta"

Hope this helps.
Valentine

@maddne
Copy link

maddne commented Aug 3, 2022

HI, I keep having a similar issue.
I don't receive the Warning: Expected read coverage is 1017 as Paul.
It seems that the assembly:flye step is expecting assembly.fasta as an input file, but in the directory of this file, there is draft_assembly.fasta. Do you think this might cause the problem?
I am attaching flye log and nextflow log
flye.log
.nextflow.log
.

@vmurigneu
Copy link
Contributor

@maddne No the assembly.fasta is an output file of the Flye step.
The error is here:
OSError: [Errno 30] Read-only file system

Work dir:
/home/bio/micropipe/micropipe/work/36/fa4934df1db68825ea7799ce4a2f88

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh
Aug-03 16:58:48.309 [Task monitor] DEBUG nextflow.Session - Session aborted -- Cause: Missing output file(s) assembly.fasta expected by process assembly:flye (Eco1948)

Can you check the content of .command.sh and .command.run inside the work dir?

@maddne
Copy link

maddne commented Aug 5, 2022

@vmurigneu yes I can check the content them. They seem to appear hidden and I cannot upload them directly, so I copied the content into a txt file
error.txt
.

@vmurigneu
Copy link
Contributor

@maddne have you checked that you have permission to write in the output folder? Did the previous steps of the pipeline generated expected output (trimming, filtering)?

Can you send the command line used and content of nextfow.config please

@maddne
Copy link

maddne commented Aug 11, 2022

I assumed that I had problems with permission however I changed the permission to the output folder to drwxrwxrwx, but this wasn't the case, because it was able to write files there. The previous steps Trimming and filtering worked like a charm and produced output and HTML report. Here are some reports:
Eco1948_porechop.log
trace.txt
nextflow_report.txt
s

Here is my command nextflow main.nf --samplesheet /home/bio/micropipe/micropipe/samples5.csv --fastq /home/bio/micropipe/micropipe/bact5/ --outdir /home/bio/micropipe/micropipe/results123/ --datadir /home/bio/micropipe/micropipe/bact5/

nextflow.config file was downloaded as in your repo and I only changed the cache folder for singularity at line 3

@vmurigneu
Copy link
Contributor

vmurigneu commented Aug 23, 2022

@maddne
can you post the nextflow.config file line 3 please?

would you please be able to post the .command.run inside the work dir
/home/bio/micropipe/micropipe/work/36/fa4934df1db68825ea7799ce4a2f88

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants