-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty contig files despite stats looking OK #478
Comments
zsh:test:1: unknown condition: -le Copyright 2012 Canada's Michael Smith Genome Science Centre Description: Ubuntu 22.04.4 LTS
The minimum coverage of single-end contigs is 1.08333. |
Hi @alexbacita, Are you seeing the files as 'empty' based on looking at the file size, or inspecting the contents? If you run Thank you for your interest in ABySS! |
Hi @lcoombe, Thanks for the reply! I was just looking at size indeed. Ideally I would like to generate a consensus read to understand my sample sequence. What file should I use for this purpose? What is the distinction between the -3 -6 -8 files? Many thanks, |
Hi Alex, The names associated with each of those soft links refer to progressive stages of the ABySS assembly - unitigs, contigs and scaffolds. We generally consider the scaffolds ( You could think of it like this - the contigs are generated using the assembly graph plus your input data to make more joins between your unitigs, and the scaffolds are generated using the same information with slightly different algorithms to make more joins between your contigs. Hope that makes sense! |
Thanks Lauren! That makes sense, please could you advise on how to obtain one single/continous consensus sequence? Can this be done in ABySS or do I need to use a different software? Apologies for my ignorance, I'm new to de-novo assembly |
Hi Alex, I'm not really sure what you mean with 'one single/continuous consensus sequence'? It would help if I knew more about what are you trying to assemble, and what is your input data? ABySS will assemble your input reads, but whether you expect the full sequence in one piece or in multiple pieces really depends on what you are trying to assemble. Even if your target region is assembled in a single piece, there will likely be more than one sequence in the final assembly. And, ABySS will generally not output multiple assemblies/copies of the same region, so there wouldn't be any sense of doing a 'consensus' of multiple sequences after assembly. |
Thanks Lauren, I have a synthetic oligo that has been amplified with phi29 RCA to understand the bias of the enzyme. When aligning against the reference sequence we have 95% reads mapped in the un-amplified but only 2% reads mapped in the RCA product. So I'm using trying to understand the nature of this product by doing de-novo assembly. Good to know that there no overlaps of the same region generated from ABySS! If I understand correctly I still need to join the scaffolds? |
Hi Alex, So if I'm understanding correctly, you're just trying to get an idea of what the reads are? If you know what you are looking for in the target, you could do an alignment to the potential region of interest to see which contigs map there. If you don't really know what you're looking for at all, you could BLAST the assembled pieces. It is possible that the region you are talking about is in one piece, or in multiple pieces - you won't really know until you do the analysis. If your intention is to get a single piece for a particular region, it can be worth doing k/kc sweeps, which can impact the contiguity of your assembly. |
Hi Lauren, The region of interest is unknown, also it's a synthetic product so BLAST won't work I think. Pretty sure there are multiple pieces - if I understand correctly the scaffolds from the ABySS output represent independent fragments assembled to their original size from the 150PE reads? If I was doing whole genome analysis and I would put the PE reads into ABySS would the output be a continuous sequence of the assembled genome (equivalent to E. Coli genome for example) or is there a separate function for to achieve that? Thank you so much for the discussion and support! |
Hi Alex, If you have multiple independent fragments sequenced, then yes, even an optimal assembly would generate multiple different pieces. |
Hi @lcoombe Thank you very much for the clarifications and detailed replies, much appreciated! Alex |
Hi,
I'm running the following line: abyss-pe name=test_pe k=96 B=2G in='out_R1_001.fastq out_R2_001.fastq'
Despite getting stats on contigs:
n n:500 L50 min N75 N50 N25 E-size max sum name
381190 760 203 500 712 1090 1912 1837 9667 783583 test_pe-unitigs.fa
381122 714 165 500 729 1212 2365 2264 12653 790301 test_pe-contigs.fa
381107 705 156 500 729 1222 2468 2625 15574 790239 test_pe-scaffolds.fa
The following files are empty: 'test_pe-unitigs.fa; test_pe-contigs.fa'; test_pe-scaffolds.fa' although I do have some sequneces under the iterations of test-1.fa, test-2.fa and so on
Please can you let me know how to interpret this and troubleshoot
Many thanks
Alex
The text was updated successfully, but these errors were encountered: