Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

corrupted size vs. prev_size #57

Open
nick-youngblut opened this issue Apr 29, 2024 · 1 comment
Open

corrupted size vs. prev_size #57

nick-youngblut opened this issue Apr 29, 2024 · 1 comment

Comments

@nick-youngblut
Copy link

I'm using quay.io/biocontainers/falco:1.2.2--hdcf5f25_0. The run output:

[limits]	using file /usr/local/opt/falco/Configuration/limits.txt
[adapters]	using file /usr/local/opt/falco/Configuration/adapter_list.txt
[contaminants]	using file /usr/local/opt/falco/Configuration/contaminant_list.txt
[Mon Apr 29 18:30:02 2024] Started reading file 20241218_Parse_CRISPR_K562_cas12a_Sub1_R1_001.fastq.gz
[Mon Apr 29 18:30:02 2024] reading file as gzipped FASTQ format
[running falco|                                                   |  0%]corrupted size vs. prev_size
/home/nickyoungblut/tmp/auto-demux/work/20240426_SspArc0132/33/42662ba1c885c4ddfbc2724221e894/.command.sh: line 9:    36 Aborted                 (core dumped) falco 20241218_Parse_CRISPR_K562_cas12a_Sub1_R1_001.fastq.gz -D 20241218_Parse_CRISPR_K562_cas12a_Sub1_R1_001/fastqc_data.txt -R 20241218_Parse_CRISPR_K562_cas12a_Sub1_R1_001/fastqc_report.html -S 20241218_Parse_CRISPR_K562_cas12a_Sub1_R1_001/summary.txt
(nextflow)

seqkit stats -a -T 20241218_Parse_CRISPR_K562_cas12a_Sub1_R1_001.fastq.gz produces the following output:

file	format	type	num_seqs	sum_len	min_len	avg_len	max_len	Q1	Q2	Q3	sum_gap	N50	N50_num	Q20(%)	Q30(%)	AvgQual	GC(%)
20241218_Parse_CRISPR_K562_cas12a_Sub1_R1_001.fastq.gz	FASTQ	DNA	24322546	12501788644	514	514.0	514	514.0	514.0	514.0	0	514	1	44.39	29.94	11.99	42.50

...so it appears that there is nothing wrong with the fastq file. Note the long read lengths. The RunInfo.xml for this Illumina run was skewed to long Read 1 lengths:

    <Reads>
      <Read NumCycles="514" Number="1" IsIndexedRead="N" />
      <Read NumCycles="86" Number="2" IsIndexedRead="N" />
    </Reads>
@andrewdavidsmith
Copy link
Collaborator

@nick-youngblut any chance you can reproduce with a smaller file that can be linked? If not, can you try it with the file unzipped?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants