Solving the "Error: BiocParallel errors" (in almost every sample) #1946

Yoonseopark02 · 2024-05-03T05:50:09Z

Hi, I am an undergraduate studying bioinformatics in the microbiome analysis field, and I am having a BiocParallel Error every time I run my code for the ENA database datasets.
The same code worked well for NCBI datasets, and I kept on debugging but couldn't solve the problem.

Following is the code I used:

library(dada2)
library(ggplot2)
library(dplyr)

path <- "/Users/User/Documents/SeniorThesis/zebraENA/reads_fastq/PRJNA806371"
list.files(path)

#Assuming your forward and reverse fastq filenames have the format: SRRXXXXXXX_1.fastq and SRRXXXXXXX_2.fastq
fnFs <- sort(list.files(path, pattern="_1\.fastq.gz$", full.names = TRUE))
fnRs <- sort(list.files(path, pattern="2\.fastq.gz$", full.names = TRUE))
#Extract sample names, assuming filenames have the format: SRRXXXXXXX_X.fastq
sample.names <- sapply(strsplit(basename(fnFs), ""), [, 1)

#INSPECT READ QUALITY PROFILES
QPF <- plotQualityProfile(fnFs[1:2])

Error: BiocParallel errors
1 remote errors, element index: 1
0 unevaluated and other errors
first remote error:
Error in data.frame(sequence = names(freqtbl$top), count = as.integer(freqtbl$top), : arguments imply differing number of rows: 0, 1

I am getting this same error in almost every single public data I used..
Please give me comments how to solve this error.
Thanks so much in advance!

The text was updated successfully, but these errors were encountered:

benjjneb · 2024-05-03T13:11:44Z

What is head(fnFs)?

What is the output of head(ShortRead::readFastq(fnFs[[1]]))?

Yoonseopark02 · 2024-05-04T05:53:39Z

Thanks so much for the reply @benjjneb !!
It appears like this:

head(fnFs)
[1] "/Users/User/Documents/SeniorThesis/zebraENA/reads_fastq/PRJNA806371/SRR18030333_1.fastq.gz"
[2] "/Users/User/Documents/SeniorThesis/zebraENA/reads_fastq/PRJNA806371/SRR18030334_1.fastq.gz"
[3] "/Users/User/Documents/SeniorThesis/zebraENA/reads_fastq/PRJNA806371/SRR18030335_1.fastq.gz"
[4] "/Users/User/Documents/SeniorThesis/zebraENA/reads_fastq/PRJNA806371/SRR18030336_1.fastq.gz"
[5] "/Users/User/Documents/SeniorThesis/zebraENA/reads_fastq/PRJNA806371/SRR18030337_1.fastq.gz"
head(ShortRead::readFastq(fnFs[[1]]))
class: ShortReadQ
length: 6 reads; width: 151 cycles

benjjneb · 2024-05-07T15:32:57Z

Could there be any input files in your data that are empty (i.e. contain no sequences)? See this comment thread: #1503 (comment)

If there are, removing those before running plotQualityProfile should solve the issue.

Yoonseopark02 · 2024-05-14T02:07:22Z

@benjjneb Thanks! I found that the input files had the error and solved it.

Sorry but can I ask one more thing, please?

I am having these errors in one large dataset when running the code
out <- filterAndTrim(fnFs, filtFs, fnRs, filtRs, truncLen=c(150,150), trimLeft = c(17, 21),
maxN=0, maxEE=c(2,2), truncQ=2, rm.phix=TRUE,
compress=TRUE, multithread=TRUE)

Error in filterAndTrim(fnFs, filtFs, fnRs, filtRs, truncLen = c(150, 150), :
These are the errors (up to 5) encountered in individual cores...
Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, :
Mismatched forward and reverse sequence files: 0, 100000.
...
This error with multiple lines (please see the picture attached)
I often encounter this error too, and I tried to refer to #283 and ran this but couldn't find the problem.

Can you give me some advice on how to solve this?
Thank you so much!

benjjneb · 2024-05-14T15:10:30Z

The error messages indicate that for pairs of forward/reverse fastq files, one of the files has many reads (100k or 83.5k) while the other has zero reads. To troubleshoot, I would check that this error is caused by a single sample (i.e. fnFs[[1]] etc.), and then look at those individual files. Is one of them empty? Then it becomes a question of how these files were obtained. Did some pre-processing on your end lead to one of them being empty? Or did this discrepancy exist in the raw files you downloaded?

benjjneb closed this as completed May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solving the "Error: BiocParallel errors" (in almost every sample) #1946

Solving the "Error: BiocParallel errors" (in almost every sample) #1946

Yoonseopark02 commented May 3, 2024

benjjneb commented May 3, 2024

Yoonseopark02 commented May 4, 2024

benjjneb commented May 7, 2024

Yoonseopark02 commented May 14, 2024

benjjneb commented May 14, 2024

Solving the "Error: BiocParallel errors" (in almost every sample) #1946

Solving the "Error: BiocParallel errors" (in almost every sample) #1946

Comments

Yoonseopark02 commented May 3, 2024

benjjneb commented May 3, 2024

Yoonseopark02 commented May 4, 2024

benjjneb commented May 7, 2024

Yoonseopark02 commented May 14, 2024

benjjneb commented May 14, 2024