-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't process gzipped fastq #35
Comments
Sorry for the loooong delay, I'm back now at tending to the issues. I believe this is an issue on the kmer kernel, that doesn't allow gzipped input due to the random access function it uses (@gunturus please comment if I'm wrong). Unfortunately, I don't think this can be easily resolved. I'll leave this issue open until I add a corresponding comment to the documentation, but you'll have to unzip the fastq file prior to using nonpareil. |
I'm starting to investigate nonpareil, and also had the same issue. Having gzipped input support would be very useful to have, because I have >100 sequencing files all in >1GB file-size range, so having to decompress each time would be a bit nasty when trying to parallelise processing all the files at once. So I would like to give support to this, if a solution is feasible (even if there is an internal temporary decompression)! |
@gunturus Do you have an update on this issue? I know you were looking into it. Thanks! |
@gunturus do you have any more news? I'm interested in potentially adding nonpariel to the nf-core/eager pipeline, but the lack of |
@jfy133 unfortunately gzip is not supported. @lmrodriguezr do you have any suggestions to provide gzip support? I have no idea. |
Do you think this is in anyway on a roadmap @lmrodriguezr? Just to know if I should look for different solutions instead. |
I would also like to add that having support for compressed FASTQ files would be good. |
Hello. We're finally back at this issue, and it's top of the roadmap. An initial not-so-clean solution would be to unzip the files into a temporary directory, launch nonpareil, and then remove the directory. Would this work as a temporary solution? If yes, I can implement it into a bash wrapper so you could use it out of the box. A more robust solution is to read directly from the zipped file, but this will take some heavy lifting because we will need to replace a random file access with another method. It's also doable, but I'll take us a bit longer, so hopefully the first option works in the meantime? |
Dear @lmrodriguezr, Thank you very much for looking into this! For our purpose, having the second option being implemented would be better. We use |
@lmrodriguezr we are in the same situtation as @VGalata as we would like to add it to a nextflow pipeline ;). However, I think unzipping to a But depending on the time it takes for the more robust solution, I guess I would prefer to wait a bit longer (thus time investment) goes into an 'inbuilt' solution. |
@lmrodriguezr just another thought... would it be easier to refactor input to allow then could simply to Just sayin' as also would be fine with me in terms of accepting gzipped input in terms of useability. |
Just wanted to chime in with more support for enabling compressed fastq files! |
Hi, I'm just getting started with Nonpareil, thanks for your work.
I'm unable to process my gzipped fastq. If I first uncompress the file, it processes as expected. The error:
The text was updated successfully, but these errors were encountered: