Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to specify file output names #4

Open
moldach opened this issue Apr 23, 2020 · 5 comments
Open

Add option to specify file output names #4

moldach opened this issue Apr 23, 2020 · 5 comments

Comments

@moldach
Copy link

moldach commented Apr 23, 2020

In cases where you have you have paired-end reads (e.g. HG03583_S1_L001_R1.fastq.gz & HG03583_S1_L001_R2.fastq.gz) or a number of FASTQ files in a directory falco will over-write fastqc_data.txt, fastqc_report.html and summary.txt.

At the moment the only way around this, I see, would be to have each FASTQ file in it's own directory (not ideal IMO).

It would be nice to be able to specify the name of output so you could use wild-card rules in a Snakemake workflow for example.

@guilhermesena1
Copy link
Collaborator

guilhermesena1 commented Apr 24, 2020

Hi Matthew,

You can make a custom output directory for fastq files using the -o argument. In your case, one possibility would be to run the following in the directory with the two end reads:

for i in R1 R2; do falco -o HG03583_S1_L001_${i} HG03583_S1_L001_${i}.fastq.gz; done

Which would create two directories, HG03583_S1_L001_R1 and HG03583_S1_L001_R2, with the respective data, summary and reports for each end of the read.

We chose to do it this way mostly because it's how FastQC does it, but we will add custom output filename options on the next release. I personally agree that users should have the freedom to choose the filename of every output.

@moldach
Copy link
Author

moldach commented May 3, 2020

Thanks for the suggestion @guilhermesena1

another idea

MuliQC can collect reports from FastQC's .zip output making it easier to compare results. Structuring the output of falco to emulate that of FastQC would allow this tool to work seamlessly with other tools that use the output from FastQC.

@guilhermesena1
Copy link
Collaborator

Hi Matthew,

Thank you for the suggestion. I believe that indeed if you zip the output files from falco, you should be able to run the output through MultiQC by "pretending" it comes from FastQC since the outputs should be identical. My memory is a bit fuzzy on it but it also might be possible that you only need the "fastqc_data.txt" file to generate multiqc reports, or that you can pass the directory generated by falco and MultiQC will look for the output files within it. I'd be very interested in knowing if MultiQC fails to parse your output files, and what error reports they generate in case you tried this.

@moldach
Copy link
Author

moldach commented May 3, 2020 via email

@guilhermesena1
Copy link
Collaborator

only took me over 2 years but I finally got around to implementing this. Custom flags for the summary, report and data filenames (although only for single-input files). Done on commit 159e7f3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants