Skip to content

A flexible, scalable, and reproducible pipeline to automate variant calling from sequence reads.

License

Notifications You must be signed in to change notification settings

moiexpositoalonsolab/grenepipe

Repository files navigation

Snakemake CI Platforms License DOI

grenepipe logo

Snakemake pipeline for variant calling from raw sample sequences, with lots of bells and whistles.

Advantages:

  • One command to run the whole pipeline!
  • Many tools to choose from for each step
  • Simple configuration via a single file
  • Automatic download of tool dependencies
  • Resuming from failing jobs

Getting Started

See --> the Wiki pages <-- for setup and documentation.

For questions, bug reports, and feature requests, please open an issue. Please do not send emails with questions or requests, as others might be having them as well, and so it is better to discuss them where they can be found.

Pipeline Overview

Minimal input:

  • Reference genome fasta file
  • Per-sample fastq files
  • Optionally, a vcf file of known variants to restrict the variant calling process

Process and available tools:

Typical output:

  • Variant calls vcf, raw and filtered, and potentially with annotations
  • MultiQC report (includes summaries of most other tools, and of the final vcf)
  • Snakemake report (optional)

Intermediate output files such as bam files are also kept by default, and mpileup files can optionally be created if needed. In addition to the above tools, there are some tools used as glue between the steps. If you are interested in the details, have a look at the snakemake rules for each step.

Citation

When using grenepipe, please cite:

grenepipe: A flexible, scalable, and reproducible pipeline
to automate variant calling from sequence reads.

Lucas Czech and Moises Exposito-Alonso. Bioinformatics. 2022.
doi:10.1093/bioinformatics/btac600 [pdf]

Furthermore, please do not forget to cite all tools that you selected to be run for your analysis. See our Wiki for their references.