Grandeur

Named after the beautiful Grandeur Peak

Location: 40.707, -111.76, 8,299 ft (2,421 m) summit

More information about the trail leading up to this landmark can be found at https://utah.com/hiking/grandeur-peak

Grandeur is a Nextflow workflow developed by @erinyoung at the Utah Public Health Laborotory. "Grandeur" is intended to be a species agnostic sequencing analysis workflow to paired-end Illumina sequencing quality control and assurance (QC) and serotyping in a local public health laboratory.

"Grandeur" is meant to augment CDC's PHOENIX nextflow workflow, which is the official recommended usage. In principle, the contigs generated by PHOENIX undergo additional quality metric and serotyping steps, with a heavy emphasis on fastANI and AMRFinderPlus.

"Grandeur" can also be a standalone workflow that takes paired-end Illumina reads, removes adaptors with fastp and PHIX with bbduk, and creates contigs through de novo alignment of the reads with spades.

"Grandeur" is also a workflow of the staphb-toolkit

Dependencies

Nextflow
Singularity or Docker

Usage

Default workflow that takes fastq files, runs them through QC/serotyping/etc, creates contig files

# using singularity
nextflow run UPHL-BioNGS/Grandeur -profile singularity --reads <path to reads>
# using docker
nextflow run UPHL-BioNGS/Grandeur -profile docker --fastas <path to fastas>

Commonly adjusted parameters

params.sample_sheet / --sample_sheet : specify sample sheet with sample id, forward reads in fastq.gz format, and reverse reads in fastq.gz format
params.outdir / --outdir : specify directly where results are saved (basic result patterns are granduer/analysis/sample*)
params.reads / --reads : specify directory with paired-end files
params.fastas / --fastas : specify directory with fasta files

Not-as-commonly adjusted parameters

params.kraken2_db / --kraken2_db : specify directory of kraken2 database
params.blast_db / --blast_db : specify directory of blast database (must accompany value for params.blast_db_type)
params.mash_db / --mash_db : specify reference file for mash
params.current_datasets / --current_datasets : set to false to avoid downloading genomes from NCBI genomes
params.iqtree2_outgroup / --iqtree2_outgroup : set outgroup for iqtree2

Wiki sections

The README got too long, so it's been moved to a wiki. There are several covered topics including:

Problems

Please submit any issues and problems to issues (or find us on SLACK).

Acknowledgements

Grandeur wouldn't be possible without the following tools:

amrfinderplus - identification of genes associated with antimicrobial resistence
bbduk - removal of PhiX
blastn - read identification with blobtools
blobtools - contamination
circulocov - coverage determination
datasets - downloads genomes from NCBI
drprg - TB AMR predictions
elgato - Legionella pneumophila Sequence Based Typing (SBT)
emmtyper - Group A Strep "emm" typing
fastani - species evaluator
fastp - cleaning reads
fastqc - fastq file QC
heatcluster - visualizes SNP matrix from SNP dists
iqtree2 - phylogenetic tree creation - used after core genome alignment
kleborate - Klebsiella serotyping
kraken2 - contamination
mash - species identifier
mashtree - tree based on mash distances (not impacted by size of core genome)
mlst - identification of MLST subtype
multiqc - summarizes QC efforts
mykrobe - Mycobacterium subtyping
panaroo - core genome alignment - optional (set with params.msa = true)
pbptyper - Penicillin Binding Protein (PBP) typer for Streptococcus pneumoniae assemblies
phytreeviz - basic tree visualization
plasmidfinder - MLST typing for plasmids
prokka - gene annotation - used for core genome alignment
- will be replaced with bakta in a future release
quast - contig QC
seqsero2 - Salmonella serotyping
serotypefinder - E. coli serotyping
shigatyper - Shigella serotyping
snp-dists - SNP matrix - used after core genome aligment
spades - de novo alignment

The expected tools are split into multiple processes. Each process has its own wiki page that we encourage users to view.

Name		Name	Last commit message	Last commit date
Latest commit History 839 Commits
.github/workflows		.github/workflows
assets		assets
bin		bin
configs		configs
modules/local		modules/local
subworkflows		subworkflows
.dockstore.yml		.dockstore.yml
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
XML_Configuration.xml		XML_Configuration.xml
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
tower.yml		tower.yml

License

UPHL-BioNGS/Grandeur

Folders and files

Latest commit

History

Repository files navigation

Grandeur

Dependencies

Usage

Commonly adjusted parameters

Not-as-commonly adjusted parameters

Wiki sections

Problems

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages