Workflow and scripts for processing eDNA metabarcoding data from Marine Protected Areas in Nova Scotia, Canada.
#1. Study sites
As of 2022, data is included from the Eastern Shore Islands Area of Interest, St. Anns Bank MPA, the Fundian Channel-Browns Bank AOI and the Gully MPA.
#2. Bioinformatics We use a combination of the R package dada2 and the QIIME2 Pipeline to trim raw reads, de-noise, and assign taxonomy to our reads.
- Import data and summarize if de-multiplexed. Check read quality with FastQC.
- Use cutadapt either on its own or in QIIME to remove primers and/or adapters.
- Then use dada2 to denoise paired sequences, ensuring to trim/truncate sequences to an appropriate length.
- Create a phylogenetic tree which aligns sequences using MAFFT and creates an unrooted tree.
- Conduct diversity analyses (alpha and beta diversity, PCoA etc)
- Assign taxonomy to our sequence features using a classifier, blast, or FuzzyID2
- Rescript plugin for QIIME was used to create a reference database for 12S and 16S fish sequences. The downloaded sequences can be filtered and evaluated before using to assign our metabarcodes taxonomic classifications.
- In QIIME we use the feature-classifier with our reference classifier object and our representative sequences, and generate a tsv table of classifications.