Skip to content

Third generation sequencing techniques rapidly evolved as a common practice in molecular biology. Great advances have been made in terms of feasibility, cost, throughput, and read-length. However, sample contamination still poses a big issue: it complicates correct, high-quality downstream analysis of sequencing data and usage in medical applica…

License

Notifications You must be signed in to change notification settings

mjoppich/sequ-into

 
 

Repository files navigation

sequ-into - A straightforward desktop app for third generation sequencing read contamination analysis

Build Status Documentation Status DOI GitHub Maintainability

Download

Releases are available in the Release section of this repository.

Installation

After downloading the release from above, follow the installation instruction for your operating system.

sequ-into

How-To

You can find step-by-step installation instructions in our documentation. There we also provide instructions on how to use sequ-into.

We provide a video explaining how to use sequ-into with different input, particularly multi-FAST5, single-FAST5 and basecalled tmp-files:

Youtube Link

This use-case demonstrates how to setup sequ-into for analysing basecalled reads during sequencing:

Youtube Link

And we also have a video demonstrating how to use sequ-into to detect ribosomal RNA from single-FAST5 files:

Youtube Link

We demonstrate that sequ-into can be run while sequencing in the following video:

Youtube Link

Description

Third generation sequencing techniques rapidly evolved as a common practice in molecular biology. Great advances have been made in terms of feasibility, cost, throughput, and read-length. However, sample contamination still poses a big issue: it complicates correct, high-quality downstream analysis of sequencing data and usage in medical applications. Furthermore, it might be unclear weather the sequenced reads represent the intended target. To address these issues we developed a cross-platform desktop application: Sequ-Into. Reads originating from unwanted sources are detected and summarized by a comprehensive statistical overview, but can also be filtered and exported in standardized FASTQ-format to facilitate custom evaluation of experimental findings. This holds also true for an evaluation weather the reads consist of the intended source, and allows for a positive selection of those reads who do. Sequ-Into creates a straightforward user experience by fusing an intuitive graphical-user-interface with state-of-the-art long-read alignment software.

The app was implemented in the context of our iGEM project, where several DNA purification protocols were evaluated with Sequ-Into and thus allowed iterative engineering cycles leading to a so far unreached purification of up to 96% (bases sequenced) in our probes. To read more about Phactory, please follow this link: http://2018.igem.org/Team:Munich

Features

  • investigate FASTQ or FAST5 format files
  • start calculations for several experiments in parallel
  • examine single files or many files at once
  • map your read files to default E.Coli K12 genome
  • map your read files to your own references and save them in the app for future use
  • map your read files against DNA as well as against RNA references
  • get an statistical overview on the results that is also visualized
  • save only those extracted reads you need for your further analysis (reads that either aligned, or didn't align to the chosen reference)
  • you may also use the app/data/ContamTool.py script from command-line

Employed Software and Modules

Acknowledgement

The app framework is based on: irath96: Electron Biolerplate https://github.com/irath96/electron-react-typescript-boilerplate

We would like to thank the iGEM Munich 2018 team and especially our supervisors for the hard work, support and the possibility to work with novel sequencing data.

Maintainers

License

MIT © Rita, Julia, Markus

About

Third generation sequencing techniques rapidly evolved as a common practice in molecular biology. Great advances have been made in terms of feasibility, cost, throughput, and read-length. However, sample contamination still poses a big issue: it complicates correct, high-quality downstream analysis of sequencing data and usage in medical applica…

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • TypeScript 49.6%
  • Python 41.5%
  • JavaScript 7.5%
  • Dockerfile 0.4%
  • HTML 0.4%
  • CSS 0.3%
  • Other 0.3%