Skip to content

saketkc/riboraptor

Repository files navigation

riboraptor : a pipeline for analysing ribosome profiling data

https://img.shields.io/pypi/v/riboraptor.svg?style=flat-square https://travis-ci.com/saketkc/riboraptor.svg?token=GsuWFnsdqcXUSp8vzLip&branch=master https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat-square https://readthedocs.org/projects/riboraptor/badge/?version=latest https://codecov.io/gh/saketkc/riboraptor/branch/master/graph/badge.svg?token=klaHhNsttK

Python package to analyse ribosome profiling data. Most of the functionality has been ported to ribotricer

Installation

Setting up conda

  1. Install conda, the best way to install it is with the Miniconda package.The Python 3 version is recommended.
  2. Set up channels, It is important to add them in this order.
conda config --add channels r
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda

Installing dependencies

We will create a searate environment inside conda for running riboraptor. The environment name is also riboraptor. If you already have a conda environment named riboraptor, you can delete it by running:

source deactivate riboraptor && conda env remove -n riboraptor

We will now install all the dependencies:

conda create --name riboraptor python=3 gcc matplotlib numpy pandas pybedtools \
pyBigWig pyfaidx pysam scipy seaborn statsmodels six click click-help-colors htseq biopython bx-python \
h5py joblib trackhub pytest snakemake sra-tools star fastqc trim-galore ucsc-bedgraphtobigwig ucsc-bedsort \
ucsc-bigwigmerge bamtools pysradb && source activate riboraptor

We also have the following two dependencies for processing and downloading SRA datasets:

  1. aspera connect : For allowing '.fasp' downloads from SRA

Linux download link: https://download.asperasoft.com/download/sw/connect/3.7.4/aspera-connect-3.7.4.147727-linux-64.tar.gz

  1. SRAdb : For fetching all experiments of a SRA project with the associated metadata

Since there is currently a bug in bioconductor-sradb, we will install it from github.

git clone https://github.com/seandavi/SRAdb
cd SRAdb

Run R, and install SRAdb within R use devtools. Please make sure your riboraptor environment is already activated. (source activate riboraptor):

library(devtools)
devtools::install(".")

And finally, we need two metadata files for processing SRA records:

mkdir riboraptor-data && cd riboraptor-data
wget -c http://starbuck1.s3.amazonaws.com/sradb/GEOmetadb.sqlite.gz && gunzip GEOmetadb.sqlite.gz
wget -c https://starbuck1.s3.amazonaws.com/sradb/SRAmetadb.sqlite.gz && gunzip SRAmetadb.sqlite.gz

Installing riboraptor

source activate riboraptor
git clone https://github.com/saketkc/riboraptor.git
cd riboraptor
python setup.py install --single-version-externally-managed --record=record.txt

We will assume you have the following directory structure for the rest of our analysis:

| some_root_directory
| ├── riboraptor
| │   ├── snakemake
| │   └── setup.py
| ├── riboraptor-data
| │   ├── GEOmetadb.sqlite
| │   └── SRAmetadb.sqlite

Using riboraptor

Usage mode 1: use riboraptor as a Snakemake based workflow

See example workflow

Usage mode 2: use riboraptor as a standalone toolkit

See: https://riboraptor.readthedocs.io/en/latest/

Usage mode 3: ribopod - database

In progress: http://ribopod.usc.edu/

Features

See: https://riboraptor.readthedocs.io/en/latest/cmd-manual.html