To execute the CircWokflow, the following programs are required:
- Snakemake >= 6.1.1
- FastQC 0.11.9
- MultiQC 1.10.1
- TrimGalore 0.6.4
- Burrow-Wheeler Aligner 0.7
- CIRI2 2.0.6
- CircExplorer2 2.3.8
- CIRIquant 1.1.2
However, we do not recommended to install any of them manually, whilst handle them automatically by using Conda environments is the best option.
Snakemake is the workflow management system used to implement CircWorkflow. It uses a domain specific language based on Python to create transparent workflows, ensuring the reproducibility and automation of data analysis.
sudo apt-get update
sudo apt-get install python3.8 python3-pip
To facilitate the installation and managing of the rest of programmes , we will use the Anaconda distribution and the package manager conda.
Anaconda is an optimized Python and R distribution, having pre-built and pre-configured collection of packages that can be installed and used on a system. Anaconda uses a package manager named conda
, which can not only built and manage software from Python language, but also from any type of programming language. Additionally, we also will use another package manager called mamba
to make easier the installation of snakemake
tool.
package manager : is a tool that automates the process of installing, updating, and removing packages.
To install Anaconda distribution on Linux follow the steps. For other operating system check the man page:
- Download the installer of Anaconda for Linux (here].
- Open a terminal window (ALT+Ctl+T) and go to the directory containing the downloaded installer.
- Run:
bash Anaconda-latest-Linux-x86_64.sh
conda update
- Follow the prompts on the installer screens. In front of any doubt, accept the defaults settings. They can be changed later.
- To make the changes take effect, close and re-open your terminal window.
- Test your installation. In your terminal window run the command
conda list
, which will show a list of installed packages if it has been installed correctly.
In order to make visible to your system and be able to run all the pre-installed programmes with Anaconda distribution from anywhere, is necessary to add the path of Anaconda to your $PATH
bash variable:
- Open the following document with root privileges:
sudo nano /etc/profile
- Modify the document, adding at the top of the document:
export PATH=$PATH:/your/path/to/anaconda3/bin:/your/path/to/anaconda3/condabin
This modification will affect global settings, which means that it will be run upon login for all current and future users. To modify the $PATH
variable only for a specific user:
sudo nano ~/.profile
# Add at the top of the profile file:
export PATH=$PATH:/your/path/to/anaconda3/bin
- Once the profile file has been modified, reload it and ensure that
$PATH
has added the path of Anaconda distribution.
source /etc/profile
$PATH
- Add repositories (called
channels
) to install packages. The order of adding these channels is important.
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
To simplify the installation of this programme, we will use the mamba
package manager instead of the default conda
manager.
conda update
conda install mamba
mamba install -c conda-forge -c bioconda snakemake
-
Download and unzip this repository.
-
Modify the config file located at
workflow/config/config.yaml
. Specify the workflow module that you want to execute and complete the general and specific configurations. -
Perform a dry run to ensure that there is no hidden problems.
snakemake -n
- After ensuring this, you can now execute the process:
snakemake --cores NUMBER --use-conda