Skip to content

franciscozorrilla/SymbNET

Repository files navigation

🧬 SymbNET 🔬 From Metagenomics to Metabolic Interactions
💻 EMBL-EBI Virtual Course (Day 5)

DOI

💰 Learning Outcomes

  • Generate genome-scale metabolic models (GEMs) from metagenome assembled genomes (MAGs)
  • Predict metabolic interactions within bacterial communities
  • Characterize communities using competition-cooperation plot
  • Explore uncertainty in GEM reconstruction and simulation
  • Pros and cons of using reference genomes vs metagenome-assembled or single-amplified genomes for metabolic modeling

🍬 Recommended Software

Tool Task GitHub Reference
CarveMe Build GEMs from MAGs Repo Paper
SMETANA Predict metabolic interactions between GEMs Repo Paper
metaGEM Metagenomic metabolic modeling workflow Repo Paper
Snakemake Workflow management and reproducibility Repo Paper

Pictured below is the metaGEM workflow for reconstructing and simulating metagenome based metabolic models. This training module will focus on how to generate and simulate communities of metabolic models using CarveMe and SMETANA.

🗻 Contents

  • /genomes/ ORF annotated protein fasta files (.faa)
  • /models/ pre-generated CarveMe GEMs for simulation (.xml)
  • /ensembles/ pre-generated ensemble models for network uncertainty (.xml)
  • /simulations/ pre-computed SMETANA simulations (.tsv)
  • /data/ various metadata, taxonomic assignments, media files, etc.
  • /scripts/ markdown, r-markdown, and jupyter notebooks for each exercise
  • /plots/ visualization of results

☑️ Exercises

Part I

  1. Start by cloning this repo
  2. Use CarveMe to generate GEMs for a bacterial community
  3. Visualize model metrics across species
  4. Use SMETANA detailed algorithm to predict metabolic interactions between species
  5. Visualize detailed interactions with alluvial diagrams

Part II

  1. Use SMETANA global algorithm to generate MIP & MRO metrics for community of GEMs
  2. Visualize communities on cooperation-competition plot
  3. Generate ensemble models (optional)
  4. Quantify network uncertainty (optional)
  5. Discussion of methods, results, and interpretation

⛏️ Datasets

The following table describes in detail the 6 small bacterial communities of 5 species that we will consider for metabolic modeling. These include MAG,SAG, and reference genome-based communities; the samples also span the human gut, kefir, and soil habitats.

Microbiome Genome type Condition Species Links
Human gut MAGs Normal Glucose Tolerance (NGT, ERR260255)
  • B. uniformis
  • R. bromii
  • B. wexlerae
  • E. rectale
  • F. saccharivorans
Human gut MAGs Impaired Glucose Tolerance (IGT, ERR260172)
  • B. uniformis
  • R. bromii
  • B. wexlerae
  • E. rectale
  • F. saccharivorans
Human gut MAGs Type II Diabetes (T2D, ERR260140)
  • B. uniformis
  • R. bromii
  • B. wexlerae
  • E. rectale
  • F. saccharivorans
Human gut Reference genomes Reference genomes taken from RefSeq
  • B. uniformis
  • R. bromii
  • B. wexlerae
  • E. rectale
  • F. saccharivorans
Kefir SAGs Fermented with German grains (GER6)
  • L. mesenteroides
  • L. lactis
  • A. fabarum
  • L. kefiranofaciens
  • L. kefiri
Soil MAGs Calcarosols from Uluru, Australia (ERR671933)
  • f_Thermoleophilaceae
  • f_Herpetosiphonaceae
  • f_Phormidiaceae
  • f_Geodermatophilaceae
  • f_Rubrobacteraceae

🏄 Metabolic Modeling Repos

Tools

  • metaGEM: Reconstruction and simulation of genome scale metabolic models directly from metagenomes
  • DesignMC: Design microbial communities for production of specific target compounds using GEMs
  • HiOrCo: Compute higher order cooccurence using abundance across samples
  • Reframed: Metabolic modeling package

Resources

Please cite literature if you make use of relevant tools and/or resources.

📚 Suggested Reading

  • Intro to FBA: What is flux balance analysis?
  • CarveMe: Fast automated reconstruction of genome-scale metabolic models for microbial species and communities
  • SMETANA: Metabolic dependencies drive species co-occurrence in diverse microbial communities
  • metaGEM: Reconstruction of genome scale metabolic models directly from metagenomes
  • Human gut study: Nutritional preferences of human gut bacteria reveal their metabolic idiosyncrasies
  • Kefir study: Metabolic cooperation and spatiotemporal niche partitioning in a kefir microbial community
  • Cooccurrence study: Polarization of microbial communities between competitive and cooperative metabolism

🚛 Software Requirements

The following software will be pre-installed in your virtual machines. In the future, you can set up these software requirements in a conda environment on your cluster or local machine using the recipe files under the /conda/ subdirectory. The exact dependencies and versions may vary based on your operating system. For example, to set up a conda environment on an M1 Macbook, assuming you are in the SymbNET repo root folder

$ conda env create -f conda/osx-64_permissive.yml

Alternatively, you can manually create an environment using conda create

$ conda create --yes -n symbnet

You can activate the environment using source or conda command

$ source activate symbnet

Then pip install software listed below as required, e.g. to install CarveMe and SMETANA

$ pip install --user carveme smetana

Then conda install software listed below as required, e.g. to install prodigal and diamond

$ conda install -c bioconda prodigal diamond

You will also need to obtain a free academic initiative license from IBM to use the academic version of CPLEX. You must then follow the installation instructions to set up CPLEX on your local machine or cluster. Refer to your local cluster's wiki page to see if they have a load-able CPLEX module that you can use. CPLEX versions 12.7-12.9 are recommended. Note that the free version of CPLEX that can be obtained with pip install DOES NOT WORK FOR BIOLOGICAL NETWORKS OF OUR SIZE.

Core

Dependencies

Additonal packages

🇪🇺 About SymbNET

SymbNET is a European network for host-microbe interactions research, funded by the European Union’s Horizon 2020 research and innovation programme. The project coordinated by FCG-IGC (Instituto Gulbenkian de Ciência, Portugal), brings together the world-leading research institutions EMBL (European Molecular Biology Laboratory, Germany), CAU (Christian-Albrechts-Universität zu Kiel, Germany), and UNIL (Université de Lausanne, Switzerland), and a local widening partner ITQB NOVA (Instituto de Tecnologia Química e Biológica, Portugal).

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement Nº 952537

🥼 Contributors

  • Francisco Zorrilla, MRC Toxicology Unit - University of Cambridge
  • Eva-Maria Geissen, Center for Biological Modelling - EMBL Heidelberg
  • Maria Zimmermann-Kogadeeva, EMBL Heidelberg
  • Kiran R. Patil, MRC Toxicology Unit - University of Cambridge