GramNegAccum

Data analysis and prediction of small-molecule accumulation in Gram-negative bacteria published as:

Predictive Compound Accumulation Rules Yield a Broad-Spectrum Antibiotic. Richter, M. F.; Drown, B. S.; Riley, A. P.; Garcia, A.; Shirai, T.; Svec, R. L.; Hergenrother, P. J. Nature 2017, published on web May 10, 2017.

Analysis of accumulation Data

Description of assay

The accumulation of a diverse library of small-molecules was measured in a LC-MS assay. E. coli cells were incubated with compounds for 10 min before being washed and lysed. Clarified lysate were analyzed by LC-MS/MS.

Datasets

Several collections of compounds are included in accum/data. These correspond to the published Supplementary Tables.

Name	Compounds	Description
table1	12	Controls for accumulation analysis
table2	100	Initial dataset for accumulation with diversity of functionality
table3	54	SAR analysis that examines specific descriptors
table4	68	Primary amines
table5	79	Common antibiotics excluding beta-lactams
table6	49	Common beta-lactams

Generation of physiochemical descriptors

Initial 3D coordinates and protonation states for molecules were determined using Schrodinger's Ligprep. For mixtures of epimers, the most stable diastereomer was used. Ensembles of conformers were generated using MOE LowModeMD conformer search (see accum/scripts/conf_search.zsh). Molecular descriptors were then calculated for each conformer and averaged (see accum/scripts/ensemble_average.py). Output data is located at accum/data/table4.csv.

Data preprocessing

All data analysis, model training, and figure generation was performed using R. The distributions and co-correlations of descriptors were examined in accum/analysis/feature_select.R. Descriptors with near-zero variance or high co-correlation were removed in order to improve model stability.

Random forest classification model

A random forest model was trained using the R package caret. Several cross-validation methods were examined accum/analysis/compareCV.R which resulted in selection of repeated 10-fold CV (n=20) as the final method. Variable importance was measured to identify molecular features that may contribute to small-molecule accumulation accum/analysis/rand_forest.R.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
accum		accum
porin		porin
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

accum

accum

porin

porin

.gitignore

.gitignore

README.md

README.md

Repository files navigation

GramNegAccum

Analysis of accumulation Data

Description of assay

Datasets

Generation of physiochemical descriptors

Data preprocessing

Random forest classification model

About

Releases

Packages

Contributors 2

Languages

HergenrotherLab/GramNegAccum

Folders and files

Latest commit

History

Repository files navigation

GramNegAccum

Analysis of accumulation Data

Description of assay

Datasets

Generation of physiochemical descriptors

Data preprocessing

Random forest classification model

About

Topics

Resources

Stars

Watchers

Forks

Languages