Skip to content

toolkit for prediction pKa values of small molecules via graph convolutional networks

License

Notifications You must be signed in to change notification settings

mayrf/pkasolver

Repository files navigation

pkasolver

GitHub Actions Build Status codecov

pKasolver is a package that enables prediction of pKa values of small molecules via graph convolutional networks. This repository and data was used for the preprint Improving Small Molecule pKa Prediction Using Transfer Learning with Graph Neural Networks.

Due to the restrictive licence of Schrödinger's Epik, we are unfortunatly unable to distribute the models that were trained using the transfer learning approach (pkasolver-epic), described by this publication. Instead, the model provided with this repository and used by the Google Colab Jupyter notebook, described below, has been trained without the transfer learning step (pkasolver-lite). For details about the limitations of this models see the publication.

Installation

We recommend using anaconda to set up a new repository (https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) and installing the dependencies listed in devtools/conda-envs/test_env.yaml. This can be done with conda env create -f devtools/conda-envs/test_env.yaml (the conda environment will be called test --- if you want a different name change the first line in test_env.yaml). After activating the conda environment the package can be installed with python setup.py install.

How to use pkasolver to predict microstate pka values

Depending on your needs you can either use the juptyer notebook deposited in the notebooks directory which demonstrate the usage of the pkasolver package (this needs a local installation of the pkasolver package). Or you can use the provided google colabs that automatically installs everything in a hosted notebook and allows you to use the package for pka prediction without a local installation: https://github.com/mayrf/pkasolver/blob/main/notebooks/pka_prediction.ipynb

To generate your own models, take a look at the scripts folder in the corresponding data repository: https://github.com/wiederm/pkasolver-data. It contains notebooks to reproduce the plots shwon in [placeholder] and python scripts to prepare the data, train models and visualize their performance. Note that an active license for Schrödinger's Epik is required in order to reproduce the transfer learning dataset.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Copyright

Copyright (c) 2022, Fritz Mayr, Oliver Wieder, Marcus Wieder, Thierry Langer

Acknowledgements

Project based on the Computational Molecular Science Python Cookiecutter version 1.4.