Skip to content

Deep reinforcement learning framework for fast prototyping based on PyTorch

License

Notifications You must be signed in to change notification settings

lucadellalib/actorch

Repository files navigation

logo

Python version: 3.6 | 3.7 | 3.8 | 3.9 | 3.10 License Code style: black Imports: isort pre-commit PyPI version

Welcome to actorch, a deep reinforcement learning framework for fast prototyping based on PyTorch. The following algorithms have been implemented so far:


💡 Key features

  • Support for OpenAI Gymnasium environments
  • Support for custom observation/action spaces
  • Support for custom multimodal input multimodal output models
  • Support for recurrent models (e.g. RNNs, LSTMs, GRUs, etc.)
  • Support for custom policy/value distributions
  • Support for custom preprocessing/postprocessing pipelines
  • Support for custom exploration strategies
  • Support for normalizing flows
  • Batched environments (both for training and evaluation)
  • Batched trajectory replay
  • Batched and distributional value estimation (e.g. batched and distributional Retrace and V-trace)
  • Data parallel and distributed data parallel multi-GPU training and evaluation
  • Automatic mixed precision training
  • Integration with Ray Tune for experiment execution and hyperparameter tuning at any scale
  • Effortless experiment definition through Python-based configuration files
  • Built-in visualization tool to plot performance metrics
  • Modular object-oriented design
  • Detailed API documentation

🛠️️ Installation

For Windows, make sure the latest Visual C++ runtime is installed.

Using Pip

First of all, install Python 3.6 or later. Open a terminal and run:

pip install actorch

Using Conda virtual environment

Clone or download and extract the repository, navigate to <path-to-repository>/bin and run the installation script (install.sh for Linux/macOS, install.bat for Windows). actorch and its dependencies (pinned to a specific version) will be installed in a Conda virtual environment named actorch-env.

NOTE: you can directly use actorch-env and the actorch package in the local project directory for development (see For development).

Using Docker (Linux/macOS only)

First of all, install Docker and NVIDIA Container Runtime. Clone or download and extract the repository, navigate to <path-to-repository>, open a terminal and run:

docker build -t <desired-image-name> .                  # Build image
docker run -it --runtime=nvidia <desired-image-name>    # Run container from image

actorch and its dependencies (pinned to a specific version) will be installed in the specified Docker image.

NOTE: you can directly use the actorch package in the local project directory inside a Docker container run from the specified Docker image for development (see For development).

From source

First of all, install Python 3.6 or later. Clone or download and extract the repository, navigate to <path-to-repository>, open a terminal and run:

pip install .

For development

First of all, install Python 3.6 or later and Git. Clone or download and extract the repository, navigate to <path-to-repository>, open a terminal and run:

pip install -e .[all]
pre-commit install -f

This will install the package in editable mode (any change to the package in the local project directory will automatically reflect on the environment-wide package installed in the site-packages directory of your environment) along with its development, test and optional dependencies. Additionally, it installs a git commit hook. Each time you commit, unit tests, static type checkers, code formatters and linters are run automatically. Run pre-commit run --all-files to check that the hook was successfully installed. For more details, see pre-commit's documentation.


▶️ Quickstart

In this example we will solve the OpenAI Gymnasium environment CartPole-v1 using REINFORCE. Copy the following configuration in a file named REINFORCE_CartPole-v1.py (with the same indentation):

import gymnasium as gym
from torch.optim import Adam

from actorch import *


experiment_params = ExperimentParams(
    run_or_experiment=REINFORCE,
    stop={"training_iteration": 50},
    resources_per_trial={"cpu": 1, "gpu": 0},
    checkpoint_freq=10,
    checkpoint_at_end=True,
    log_to_file=True,
    export_formats=["checkpoint", "model"],
    config=REINFORCE.Config(
        train_env_builder=lambda **config: ParallelBatchedEnv(
            lambda **kwargs: gym.make("CartPole-v1", **kwargs),
            config,
            num_workers=2,
        ),
        train_num_episodes_per_iter=5,
        eval_freq=10,
        eval_env_config={"render_mode": None},
        eval_num_episodes_per_iter=10,
        policy_network_model_builder=FCNet,
        policy_network_model_config={
            "torso_fc_configs": [{"out_features": 64, "bias": True}],
        },
        policy_network_optimizer_builder=Adam,
        policy_network_optimizer_config={"lr": 1e-1},
        discount=0.99,
        entropy_coeff=0.001,
        max_grad_l2_norm=0.5,
        seed=0,
        enable_amp=False,
        enable_reproducibility=True,
        log_sys_usage=True,
        suppress_warnings=True,
    ),
)

Open a terminal in the directory where you saved the configuration file and run (if you installed actorch in a virtual environment, you first need to activate it, e.g. conda activate actorch-env if you installed actorch using Conda):

pip install gymnasium[classic_control]  # Install dependencies for CartPole-v1
actorch run REINFORCE_CartPole-v1.py    # Run experiment

NOTE: training artifacts (e.g. checkpoints, metrics, etc.) are saved in nested subdirectories. This might cause issues on Windows, since the maximum path length is 260 characters. In that case, move the configuration file (or set local_dir) to an upper level directory (e.g. Desktop), shorten the configuration file name, and/or shorten the algorithm name (e.g. DistributedDataParallelREINFORCE.rename("DDPR")).

Wait for a few minutes until the training ends. The mean cumulative reward over the last 100 episodes should exceed 475, which means that the environment was successfully solved. You can now plot the performance metrics saved in the auto-generated TensorBoard (or CSV) log files using Plotly (or Matplotlib):

pip install actorch[vistool]  # Install dependencies for VisTool
cd experiments/REINFORCE_CartPole-v1/<auto-generated-experiment-name>
actorch vistool plotly tensorboard

You can find the generated plots in plots.

Congratulations, you ran your first experiment!

See examples for additional configuration file examples.

HINT: since a configuration file is a regular Python script, you can use all the features of the language (e.g. inheritance).


🔗 Useful links


@ Citation

@misc{DellaLibera2022ACTorch,
  author = {Luca Della Libera},
  title = {{ACTorch}: a Deep Reinforcement Learning Framework for Fast Prototyping},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/lucadellalib/actorch}},
}

📧 Contact

luca.dellalib@gmail.com