rlenvs_from_cpp

rlenvs_from_cpp is an effort to provide implementations and wrappers of environments for reinforcement learning algorithms to be used by C++ drivers. Currently, we provide a minimal number of wrappers for some common Gymnasium (former OpenAI-Gym) environments. Namely

FrozenLake with 4x4 map
FrozenLake with 8x8 map
Blackjack
CliffWalking
CartPole
MountainCar
Taxi
StateAggregationCartPole (implements state aggregation for CartPole)
SerialVectorEnvWrapper a vector wrapper for various environments

In addition there are wrappers for

GymWalk environment from gym_walk
gym-pybullet-drones from gym-pybullet-drones

In general, the environments exposed by the library should abide with dm_env specification. The following snippet shows how to use the FrozenLake and Taxi environments from Gymnasium.

#include "rlenvs/rlenvs_types_v2.h"
#include "rlenvs/envs/gymnasium/toy_text/frozen_lake_env.h"
#include "rlenvs/envs/gymnasium/toy_text/taxi_env.h"
#include <iostream>
#include <string>
#include <unordered_map>
#include <any>

namespace example_1{

const std::string SERVER_URL = "http://0.0.0.0:8001/api";

void test_frozen_lake(){

    rlenvs_cpp::envs::gymnasium::FrozenLake<4> env(SERVER_URL);

    std::cout<<"Environame URL: "<<env.get_url()<<std::endl;

    // make the environment
    std::unordered_map<std::string, std::any> options;
    options.insert({"is_slippery", true});
    env.make("v1", options);

    std::cout<<"Is environment created? "<<env.is_created()<<std::endl;
    std::cout<<"Is environment alive? "<<env.is_alive()<<std::endl;
    std::cout<<"Number of valid actions? "<<env.n_actions()<<std::endl;
    std::cout<<"Number of states? "<<env.n_states()<<std::endl;

    // reset the environment
    auto time_step = env.reset(42);

    std::cout<<"Reward on reset: "<<time_step.reward()<<std::endl;
    std::cout<<"Observation on reset: "<<time_step.observation()<<std::endl;
    std::cout<<"Is terminal state: "<<time_step.done()<<std::endl;

    //...print the time_step
    std::cout<<time_step<<std::endl;

    // take an action in the environment
    auto new_time_step = env.step(rlenvs_cpp::envs::gymnasium::FrozenLakeActionsEnum::RIGHT);

    std::cout<<new_time_step<<std::endl;

    // get the dynamics of the environment for the given state and action
    auto state = 0;
    auto action = 1;
    auto dynamics = env.p(state, action);

    std::cout<<"Dynamics for state="<<state<<" and action="<<action<<std::endl;

    for(auto item:dynamics){

        std::cout<<std::get<0>(item)<<std::endl;
        std::cout<<std::get<1>(item)<<std::endl;
        std::cout<<std::get<2>(item)<<std::endl;
        std::cout<<std::get<3>(item)<<std::endl;
    }

    // synchronize the environment. environment knows how
    // to cast std::any
    env.sync(std::unordered_map<std::string, std::any>());

    // close the environment
    env.close();

}

void test_taxi(){

    rlenvs_cpp::envs::gymnasium::Taxi env(SERVER_URL);

    std::cout<<"Environame URL: "<<env.get_url()<<std::endl;

    // make the environment
    std::unordered_map<std::string, std::any> options;
    env.make("v3", options);

    std::cout<<"Is environment created? "<<env.is_created()<<std::endl;
    std::cout<<"Is environment alive? "<<env.is_alive()<<std::endl;
    std::cout<<"Number of valid actions? "<<env.n_actions()<<std::endl;
    std::cout<<"Number of states? "<<env.n_states()<<std::endl;

    // reset the environment
    auto time_step = env.reset(42);

    std::cout<<"Reward on reset: "<<time_step.reward()<<std::endl;
    std::cout<<"Observation on reset: "<<time_step.observation()<<std::endl;
    std::cout<<"Is terminal state: "<<time_step.done()<<std::endl;

    //...print the time_step
    std::cout<<time_step<<std::endl;

    // take an action in the environment
    auto new_time_step = env.step(rlenvs_cpp::envs::gymnasium::TaxiActionsEnum::RIGHT);

    std::cout<<new_time_step<<std::endl;

    // get the dynamics of the environment for the given state and action
    auto state = 0;
    auto action = 1;
    auto dynamics = env.p(state, action);

    std::cout<<"Dynamics for state="<<state<<" and action="<<action<<std::endl;

    for(auto item:dynamics){

        std::cout<<std::get<0>(item)<<std::endl;
        std::cout<<std::get<1>(item)<<std::endl;
        std::cout<<std::get<2>(item)<<std::endl;
        std::cout<<std::get<3>(item)<<std::endl;
    }

    // close the environment
    env.close();

}

}


int main(){


    std::cout<<"Testing FrozenLake..."<<std::endl;
    example_1::test_frozen_lake();
    std::cout<<"===================="<<std::endl;
    std::cout<<"Testing Taxi..."<<std::endl;
    example_1::test_taxi();
    std::cout<<"===================="<<std::endl;
    return 0;
}

Some algorithms, such as Monte Carlo, require that we should generate a trajectory, example 3 shows how to do this. Various RL algorithms using the environments can be found at cuberl

How to use

The general use case is to build the library and link it with your driver code to access its functionality. Furthermore, the Gymnasium, gym_pybullet_drones environments are accessed via a client/server pattern. Namely, they are exposed via an API developed using FastAPI. You need to fire up the server, see dependencies, before using the environments in your code. To do so

cd rest_api && ./start_uvicorn.sh

By default the uvicorn server listents on port 8001. Change this if needed. You can access the OpenAPI specification at

http://0.0.0.0:8001/docs

Note that currently the implementation is not thread/process safe i.e. if multiple threads/processes access the environment a global instance of the environment is manipulated. Thus no session based environment exists.

Dependencies

The library has the following general dependencies

A compiler that supports C++20 e.g. g++-11
Boost C++
CMake >= 3.6
Gtest (if configured with tests)

Using the Gymnasium environments requires Gymnasium installed on your machine. In addition, you need to install

In addition, the library also incorporates, see (src/extern), the following libraries

There are extra dependencies if you want to generate the documentation. Namely,

Doxygen
Sphinx
sphinx_rtd_theme
breathe
m2r2

Installation

The usual CMake based installation process is used. Namely

mkdir build && cd build && cmake ..
make install

Run the tests

You can execute all the tests by running the helper script execute_tests.sh.

Issues

Could not find `boost_system`

It is likely that you are missing the boost_system library with your local Boost installation. This may be the case is you installed boost via a package manager. On a Ubuntu machine the following should resolve the issue

sudo apt-get update -y
sudo apt-get install -y libboost-system-dev

Name		Name	Last commit message	Last commit date
Latest commit History 251 Commits
.github/workflows		.github/workflows
cmake		cmake
doc		doc
examples		examples
rest_api		rest_api
src/rlenvs		src/rlenvs
test_python_scripts		test_python_scripts
tests		tests
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
config.h.in		config.h.in
execute_tests.sh		execute_tests.sh
version.h.in		version.h.in

pockerman/rlenvs_from_cpp

Folders and files

Latest commit

History

Repository files navigation

rlenvs_from_cpp

How to use

Dependencies

Installation

Run the tests

Issues

Could not find boost_system

About

Topics

Resources

Stars

Watchers

Forks

Languages

Could not find `boost_system`