FlexRL

FlexRL is a deep online/offline reinforcement learning library inspired and adapted from CleanRL and CORL that provides single-file implementations of algorithms that aren't necessarily covered by these libraries. FlexRL introduces the following features:

Consistent style across online and offline algorithms
Easy configuration with Pyrallis and tqdm progress bar
A few custom environments under gym API

Quick Start

Installing FlexRL

git clone https://github.com/alexchen-buaa/flexrl.git
cd flexrl
pip install -e .

Usage

Run the algorithms as individual scripts. Like CORL, we use Pyrallis for configuration management. The arguments can be specified using command-line arguments, a yaml file, or both:

python ppo.py --config_path=some_config.yaml

Algorithms Implemented

Type	Algorithm	Variants Implemented
Online	Proximal Policy Optimization (PPO)	ppo.py
		ppo_atari.py
		ppo_multidiscrete.py
	Deep Q-Networks (DQN)	dqn.py
		dqn_atari.py
	Quantile-Regression DQN (QR-DQN)	qr_dqn.py
		qr_dqn_atari.py
	Soft Actor-Critic (SAC)	sac.py
Offline	Implicit Q-Learning (IQL)	iql.py
		iql_jax.py
	In-Sample Actor-Critic (InAC)	inac.py
		inac_jax.py
	Soft Actor-Critic Ensemble (SAC-N)	sac_n_jax.py

Extra Requirements

Atari/ALE

According to The Arcade Learning Environment, you can use the command line tool to import your ROMS:

ale-import-roms roms/

MuJoCo

To use MuJoCo envs (for both online training and offline evaluation), you need to install MuJoCo first. See mujoco-py for instructions.

JAX with CUDA Support

To use JAX with CUDA support, you need to install the NVIDIA driver first. See JAX Installation for instructions.

References

[1] S. Huang, R. F. J. Dossa, C. Ye, and J. Braga, “CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms.” arXiv, Nov. 16, 2021. Accessed: Nov. 21, 2022. [Online]. Available: http://arxiv.org/abs/2111.08819
[2] Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann, “Stable-Baselines3: Reliable Reinforcement Learning Implementations,” Journal of Machine Learning Research, vol. 22, no. 268, pp. 1–8, 2021.
[3] W. Dabney, M. Rowland, M. G. Bellemare, and R. Munos, “Distributional Reinforcement Learning with Quantile Regression,” arXiv:1710.10044 [cs, stat], Oct. 2017, Accessed: Apr. 15, 2022. [Online]. Available: http://arxiv.org/abs/1710.10044
[4] I. Kostrikov, A. Nair, and S. Levine, “Offline Reinforcement Learning with Implicit Q-Learning.” arXiv, Oct. 12, 2021. Accessed: Mar. 29, 2023. [Online]. Available: http://arxiv.org/abs/2110.06169
[5] C. Xiao, H. Wang, Y. Pan, A. White, and M. White, “The In-Sample Softmax for Offline Reinforcement Learning.” arXiv, Feb. 28, 2023. Accessed: Apr. 02, 2023. [Online]. Available: http://arxiv.org/abs/2302.14372

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
src/flexrl		src/flexrl
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src/flexrl

src/flexrl

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

pyproject.toml

pyproject.toml

setup.py

setup.py

Repository files navigation

FlexRL

Quick Start

Installing FlexRL

Usage

Algorithms Implemented

Extra Requirements

Atari/ALE

MuJoCo

JAX with CUDA Support

References

About

Releases

Packages

Languages

License

alexchen-buaa/flexrl

Folders and files

Latest commit

History

Repository files navigation

FlexRL

Quick Start

Installing FlexRL

Usage

Algorithms Implemented

Extra Requirements

Atari/ALE

MuJoCo

JAX with CUDA Support

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages