Deep Reinforcement Learning Examples

This repository contains tutorials and examples I implemented and worked through as part of Udacity's Deep Reinforcement Learning Nanodegree program.

The tutorials implement various algorithms in reinforcement learning. All the code is in the latest version of PyTorch (currently version 1.9) and Python 3 (currently version 3.9.6).

Dynamic Programming: Implement Dynamic Programming algorithms such as Policy Evaluation, Policy Improvement, Policy Iteration, and Value Iteration.
Monte Carlo: Implement Monte Carlo methods for prediction and control.
Temporal-Difference: Implement Temporal-Difference methods such as Sarsa, Q-Learning, and Expected Sarsa.
Discretization: Learn how to discretize continuous state spaces, and solve the Mountain Car environment.
Tile Coding: Implement a method for discretizing continuous state spaces that enables better generalization.
Deep Q-Network: Explore how to use a Deep Q-Network (DQN) to navigate a space vehicle without crashing.
Hill Climbing: Use hill climbing with adaptive noise scaling to balance a pole on a moving cart.
Cross-Entropy Method: Use the cross-entropy method to train a car to navigate a steep hill.
REINFORCE: Learn how to use Monte Carlo Policy Gradients to solve a classic control task.
Proximal Policy Optimization: Explore how to use Proximal Policy Optimization (PPO) to solve a classic reinforcement learning task. (Coming soon!)
Deep Deterministic Policy Gradients: Explore how to use Deep Deterministic Policy Gradients (DDPG) with OpenAI Gym environments.
- Pendulum: Use OpenAI Gym's Pendulum environment.
- BipedalWalker: Use OpenAI Gym's BipedalWalker environment.
Finance: Train an agent to discover optimal trading strategies.

Resources

Cheatsheet: Udacity provide this useful PDF file with formulae and algorithms that help with understanding reinforcement learning.

OpenAI Gym Benchmarks

Classic Control

Acrobot-v1 with Tile Coding and Q-Learning
Cartpole-v0 with Hill Climbing | solved in 13 episodes
Cartpole-v0 with REINFORCE | solved in 691 episodes
MountainCarContinuous-v0 with Cross-Entropy Method | solved in 47 iterations
MountainCar-v0 with Uniform-Grid Discretization and Q-Learning | solved in <50000 episodes
Pendulum-v0 with Deep Deterministic Policy Gradients (DDPG)

Box2d

BipedalWalker-v2 with Deep Deterministic Policy Gradients (DDPG)
CarRacing-v0 with Deep Q-Networks (DQN) | Coming soon!
LunarLander-v2 with Deep Q-Networks (DQN) | solved in 1504 episodes

Toy Text

FrozenLake-v0 with Dynamic Programming
Blackjack-v0 with Monte Carlo Methods
CliffWalking-v0 with Temporal-Difference Methods

Dependencies

To set up your python environment to run the code in this repository, follow the instructions below.

Create (and activate) a new environment with Python 3.6.

Linux or Mac:

conda create --name DRLND python=3.6
source activate DRLND

Windows:

conda create --name DRLND python=3.6 
activate DRLND

Follow the instructions in this repository to perform a minimal install of OpenAI gym.
- Next, install the classic control environment group by following the instructions here.
- Then, install the box2d environment group by following the instructions here.
Clone the repository (if you haven't already!), and navigate to the python/ folder. Then, install several dependencies.

git clone https://github.com/udacity/deep-reinforcement-learning.git
cd deep-reinforcement-learning/python
pip install .

Create an IPython kernel for the DRLND environment.

python -m ipykernel install --user --name DRLND --display-name "DRLND"

Before running code in a notebook, change the kernel to match the DRLND environment by using the drop-down Kernel menu.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
cheatsheet		cheatsheet
cross-entropy		cross-entropy
ddpg-bipedal		ddpg-bipedal
ddpg-pendulum		ddpg-pendulum
discretization		discretization
dqn		dqn
dynamic-programming		dynamic-programming
finance		finance
hill-climbing		hill-climbing
images		images
monte-carlo		monte-carlo
pong-reinforce		pong-reinforce
python		python
reinforce		reinforce
temporal-difference		temporal-difference
tile-coding		tile-coding
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

ken-power/DRLND_DeepReinforcementLearning_Examples

Folders and files

Latest commit

History

Repository files navigation

Deep Reinforcement Learning Examples

Resources

OpenAI Gym Benchmarks

Classic Control

Box2d

Toy Text

Dependencies

About

Topics

Resources

Stars

Watchers

Forks

Languages