Deterministic Policy Gradient

This is a C++ implementation of a Deterministic Policy Gradient algorithm proposed by Silver et al [1]. We use tile coding proposed by Richard Sutton for the critic's linear function approximator. Note that this algorithm is different from Deep Deterministic Policy Gradient, as we use linear function approximation, and hence there are convergence guarantees. We test our algorithm on the Continuous Action Mountain Car domain, implemented similar to the OpenAI gym environment.

For a detailed discussion, please visit my blog post [2].

References

[1] David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, and Martin Riedmiller. "Deterministic policy gradient algorithms." In ICML. 2014.
[2] https://sridhartee.blogspot.in/2017/02/deterministic-policy-gradient-methods.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Deterministic Policy Gradient

Files

README.md

Latest commit

History

README.md

File metadata and controls

Deterministic Policy Gradient