Skip to content

Latest commit

 

History

History
15 lines (10 loc) · 1.02 KB

File metadata and controls

15 lines (10 loc) · 1.02 KB

Deterministic Policy Gradient

This is a C++ implementation of a Deterministic Policy Gradient algorithm proposed by Silver et al [1]. We use tile coding proposed by Richard Sutton for the critic's linear function approximator. Note that this algorithm is different from Deep Deterministic Policy Gradient, as we use linear function approximation, and hence there are convergence guarantees. We test our algorithm on the Continuous Action Mountain Car domain, implemented similar to the OpenAI gym environment.

For a detailed discussion, please visit my blog post [2].

References