Skip to content

Latest commit

 

History

History
22 lines (11 loc) · 868 Bytes

README.md

File metadata and controls

22 lines (11 loc) · 868 Bytes

PG Agents: Policy Gradient Algorithms with Tensorflow

The idea behind pg_agents is to provide an easy to understand python package containing the state the art policy gradient algorithms.

Implemented algorithms

  • VPG: Vanilla Policy Gradient Also known as REINFORCE

  • TNPG: Truncated Natural Policy Gradient Reformulation of the batch RL problem in terms of a contrained optimization problem

  • TRPO: Trust Region Policy Optimization Extension of TNPG to ensure robustness

  • GAE: Generalized Advantage Estimator Method to estimate the advantage function from experience. Helps to reduce the variance of the gradient estimator.

  • PPO: Proximal Policy Optimization Simple but efficient extension of VPG.

Examples

  • Robotschool Ant

Ant