[Feature Request] Adding the ppo trainer #607

Esmail-ibraheem · 2024-04-30T16:17:12Z

Adding the proximal policy optimization (ppo) trainer

Applying the ppo trainer, so we can compare between the two trainers: ppo and dpo

No response

github-actions · 2024-05-31T15:01:48Z

This issue is stale because it has been open for 30 days with no activity.

Esmail-ibraheem added the feature request label Apr 30, 2024

abhishekkrthakur changed the title ~~Adding the ppo trainer~~ [Feature Request] Adding the ppo trainer Apr 30, 2024

github-actions bot added the stale label May 31, 2024

Provide feedback