You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Second, the implementation of Montezuma鈥檚 Revenge in the ALE library includes a bug that prevents the agent from progressing to the next level when the agent is on its last life, which is clearly unintended behaviour that does not occur in the original game. Because there are no penalties for losing a life, policy-based Go-Explore learns to sacrifice lives in order to bypass hazards or to return to the entrance of a room more quickly. As a result, policy-based Go-Explore frequently reaches the treasure room without any lives remaining, preventing further progress. As such, for policy-based Go-Explore only, we terminate the episode on first death, which avoids this bug without simplifying the game.
馃悰 Bug
i'm not sure if it's due to specific version of Atari, but I remember having to add
terminal_on_life_loss: False
for PPO LSTM to prevent those hangs.I also used version from envpool that time, need to try to reproduce with normal atari game.
To Reproduce
python train.py --algo a2c --env BreakoutNoFrameskip-v4 -envpool --eval-freq 50000 --n-eval-envs 6 --eval-episodes 10 -P -params policy_kwargs:"dict(share_features_extractor=False)" --log-interval 500 --seed 1296705713
Relevant log output / Error message
No response
System Info
No response
Checklist
The text was updated successfully, but these errors were encountered: