Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Potential Bug]: Atari breakout evaluation hangs #329

Open
5 tasks done
araffin opened this issue Dec 21, 2022 · 1 comment
Open
5 tasks done

[Potential Bug]: Atari breakout evaluation hangs #329

araffin opened this issue Dec 21, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@araffin
Copy link
Member

araffin commented Dec 21, 2022

馃悰 Bug

i'm not sure if it's due to specific version of Atari, but I remember having to add terminal_on_life_loss: False for PPO LSTM to prevent those hangs.

I also used version from envpool that time, need to try to reproduce with normal atari game.

To Reproduce

python train.py --algo a2c --env BreakoutNoFrameskip-v4 -envpool --eval-freq 50000 --n-eval-envs 6 --eval-episodes 10 -P -params policy_kwargs:"dict(share_features_extractor=False)" --log-interval 500 --seed 1296705713

Relevant log output / Error message

No response

System Info

No response

Checklist

@araffin araffin added the bug Something isn't working label Dec 21, 2022
@araffin araffin changed the title [Potential Bug]: Atari breakout evaluation hang [Potential Bug]: Atari breakout evaluation hangs Dec 21, 2022
@qgallouedec
Copy link
Collaborator

Somehow related:

Second, the implementation of Montezuma鈥檚 Revenge in the ALE library includes a bug that prevents the agent from progressing to the next level when the agent is on its last life, which is clearly unintended behaviour that does not occur in the original game. Because there are no penalties for losing a life, policy-based Go-Explore learns to sacrifice lives in order to bypass hazards or to return to the entrance of a room more quickly. As a result, policy-based Go-Explore frequently reaches the treasure room without any lives remaining, preventing further progress. As such, for policy-based Go-Explore only, we terminate the episode on first death, which avoids this bug without simplifying the game.

Page 81 of Go-Explore paper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants