Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Missing default value for noise_type (for ddpg/td3) leads to unexpected behvaiours #348

Open
5 tasks done
qihuazhong opened this issue Jan 28, 2023 · 1 comment
Open
5 tasks done
Labels
bug Something isn't working

Comments

@qihuazhong
Copy link

qihuazhong commented Jan 28, 2023

馃悰 Bug

Using TD3 as an exmaple, if the the noise_type is not specified for a custom environment in td3.yml. The following weird behavior happens:

The logic of deciding n_actions would be skipped and n_actions would remain None (in exp_manager.py). The value None will be further passed down to the Noise constructor, e.g: NormalActionNoise(mean=np.zeros(trial.n_actions), sigma=noise_std * np.ones(trial.n_actions))

Depending on the value of n_envs, the program would raise an error, or produce unintended result silently.

  • When n_envs > 1, an error would be raised for matrix shape mismatch.
  • When n_envs = 1, n_actions=None. No runtime error raised. Instead, the action noise will be one dim and broadcasted to the actual environment dimension, which will likely degrade the model performance silently (one of the most frustrating issues in ML).

The unintended behvaiour also depends on the actual environment action space, but you get the idea..

==========================

I think people expect that when a default param is not specified in td3.yml but present in the params sampler (e.g. sample_td3_params() in hyperparams_opt.py), the program will just use a sampled value and work as intended.

To Reproduce

python train.py --algo td3 --env "CustomEnv-v0" -optimize --n-trials 100 --sampler tpe --pruner median

Relevant log output / Error message

No response

System Info

No response

Checklist

@qihuazhong qihuazhong added the bug Something isn't working label Jan 28, 2023
qihuazhong added a commit to qihuazhong/rl-baselines3-zoo that referenced this issue Jan 28, 2023
Moved the assignmeng of n_actions out of the if condition to make
sure it is not None when action_noise is not provided in algo.yml
qihuazhong added a commit to qihuazhong/rl-baselines3-zoo that referenced this issue Jan 28, 2023
Added space type check to avoid error (e.g. discrete space has no shape)
@araffin
Copy link
Member

araffin commented Feb 6, 2023

Hello,

thanks for reporting the issue, it is indeed a bug.
We should also use a VecActionNoise when using n_envs > 1.

I would be happy to receive a PR that fix this issue =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants