[Bug]: Missing default value for noise_type (for ddpg/td3) leads to unexpected behvaiours #348
Open
5 tasks done
Labels
bug
Something isn't working
馃悰 Bug
Using TD3 as an exmaple, if the the
noise_type
is not specified for a custom environment in td3.yml. The following weird behavior happens:The logic of deciding
n_actions
would be skipped andn_actions
would remainNone
(in exp_manager.py). The value None will be further passed down to the Noise constructor, e.g:NormalActionNoise(mean=np.zeros(trial.n_actions), sigma=noise_std * np.ones(trial.n_actions))
Depending on the value of
n_envs
, the program would raise an error, or produce unintended result silently.n_envs > 1
, an error would be raised for matrix shape mismatch.n_envs = 1
,n_actions=None
. No runtime error raised. Instead, the action noise will be one dim and broadcasted to the actual environment dimension, which will likely degrade the model performance silently (one of the most frustrating issues in ML).The unintended behvaiour also depends on the actual environment action space, but you get the idea..
==========================
I think people expect that when a default param is not specified in td3.yml but present in the params sampler (e.g. sample_td3_params() in hyperparams_opt.py), the program will just use a sampled value and work as intended.
To Reproduce
python train.py --algo td3 --env "CustomEnv-v0" -optimize --n-trials 100 --sampler tpe --pruner median
Relevant log output / Error message
No response
System Info
No response
Checklist
The text was updated successfully, but these errors were encountered: