You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The reward function currently assigns a higher reward when the gripper is merely against the handle compared to when it is actually hooking the handle. Additionally, the reward when the gripper is opening the drawer is similar to when it is against the handle. This can lead the algorithm to favor a policy where the gripper stays against the handle rather than correctly hooking and pulling it without applying tricks to the RL algorithm.
The text was updated successfully, but these errors were encountered:
Sure, but it's very unlikely that every reward function is going to be 100% perfect such that the agent always solves the task. I'm not quite sure if this is an issue, I am quite confident there are similar instances of this in many reward functions
The reward function currently assigns a higher reward when the gripper is merely against the handle compared to when it is actually hooking the handle. Additionally, the reward when the gripper is opening the drawer is similar to when it is against the handle. This can lead the algorithm to favor a policy where the gripper stays against the handle rather than correctly hooking and pulling it without applying tricks to the RL algorithm.
The text was updated successfully, but these errors were encountered: