Why might my rewards be inversely proportional to the target height in the HoverAviary environment? #214

gulbinkirikoglu · 2024-05-13T21:20:41Z

I'm attempting to conduct discrete-time RL training using the HoverAviary environment. My goal is to take the z position and z velocity within the observation space as input and control the drone's up and down movement (just [-1 -1 -1 -1] and [1 1 1 1] arrays). I'm using a defined reward function, but as the drone falls below a height of 1, the reward increases. What could be the reason for this?

piratax007 · 2024-05-21T08:21:18Z

How have you defined the reward function? how does the code of the _compute_reward method?

gulbinkirikoglu · 2024-05-29T08:17:44Z

Thank you for your help, I found the error. The issue wasn't with the reward function. I was treating the actions as discrete values, converting them to arrays, and storing them in the experience as arrays. Instead, I stored the discrete actions directly, and the problem was resolved.

gulbinkirikoglu closed this as completed May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why might my rewards be inversely proportional to the target height in the HoverAviary environment? #214

Why might my rewards be inversely proportional to the target height in the HoverAviary environment? #214

gulbinkirikoglu commented May 13, 2024

piratax007 commented May 21, 2024

gulbinkirikoglu commented May 29, 2024

Why might my rewards be inversely proportional to the target height in the HoverAviary environment? #214

Why might my rewards be inversely proportional to the target height in the HoverAviary environment? #214

Comments

gulbinkirikoglu commented May 13, 2024

piratax007 commented May 21, 2024

gulbinkirikoglu commented May 29, 2024