[Bug Report] FetchPickAndPlace-v2 does not yield reproducible results #207

amacati · 2024-02-10T08:56:46Z

Describe the bug
The gymnasium API allows users to seed the environment on each reset to yield reproducible results. Running the environment with the same seed should always give the exact same results. While the documentation recommends that users should seed reset only once, it does not forbid seeding multiple times.

FetchPickAndPlace-v2 does not yield reproducible results under these conditions. The reset observation is identical, but the observations start deviating at the first environment step using identical actions.

Code example

import gymnasium
import numpy as np


def test_reproducibility(env: gymnasium.Env, seed: int = 42):
    env.action_space.seed(seed)  # Reproducible actions
    action = env.action_space.sample()  # Same random action for both runs
    env.reset(seed=seed)
    obs_1, _, _, _, _ = env.step(action)
    env.reset(seed=seed)  # Same seed should produce the same observations
    obs_2, _, _, _, _ = env.step(action)  # Identical action
    if isinstance(obs_1, dict):
        for key in obs_1:
            assert np.all(obs_1[key] == obs_2[key])  # Assertion error: different observations
    else:
        assert np.all(obs_1 == obs_2)
    print(f"Reproducibility test passed for {env.unwrapped.spec.id}")


def main():
    test_reproducibility(gymnasium.make('CartPole-v1'))  # Works
    test_reproducibility(gymnasium.make("FetchPickAndPlace-v2"))  # Fails


if __name__ == '__main__':
    main()

Stack Trace:

Reproducibility test passed for CartPole-v1
Traceback (most recent call last):
  File "/home/amacati/repos/Gymnasium-Robotics/bug_report.py", line 26, in <module>
    main()
  File "/home/amacati/repos/Gymnasium-Robotics/bug_report.py", line 22, in main
    test_reproducibility(gymnasium.make("FetchPickAndPlace-v2"))  # Fails
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/amacati/repos/Gymnasium-Robotics/bug_report.py", line 14, in test_reproducibility
    assert np.all(obs_1[key] == obs_2[key])  # Assertion error: different observations
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

System Info

Clone the latest Gymnasium-Robotics commit (commit 50c8019)
New mamba environment with Python 3.11
Install Gymnasium-Robotics with pip install -e .
Ubuntu 22.04 Jammy
Python 3.11.7

Additional context
The differences are small, i.e. they sometimes pass a np.allclose assert. In the example above, the object rotation in observation 1 is
[-5.18150577e-08 7.97154734e-08 -1.37921664e-16]
and
[-5.18150577e-08 7.97154734e-08 -1.37780312e-16]
in observation 2. Note the difference in z rotation. In fact, all three rotations are not equal, but the differences are too small to be printed without additional precision.

The inconsistencies arise from the FetchPickAndPlace environment's use of mocap bodies. The position and quaternions of the mocap bodies are currently not reset properly.

Furthermore, the Mujoco integrator uses warmstarts and caches the last controls in mjData. In the current implementation, these are also not reset. Only if these four mjData fields are properly restored to their initial states, env.reset(seed=seed) yields reproducible results.

I will open up a pull request that fixes this.

Checklist

I have checked that there is no similar issue in the repo (required)

The text was updated successfully, but these errors were encountered:

amacati · 2024-02-10T10:15:06Z

I have opened up the PR #208 that fixes this behaviour.

Kallinteris-Andreas · 2024-02-10T10:40:23Z

I can replicate the results on my machine, note: during testing we use atol=0.00001
https://github.com/Farama-Foundation/Gymnasium/blob/72cfbc204beca309579681b1201990a3d706e070/gymnasium/utils/env_checker.py#LL57

I expanded the test to cover all robotics environments

import gymnasium
import numpy as np
import pytest
from gymnasium.utils.env_checker import data_equivalence

robotics_full_env_list = []
for env_id, spec in gymnasium.envs.registration.registry.items():
    if spec.entry_point.startswith("gymnasium_robotics"):
        robotics_full_env_list.append(env_id)

@pytest.mark.parametrize("env_id", robotics_full_env_list)
def test_reproducibility(env_id: str, seed: int = 42):
    env = gymnasium.make(env_id)
    env.action_space.seed(seed)  # Reproducible actions
    action = env.action_space.sample()  # Same random action for both runs
    env.reset(seed=seed)
    obs_1, _, _, _, _ = env.step(action)
    env.reset(seed=seed)  # Same seed should produce the same observations
    obs_2, _, _, _, _ = env.step(action)  # Identical action
    if isinstance(obs_1, dict):
        for key in obs_1:
            assert np.all(obs_1[key] == obs_2[key])  # Assertion error: different observations
    else:
        assert np.all(obs_1 == obs_2)
    assert data_equivalence(obs_1, obs_2)
    print(f"Reproducibility test passed for {env.unwrapped.spec.id}")

and other ones seem to fail

amacati · 2024-02-10T18:58:33Z

Yes, I already suspected that any environment that uses Mujoco and allows for solver warm starts might have this issue.

By the way, I am not sure about the performance impact of removing warm starts. If that is something you are worried about, it might be a solution to just reset the buffers if a seed is passed to env.reset.

Kallinteris-Andreas · 2024-02-11T04:21:08Z

Seems to affect all RobotEnv environments

FetchSlide-v1
FetchSlide-v2
FetchPickAndPlace-v1
FetchPickAndPlace-v2
FetchReach-v1
FetchReach-v2
FetchPush-v1
FetchPush-v2
HandReach-v0
HandReach-v1
FetchSlideDense-v1
FetchSlideDense-v2
FetchPickAndPlaceDense-v1
FetchPickAndPlaceDense-v2
FetchReachDense-v1
FetchReachDense-v2
FetchPushDense-v1
FetchPushDense-v2
HandReachDense-v0
HandReachDense-v1

amacati · 2024-02-11T08:04:34Z

Should I update my PR to fix all of them?

Kallinteris-Andreas · 2024-02-11T08:24:59Z

updating robotEnv should in theory fix all of them (all the mujoco based ones at least)

amacati · 2024-02-11T09:56:38Z

I pushed a new fix (5e14ea1) that includes both the mujoco and mujoco_py envs. It passes all tests on my side including the new reproducibility tests, feel free to have a look

amacati mentioned this issue Feb 11, 2024

Add new Fetch-v3 and HandReacher-v2 environments (Fix reproducibility issues) #208

Merged

7 tasks

This was referenced Feb 16, 2024

[Proposal] Improve check_env Farama-Foundation/Gymnasium#927

Closed

Add check_mujoco_reset_state Farama-Foundation/Gymnasium#928

Merged

Kallinteris-Andreas closed this as completed in #208 May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Report] FetchPickAndPlace-v2 does not yield reproducible results #207

[Bug Report] FetchPickAndPlace-v2 does not yield reproducible results #207

amacati commented Feb 10, 2024

amacati commented Feb 10, 2024

Kallinteris-Andreas commented Feb 10, 2024 •

edited

amacati commented Feb 10, 2024

Kallinteris-Andreas commented Feb 11, 2024

amacati commented Feb 11, 2024

Kallinteris-Andreas commented Feb 11, 2024

amacati commented Feb 11, 2024

[Bug Report] FetchPickAndPlace-v2 does not yield reproducible results #207

[Bug Report] FetchPickAndPlace-v2 does not yield reproducible results #207

Comments

amacati commented Feb 10, 2024

Checklist

amacati commented Feb 10, 2024

Kallinteris-Andreas commented Feb 10, 2024 • edited

amacati commented Feb 10, 2024

Kallinteris-Andreas commented Feb 11, 2024

amacati commented Feb 11, 2024

Kallinteris-Andreas commented Feb 11, 2024

amacati commented Feb 11, 2024

Kallinteris-Andreas commented Feb 10, 2024 •

edited