Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] KeyError: 'infos' in _process_observations when Using Custom Multi-Agent Environment #45291

Open
clemenjuan opened this issue May 13, 2024 · 2 comments
Labels
bug Something that is supposed to be working; but isn't P2 Important issue, but not time-critical rllib RLlib related issues rllib-newstack

Comments

@clemenjuan
Copy link

What happened + What you expected to happen

I am experiencing a persistent issue with my custom multi-agent environment in RLlib, where the infos dictionary is not being found, leading to a KeyError: 'infos'. This error arises during the processing of observations in the _process_observations function in env_runner_v2.py.

The error occurs consistently across different configurations and even after ensuring the environment complies with the expected structures for observations, rewards, terminations, truncations, and infos.

Versions / Dependencies

  • RLlib Version: 2.10.0
  • Python Version 3.11.6
  • Operating System: MacOS
  • Framework: PyTorch

Reproduction script

Steps to Reproduce

  1. Environment Setup: I have a custom multi-agent environment with satellites as agents. Observations, rewards, terminations, truncations, and infos are handled per agent.
  2. RLlib Setup: Using PPO with multi-agent configuration. The environment is registered and used within a standard RLlib training loop.
  3. Error Encounter: Upon initiating the training, during the first iteration of sampling from the environment, the error KeyError: 'infos' occurs in _process_observations.

Code Snippets

# Sample environment's step method:
def step(self, actions):
    # Process actions...
    observations, rewards, terminations, truncations, infos = {}, {}, {}, {}, {}
    # Logic to fill the above dictionaries based on the environment's dynamics
    return observations, rewards, terminations, truncations, infos

def reset(self, seed=None, options=None):
   observations, infos = {}, {}
   # Logic to fill the dictionaries
   return observations, infos

Issue Severity

High: It blocks me from completing my task.

@clemenjuan clemenjuan added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 13, 2024
@clemenjuan
Copy link
Author

clemenjuan commented May 13, 2024

/I found that line 573 in env_runner_v2.py was commented, so I uncommented it and now it seems to work, but I think that should be checked.

values_dict = {
                    SampleBatch.T: episode.length,  # Episodes start at -1 before we
                    # add the initial obs. After that, we infer from initial obs at
                    # t=0 since that will be our new episode.length.
                    SampleBatch.ENV_ID: env_id,
                    SampleBatch.AGENT_INDEX: episode.agent_index(agent_id),
                    # Last action (SampleBatch.ACTIONS) column will be populated by
                    # StateBufferConnector.
                    # Reward received after taking action at timestep t.
                    SampleBatch.REWARDS: rewards[env_id].get(agent_id, 0.0),
                    # After taking action=a, did we reach terminal?
                    SampleBatch.TERMINATEDS: agent_terminated,
                    # Was the episode truncated artificially
                    # (e.g. b/c of some time limit)?
                    SampleBatch.TRUNCATEDS: agent_truncated,
                    SampleBatch.INFOS: infos[env_id].get(agent_id, {}), # this line was previously commented
                    SampleBatch.NEXT_OBS: obs,
                }

@clemenjuan clemenjuan closed this as not planned Won't fix, can't repro, duplicate, stale May 13, 2024
@clemenjuan clemenjuan reopened this May 13, 2024
@anyscalesam anyscalesam added the rllib RLlib related issues label May 13, 2024
@simonsays1980
Copy link
Collaborator

@clemenjuan Thanks for filing this issue. During the the release of ray-2.10.0 and the actual release we have changed a lot in the coder of the EnvRunner API. Could you give the actual version a try and see, if the error prevails?

@simonsays1980 simonsays1980 added P2 Important issue, but not time-critical rllib-newstack and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't P2 Important issue, but not time-critical rllib RLlib related issues rllib-newstack
Projects
None yet
Development

No branches or pull requests

3 participants