Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Validate object_store_memory in ray_actor_options error #45321

Open
aRyBernAlTEglOTRO opened this issue May 14, 2024 · 3 comments
Open
Assignees
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue

Comments

@aRyBernAlTEglOTRO
Copy link

What happened + What you expected to happen

  1. The bug
    After execute serve run test:app in terminal, it raise a validation error about object_store_memory in ray_actor_options.
  2. Expected behavior
    The deployment should start up successfully.
  3. Useful information
    Following are logs from terminal:
    2024-05-14 12:29:58,511 INFO scripts.py:499 -- Running import path: 'test:app'.
    2024-05-14 12:29:58,543 INFO worker.py:1564 -- Connecting to existing Ray cluster at address: 0.0.0.0:6379...
    2024-05-14 12:29:58,546 INFO worker.py:1740 -- Connected to Ray cluster. View the dashboard at http://0.0.0.0:8265
    Traceback (most recent call last):
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/scripts.py", line 548, in run
        serve.run(app, blocking=should_block, name=name, route_prefix=route_prefix)
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/api.py", line 578, in run
        handle = _run(
                 ^^^^^
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/api.py", line 492, in _run
        deployments = pipeline_build(target._get_internal_dag_node(), name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/_private/deployment_graph_build.py", line 81, in build
        serve_root_dag = ray_dag_root_node.apply_recursive(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/dag/dag_node.py", line 297, in apply_recursive
        return fn(
               ^^^
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/dag/dag_node.py", line 282, in __call__
        self.cache[node._stable_uuid] = self.fn(node)
                                        ^^^^^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/_private/deployment_graph_build.py", line 82, in <lambda>
        lambda node: transform_ray_dag_to_serve_dag(node, node_name_generator, name)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/_private/deployment_graph_build.py", line 162, in transform_ray_dag_to_serve_dag
        deployment_shell: Deployment = schema_to_deployment(deployment_schema)
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/deployment.py", line 651, in schema_to_deployment
        replica_config = ReplicaConfig.create(
                         ^^^^^^^^^^^^^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/_private/config.py", line 517, in create
        config = cls(
                 ^^^^
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/_private/config.py", line 429, in __init__
        self._validate()
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/_private/config.py", line 438, in _validate
        self._validate_ray_actor_options()
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/serve/_private/config.py", line 563, in _validate_ray_actor_options
        ray_option_utils.validate_actor_options(self.ray_actor_options, in_options=True)
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/_private/ray_option_utils.py", line 340, in validate_actor_options
        actor_options[k].validate(k, v)
      File "/home/user/.pyenv/versions/3.11.9/lib/python3.11/site-packages/ray/_private/ray_option_utils.py", line 32, in validate
        raise TypeError(
    TypeError: The type of keyword 'object_store_memory' must be (<class 'int'>, <class 'NoneType'>), but received type <class 'float'>
    2024-05-14 12:30:00,038 ERR scripts.py:585 -- Received unexpected error, see console logs for more details. Shutting down...
    (ProxyActor pid=846212) INFO 2024-05-14 12:30:00,022 proxy 0.0.0.0 proxy.py:1161 - Proxy starting on node b2280301862051dba174650bd75aa47d6b9749aefb6fe43f795f2e53 (HTTP port: 8000).
    

Versions / Dependencies

Python Version: 3.11.9
Ray Version: 2.21.0
FastAPI Version: 0.111.0
OS: Ubuntu 22.04.4 LTS

Reproduction script

# test.py

from ray import serve
from fastapi import FastAPI

app = FastAPI()


@serve.deployment(ray_actor_options={"object_store_memory": 20 * 1024 * 1024})
@serve.ingress(app)
class FastAPIDeployment:
    @app.post("/")
    async def run(self) -> str:
        return ""


app = FastAPIDeployment.bind()
# bash script
serve run test:app

Issue Severity

Low: It annoys or frustrates me.

@aRyBernAlTEglOTRO aRyBernAlTEglOTRO added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 14, 2024
@aRyBernAlTEglOTRO
Copy link
Author

I think this bug is caused by this line of code, after I change the value from _counting_option("object_store_memory", False) to _resource_option("object_store_memory") the error fixed.

@anyscalesam anyscalesam added the serve Ray Serve Related Issue label May 20, 2024
@shrekris-anyscale shrekris-anyscale added the P1 Issue that should be fixed within a few weeks label May 28, 2024
@shrekris-anyscale shrekris-anyscale self-assigned this May 28, 2024
@shrekris-anyscale shrekris-anyscale removed the triage Needs triage (eg: priority, bug/not-bug, and owning component) label May 28, 2024
@shrekris-anyscale
Copy link
Contributor

@aRyBernAlTEglOTRO did you set object_store_memory to a float? Or did you set it to an int and then see this error?

It looks like this option is expected to be an int instead of a float (code link). If Serve is converting it to a float, then that sounds like a bug.

@aRyBernAlTEglOTRO
Copy link
Author

Q: Did you set object_store_memory to a float?
A: Yes, as you can see from the the reproduction script:

 @serve.deployment(ray_actor_options={"object_store_memory": 20 * 1024 * 1024})

I think this line of code is the reason that cause the bug, which convert the value of object_store_memory to float, I think change it from object_store_memory: float to object_store_memory: int will fix this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue
Projects
None yet
Development

No branches or pull requests

3 participants