Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to take down ray and put up again in local mode #7249

Open
SiRumCz opened this issue May 9, 2024 · 5 comments · May be fixed by #7280
Open

how to take down ray and put up again in local mode #7249

SiRumCz opened this issue May 9, 2024 · 5 comments · May be fixed by #7280
Labels
new feature/request 💬 Requests and pull requests for new features

Comments

@SiRumCz
Copy link

SiRumCz commented May 9, 2024

My program has memory risk, and part of it seems to come from memory leak (idling ray workers holding a big chunk of memory). I have a for loop to independently run chunks of csv file on a series of tasks, I wish to kill ray after each iteration to release memory, and let Modin to put it up again with fresh ray workers. However, my code is the following:

import pandas

for df_ in pandas.read_csv('xxx.csv', chunk=5000):
    df_.to_csv(xxx)
    run_my_tasks(xxx) # Modin will initialize ray in first iteration
    ray.shutdown()

however, I got below error:

File "/home/.../lib/python3.9/site-packages/modin/core/execution/ray/common/deferred_execution.py", line 309, in _deconstruct_chain
    output[out_pos] = out_pos
IndexError: list assignment index out of range
@SiRumCz SiRumCz added question ❓ Questions about Modin Triage 🩹 Issues that need triage labels May 9, 2024
@YarShev
Copy link
Collaborator

YarShev commented May 12, 2024

Hi @SiRumCz, thanks for posting this issue. I guess there might be an issue with multiple Ray initialization in Modin codebase. We would have to look into this deeper. Meanwhile, can you explicitly put ray.init() before run_my_tasks(xxx) to see if it works?

@SiRumCz
Copy link
Author

SiRumCz commented May 13, 2024

@YarShev Thanks for your response. Yes, I have tried that method, and unfortunately I got:
ValueError: An application is trying to access a Ray object whose owner is unknown(00ffffffffffffffffffffffffffffffffffffff0100000002e1f505). Please make sure that all Ray objects you are trying to access are part of the current Ray session. Note that object IDs generated randomly (ObjectID.from_random()) or out-of-band (ObjectID.from_binary(...)) cannot be passed as a task argument because Ray does not know which task created them. If this was not how your object ID was generated, please file an issue at https://github.com/ray-project/ray/issues/

@YarShev
Copy link
Collaborator

YarShev commented May 17, 2024

@SiRumCz, could you try to execute ray.init() and importlib.reload(pd) before run_my_tasks(xxx), where pd is import modin.pandas as pd?

@YarShev YarShev added new feature/request 💬 Requests and pull requests for new features and removed question ❓ Questions about Modin Triage 🩹 Issues that need triage labels May 17, 2024
YarShev added a commit to YarShev/modin that referenced this issue May 17, 2024
Signed-off-by: Igoshev, Iaroslav <iaroslav.igoshev@intel.com>
@YarShev YarShev linked a pull request May 17, 2024 that will close this issue
7 tasks
@YarShev
Copy link
Collaborator

YarShev commented May 17, 2024

@SiRumCz, I opened #7280, which adds reload_modin function. Tested on the following example and it passed to me.

import modin.pandas as pd
from modin.utils import reload_modin
import ray

ray.init(num_cpus=16)  # can be commented out, works

df = pd.read_csv("example.csv")
df = df.abs()
print(df)

ray.shutdown()
reload_modin()
ray.init(num_cpus=16)  # can be commented out, works

df = pd.read_csv("example.csv")
df = df.abs()
print(df)

@SiRumCz
Copy link
Author

SiRumCz commented May 21, 2024

thanks, I ended up using a Process to wrap my task into a new process, ray will be taken down when process ends. But I am happy that there will be a feature for this, cheers :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature/request 💬 Requests and pull requests for new features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants