Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grid_search resolution code optimization #45267

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

rdyro
Copy link

@rdyro rdyro commented May 11, 2024

A small Python code optimization to significantly speed up grid_search resolution.

Instead of deep-copying the whole unresolved spec for every resolved spec, we can create a skeleton spec, filled with None in place of grid variables, and deep-copy that every time. The fix involves a handful of line changes in one location.

I checked that the slow implementation is still present in the latest release.

Why are these changes needed?

The current grid resolution takes upwards of 1 minute on a Ryzen 7 7600X for grid_search generation for grids larger than 10k elements. Ray does not start the trials before generating the entire grid and the behavior, resulting from slow grid resolution, appears like ray has hung (as the trials are not starting) - this is confusing.

Related issue number

N/A

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Robert Dyro <rdyro@stanford.edu>
Copy link
Contributor

@matthewdeng matthewdeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Code looks good, leaving one comment for understandability.

Comment on lines 428 to 433
# a skeleton is easier to copy for every iteration
unresolved_spec_skeleton = copy.deepcopy(unresolved_spec)
for path, _ in grid_vars:
assign_value(unresolved_spec_skeleton, path, None)
while value_indices[-1] < len(grid_vars[-1][1]):
spec = copy.deepcopy(unresolved_spec)
spec = copy.deepcopy(unresolved_spec_skeleton)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify the comments here to describe what's happening in each step? Specifically I think it would be good to explain what the "skeleton" is (i.e. the spec with the grid_vars set to none), and then within the while loop comment that we are populating the values for a single variant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great suggestion; I'll add the comments ASAP.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if that's what you had in mind or if you have any feedback.

…ere values are going to be filled

Signed-off-by: Robert Dyro <rdyro@stanford.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants