Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Add Local Search for Solution Improvement #140

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

hyeok9855
Copy link
Contributor

@hyeok9855 hyeok9855 commented Mar 15, 2024

Open for Review; Do Not Merge for Now!

Description

Add local search operators as a post-processing to improve a given solution.
Here, we implement 2-opt for TSP and LocalSearch operator provided by PyVRP for CVRP and CVRPTW.
For other problems (e.g., PDP or scheduling), we couldn't find such a plug-and-play local search operator, and we're looking for contributions to local search for other problems!

Note that I also made some refactorings regarding type hinting.

Motivation and Context

Local search is an essential component for an enhanced CO solver. Many research projects in the NCO field utilize local search operators, such as our recent work GFACS, which trains NN using the solution refined by local search in an off-policy manner.

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.

rl4co/envs/common/base.py Outdated Show resolved Hide resolved
rl4co/envs/routing/tsp.py Outdated Show resolved Hide resolved
@@ -161,6 +174,50 @@ def check_solution_validity(td: TensorDict, actions: torch.Tensor):
== actions.data.sort(1)[0]
).all(), "Invalid tour"

@staticmethod
def improve_solution(td: TensorDict, actions: torch.Tensor, **kwargs) -> torch.Tensor:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(as per above, we can call this local_search)


CC: @leonlan this is what we mentioned during our meeting about having an optional local search operator API in RL4CO with pyvrp ; any thoughts about this?

Copy link

@leonlan leonlan Mar 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are unfortunately dropping support for the TSP-based TwoOpt in the next release (v0.8.0), because this operator is not very effective in VRPs. I spoke to @fedebotu in Slack and these are my suggestions:

  • If you want a dedicated TwoOpt operator for TSP, you can try to implement this yourself. This should not be too hard (GPT-4 can give the right solution) and you can make this very performant when implemented in say e.g. Cython. Happy to take a look when you have an implementation.
  • If you want to have local search support for VRPs, the current setup can serve as a good basis. However, some details (e.g., which penalties to set in the CostEvaluator?) need to be addressed. You can always ping me to have a look when you have an implementation.

For now, I assume that you continue to plan using PyVRP v0.7.0 which has support for TwoOpt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion!
As you suggested, I will replace the pyvrp TwoOpt with another implementation, and accordingly, I'll change the PyVRP version to the latest one.

For other problems like CVRP(TW) or PDP, I think I should use PyVRP, so I will let you know if there's anything to discuss.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About PyVRP (CVRP, CVRPTW, new variants): we can help with that!
About PDP: from what I know, one-on-one pickup and delivery problems are not available right now, is that correct @leonlan ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-to-one PDP is not supported indeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any plan for supporting one-to-one pdp by any chance?
If so, when do you think will it be?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no plans for supporting PDP in the near future.

@@ -48,6 +48,10 @@ def test_routing(env_cls, batch_size=2, size=20):
env = env_cls(num_loc=size)
reward, td, actions = rollout(env, env.reset(batch_size=[batch_size]), random_policy)
env.render(td, actions)
try:
env.improve_solution(td, actions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now this is fine, for the future I think we can move the local_search part in a separate test as it is optional

@fedebotu
Copy link
Member

Great job! 🚀


Another detail I cannot review above: tt seems that the installation fails for Python=3.8. This should be because pyvrp is only available for Python 3.9 onwards (reference), so maybe we can skip its for that case

@@ -161,6 +174,50 @@ def check_solution_validity(td: TensorDict, actions: torch.Tensor):
== actions.data.sort(1)[0]
).all(), "Invalid tour"

@staticmethod
def improve_solution(td: TensorDict, actions: torch.Tensor, **kwargs) -> torch.Tensor:
Copy link

@leonlan leonlan Mar 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are unfortunately dropping support for the TSP-based TwoOpt in the next release (v0.8.0), because this operator is not very effective in VRPs. I spoke to @fedebotu in Slack and these are my suggestions:

  • If you want a dedicated TwoOpt operator for TSP, you can try to implement this yourself. This should not be too hard (GPT-4 can give the right solution) and you can make this very performant when implemented in say e.g. Cython. Happy to take a look when you have an implementation.
  • If you want to have local search support for VRPs, the current setup can serve as a good basis. However, some details (e.g., which penalties to set in the CostEvaluator?) need to be addressed. You can always ping me to have a look when you have an implementation.

For now, I assume that you continue to plan using PyVRP v0.7.0 which has support for TwoOpt.

rl4co/envs/routing/tsp.py Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
rl4co/envs/routing/tsp.py Outdated Show resolved Hide resolved
@fedebotu fedebotu mentioned this pull request Mar 18, 2024
4 tasks
rl4co/envs/common/base.py Outdated Show resolved Hide resolved
@hyeok9855
Copy link
Contributor Author

Changes

  • Merged the updated main branch
  • Added the reward augmentation with local search, which is one of the key components of DeepACO.

@hyeok9855
Copy link
Contributor Author

Changes

@fedebotu @leonlan
If you have time, please review the renewed code!

improved_solution, is_feasible = perform_local_search(
ls_operator,
solution,
int(load_penalty * 10**4), # * 10**4 as we scale the data by 10**4 in `make_data`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int(load_penalty * 10**4), # * 10**4 as we scale the data by 10**4 in `make_data`
int(load_penalty * 10**4), # * 10**4 as we scale the data by 10**4 in `make_data`

I suggest making this value a CONSTANT variable and use it make_data to. Perhaps also add a comment to the CONSTANT explaining why it's needed to scale the data so that users can understand how to modify if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks!

@leonlan
Copy link

leonlan commented Jun 1, 2024

Hi @hyeok9855, I don't have time to go over the code in full detail, but I had a quick glance at the PyVRP part. It looks good to me with a small comment on the constant.

@@ -27,7 +29,10 @@ class AntSystem:
pheromone: Initial pheromone matrix. Defaults to `torch.ones_like(log_heuristic)`.
require_logprobs: Whether to require the log probability of actions. Defaults to False.
use_local_search: Whether to use local_search provided by the env. Default to False.
local_search_params: Arguments to be passed to the local_search function.
use_nls: Whether to use neural-guided local search provided by the env. Default to False.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between use_nls and use_local_search?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nls indicates the local search with neural-guided perturbation, proposed by DeepACO. See here.

@@ -15,6 +15,7 @@
from rl4co.utils.pylogger import get_pylogger

from .generator import TSPGenerator
from .local_search import local_search
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local_search has numba as a dependency (which is optional), so we could do something like:

try:
    from .local_search import local_search
except:
    local_search = None

Then in env:

def local_search(...):
    assert local_search is not None, "Cannot import local_search module. Make sure to have `numba` installed"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, thanks!

@@ -166,6 +167,19 @@ def check_solution_validity(td: TensorDict, actions: torch.Tensor):
== actions.data.sort(1)[0]
).all(), "Invalid tour"

def generate_data(self, batch_size) -> TensorDict:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Important] We replaced the generate_data with @cbhua this function with the more modular generator function:(e.g. here).

To generate data, we can call: env.generator(...) instead of env.generate_data

return heuristic_dist

@staticmethod
def select_start_node_fn(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor, for now]
By default, now we look for this function inside of the environment as done here, which is a bit more modular. But since we have to transfer this functions yet and is not a hard task, no need to do it now

return actions, reward # type: ignore
td_cpu = td.detach().cpu() # Convert to CPU in advance to minimize the overhead from device transfer
td_cpu["distances"] = get_distance_matrix(td_cpu["locs"])
# TODO: avoid or generalize this, e.g., pre-compute for local search in each env
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can keep this as todo, but it should be generalized. I think this could be a common classmethod for routing environments


return heatmaps_logits
heatmap += 1e-10 if heatmap.dtype != torch.float16 else 3e-8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I see a huge trick here!
These values should ideally be constants at the top, e.g:

LOWEST_POSVAL_FP32 = 1e-10
LOWEST_POSVAL_FP16 = 3e-8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lowest positive value for FP32 is not 1e-10 actually. It is much smaller than that, but 1e-10 is used in DeepACO.

@@ -79,6 +79,10 @@ def test_routing(env_cls, batch_size=2, size=20):
def test_mtvrp(variant, batch_size=2, size=20):
env = MTVRPEnv(generator_params=dict(num_loc=size, variant_preset=variant))
reward, td, actions = rollout(env, env.reset(batch_size=[batch_size]), random_policy)
try:
env.local_search(td, actions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you add the local search to this environment?

We should have this available for all 16 variants with the code @leonlan made

@fedebotu
Copy link
Member

fedebotu commented Jun 1, 2024

I think we should skip the local search testing with pyvrp for Python < 3.9, I think that is why testing failed

Like:

@pytest.mark.skipif(sys.version_info < (3, 9))
[...]

Maybe in the near future we could just remove testing Python 3.8 since it's a really old version anyways

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants