-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] Add Local Search for Solution Improvement #140
base: main
Are you sure you want to change the base?
Conversation
rl4co/envs/routing/tsp.py
Outdated
@@ -161,6 +174,50 @@ def check_solution_validity(td: TensorDict, actions: torch.Tensor): | |||
== actions.data.sort(1)[0] | |||
).all(), "Invalid tour" | |||
|
|||
@staticmethod | |||
def improve_solution(td: TensorDict, actions: torch.Tensor, **kwargs) -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(as per above, we can call this local_search
)
CC: @leonlan this is what we mentioned during our meeting about having an optional local search operator API in RL4CO with pyvrp
; any thoughts about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are unfortunately dropping support for the TSP-based TwoOpt
in the next release (v0.8.0), because this operator is not very effective in VRPs. I spoke to @fedebotu in Slack and these are my suggestions:
- If you want a dedicated TwoOpt operator for TSP, you can try to implement this yourself. This should not be too hard (GPT-4 can give the right solution) and you can make this very performant when implemented in say e.g. Cython. Happy to take a look when you have an implementation.
- If you want to have local search support for VRPs, the current setup can serve as a good basis. However, some details (e.g., which penalties to set in the CostEvaluator?) need to be addressed. You can always ping me to have a look when you have an implementation.
For now, I assume that you continue to plan using PyVRP v0.7.0 which has support for TwoOpt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion!
As you suggested, I will replace the pyvrp TwoOpt
with another implementation, and accordingly, I'll change the PyVRP version to the latest one.
For other problems like CVRP(TW) or PDP, I think I should use PyVRP, so I will let you know if there's anything to discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About PyVRP (CVRP, CVRPTW, new variants): we can help with that!
About PDP: from what I know, one-on-one pickup and delivery problems are not available right now, is that correct @leonlan ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One-to-one PDP is not supported indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any plan for supporting one-to-one pdp by any chance?
If so, when do you think will it be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no plans for supporting PDP in the near future.
tests/test_envs.py
Outdated
@@ -48,6 +48,10 @@ def test_routing(env_cls, batch_size=2, size=20): | |||
env = env_cls(num_loc=size) | |||
reward, td, actions = rollout(env, env.reset(batch_size=[batch_size]), random_policy) | |||
env.render(td, actions) | |||
try: | |||
env.improve_solution(td, actions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now this is fine, for the future I think we can move the local_search
part in a separate test as it is optional
Great job! 🚀 Another detail I cannot review above: tt seems that the installation fails for Python=3.8. This should be because |
rl4co/envs/routing/tsp.py
Outdated
@@ -161,6 +174,50 @@ def check_solution_validity(td: TensorDict, actions: torch.Tensor): | |||
== actions.data.sort(1)[0] | |||
).all(), "Invalid tour" | |||
|
|||
@staticmethod | |||
def improve_solution(td: TensorDict, actions: torch.Tensor, **kwargs) -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are unfortunately dropping support for the TSP-based TwoOpt
in the next release (v0.8.0), because this operator is not very effective in VRPs. I spoke to @fedebotu in Slack and these are my suggestions:
- If you want a dedicated TwoOpt operator for TSP, you can try to implement this yourself. This should not be too hard (GPT-4 can give the right solution) and you can make this very performant when implemented in say e.g. Cython. Happy to take a look when you have an implementation.
- If you want to have local search support for VRPs, the current setup can serve as a good basis. However, some details (e.g., which penalties to set in the CostEvaluator?) need to be addressed. You can always ping me to have a look when you have an implementation.
For now, I assume that you continue to plan using PyVRP v0.7.0 which has support for TwoOpt.
Changes
|
Changes
@fedebotu @leonlan |
improved_solution, is_feasible = perform_local_search( | ||
ls_operator, | ||
solution, | ||
int(load_penalty * 10**4), # * 10**4 as we scale the data by 10**4 in `make_data` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int(load_penalty * 10**4), # * 10**4 as we scale the data by 10**4 in `make_data` | |
int(load_penalty * 10**4), # * 10**4 as we scale the data by 10**4 in `make_data` |
I suggest making this value a CONSTANT variable and use it make_data
to. Perhaps also add a comment to the CONSTANT explaining why it's needed to scale the data so that users can understand how to modify if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks!
Hi @hyeok9855, I don't have time to go over the code in full detail, but I had a quick glance at the PyVRP part. It looks good to me with a small comment on the constant. |
@@ -27,7 +29,10 @@ class AntSystem: | |||
pheromone: Initial pheromone matrix. Defaults to `torch.ones_like(log_heuristic)`. | |||
require_logprobs: Whether to require the log probability of actions. Defaults to False. | |||
use_local_search: Whether to use local_search provided by the env. Default to False. | |||
local_search_params: Arguments to be passed to the local_search function. | |||
use_nls: Whether to use neural-guided local search provided by the env. Default to False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between use_nls
and use_local_search
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nls
indicates the local search with neural-guided perturbation, proposed by DeepACO. See here.
@@ -15,6 +15,7 @@ | |||
from rl4co.utils.pylogger import get_pylogger | |||
|
|||
from .generator import TSPGenerator | |||
from .local_search import local_search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local_search
has numba
as a dependency (which is optional), so we could do something like:
try:
from .local_search import local_search
except:
local_search = None
Then in env:
def local_search(...):
assert local_search is not None, "Cannot import local_search module. Make sure to have `numba` installed"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good, thanks!
@@ -166,6 +167,19 @@ def check_solution_validity(td: TensorDict, actions: torch.Tensor): | |||
== actions.data.sort(1)[0] | |||
).all(), "Invalid tour" | |||
|
|||
def generate_data(self, batch_size) -> TensorDict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return heuristic_dist | ||
|
||
@staticmethod | ||
def select_start_node_fn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Minor, for now]
By default, now we look for this function inside of the environment as done here, which is a bit more modular. But since we have to transfer this functions yet and is not a hard task, no need to do it now
return actions, reward # type: ignore | ||
td_cpu = td.detach().cpu() # Convert to CPU in advance to minimize the overhead from device transfer | ||
td_cpu["distances"] = get_distance_matrix(td_cpu["locs"]) | ||
# TODO: avoid or generalize this, e.g., pre-compute for local search in each env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can keep this as todo
, but it should be generalized. I think this could be a common classmethod
for routing environments
|
||
return heatmaps_logits | ||
heatmap += 1e-10 if heatmap.dtype != torch.float16 else 3e-8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I see a huge trick here!
These values should ideally be constants at the top, e.g:
LOWEST_POSVAL_FP32 = 1e-10
LOWEST_POSVAL_FP16 = 3e-8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lowest positive value for FP32 is not 1e-10 actually. It is much smaller than that, but 1e-10 is used in DeepACO.
@@ -79,6 +79,10 @@ def test_routing(env_cls, batch_size=2, size=20): | |||
def test_mtvrp(variant, batch_size=2, size=20): | |||
env = MTVRPEnv(generator_params=dict(num_loc=size, variant_preset=variant)) | |||
reward, td, actions = rollout(env, env.reset(batch_size=[batch_size]), random_policy) | |||
try: | |||
env.local_search(td, actions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you add the local search to this environment?
We should have this available for all 16 variants with the code @leonlan made
I think we should skip the local search testing with Like: @pytest.mark.skipif(sys.version_info < (3, 9))
[...] Maybe in the near future we could just remove testing Python 3.8 since it's a really old version anyways |
Open for Review; Do Not Merge for Now!
Description
Add local search operators as a post-processing to improve a given solution.
Here, we implement 2-opt for TSP and LocalSearch operator provided by PyVRP for CVRP and CVRPTW.
For other problems (e.g., PDP or scheduling), we couldn't find such a plug-and-play local search operator, and we're looking for contributions to local search for other problems!
Note that I also made some refactorings regarding type hinting.
Motivation and Context
Local search is an essential component for an enhanced CO solver. Many research projects in the NCO field utilize local search operators, such as our recent work GFACS, which trains NN using the solution refined by local search in an off-policy manner.
Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!