-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed some parameter discrepancy for MDAM #157
Conversation
fixed some parameter discrepancy
fixed some parameter descrepency
fixed some parameter discrepancy, but still ValueError
This model was working, but we did not complete the beam search part, mostly because some newer methods are more efficient anyway. |
|
||
self.logit_attention = [ | ||
LogitAttention( | ||
embedding_dim, | ||
num_heads, | ||
mask_inner=mask_inner, | ||
force_flash_attn=force_flash_attn, | ||
# force_flash_attn=force_flash_attn, # fix: not used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can delete this, too, it was left from the previous versions of FlashAttention which is now implemented natively in PyTorch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delted
rl4co/models/zoo/mdam/encoder.py
Outdated
self.attention_layer = MultiHeadAttention( | ||
num_heads, input_dim=embed_dim, embed_dim=embed_dim, sdpa_fn=sdpa_fn | ||
# num_heads, input_dim=embed_dim, embed_dim=embed_dim, sdpa_fn=sdpa_fn # fix: redudant parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can delete comments too
@@ -63,6 +65,7 @@ def forward(self, x, mask=None, return_transform_loss=False): | |||
|
|||
h_embeded = x | |||
h_old = self.layers(h_embeded) | |||
# ValueError: too many values to unpack (expected 3); here self.attention_layer(h_old) is a single tensor | |||
h_new, attn, V = self.attention_layer(h_old) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it was working, was something changed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the behavior of MultiHeadAttention
was changed, was it? @cbhua
Because in this model they remade the MHA class from what I remember
rl4co/models/zoo/mdam/decoder.py
Outdated
env_name, {"embedding_dim": embedding_dim} | ||
) | ||
else: | ||
dynamic_embedding = dynamic_embedding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Minor] I think here could be self.dynamic_embedding = dynamic_embedding
, merge with L97.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated thanks
rl4co/models/zoo/mdam/decoder.py
Outdated
self.context = [ | ||
env_context_embedding(env_name, {"embedding_dim": embedding_dim}) | ||
# env_context_embedding(env_name, {"embedding_dim": embedding_dim}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of won't used code could be directly removed, since we have Git it would be easy to track. Don't worry.
self.init_embed = nn.Linear(node_dim, embed_dim) if node_dim is not None else None | ||
|
||
self.init_embed = nn.Linear(node_dim, embed_dim) if node_dim is not None else None | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Minor] for clarify we could avoid these spaces.
Note: an update was pushed but forgot to discuss here - now it seems to be working, although the beam search part is still missing. However I would consider this as lower priority at the moment |
Fixed some parameter discrepancy for MDAM.
But the encoder still raises ValueError.
See
GraphAttentionEncoder
atmodels/zoo/mdam/encoder.py
.ValueError: too many values to unpack (expected 3); here self.attention_layer(h_old) is a single tensor
h_new, attn, V = self.attention_layer(h_old)