Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed some parameter discrepancy for MDAM #157

Closed
wants to merge 8 commits into from

Conversation

bokveizen
Copy link
Member

Fixed some parameter discrepancy for MDAM.
But the encoder still raises ValueError.
See GraphAttentionEncoder at models/zoo/mdam/encoder.py.
ValueError: too many values to unpack (expected 3); here self.attention_layer(h_old) is a single tensor
h_new, attn, V = self.attention_layer(h_old)

fixed some parameter discrepancy
fixed some parameter descrepency
fixed some parameter discrepancy, but still ValueError
@fedebotu fedebotu requested review from cbhua and fedebotu and removed request for cbhua April 9, 2024 01:58
@fedebotu
Copy link
Member

fedebotu commented Apr 9, 2024

This model was working, but we did not complete the beam search part, mostly because some newer methods are more efficient anyway.
If someone would like to work on it we'd be glad to delegate ;)


self.logit_attention = [
LogitAttention(
embedding_dim,
num_heads,
mask_inner=mask_inner,
force_flash_attn=force_flash_attn,
# force_flash_attn=force_flash_attn, # fix: not used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete this, too, it was left from the previous versions of FlashAttention which is now implemented natively in PyTorch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delted

self.attention_layer = MultiHeadAttention(
num_heads, input_dim=embed_dim, embed_dim=embed_dim, sdpa_fn=sdpa_fn
# num_heads, input_dim=embed_dim, embed_dim=embed_dim, sdpa_fn=sdpa_fn # fix: redudant parameter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete comments too

@@ -63,6 +65,7 @@ def forward(self, x, mask=None, return_transform_loss=False):

h_embeded = x
h_old = self.layers(h_embeded)
# ValueError: too many values to unpack (expected 3); here self.attention_layer(h_old) is a single tensor
h_new, attn, V = self.attention_layer(h_old)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it was working, was something changed here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the behavior of MultiHeadAttention was changed, was it? @cbhua
Because in this model they remade the MHA class from what I remember

env_name, {"embedding_dim": embedding_dim}
)
else:
dynamic_embedding = dynamic_embedding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor] I think here could be self.dynamic_embedding = dynamic_embedding, merge with L97.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated thanks

self.context = [
env_context_embedding(env_name, {"embedding_dim": embedding_dim})
# env_context_embedding(env_name, {"embedding_dim": embedding_dim})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of won't used code could be directly removed, since we have Git it would be easy to track. Don't worry.

self.init_embed = nn.Linear(node_dim, embed_dim) if node_dim is not None else None

self.init_embed = nn.Linear(node_dim, embed_dim) if node_dim is not None else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor] for clarify we could avoid these spaces.

@fedebotu
Copy link
Member

Note: an update was pushed but forgot to discuss here - now it seems to be working, although the beam search part is still missing. However I would consider this as lower priority at the moment

@bokveizen bokveizen closed this May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants