Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print model structure like with PyTorch #1357

Open
antimora opened this issue Feb 23, 2024 · 3 comments · May be fixed by #1763
Open

Print model structure like with PyTorch #1357

antimora opened this issue Feb 23, 2024 · 3 comments · May be fixed by #1763
Assignees
Labels
enhancement Enhance existing features

Comments

@antimora
Copy link
Collaborator

Feature description

Want to see a models structure at a glance like when you print a pytorch model:

import whisper
model = whisper.load_model("tiny")
print(model)

Result:

Whisper(
  (encoder): AudioEncoder(
    (conv1): Conv1d(80, 384, kernel_size=(3,), stride=(1,), padding=(1,))
    (conv2): Conv1d(384, 384, kernel_size=(3,), stride=(2,), padding=(1,))
    (blocks): ModuleList(
      (0-3): 4 x ResidualAttentionBlock(
        (attn): MultiHeadAttention(
          (query): Linear(in_features=384, out_features=384, bias=True)
          (key): Linear(in_features=384, out_features=384, bias=False)
          (value): Linear(in_features=384, out_features=384, bias=True)
          (out): Linear(in_features=384, out_features=384, bias=True)
        )
        (attn_ln): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
        (mlp): Sequential(
          (0): Linear(in_features=384, out_features=1536, bias=True)
          (1): GELU(approximate='none')
          (2): Linear(in_features=1536, out_features=384, bias=True)
        )
        (mlp_ln): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
      )
    )
    (ln_post): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
  )
  (decoder): TextDecoder(
    (token_embedding): Embedding(51865, 384)
    (blocks): ModuleList(
      (0-3): 4 x ResidualAttentionBlock(
        (attn): MultiHeadAttention(
          (query): Linear(in_features=384, out_features=384, bias=True)
          (key): Linear(in_features=384, out_features=384, bias=False)
          (value): Linear(in_features=384, out_features=384, bias=True)
          (out): Linear(in_features=384, out_features=384, bias=True)
        )
        (attn_ln): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
        (cross_attn): MultiHeadAttention(
          (query): Linear(in_features=384, out_features=384, bias=True)
          (key): Linear(in_features=384, out_features=384, bias=False)
          (value): Linear(in_features=384, out_features=384, bias=True)
          (out): Linear(in_features=384, out_features=384, bias=True)
        )
        (cross_attn_ln): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
        (mlp): Sequential(
          (0): Linear(in_features=384, out_features=1536, bias=True)
          (1): GELU(approximate='none')
          (2): Linear(in_features=1536, out_features=384, bias=True)
        )
        (mlp_ln): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
      )
    )
    (ln): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
  )
)

Feature motivation

Feature to see at a glance instead of reviewing code.

@antimora antimora added the enhancement Enhance existing features label Feb 23, 2024
@nathanielsimard nathanielsimard added the good first issue Good for newcomers label Feb 24, 2024
@McArthur-Alford
Copy link
Contributor

McArthur-Alford commented Mar 15, 2024

Did some digging around. This one looks like a pretty easy starting point, so Ill give it a shot.

Is this something that would be preferred as the implementation of display::fmt? Right now display only writes the name and number of parameters, which isn't hugely useful.
Alternatively perhaps some new function, e.g. tree(), on the module trait would probably work just fine.

@antimora
Copy link
Collaborator Author

I prefer if we implement display::fmt. BTW, you may have to look into burn-derive crate to get attribute names and tree structure.

@antimora antimora self-assigned this May 13, 2024
@antimora antimora removed the good first issue Good for newcomers label May 13, 2024
@antimora
Copy link
Collaborator Author

I have started working on this issue and I have a good design solution that's flexible and robust.

@antimora antimora linked a pull request May 13, 2024 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhance existing features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants