Add module repr methods #2191

1ytic · 2024-09-04T17:01:13Z

Add __repr__ method to Module and ModuleList. This is a simple copy-paste from the PyTorch implementation and it helps to debug mapping between HF and TensorRT-LLM. For example:

[09/04/2024-16:23:28] [TRT-LLM] [I] HuggingFace model: Qwen2MoeForCausalLM(
  (model): Qwen2MoeModel(
    (embed_tokens): Embedding(151936, 3584)
    (layers): ModuleList(
      (0-27): 28 x Qwen2MoeDecoderLayer(
        (self_attn): Qwen2MoeSdpaAttention(
          (rotary_emb): Qwen2MoeRotaryEmbedding()
          (k_proj): QuantLinear()
          (o_proj): QuantLinear()
          (q_proj): QuantLinear()
          (v_proj): QuantLinear()
        )
        (mlp): Qwen2MoeSparseMoeBlock(
          (gate): Linear(in_features=3584, out_features=64, bias=False)
          (experts): ModuleList(
            (0-63): 64 x Qwen2MoeMLP(
              (act_fn): SiLU()
              (down_proj): QuantLinear()
              (gate_proj): QuantLinear()
              (up_proj): QuantLinear()
            )
          )
          (shared_expert): Qwen2MoeMLP(
            (act_fn): SiLU()
            (down_proj): QuantLinear()
            (gate_proj): QuantLinear()
            (up_proj): QuantLinear()
          )
          (shared_expert_gate): Linear(in_features=3584, out_features=1, bias=False)
        )
        (input_layernorm): Qwen2MoeRMSNorm()
        (post_attention_layernorm): Qwen2MoeRMSNorm()
      )
    )
    (norm): Qwen2MoeRMSNorm()
  )
  (lm_head): Linear(in_features=3584, out_features=151936, bias=False)
)
[09/04/2024-16:23:28] [TRT-LLM] [I] TensorRT-LLM model: QWenForCausalLM(
  (transformer): QWenModel(
    (vocab_embedding): Embedding()
    (layers): DecoderLayerList(
      (0-27): 28 x QWenDecoderLayer(
        (input_layernorm): RmsNorm()
        (attention): Attention(
          (qkv): WeightOnlyGroupwiseQuantLinear()
          (dense): WeightOnlyGroupwiseQuantRowLinear()
        )
        (shared_expert): MLP(
          (fc): WeightOnlyGroupwiseQuantLinear()
          (proj): WeightOnlyGroupwiseQuantRowLinear()
        )
        (shared_expert_gate): RowLinear()
        (mlp): MixtureOfExperts(
          (router): RowLinear()
          (fc): MOEWeightWrapper()
          (proj): MOEWeightWrapper()
        )
        (post_layernorm): RmsNorm()
      )
    )
    (ln_f): RmsNorm()
  )
  (lm_head): Linear()
)

lfr-0531 · 2024-09-08T10:14:06Z

@jershi425 Could you please take a look at this PR? Thanks

lfr-0531 · 2024-09-09T06:05:28Z

@1ytic Thanks for your contribution. Checked with @jershi425, looks good to him. We'll merge your changes into the internal code base.

hchings · 2024-09-27T16:48:49Z

Closing this out as it's been merged.

Add module repr methods

c328e4e

lfr-0531 assigned jershi425 Sep 8, 2024

lfr-0531 added the triaged Issue has been triaged by maintainers label Sep 8, 2024

lfr-0531 requested a review from jershi425 September 8, 2024 10:14

lfr-0531 added the Merged label Sep 16, 2024

kaiyux mentioned this pull request Sep 24, 2024

Update TensorRT-LLM #2253

Merged

hchings closed this Sep 27, 2024

kaiyux mentioned this pull request Nov 1, 2024

Update TensorRT-LLM v0.14.0 #2401

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add module repr methods #2191

Add module repr methods #2191

1ytic commented Sep 4, 2024

lfr-0531 commented Sep 8, 2024

lfr-0531 commented Sep 9, 2024

hchings commented Sep 27, 2024

Add module __repr__ methods #2191

Add module __repr__ methods #2191

Conversation

1ytic commented Sep 4, 2024

lfr-0531 commented Sep 8, 2024

lfr-0531 commented Sep 9, 2024

hchings commented Sep 27, 2024

Add module repr methods #2191

Add module repr methods #2191