Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count of Conformer parameters mismatch with that in the paper #35

Open
maxwellzh opened this issue Oct 18, 2021 · 4 comments
Open

Count of Conformer parameters mismatch with that in the paper #35

maxwellzh opened this issue Oct 18, 2021 · 4 comments

Comments

@maxwellzh
Copy link

In the Conformer original paper, the number of parameters are
截屏2021-10-18 下午3 22 54

However, with the implementation in this repo, the number of parameters are slightly different

Conformer  small: 10.16 M
Conformer medium: 31.86 M
Conformer  large: 120.11 M

I get the size with this script

from conformer import Conformer


def count_parameters(model) -> int:
    return sum(p.numel() for p in model.parameters() if p.requires_grad)


models = {
    'small': Conformer(
        num_classes=1000,
        input_dim=80,
        encoder_dim=144,
        decoder_dim=320,
        num_encoder_layers=16,
        num_decoder_layers=1,
        num_attention_heads=4,
        conv_kernel_size=31
    ),
    'medium': Conformer(
        num_classes=1000,
        input_dim=80,
        encoder_dim=256,
        decoder_dim=640,
        num_encoder_layers=16,
        num_decoder_layers=1,
        num_attention_heads=4,
        conv_kernel_size=31
    ),
    'large': Conformer(
        num_classes=1000,
        input_dim=80,
        encoder_dim=512,
        decoder_dim=640,
        num_encoder_layers=17,
        num_decoder_layers=1,
        num_attention_heads=8,
        conv_kernel_size=31
    )
}

for size, m in models.items():
    print("Conformer {:>6}: {:.2f} M".format(size, count_parameters(m)/1e6))

Since the convolution layer kernel size couldn't be set to 32, I just set it to 31. But this won't make such difference in number of params.

@sooftware
Copy link
Owner

This is not an official implementation, so there is a slight difference in the number of parameters.
Of course, I tried to implement it as similar as possible to the contents of the paper. :).

@sooftware
Copy link
Owner

Also, num_classes affects.

@maxwellzh
Copy link
Author

This is kind of weird. I test several open-source Conformer implementation (I also implement it myself), but none of them can strictly match the reported number of parameters. Do you have any idea where the difference may be?
btw. num_classes is set to 1k according to the paper.

@sooftware
Copy link
Owner

I'm curious, too. I am only speculating that there may be details not mentioned in the paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants