Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix GemmaBackbone.get_layout_map + test #1669

Merged
merged 5 commits into from
Jun 21, 2024

Commits on Jun 19, 2024

  1. Configuration menu
    Copy the full SHA
    9aaa915 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2fe48e0 View commit details
    Browse the repository at this point in the history
  3. Also fixing forgotten ffw_gating_2 in GemmaBackbone.get_layout_map. T…

    …he sharding spec ("batch", "model") is the one that provides the best training performance. ("batch", "model") and (None, None) are slower (the first one by 40%, the second by 2%).
    
    Fixing test too, including typo ffw_linearl => ffw_linear
    martin-gorner committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    12b222b View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2024

  1. changed test_architecture_characteristics test to follow the 4->8 hea…

    …ds change necessary for the test to work on TPUs.
    
    Also fixed formatting.
    martin-gorner committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    12fc70c View commit details
    Browse the repository at this point in the history
  2. Update gemma_backbone_test.py

    Better test messages
    mattdangerw committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    2001a3d View commit details
    Browse the repository at this point in the history