Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set expand option as default for @layer #2532

Merged
merged 6 commits into from
Dec 3, 2024
Merged

set expand option as default for @layer #2532

merged 6 commits into from
Dec 3, 2024

Conversation

CarloLucibello
Copy link
Member

@CarloLucibello CarloLucibello commented Nov 24, 2024

This introduces the new show option :noexpand for @layer and sets the default to :expand instead.

Fix #2531

@@ -27,7 +27,7 @@ export gradient
CUDADevice, AMDGPUDevice, MetalDevice, oneAPIDevice,
XLADevice,
# get_device, # we define get_device here for retrocompatibility
# gpu_backend!, # have to define here due to https://github.com/JuliaPackaging/Preferences.jl/issues/39
gpu_backend!,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated change: with v0.15 we don't need to define gpu_backend!, we can just reexport the one from MLDataDevices

Copy link
Member

@mcabbott mcabbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These break, and probably others. Before @layer there was lots of special-case code to catch all of them.

julia> LayerNorm(10)
LayerNorm(10)       # 20 parameters

julia> MultiHeadAttention(64 => 1024 => 1024, nheads = 8)
MultiHeadAttention(64 => 1024 => 1024; nheads=8)  # 1_245_184 parameters

This PR:

julia> LayerNorm(10)
LayerNorm(
  identity,
  Scale(10),                            # 20 parameters
  1.0f-5,
  10,
  true,
) 

Copy link

codecov bot commented Nov 24, 2024

Codecov Report

Attention: Patch coverage is 25.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 33.56%. Comparing base (e2b3f06) to head (a0b083a).

Files with missing lines Patch % Lines
src/layers/macro.jl 25.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2532      +/-   ##
==========================================
+ Coverage   33.54%   33.56%   +0.01%     
==========================================
  Files          31       31              
  Lines        1881     1871      -10     
==========================================
- Hits          631      628       -3     
+ Misses       1250     1243       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@CarloLucibello CarloLucibello changed the title remove expand option (everything is expanded) set expand option as default for @layer Nov 26, 2024
@CarloLucibello
Copy link
Member Author

Ok now we have a :noexpand option but :expand becomes the default. I remove from the codebase (src+tests) 21 :expand, and added 2 :noexpand (LayerNorm + MultiHeadAttention).

I'm not sure only LayerNorm and MultiHeadAttention need :noexpand, will do more testing.

@CarloLucibello CarloLucibello merged commit e2f58a8 into master Dec 3, 2024
6 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider making the :expand option the default in @layer
2 participants