Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to unfuse Wqkv #1367

Merged
merged 13 commits into from
Jul 17, 2024
Merged

Add option to unfuse Wqkv #1367

merged 13 commits into from
Jul 17, 2024

Conversation

snarayan21
Copy link
Contributor

@snarayan21 snarayan21 commented Jul 16, 2024

Adding an option to unfuse Wqkv. Gives us flexibility to support various other model architectures that don't fuse Wqkv like we do.

Run with fused Wqkv (main): fused-wqkv-orig-n7eky6
Run with fused Wqkv (this branch): fused-wqkv-new-ggXoVL
Run without fused Wqkv: unfused-wqkv-new-a2Qo2r

Same loss curves:
Screenshot 2024-07-17 at 11 39 08 AM

Same throughput for fused Wqkv before and after, slightly lower throughput for unfused Wqkv (as expected):
Screenshot 2024-07-17 at 11 38 42 AM

@snarayan21 snarayan21 requested a review from a team as a code owner July 16, 2024 20:44
Copy link
Collaborator

@dakinggg dakinggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add some unit tests too

llmfoundry/models/layers/attention.py Outdated Show resolved Hide resolved
llmfoundry/models/layers/attention.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@dakinggg dakinggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, would like @vchiley or @ShashankMosaicML to take a look before approving

llmfoundry/models/layers/attention.py Outdated Show resolved Hide resolved
@snarayan21
Copy link
Contributor Author

Added testing for fuse splits into the unit test @dakinggg

@snarayan21 snarayan21 requested a review from dakinggg July 17, 2024 20:06
Copy link
Contributor

@ShashankMosaicML ShashankMosaicML left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@snarayan21 snarayan21 enabled auto-merge (squash) July 17, 2024 20:12
@snarayan21 snarayan21 disabled auto-merge July 17, 2024 20:12
@snarayan21 snarayan21 enabled auto-merge (squash) July 17, 2024 20:12
@snarayan21 snarayan21 merged commit 221d252 into main Jul 17, 2024
9 checks passed
@dakinggg dakinggg deleted the saaketh/unfuse_wqkv branch August 6, 2024 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants