Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for #114 -- incorrect shape used in preload #115

Merged
merged 1 commit into from
Aug 30, 2024
Merged

Conversation

davidkoski
Copy link
Collaborator

No description provided.

@davidkoski davidkoski requested a review from awni August 30, 2024 02:37
@davidkoski davidkoski self-assigned this Aug 30, 2024
@@ -179,7 +179,7 @@ public struct TokenIterator: Sequence, IteratorProtocol {
// prepare the prompt in chunks if larger than the prefill size
while y.size > parameters.prefillStepSize {
_ = model(
y[..<parameters.prefillStepSize, .newAxis], cache: cache.isEmpty ? nil : cache)
y[.newAxis, ..<parameters.prefillStepSize], cache: cache.isEmpty ? nil : cache)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code from python was translated incorrectly:

        model(y[:prefill_step_size][None], cache=cache)

should be:

y[.newAxis, ..<parameters.prefillStepSize]

So this only hits the preload case (> 512 tokens) and curiously it seems to work fine with smaller preloads -- I was testing it with 32 token parameters.prefillStepSize and didn't encounter this.

Anyway, this was a bad problem because the shape of the input to attention was:

python: (1, 512, 3072)
swift: [512, 1, 3072]

Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick fix!

@davidkoski davidkoski merged commit 9266a62 into main Aug 30, 2024
1 check passed
@davidkoski davidkoski deleted the kvcache2 branch August 30, 2024 04:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants