Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] Add gemma model support. #259

Merged
merged 3 commits into from
Apr 10, 2024
Merged

[Model] Add gemma model support. #259

merged 3 commits into from
Apr 10, 2024

Conversation

marvin-Yu
Copy link
Contributor

@marvin-Yu marvin-Yu commented Mar 4, 2024

TODO list:

  • [xDNN] add gelu ACT support.
  • [mlp] add gelu ACT support.
  • [mlp] fix catup/gate gemm bug.

@marvin-Yu marvin-Yu force-pushed the gemma_model branch 5 times, most recently from 130cfbe to 7fcea4b Compare March 6, 2024 00:38
@marvin-Yu marvin-Yu force-pushed the gemma_model branch 2 times, most recently from 4aca9cf to 8adcda6 Compare April 8, 2024 05:30
@marvin-Yu marvin-Yu changed the title [model] add gemma fp16 support. [model] add gemma model support. Apr 8, 2024
@marvin-Yu marvin-Yu force-pushed the gemma_model branch 2 times, most recently from b8ed43b to ad79da3 Compare April 9, 2024 01:06
@marvin-Yu marvin-Yu marked this pull request as ready for review April 9, 2024 01:41
@marvin-Yu marvin-Yu force-pushed the gemma_model branch 3 times, most recently from b175049 to 6ce1221 Compare April 10, 2024 01:00
@marvin-Yu marvin-Yu requested review from pujiang2018, changqi1 and abenmao and removed request for pujiang2018 April 10, 2024 01:00
@Duyi-Wang Duyi-Wang changed the title [model] add gemma model support. [Model] Add gemma model support. Apr 10, 2024
@marvin-Yu marvin-Yu force-pushed the gemma_model branch 2 times, most recently from 55cbf57 to c12408c Compare April 10, 2024 02:00
@abenmao
Copy link
Contributor

abenmao commented Apr 10, 2024

Add support in example.cpp please.

@marvin-Yu
Copy link
Contributor Author

Add support in example.cpp please.

the Gemma support for example.cpp will be updated in a new PR, not adding to the new content of this PR.

@@ -218,6 +218,7 @@ class Attention {
bool useSelfAttn, bool doLnBefore, int *positionIds = nullptr) {

auto hiddenSize = ctx->hiddenSize;
auto attSize = ctx->attHeadNum * ctx->attHeadSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if 'attSize' not used, let's remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

ctx->mmHelper->compute_silu(false, M, N, K, 1.0f, A, lda, B, scaleB, zeroB, sumB, 0.0f, C, ldc);
if (ctx->actType == DecoderContext::SILU) {
ctx->mmHelper->compute_silu(false, M, N, K, 1.0f, A, lda, B, scaleB, zeroB, sumB, 0.0f, C, ldc);
} else if (ctx->actType == DecoderContext::SWIGLU) { // chatglm2/3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use the original path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// compute gelu on the left half and then add it with the right half
template <typename T1, typename T2>
static void geluSum(hpj::Matrix<T1> &src, hpj::Matrix<T2> &dst) {
__m512 c1 = _mm512_set1_ps(0.044715f);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@changqi1
Copy link
Contributor

Wait for checking Silu API.

@marvin-Yu marvin-Yu merged commit cff1f79 into main Apr 10, 2024
1 check passed
@Duyi-Wang Duyi-Wang deleted the gemma_model branch April 23, 2024 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants