[Layer] Fine grained data type definition for Attention and MLP #194

pujiang2018 · 2024-01-23T06:40:40Z

Main modification:

Make xFT support any data type for input, output and intermediate buffer, by adding more template parameters to Attention and MLP.
Llama model data type definition according to type selector
more data type support for some utility functions
BF16 communication
Some little refactor (like in CommonDecoder)

(Please be noted, this PR targets to make the pure BF16 flow works, the performance is not optimized yet)

changqi1 · 2024-01-27T03:38:13Z

src/kernels/attention_kernels.h

@@ -0,0 +1,162 @@
+#pragma once


Add License.

changqi1 · 2024-01-27T03:38:25Z

src/kernels/attention_kernels.cpp

-#include "amx_sgemm_bf16bf16bf16.h"
-#include "bfloat16.h"
-#include "copy_util.h"
+#include "attention_kernels.h"


Add License.

changqi1 · 2024-01-27T03:43:02Z

src/utils/intrinsics_util.h

@@ -0,0 +1,67 @@
+#pragma once


Add License.

changqi1 · 2024-01-27T03:43:51Z

src/utils/type_selector.h

@@ -7,13 +7,15 @@ struct TypeSelector {
    using InType = float;


Add License.

changqi1 · 2024-01-27T03:44:33Z

src/utils/copy_util.h

@@ -3,43 +3,29 @@
 #include <cstdio>


Add License.

changqi1 · 2024-01-27T03:46:04Z

@pujiang2018

pujiang2018 · 2024-01-30T05:36:13Z

added the license header.

changqi1 · 2024-01-30T08:06:44Z

@pujiang2018 Have a conflict.

Fine grained data type definition for Attention and MLP

eb87363

pujiang2018 requested a review from changqi1 January 23, 2024 06:41

changqi1 approved these changes Jan 27, 2024

View reviewed changes

changqi1 changed the title ~~Fine grained data type definition for Attention and MLP~~ [Layer] Fine grained data type definition for Attention and MLP Jan 27, 2024

Add license header

138d449

pujiang2018 added 2 commits January 30, 2024 03:52

remove thread restriction; align KV cache to 1024

79b4ccf

Merge branch 'main' into pujiang/feature/fine_grained_dt

5a4ed08

changqi1 merged commit 08527aa into main Jan 30, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Layer] Fine grained data type definition for Attention and MLP #194

[Layer] Fine grained data type definition for Attention and MLP #194

pujiang2018 commented Jan 23, 2024

changqi1 Jan 27, 2024

changqi1 Jan 27, 2024

changqi1 Jan 27, 2024

changqi1 Jan 27, 2024

changqi1 Jan 27, 2024

changqi1 commented Jan 27, 2024

pujiang2018 commented Jan 30, 2024

changqi1 commented Jan 30, 2024

		@@ -7,13 +7,15 @@ struct TypeSelector {
		using InType = float;

[Layer] Fine grained data type definition for Attention and MLP #194

[Layer] Fine grained data type definition for Attention and MLP #194

Conversation

pujiang2018 commented Jan 23, 2024

changqi1 Jan 27, 2024

Choose a reason for hiding this comment

changqi1 Jan 27, 2024

Choose a reason for hiding this comment

changqi1 Jan 27, 2024

Choose a reason for hiding this comment

changqi1 Jan 27, 2024

Choose a reason for hiding this comment

changqi1 Jan 27, 2024

Choose a reason for hiding this comment

changqi1 commented Jan 27, 2024

pujiang2018 commented Jan 30, 2024

changqi1 commented Jan 30, 2024