mergekit_with_sparsify #9561

Mangodadada · 2024-12-03T11:54:11Z

PR types

New features

PR changes

Others

Description

add merge kit

paddle-bot · 2024-12-03T11:54:17Z

Thanks for your contribution!

codecov · 2024-12-04T08:40:26Z

Codecov Report

Attention: Patch coverage is 81.93833% with 82 lines in your changes missing coverage. Please review.

Project coverage is 52.87%. Comparing base (d455181) to head (9555b1b).
Report is 65 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/mergekit/merge_model.py	71.68%	62 Missing ⚠️
paddlenlp/mergekit/merge_method.py	90.00%	7 Missing ⚠️
paddlenlp/mergekit/merge_config.py	94.25%	5 Missing ⚠️
paddlenlp/mergekit/sparsify_method.py	91.22%	5 Missing ⚠️
paddlenlp/mergekit/merge_utils.py	80.00%	3 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9561      +/-   ##
===========================================
- Coverage    53.19%   52.87%   -0.32%     
===========================================
  Files          700      716      +16     
  Lines       110757   111685     +928     
===========================================
+ Hits         58921    59058     +137     
- Misses       51836    52627     +791

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

lugimzzz · 2024-12-04T08:58:27Z

llm/tools/run_model_weight_merge.py

+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse


需要在下个PR新增脚本文档和API文档

lugimzzz · 2024-12-04T09:28:40Z

paddlenlp/mergekit/merge_config.py

+    dot_threshold: float = field(
+        default=0.99, metadata={"help": "Threshold for considering the two vectors as colinear.(Used in slerp)"}
+    )
+    scaling: bool = field(default=False, metadata={"help": "Whether to scale the weights."})


这些参数代表的什么不是很直观，考虑修改一下命名或者是在注释写的更清楚一些

lugimzzz · 2024-12-04T09:37:56Z

paddlenlp/mergekit/merge_sparsify.py

+import numpy as np
+
+
+class SparsificationMethod:


mask没必要传出来，直接传一个稀疏化后的tensor即可

lugimzzz · 2024-12-04T09:40:35Z

paddlenlp/mergekit/merge_linear.py

+        """
+        Linear interpolation between two values.
+        """
+        if sparsify_method is not None:


写成：

def merge_op(self, v_0, v_1, sparsify_method=None): v_0 = sparsify_method.sparsify(v_0) v_1 = sparsify_method.sparsify(v_1) v_merge = 1 - self.merge_config.linear_ratio * v_0 + self.merge_config.linear_ratio * v_1 return v_merge

lugimzzz · 2024-12-04T09:42:54Z

paddlenlp/mergekit/merge_slerp.py

+    def __init__(self, merge_config):
+        self.merge_config = merge_config
+
+    def merge_op(self, v0, v1, eps=float(1e-8), dot_threshold=None, sparsify_method=None):


eps和dot_threshold是不是直接用merge_config的就行

lugimzzz · 2024-12-05T03:11:07Z

paddlenlp/mergekit/merge_linear.py

+        Linear interpolation between two values.
+        """
+        if self.merge_config.sparsify_type is not None:
+            sparsify = SparsificationMethod(self.merge_config)


不要在这里初始化

lugimzzz · 2024-12-05T03:11:36Z

paddlenlp/mergekit/merge_slerp.py

+        """
+        if dot_threshold is None:
+            dot_threshold = self.merge_config.dot_threshold
+        if self.merge_config.sparsify_type is not None:


lugimzzz · 2024-12-05T03:16:32Z

paddlenlp/mergekit/merge_model.py

+
+class MergeModel:
+    def __init__(self, merge_config):
+        self.merge_config = merge_config


这里的方法怎么变少了

lugimzzz · 2024-12-05T03:17:34Z

paddlenlp/mergekit/merge_model.py

+
+    def merge_model(self, model_path0, model_path1, output_path, base_path=None):
+        is_safetensor0 = self.check_model_path(model_path0)
+        is_safetensor1 = self.check_model_path(model_path1)


这里也要check base_path

lugimzzz · 2024-12-05T03:25:33Z

paddlenlp/mergekit/merge_model.py

+            raise ValueError("Weights total_size mismatch. " "Please make sure you load the correct weight file")
+        if index0["weight_map"].keys() != index1["weight_map"].keys():
+            raise ValueError("Weights weight_map mismatch. Please make sure you load the correct weight file")
+        if self.merge_config.merge_type in {"ties", "dare", "della", "dare_ties", "della_linear"}:


这里加个merge_type判断有什么作用

CLAassistant · 2024-12-05T10:38:19Z

All committers have signed the CLA.

wawltor · 2024-12-16T03:13:36Z

llm/tools/merge_weight.py

@@ -0,0 +1,35 @@
+# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
+#


这里提供一个示例的config文件？

因为比较通用,感觉专门每个config写一个json必要性不大。类似lora merge直接给一个运行命令就行
python ./tools/merge_weight.py ，这块后续下一个pr会有相应的文档

wawltor · 2024-12-16T03:21:53Z

paddlenlp/mergekit/sparsify_method.py

+import numpy as np
+
+
+class SparsifyMethod:


这些稀疏化的方式只能在cpu上操作是吗？

目前这个pr只支持numpy，后续会开发基于paddle tensor的版本

wawltor · 2024-12-16T03:25:43Z

paddlenlp/mergekit/merge_method.py

+        """
+        # init weight
+        weight_list = self.merge_config.weight_list
+        if self.merge_config.normalize:


这种normalize是不是默认打开比较好？

效果侧是建议设置True还是False？

改为True，建议为True比较好

wawltor · 2024-12-16T07:01:22Z

paddlenlp/mergekit/merge_model.py

+            with fast_safe_open(os.path.join(model_path, self.safe_weight_name()), framework="numpy") as f:
+                for k in f.keys():
+                    state_dict[k] = f.get_tensor(k)
+        elif file_type == "pdparams":


这里是否需要考虑tp格式的存储了？如果不支持tp格式的存储需要抛出报错或者报错？

暂时不支持，在check_model_path会先检查模型权重存储类型。

wawltor

LGTM

mergekit_with_sparsify

6f41a4b

mergekit_with_sparsify

7c34b54

Mangodadada added 2 commits December 4, 2024 16:49

参数上传修改

4e5688e

change sparsify style

a8cd5e0

lugimzzz reviewed Dec 5, 2024

View reviewed changes

test

b07b220

add merge

d534a72

Mangodadada force-pushed the mergekit12.3 branch from fce8d97 to d534a72 Compare December 5, 2024 12:24

lugimzzz added 12 commits December 5, 2024 20:40

add merge

e2d2667

add merge

ef21f08

add merge

70b1100

add test

cbe4cd2

add tensor

b01a91f

add merge

ef1b28e

add del

8c74bdb

add merge

61c870f

add merge

2aeafa2

del base

3bd89e1

del base

17e4169

alpha

b7acb2c

wawltor reviewed Dec 16, 2024

View reviewed changes

update follow comments

9555b1b

wawltor approved these changes Dec 18, 2024

View reviewed changes

wawltor merged commit fa0febc into PaddlePaddle:develop Dec 18, 2024
10 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mergekit_with_sparsify #9561

mergekit_with_sparsify #9561

Mangodadada commented Dec 3, 2024

paddle-bot bot commented Dec 3, 2024

codecov bot commented Dec 4, 2024 •

edited

Loading

lugimzzz Dec 4, 2024

lugimzzz Dec 4, 2024

lugimzzz Dec 4, 2024

lugimzzz Dec 4, 2024

lugimzzz Dec 4, 2024

lugimzzz Dec 5, 2024

lugimzzz Dec 5, 2024

lugimzzz Dec 5, 2024

lugimzzz Dec 5, 2024

lugimzzz Dec 5, 2024

CLAassistant commented Dec 5, 2024 •

edited

Loading

wawltor Dec 16, 2024

lugimzzz Dec 16, 2024

wawltor Dec 16, 2024

lugimzzz Dec 16, 2024

wawltor Dec 16, 2024

wawltor Dec 16, 2024

lugimzzz Dec 16, 2024

wawltor Dec 16, 2024

lugimzzz Dec 16, 2024

wawltor left a comment

		@@ -0,0 +1,35 @@
		# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
		#

		import numpy as np


		class SparsificationMethod:

		import numpy as np


		class SparsifyMethod:

mergekit_with_sparsify #9561

mergekit_with_sparsify #9561

Conversation

Mangodadada commented Dec 3, 2024

PR types

PR changes

Description

paddle-bot bot commented Dec 3, 2024

codecov bot commented Dec 4, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CLAassistant commented Dec 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wawltor left a comment

Choose a reason for hiding this comment

codecov bot commented Dec 4, 2024 •

edited

Loading

CLAassistant commented Dec 5, 2024 •

edited

Loading