Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mergekit_with_sparsify #9561

Merged
merged 19 commits into from
Dec 18, 2024
Merged

mergekit_with_sparsify #9561

merged 19 commits into from
Dec 18, 2024

Conversation

Mangodadada
Copy link
Contributor

PR types

New features

PR changes

Others

Description

add merge kit

Copy link

paddle-bot bot commented Dec 3, 2024

Thanks for your contribution!

Copy link

codecov bot commented Dec 4, 2024

Codecov Report

Attention: Patch coverage is 81.93833% with 82 lines in your changes missing coverage. Please review.

Project coverage is 52.87%. Comparing base (d455181) to head (9555b1b).
Report is 65 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/mergekit/merge_model.py 71.68% 62 Missing ⚠️
paddlenlp/mergekit/merge_method.py 90.00% 7 Missing ⚠️
paddlenlp/mergekit/merge_config.py 94.25% 5 Missing ⚠️
paddlenlp/mergekit/sparsify_method.py 91.22% 5 Missing ⚠️
paddlenlp/mergekit/merge_utils.py 80.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9561      +/-   ##
===========================================
- Coverage    53.19%   52.87%   -0.32%     
===========================================
  Files          700      716      +16     
  Lines       110757   111685     +928     
===========================================
+ Hits         58921    59058     +137     
- Misses       51836    52627     +791     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import argparse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要在下个PR新增脚本文档和API文档

dot_threshold: float = field(
default=0.99, metadata={"help": "Threshold for considering the two vectors as colinear.(Used in slerp)"}
)
scaling: bool = field(default=False, metadata={"help": "Whether to scale the weights."})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些参数代表的什么不是很直观,考虑修改一下命名或者是在注释写的更清楚一些

import numpy as np


class SparsificationMethod:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mask没必要传出来,直接传一个稀疏化后的tensor即可

"""
Linear interpolation between two values.
"""
if sparsify_method is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

写成:

def merge_op(self, v_0, v_1, sparsify_method=None):
  v_0 = sparsify_method.sparsify(v_0)
  v_1 = sparsify_method.sparsify(v_1)
  v_merge = 1 - self.merge_config.linear_ratio * v_0 + self.merge_config.linear_ratio * v_1
  return v_merge

def __init__(self, merge_config):
self.merge_config = merge_config

def merge_op(self, v0, v1, eps=float(1e-8), dot_threshold=None, sparsify_method=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eps和dot_threshold是不是直接用merge_config的就行

Linear interpolation between two values.
"""
if self.merge_config.sparsify_type is not None:
sparsify = SparsificationMethod(self.merge_config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要在这里初始化

"""
if dot_threshold is None:
dot_threshold = self.merge_config.dot_threshold
if self.merge_config.sparsify_type is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上


class MergeModel:
def __init__(self, merge_config):
self.merge_config = merge_config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的方法怎么变少了


def merge_model(self, model_path0, model_path1, output_path, base_path=None):
is_safetensor0 = self.check_model_path(model_path0)
is_safetensor1 = self.check_model_path(model_path1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里也要check base_path

raise ValueError("Weights total_size mismatch. " "Please make sure you load the correct weight file")
if index0["weight_map"].keys() != index1["weight_map"].keys():
raise ValueError("Weights weight_map mismatch. Please make sure you load the correct weight file")
if self.merge_config.merge_type in {"ties", "dare", "della", "dare_ties", "della_linear"}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里加个merge_type判断有什么作用

@CLAassistant
Copy link

CLAassistant commented Dec 5, 2024

CLA assistant check
All committers have signed the CLA.

@@ -0,0 +1,35 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
#
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里提供一个示例的config文件?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为比较通用,感觉专门每个config写一个json必要性不大。类似lora merge直接给一个运行命令就行
python ./tools/merge_weight.py ,这块后续下一个pr会有相应的文档

import numpy as np


class SparsifyMethod:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些稀疏化的方式只能在cpu上操作是吗?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前这个pr只支持numpy,后续会开发基于paddle tensor的版本

"""
# init weight
weight_list = self.merge_config.weight_list
if self.merge_config.normalize:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种normalize是不是默认打开比较好?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

效果侧是建议设置True还是False?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改为True,建议为True比较好

with fast_safe_open(os.path.join(model_path, self.safe_weight_name()), framework="numpy") as f:
for k in f.keys():
state_dict[k] = f.get_tensor(k)
elif file_type == "pdparams":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是否需要考虑tp格式的存储了?如果不支持tp格式的存储需要抛出报错或者报错?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

暂时不支持,在check_model_path会先检查模型权重存储类型。

Copy link
Collaborator

@wawltor wawltor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wawltor wawltor merged commit fa0febc into PaddlePaddle:develop Dec 18, 2024
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants