-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GLM-4 and Later GLM Model (Draft) #31977
Closed
Closed
Changes from all commits
Commits
Show all changes
86 commits
Select commit
Hold shift + click to select a range
9cf74d7
add GLM-4
zRzRzRzRzRzRzR bef7fd9
GLM-4 FastTokenizer
zRzRzRzRzRzRzR c986fac
tokenizer fix
zRzRzRzRzRzRzR 2da5d32
rename
zRzRzRzRzRzRzR 675e7a1
pad token
zRzRzRzRzRzRzR 304e4ef
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 0b241f2
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR fa44041
Fix past_key_values
duzx16 24dec6b
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 5d2bf5e
Merge branch 'glm-4' of github.com:zRzRzRzRzRzRzR/transformers into g…
duzx16 63d49c9
Fix flash attention
duzx16 0a5adf3
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 51cbf5d
add update
zRzRzRzRzRzRzR 86b5004
Merge branch 'glm-4' of https://github.com/zRzRzRzRzRzRzR/transformer…
zRzRzRzRzRzRzR 9a553e5
test with glm
zRzRzRzRzRzRzR 4d45b21
fix test
zRzRzRzRzRzRzR 85cfe41
add discription
zRzRzRzRzRzRzR 860c7ee
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR c83ec2d
update glm
zRzRzRzRzRzRzR 2608010
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 1719000
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 3f0452e
rewrite tokenizer
zRzRzRzRzRzRzR 33d2ca3
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 084988e
fix some test
zRzRzRzRzRzRzR 0cb1531
fix testing
zRzRzRzRzRzRzR e49718f
Fix RMSNorm initialization
duzx16 a362206
Fix position ids when passing input_embeds
duzx16 08b43d9
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 3c5322d
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR dd06993
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 8cc0381
Fix dtype error
duzx16 a35997e
Merge branch 'glm-4' of github.com:zRzRzRzRzRzRzR/transformers into g…
duzx16 621d32f
Fix output_layer for classification models
duzx16 48d1704
fix gradient
zRzRzRzRzRzRzR 5881ed5
remove some skip test
zRzRzRzRzRzRzR c920ad9
fix small test
zRzRzRzRzRzRzR 21781b3
Fix prepare_inputs_for_generation
duzx16 9599200
Merge branch 'glm-4' of github.com:zRzRzRzRzRzRzR/transformers into g…
duzx16 a9b1d0d
fix
zRzRzRzRzRzRzR 0631615
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 9f33751
add converter
zRzRzRzRzRzRzR 2663a13
fix PEP 8
zRzRzRzRzRzRzR aad19db
remove test
zRzRzRzRzRzRzR 1e9183c
index
zRzRzRzRzRzRzR e8b90a1
fix doctested
zRzRzRzRzRzRzR 65e1996
remove init
zRzRzRzRzRzRzR 266ce77
fix copied error
zRzRzRzRzRzRzR cd9c304
fix mlp differ
zRzRzRzRzRzRzR ba30dad
fix copied eerror
zRzRzRzRzRzRzR afb1423
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 48aaba1
test_hidden_states_output = False
zRzRzRzRzRzRzR 33d976f
Merge branch 'glm-4' of https://github.com/zRzRzRzRzRzRzR/transformer…
zRzRzRzRzRzRzR 0675202
fix
zRzRzRzRzRzRzR 19b0939
Update modeling_glm.py
zRzRzRzRzRzRzR b2b6c0f
Update __init__.py
zRzRzRzRzRzRzR 6760791
fix glm type error
zRzRzRzRzRzRzR 515d9d9
fix
zRzRzRzRzRzRzR 9951c92
ruff problem
zRzRzRzRzRzRzR 547ac95
Update convert_slow_tokenizer.py
zRzRzRzRzRzRzR 9ba6cf7
Add explanations in English
zRzRzRzRzRzRzR 9fb6405
reformate
zRzRzRzRzRzRzR e37bb49
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 25aec29
Update configuration_glm.py
zRzRzRzRzRzRzR 58d344a
Merge branch 'glm-4' of https://github.com/zRzRzRzRzRzRzR/transformer…
zRzRzRzRzRzRzR 073b811
fix
zRzRzRzRzRzRzR c0e6ae9
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 6ac085f
fix glm dummy
zRzRzRzRzRzRzR f140603
Merge branch 'glm-4' of https://github.com/zRzRzRzRzRzRzR/transformer…
zRzRzRzRzRzRzR 65f471d
add doc
zRzRzRzRzRzRzR 7ad819f
fix init
zRzRzRzRzRzRzR f86af8e
Update __init__.py
zRzRzRzRzRzRzR c179377
Update dummy_vision_objects.py
zRzRzRzRzRzRzR 41338d7
add_start_docstrings
zRzRzRzRzRzRzR dba6d1e
fix GLM_START_DOCSTRING
zRzRzRzRzRzRzR 82b0c7f
1
zRzRzRzRzRzRzR a6b6f4e
Update perf_infer_gpu_one.md
zRzRzRzRzRzRzR d1a5ee1
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR c99610e
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR b283adc
flash attn
zRzRzRzRzRzRzR 4cc618e
stiil need fix rotary_emb
zRzRzRzRzRzRzR b476dd0
fix GLMSelfAttension
zRzRzRzRzRzRzR aab2386
remove _get_unpad_data
zRzRzRzRzRzRzR 550a692
fix GLMSelfAttention
zRzRzRzRzRzRzR 6492ac3
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR c3d4636
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR 70b7ff4
Merge branch 'huggingface:main' into glm-4
zRzRzRzRzRzRzR File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,108 @@ | ||||||
<!--Copyright 2024 The GLM & ZhipuAI team and The HuggingFace Team. All rights reserved. | ||||||
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||||||
the License. You may obtain a copy of the License at | ||||||
|
||||||
http://www.apache.org/licenses/LICENSE-2.0 | ||||||
|
||||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||||||
specific language governing permissions and limitations under the License. | ||||||
|
||||||
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | ||||||
rendered properly in your Markdown viewer. | ||||||
|
||||||
--> | ||||||
|
||||||
# GLM | ||||||
|
||||||
## Overview | ||||||
|
||||||
The GLM Model was proposed | ||||||
in [ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools](https://arxiv.org/html/2406.12793v1) | ||||||
by GLM Team, THUDM & ZhipuAI. | ||||||
|
||||||
The abstract from the paper is the following: | ||||||
|
||||||
*We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report | ||||||
primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most | ||||||
capable models that are trained with all the insights and lessons gained from the preceding three generations of | ||||||
ChatGLM. To date, the GLM-4 models are pre-trained on ten trillions of tokens mostly in Chinese and English, along with | ||||||
a small set of corpus from 24 languages, and aligned primarily for Chinese and English usage. The high-quality alignment | ||||||
is achieved via a multi-stage post-training process, which involves supervised fine-tuning and learning from human | ||||||
feedback. Evaluations show that GLM-4 1) closely rivals or outperforms GPT-4 in terms of general metrics such as MMLU, | ||||||
GSM8K, MATH, BBH, GPQA, and HumanEval, 2) gets close to GPT-4-Turbo in instruction following as measured by IFEval, 3) | ||||||
matches GPT-4 Turbo (128K) and Claude 3 for long context tasks, and 4) outperforms GPT-4 in Chinese alignments as | ||||||
measured by AlignBench. The GLM-4 All Tools model is further aligned to understand user intent and autonomously decide | ||||||
when and which tool(s) to use—including web browser, Python interpreter, text-to-image model, and user-defined | ||||||
functions—to effectively complete complex tasks. In practical applications, it matches and even surpasses GPT-4 All | ||||||
Tools in tasks like accessing online information via web browsing and solving math problems using Python interpreter. | ||||||
Over the course, we have open-sourced a series of models, including ChatGLM-6B (three generations), GLM-4-9B (128K, 1M), | ||||||
GLM-4V-9B, WebGLM, and CodeGeeX, attracting over 10 million downloads on Hugging face in the year 2023 alone.* | ||||||
|
||||||
Tips: | ||||||
|
||||||
- This model was contributed by [THUDM](https://huggingface.co/THUDM). The most recent code can be | ||||||
found [here](https://github.com/thudm/GLM-4). | ||||||
|
||||||
|
||||||
## Usage tips | ||||||
|
||||||
`GLM-4` can be found on the [Huggingface Hub](https://huggingface.co/collections/THUDM/glm-4-665fcf188c414b03c2f7e3b7) | ||||||
|
||||||
In the following, we demonstrate how to use `glm-4-9b-chat` for the inference. Note that we have used the ChatML format for dialog, in this demo we show how to leverage `apply_chat_template` for this purpose. | ||||||
|
||||||
```python | ||||||
>>> from transformers import AutoModelForCausalLM, AutoTokenizer | ||||||
>>> device = "cuda" # the device to load the model onto | ||||||
|
||||||
>>> model = AutoModelForCausalLM.from_pretrained("THUDM/glm-4-9b-chat", device_map="auto") | ||||||
>>> tokenizer = AutoTokenizer.from_pretrained("THUDM/glm-4-9b-chat") | ||||||
|
||||||
>>> prompt = "Give me a short introduction to large language model." | ||||||
|
||||||
>>> messages = [{"role": "user", "content": prompt}] | ||||||
|
||||||
>>> text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | ||||||
|
||||||
>>> model_inputs = tokenizer([text], return_tensors="pt").to(device) | ||||||
|
||||||
>>> generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512, do_sample=True) | ||||||
|
||||||
>>> generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)] | ||||||
|
||||||
>>> response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] | ||||||
``` | ||||||
|
||||||
## GLMConfig | ||||||
|
||||||
[[autodoc]] GLMConfig | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
let's use camel casing everywhere we can! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fix now |
||||||
|
||||||
## GLMTokenizer | ||||||
|
||||||
[[autodoc]] GLMTokenizer | ||||||
- save_vocabulary | ||||||
|
||||||
## GLMTokenizerFast | ||||||
|
||||||
[[autodoc]] GLMTokenizerFast | ||||||
|
||||||
## GLMModel | ||||||
|
||||||
[[autodoc]] GLMModel | ||||||
- forward | ||||||
|
||||||
## GLMForCausalLM | ||||||
|
||||||
[[autodoc]] GLMForCausalLM | ||||||
- forward | ||||||
|
||||||
## GLMForSequenceClassification | ||||||
|
||||||
[[autodoc]] GLMForSequenceClassification | ||||||
- forward | ||||||
|
||||||
## GLMForTokenClassification | ||||||
|
||||||
[[autodoc]] GLMForTokenClassification | ||||||
- forward |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add the abstract of the paper here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finish