update readme

dbiir · Feb 14, 2024 · d8d5bf3 · d8d5bf3
1 parent b6bca11
commit d8d5bf3
Show file tree

Hide file tree

Showing 2 changed files with 6 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -53,7 +53,6 @@ UER-py has the following features:
 * argparse
 * packaging
 * regex
-* For the mixed precision training you will need apex from NVIDIA
 * For the pre-trained model conversion (related with TensorFlow) you will need TensorFlow
 * For the tokenization with sentencepiece model you will need [SentencePiece](https://github.com/google/sentencepiece)
 * For developing a stacking model you will need LightGBM and [BayesianOptimization](https://github.com/fmfn/BayesianOptimization)
@@ -111,7 +110,7 @@ mv models/book_review_model.bin-5000 models/book_review_model.bin
 ```
 Notice that the model trained by *pretrain.py* is attacted with the suffix which records the training step (*--total_steps*). We could remove the suffix for ease of use.
 
-Then we fine-tune the pre-trained model on downstream classification dataset. We use embedding and encoder layers of book_review_model.bin, which is the output of *pretrain.py*:
+Then we fine-tune the pre-trained model on downstream classification dataset. We use embedding and encoder layers of *book_review_model.bin*, which is the output of *pretrain.py*:
 ```
 python3 finetune/run_classifier.py --pretrained_model_path models/book_review_model.bin \
                                    --vocab_path models/google_zh_vocab.txt \
@@ -142,7 +141,7 @@ The above content provides basic ways of using UER-py to pre-process, pre-train,
 <br/>
 
 ## Pre-training data
-This section provides links to a range of :arrow_right: [__pre-training data__](https://github.com/dbiir/UER-py/wiki/Pretraining-data) :arrow_left: .
+This section provides links to a range of :arrow_right: [__pre-training data__](https://github.com/dbiir/UER-py/wiki/Pretraining-data) :arrow_left: . UER can load these pre-training data directly.
 
 <br/>
 
@@ -152,7 +151,7 @@ This section provides links to a range of :arrow_right: [__downstream datasets__
 <br/>
 
 ## Modelzoo
-With the help of UER, we pre-trained models of different properties (e.g. models based on different corpora, encoders, and targets). Detailed introduction of pre-trained models and their download links can be found in :arrow_right: [__modelzoo__](https://github.com/dbiir/UER-py/wiki/Modelzoo) :arrow_left: . All pre-trained models can be loaded by UER directly. More pre-trained models will be released in the future.
+With the help of UER, we pre-trained models of different properties (e.g. models based on different corpora, encoders, and targets). Detailed introduction of pre-trained models and their download links can be found in :arrow_right: [__modelzoo__](https://github.com/dbiir/UER-py/wiki/Modelzoo) :arrow_left: . All pre-trained models can be loaded by UER directly.
 
 <br/>
 

diff --git a/README_ZH.md b/README_ZH.md
@@ -51,7 +51,6 @@ UER-py有如下几方面优势:
 * argparse
 * packaging
 * regex
-* 如果使用混合精度，需要安装英伟达的apex
 * 如果涉及到TensorFlow模型的转换，需要安装TensorFlow
 * 如果在tokenizer中使用sentencepiece模型，需要安装[SentencePiece](https://github.com/google/sentencepiece)
 * 如果使用模型集成stacking，需要安装LightGBM和[BayesianOptimization](https://github.com/fmfn/BayesianOptimization)
@@ -109,7 +108,7 @@ mv models/book_review_model.bin-5000 models/book_review_model.bin
 ```
 请注意，*pretrain.py*输出的模型会带有记录训练步数的后缀（*--total_steps*），这里我们可以删除后缀以方便使用。
 
-然后，我们在下游分类数据集上微调预训练模型，我们使用 *pretrain.py* 的输出book_review_model.bin（加载词向量层和编码层参数）：
+然后，我们在下游分类数据集上微调预训练模型，我们使用 *pretrain.py* 的输出*book_review_model.bin*（加载词向量层和编码层参数）：
 ```
 python3 finetune/run_classifier.py --pretrained_model_path models/book_review_model.bin \
                                    --vocab_path models/google_zh_vocab.txt \
@@ -140,7 +139,7 @@ python3 inference/run_classifier_infer.py --load_model_path models/finetuned_mod
 <br/>
 
 ## 预训练数据
-我们提供了链接，指向一系列开源的 :arrow_right: [__预训练数据__](https://github.com/dbiir/UER-py/wiki/预训练数据) :arrow_left: 。
+我们提供了链接，指向一系列开源的 :arrow_right: [__预训练数据__](https://github.com/dbiir/UER-py/wiki/预训练数据) :arrow_left: 。UER可以直接加载这些预训练数据。
 
 <br/>
 
@@ -150,7 +149,7 @@ python3 inference/run_classifier_infer.py --load_model_path models/finetuned_mod
 <br/>
 
 ## 预训练模型仓库
-借助UER-py，我们训练不同性质的预训练模型（例如基于不同语料、编码器、目标任务）。用户可以在 :arrow_right: [__预训练模型仓库__](https://github.com/dbiir/UER-py/wiki/预训练模型仓库) :arrow_left: 中找到各种性质的预训练模型以及它们对应的描述和下载链接。所有预训练模型都可以由UER-py直接加载。将来我们会发布更多的预训练模型。
+借助UER-py，我们训练不同性质的预训练模型（例如基于不同语料、编码器、目标任务）。用户可以在 :arrow_right: [__预训练模型仓库__](https://github.com/dbiir/UER-py/wiki/预训练模型仓库) :arrow_left: 中找到各种性质的预训练模型以及它们对应的描述和下载链接。所有预训练模型都可以由UER-py直接加载。
 
 <br/>