Skip to content

Latest commit

 

History

History
22 lines (18 loc) · 789 Bytes

README.md

File metadata and controls

22 lines (18 loc) · 789 Bytes

This repo is WIP

Download

git clone https://github.com/yuki-mt/bert-ja-ERNIE.git
cd bert-ja-ERNIE
git submodule update --init --recursive

Pretraining

  1. follow bert-japanese pretraining-from-scratch section
  • if your machine have enough CPU cores, try ./script/create_pretraining_data.sh instead of this script
  • install gcloud command and setup by running gcloud init
  • run the following commands
$ YOUR_GCP_PROJECT="..."
$ ./script/cp_pretraining_data.sh
$ gsutil -m cp -r /work/data/wiki_record/ gs://${YOUR_GCP_PROJECT}/data