You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
korpora lmdata \
--corpus all \
--output_dir ~/works/lmdata
Error log
Create train data from kowikitext: 0it [00:00, ?it/s]
| Done | Corpus name | Num sents | File name |
| ---- | ------------------------- | ---------- | --------- |
| x | kcbert | 86246284 | all.train |
| x | korean_chatbot_data | 23646 | all.train |
| x | korean_hate_speech | 2042260 | all.train |
| x | korean_parallel_koen_news | 97123 | all.train |
| x | korean_petitions | 867262 | all.train |
| x | kornli | 1900708 | all.train |
| x | korsts | 17256 | all.train |
| | kowikitext | - | |
| | namuwikitext | - | |
| | naver_changwon_ner | - | |
| | nsmc | - | |
| | question_pair | - | |
[Korpora] Corpus `kowikitext` is already installed at /home/beomi/Korpora/kowikitext/kowikitext_20200920.train.zip
[Korpora] Corpus `kowikitext` is already installed at /home/beomi/Korpora/kowikitext/kowikitext_20200920.train
[Korpora] Corpus `kowikitext` is already installed at /home/beomi/Korpora/kowikitext/kowikitext_20200920.test.zip
[Korpora] Corpus `kowikitext` is already installed at /home/beomi/Korpora/kowikitext/kowikitext_20200920.test
[Korpora] Corpus `kowikitext` is already installed at /home/beomi/Korpora/kowikitext/kowikitext_20200920.dev.zip
[Korpora] Corpus `kowikitext` is already installed at /home/beomi/Korpora/kowikitext/kowikitext_20200920.dev
Create train data from kowikitext: 0it [00:02, ?it/s]
Traceback (most recent call last):
File "/home/beomi/anaconda3/envs/deepspeed/bin/korpora", line 8, in <module>
sys.exit(main())
File "/home/beomi/anaconda3/envs/deepspeed/lib/python3.8/site-packages/Korpora/cli.py", line 64, in main
task_function(args)
File "/home/beomi/anaconda3/envs/deepspeed/lib/python3.8/site-packages/Korpora/task_lmdata.py", line 47, in create_lmdata
for i_sent, sent in enumerate(sent_iterator):
File "/home/beomi/anaconda3/envs/deepspeed/lib/python3.8/site-packages/tqdm/std.py", line 1133, in __iter__
for obj in iterable:
File "/home/beomi/anaconda3/envs/deepspeed/lib/python3.8/site-packages/Korpora/task_lmdata.py", line 180, in iterate_kowikitext
with open(path, encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/beomi/Korpora//kowiki/kowikitext_20200920.train'
ko wiki의 경우 kowikitext/kowikitext_.....으로 되어있어야 하는데, LM data 부분에서는 /kowiki/kowikitext_....으로 오타가 있는 듯 합니다.
The text was updated successfully, but these errors were encountered:
env
Issue
command
아래 커맨드 실행시 에러 발생
korpora lmdata \ --corpus all \ --output_dir ~/works/lmdata
Error log
ko wiki의 경우
kowikitext/kowikitext_.....
으로 되어있어야 하는데, LM data 부분에서는/kowiki/kowikitext_....
으로 오타가 있는 듯 합니다.The text was updated successfully, but these errors were encountered: