-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: RAY error #129
Comments
|
# Process config example for dataset
# global parameters
project_name: 'demo-process'
dataset_path: '/home/wzp/code/LLMData/open_source/data-juicer/demos/process_on_ray/data/demo-dataset.json' # path to your dataset directory or file
np: 4 # number of subprocess to process your dataset
export_path: '/home/wzp/code/LLMData/open_source/data-juicer/outputs/demo-process/demo-processed.jsonl'
# use_cache: false
# save_stats_in_one_file: true
# process schedule
# a list of several process operators with their arguments
process:
# - language_id_score_filter:
# lang: 'zh'
# - alphanumeric_filter:
- chinese_convert_mapper:
mode: 's2t'
|
第三个bug:
2023-12-12 19:40:46,412 INFO plan.py:757 -- Using autodetected parallelism=192 for stage ReadJSON to satisfy parallelism at least twice the available number of CPUs (96).
|
|
这个问题我们复现排查看看. |
This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day. |
Close this stale issue. |
Before Reporting 报告之前
I have pulled the latest code of main branch to run again and the bug still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
I have read the README carefully and no error occurred during the installation process. (Otherwise, we recommend that you can ask a question using the Question template) 我已经仔细阅读了 README 上的操作指引,并且在安装过程中没有错误发生。(否则,我们建议您使用Question模板向我们进行提问)
Search before reporting 先搜索,再报告
OS 系统
ubuntu
Installation Method 安装方式
from source
Data-Juicer Version Data-Juicer版本
v0.1.2
Python Version Python版本
3.8
Describe the bug 描述这个bug
language_id_score_filter算子
To Reproduce 如何复现
python tools/process_data.py --config configs/demo/process.yaml --executor_type ray
Configs 配置信息
Logs 报错日志
outputs.zip
Screenshots 截图
Additional 额外信息
No response
The text was updated successfully, but these errors were encountered: