Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run harness on A770 error #12290

Closed
tao-ov opened this issue Oct 29, 2024 · 6 comments
Closed

run harness on A770 error #12290

tao-ov opened this issue Oct 29, 2024 · 6 comments
Assignees

Comments

@tao-ov
Copy link

tao-ov commented Oct 29, 2024

when i run harness as the following link on A770

https://github.com/intel-analytics/ipex-llm/blob/main/python/llm/dev/benchmark/harness/run_llb.py

the cmd is:python run_llb.py --model ipex-llm --pretrained /home/test/models/LLM/baichuan2-7b/pytorch/ --precision sym_int4 --device xpu --tasks hellaswag --batch 1 --no_cache

it occurs this error:
RuntimeError: Job config of task=hellaswag, precision=sym_int4 failed. Error Message: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte
image

@glorysdj glorysdj assigned glorysdj and lalalapotter and unassigned glorysdj Oct 30, 2024
@lalalapotter
Copy link
Contributor

Could you please remove the try-except clause here and provide more error log?

@tao-ov
Copy link
Author

tao-ov commented Oct 30, 2024

(llm) test@test-Z590-VISION-D:~/ipexllm_whowhat/ipex-llm/python/llm/dev/benchmark/harness$ python run_llb.py --model ipex-llm --pretrained /home/test/models/LLM/baichuan2-7b/pytorch/ --precision sym_int4 --device xpu --tasks hellaswag --batch 1 --no_cache
/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
2024-10-30 11:06:38,081 - INFO - intel_extension_for_pytorch auto imported
Selected Tasks: ['hellaswag']
The repository for /home/test/models/LLM/baichuan2-7b/pytorch/ contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//home/test/models/LLM/baichuan2-7b/pytorch/.
You can avoid this prompt in future by passing the argument trust_remote_code=True.

Do you wish to run the custom code? [y/N] y
2024-10-30 11:06:40,365 - WARNING - Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.
/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.get(instance, owner)()
2024-10-30 11:06:55,197 - INFO - Converting the current model to sym_int4 format......
Traceback (most recent call last):
File "/home/test/ipexllm_whowhat/ipex-llm/python/llm/dev/benchmark/harness/run_llb.py", line 147, in
main()
File "/home/test/ipexllm_whowhat/ipex-llm/python/llm/dev/benchmark/harness/run_llb.py", line 101, in main
results = evaluator.simple_evaluate(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/test/lm-evaluation-harness/lm_eval/utils.py", line 243, in _wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/test/lm-evaluation-harness/lm_eval/evaluator.py", line 89, in simple_evaluate
task_dict = lm_eval.tasks.get_task_dict(tasks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/test/lm-evaluation-harness/lm_eval/tasks/init.py", line 390, in get_task_dict
task_name_dict = {
^
File "/home/test/lm-evaluation-harness/lm_eval/tasks/init.py", line 391, in
task_name: get_task(task_name)()
^^^^^^^^^^^^^^^^^^^^^
File "/home/test/lm-evaluation-harness/lm_eval/base.py", line 481, in init
self.download(data_dir, cache_dir, download_mode)
File "/home/test/lm-evaluation-harness/lm_eval/base.py", line 510, in download
self.dataset = datasets.load_dataset(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/datasets/load.py", line 2606, in load_dataset
builder_instance = load_dataset_builder(
^^^^^^^^^^^^^^^^^^^^^
File "/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/datasets/load.py", line 2277, in load_dataset_builder
dataset_module = dataset_module_factory(
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/datasets/load.py", line 1923, in dataset_module_factory
raise e1 from None
File "/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/datasets/load.py", line 1875, in dataset_module_factory
can_load_config_from_parquet_export = "DEFAULT_CONFIG_NAME" not in f.read()
^^^^^^^^
File "", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte
image

@lalalapotter
Copy link
Contributor

Given this issue may caused by datasets lib version. Could you please provide some information about python libs version, so that we can reproduce the issue.

@tao-ov
Copy link
Author

tao-ov commented Oct 31, 2024

(llm) test@test-Z590-VISION-D:~/ipexllm_whowhat/ipex-llm/python/llm/dev/benchmark/harness$ pip show datasets
DEPRECATION: Loading egg at /home/test/miniforge3/envs/llm/lib/python3.11/site-packages/whowhatbench-1.0.0-py3.11.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation. Discussion can be found at pypa/pip#12330
Name: datasets
Version: 2.21.0
Summary: HuggingFace community-driven open-source library of datasets
Home-page: https://github.com/huggingface/datasets
Author: HuggingFace Inc.
Author-email: thomas@huggingface.co
License: Apache 2.0
Location: /home/test/miniforge3/envs/llm/lib/python3.11/site-packages
Requires: aiohttp, dill, filelock, fsspec, huggingface-hub, multiprocess, numpy, packaging, pandas, pyarrow, pyyaml, requests, tqdm, xxhash
Required-by: lm_eval, optimum, optimum-intel

@lalalapotter
Copy link
Contributor

Our verified datasets lib version is 2.14.6, could you please try it in your env. At the same time, we will reproduce the issue with the datasets version your provided.

@tao-ov
Copy link
Author

tao-ov commented Oct 31, 2024

pip install datasets==2.14.6
when i change the datasets lib version with 2.14.6, it success.
Thanks for your supporting~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants