[Bug]: Dear teachers, when i try to use spu on torch, i found a problem #771

zhangwaer · 2024-07-16T02:08:13Z

Issue Type

Others

Modules Involved

Others

Have you reproduced the bug with SPU HEAD?

Yes

Have you searched existing issues?

Yes

SPU Version

spu0.9.1b0

OS Platform and Distribution

ubuntu18.04

Python Version

3.10.14

Compiler Version

GCC11.2.1

Current Behavior?

Dear teachers, when i try to use gpt2 on spu, i found a problem that i cannot transfer the params successfully, could you please give me some advice on how to transfer it, thanks very much!!

Traceback (most recent call last):
File "/a-tmp/g/a.py", line 32, in
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/gpt2/modeling_gpt2.py", line 1302, in forward
transformer_outputs = self.transformer(
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/gpt2/modeling_gpt2.py", line 1002, in forward
input_shape = input_ids.size()
^^^^^^^^^^^^^^
AttributeError: 'collections.OrderedDict' object has no attribute 'size'

import torch
from transformers import AutoTokenizer, GPT2LMHeadModel, GPT2Config
import json
import urllib
from collections import OrderedDict
from jax.tree_util import tree_map
configuration = GPT2Config()
model =GPT2LMHeadModel(configuration)
pre_model = GPT2LMHeadModel.from_pretrained("gpt2")
tokenizer=AutoTokenizer.from_pretrained("gpt2")
inputs = tokenizer.encode("Hello, my dog is cute", return_tensors="pt")
params=pre_model.state_dict()
res = model(params, inputs)
print(res)

Standalone code to reproduce the issue

print("A bug")

Relevant log output

No response

tpppppub · 2024-07-16T02:16:03Z

Your code and errors seem to be that you are using the transformers GPT2 model incorrectly. I don't see anywhere you are calling SPU.

zhangwaer · 2024-07-16T02:22:53Z

Your code and errors seem to be that you are using the transformers GPT2 model incorrectly. I don't see anywhere you are calling SPU.

sorry, i use the following code to run gpt2 on spu, but it output "I0000 00:00:1721093883.848833 2880 cpu_client.cc:405] TfrtCpuClient created" and stuck for a long time, so i hope to get some advice on how to transfer the params, thank you very much!!

import torch
from transformers import AutoTokenizer, GPT2LMHeadModel, GPT2Config
import spu.utils.distributed as ppd
import json
import urllib
from collections import OrderedDict
from jax.tree_util import tree_map
with open("3pc.json", 'r') as file:
conf = json.load(file)
ppd.init(conf["nodes"], conf["devices"], framework=ppd.Framework.EXP_TORCH)
configuration = GPT2Config()
model =GPT2LMHeadModel(configuration)
pre_model = GPT2LMHeadModel.from_pretrained("gpt2")
tokenizer=AutoTokenizer.from_pretrained("gpt2")
inputs = tokenizer.encode("Hello, my dog is cute", return_tensors="pt")
params=pre_model.state_dict()
params = ppd.device("P1")(lambda input: tree_map(lambda x: x.detach().numpy(), input))(params)
inputs= ppd.device("P2")(lambda x: x.detach().numpy())(inputs)
res = ppd.device("SPU")(model)(params, inputs)

tpppppub · 2024-07-16T03:43:50Z

replace res = ppd.device("SPU")(model)(params, inputs) with res = ppd.device("SPU")(pre_model)(params, inputs)

zhangwaer · 2024-07-16T03:53:43Z

replace res = ppd.device("SPU")(model)(params, inputs) with res = ppd.device("SPU")(pre_model)(params, inputs)

thanks very much, but i have a question that the pre_model can load params through "
res = ppd.device("SPU")(pre_model)(params, inputs)", but model cannot load it, thanks very much!!

tpppppub · 2024-07-16T05:41:32Z

a possible reason is model which is initialized with an empty config has a different DAG with pre_model

tpppppub · 2024-07-16T09:05:14Z

Currently, SPU for PyTorch is experimental and in the early stage, using some dirty hacks to support torch.nn.Module inference. As a result, for a PyTorch model (no matter from Hugginface or somewhere else), running it on SPU doesn't fully align with its plaintext version.

zhangwaer · 2024-07-16T09:35:51Z

Currently, SPU for PyTorch is experimental and in the early stage, using some dirty hacks to support torch.nn.Module inference. As a result, for a PyTorch model (no matter from Hugginface or somewhere else), running it on SPU doesn't fully align with its plaintext version.
Thanks very much!!!!!!! looking forward to the better version!!!

zhangwaer closed this as completed Jul 16, 2024

zhangwaer reopened this Jul 16, 2024

zhangwaer closed this as completed Jul 16, 2024

zhangwaer reopened this Jul 16, 2024

zhangwaer closed this as completed Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Dear teachers, when i try to use spu on torch, i found a problem #771

[Bug]: Dear teachers, when i try to use spu on torch, i found a problem #771

zhangwaer commented Jul 16, 2024 •

edited

Loading

tpppppub commented Jul 16, 2024

zhangwaer commented Jul 16, 2024 •

edited

Loading

tpppppub commented Jul 16, 2024

zhangwaer commented Jul 16, 2024 •

edited

Loading

tpppppub commented Jul 16, 2024

tpppppub commented Jul 16, 2024

zhangwaer commented Jul 16, 2024 •

edited

Loading

[Bug]: Dear teachers, when i try to use spu on torch, i found a problem #771

[Bug]: Dear teachers, when i try to use spu on torch, i found a problem #771

Comments

zhangwaer commented Jul 16, 2024 • edited Loading

Issue Type

Modules Involved

Have you reproduced the bug with SPU HEAD?

Have you searched existing issues?

SPU Version

OS Platform and Distribution

Python Version

Compiler Version

Current Behavior?

Standalone code to reproduce the issue

Relevant log output

tpppppub commented Jul 16, 2024

zhangwaer commented Jul 16, 2024 • edited Loading

tpppppub commented Jul 16, 2024

zhangwaer commented Jul 16, 2024 • edited Loading

tpppppub commented Jul 16, 2024

tpppppub commented Jul 16, 2024

zhangwaer commented Jul 16, 2024 • edited Loading

zhangwaer commented Jul 16, 2024 •

edited

Loading

zhangwaer commented Jul 16, 2024 •

edited

Loading

zhangwaer commented Jul 16, 2024 •

edited

Loading

zhangwaer commented Jul 16, 2024 •

edited

Loading