-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffer not aligned #231
Comments
Which version are you using ? Alignement was added in |
0.3.0 |
Hey, I just took a look. For this file: https://huggingface.co/BlinkDL/rwkv-4-pile-430m/blob/main/RWKV-4-Pile-430M-20220808-8066.pth All tensor data are bfloat16, not f32 and alignment of f16 is respected there, no ? |
The file I used is .safetensors, which has only float32 data. Please use this script to convert the model first. |
The problem: offsets are calculated from the end of header. If the header size is not aligned by 4, even if the offsets are aligned, tho actual memory address won't be aligned |
I'm not sure I understand. The header get added empty spaces, until memory addresses are aligned. The offsets alignment doesn't matter. I could share a script to showcase addresses alignement if you want. |
I know it's possible to add alignment on Linux/POSIX mmap. The problem is that the model created with safetensors's Python library doesn't load aligned with safetensor's Rust library. |
You can check this: from huggingface_hub import hf_hub_download
import torch
from safetensors.torch import load_file, save_file
filename = hf_hub_download("BlinkDL/rwkv-4-pile-430m", filename="RWKV-4-Pile-430M-20220808-8066.pth")
weights = torch.load(filename, map_location="cpu")
save_file(weights, "out.safetensors")
import mmap
import torch
import json
import os
from huggingface_hub import hf_hub_download
def load_file(filename, device):
with open(filename, mode="r", encoding="utf8") as file_obj:
with mmap.mmap(file_obj.fileno(), length=0, access=mmap.ACCESS_READ) as m:
header = m.read(8)
n = int.from_bytes(header, "little")
metadata_bytes = m.read(n)
metadata = json.loads(metadata_bytes)
size = os.stat(filename).st_size
storage = torch.ByteStorage.from_file(filename, shared=False, size=size).untyped()
offset = n + 8
return {name: create_tensor(storage, info, offset) for name, info in metadata.items() if name != "__metadata__"}
DTYPES = {"F32": torch.float32, "BF16": torch.bfloat16}
ALIGNMENT = {torch.float32: 4, torch.bfloat16: 2}
device = "cpu"
def create_tensor(storage, info, offset):
dtype = DTYPES[info["dtype"]]
shape = info["shape"]
start, stop = info["data_offsets"]
print((start + offset) % ALIGNMENT[dtype])
return torch.asarray(storage[start + offset : stop + offset], dtype=torch.uint8).view(dtype=dtype).reshape(shape)
weights = load_file("out.safetensors", device) The loading is done in pure Python just so that you can mess with pointers easily. the mmap initial pointer is always page aligned and what's important to check is |
Your example works correctly by coincidence. Please try this. I only changed a few lines to convert the weights to f32. from huggingface_hub import hf_hub_download
import torch
from safetensors.torch import load_file, save_file
filename = hf_hub_download("BlinkDL/rwkv-4-pile-430m", filename="RWKV-4-Pile-430M-20220808-8066.pth")
weights = torch.load(filename, map_location="cpu")
for k in weights.keys():
weights[k] = weights[k].float() # convert to float32
save_file(weights, "out.safetensors")
import mmap
import torch
import json
import os
from huggingface_hub import hf_hub_download
def load_file(filename, device):
with open(filename, mode="r", encoding="utf8") as file_obj:
with mmap.mmap(file_obj.fileno(), length=0, access=mmap.ACCESS_READ) as m:
header = m.read(8)
n = int.from_bytes(header, "little")
metadata_bytes = m.read(n)
metadata = json.loads(metadata_bytes)
size = os.stat(filename).st_size
storage = torch.ByteStorage.from_file(filename, shared=False, size=size).untyped()
offset = n + 8
# print(n)
return {name: create_tensor(storage, info, offset) for name, info in metadata.items() if name != "__metadata__"}
DTYPES = {"F32": torch.float32, "BF16": torch.bfloat16}
ALIGNMENT = {torch.float32: 4, torch.bfloat16: 2}
device = "cpu"
def create_tensor(storage, info, offset):
dtype = DTYPES[info["dtype"]]
shape = info["shape"]
start, stop = info["data_offsets"]
print((start + offset) % ALIGNMENT[dtype])
return torch.asarray(storage[start + offset : stop + offset], dtype=torch.uint8).view(dtype=dtype).reshape(shape)
weights = load_file("out.safetensors", device) |
Indeed that's pretty bad ! I created #235 to fix that. I did some testing with various models on custom backends and I was pretty lucky I guess. |
I converted the pytorch checkpoint to safetensors. The buffer is not aligned
RWKV-4-Pile-430M-20220808-8066.pth is from https://huggingface.co/BlinkDL/rwkv-4-pile-430m
The convert script is here: https://github.com/iacore/rwkv-np/blob/main/convert.py
The tensor data are all f32.
0xa766 % 4 == 2
Why not aligned: the offset count from after the metadata header (sized 0xa766).
The text was updated successfully, but these errors were encountered: