Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable llama.cpp on s390x big endian platform #3552
Enable llama.cpp on s390x big endian platform #3552
Changes from 5 commits
0613562
fa62c8c
1ce890a
8640f3b
e4efbdb
51e9d39
7fc0250
e513abe
eb5b832
4389bda
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be handle in
GGUFWriter.write_tensor_data
just like you do inadd_tensor
. Conversion script should not have no responsibility for handling endianness other than setting it in the constructor.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@monatis updated as your comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic is meant to be exactly the ascii bytes G G U F in the file, regardless of the system endianness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to check magic code instead. If the endianess is not match, magic code is 0x47475546. Then we can warn user:
"Endianess of the GGUF file and platform do not match"
Result after apply fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do want to start loading and saving files that start with
F U G G
(look in a hex editor), you will have to request a spec change, because that's no longer a GGUF file by its current definition.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cebtenzzre
I added endianess check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cebtenzzre This depends on whether we think the magic number is a number or a string.
ggml.c read the magic number as uint_32. This is endianess sensitive. If we think magic is int type, I think my update is compatible to the spec. But if we think magic is string type, then we need to update both ggml.h, ggml.c and gguf.py.
@ggerganov what's your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to fix
ggml.c
to read and write the magic byte-per-byte to match the spec?Currently, technically it does not match the spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ggerganov @cebtenzzre
Appreciate for your comments.
Yes. Let me clarify my update. I fixed the ggml.h to use the difference int magic value according to endianess which always represents "GGUF" characters. Now the file is always compatible to the spec. Now the GGUF file for big endian is started with "GGUF" as small endian GGUF file is.
See the hexdump of llama2 gguf file on s390x big endian:
And I rolled back the line to write GGUF_MAGIC in gguf.py. It always write the magic in byte order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this works, but I wish to avoid the
ifdef
in the header and the inclusion of extra headers (endian.h
).We should implement the multi-character constant alternative as proposed by @cebtenzzre and instead of read / write
uint32_t
at once, we should read / write byte-by-byte and compare the multi-byte constant.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ggerganov @monatis
I like this choice. Previously I thought maybe this change is too big.
I will also need to change the magic in struct gguf_header to char array.
If you agree, I will update according to your comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ggerganov updated according to your comments