ggml-alloc : use virtual memory for measurement #2973

slaren · 2023-09-02T20:38:58Z

Instead of using a fixed memory address, allocates an uncommitted virtual memory region that is guaranteed to not overlap with any other allocations. Should solve issues with most 32-bit platforms and OS X. Wasm may still be an issue, as I don't think it supports mmap.

slaren · 2023-09-02T22:03:15Z

I have tested this successfully under Linux and Windows, including 32-bit builds, but I would need some help testing with OS X.

CoruNethron · 2023-09-02T23:49:11Z

I have tested this successfully under Linux and Windows, including 32-bit builds, but I would need some help testing with OS X.

I'll do. What kind of test could I perform?

slaren · 2023-09-03T00:13:18Z

All you would need to do is make sure it builds without errors and generate a few tokens using main or any other example.

CoruNethron · 2023-09-03T00:45:43Z

@slaren, I've tested alloc-vmem (c031b6c) - all binaries built without errors or warnings. Generated few tokens without any problem on Mac M1.

slaren · 2023-09-03T01:14:06Z

Awesome, thanks!

ggerganov

Did some testing on M2 Ultra and it works.

I wonder, should we fallback to the old strategy if mmap is not available?
I'm planning soon to update whisper.cpp to use ggml-alloc and update the WASM examples.

slaren · 2023-09-03T14:36:56Z

I have added a fallback for systems without virtual memory, however I noticed that emscripten has a shim for mmap that just allocates memory, which is not great. It should still work, but the memory usage will be high, so we may have to add a check to disable mmap when building with emscripten.

staviq · 2023-09-03T22:38:13Z

As a sidenote

Considering mmap, this seems related: WebAssembly/WASI#304

There's also this project which I have used successfully before, and it's simple enough so extending it is not very complicated, and it uses clang to produce wasm: https://github.com/schellingb/wajic

Edit: It should be possible to get rid of emscripten in favour of clang, if that helps in any way.

ggml-alloc : use virtual memory for measurement

96f3662

slaren marked this pull request as draft September 2, 2023 20:39

compatibility fixes for MAP_ANONYMOUS

c031b6c

slaren mentioned this pull request Sep 2, 2023

Finetune LORA #2632

Merged

13 tasks

slaren marked this pull request as ready for review September 2, 2023 22:03

jhen0409 approved these changes Sep 3, 2023

View reviewed changes

ggerganov approved these changes Sep 3, 2023

View reviewed changes

fallback to fixed address for systems without virtual memory

203afcf

slaren mentioned this pull request Sep 3, 2023

llama.cpp/ggml-alloc.c:230: alloc->n_free_blocks < MAX_FREE_BLOCKS && "out of free blocks" #2993

Closed

4 tasks

slaren merged commit cf9b084 into master Sep 3, 2023
26 checks passed

slaren deleted the alloc-vmem branch September 3, 2023 18:34

jhen0409 mentioned this pull request Sep 8, 2023

Excessively high memory consumption on iOS #3069

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-alloc : use virtual memory for measurement #2973

ggml-alloc : use virtual memory for measurement #2973

slaren commented Sep 2, 2023

slaren commented Sep 2, 2023 •

edited

Loading

CoruNethron commented Sep 2, 2023

slaren commented Sep 3, 2023

CoruNethron commented Sep 3, 2023

slaren commented Sep 3, 2023

ggerganov left a comment

slaren commented Sep 3, 2023

staviq commented Sep 3, 2023 •

edited

Loading

ggml-alloc : use virtual memory for measurement #2973

ggml-alloc : use virtual memory for measurement #2973

Conversation

slaren commented Sep 2, 2023

slaren commented Sep 2, 2023 • edited Loading

CoruNethron commented Sep 2, 2023

slaren commented Sep 3, 2023

CoruNethron commented Sep 3, 2023

slaren commented Sep 3, 2023

ggerganov left a comment

Choose a reason for hiding this comment

slaren commented Sep 3, 2023

staviq commented Sep 3, 2023 • edited Loading

slaren commented Sep 2, 2023 •

edited

Loading

staviq commented Sep 3, 2023 •

edited

Loading