ROCm Port #1087

SlyEcho · 2023-04-20T18:43:33Z

Currently I can say that for regular users the CLBlast version is much easier to run. If you want the most performance, though, HIP is for you.

Remember to tweak the new settings LLAMA_CUDA_DMMV_X and LLAMA_CUDA_MMV_Y, LLAMA_CUDA_KQUANTS_ITER

I get the best result with 128, 8 and 1, for example.

Note for unsupported GPU users:

You need to use an environment variable to force ROCm to run.
You can ckeck this resource: ROCm supported-gpu-list

export HSA_OVERRIDE_GFX_VERSION=10.3.0

This will make it work in the currently running shell, after that ./main and other llama.cpp commands will run.

rocBLAS is only released for a limited number of GPUs: gfx900 gfx906 gfx908 gfx90a gfx1030 (depends on ROCm version, etc).

If you look in /opt/rocm/lib/rocblas/library/ you should see a lot of files, but only for some GPUs, for others you need to find something that is close enough, like gfx1030 instead of gfx1033, and then that becomes 10.3.0 for the environment variable.

If you have multiple AMD devices:

If you have a GPU and APU then it may try to use wrong devices. There is an environment variable you can set to control the selected device:

export HIP_VISIBLE_DEVICES=0

ROCm port

I just define all the cudaXxx functions to hipXxx etc. This may seem stupidly simple but it's exactly the same kind of trick AMD uses to make HIP code compile with nvcc, you can see it in /opt/rocm/include/hip/nvidia_detail/nvidia_hip_runtime_api.h (for some reason I can't find the source for this anywhere online but it has a free license, so if you want, I can post it).

HIP can also compile the Cuda kernel programs without any major modifications, just some header stuff.

Compiling

To this, you need the ROCm developer kit and hipBLAS which may be a separate package.

With CMake I have to invoke:

CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ cmake -DLLAMA_HIPBLAS=ON

It is probably unavoidable to use the LLVM Clang compiler You can use the ROCm included one or the system one, but mixing it with GCC objects is just asking for trouble.

Makefile should work, too, pass in LLAMA_HIPBLAS=1. You can use the env variable ROCM_PATH if ROCm is not installed at /opt/rocm:

make -j4 LLAMA_HIPBLAS=1

Makefile will override the compilers to ROCm LLVM, so it should be a simple command to compile. But you should be able to override the compilers on the make command line.

Docker

Probably the best option right now is using Docker with AMD's images:

FROM rocm/dev-ubuntu-22.04:5.5-complete AS build

WORKDIR /app

COPY . ./

RUN make LLAMA_HIPBLAS=1

ENV PATH="/app:$PATH"
CMD [ "main" ]

Save it somewhere as rocm.Dockerfile then in llama.cpp's source do:

docker build -f /path/to/rocm.Dockerfile . -t llama.cpp:rocm

Then run it like this:

docker run --rm -it --init \
    --device /dev/dri --device /dev/kfd \
    -v/my/models:/models llama.cpp:rocm \
    main -m /models/llama-7b-q4_2.bin -p "$(cat prompts/dan.txt)"

You can also add the override like this: -e HSA_OVERRIDE_GFX_VERSION=10.3.0 and -e HIP_VISIBLE_DEVICES=0 as needed. There may be also some other security flags needed on some distros, and whatever permissions your user needs to have for the devices (usually group video).

Using nerdctl, I had to add the DRI devices separately (--device /dev/dri/card0 --device /dev/dri/renderD128 rather than the /dev/dri directory like in Docker), it also works, but beware that on some buildkit setups it will load the whole image via tarballs and since it's several gigabytes it will take some time to build.

All the commands are there besides main, you can also run /bin/bash for a dev shell, mount the llama.cpp source somewhere and use it for development. It is a bit of a thick image, for end users, maybe too big, I want to trim it down but the AMD stuff is bloated.

What's up with the compilers?

Regarding hipcc, it is not really a compiler, I had a lot of problems with it, it couldn't compile and link .cpp and .o files together (like hipcc main.cpp llama.o ggml.o ...). If you open it in a text editor you see it's a Perl script and all it does is provide some default flags for the Clang compiler. It might work in CMake, since CMake always compiles to objects first.

It shouldn't be a requirement to use AMD's version of Clang, it is possible to use any normal Clang or LLVM (maybe even Zig?) to compile the device code. In the CMake build I added a warning if the compiler is not Clang but it won't stop you from experimenting (well, it will probably fail to compile the .cu file).

If you use VS Code then the C/C++ plugin doesn't support HIP correctly, it sees in compileCommands.json (part of CMake's output) that the .cu file is using a language argument -x hip and it doesn't know what that is, so the whole file is locked to the C language even if it's actually C++ and you'll see some red squiggles. This flag comes from the hip::device package in CMake.

In CMake it is harder to use different compilers in the same project (may need to use a subdirectory) than in Make, so currently the .cu file is handled as a C++ file and compiled with the rest of the C++ files, this is what AMD's vision is with HIP -- they should just be normal C++ files.

I also tried adding another language, HIP enable_language(HIP), to CMake but I had some trouble getting the CMake to configure in all environments consistently, maybe it it needs some package that was missing in the container. In this case, it would work more similar to Cuda: I can define the .cu file's language to be HIP, whatever compiler configured for HIP compiles it and a compiler that can link it correctly will link it to an executable. When it was working on Arch, it configured it automatically like: CMAKE_CXX_COMPILER=/usr/bin/g++ and CMAKE_HIP_COMPILER=/usr/bin/clang++ and it was working correctly, using the HIP compliler to link in the end. This would be the ideal solution, it would give the user the most control over the config -- if I got it to work, that is 😜. If someone more experienced with this knows how to do it, please go ahead.

For the Makefile I thought it would be easier to override the compilers, because it is supposed to be more beginner friendly and you can get a result in one command (that is if everything is installed properly). But it has some variables also.

ggerganov · 2023-04-20T18:46:54Z

What does hipBLAS do?

SlyEcho · 2023-04-20T18:49:36Z

hipBLAS is just basically a wrapper around rocBLAS or cuBLAS. Well, all of HIP is supposed to be.

Now HIP Clang is not required, the CMake scripts will configure the needed compiler, which can be system clang++. Also other code can still use GCC, but CMake will force the clang to link.

slaren · 2023-04-20T23:20:38Z

I have started moving all the cuda specific stuff to ggml-cuda.h/cu in ROCm/rocBLAS#1094, you could also move all the HIP stuff to ggml-cuda.h to keep ggml.c a bit more clean. If this works well, it could be a nice way to support AMD GPUs. Do you have any performance numbers?

SlyEcho · 2023-04-20T23:28:44Z

I'll try to rebase on your code. As for perf, it's about 38 ms for 7B, GPU is Vega64
What's the best way to do a measurement?

slaren · 2023-04-20T23:30:50Z

Either the perplexity time per pass or the prompt eval times with a big prompt seems good enough to measure performance, that's what I have been doing anyway. Use --no-mmap to make sure that there isn't any loading happening in the first eval.

SlyEcho · 2023-04-20T23:47:16Z

7b-q4_0:  15.21 seconds per pass - ETA 2.77 hours
7b-f16:   16.30 seconds per pass - ETA 2.97 hours
13b-q4_0: 19.60 seconds per pass - ETA 3.57 hours
30b-q4_0: 29.70 seconds per pass - ETA 5.40 hours

GPU is used at about 30%, VRAM 2G

SlyEcho · 2023-04-21T00:27:23Z

I'm now building it in AMD's official Docker image and it is giving me double the performance... 🤯

7b-q4_0: 5.84 seconds per pass - ETA 1.06 hours
7b-f16: 6.47 seconds per pass - ETA 1.18 hours
13b-q4_0: 9.89 seconds per pass - ETA 1.80 hours
30b-q4_0: 20.40 seconds per pass - ETA 3.71 hours

SlyEcho · 2023-04-21T01:25:33Z

This is the rocprof trace from the Docker image:

And this one from the Arch:

It just seems faster because it loads the BLAS libraries faster.

SlyEcho · 2023-04-21T07:23:43Z

@slaren can you check in Cuda, currently --memory_f32 is broken for me.

slaren · 2023-04-21T13:04:57Z

--memory_f32 seems to work fine for me with Cuda, I couldn't notice any issues.

FNsi · 2023-04-22T10:57:58Z

Thank you for the great work.

~~Currently Perplexity not working in that PR.~~
Running perplexity and it stuck after show the 655chunks, batch_size=512
GPU is still working. Let me try to wait more time for that...
@SlyEcho it's work, sorry I didn't make it because haven't deleted all the other flags.

Llama 30B Q4_2
F32:
ETA 9h28m
[1] 3.2521, [2] 3.6665, [3] 4.3870 [4] 4.3477, [5] 4.2213
Without F32 flag:
ETA 8h47m
[1] 3.2520, [2] 3.6665, [3] 4.3869, [4] 4.3476, [5] 4.2213, [6] 4.2205, [7] 4.4011, [8] 4.4856, [9] 4.7332, [10] 4.9523, [11] 5.1126, [12] 5.1601, [13] 5.1378 [14] 5.2206 [15] 5.3794 ......[100] 4.3098 and I decide to abort it 😅
Until now, compare with 30b Q4_1 result in the discussion post, it's keep accurate and perform better.
30b q4_1 result by Jason Titus

And my pc's running test with rocm suit 5.4.2 is below:
30b llama Q4_2 Running with DAN
Master 50cb666
OpenBlas:
real 1m29.206
User 14m47.035
Sys 6m13.047

Master 50cb666
with your ggml.c and ggml-cuda.cu

Hipblas:
Real 0m57.723
User 7m23.156
Sys 0m3.356

Meanwhile maybe it's better to mention CXX also need to be changed to hipcc

Peak vram usage about 1.4 G, while running perplexity is about 2 G.

FNsi · 2023-04-22T11:02:47Z

@slaren can you check in Cuda, currently --memory_f32 is broken for me.

This --memory_f32 is Working with gfx1035 (HSA gfx1030) indeed the vega integrated gpu 680M

More detail: I didn't set cxx=clang, but cxx=hipcc. Maybe that's the reason?

SlyEcho · 2023-04-23T19:00:13Z

I think the issue with --memory_f32 is resolved for me at least. I will try to to a perplexity run.

SlyEcho · 2023-04-24T11:57:20Z

Bonus picture, running on a Steam Desk with Steam OS. I have installed containerd so I don't have to install any ROCm stuff.

To achieve this, the env var HSA_OVERRIDE_GFX_VERSION=10.3.0 was used and llama.cpp built with GPU_TARGETS=gfx1030 because the native gfx1033 is not supported by rocBLAS yet. I'm sure a properly tuned rocBLAS build could be faster. Note that I have changed the GPU VRAM split in the BIOS to 4GB.

hipBLAS eval (plugged in 🔌): 49 ms per token.
CPU eval (🔋🔌): 118 ms per token.
OpenBLAS eval (🔋🔌): 84 ms per token.

DGdev91 · 2023-04-24T15:50:40Z

I was trying to make it work on HIP too (here is my fork https://github.com/DGdev91/llama.cpp) but i wasn't able to make it work, it was stuck after showing the "llama_model_load_internal" rows.
I have the same problem with this code, so i guess the issue wasn't in the code, but in my own setup.
Also, this solution is indeed much cleaner than mine, so let's just work on this PR.
My GPU is a RX 5700xt, and i use HSA_OVERRIDE_GFX_VERSION=10.3.0 too, it's a common workaround also for pytorch-related programs, like StableDiffusion.

Any idea on how can i try to figure out what is going on?

SlyEcho · 2023-04-24T17:20:18Z

@DGdev91 that means it is crashing when trying to initialize HIP or hipBLAS.

What compiler did you use? The hipcc perl script is probably legacy and the integrated LLVM is the way to go, also the program should be linked with it and not GCC.

What is the GPU target that you used? Should be --offload-arch=gfx1030 if you want to use that.

The CMake file seems to be just broken.

EDIT: I forgot to mention, but when I managed to compile your code, it was running fine on the GPU 😃

DGdev91 · 2023-04-24T17:50:30Z

@DGdev91 that means it is crashing when trying to initialize HIP or hipBLAS.

What compiler did you use? The hipcc perl script is probably legacy and the integrated LLVM is the way to go, also the program should be linked with it and not GCC.

What is the GPU target that you used? Should be --offload-arch=gfx1030 if you want to use that.

The CMake file seems to be just broken.

EDIT: I forgot to mention, but when I managed to compile your code, it was running fine on the GPU smiley

You are right, but forget my fork, it was just an experiment. i already said i prefer your solution, and i had the same exact issue even there.
If my code worked for you (after correcting the makefile) we have another confirm it's an issue on my end.
what is really wierd, it works just fine with StableDiffusion

SlyEcho · 2023-04-24T17:56:27Z

I suspect it has something to do with the GPU architecture that is being built. My Makefile changes will detect the GPU of your system but that may not work if you're overriding it on the command line. On the Steam Deck I had to build it for one specific one (gfx1030) because that's the one rocBLAS supports.

This is something that should happen automatically and not be on the user to fix. I need to figure it out.

DGdev91 · 2023-04-24T18:02:51Z

I suspect it has something to do with the GPU architecture that is being built. My Makefile changes will detect the GPU of your system but that may not work if you're overriding it on the command line. On the Steam Deck I had to build it for one specific one (gfx1030) because that's the one rocBLAS supports.

This is something that should happen automatically and not be on the user to fix. I need to figure it out.

I compiled it with make LLAMA_HIPBLAS=1 GPU_TARGETS=gfx1030 and launched export HSA_OVERRIDE_GFX_VERSION=10.3.0 before launching main. There must be something else.

SlyEcho · 2023-04-25T17:58:57Z

Perplexity Testing for hipBLAS version

Code

Commit: 3a004b2a0166e412d8d54052c50bfd093611ad95

Models

I should mention that the Q4_0 models were converted some time ago so I don't know if they are "fresh" with the latest quantization fixes.
The other ones I made recently from the F16 version.

find models -name 'llama-7b-*.bin' -exec sem -j4 shasum {} ';'
8c5fe788ceaf8077e505f8f43efaa8f8cfd6e3eb  models/llama-7b-q4_0.bin
80a9d0bdf85dcddc83533a3aecf70eb9c542fdfa  models/llama-7b-q4_2.bin
da9ebf470350d8912caa04bf54fc6aced8d9ef19  models/llama-7b-q4_1.bin
1cbe22cfd2600f4e3b2d247ed1b82504cde3be78  models/llama-7b-q4_3.bin
0512fdf961215612db5a47cb1f6539c55936523c  models/llama-7b-f16.bin

Hardware

CPU: Intel Core i7 7700K (4c/8t), 4.7 GHz (OC)
RAM: 32 GB DDR4, 2666 MT/s
GPU: AMD Radeon Vega64 (8GB)

Arch Linux testing with:

OS: Arch Linux 6.2.11-arch1-1
BLAS: OpenBLAS 0.3.23-1
ROCm: 5.4.3

AMD official Docker with this Dockerfile:

rocm.Dockerfile

FROM rocm/dev-ubuntu-22.04
ARG GPU_TARGETS="gfx900"
ARG MAKE_JOBS=4

RUN apt-get update && \
    apt-get --no-install-recommends install -y hipblas-dev

WORKDIR /app

COPY . ./

RUN make \
    LLAMA_HIPBLAS=1 \
    GPU_TARGETS="$GPU_TARGETS" \
    -j $MAKE_JOBS \
    main perplexity

STOPSIGNAL SIGKILL
ENV PATH="/app:$PATH"
CMD [ "main" ]

Compile with:

docker build -f ~/Desktop/rocm.Dockerfile . -t llama.cpp:rocm

Results

7B Q4_0, Arch: [655]6.2818

./build/bin/perplexity --no-mmap -m ./models/llama-7b-q4_0.bin -f ./models/wiki.test.raw

main: seed = 1682276609
llama.cpp: loading model from ./models/llama-7b-q4_0.bin
llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 4113739.11 KB
llama_model_load_internal: mem required  = 5809.32 MB (+ 1026.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  =  256.00 MB

system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
15.40 seconds per pass - ETA 2 hours 48 minutes
[1]4.3749,[2]4.9540,[3]5.8254,[4]6.4669,[5]6.5409,[6]6.5395,[7]6.7155,[8]6.8046,[9]7.1737,[10]7.4103,[11]7.6549,[12]7.6926,[13]7.6022,[14]7.6783,[15]7.9331,[16]7.5386,[17]7.4157,[18]7.3768,[19]7.0052,[20]6.9921,[21]6.8947,[22]6.7102,[23]6.6723,[24]6.5850,[25]6.5848,[26]6.4125,[27]6.2326,[28]6.1317,[29]6.0477,[30]5.8916,[31]5.8634,[32]5.8812,[33]5.8164,[34]5.8511,[35]5.8769,[36]5.9208,[37]5.9247,[38]5.9419,[39]5.9800,[40]6.0387,[41]6.0458,[42]6.0802,[43]6.0373,[44]6.0921,[45]6.0965,[46]6.0707,[47]6.0944,[48]6.0652,[49]6.0722,[50]6.0328,[51]6.0287,[52]6.0177,[53]6.0619,[54]6.0454,[55]6.0230,[56]6.0572,[57]6.0803,[58]6.1021,[59]6.1159,[60]6.1624,[61]6.1512,[62]6.2143,[63]6.2479,[64]6.2630,[65]6.3095,[66]6.3197,[67]6.3378,[68]6.3518,[69]6.3767,[70]6.4090,[71]6.4305,[72]6.4602,[73]6.5254,[74]6.5308,[75]6.5453,[76]6.5616,[77]6.5749,[78]6.5597,[79]6.5892,[80]6.5817,[81]6.5943,[82]6.5980,[83]6.5443,[84]6.5297,[85]6.5182,[86]6.4971,[87]6.4318,[88]6.4033,[89]6.3827,[90]6.3661,[91]6.3922,[92]6.3884,[93]6.3909,[94]6.3884,[95]6.4171,[96]6.4150,[97]6.4078,[98]6.4007,[99]6.3867,[100]6.3867,[101]6.4126,[102]6.4062,[103]6.4280,[104]6.4347,[105]6.4333,[106]6.4510,[107]6.4497,[108]6.4621,[109]6.4568,[110]6.4523,[111]6.4751,[112]6.4941,[113]6.4955,[114]6.4921,[115]6.5003,[116]6.4930,[117]6.4985,[118]6.5270,[119]6.5479,[120]6.5844,[121]6.6007,[122]6.6254,[123]6.6644,[124]6.6822,[125]6.6735,[126]6.7126,[127]6.7497,[128]6.7772,[129]6.7603,[130]6.7699,[131]6.7646,[132]6.7558,[133]6.7430,[134]6.7542,[135]6.7507,[136]6.7376,[137]6.7296,[138]6.7125,[139]6.7009,[140]6.6979,[141]6.6681,[142]6.6633,[143]6.6354,[144]6.6153,[145]6.6066,[146]6.5931,[147]6.6006,[148]6.6029,[149]6.5969,[150]6.5928,[151]6.5940,[152]6.5845,[153]6.5678,[154]6.5587,[155]6.5655,[156]6.5605,[157]6.5789,[158]6.5824,[159]6.5866,[160]6.5892,[161]6.6017,[162]6.5716,[163]6.5595,[164]6.5334,[165]6.5015,[166]6.4728,[167]6.4354,[168]6.4028,[169]6.3893,[170]6.3768,[171]6.3479,[172]6.3298,[173]6.3113,[174]6.2805,[175]6.2584,[176]6.2482,[177]6.2271,[178]6.2036,[179]6.1864,[180]6.1775,[181]6.1551,[182]6.1359,[183]6.1217,[184]6.1215,[185]6.1142,[186]6.1160,[187]6.1214,[188]6.1178,[189]6.1362,[190]6.1371,[191]6.1575,[192]6.1738,[193]6.1916,[194]6.2032,[195]6.2242,[196]6.2412,[197]6.2633,[198]6.2788,[199]6.2818,[200]6.2863,[201]6.2822,[202]6.3027,[203]6.3093,[204]6.3092,[205]6.3201,[206]6.3279,[207]6.3239,[208]6.3323,[209]6.3375,[210]6.3426,[211]6.3524,[212]6.3598,[213]6.3704,[214]6.3739,[215]6.3780,[216]6.3927,[217]6.4106,[218]6.4241,[219]6.4244,[220]6.4208,[221]6.4145,[222]6.4110,[223]6.4002,[224]6.3935,[225]6.3888,[226]6.4103,[227]6.4191,[228]6.4249,[229]6.4317,[230]6.4273,[231]6.4441,[232]6.4311,[233]6.4140,[234]6.3983,[235]6.3825,[236]6.3747,[237]6.3644,[238]6.3678,[239]6.3516,[240]6.3413,[241]6.3446,[242]6.3483,[243]6.3468,[244]6.3348,[245]6.3322,[246]6.3201,[247]6.3077,[248]6.3010,[249]6.2989,[250]6.3037,[251]6.2960,[252]6.2927,[253]6.2824,[254]6.2784,[255]6.2668,[256]6.2477,[257]6.2366,[258]6.2279,[259]6.2259,[260]6.2178,[261]6.2135,[262]6.2076,[263]6.2030,[264]6.1838,[265]6.1829,[266]6.1814,[267]6.1745,[268]6.1842,[269]6.1822,[270]6.1828,[271]6.1906,[272]6.1952,[273]6.1948,[274]6.1962,[275]6.2052,[276]6.2107,[277]6.2267,[278]6.2375,[279]6.2461,[280]6.2497,[281]6.2596,[282]6.2656,[283]6.2803,[284]6.2881,[285]6.2975,[286]6.3122,[287]6.3116,[288]6.3176,[289]6.3085,[290]6.2934,[291]6.2780,[292]6.2622,[293]6.2484,[294]6.2509,[295]6.2503,[296]6.2547,[297]6.2533,[298]6.2559,[299]6.2531,[300]6.2418,[301]6.2419,[302]6.2339,[303]6.2262,[304]6.2184,[305]6.2159,[306]6.2027,[307]6.2051,[308]6.2084,[309]6.1921,[310]6.1860,[311]6.1796,[312]6.1818,[313]6.1762,[314]6.1749,[315]6.1584,[316]6.1541,[317]6.1375,[318]6.1159,[319]6.1278,[320]6.1408,[321]6.1446,[322]6.1401,[323]6.1335,[324]6.1310,[325]6.1410,[326]6.1410,[327]6.1431,[328]6.1473,[329]6.1533,[330]6.1559,[331]6.1682,[332]6.1651,[333]6.1720,[334]6.1662,[335]6.1597,[336]6.1635,[337]6.1605,[338]6.1592,[339]6.1534,[340]6.1491,[341]6.1568,[342]6.1593,[343]6.1648,[344]6.1648,[345]6.1647,[346]6.1619,[347]6.1666,[348]6.1708,[349]6.1726,[350]6.1692,[351]6.1698,[352]6.1698,[353]6.1646,[354]6.1644,[355]6.1699,[356]6.1729,[357]6.1693,[358]6.1783,[359]6.1814,[360]6.1777,[361]6.1772,[362]6.1839,[363]6.1951,[364]6.2016,[365]6.2074,[366]6.2081,[367]6.2169,[368]6.2147,[369]6.2156,[370]6.2166,[371]6.2106,[372]6.2159,[373]6.2215,[374]6.2202,[375]6.2198,[376]6.2282,[377]6.2233,[378]6.2259,[379]6.2319,[380]6.2235,[381]6.2192,[382]6.2135,[383]6.2125,[384]6.2118,[385]6.2105,[386]6.2100,[387]6.2092,[388]6.2047,[389]6.1993,[390]6.1924,[391]6.1843,[392]6.1803,[393]6.1784,[394]6.1810,[395]6.1793,[396]6.1720,[397]6.1795,[398]6.1833,[399]6.1916,[400]6.1912,[401]6.1926,[402]6.1932,[403]6.1950,[404]6.2014,[405]6.1918,[406]6.1884,[407]6.1877,[408]6.1887,[409]6.2011,[410]6.2121,[411]6.2246,[412]6.2408,[413]6.2524,[414]6.2599,[415]6.2652,[416]6.2732,[417]6.2863,[418]6.2897,[419]6.2971,[420]6.3058,[421]6.3179,[422]6.3236,[423]6.3308,[424]6.3428,[425]6.3519,[426]6.3583,[427]6.3628,[428]6.3711,[429]6.3756,[430]6.3846,[431]6.3992,[432]6.4035,[433]6.4022,[434]6.3976,[435]6.3983,[436]6.4008,[437]6.4102,[438]6.4181,[439]6.4145,[440]6.4140,[441]6.4089,[442]6.4080,[443]6.4093,[444]6.4096,[445]6.4076,[446]6.4100,[447]6.4129,[448]6.4172,[449]6.4145,[450]6.4148,[451]6.4106,[452]6.3987,[453]6.3904,[454]6.3843,[455]6.3851,[456]6.3898,[457]6.3915,[458]6.3894,[459]6.3903,[460]6.3990,[461]6.3962,[462]6.3946,[463]6.3997,[464]6.3988,[465]6.3957,[466]6.3877,[467]6.3879,[468]6.3878,[469]6.3900,[470]6.3905,[471]6.3858,[472]6.3904,[473]6.3848,[474]6.3861,[475]6.3802,[476]6.3826,[477]6.3754,[478]6.3745,[479]6.3808,[480]6.3860,[481]6.3880,[482]6.3834,[483]6.3793,[484]6.3815,[485]6.3798,[486]6.3743,[487]6.3743,[488]6.3724,[489]6.3674,[490]6.3647,[491]6.3616,[492]6.3558,[493]6.3528,[494]6.3510,[495]6.3507,[496]6.3473,[497]6.3419,[498]6.3402,[499]6.3351,[500]6.3255,[501]6.3185,[502]6.3184,[503]6.3181,[504]6.3088,[505]6.3113,[506]6.3122,[507]6.3060,[508]6.3018,[509]6.3007,[510]6.3046,[511]6.3092,[512]6.3127,[513]6.3145,[514]6.3212,[515]6.3156,[516]6.3149,[517]6.3159,[518]6.3160,[519]6.3190,[520]6.3218,[521]6.3234,[522]6.3263,[523]6.3273,[524]6.3336,[525]6.3373,[526]6.3385,[527]6.3405,[528]6.3351,[529]6.3355,[530]6.3308,[531]6.3298,[532]6.3347,[533]6.3370,[534]6.3351,[535]6.3374,[536]6.3320,[537]6.3297,[538]6.3345,[539]6.3357,[540]6.3397,[541]6.3405,[542]6.3412,[543]6.3426,[544]6.3438,[545]6.3417,[546]6.3423,[547]6.3378,[548]6.3323,[549]6.3325,[550]6.3298,[551]6.3260,[552]6.3239,[553]6.3197,[554]6.3175,[555]6.3146,[556]6.3143,[557]6.3166,[558]6.3126,[559]6.3122,[560]6.3117,[561]6.3118,[562]6.3100,[563]6.3100,[564]6.3143,[565]6.3160,[566]6.3157,[567]6.3135,[568]6.3140,[569]6.3124,[570]6.3150,[571]6.3156,[572]6.3166,[573]6.3168,[574]6.3132,[575]6.3127,[576]6.3126,[577]6.3115,[578]6.3095,[579]6.3103,[580]6.3037,[581]6.2999,[582]6.2989,[583]6.2997,[584]6.3001,[585]6.2924,[586]6.2856,[587]6.2858,[588]6.2908,[589]6.2966,[590]6.2996,[591]6.3018,[592]6.3003,[593]6.2966,[594]6.2977,[595]6.2954,[596]6.2991,[597]6.2967,[598]6.2930,[599]6.2952,[600]6.2949,[601]6.2935,[602]6.2952,[603]6.2982,[604]6.2992,[605]6.3025,[606]6.3045,[607]6.3029,[608]6.2993,[609]6.3000,[610]6.3036,[611]6.3018,[612]6.3043,[613]6.3007,[614]6.2956,[615]6.2879,[616]6.2909,[617]6.2846,[618]6.2794,[619]6.2738,[620]6.2595,[621]6.2523,[622]6.2506,[623]6.2521,[624]6.2525,[625]6.2525,[626]6.2510,[627]6.2531,[628]6.2536,[629]6.2533,[630]6.2567,[631]6.2631,[632]6.2685,[633]6.2668,[634]6.2701,[635]6.2707,[636]6.2674,[637]6.2640,[638]6.2666,[639]6.2637,[640]6.2647,[641]6.2650,[642]6.2718,[643]6.2740,[644]6.2752,[645]6.2731,[646]6.2774,[647]6.2735,[648]6.2742,[649]6.2743,[650]6.2782,[651]6.2839,[652]6.2846,[653]6.2889,[654]6.2825,[655]6.2818,

llama_print_timings:        load time = 16830.86 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time = 3533301.29 ms / 335360 tokens (   10.54 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time = 3572772.20 ms

7B Q4_0 --memory_f32, Arch: [655]6.2838,

./build/bin/perplexity --no-mmap -m ./models/llama-7b-q4_0.bin --memory_f32 -f ./models/wiki.test.raw

main: seed = 1682280920
llama.cpp: loading model from ./models/llama-7b-q4_0.bin
llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 4113739.11 KB
llama_model_load_internal: mem required  = 5809.32 MB (+ 2052.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  =  512.00 MB

system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
15.63 seconds per pass - ETA 2 hours 50 minutes
[1]4.3801,[2]4.9556,[3]5.8270,[4]6.4693,[5]6.5437,[6]6.5414,[7]6.7176,[8]6.8070,[9]7.1757,[10]7.4122,[11]7.6567,[12]7.6957,[13]7.6057,[14]7.6821,[15]7.9367,[16]7.5419,[17]7.4189,[18]7.3798,[19]7.0077,[20]6.9948,[21]6.8969,[22]6.7124,[23]6.6743,[24]6.5868,[25]6.5871,[26]6.4149,[27]6.2349,[28]6.1341,[29]6.0498,[30]5.8938,[31]5.8659,[32]5.8839,[33]5.8189,[34]5.8538,[35]5.8795,[36]5.9233,[37]5.9272,[38]5.9444,[39]5.9825,[40]6.0413,[41]6.0483,[42]6.0827,[43]6.0398,[44]6.0944,[45]6.0989,[46]6.0730,[47]6.0968,[48]6.0675,[49]6.0746,[50]6.0352,[51]6.0310,[52]6.0201,[53]6.0642,[54]6.0477,[55]6.0251,[56]6.0595,[57]6.0825,[58]6.1044,[59]6.1183,[60]6.1648,[61]6.1536,[62]6.2166,[63]6.2503,[64]6.2654,[65]6.3119,[66]6.3221,[67]6.3402,[68]6.3541,[69]6.3791,[70]6.4114,[71]6.4328,[72]6.4626,[73]6.5277,[74]6.5331,[75]6.5475,[76]6.5638,[77]6.5771,[78]6.5619,[79]6.5915,[80]6.5839,[81]6.5968,[82]6.6005,[83]6.5468,[84]6.5323,[85]6.5209,[86]6.4997,[87]6.4344,[88]6.4059,[89]6.3854,[90]6.3688,[91]6.3949,[92]6.3910,[93]6.3936,[94]6.3911,[95]6.4198,[96]6.4178,[97]6.4106,[98]6.4036,[99]6.3896,[100]6.3896,[101]6.4155,[102]6.4091,[103]6.4309,[104]6.4376,[105]6.4362,[106]6.4539,[107]6.4526,[108]6.4649,[109]6.4596,[110]6.4551,[111]6.4779,[112]6.4970,[113]6.4984,[114]6.4950,[115]6.5033,[116]6.4959,[117]6.5014,[118]6.5299,[119]6.5508,[120]6.5872,[121]6.6035,[122]6.6283,[123]6.6673,[124]6.6850,[125]6.6763,[126]6.7154,[127]6.7524,[128]6.7799,[129]6.7630,[130]6.7725,[131]6.7673,[132]6.7584,[133]6.7457,[134]6.7568,[135]6.7534,[136]6.7402,[137]6.7322,[138]6.7151,[139]6.7035,[140]6.7005,[141]6.6707,[142]6.6659,[143]6.6379,[144]6.6178,[145]6.6092,[146]6.5957,[147]6.6031,[148]6.6054,[149]6.5994,[150]6.5953,[151]6.5965,[152]6.5870,[153]6.5703,[154]6.5613,[155]6.5680,[156]6.5630,[157]6.5813,[158]6.5849,[159]6.5890,[160]6.5916,[161]6.6041,[162]6.5739,[163]6.5619,[164]6.5357,[165]6.5039,[166]6.4751,[167]6.4377,[168]6.4051,[169]6.3916,[170]6.3791,[171]6.3502,[172]6.3322,[173]6.3136,[174]6.2829,[175]6.2607,[176]6.2505,[177]6.2295,[178]6.2059,[179]6.1887,[180]6.1798,[181]6.1574,[182]6.1382,[183]6.1240,[184]6.1238,[185]6.1165,[186]6.1182,[187]6.1237,[188]6.1200,[189]6.1384,[190]6.1393,[191]6.1597,[192]6.1761,[193]6.1938,[194]6.2054,[195]6.2264,[196]6.2434,[197]6.2655,[198]6.2811,[199]6.2840,[200]6.2886,[201]6.2844,[202]6.3049,[203]6.3115,[204]6.3114,[205]6.3224,[206]6.3302,[207]6.3262,[208]6.3347,[209]6.3398,[210]6.3449,[211]6.3547,[212]6.3621,[213]6.3727,[214]6.3763,[215]6.3803,[216]6.3951,[217]6.4129,[218]6.4264,[219]6.4267,[220]6.4231,[221]6.4168,[222]6.4133,[223]6.4024,[224]6.3958,[225]6.3910,[226]6.4126,[227]6.4212,[228]6.4271,[229]6.4338,[230]6.4294,[231]6.4463,[232]6.4332,[233]6.4160,[234]6.4004,[235]6.3846,[236]6.3768,[237]6.3664,[238]6.3698,[239]6.3536,[240]6.3433,[241]6.3466,[242]6.3504,[243]6.3488,[244]6.3368,[245]6.3342,[246]6.3221,[247]6.3098,[248]6.3030,[249]6.3010,[250]6.3057,[251]6.2981,[252]6.2947,[253]6.2844,[254]6.2804,[255]6.2688,[256]6.2497,[257]6.2386,[258]6.2299,[259]6.2279,[260]6.2197,[261]6.2154,[262]6.2095,[263]6.2050,[264]6.1858,[265]6.1850,[266]6.1835,[267]6.1766,[268]6.1863,[269]6.1843,[270]6.1850,[271]6.1928,[272]6.1974,[273]6.1969,[274]6.1983,[275]6.2073,[276]6.2128,[277]6.2288,[278]6.2397,[279]6.2483,[280]6.2518,[281]6.2617,[282]6.2678,[283]6.2825,[284]6.2902,[285]6.2997,[286]6.3144,[287]6.3138,[288]6.3198,[289]6.3107,[290]6.2956,[291]6.2802,[292]6.2644,[293]6.2505,[294]6.2530,[295]6.2524,[296]6.2567,[297]6.2553,[298]6.2579,[299]6.2551,[300]6.2439,[301]6.2440,[302]6.2359,[303]6.2282,[304]6.2204,[305]6.2180,[306]6.2047,[307]6.2072,[308]6.2104,[309]6.1941,[310]6.1880,[311]6.1816,[312]6.1838,[313]6.1782,[314]6.1769,[315]6.1604,[316]6.1562,[317]6.1395,[318]6.1179,[319]6.1298,[320]6.1428,[321]6.1466,[322]6.1422,[323]6.1355,[324]6.1331,[325]6.1431,[326]6.1430,[327]6.1451,[328]6.1494,[329]6.1554,[330]6.1579,[331]6.1703,[332]6.1671,[333]6.1741,[334]6.1682,[335]6.1618,[336]6.1655,[337]6.1625,[338]6.1612,[339]6.1555,[340]6.1511,[341]6.1589,[342]6.1614,[343]6.1669,[344]6.1668,[345]6.1667,[346]6.1638,[347]6.1686,[348]6.1727,[349]6.1746,[350]6.1712,[351]6.1717,[352]6.1717,[353]6.1665,[354]6.1664,[355]6.1718,[356]6.1749,[357]6.1712,[358]6.1802,[359]6.1833,[360]6.1795,[361]6.1791,[362]6.1858,[363]6.1970,[364]6.2035,[365]6.2093,[366]6.2100,[367]6.2188,[368]6.2166,[369]6.2175,[370]6.2185,[371]6.2125,[372]6.2178,[373]6.2234,[374]6.2221,[375]6.2217,[376]6.2301,[377]6.2252,[378]6.2278,[379]6.2338,[380]6.2254,[381]6.2211,[382]6.2154,[383]6.2144,[384]6.2137,[385]6.2124,[386]6.2119,[387]6.2111,[388]6.2066,[389]6.2012,[390]6.1943,[391]6.1862,[392]6.1822,[393]6.1803,[394]6.1828,[395]6.1812,[396]6.1738,[397]6.1814,[398]6.1852,[399]6.1935,[400]6.1931,[401]6.1945,[402]6.1950,[403]6.1969,[404]6.2032,[405]6.1937,[406]6.1903,[407]6.1895,[408]6.1905,[409]6.2029,[410]6.2139,[411]6.2264,[412]6.2427,[413]6.2542,[414]6.2618,[415]6.2670,[416]6.2750,[417]6.2881,[418]6.2916,[419]6.2990,[420]6.3077,[421]6.3197,[422]6.3255,[423]6.3326,[424]6.3446,[425]6.3537,[426]6.3602,[427]6.3647,[428]6.3730,[429]6.3775,[430]6.3865,[431]6.4011,[432]6.4054,[433]6.4041,[434]6.3995,[435]6.4002,[436]6.4027,[437]6.4121,[438]6.4200,[439]6.4164,[440]6.4158,[441]6.4108,[442]6.4099,[443]6.4112,[444]6.4115,[445]6.4095,[446]6.4118,[447]6.4147,[448]6.4191,[449]6.4164,[450]6.4167,[451]6.4124,[452]6.4006,[453]6.3922,[454]6.3862,[455]6.3869,[456]6.3917,[457]6.3934,[458]6.3912,[459]6.3922,[460]6.4009,[461]6.3981,[462]6.3965,[463]6.4016,[464]6.4007,[465]6.3976,[466]6.3895,[467]6.3898,[468]6.3897,[469]6.3919,[470]6.3924,[471]6.3876,[472]6.3923,[473]6.3866,[474]6.3880,[475]6.3821,[476]6.3844,[477]6.3773,[478]6.3764,[479]6.3827,[480]6.3879,[481]6.3899,[482]6.3854,[483]6.3813,[484]6.3835,[485]6.3818,[486]6.3763,[487]6.3763,[488]6.3744,[489]6.3694,[490]6.3668,[491]6.3637,[492]6.3579,[493]6.3549,[494]6.3531,[495]6.3528,[496]6.3493,[497]6.3440,[498]6.3422,[499]6.3372,[500]6.3275,[501]6.3206,[502]6.3204,[503]6.3202,[504]6.3109,[505]6.3134,[506]6.3143,[507]6.3081,[508]6.3038,[509]6.3027,[510]6.3067,[511]6.3113,[512]6.3148,[513]6.3166,[514]6.3233,[515]6.3177,[516]6.3169,[517]6.3180,[518]6.3181,[519]6.3211,[520]6.3238,[521]6.3255,[522]6.3284,[523]6.3294,[524]6.3357,[525]6.3394,[526]6.3406,[527]6.3426,[528]6.3372,[529]6.3376,[530]6.3329,[531]6.3319,[532]6.3368,[533]6.3391,[534]6.3372,[535]6.3395,[536]6.3341,[537]6.3318,[538]6.3366,[539]6.3378,[540]6.3417,[541]6.3426,[542]6.3433,[543]6.3447,[544]6.3459,[545]6.3437,[546]6.3444,[547]6.3398,[548]6.3343,[549]6.3345,[550]6.3318,[551]6.3280,[552]6.3260,[553]6.3217,[554]6.3195,[555]6.3166,[556]6.3163,[557]6.3186,[558]6.3146,[559]6.3142,[560]6.3137,[561]6.3139,[562]6.3120,[563]6.3120,[564]6.3163,[565]6.3180,[566]6.3177,[567]6.3155,[568]6.3160,[569]6.3144,[570]6.3170,[571]6.3176,[572]6.3186,[573]6.3188,[574]6.3151,[575]6.3147,[576]6.3145,[577]6.3135,[578]6.3114,[579]6.3122,[580]6.3056,[581]6.3018,[582]6.3008,[583]6.3016,[584]6.3020,[585]6.2943,[586]6.2875,[587]6.2878,[588]6.2927,[589]6.2985,[590]6.3015,[591]6.3037,[592]6.3022,[593]6.2985,[594]6.2996,[595]6.2973,[596]6.3010,[597]6.2987,[598]6.2949,[599]6.2971,[600]6.2969,[601]6.2954,[602]6.2971,[603]6.3001,[604]6.3012,[605]6.3044,[606]6.3065,[607]6.3048,[608]6.3013,[609]6.3019,[610]6.3056,[611]6.3037,[612]6.3062,[613]6.3026,[614]6.2975,[615]6.2898,[616]6.2928,[617]6.2865,[618]6.2814,[619]6.2757,[620]6.2615,[621]6.2542,[622]6.2525,[623]6.2540,[624]6.2545,[625]6.2544,[626]6.2529,[627]6.2550,[628]6.2555,[629]6.2552,[630]6.2586,[631]6.2650,[632]6.2704,[633]6.2687,[634]6.2720,[635]6.2726,[636]6.2694,[637]6.2659,[638]6.2686,[639]6.2657,[640]6.2666,[641]6.2669,[642]6.2738,[643]6.2759,[644]6.2772,[645]6.2750,[646]6.2793,[647]6.2755,[648]6.2761,[649]6.2762,[650]6.2801,[651]6.2858,[652]6.2865,[653]6.2908,[654]6.2844,[655]6.2838,

llama_print_timings:        load time = 17052.08 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time = 4082758.39 ms / 335360 tokens (   12.17 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time = 4118136.22 ms

7B Q4_0, Docker: [655]6.2819,

docker run -it --rm -v$PWD/models:/models --device /dev/dri --device /dev/kfd llama.cpp:rocm perplexity -m /models/llama-7b-q4_0.bin --no-mmap -f /models/wiki.test.raw

main: seed = 1682287852
llama.cpp: loading model from /models/llama-7b-q4_0.bin
llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 4113739.11 KB
llama_model_load_internal: mem required  = 5809.32 MB (+ 1026.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  =  256.00 MB

system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
5.45 seconds per pass - ETA 59 minutes
[1]4.3749,[2]4.9542,[3]5.8256,[4]6.4671,[5]6.5411,[6]6.5396,[7]6.7157,[8]6.8048,[9]7.1739,[10]7.4104,[11]7.6551,[12]7.6928,[13]7.6023,[14]7.6784,[15]7.9332,[16]7.5387,[17]7.4157,[18]7.3769,[19]7.0053,[20]6.9922,[21]6.8948,[22]6.7103,[23]6.6723,[24]6.5850,[25]6.5849,[26]6.4126,[27]6.2327,[28]6.1319,[29]6.0478,[30]5.8917,[31]5.8635,[32]5.8813,[33]5.8165,[34]5.8513,[35]5.8770,[36]5.9209,[37]5.9248,[38]5.9420,[39]5.9801,[40]6.0388,[41]6.0459,[42]6.0803,[43]6.0375,[44]6.0922,[45]6.0966,[46]6.0707,[47]6.0945,[48]6.0653,[49]6.0723,[50]6.0329,[51]6.0287,[52]6.0178,[53]6.0619,[54]6.0455,[55]6.0231,[56]6.0573,[57]6.0803,[58]6.1021,[59]6.1160,[60]6.1625,[61]6.1512,[62]6.2143,[63]6.2479,[64]6.2631,[65]6.3095,[66]6.3198,[67]6.3379,[68]6.3518,[69]6.3767,[70]6.4090,[71]6.4305,[72]6.4603,[73]6.5255,[74]6.5309,[75]6.5453,[76]6.5617,[77]6.5750,[78]6.5597,[79]6.5892,[80]6.5817,[81]6.5943,[82]6.5980,[83]6.5443,[84]6.5297,[85]6.5182,[86]6.4971,[87]6.4318,[88]6.4033,[89]6.3828,[90]6.3662,[91]6.3922,[92]6.3884,[93]6.3909,[94]6.3885,[95]6.4172,[96]6.4151,[97]6.4078,[98]6.4007,[99]6.3868,[100]6.3867,[101]6.4126,[102]6.4063,[103]6.4280,[104]6.4347,[105]6.4333,[106]6.4510,[107]6.4497,[108]6.4621,[109]6.4568,[110]6.4523,[111]6.4752,[112]6.4942,[113]6.4955,[114]6.4921,[115]6.5004,[116]6.4931,[117]6.4986,[118]6.5271,[119]6.5480,[120]6.5844,[121]6.6008,[122]6.6255,[123]6.6645,[124]6.6823,[125]6.6736,[126]6.7127,[127]6.7497,[128]6.7772,[129]6.7604,[130]6.7699,[131]6.7647,[132]6.7558,[133]6.7431,[134]6.7543,[135]6.7508,[136]6.7377,[137]6.7297,[138]6.7126,[139]6.7009,[140]6.6979,[141]6.6682,[142]6.6633,[143]6.6354,[144]6.6154,[145]6.6067,[146]6.5931,[147]6.6007,[148]6.6030,[149]6.5969,[150]6.5928,[151]6.5940,[152]6.5846,[153]6.5678,[154]6.5587,[155]6.5655,[156]6.5605,[157]6.5789,[158]6.5824,[159]6.5866,[160]6.5893,[161]6.6017,[162]6.5716,[163]6.5596,[164]6.5334,[165]6.5016,[166]6.4728,[167]6.4355,[168]6.4028,[169]6.3893,[170]6.3768,[171]6.3479,[172]6.3299,[173]6.3113,[174]6.2806,[175]6.2585,[176]6.2483,[177]6.2272,[178]6.2037,[179]6.1865,[180]6.1776,[181]6.1552,[182]6.1360,[183]6.1217,[184]6.1216,[185]6.1143,[186]6.1160,[187]6.1215,[188]6.1178,[189]6.1363,[190]6.1371,[191]6.1575,[192]6.1739,[193]6.1916,[194]6.2033,[195]6.2243,[196]6.2413,[197]6.2633,[198]6.2789,[199]6.2818,[200]6.2864,[201]6.2822,[202]6.3027,[203]6.3093,[204]6.3092,[205]6.3201,[206]6.3279,[207]6.3239,[208]6.3324,[209]6.3376,[210]6.3426,[211]6.3524,[212]6.3598,[213]6.3704,[214]6.3740,[215]6.3780,[216]6.3928,[217]6.4107,[218]6.4241,[219]6.4244,[220]6.4209,[221]6.4146,[222]6.4110,[223]6.4002,[224]6.3936,[225]6.3888,[226]6.4104,[227]6.4191,[228]6.4250,[229]6.4317,[230]6.4273,[231]6.4442,[232]6.4311,[233]6.4140,[234]6.3984,[235]6.3825,[236]6.3748,[237]6.3644,[238]6.3678,[239]6.3516,[240]6.3413,[241]6.3446,[242]6.3484,[243]6.3468,[244]6.3349,[245]6.3323,[246]6.3201,[247]6.3078,[248]6.3010,[249]6.2990,[250]6.3037,[251]6.2961,[252]6.2927,[253]6.2825,[254]6.2785,[255]6.2669,[256]6.2477,[257]6.2367,[258]6.2280,[259]6.2260,[260]6.2178,[261]6.2135,[262]6.2076,[263]6.2031,[264]6.1838,[265]6.1830,[266]6.1815,[267]6.1745,[268]6.1842,[269]6.1822,[270]6.1829,[271]6.1907,[272]6.1953,[273]6.1948,[274]6.1963,[275]6.2052,[276]6.2107,[277]6.2268,[278]6.2376,[279]6.2462,[280]6.2497,[281]6.2596,[282]6.2657,[283]6.2804,[284]6.2882,[285]6.2976,[286]6.3123,[287]6.3117,[288]6.3177,[289]6.3086,[290]6.2935,[291]6.2781,[292]6.2623,[293]6.2485,[294]6.2509,[295]6.2504,[296]6.2547,[297]6.2533,[298]6.2559,[299]6.2531,[300]6.2419,[301]6.2420,[302]6.2339,[303]6.2262,[304]6.2184,[305]6.2160,[306]6.2028,[307]6.2052,[308]6.2084,[309]6.1921,[310]6.1860,[311]6.1796,[312]6.1819,[313]6.1763,[314]6.1750,[315]6.1585,[316]6.1542,[317]6.1375,[318]6.1159,[319]6.1278,[320]6.1409,[321]6.1447,[322]6.1402,[323]6.1335,[324]6.1311,[325]6.1411,[326]6.1410,[327]6.1432,[328]6.1474,[329]6.1534,[330]6.1559,[331]6.1683,[332]6.1651,[333]6.1721,[334]6.1662,[335]6.1598,[336]6.1635,[337]6.1605,[338]6.1592,[339]6.1535,[340]6.1492,[341]6.1569,[342]6.1594,[343]6.1649,[344]6.1648,[345]6.1648,[346]6.1619,[347]6.1666,[348]6.1708,[349]6.1727,[350]6.1693,[351]6.1698,[352]6.1698,[353]6.1646,[354]6.1644,[355]6.1699,[356]6.1730,[357]6.1693,[358]6.1784,[359]6.1815,[360]6.1777,[361]6.1773,[362]6.1839,[363]6.1951,[364]6.2016,[365]6.2075,[366]6.2082,[367]6.2169,[368]6.2147,[369]6.2156,[370]6.2167,[371]6.2107,[372]6.2159,[373]6.2216,[374]6.2202,[375]6.2199,[376]6.2283,[377]6.2234,[378]6.2259,[379]6.2320,[380]6.2235,[381]6.2193,[382]6.2135,[383]6.2125,[384]6.2118,[385]6.2106,[386]6.2100,[387]6.2092,[388]6.2047,[389]6.1993,[390]6.1924,[391]6.1844,[392]6.1803,[393]6.1784,[394]6.1810,[395]6.1794,[396]6.1720,[397]6.1795,[398]6.1834,[399]6.1917,[400]6.1913,[401]6.1927,[402]6.1932,[403]6.1951,[404]6.2014,[405]6.1919,[406]6.1885,[407]6.1877,[408]6.1887,[409]6.2011,[410]6.2121,[411]6.2246,[412]6.2409,[413]6.2524,[414]6.2600,[415]6.2652,[416]6.2732,[417]6.2863,[418]6.2898,[419]6.2972,[420]6.3058,[421]6.3179,[422]6.3237,[423]6.3308,[424]6.3428,[425]6.3519,[426]6.3583,[427]6.3628,[428]6.3711,[429]6.3756,[430]6.3846,[431]6.3992,[432]6.4035,[433]6.4022,[434]6.3976,[435]6.3983,[436]6.4008,[437]6.4102,[438]6.4181,[439]6.4146,[440]6.4140,[441]6.4089,[442]6.4080,[443]6.4094,[444]6.4097,[445]6.4076,[446]6.4100,[447]6.4129,[448]6.4172,[449]6.4145,[450]6.4149,[451]6.4106,[452]6.3987,[453]6.3904,[454]6.3843,[455]6.3851,[456]6.3898,[457]6.3915,[458]6.3894,[459]6.3904,[460]6.3991,[461]6.3962,[462]6.3946,[463]6.3997,[464]6.3988,[465]6.3958,[466]6.3877,[467]6.3880,[468]6.3879,[469]6.3901,[470]6.3906,[471]6.3858,[472]6.3904,[473]6.3848,[474]6.3862,[475]6.3803,[476]6.3826,[477]6.3754,[478]6.3745,[479]6.3808,[480]6.3860,[481]6.3880,[482]6.3835,[483]6.3793,[484]6.3816,[485]6.3798,[486]6.3743,[487]6.3743,[488]6.3724,[489]6.3674,[490]6.3647,[491]6.3617,[492]6.3559,[493]6.3528,[494]6.3510,[495]6.3508,[496]6.3473,[497]6.3419,[498]6.3402,[499]6.3352,[500]6.3255,[501]6.3185,[502]6.3184,[503]6.3182,[504]6.3088,[505]6.3113,[506]6.3122,[507]6.3061,[508]6.3018,[509]6.3007,[510]6.3046,[511]6.3092,[512]6.3127,[513]6.3146,[514]6.3212,[515]6.3157,[516]6.3149,[517]6.3159,[518]6.3160,[519]6.3190,[520]6.3218,[521]6.3234,[522]6.3263,[523]6.3274,[524]6.3336,[525]6.3373,[526]6.3385,[527]6.3405,[528]6.3351,[529]6.3356,[530]6.3308,[531]6.3298,[532]6.3347,[533]6.3370,[534]6.3351,[535]6.3375,[536]6.3321,[537]6.3297,[538]6.3345,[539]6.3357,[540]6.3397,[541]6.3406,[542]6.3413,[543]6.3426,[544]6.3438,[545]6.3417,[546]6.3423,[547]6.3378,[548]6.3323,[549]6.3325,[550]6.3298,[551]6.3260,[552]6.3240,[553]6.3197,[554]6.3175,[555]6.3146,[556]6.3143,[557]6.3166,[558]6.3126,[559]6.3122,[560]6.3117,[561]6.3119,[562]6.3100,[563]6.3100,[564]6.3144,[565]6.3161,[566]6.3158,[567]6.3135,[568]6.3141,[569]6.3124,[570]6.3150,[571]6.3157,[572]6.3167,[573]6.3168,[574]6.3132,[575]6.3127,[576]6.3126,[577]6.3116,[578]6.3095,[579]6.3103,[580]6.3037,[581]6.2999,[582]6.2990,[583]6.2997,[584]6.3001,[585]6.2924,[586]6.2856,[587]6.2858,[588]6.2908,[589]6.2966,[590]6.2996,[591]6.3018,[592]6.3003,[593]6.2966,[594]6.2977,[595]6.2954,[596]6.2991,[597]6.2968,[598]6.2930,[599]6.2952,[600]6.2950,[601]6.2935,[602]6.2953,[603]6.2982,[604]6.2992,[605]6.3025,[606]6.3046,[607]6.3029,[608]6.2994,[609]6.3000,[610]6.3037,[611]6.3018,[612]6.3043,[613]6.3007,[614]6.2956,[615]6.2879,[616]6.2909,[617]6.2846,[618]6.2795,[619]6.2738,[620]6.2596,[621]6.2524,[622]6.2506,[623]6.2521,[624]6.2526,[625]6.2525,[626]6.2510,[627]6.2531,[628]6.2536,[629]6.2534,[630]6.2568,[631]6.2631,[632]6.2685,[633]6.2668,[634]6.2701,[635]6.2707,[636]6.2675,[637]6.2640,[638]6.2667,[639]6.2638,[640]6.2647,[641]6.2650,[642]6.2719,[643]6.2740,[644]6.2753,[645]6.2732,[646]6.2774,[647]6.2736,[648]6.2743,[649]6.2744,[650]6.2782,[651]6.2839,[652]6.2846,[653]6.2889,[654]6.2825,[655]6.2819,

llama_print_timings:        load time =  6811.20 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time = 3334178.62 ms / 335360 tokens (    9.94 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time = 3367197.45 ms

7B Q4_0 --memory_f32, Docker: [655]6.2838,

docker run -it --rm -v$PWD/models:/models --device /dev/dri --device /dev/kfd llama.cpp:rocm perplexity -m /models/llama-7b-q4_0.bin --memory_f32 -f /models/wiki.test.raw

main: seed = 1682331507
llama.cpp: loading model from /models/llama-7b-q4_0.bin
llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =  59.11 KB
llama_model_load_internal: mem required  = 5809.32 MB (+ 2052.00 MB per state)
llama_init_from_file: kv self size  =  512.00 MB

system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
6.72 seconds per pass - ETA 1 hours 13 minutes
[1]4.3801,[2]4.9555,[3]5.8269,[4]6.4692,[5]6.5436,[6]6.5413,[7]6.7175,[8]6.8070,[9]7.1756,[10]7.4121,[11]7.6567,[12]7.6957,[13]7.6057,[14]7.6821,[15]7.9367,[16]7.5419,[17]7.4189,[18]7.3798,[19]7.0077,[20]6.9948,[21]6.8969,[22]6.7124,[23]6.6744,[24]6.5868,[25]6.5871,[26]6.4149,[27]6.2349,[28]6.1341,[29]6.0499,[30]5.8939,[31]5.8660,[32]5.8840,[33]5.8189,[34]5.8538,[35]5.8795,[36]5.9233,[37]5.9272,[38]5.9444,[39]5.9825,[40]6.0413,[41]6.0483,[42]6.0827,[43]6.0398,[44]6.0945,[45]6.0989,[46]6.0730,[47]6.0968,[48]6.0675,[49]6.0746,[50]6.0352,[51]6.0311,[52]6.0201,[53]6.0642,[54]6.0477,[55]6.0251,[56]6.0595,[57]6.0826,[58]6.1044,[59]6.1183,[60]6.1648,[61]6.1537,[62]6.2167,[63]6.2503,[64]6.2654,[65]6.3119,[66]6.3221,[67]6.3402,[68]6.3542,[69]6.3791,[70]6.4114,[71]6.4328,[72]6.4626,[73]6.5278,[74]6.5331,[75]6.5475,[76]6.5638,[77]6.5771,[78]6.5619,[79]6.5915,[80]6.5840,[81]6.5968,[82]6.6005,[83]6.5468,[84]6.5323,[85]6.5209,[86]6.4998,[87]6.4344,[88]6.4060,[89]6.3854,[90]6.3688,[91]6.3949,[92]6.3910,[93]6.3936,[94]6.3911,[95]6.4198,[96]6.4178,[97]6.4106,[98]6.4036,[99]6.3896,[100]6.3896,[101]6.4155,[102]6.4091,[103]6.4309,[104]6.4377,[105]6.4362,[106]6.4539,[107]6.4526,[108]6.4649,[109]6.4596,[110]6.4551,[111]6.4780,[112]6.4970,[113]6.4984,[114]6.4950,[115]6.5033,[116]6.4959,[117]6.5014,[118]6.5299,[119]6.5508,[120]6.5872,[121]6.6035,[122]6.6283,[123]6.6673,[124]6.6850,[125]6.6763,[126]6.7154,[127]6.7524,[128]6.7799,[129]6.7630,[130]6.7725,[131]6.7673,[132]6.7585,[133]6.7457,[134]6.7569,[135]6.7534,[136]6.7402,[137]6.7322,[138]6.7151,[139]6.7035,[140]6.7005,[141]6.6707,[142]6.6659,[143]6.6379,[144]6.6178,[145]6.6092,[146]6.5957,[147]6.6032,[148]6.6054,[149]6.5994,[150]6.5953,[151]6.5965,[152]6.5870,[153]6.5703,[154]6.5613,[155]6.5680,[156]6.5630,[157]6.5814,[158]6.5849,[159]6.5891,[160]6.5916,[161]6.6041,[162]6.5739,[163]6.5619,[164]6.5357,[165]6.5039,[166]6.4751,[167]6.4378,[168]6.4051,[169]6.3916,[170]6.3791,[171]6.3502,[172]6.3322,[173]6.3136,[174]6.2829,[175]6.2608,[176]6.2505,[177]6.2295,[178]6.2059,[179]6.1887,[180]6.1798,[181]6.1574,[182]6.1382,[183]6.1240,[184]6.1238,[185]6.1165,[186]6.1182,[187]6.1237,[188]6.1200,[189]6.1384,[190]6.1393,[191]6.1597,[192]6.1761,[193]6.1938,[194]6.2054,[195]6.2264,[196]6.2434,[197]6.2655,[198]6.2811,[199]6.2840,[200]6.2886,[201]6.2844,[202]6.3049,[203]6.3115,[204]6.3114,[205]6.3224,[206]6.3302,[207]6.3262,[208]6.3347,[209]6.3398,[210]6.3449,[211]6.3547,[212]6.3621,[213]6.3727,[214]6.3763,[215]6.3803,[216]6.3951,[217]6.4129,[218]6.4264,[219]6.4267,[220]6.4231,[221]6.4168,[222]6.4133,[223]6.4024,[224]6.3958,[225]6.3910,[226]6.4126,[227]6.4212,[228]6.4271,[229]6.4338,[230]6.4294,[231]6.4463,[232]6.4332,[233]6.4160,[234]6.4004,[235]6.3846,[236]6.3768,[237]6.3664,[238]6.3698,[239]6.3536,[240]6.3433,[241]6.3466,[242]6.3504,[243]6.3488,[244]6.3368,[245]6.3342,[246]6.3221,[247]6.3098,[248]6.3030,[249]6.3010,[250]6.3057,[251]6.2981,[252]6.2947,[253]6.2844,[254]6.2804,[255]6.2688,[256]6.2497,[257]6.2386,[258]6.2299,[259]6.2279,[260]6.2197,[261]6.2154,[262]6.2095,[263]6.2050,[264]6.1858,[265]6.1850,[266]6.1835,[267]6.1766,[268]6.1863,[269]6.1843,[270]6.1850,[271]6.1928,[272]6.1974,[273]6.1969,[274]6.1983,[275]6.2073,[276]6.2128,[277]6.2288,[278]6.2397,[279]6.2483,[280]6.2518,[281]6.2617,[282]6.2678,[283]6.2825,[284]6.2903,[285]6.2997,[286]6.3144,[287]6.3138,[288]6.3198,[289]6.3107,[290]6.2956,[291]6.2802,[292]6.2644,[293]6.2505,[294]6.2530,[295]6.2524,[296]6.2567,[297]6.2553,[298]6.2579,[299]6.2551,[300]6.2439,[301]6.2440,[302]6.2359,[303]6.2282,[304]6.2204,[305]6.2180,[306]6.2047,[307]6.2072,[308]6.2104,[309]6.1941,[310]6.1880,[311]6.1816,[312]6.1838,[313]6.1782,[314]6.1769,[315]6.1604,[316]6.1562,[317]6.1395,[318]6.1179,[319]6.1298,[320]6.1429,[321]6.1466,[322]6.1422,[323]6.1356,[324]6.1331,[325]6.1431,[326]6.1430,[327]6.1451,[328]6.1494,[329]6.1554,[330]6.1579,[331]6.1703,[332]6.1671,[333]6.1741,[334]6.1682,[335]6.1618,[336]6.1655,[337]6.1625,[338]6.1612,[339]6.1555,[340]6.1511,[341]6.1589,[342]6.1614,[343]6.1669,[344]6.1668,[345]6.1667,[346]6.1638,[347]6.1686,[348]6.1727,[349]6.1746,[350]6.1712,[351]6.1717,[352]6.1717,[353]6.1665,[354]6.1664,[355]6.1718,[356]6.1749,[357]6.1712,[358]6.1802,[359]6.1833,[360]6.1795,[361]6.1791,[362]6.1858,[363]6.1970,[364]6.2035,[365]6.2093,[366]6.2100,[367]6.2188,[368]6.2166,[369]6.2175,[370]6.2185,[371]6.2125,[372]6.2178,[373]6.2234,[374]6.2221,[375]6.2217,[376]6.2301,[377]6.2252,[378]6.2278,[379]6.2338,[380]6.2254,[381]6.2211,[382]6.2154,[383]6.2144,[384]6.2137,[385]6.2124,[386]6.2119,[387]6.2111,[388]6.2066,[389]6.2012,[390]6.1943,[391]6.1862,[392]6.1822,[393]6.1803,[394]6.1828,[395]6.1812,[396]6.1738,[397]6.1814,[398]6.1852,[399]6.1935,[400]6.1931,[401]6.1945,[402]6.1950,[403]6.1969,[404]6.2032,[405]6.1937,[406]6.1903,[407]6.1895,[408]6.1905,[409]6.2029,[410]6.2139,[411]6.2264,[412]6.2427,[413]6.2542,[414]6.2618,[415]6.2670,[416]6.2750,[417]6.2881,[418]6.2916,[419]6.2990,[420]6.3077,[421]6.3197,[422]6.3255,[423]6.3326,[424]6.3446,[425]6.3537,[426]6.3602,[427]6.3647,[428]6.3730,[429]6.3775,[430]6.3865,[431]6.4011,[432]6.4054,[433]6.4041,[434]6.3995,[435]6.4002,[436]6.4027,[437]6.4121,[438]6.4200,[439]6.4164,[440]6.4158,[441]6.4108,[442]6.4099,[443]6.4112,[444]6.4115,[445]6.4095,[446]6.4118,[447]6.4147,[448]6.4191,[449]6.4164,[450]6.4167,[451]6.4124,[452]6.4006,[453]6.3922,[454]6.3862,[455]6.3869,[456]6.3917,[457]6.3934,[458]6.3912,[459]6.3922,[460]6.4009,[461]6.3981,[462]6.3965,[463]6.4016,[464]6.4007,[465]6.3976,[466]6.3895,[467]6.3898,[468]6.3897,[469]6.3919,[470]6.3924,[471]6.3876,[472]6.3923,[473]6.3866,[474]6.3880,[475]6.3821,[476]6.3844,[477]6.3773,[478]6.3764,[479]6.3827,[480]6.3879,[481]6.3899,[482]6.3854,[483]6.3813,[484]6.3835,[485]6.3818,[486]6.3763,[487]6.3763,[488]6.3744,[489]6.3694,[490]6.3667,[491]6.3637,[492]6.3579,[493]6.3549,[494]6.3531,[495]6.3528,[496]6.3493,[497]6.3440,[498]6.3422,[499]6.3372,[500]6.3275,[501]6.3206,[502]6.3204,[503]6.3202,[504]6.3109,[505]6.3134,[506]6.3143,[507]6.3081,[508]6.3038,[509]6.3027,[510]6.3067,[511]6.3113,[512]6.3148,[513]6.3166,[514]6.3233,[515]6.3177,[516]6.3169,[517]6.3180,[518]6.3181,[519]6.3211,[520]6.3238,[521]6.3255,[522]6.3283,[523]6.3294,[524]6.3357,[525]6.3394,[526]6.3406,[527]6.3426,[528]6.3372,[529]6.3376,[530]6.3329,[531]6.3319,[532]6.3368,[533]6.3391,[534]6.3372,[535]6.3395,[536]6.3341,[537]6.3318,[538]6.3366,[539]6.3378,[540]6.3417,[541]6.3426,[542]6.3433,[543]6.3447,[544]6.3459,[545]6.3437,[546]6.3444,[547]6.3398,[548]6.3343,[549]6.3345,[550]6.3318,[551]6.3280,[552]6.3260,[553]6.3217,[554]6.3195,[555]6.3166,[556]6.3163,[557]6.3186,[558]6.3146,[559]6.3142,[560]6.3137,[561]6.3139,[562]6.3120,[563]6.3120,[564]6.3163,[565]6.3180,[566]6.3177,[567]6.3155,[568]6.3160,[569]6.3144,[570]6.3170,[571]6.3176,[572]6.3186,[573]6.3188,[574]6.3151,[575]6.3147,[576]6.3145,[577]6.3135,[578]6.3114,[579]6.3122,[580]6.3056,[581]6.3018,[582]6.3008,[583]6.3016,[584]6.3020,[585]6.2943,[586]6.2875,[587]6.2877,[588]6.2927,[589]6.2985,[590]6.3015,[591]6.3037,[592]6.3022,[593]6.2985,[594]6.2996,[595]6.2973,[596]6.3010,[597]6.2987,[598]6.2949,[599]6.2971,[600]6.2969,[601]6.2954,[602]6.2971,[603]6.3001,[604]6.3011,[605]6.3044,[606]6.3065,[607]6.3048,[608]6.3013,[609]6.3019,[610]6.3056,[611]6.3037,[612]6.3062,[613]6.3026,[614]6.2975,[615]6.2898,[616]6.2928,[617]6.2865,[618]6.2814,[619]6.2757,[620]6.2614,[621]6.2542,[622]6.2525,[623]6.2540,[624]6.2545,[625]6.2544,[626]6.2529,[627]6.2550,[628]6.2555,[629]6.2552,[630]6.2586,[631]6.2650,[632]6.2704,[633]6.2687,[634]6.2720,[635]6.2726,[636]6.2694,[637]6.2659,[638]6.2686,[639]6.2657,[640]6.2666,[641]6.2669,[642]6.2738,[643]6.2759,[644]6.2772,[645]6.2750,[646]6.2793,[647]6.2755,[648]6.2761,[649]6.2762,[650]6.2801,[651]6.2858,[652]6.2865,[653]6.2908,[654]6.2844,[655]6.2838,

llama_print_timings:        load time = 11650.39 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time = 4610430.68 ms / 335360 tokens (   13.75 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time = 4646862.42 ms

7B F16, Docker: [655]5.9564,

docker run -it --rm -v$PWD/models:/models --device /dev/dri --device /dev/kfd llama.cpp:rocm perplexity -m /models/llama-7b-f16.bin --no-mmap -f /models/wiki.test.raw

main: seed = 1682338603
llama.cpp: loading model from /models/llama-7b-f16.bin
llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 1 (mostly F16)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 13161547.11 KB
llama_model_load_internal: mem required  = 14645.07 MB (+ 1026.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  =  256.00 MB

system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
5.79 seconds per pass - ETA 1 hours 3 minutes
[1]4.2324,[2]4.7328,[3]5.5850,[4]6.1712,[5]6.2985,[6]6.2643,[7]6.4560,[8]6.5508,[9]6.8799,[10]7.1226,[11]7.3342,[12]7.3542,[13]7.2699,[14]7.3195,[15]7.5602,[16]7.1907,[17]7.0812,[18]7.0277,[19]6.6794,[20]6.6695,[21]6.5787,[22]6.4057,[23]6.3766,[24]6.2860,[25]6.2829,[26]6.1235,[27]5.9531,[28]5.8556,[29]5.7688,[30]5.6166,[31]5.5870,[32]5.6072,[33]5.5522,[34]5.5821,[35]5.6047,[36]5.6413,[37]5.6418,[38]5.6522,[39]5.6845,[40]5.7347,[41]5.7433,[42]5.7807,[43]5.7433,[44]5.7999,[45]5.8026,[46]5.7772,[47]5.7975,[48]5.7731,[49]5.7745,[50]5.7360,[51]5.7323,[52]5.7230,[53]5.7679,[54]5.7525,[55]5.7312,[56]5.7595,[57]5.7791,[58]5.7980,[59]5.8151,[60]5.8558,[61]5.8487,[62]5.9057,[63]5.9365,[64]5.9499,[65]5.9914,[66]5.9998,[67]6.0170,[68]6.0314,[69]6.0549,[70]6.0847,[71]6.1056,[72]6.1367,[73]6.1949,[74]6.1989,[75]6.2124,[76]6.2242,[77]6.2354,[78]6.2210,[79]6.2483,[80]6.2418,[81]6.2527,[82]6.2567,[83]6.2069,[84]6.1891,[85]6.1766,[86]6.1557,[87]6.0916,[88]6.0670,[89]6.0476,[90]6.0336,[91]6.0562,[92]6.0504,[93]6.0511,[94]6.0487,[95]6.0758,[96]6.0754,[97]6.0698,[98]6.0639,[99]6.0511,[100]6.0501,[101]6.0737,[102]6.0689,[103]6.0889,[104]6.0961,[105]6.0961,[106]6.1125,[107]6.1118,[108]6.1251,[109]6.1202,[110]6.1167,[111]6.1388,[112]6.1588,[113]6.1608,[114]6.1570,[115]6.1628,[116]6.1539,[117]6.1588,[118]6.1868,[119]6.2082,[120]6.2423,[121]6.2567,[122]6.2808,[123]6.3170,[124]6.3342,[125]6.3251,[126]6.3631,[127]6.3985,[128]6.4280,[129]6.4134,[130]6.4216,[131]6.4180,[132]6.4108,[133]6.3979,[134]6.4077,[135]6.4038,[136]6.3934,[137]6.3862,[138]6.3688,[139]6.3586,[140]6.3551,[141]6.3264,[142]6.3230,[143]6.2934,[144]6.2733,[145]6.2644,[146]6.2529,[147]6.2563,[148]6.2567,[149]6.2515,[150]6.2474,[151]6.2494,[152]6.2398,[153]6.2242,[154]6.2158,[155]6.2226,[156]6.2179,[157]6.2344,[158]6.2386,[159]6.2430,[160]6.2457,[161]6.2574,[162]6.2300,[163]6.2188,[164]6.1960,[165]6.1661,[166]6.1397,[167]6.1037,[168]6.0740,[169]6.0606,[170]6.0500,[171]6.0241,[172]6.0076,[173]5.9916,[174]5.9623,[175]5.9412,[176]5.9301,[177]5.9106,[178]5.8884,[179]5.8719,[180]5.8626,[181]5.8416,[182]5.8242,[183]5.8109,[184]5.8101,[185]5.8029,[186]5.8040,[187]5.8101,[188]5.8063,[189]5.8232,[190]5.8240,[191]5.8445,[192]5.8602,[193]5.8764,[194]5.8872,[195]5.9080,[196]5.9233,[197]5.9438,[198]5.9585,[199]5.9614,[200]5.9663,[201]5.9611,[202]5.9793,[203]5.9863,[204]5.9848,[205]5.9948,[206]6.0016,[207]5.9979,[208]6.0061,[209]6.0101,[210]6.0151,[211]6.0257,[212]6.0326,[213]6.0428,[214]6.0451,[215]6.0475,[216]6.0614,[217]6.0792,[218]6.0920,[219]6.0918,[220]6.0883,[221]6.0832,[222]6.0811,[223]6.0719,[224]6.0648,[225]6.0611,[226]6.0812,[227]6.0890,[228]6.0942,[229]6.1002,[230]6.0970,[231]6.1133,[232]6.1020,[233]6.0861,[234]6.0718,[235]6.0519,[236]6.0454,[237]6.0361,[238]6.0388,[239]6.0245,[240]6.0147,[241]6.0165,[242]6.0202,[243]6.0185,[244]6.0076,[245]6.0047,[246]5.9939,[247]5.9826,[248]5.9756,[249]5.9732,[250]5.9778,[251]5.9710,[252]5.9679,[253]5.9586,[254]5.9534,[255]5.9426,[256]5.9254,[257]5.9136,[258]5.9058,[259]5.9036,[260]5.8957,[261]5.8916,[262]5.8862,[263]5.8811,[264]5.8589,[265]5.8584,[266]5.8566,[267]5.8502,[268]5.8588,[269]5.8570,[270]5.8580,[271]5.8655,[272]5.8688,[273]5.8691,[274]5.8716,[275]5.8797,[276]5.8855,[277]5.9009,[278]5.9107,[279]5.9200,[280]5.9227,[281]5.9323,[282]5.9380,[283]5.9524,[284]5.9602,[285]5.9686,[286]5.9819,[287]5.9814,[288]5.9871,[289]5.9791,[290]5.9639,[291]5.9494,[292]5.9350,[293]5.9221,[294]5.9243,[295]5.9235,[296]5.9281,[297]5.9269,[298]5.9297,[299]5.9273,[300]5.9169,[301]5.9169,[302]5.9093,[303]5.9010,[304]5.8929,[305]5.8895,[306]5.8773,[307]5.8795,[308]5.8825,[309]5.8673,[310]5.8620,[311]5.8558,[312]5.8579,[313]5.8525,[314]5.8509,[315]5.8356,[316]5.8304,[317]5.8147,[318]5.7950,[319]5.8065,[320]5.8185,[321]5.8229,[322]5.8190,[323]5.8124,[324]5.8097,[325]5.8197,[326]5.8199,[327]5.8220,[328]5.8258,[329]5.8316,[330]5.8342,[331]5.8463,[332]5.8435,[333]5.8502,[334]5.8449,[335]5.8390,[336]5.8428,[337]5.8406,[338]5.8399,[339]5.8350,[340]5.8308,[341]5.8387,[342]5.8415,[343]5.8462,[344]5.8463,[345]5.8468,[346]5.8444,[347]5.8484,[348]5.8518,[349]5.8541,[350]5.8509,[351]5.8517,[352]5.8517,[353]5.8461,[354]5.8462,[355]5.8512,[356]5.8542,[357]5.8508,[358]5.8597,[359]5.8622,[360]5.8590,[361]5.8586,[362]5.8654,[363]5.8764,[364]5.8823,[365]5.8874,[366]5.8887,[367]5.8971,[368]5.8948,[369]5.8957,[370]5.8971,[371]5.8919,[372]5.8966,[373]5.9012,[374]5.8997,[375]5.8998,[376]5.9063,[377]5.9020,[378]5.9047,[379]5.9104,[380]5.9027,[381]5.8994,[382]5.8945,[383]5.8938,[384]5.8934,[385]5.8924,[386]5.8919,[387]5.8917,[388]5.8882,[389]5.8832,[390]5.8765,[391]5.8691,[392]5.8652,[393]5.8636,[394]5.8661,[395]5.8649,[396]5.8579,[397]5.8648,[398]5.8685,[399]5.8760,[400]5.8762,[401]5.8776,[402]5.8786,[403]5.8805,[404]5.8869,[405]5.8775,[406]5.8743,[407]5.8739,[408]5.8755,[409]5.8868,[410]5.8975,[411]5.9086,[412]5.9240,[413]5.9348,[414]5.9422,[415]5.9476,[416]5.9552,[417]5.9669,[418]5.9704,[419]5.9770,[420]5.9856,[421]5.9969,[422]6.0009,[423]6.0078,[424]6.0182,[425]6.0267,[426]6.0329,[427]6.0372,[428]6.0453,[429]6.0503,[430]6.0583,[431]6.0720,[432]6.0758,[433]6.0751,[434]6.0711,[435]6.0720,[436]6.0745,[437]6.0839,[438]6.0912,[439]6.0882,[440]6.0873,[441]6.0824,[442]6.0810,[443]6.0823,[444]6.0828,[445]6.0810,[446]6.0833,[447]6.0862,[448]6.0903,[449]6.0879,[450]6.0888,[451]6.0850,[452]6.0715,[453]6.0631,[454]6.0575,[455]6.0585,[456]6.0631,[457]6.0651,[458]6.0629,[459]6.0635,[460]6.0719,[461]6.0692,[462]6.0679,[463]6.0717,[464]6.0706,[465]6.0679,[466]6.0604,[467]6.0605,[468]6.0603,[469]6.0623,[470]6.0627,[471]6.0581,[472]6.0623,[473]6.0572,[474]6.0584,[475]6.0523,[476]6.0539,[477]6.0469,[478]6.0458,[479]6.0513,[480]6.0557,[481]6.0574,[482]6.0531,[483]6.0491,[484]6.0510,[485]6.0489,[486]6.0432,[487]6.0429,[488]6.0407,[489]6.0360,[490]6.0337,[491]6.0308,[492]6.0253,[493]6.0226,[494]6.0209,[495]6.0204,[496]6.0167,[497]6.0112,[498]6.0095,[499]6.0053,[500]5.9962,[501]5.9897,[502]5.9899,[503]5.9894,[504]5.9808,[505]5.9830,[506]5.9837,[507]5.9780,[508]5.9741,[509]5.9735,[510]5.9769,[511]5.9814,[512]5.9849,[513]5.9869,[514]5.9930,[515]5.9877,[516]5.9867,[517]5.9878,[518]5.9874,[519]5.9904,[520]5.9928,[521]5.9940,[522]5.9967,[523]5.9974,[524]6.0030,[525]6.0061,[526]6.0070,[527]6.0087,[528]6.0038,[529]6.0043,[530]5.9994,[531]5.9984,[532]6.0029,[533]6.0052,[534]6.0035,[535]6.0056,[536]6.0004,[537]5.9984,[538]6.0032,[539]6.0043,[540]6.0080,[541]6.0083,[542]6.0094,[543]6.0109,[544]6.0120,[545]6.0102,[546]6.0110,[547]6.0069,[548]6.0023,[549]6.0025,[550]5.9996,[551]5.9963,[552]5.9941,[553]5.9906,[554]5.9886,[555]5.9857,[556]5.9852,[557]5.9875,[558]5.9838,[559]5.9834,[560]5.9833,[561]5.9835,[562]5.9814,[563]5.9810,[564]5.9853,[565]5.9873,[566]5.9872,[567]5.9850,[568]5.9856,[569]5.9844,[570]5.9871,[571]5.9876,[572]5.9886,[573]5.9887,[574]5.9852,[575]5.9846,[576]5.9845,[577]5.9831,[578]5.9812,[579]5.9818,[580]5.9755,[581]5.9719,[582]5.9708,[583]5.9717,[584]5.9719,[585]5.9646,[586]5.9579,[587]5.9585,[588]5.9633,[589]5.9684,[590]5.9714,[591]5.9735,[592]5.9724,[593]5.9692,[594]5.9702,[595]5.9679,[596]5.9711,[597]5.9691,[598]5.9663,[599]5.9684,[600]5.9679,[601]5.9664,[602]5.9672,[603]5.9700,[604]5.9708,[605]5.9742,[606]5.9761,[607]5.9745,[608]5.9713,[609]5.9721,[610]5.9755,[611]5.9738,[612]5.9764,[613]5.9729,[614]5.9680,[615]5.9610,[616]5.9637,[617]5.9578,[618]5.9532,[619]5.9479,[620]5.9347,[621]5.9282,[622]5.9266,[623]5.9281,[624]5.9286,[625]5.9288,[626]5.9278,[627]5.9300,[628]5.9301,[629]5.9297,[630]5.9328,[631]5.9384,[632]5.9439,[633]5.9425,[634]5.9459,[635]5.9466,[636]5.9432,[637]5.9398,[638]5.9422,[639]5.9392,[640]5.9401,[641]5.9403,[642]5.9468,[643]5.9489,[644]5.9501,[645]5.9483,[646]5.9522,[647]5.9482,[648]5.9491,[649]5.9493,[650]5.9531,[651]5.9583,[652]5.9594,[653]5.9632,[654]5.9571,[655]5.9564,

llama_print_timings:        load time = 11891.56 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time = 3755163.27 ms / 335360 tokens (   11.20 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time = 3794021.15 ms

7B Q4_1, Docker: [655]6.1290,

docker run -it --rm -v$PWD/models:/models --device /dev/dri --device /dev/kfd llama.cpp:rocm perplexity -m /models/llama-7b-q4_1.bin --no-mmap -f /models/wiki.test.raw

main: seed = 1682342791
llama.cpp: loading model from /models/llama-7b-q4_1.bin
llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 3 (mostly Q4_1)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 4936267.11 KB
llama_model_load_internal: mem required  = 6612.57 MB (+ 1026.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  =  256.00 MB

system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
5.21 seconds per pass - ETA 56 minutes
[1]4.4323,[2]4.8863,[3]5.7761,[4]6.3814,[5]6.4911,[6]6.4638,[7]6.6548,[8]6.7572,[9]7.0838,[10]7.3394,[11]7.5618,[12]7.6045,[13]7.5323,[14]7.5955,[15]7.8405,[16]7.4403,[17]7.3181,[18]7.2617,[19]6.8886,[20]6.8673,[21]6.7698,[22]6.5975,[23]6.5679,[24]6.4790,[25]6.4809,[26]6.3162,[27]6.1360,[28]6.0296,[29]5.9400,[30]5.7779,[31]5.7483,[32]5.7658,[33]5.7091,[34]5.7423,[35]5.7643,[36]5.8049,[37]5.8079,[38]5.8115,[39]5.8458,[40]5.8938,[41]5.9067,[42]5.9474,[43]5.9071,[44]5.9658,[45]5.9726,[46]5.9454,[47]5.9647,[48]5.9383,[49]5.9370,[50]5.8961,[51]5.8903,[52]5.8785,[53]5.9269,[54]5.9100,[55]5.8886,[56]5.9167,[57]5.9356,[58]5.9544,[59]5.9729,[60]6.0160,[61]6.0048,[62]6.0620,[63]6.0919,[64]6.1024,[65]6.1472,[66]6.1572,[67]6.1761,[68]6.1893,[69]6.2130,[70]6.2412,[71]6.2616,[72]6.2930,[73]6.3496,[74]6.3531,[75]6.3683,[76]6.3813,[77]6.3933,[78]6.3796,[79]6.4081,[80]6.4020,[81]6.4178,[82]6.4235,[83]6.3713,[84]6.3554,[85]6.3430,[86]6.3217,[87]6.2604,[88]6.2376,[89]6.2164,[90]6.2009,[91]6.2245,[92]6.2182,[93]6.2174,[94]6.2142,[95]6.2430,[96]6.2421,[97]6.2365,[98]6.2300,[99]6.2155,[100]6.2138,[101]6.2387,[102]6.2327,[103]6.2523,[104]6.2604,[105]6.2595,[106]6.2763,[107]6.2764,[108]6.2882,[109]6.2820,[110]6.2782,[111]6.2997,[112]6.3200,[113]6.3235,[114]6.3198,[115]6.3254,[116]6.3161,[117]6.3214,[118]6.3491,[119]6.3717,[120]6.4076,[121]6.4225,[122]6.4466,[123]6.4839,[124]6.5025,[125]6.4928,[126]6.5324,[127]6.5693,[128]6.6014,[129]6.5853,[130]6.5951,[131]6.5913,[132]6.5826,[133]6.5700,[134]6.5797,[135]6.5762,[136]6.5649,[137]6.5579,[138]6.5414,[139]6.5304,[140]6.5264,[141]6.4978,[142]6.4955,[143]6.4675,[144]6.4464,[145]6.4385,[146]6.4268,[147]6.4309,[148]6.4313,[149]6.4269,[150]6.4230,[151]6.4258,[152]6.4149,[153]6.3993,[154]6.3906,[155]6.3969,[156]6.3919,[157]6.4084,[158]6.4117,[159]6.4171,[160]6.4203,[161]6.4325,[162]6.4049,[163]6.3934,[164]6.3699,[165]6.3385,[166]6.3114,[167]6.2734,[168]6.2433,[169]6.2304,[170]6.2190,[171]6.1931,[172]6.1760,[173]6.1603,[174]6.1305,[175]6.1096,[176]6.0981,[177]6.0784,[178]6.0553,[179]6.0387,[180]6.0287,[181]6.0072,[182]5.9899,[183]5.9766,[184]5.9759,[185]5.9687,[186]5.9694,[187]5.9750,[188]5.9709,[189]5.9890,[190]5.9906,[191]6.0118,[192]6.0274,[193]6.0442,[194]6.0558,[195]6.0776,[196]6.0935,[197]6.1144,[198]6.1302,[199]6.1334,[200]6.1386,[201]6.1341,[202]6.1530,[203]6.1607,[204]6.1598,[205]6.1706,[206]6.1776,[207]6.1744,[208]6.1829,[209]6.1874,[210]6.1918,[211]6.2028,[212]6.2108,[213]6.2210,[214]6.2242,[215]6.2267,[216]6.2408,[217]6.2595,[218]6.2735,[219]6.2740,[220]6.2702,[221]6.2643,[222]6.2621,[223]6.2519,[224]6.2450,[225]6.2412,[226]6.2615,[227]6.2703,[228]6.2761,[229]6.2819,[230]6.2788,[231]6.2952,[232]6.2836,[233]6.2669,[234]6.2516,[235]6.2328,[236]6.2266,[237]6.2167,[238]6.2190,[239]6.2039,[240]6.1932,[241]6.1956,[242]6.1985,[243]6.1965,[244]6.1854,[245]6.1824,[246]6.1716,[247]6.1596,[248]6.1522,[249]6.1487,[250]6.1530,[251]6.1460,[252]6.1421,[253]6.1328,[254]6.1282,[255]6.1171,[256]6.0992,[257]6.0866,[258]6.0783,[259]6.0760,[260]6.0677,[261]6.0633,[262]6.0578,[263]6.0518,[264]6.0313,[265]6.0306,[266]6.0293,[267]6.0225,[268]6.0305,[269]6.0293,[270]6.0292,[271]6.0371,[272]6.0405,[273]6.0408,[274]6.0431,[275]6.0518,[276]6.0575,[277]6.0727,[278]6.0826,[279]6.0913,[280]6.0939,[281]6.1042,[282]6.1099,[283]6.1250,[284]6.1326,[285]6.1406,[286]6.1534,[287]6.1526,[288]6.1586,[289]6.1498,[290]6.1340,[291]6.1184,[292]6.1034,[293]6.0905,[294]6.0925,[295]6.0918,[296]6.0968,[297]6.0962,[298]6.0997,[299]6.0974,[300]6.0864,[301]6.0859,[302]6.0783,[303]6.0693,[304]6.0607,[305]6.0572,[306]6.0448,[307]6.0469,[308]6.0498,[309]6.0338,[310]6.0280,[311]6.0217,[312]6.0240,[313]6.0182,[314]6.0166,[315]6.0009,[316]5.9962,[317]5.9799,[318]5.9594,[319]5.9713,[320]5.9835,[321]5.9877,[322]5.9835,[323]5.9766,[324]5.9733,[325]5.9843,[326]5.9843,[327]5.9863,[328]5.9897,[329]5.9953,[330]5.9982,[331]6.0104,[332]6.0076,[333]6.0148,[334]6.0091,[335]6.0027,[336]6.0059,[337]6.0036,[338]6.0026,[339]5.9973,[340]5.9932,[341]6.0011,[342]6.0040,[343]6.0086,[344]6.0088,[345]6.0089,[346]6.0060,[347]6.0100,[348]6.0137,[349]6.0159,[350]6.0131,[351]6.0139,[352]6.0140,[353]6.0077,[354]6.0081,[355]6.0134,[356]6.0164,[357]6.0133,[358]6.0226,[359]6.0250,[360]6.0220,[361]6.0216,[362]6.0284,[363]6.0395,[364]6.0459,[365]6.0509,[366]6.0528,[367]6.0615,[368]6.0588,[369]6.0599,[370]6.0617,[371]6.0565,[372]6.0615,[373]6.0661,[374]6.0647,[375]6.0648,[376]6.0715,[377]6.0670,[378]6.0694,[379]6.0753,[380]6.0675,[381]6.0642,[382]6.0597,[383]6.0588,[384]6.0583,[385]6.0573,[386]6.0570,[387]6.0571,[388]6.0535,[389]6.0483,[390]6.0418,[391]6.0341,[392]6.0298,[393]6.0284,[394]6.0312,[395]6.0298,[396]6.0224,[397]6.0291,[398]6.0330,[399]6.0406,[400]6.0403,[401]6.0417,[402]6.0429,[403]6.0448,[404]6.0512,[405]6.0421,[406]6.0390,[407]6.0387,[408]6.0405,[409]6.0521,[410]6.0632,[411]6.0746,[412]6.0906,[413]6.1016,[414]6.1093,[415]6.1144,[416]6.1222,[417]6.1344,[418]6.1378,[419]6.1451,[420]6.1543,[421]6.1657,[422]6.1697,[423]6.1766,[424]6.1871,[425]6.1958,[426]6.2025,[427]6.2071,[428]6.2153,[429]6.2208,[430]6.2288,[431]6.2426,[432]6.2466,[433]6.2459,[434]6.2413,[435]6.2424,[436]6.2449,[437]6.2548,[438]6.2623,[439]6.2590,[440]6.2580,[441]6.2531,[442]6.2512,[443]6.2522,[444]6.2528,[445]6.2507,[446]6.2529,[447]6.2559,[448]6.2602,[449]6.2578,[450]6.2586,[451]6.2546,[452]6.2425,[453]6.2342,[454]6.2284,[455]6.2291,[456]6.2343,[457]6.2365,[458]6.2345,[459]6.2351,[460]6.2436,[461]6.2409,[462]6.2395,[463]6.2438,[464]6.2425,[465]6.2398,[466]6.2324,[467]6.2331,[468]6.2329,[469]6.2352,[470]6.2358,[471]6.2311,[472]6.2362,[473]6.2308,[474]6.2321,[475]6.2264,[476]6.2283,[477]6.2213,[478]6.2203,[479]6.2260,[480]6.2304,[481]6.2322,[482]6.2276,[483]6.2235,[484]6.2252,[485]6.2232,[486]6.2172,[487]6.2169,[488]6.2149,[489]6.2100,[490]6.2079,[491]6.2052,[492]6.1996,[493]6.1968,[494]6.1950,[495]6.1947,[496]6.1910,[497]6.1854,[498]6.1839,[499]6.1794,[500]6.1700,[501]6.1636,[502]6.1636,[503]6.1631,[504]6.1542,[505]6.1565,[506]6.1573,[507]6.1519,[508]6.1481,[509]6.1475,[510]6.1511,[511]6.1558,[512]6.1596,[513]6.1615,[514]6.1679,[515]6.1625,[516]6.1617,[517]6.1627,[518]6.1623,[519]6.1655,[520]6.1676,[521]6.1690,[522]6.1718,[523]6.1726,[524]6.1784,[525]6.1817,[526]6.1826,[527]6.1841,[528]6.1791,[529]6.1797,[530]6.1745,[531]6.1728,[532]6.1777,[533]6.1800,[534]6.1785,[535]6.1807,[536]6.1755,[537]6.1733,[538]6.1784,[539]6.1792,[540]6.1829,[541]6.1831,[542]6.1838,[543]6.1854,[544]6.1864,[545]6.1844,[546]6.1852,[547]6.1813,[548]6.1765,[549]6.1762,[550]6.1735,[551]6.1698,[552]6.1675,[553]6.1638,[554]6.1616,[555]6.1585,[556]6.1580,[557]6.1602,[558]6.1564,[559]6.1562,[560]6.1561,[561]6.1566,[562]6.1542,[563]6.1539,[564]6.1585,[565]6.1607,[566]6.1607,[567]6.1588,[568]6.1592,[569]6.1577,[570]6.1605,[571]6.1609,[572]6.1614,[573]6.1611,[574]6.1576,[575]6.1571,[576]6.1570,[577]6.1551,[578]6.1529,[579]6.1531,[580]6.1468,[581]6.1431,[582]6.1423,[583]6.1431,[584]6.1434,[585]6.1360,[586]6.1291,[587]6.1297,[588]6.1344,[589]6.1400,[590]6.1429,[591]6.1451,[592]6.1438,[593]6.1405,[594]6.1415,[595]6.1392,[596]6.1426,[597]6.1404,[598]6.1379,[599]6.1401,[600]6.1401,[601]6.1388,[602]6.1407,[603]6.1432,[604]6.1441,[605]6.1479,[606]6.1499,[607]6.1483,[608]6.1447,[609]6.1452,[610]6.1488,[611]6.1473,[612]6.1499,[613]6.1463,[614]6.1415,[615]6.1340,[616]6.1366,[617]6.1305,[618]6.1256,[619]6.1201,[620]6.1063,[621]6.0995,[622]6.0979,[623]6.0996,[624]6.1001,[625]6.1002,[626]6.0993,[627]6.1019,[628]6.1021,[629]6.1016,[630]6.1047,[631]6.1103,[632]6.1160,[633]6.1145,[634]6.1179,[635]6.1184,[636]6.1149,[637]6.1115,[638]6.1141,[639]6.1109,[640]6.1119,[641]6.1120,[642]6.1185,[643]6.1204,[644]6.1215,[645]6.1198,[646]6.1240,[647]6.1202,[648]6.1213,[649]6.1215,[650]6.1256,[651]6.1310,[652]6.1322,[653]6.1361,[654]6.1297,[655]6.1290,

llama_print_timings:        load time =  8444.43 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time = 3298326.00 ms / 335360 tokens (    9.84 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time = 3338225.33 ms

7B Q4_2, Docker: [655]6.2002,

docker run -it --rm -v$PWD/models:/models --device /dev/dri --device /dev/kfd llama.cpp:rocm perplexity -m /models/llama-7b-q4_2.bin --no-mmap -f /models/wiki.test.raw

main: seed = 1682346906
llama.cpp: loading model from /models/llama-7b-q4_2.bin
llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 5 (mostly Q4_2)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 4113739.11 KB
llama_model_load_internal: mem required  = 5809.32 MB (+ 1026.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  =  256.00 MB

system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
5.62 seconds per pass - ETA 1 hours 1 minutes
[1]4.4374,[2]4.8772,[3]5.7681,[4]6.3925,[5]6.5142,[6]6.4884,[7]6.6803,[8]6.7903,[9]7.1376,[10]7.3772,[11]7.5927,[12]7.6131,[13]7.5335,[14]7.6098,[15]7.8668,[16]7.4772,[17]7.3521,[18]7.3157,[19]6.9532,[20]6.9431,[21]6.8483,[22]6.6732,[23]6.6360,[24]6.5422,[25]6.5428,[26]6.3758,[27]6.1929,[28]6.0919,[29]6.0021,[30]5.8428,[31]5.8105,[32]5.8319,[33]5.7736,[34]5.8103,[35]5.8352,[36]5.8781,[37]5.8817,[38]5.8969,[39]5.9326,[40]5.9921,[41]6.0049,[42]6.0453,[43]6.0030,[44]6.0581,[45]6.0611,[46]6.0371,[47]6.0585,[48]6.0306,[49]6.0337,[50]5.9922,[51]5.9876,[52]5.9770,[53]6.0202,[54]6.0027,[55]5.9780,[56]6.0049,[57]6.0244,[58]6.0453,[59]6.0629,[60]6.1072,[61]6.0985,[62]6.1567,[63]6.1927,[64]6.2088,[65]6.2552,[66]6.2626,[67]6.2811,[68]6.2987,[69]6.3254,[70]6.3581,[71]6.3803,[72]6.4105,[73]6.4727,[74]6.4784,[75]6.4929,[76]6.5055,[77]6.5178,[78]6.5028,[79]6.5305,[80]6.5214,[81]6.5310,[82]6.5349,[83]6.4802,[84]6.4626,[85]6.4516,[86]6.4288,[87]6.3638,[88]6.3353,[89]6.3155,[90]6.3010,[91]6.3247,[92]6.3185,[93]6.3207,[94]6.3176,[95]6.3453,[96]6.3436,[97]6.3384,[98]6.3311,[99]6.3161,[100]6.3170,[101]6.3427,[102]6.3372,[103]6.3573,[104]6.3649,[105]6.3640,[106]6.3790,[107]6.3764,[108]6.3901,[109]6.3834,[110]6.3793,[111]6.4015,[112]6.4218,[113]6.4244,[114]6.4213,[115]6.4285,[116]6.4199,[117]6.4258,[118]6.4552,[119]6.4756,[120]6.5110,[121]6.5286,[122]6.5540,[123]6.5918,[124]6.6105,[125]6.6001,[126]6.6398,[127]6.6765,[128]6.7069,[129]6.6906,[130]6.7006,[131]6.6967,[132]6.6882,[133]6.6753,[134]6.6859,[135]6.6817,[136]6.6691,[137]6.6608,[138]6.6454,[139]6.6349,[140]6.6299,[141]6.5998,[142]6.5962,[143]6.5663,[144]6.5457,[145]6.5365,[146]6.5232,[147]6.5297,[148]6.5298,[149]6.5236,[150]6.5186,[151]6.5200,[152]6.5081,[153]6.4913,[154]6.4825,[155]6.4894,[156]6.4842,[157]6.5023,[158]6.5054,[159]6.5108,[160]6.5127,[161]6.5255,[162]6.4955,[163]6.4823,[164]6.4575,[165]6.4257,[166]6.3978,[167]6.3596,[168]6.3278,[169]6.3146,[170]6.3033,[171]6.2754,[172]6.2583,[173]6.2412,[174]6.2109,[175]6.1888,[176]6.1788,[177]6.1583,[178]6.1346,[179]6.1178,[180]6.1090,[181]6.0874,[182]6.0696,[183]6.0557,[184]6.0553,[185]6.0480,[186]6.0489,[187]6.0550,[188]6.0504,[189]6.0677,[190]6.0690,[191]6.0907,[192]6.1069,[193]6.1243,[194]6.1357,[195]6.1569,[196]6.1728,[197]6.1938,[198]6.2088,[199]6.2128,[200]6.2175,[201]6.2129,[202]6.2335,[203]6.2417,[204]6.2405,[205]6.2511,[206]6.2578,[207]6.2542,[208]6.2626,[209]6.2669,[210]6.2724,[211]6.2821,[212]6.2895,[213]6.3000,[214]6.3023,[215]6.3061,[216]6.3214,[217]6.3395,[218]6.3527,[219]6.3533,[220]6.3488,[221]6.3441,[222]6.3413,[223]6.3307,[224]6.3234,[225]6.3195,[226]6.3405,[227]6.3493,[228]6.3539,[229]6.3599,[230]6.3564,[231]6.3734,[232]6.3605,[233]6.3436,[234]6.3285,[235]6.3116,[236]6.3042,[237]6.2939,[238]6.2969,[239]6.2813,[240]6.2712,[241]6.2738,[242]6.2775,[243]6.2755,[244]6.2639,[245]6.2611,[246]6.2492,[247]6.2367,[248]6.2292,[249]6.2272,[250]6.2312,[251]6.2243,[252]6.2207,[253]6.2107,[254]6.2066,[255]6.1956,[256]6.1775,[257]6.1655,[258]6.1570,[259]6.1551,[260]6.1478,[261]6.1437,[262]6.1383,[263]6.1331,[264]6.1137,[265]6.1127,[266]6.1113,[267]6.1047,[268]6.1140,[269]6.1122,[270]6.1132,[271]6.1211,[272]6.1241,[273]6.1240,[274]6.1258,[275]6.1337,[276]6.1394,[277]6.1549,[278]6.1651,[279]6.1738,[280]6.1767,[281]6.1859,[282]6.1921,[283]6.2069,[284]6.2145,[285]6.2233,[286]6.2369,[287]6.2367,[288]6.2425,[289]6.2335,[290]6.2178,[291]6.2024,[292]6.1871,[293]6.1732,[294]6.1753,[295]6.1749,[296]6.1789,[297]6.1773,[298]6.1800,[299]6.1770,[300]6.1657,[301]6.1659,[302]6.1582,[303]6.1501,[304]6.1420,[305]6.1395,[306]6.1268,[307]6.1291,[308]6.1326,[309]6.1167,[310]6.1106,[311]6.1045,[312]6.1074,[313]6.1017,[314]6.1001,[315]6.0837,[316]6.0788,[317]6.0625,[318]6.0413,[319]6.0535,[320]6.0661,[321]6.0703,[322]6.0661,[323]6.0592,[324]6.0566,[325]6.0669,[326]6.0667,[327]6.0684,[328]6.0720,[329]6.0782,[330]6.0808,[331]6.0932,[332]6.0902,[333]6.0973,[334]6.0916,[335]6.0847,[336]6.0880,[337]6.0852,[338]6.0850,[339]6.0796,[340]6.0752,[341]6.0831,[342]6.0853,[343]6.0901,[344]6.0899,[345]6.0898,[346]6.0868,[347]6.0912,[348]6.0944,[349]6.0964,[350]6.0928,[351]6.0934,[352]6.0935,[353]6.0875,[354]6.0880,[355]6.0933,[356]6.0961,[357]6.0928,[358]6.1020,[359]6.1049,[360]6.1012,[361]6.1008,[362]6.1077,[363]6.1193,[364]6.1256,[365]6.1313,[366]6.1324,[367]6.1415,[368]6.1389,[369]6.1394,[370]6.1407,[371]6.1349,[372]6.1398,[373]6.1451,[374]6.1436,[375]6.1435,[376]6.1506,[377]6.1458,[378]6.1484,[379]6.1541,[380]6.1461,[381]6.1422,[382]6.1368,[383]6.1359,[384]6.1353,[385]6.1347,[386]6.1344,[387]6.1338,[388]6.1298,[389]6.1246,[390]6.1177,[391]6.1100,[392]6.1058,[393]6.1042,[394]6.1067,[395]6.1053,[396]6.0976,[397]6.1055,[398]6.1095,[399]6.1178,[400]6.1175,[401]6.1192,[402]6.1200,[403]6.1221,[404]6.1287,[405]6.1186,[406]6.1151,[407]6.1145,[408]6.1158,[409]6.1277,[410]6.1385,[411]6.1499,[412]6.1658,[413]6.1777,[414]6.1852,[415]6.1903,[416]6.1981,[417]6.2105,[418]6.2143,[419]6.2215,[420]6.2303,[421]6.2420,[422]6.2469,[423]6.2538,[424]6.2655,[425]6.2744,[426]6.2811,[427]6.2856,[428]6.2940,[429]6.2990,[430]6.3074,[431]6.3216,[432]6.3258,[433]6.3247,[434]6.3202,[435]6.3210,[436]6.3232,[437]6.3328,[438]6.3403,[439]6.3371,[440]6.3367,[441]6.3315,[442]6.3301,[443]6.3314,[444]6.3317,[445]6.3299,[446]6.3325,[447]6.3355,[448]6.3401,[449]6.3376,[450]6.3389,[451]6.3346,[452]6.3218,[453]6.3130,[454]6.3073,[455]6.3084,[456]6.3130,[457]6.3151,[458]6.3129,[459]6.3132,[460]6.3217,[461]6.3188,[462]6.3170,[463]6.3219,[464]6.3209,[465]6.3177,[466]6.3098,[467]6.3096,[468]6.3093,[469]6.3113,[470]6.3116,[471]6.3067,[472]6.3116,[473]6.3060,[474]6.3068,[475]6.3005,[476]6.3025,[477]6.2953,[478]6.2940,[479]6.3000,[480]6.3046,[481]6.3063,[482]6.3018,[483]6.2976,[484]6.2999,[485]6.2983,[486]6.2928,[487]6.2928,[488]6.2905,[489]6.2857,[490]6.2834,[491]6.2805,[492]6.2745,[493]6.2715,[494]6.2699,[495]6.2703,[496]6.2667,[497]6.2611,[498]6.2592,[499]6.2545,[500]6.2448,[501]6.2381,[502]6.2383,[503]6.2377,[504]6.2288,[505]6.2314,[506]6.2324,[507]6.2268,[508]6.2228,[509]6.2220,[510]6.2257,[511]6.2306,[512]6.2338,[513]6.2358,[514]6.2422,[515]6.2366,[516]6.2357,[517]6.2366,[518]6.2366,[519]6.2397,[520]6.2422,[521]6.2438,[522]6.2468,[523]6.2477,[524]6.2532,[525]6.2568,[526]6.2580,[527]6.2598,[528]6.2548,[529]6.2550,[530]6.2503,[531]6.2492,[532]6.2542,[533]6.2564,[534]6.2548,[535]6.2572,[536]6.2516,[537]6.2493,[538]6.2539,[539]6.2550,[540]6.2588,[541]6.2590,[542]6.2601,[543]6.2615,[544]6.2627,[545]6.2603,[546]6.2611,[547]6.2567,[548]6.2518,[549]6.2515,[550]6.2485,[551]6.2450,[552]6.2428,[553]6.2389,[554]6.2365,[555]6.2336,[556]6.2331,[557]6.2354,[558]6.2316,[559]6.2310,[560]6.2308,[561]6.2307,[562]6.2286,[563]6.2286,[564]6.2330,[565]6.2351,[566]6.2348,[567]6.2328,[568]6.2333,[569]6.2317,[570]6.2343,[571]6.2348,[572]6.2358,[573]6.2360,[574]6.2327,[575]6.2322,[576]6.2321,[577]6.2308,[578]6.2287,[579]6.2293,[580]6.2224,[581]6.2186,[582]6.2174,[583]6.2183,[584]6.2185,[585]6.2112,[586]6.2044,[587]6.2047,[588]6.2096,[589]6.2150,[590]6.2178,[591]6.2199,[592]6.2185,[593]6.2150,[594]6.2158,[595]6.2136,[596]6.2170,[597]6.2149,[598]6.2118,[599]6.2139,[600]6.2133,[601]6.2118,[602]6.2134,[603]6.2166,[604]6.2175,[605]6.2209,[606]6.2229,[607]6.2211,[608]6.2179,[609]6.2185,[610]6.2220,[611]6.2202,[612]6.2229,[613]6.2191,[614]6.2138,[615]6.2065,[616]6.2093,[617]6.2031,[618]6.1980,[619]6.1923,[620]6.1781,[621]6.1710,[622]6.1694,[623]6.1710,[624]6.1715,[625]6.1717,[626]6.1704,[627]6.1724,[628]6.1725,[629]6.1719,[630]6.1751,[631]6.1808,[632]6.1864,[633]6.1847,[634]6.1880,[635]6.1888,[636]6.1857,[637]6.1824,[638]6.1851,[639]6.1822,[640]6.1831,[641]6.1834,[642]6.1899,[643]6.1920,[644]6.1931,[645]6.1912,[646]6.1953,[647]6.1914,[648]6.1925,[649]6.1926,[650]6.1967,[651]6.2023,[652]6.2032,[653]6.2071,[654]6.2008,[655]6.2002,

llama_print_timings:        load time =  8085.81 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time = 3414572.54 ms / 335360 tokens (   10.18 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time = 3449029.12 ms

7B Q4_3, Docker: [655]6.0619,

docker run -it --rm -v$PWD/models:/models --device /dev/dri --device /dev/kfd llama.cpp:rocm perplexity -m /models/llama-7b-q4_3.bin --no-mmap -f /models/wiki.test.raw

main: seed = 1682356946
llama.cpp: loading model from /models/llama-7b-q4_3.bin
llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 6 (mostly Q4_3)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 4936267.11 KB
llama_model_load_internal: mem required  = 6612.57 MB (+ 1026.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  =  256.00 MB

system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
5.95 seconds per pass - ETA 1 hours 4 minutes
[1]4.3494,[2]4.7745,[3]5.6675,[4]6.2874,[5]6.4224,[6]6.3707,[7]6.5479,[8]6.6453,[9]6.9852,[10]7.2514,[11]7.4535,[12]7.4779,[13]7.3964,[14]7.4636,[15]7.7126,[16]7.3279,[17]7.2090,[18]7.1599,[19]6.8048,[20]6.7914,[21]6.6994,[22]6.5295,[23]6.5008,[24]6.4058,[25]6.4156,[26]6.2545,[27]6.0814,[28]5.9817,[29]5.8923,[30]5.7342,[31]5.7033,[32]5.7206,[33]5.6627,[34]5.6956,[35]5.7173,[36]5.7587,[37]5.7623,[38]5.7694,[39]5.8020,[40]5.8530,[41]5.8653,[42]5.9052,[43]5.8678,[44]5.9249,[45]5.9253,[46]5.8998,[47]5.9191,[48]5.8935,[49]5.8922,[50]5.8505,[51]5.8451,[52]5.8341,[53]5.8787,[54]5.8591,[55]5.8357,[56]5.8629,[57]5.8831,[58]5.9029,[59]5.9213,[60]5.9619,[61]5.9534,[62]6.0125,[63]6.0403,[64]6.0536,[65]6.0964,[66]6.1041,[67]6.1232,[68]6.1375,[69]6.1625,[70]6.1932,[71]6.2158,[72]6.2466,[73]6.3047,[74]6.3090,[75]6.3245,[76]6.3370,[77]6.3496,[78]6.3362,[79]6.3624,[80]6.3552,[81]6.3677,[82]6.3722,[83]6.3214,[84]6.3041,[85]6.2914,[86]6.2693,[87]6.2097,[88]6.1854,[89]6.1655,[90]6.1488,[91]6.1725,[92]6.1670,[93]6.1680,[94]6.1655,[95]6.1930,[96]6.1916,[97]6.1878,[98]6.1812,[99]6.1680,[100]6.1683,[101]6.1917,[102]6.1871,[103]6.2064,[104]6.2133,[105]6.2121,[106]6.2295,[107]6.2300,[108]6.2432,[109]6.2365,[110]6.2303,[111]6.2516,[112]6.2716,[113]6.2740,[114]6.2700,[115]6.2758,[116]6.2669,[117]6.2726,[118]6.2999,[119]6.3217,[120]6.3564,[121]6.3711,[122]6.3946,[123]6.4319,[124]6.4494,[125]6.4394,[126]6.4789,[127]6.5147,[128]6.5449,[129]6.5290,[130]6.5368,[131]6.5312,[132]6.5237,[133]6.5113,[134]6.5209,[135]6.5165,[136]6.5048,[137]6.4964,[138]6.4786,[139]6.4684,[140]6.4647,[141]6.4374,[142]6.4326,[143]6.4037,[144]6.3828,[145]6.3746,[146]6.3634,[147]6.3664,[148]6.3664,[149]6.3613,[150]6.3575,[151]6.3596,[152]6.3501,[153]6.3342,[154]6.3260,[155]6.3326,[156]6.3281,[157]6.3447,[158]6.3487,[159]6.3539,[160]6.3562,[161]6.3678,[162]6.3399,[163]6.3276,[164]6.3041,[165]6.2736,[166]6.2468,[167]6.2097,[168]6.1796,[169]6.1652,[170]6.1539,[171]6.1269,[172]6.1090,[173]6.0932,[174]6.0639,[175]6.0424,[176]6.0305,[177]6.0112,[178]5.9887,[179]5.9718,[180]5.9625,[181]5.9414,[182]5.9235,[183]5.9097,[184]5.9083,[185]5.9006,[186]5.9010,[187]5.9075,[188]5.9036,[189]5.9212,[190]5.9223,[191]5.9434,[192]5.9589,[193]5.9754,[194]5.9867,[195]6.0079,[196]6.0237,[197]6.0440,[198]6.0586,[199]6.0620,[200]6.0665,[201]6.0616,[202]6.0801,[203]6.0874,[204]6.0859,[205]6.0967,[206]6.1035,[207]6.0994,[208]6.1080,[209]6.1120,[210]6.1170,[211]6.1269,[212]6.1338,[213]6.1440,[214]6.1465,[215]6.1486,[216]6.1633,[217]6.1805,[218]6.1938,[219]6.1935,[220]6.1896,[221]6.1843,[222]6.1824,[223]6.1736,[224]6.1670,[225]6.1634,[226]6.1834,[227]6.1920,[228]6.1971,[229]6.2033,[230]6.2003,[231]6.2162,[232]6.2048,[233]6.1881,[234]6.1737,[235]6.1548,[236]6.1483,[237]6.1384,[238]6.1405,[239]6.1260,[240]6.1157,[241]6.1175,[242]6.1210,[243]6.1193,[244]6.1085,[245]6.1052,[246]6.0943,[247]6.0828,[248]6.0761,[249]6.0736,[250]6.0782,[251]6.0713,[252]6.0678,[253]6.0581,[254]6.0525,[255]6.0405,[256]6.0226,[257]6.0107,[258]6.0026,[259]6.0003,[260]5.9921,[261]5.9881,[262]5.9824,[263]5.9771,[264]5.9585,[265]5.9581,[266]5.9564,[267]5.9498,[268]5.9584,[269]5.9570,[270]5.9575,[271]5.9655,[272]5.9693,[273]5.9691,[274]5.9714,[275]5.9798,[276]5.9857,[277]6.0012,[278]6.0114,[279]6.0208,[280]6.0234,[281]6.0337,[282]6.0395,[283]6.0545,[284]6.0628,[285]6.0712,[286]6.0841,[287]6.0842,[288]6.0898,[289]6.0816,[290]6.0666,[291]6.0517,[292]6.0368,[293]6.0239,[294]6.0261,[295]6.0251,[296]6.0296,[297]6.0280,[298]6.0312,[299]6.0286,[300]6.0177,[301]6.0176,[302]6.0096,[303]6.0007,[304]5.9919,[305]5.9884,[306]5.9764,[307]5.9785,[308]5.9813,[309]5.9657,[310]5.9603,[311]5.9538,[312]5.9560,[313]5.9502,[314]5.9487,[315]5.9330,[316]5.9279,[317]5.9122,[318]5.8926,[319]5.9047,[320]5.9170,[321]5.9211,[322]5.9171,[323]5.9105,[324]5.9077,[325]5.9179,[326]5.9179,[327]5.9202,[328]5.9239,[329]5.9299,[330]5.9332,[331]5.9456,[332]5.9430,[333]5.9500,[334]5.9448,[335]5.9389,[336]5.9427,[337]5.9405,[338]5.9398,[339]5.9350,[340]5.9309,[341]5.9389,[342]5.9418,[343]5.9461,[344]5.9466,[345]5.9471,[346]5.9449,[347]5.9488,[348]5.9522,[349]5.9546,[350]5.9512,[351]5.9519,[352]5.9524,[353]5.9464,[354]5.9477,[355]5.9528,[356]5.9562,[357]5.9527,[358]5.9621,[359]5.9646,[360]5.9614,[361]5.9612,[362]5.9680,[363]5.9789,[364]5.9852,[365]5.9901,[366]5.9914,[367]5.9997,[368]5.9971,[369]5.9980,[370]5.9997,[371]5.9944,[372]5.9992,[373]6.0039,[374]6.0024,[375]6.0024,[376]6.0089,[377]6.0041,[378]6.0069,[379]6.0130,[380]6.0056,[381]6.0024,[382]5.9974,[383]5.9965,[384]5.9962,[385]5.9950,[386]5.9946,[387]5.9944,[388]5.9910,[389]5.9861,[390]5.9792,[391]5.9717,[392]5.9678,[393]5.9661,[394]5.9690,[395]5.9677,[396]5.9601,[397]5.9669,[398]5.9713,[399]5.9791,[400]5.9791,[401]5.9804,[402]5.9813,[403]5.9832,[404]5.9894,[405]5.9803,[406]5.9772,[407]5.9766,[408]5.9783,[409]5.9897,[410]6.0010,[411]6.0123,[412]6.0280,[413]6.0389,[414]6.0467,[415]6.0521,[416]6.0601,[417]6.0720,[418]6.0755,[419]6.0825,[420]6.0912,[421]6.1028,[422]6.1064,[423]6.1134,[424]6.1238,[425]6.1330,[426]6.1393,[427]6.1438,[428]6.1518,[429]6.1570,[430]6.1652,[431]6.1790,[432]6.1826,[433]6.1817,[434]6.1775,[435]6.1784,[436]6.1808,[437]6.1905,[438]6.1980,[439]6.1948,[440]6.1937,[441]6.1888,[442]6.1876,[443]6.1888,[444]6.1896,[445]6.1875,[446]6.1900,[447]6.1929,[448]6.1967,[449]6.1942,[450]6.1950,[451]6.1909,[452]6.1783,[453]6.1702,[454]6.1647,[455]6.1653,[456]6.1701,[457]6.1718,[458]6.1699,[459]6.1706,[460]6.1790,[461]6.1765,[462]6.1752,[463]6.1787,[464]6.1776,[465]6.1750,[466]6.1674,[467]6.1680,[468]6.1677,[469]6.1699,[470]6.1703,[471]6.1656,[472]6.1701,[473]6.1647,[474]6.1659,[475]6.1598,[476]6.1614,[477]6.1545,[478]6.1536,[479]6.1597,[480]6.1641,[481]6.1658,[482]6.1614,[483]6.1573,[484]6.1591,[485]6.1573,[486]6.1517,[487]6.1515,[488]6.1493,[489]6.1445,[490]6.1422,[491]6.1395,[492]6.1340,[493]6.1311,[494]6.1292,[495]6.1289,[496]6.1252,[497]6.1198,[498]6.1182,[499]6.1138,[500]6.1045,[501]6.0981,[502]6.0982,[503]6.0975,[504]6.0887,[505]6.0905,[506]6.0915,[507]6.0862,[508]6.0823,[509]6.0817,[510]6.0850,[511]6.0897,[512]6.0931,[513]6.0953,[514]6.1016,[515]6.0961,[516]6.0952,[517]6.0962,[518]6.0956,[519]6.0986,[520]6.1009,[521]6.1022,[522]6.1050,[523]6.1057,[524]6.1114,[525]6.1145,[526]6.1156,[527]6.1172,[528]6.1122,[529]6.1131,[530]6.1078,[531]6.1064,[532]6.1112,[533]6.1135,[534]6.1118,[535]6.1138,[536]6.1085,[537]6.1063,[538]6.1114,[539]6.1124,[540]6.1160,[541]6.1162,[542]6.1174,[543]6.1189,[544]6.1198,[545]6.1179,[546]6.1189,[547]6.1149,[548]6.1098,[549]6.1100,[550]6.1070,[551]6.1036,[552]6.1014,[553]6.0976,[554]6.0953,[555]6.0922,[556]6.0915,[557]6.0940,[558]6.0903,[559]6.0901,[560]6.0899,[561]6.0902,[562]6.0881,[563]6.0879,[564]6.0922,[565]6.0943,[566]6.0943,[567]6.0921,[568]6.0930,[569]6.0915,[570]6.0942,[571]6.0944,[572]6.0951,[573]6.0949,[574]6.0913,[575]6.0909,[576]6.0908,[577]6.0892,[578]6.0872,[579]6.0877,[580]6.0813,[581]6.0774,[582]6.0766,[583]6.0774,[584]6.0776,[585]6.0700,[586]6.0632,[587]6.0638,[588]6.0684,[589]6.0739,[590]6.0767,[591]6.0790,[592]6.0777,[593]6.0747,[594]6.0756,[595]6.0732,[596]6.0766,[597]6.0745,[598]6.0715,[599]6.0736,[600]6.0729,[601]6.0715,[602]6.0729,[603]6.0756,[604]6.0764,[605]6.0799,[606]6.0823,[607]6.0807,[608]6.0775,[609]6.0783,[610]6.0818,[611]6.0802,[612]6.0826,[613]6.0789,[614]6.0741,[615]6.0668,[616]6.0694,[617]6.0634,[618]6.0587,[619]6.0531,[620]6.0395,[621]6.0328,[622]6.0311,[623]6.0325,[624]6.0329,[625]6.0328,[626]6.0317,[627]6.0341,[628]6.0343,[629]6.0341,[630]6.0374,[631]6.0430,[632]6.0488,[633]6.0473,[634]6.0507,[635]6.0514,[636]6.0479,[637]6.0444,[638]6.0470,[639]6.0439,[640]6.0448,[641]6.0450,[642]6.0516,[643]6.0538,[644]6.0549,[645]6.0530,[646]6.0572,[647]6.0530,[648]6.0541,[649]6.0544,[650]6.0582,[651]6.0636,[652]6.0648,[653]6.0686,[654]6.0624,[655]6.0619,

llama_print_timings:        load time =  9035.02 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time = 3789898.46 ms / 335360 tokens (   11.30 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time = 3830652.16 ms

JohannesGaessler · 2023-08-29T17:50:41Z

No. I don't have 70b q4 ready and there wouldn't be a point anyways since with 16 GB VRAM I would just be benchmarking the speed of the CPU.

* use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (ggerganov#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com> Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com> Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com> Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com> Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> Co-authored-by: jammm <2500920+jammm@users.noreply.github.com> Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>

JohannesGaessler · 2023-08-30T17:53:53Z

I implemented mul_mat_q tunings for RDNA 2 (using my RX 6800): #2910 . Please check whether they are better/worse on other AMD GPUs.

YellowRoseCx · 2023-08-30T17:55:40Z

I implemented mul_mat_q tunings for RDNA 2 (using my RX 6800): #2910 . Please check whether they are better/worse on other AMD GPUs.

What tuning value would you recommend? Do you want us to check via regular use or by a perplexity test?

JohannesGaessler · 2023-08-30T18:02:36Z

The RDNA 2 tunings are currently being applied to all AMD GPUs. Just checking whether the PR is slower or faster than master is enough.

ardfork · 2023-08-30T19:21:01Z

While testing #2910, I did some newer benchmark on q4_K_M (on a 6700 XT):

model	size	params	backend	ngl	mmq	test	t/s
LLaMA v2 13B mostly Q4_K - Medium	7.33 GiB	13.02 B	ROCm #2910	43	1	pp 512	433.72 ± 0.45
LLaMA v2 13B mostly Q4_K - Medium	7.33 GiB	13.02 B	ROCm #2910	43	1	tg 128	29.94 ± 0.11
LLaMA v2 13B mostly Q4_K - Medium	7.33 GiB	13.02 B	ROCm	43	1	pp 512	369.65 ± 0.76
LLaMA v2 13B mostly Q4_K - Medium	7.33 GiB	13.02 B	ROCm	43	1	tg 128	29.65 ± 0.03
LLaMA v2 13B mostly Q4_K - Medium	7.33 GiB	13.02 B	ROCm	43	0	pp 512	302.21 ± 0.97
LLaMA v2 13B mostly Q4_K - Medium	7.33 GiB	13.02 B	ROCm	43	0	tg 128	29.92 ± 0.11
LLaMA v2 13B mostly Q4_K - Medium	7.33 GiB	13.02 B	OpenCL	43	0	pp 512	100.31 ± 6.65
LLaMA v2 13B mostly Q4_K - Medium	7.33 GiB	13.02 B	OpenCL	43	0	tg 128	23.47 ± 0.02

Sadly, I forgot to measure VRAM usage.

harish0201 · 2023-08-31T01:32:54Z

Would something like https://github.com/microsoft/antares help in easier optimization of HIPBlas builds or working around HSA prefixes eventually for windows?

I'm sorry if this is the wrong fora!

jammm · 2023-08-31T07:22:18Z

@ggerganov @SlyEcho I was able to compile the ROCm version successfully on Windows using the HIP SDK.
It compiles within seconds via ninja and vs2019. Using the vs2019 x64 command prompt running as administrator:
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLAMA_HIPBLAS=ON -Dhipblas_DIR=<hipBLAS cmake dir> -Drocblas_DIR=<rocBLAS cmake dir> ..
cmake --build .
And CC and CXX were set to clang.exe and clang++.exe respectively from the bin folder of the HIP SDK.

Ran it successfully on 7900XTX. Not sure of the speed though. How do I check that?

The command I used ./main -ngl 32 -m ../../models/vicuna-7b-1.1.ggmlv3.q2_K.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: Write a story about llamas\n### Response:"

jammm · 2023-08-31T08:16:53Z

@SlyEcho will there be hipBLAS builds for Windows uploaded in the packages now?

ghost · 2023-08-31T08:34:41Z

Unfortunately, I still can't get it to work on Windows. Compiling is not the problem, it worked. Unfortunately, it cannot be started or crashes after starting the server, as mentioned above. Seems like I must live with that as my 6650 XT has no official support for Windows yet. I just don't understand why it works under Linux with the 1030 overwrite, but not on Windows.

ghost · 2023-08-31T08:36:22Z

While testing #2910, I did some newer benchmark on q4_K_M (on a 6700 XT):

model size params backend ngl mmq test t/s
LLaMA v2 13B mostly Q4_K - Medium 7.33 GiB 13.02 B ROCm #2910 43 1 pp 512 433.72 ± 0.45
LLaMA v2 13B mostly Q4_K - Medium 7.33 GiB 13.02 B ROCm #2910 43 1 tg 128 29.94 ± 0.11
LLaMA v2 13B mostly Q4_K - Medium 7.33 GiB 13.02 B ROCm 43 1 pp 512 369.65 ± 0.76
LLaMA v2 13B mostly Q4_K - Medium 7.33 GiB 13.02 B ROCm 43 1 tg 128 29.65 ± 0.03
LLaMA v2 13B mostly Q4_K - Medium 7.33 GiB 13.02 B ROCm 43 0 pp 512 302.21 ± 0.97
LLaMA v2 13B mostly Q4_K - Medium 7.33 GiB 13.02 B ROCm 43 0 tg 128 29.92 ± 0.11
LLaMA v2 13B mostly Q4_K - Medium 7.33 GiB 13.02 B OpenCL 43 0 pp 512 100.31 ± 6.65
LLaMA v2 13B mostly Q4_K - Medium 7.33 GiB 13.02 B OpenCL 43 0 tg 128 23.47 ± 0.02
Sadly, I forgot to measure VRAM usage.

I would like to test it with my card on Linux. How can i measure it like this? I'm new in this topic.

JohannesGaessler · 2023-08-31T08:58:51Z

Use the llama-bench binary. By default it will output a table that will be correctly formatted on GitHub.

jllllll · 2023-08-31T19:22:43Z

Updated CI example building llama-cpp-python for both Windows and Linux:
https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/blob/main/old_workflows/build-wheel-rocm.yml

The code for building libs should still be relevant if only building llama.cpp.

It is curious that the 6650 XT doesn't work on Windows given that GPU is explicitly listed as supported in the runtime:
https://rocm.docs.amd.com/en/docs-5.5.1/release/windows_support.html
Even though the SDK can't directly build for gfx1032, the copy procedure described before should still work as the gfx1032 ISA is supposed to be identical to gfx1030: https://salsa.debian.org/rocm-team/community/team-project/-/wikis/supported-gpu-list#gfx1032

Engininja2 · 2023-08-31T20:11:53Z

The issue is that rocBLAS on Windows comes compiled & with tensile libs for gfx906, gfx1030, gfx1100, gfx1101, and gfx1102. There's no HSA_OVERRIDE_GFX_VERSION because it's running on top of PAL instead of HSA. PAL might have an equivalent, from reading its repo.
There's a list of settings that lists spoofNullGpuIfh for GPU ID Masquerade. There's also an OverrideGpuId function. On the other hand a CMake file for rocclr has set(PAL_BUILD_NULL_DEVICE OFF) which might be related.

So there might be a registry setting that could work but it may need recompiling part of the HIP SDK anyways.

SlyEcho · 2023-08-31T22:15:18Z

There may come a time when rocBLAS is not needed, then it would work.

JohannesGaessler · 2023-08-31T22:28:00Z

Speaking of which, one of my next goals is to try and quantize the KV cache to q8_1. It will probably take some time but if that is done (and works) you could compile completely without cuBLAS/rocBLAS.

jammm · 2023-09-01T10:15:12Z

Speaking of which, one of my next goals is to try and quantize the KV cache to q8_1. It will probably take some time but if that is done (and works) you could compile completely without cuBLAS/rocBLAS.

Is there a timeline for this? I'd like to how how many users here are using navi22 and navi23. If it's worthwhile to push for hipBLAS to support it in its precompiled form until the hipBLAS dependency is removed, I can at least request for it internally, no guarantees though.

So navi22 and navi23 users, feel free to use the rocket emoji. Also if you're an APU user using phoenix, use the hooray emoji.

JohannesGaessler · 2023-09-01T12:43:43Z

I can't give a serious ETA because there are too many uncertainties. It will be done when it's done.

KerfuffleV2 · 2023-09-01T14:28:58Z

I looked at doing this (for other reasons, like making the prompt caches smaller or reducing VRAM usage) in the past. Seems like it'll require making a number of operations that currently only work on 32bit tensors support quantized ones also. Another nice side benefit may be making it easier to support other models that could benefit from using those ops on quantized tensors.

jammm · 2023-09-01T16:14:06Z

The following was run on Windows using the HIP SDK:

Device 0: AMD Radeon RX 7900 XTX, compute capability 11.0

model	size	params	backend	ngl	test	t/s
llama-2-7b.ggmlv3.q4_0.bin 7B mostly Q4_0 (guessed)	3.53 GiB	6.74 B	ROCm	99	pp 512	1637.00 ± 28.37
llama-2-7b.ggmlv3.q4_0.bin 7B mostly Q4_0 (guessed)	3.53 GiB	6.74 B	ROCm	99	tg 128	100.04 ± 0.59

model	size	params	backend	ngl	test	t/s
llama-2-13b.ggmlv3.q4_0.bin 13B mostly Q4_0 (guessed)	6.82 GiB	13.02 B	ROCm	99	pp 512	896.31 ± 5.70
llama-2-13b.ggmlv3.q4_0.bin 13B mostly Q4_0 (guessed)	6.82 GiB	13.02 B	ROCm	99	tg 128	66.26 ± 0.03

YellowRoseCx · 2024-04-14T18:01:28Z

Is anyone having issues compiling the hipBLAS backend with the cmakelists.txt file on Windows after the ggml-cuda was broken up into different files in its own folder?

use hipblas based on cublas

0fd8363

SlyEcho force-pushed the hipblas branch from 3949441 to 0fd8363 Compare April 20, 2023 18:53

Update Makefile for the Cuda kernels

54a63c1

SlyEcho mentioned this pull request Apr 20, 2023

rocBLAS support #1060

Closed

Build file changes

0e005f7

Now HIP Clang is not required, the CMake scripts will configure the needed compiler, which can be system clang++. Also other code can still use GCC, but CMake will force the clang to link.

add rpath

d3e1984

slaren mentioned this pull request Apr 21, 2023

Improve cuBLAS performance by using a memory pool #1094

Merged

SlyEcho added 2 commits April 22, 2023 23:28

More build file changes

3677235

Merge 'origin/master' into hipblas

db7a012

add rpath

3a004b2

evshiron mentioned this pull request Sep 2, 2023

Error running ghcr.io/evshiron/rocm_lab:rocm5.5-text-gen-webui 7dea7110f293 evshiron/rocm_lab#13

Closed

harish0201 mentioned this pull request Sep 17, 2023

6700XT/6800M Gfx1031 libraries for compilation of Kobold.cpp LostRuins/koboldcpp#441

Open

Kazaflow mentioned this pull request Nov 2, 2023

[Bug] KoboldCpp - Version 1.47.1/1.47.2.yr0-ROCm programme crash and close LostRuins/koboldcpp#495

Open

xangelix mentioned this pull request Nov 19, 2023

Segmentation fault after model load on ROCm multi-gpu, multi-gfx #4030

Closed

4 tasks

jammm mentioned this pull request Nov 24, 2023

Update README.md for ROCm Windows support #4122

Merged

hiepxanh mentioned this pull request Jan 19, 2024

AMD gfx1103 laptop GPU returning HIPBLAS_STATUS_UNKNOWN Mozilla-Ocho/llamafile#188

Closed

jasyuiop mentioned this pull request Feb 1, 2024

6600/6600 XT/6650 XT gfx1032 libraries for compilation of Kobold.cpp LostRuins/koboldcpp#655

Open

yeahdongcn mentioned this pull request Jul 9, 2024

feat: Support Moore Threads GPU #8383

Merged

4 tasks

sARY77 mentioned this pull request Jan 5, 2025

Only call rocblas_initialize for versions < 4 to eliminate unncessary VRAM allocation on some AMD cards #11080

Merged

ROCm Port #1087

ROCm Port #1087

Conversation

SlyEcho commented Apr 20, 2023 • edited Loading

ROCm port

Compiling

Docker

ggerganov commented Apr 20, 2023

SlyEcho commented Apr 20, 2023

slaren commented Apr 20, 2023

SlyEcho commented Apr 20, 2023 • edited Loading

slaren commented Apr 20, 2023

SlyEcho commented Apr 20, 2023

SlyEcho commented Apr 21, 2023

SlyEcho commented Apr 21, 2023

SlyEcho commented Apr 21, 2023

slaren commented Apr 21, 2023

FNsi commented Apr 22, 2023 • edited Loading

FNsi commented Apr 22, 2023 • edited Loading

SlyEcho commented Apr 23, 2023

SlyEcho commented Apr 24, 2023 • edited Loading

DGdev91 commented Apr 24, 2023

SlyEcho commented Apr 24, 2023 • edited Loading

DGdev91 commented Apr 24, 2023 • edited Loading

SlyEcho commented Apr 24, 2023

DGdev91 commented Apr 24, 2023

SlyEcho commented Apr 25, 2023 • edited Loading

Perplexity Testing for hipBLAS version

Code

Models

Hardware

Arch Linux testing with:

AMD official Docker with this Dockerfile:

Results

JohannesGaessler commented Aug 29, 2023

JohannesGaessler commented Aug 30, 2023

YellowRoseCx commented Aug 30, 2023

JohannesGaessler commented Aug 30, 2023

ardfork commented Aug 30, 2023

harish0201 commented Aug 31, 2023

jammm commented Aug 31, 2023 • edited Loading

jammm commented Aug 31, 2023 • edited Loading

ghost commented Aug 31, 2023 • edited by ghost Loading

ghost commented Aug 31, 2023

JohannesGaessler commented Aug 31, 2023

jllllll commented Aug 31, 2023 • edited Loading

Engininja2 commented Aug 31, 2023

SlyEcho commented Aug 31, 2023

JohannesGaessler commented Aug 31, 2023

jammm commented Sep 1, 2023 • edited Loading

JohannesGaessler commented Sep 1, 2023

KerfuffleV2 commented Sep 1, 2023

jammm commented Sep 1, 2023 • edited Loading

YellowRoseCx commented Apr 14, 2024 • edited Loading

SlyEcho commented Apr 20, 2023 •

edited

Loading

SlyEcho commented Apr 20, 2023 •

edited

Loading

FNsi commented Apr 22, 2023 •

edited

Loading

FNsi commented Apr 22, 2023 •

edited

Loading

SlyEcho commented Apr 24, 2023 •

edited

Loading

SlyEcho commented Apr 24, 2023 •

edited

Loading

DGdev91 commented Apr 24, 2023 •

edited

Loading

SlyEcho commented Apr 25, 2023 •

edited

Loading

jammm commented Aug 31, 2023 •

edited

Loading

jammm commented Aug 31, 2023 •

edited

Loading

ghost commented Aug 31, 2023 •

edited by ghost

Loading

jllllll commented Aug 31, 2023 •

edited

Loading

jammm commented Sep 1, 2023 •

edited

Loading

jammm commented Sep 1, 2023 •

edited

Loading

YellowRoseCx commented Apr 14, 2024 •

edited

Loading