-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variable bit rate quantization #1256
Comments
Great idea! |
I'd not compare it to audio, that variable bitrate is something totally different. There would also be merit in testing to keep the encoder block higher res and the decoder lowres also the early layers high res and the late layers lower resolution. |
The easiest way I can think of would be to vary allocation within a row - since we always quantize/dequantize/dot vector by row. You would have to allocate a "worst case" length since the row size is fixed, but you could still save memory accesses on more compressible rows. |
Suppose we want to quantize only half of the
|
I have added a bunch of new quantizations to play with in a quest to find the smallest model that gives a "reasonable" performance:
Here are some model sizes and perplexities where
They all satisfy the requirement to be usable on a phone or within a web browser by a comfortable margin. Considering that the If I arbitrarily define a perplexity of |
One thing that came to my mind during my latest tests: When thinking about mixing precision of layers, we should keep in mind that caluclations on GPU are already 16 bit. Any lower precision does not gain performance for those cases, it loses performance from conversions. Update: that's not relevant anymore |
I encountered some other interesting methods and benchmarks here, just as background https://github.com/megvii-research/Sparsebit/blob/main/large_language_models/llama/quantization/README.md |
Closed via #1684 |
How is it not relevant anymore? What does clBlast do? Is it better than cuBLAS when using quantizied models? |
After quantizing the weights, are they dequantized to float during computation (e.g., while computing dot product)? Or is the computation also done in the integer scale? Thanks. |
Variable bit rate is commonly used in audio and video compression, so why not try on LLMs?
My guess is that a locally adaptive variable bit rate would require a major change to
ggml
. So, then, the least one can try is to see if using different number of bits in the different network layers would be beneficial.As a first step, I simply changed
llama.cpp
to not quantize one of the tensor types in addition tooutput.weight
(which is already known to have a significant impact on generation quality) and calculated perplexity forQ2_4
quantization (see issue #1240). Picked 2-bit quantization because there the difference between a quantized and not quantized tensor will be largest, so it would be easiest to see the effect. The following table summarizes the results (PPL improvement is perplexity withfp16
output.weight
- perplexity withfp16
output weight
+ indicated tensor, table is sorted in decreasing order of impact)Interesting to note that the
tok_embeddings
tensor, which has been considered worthy of a dedicated quantization typeLLAMA_FTYPE_MOSTLY_Q4_1_SOME_F16
where it is kept asfp16
, has basically no influence on generation quality even when quantized with 2 bits.Based on these findings, I ran 2- and 3-bit perplexity calculations where the top 2 tensors
feed_forward.w2
andattention.wv
are quantized usingQ5_1
instead ofQ2_4
orQ3_4
. Here are the results:Interesting to note that the mixed
Q3_4 + Q5_1
quantization has a lower perplexity than any 4-bit quantization listed on the main page for the 7B model despite the smaller quantized model size.I have not explored only using
Q5_1
for a subset of thefeed_forward.w2
andattention.wv
tensors. The quantization rmse for these two tensor types increases with layer depth, so perhaps it would be sufficient to use a higher bit rate for only the last few layers, thus reducing quantized model size compared to what is given in the above table.Here are the complete runs for the above table. There is no new quantization type, just a quick hack where I added
just after this line in
llama.cpp
.7B, Q2_4 + Q5_1
main: seed = 1682863345 llama.cpp: loading model from ../build/junk.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 512 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 15 (mostly Q2_4) llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B llama_model_load_internal: ggml ctx size = 59.11 KB llama_model_load_internal: mem required = 5026.65 MB (+ 1026.00 MB per state) llama_init_from_file: kv self size = 256.00 MBsystem_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
1.61 seconds per pass - ETA 17 minutes
[1]4.8689,[2]5.5380,[3]6.3512,[4]7.0241,[5]7.0748,[6]7.0472,[7]7.2764,[8]7.3672,[9]7.7450,[10]8.0356,[11]8.2671,[12]8.3551,[13]8.2873,[14]8.4518,[15]8.7368,[16]8.2773,[17]8.1054,[18]8.0547,[19]7.6308,[20]7.5978,[21]7.4992,[22]7.3066,[23]7.2867,[24]7.1873,[25]7.1983,[26]7.0220,[27]6.8122,[28]6.7195,[29]6.6220,[30]6.4602,[31]6.4277,[32]6.4450,[33]6.3845,[34]6.4217,[35]6.4431,[36]6.4962,[37]6.4978,[38]6.5093,[39]6.5542,[40]6.6065,[41]6.6159,[42]6.6562,[43]6.6069,[44]6.6684,[45]6.6746,[46]6.6483,[47]6.6718,[48]6.6386,[49]6.6381,[50]6.5934,[51]6.5892,[52]6.5701,[53]6.6233,[54]6.6083,[55]6.5750,[56]6.6126,[57]6.6321,[58]6.6554,[59]6.6693,[60]6.7166,[61]6.7055,[62]6.7665,[63]6.8012,[64]6.8124,[65]6.8597,[66]6.8646,[67]6.8819,[68]6.8975,[69]6.9258,[70]6.9585,[71]6.9829,[72]7.0148,[73]7.0743,[74]7.0740,[75]7.0859,[76]7.0966,[77]7.1070,[78]7.0913,[79]7.1232,[80]7.1149,[81]7.1345,[82]7.1442,[83]7.0841,[84]7.0665,[85]7.0551,[86]7.0270,[87]6.9683,[88]6.9379,[89]6.9188,[90]6.9039,[91]6.9317,[92]6.9235,[93]6.9269,[94]6.9251,[95]6.9596,[96]6.9609,[97]6.9612,[98]6.9529,[99]6.9338,[100]6.9302,[101]6.9593,[102]6.9523,[103]6.9737,[104]6.9851,[105]6.9858,[106]7.0021,[107]6.9985,[108]7.0140,[109]7.0090,[110]7.0046,[111]7.0250,[112]7.0456,[113]7.0524,[114]7.0490,[115]7.0560,[116]7.0459,[117]7.0517,[118]7.0811,[119]7.1028,[120]7.1413,[121]7.1622,[122]7.1874,[123]7.2290,[124]7.2475,[125]7.2360,[126]7.2746,[127]7.3097,[128]7.3423,[129]7.3207,[130]7.3302,[131]7.3224,[132]7.3129,[133]7.3003,[134]7.3145,[135]7.3089,[136]7.2968,[137]7.2890,[138]7.2756,[139]7.2651,[140]7.2619,[141]7.2379,[142]7.2336,[143]7.2057,[144]7.1847,[145]7.1773,[146]7.1654,[147]7.1718,[148]7.1704,[149]7.1673,[150]7.1636,[151]7.1664,[152]7.1540,[153]7.1358,[154]7.1244,[155]7.1304,[156]7.1259,[157]7.1421,[158]7.1439,[159]7.1515,[160]7.1542,[161]7.1698,[162]7.1374,[163]7.1246,[164]7.0971,[165]7.0609,[166]7.0288,[167]6.9864,[168]6.9518,[169]6.9381,[170]6.9242,[171]6.8931,[172]6.8715,[173]6.8532,[174]6.8210,[175]6.7957,[176]6.7799,[177]6.7599,[178]6.7353,[179]6.7163,[180]6.7042,[181]6.6793,[182]6.6581,[183]6.6411,[184]6.6389,[185]6.6304,[186]6.6321,[187]6.6376,[188]6.6348,[189]6.6532,[190]6.6555,[191]6.6786,[192]6.6938,[193]6.7137,[194]6.7257,[195]6.7503,[196]6.7668,[197]6.7887,[198]6.8067,[199]6.8101,[200]6.8141,[201]6.8096,[202]6.8342,[203]6.8424,[204]6.8470,[205]6.8581,[206]6.8657,[207]6.8619,[208]6.8726,[209]6.8780,[210]6.8816,[211]6.8934,[212]6.9011,[213]6.9121,[214]6.9185,[215]6.9231,[216]6.9367,[217]6.9563,[218]6.9708,[219]6.9702,[220]6.9650,[221]6.9601,[222]6.9569,[223]6.9457,[224]6.9389,[225]6.9346,[226]6.9554,[227]6.9661,[228]6.9727,[229]6.9772,[230]6.9739,[231]6.9931,[232]6.9831,[233]6.9635,[234]6.9467,[235]6.9311,[236]6.9226,[237]6.9113,[238]6.9156,[239]6.8994,[240]6.8880,[241]6.8921,[242]6.8965,[243]6.8932,[244]6.8800,[245]6.8776,[246]6.8658,[247]6.8522,[248]6.8445,[249]6.8409,[250]6.8471,[251]6.8399,[252]6.8348,[253]6.8244,[254]6.8182,[255]6.8047,[256]6.7833,[257]6.7706,[258]6.7620,[259]6.7591,[260]6.7503,[261]6.7464,[262]6.7397,[263]6.7344,[264]6.7181,[265]6.7174,[266]6.7143,[267]6.7061,[268]6.7162,[269]6.7142,[270]6.7137,[271]6.7213,[272]6.7260,[273]6.7251,[274]6.7284,[275]6.7389,[276]6.7449,[277]6.7640,[278]6.7748,[279]6.7843,[280]6.7867,[281]6.7962,[282]6.8019,[283]6.8184,[284]6.8264,[285]6.8361,[286]6.8503,[287]6.8506,[288]6.8577,[289]6.8476,[290]6.8306,[291]6.8143,[292]6.7983,[293]6.7837,[294]6.7858,[295]6.7836,[296]6.7889,[297]6.7877,[298]6.7916,[299]6.7881,[300]6.7765,[301]6.7751,[302]6.7663,[303]6.7569,[304]6.7478,[305]6.7439,[306]6.7292,[307]6.7309,[308]6.7356,[309]6.7174,[310]6.7112,[311]6.7044,[312]6.7065,[313]6.6999,[314]6.6979,[315]6.6804,[316]6.6768,[317]6.6583,[318]6.6363,[319]6.6513,[320]6.6646,[321]6.6703,[322]6.6661,[323]6.6598,[324]6.6586,[325]6.6711,[326]6.6704,[327]6.6734,[328]6.6757,[329]6.6834,[330]6.6872,[331]6.7003,[332]6.6962,[333]6.7041,[334]6.6979,[335]6.6901,[336]6.6922,[337]6.6888,[338]6.6896,[339]6.6829,[340]6.6784,[341]6.6871,[342]6.6901,[343]6.6962,[344]6.6964,[345]6.6953,[346]6.6908,[347]6.6946,[348]6.6986,[349]6.6998,[350]6.6957,[351]6.6962,[352]6.6968,[353]6.6902,[354]6.6924,[355]6.6989,[356]6.7021,[357]6.6984,[358]6.7090,[359]6.7119,[360]6.7066,[361]6.7052,[362]6.7144,[363]6.7251,[364]6.7318,[365]6.7372,[366]6.7380,[367]6.7469,[368]6.7429,[369]6.7430,[370]6.7444,[371]6.7381,[372]6.7424,[373]6.7475,[374]6.7446,[375]6.7432,[376]6.7509,[377]6.7460,[378]6.7483,[379]6.7546,[380]6.7455,[381]6.7411,[382]6.7370,[383]6.7360,[384]6.7360,[385]6.7342,[386]6.7340,[387]6.7332,[388]6.7288,[389]6.7223,[390]6.7152,[391]6.7065,[392]6.7026,[393]6.7020,[394]6.7049,[395]6.7036,[396]6.6953,[397]6.7020,[398]6.7058,[399]6.7139,[400]6.7138,[401]6.7148,[402]6.7157,[403]6.7171,[404]6.7237,[405]6.7152,[406]6.7121,[407]6.7122,[408]6.7134,[409]6.7250,[410]6.7371,[411]6.7497,[412]6.7668,[413]6.7790,[414]6.7873,[415]6.7931,[416]6.8026,[417]6.8166,[418]6.8216,[419]6.8294,[420]6.8394,[421]6.8512,[422]6.8556,[423]6.8647,[424]6.8765,[425]6.8857,[426]6.8922,[427]6.8970,[428]6.9061,[429]6.9121,[430]6.9217,[431]6.9381,[432]6.9418,[433]6.9406,[434]6.9354,[435]6.9354,[436]6.9371,[437]6.9470,[438]6.9553,[439]6.9516,[440]6.9503,[441]6.9448,[442]6.9418,[443]6.9432,[444]6.9441,[445]6.9416,[446]6.9440,[447]6.9471,[448]6.9513,[449]6.9487,[450]6.9482,[451]6.9434,[452]6.9351,[453]6.9262,[454]6.9219,[455]6.9226,[456]6.9276,[457]6.9302,[458]6.9276,[459]6.9285,[460]6.9373,[461]6.9347,[462]6.9338,[463]6.9396,[464]6.9390,[465]6.9360,[466]6.9289,[467]6.9300,[468]6.9310,[469]6.9335,[470]6.9342,[471]6.9290,[472]6.9336,[473]6.9272,[474]6.9292,[475]6.9246,[476]6.9274,[477]6.9200,[478]6.9199,[479]6.9293,[480]6.9351,[481]6.9373,[482]6.9325,[483]6.9273,[484]6.9295,[485]6.9287,[486]6.9227,[487]6.9233,[488]6.9219,[489]6.9161,[490]6.9132,[491]6.9109,[492]6.9048,[493]6.9019,[494]6.9008,[495]6.9013,[496]6.8970,[497]6.8923,[498]6.8908,[499]6.8846,[500]6.8749,[501]6.8686,[502]6.8689,[503]6.8673,[504]6.8577,[505]6.8599,[506]6.8605,[507]6.8559,[508]6.8514,[509]6.8506,[510]6.8548,[511]6.8601,[512]6.8632,[513]6.8650,[514]6.8722,[515]6.8664,[516]6.8648,[517]6.8661,[518]6.8655,[519]6.8687,[520]6.8713,[521]6.8724,[522]6.8757,[523]6.8762,[524]6.8826,[525]6.8860,[526]6.8879,[527]6.8899,[528]6.8851,[529]6.8861,[530]6.8799,[531]6.8775,[532]6.8826,[533]6.8846,[534]6.8821,[535]6.8854,[536]6.8789,[537]6.8761,[538]6.8816,[539]6.8823,[540]6.8880,[541]6.8893,[542]6.8899,[543]6.8916,[544]6.8923,[545]6.8906,[546]6.8919,[547]6.8868,[548]6.8803,[549]6.8806,[550]6.8770,[551]6.8732,[552]6.8708,[553]6.8665,[554]6.8638,[555]6.8593,[556]6.8589,[557]6.8627,[558]6.8587,[559]6.8583,[560]6.8578,[561]6.8580,[562]6.8555,[563]6.8559,[564]6.8605,[565]6.8629,[566]6.8624,[567]6.8599,[568]6.8593,[569]6.8572,[570]6.8601,[571]6.8603,[572]6.8609,[573]6.8598,[574]6.8565,[575]6.8570,[576]6.8572,[577]6.8550,[578]6.8537,[579]6.8547,[580]6.8475,[581]6.8430,[582]6.8412,[583]6.8415,[584]6.8412,[585]6.8333,[586]6.8264,[587]6.8268,[588]6.8321,[589]6.8388,[590]6.8420,[591]6.8438,[592]6.8424,[593]6.8377,[594]6.8379,[595]6.8352,[596]6.8387,[597]6.8353,[598]6.8323,[599]6.8340,[600]6.8332,[601]6.8313,[602]6.8341,[603]6.8370,[604]6.8381,[605]6.8409,[606]6.8429,[607]6.8417,[608]6.8373,[609]6.8375,[610]6.8410,[611]6.8390,[612]6.8421,[613]6.8380,[614]6.8332,[615]6.8241,[616]6.8281,[617]6.8211,[618]6.8153,[619]6.8084,[620]6.7923,[621]6.7843,[622]6.7822,[623]6.7841,[624]6.7839,[625]6.7838,[626]6.7822,[627]6.7851,[628]6.7848,[629]6.7844,[630]6.7879,[631]6.7937,[632]6.7994,[633]6.7975,[634]6.8018,[635]6.8029,[636]6.8007,[637]6.7977,[638]6.8010,[639]6.7979,[640]6.7991,[641]6.7990,[642]6.8064,[643]6.8086,[644]6.8094,[645]6.8069,[646]6.8121,[647]6.8090,[648]6.8099,[649]6.8094,[650]6.8144,[651]6.8197,[652]6.8211,[653]6.8252,[654]6.8175,[655]6.8160,
llama_print_timings: load time = 2658.59 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 976009.34 ms / 335360 tokens ( 2.91 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1004906.69 ms
7B, Q3_4 + Q5_1
main: seed = 1682864367 llama.cpp: loading model from ../build/junk1.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 512 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 10 (mostly Q3_4) llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B llama_model_load_internal: ggml ctx size = 59.11 KB llama_model_load_internal: mem required = 5578.28 MB (+ 1026.00 MB per state) llama_init_from_file: kv self size = 256.00 MBsystem_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
1.69 seconds per pass - ETA 18 minutes
[1]4.3772,[2]4.7902,[3]5.6644,[4]6.2926,[5]6.4094,[6]6.3682,[7]6.5670,[8]6.6689,[9]7.0093,[10]7.2537,[11]7.4667,[12]7.5133,[13]7.4469,[14]7.5018,[15]7.7581,[16]7.3727,[17]7.2507,[18]7.1917,[19]6.8302,[20]6.8186,[21]6.7251,[22]6.5571,[23]6.5252,[24]6.4336,[25]6.4301,[26]6.2650,[27]6.0943,[28]5.9983,[29]5.9072,[30]5.7545,[31]5.7284,[32]5.7503,[33]5.6977,[34]5.7261,[35]5.7495,[36]5.7812,[37]5.7814,[38]5.7930,[39]5.8255,[40]5.8782,[41]5.8900,[42]5.9274,[43]5.8861,[44]5.9426,[45]5.9455,[46]5.9212,[47]5.9424,[48]5.9169,[49]5.9180,[50]5.8766,[51]5.8735,[52]5.8628,[53]5.9067,[54]5.8907,[55]5.8671,[56]5.8970,[57]5.9166,[58]5.9371,[59]5.9534,[60]5.9951,[61]5.9885,[62]6.0481,[63]6.0787,[64]6.0918,[65]6.1345,[66]6.1431,[67]6.1623,[68]6.1757,[69]6.1978,[70]6.2297,[71]6.2521,[72]6.2830,[73]6.3437,[74]6.3470,[75]6.3602,[76]6.3750,[77]6.3875,[78]6.3715,[79]6.4006,[80]6.3955,[81]6.4061,[82]6.4121,[83]6.3619,[84]6.3428,[85]6.3290,[86]6.3068,[87]6.2470,[88]6.2224,[89]6.2023,[90]6.1887,[91]6.2120,[92]6.2071,[93]6.2058,[94]6.2024,[95]6.2297,[96]6.2300,[97]6.2245,[98]6.2201,[99]6.2056,[100]6.2044,[101]6.2303,[102]6.2271,[103]6.2465,[104]6.2540,[105]6.2524,[106]6.2693,[107]6.2680,[108]6.2817,[109]6.2775,[110]6.2736,[111]6.2939,[112]6.3155,[113]6.3180,[114]6.3130,[115]6.3189,[116]6.3090,[117]6.3133,[118]6.3412,[119]6.3632,[120]6.3973,[121]6.4124,[122]6.4367,[123]6.4732,[124]6.4919,[125]6.4823,[126]6.5207,[127]6.5572,[128]6.5871,[129]6.5713,[130]6.5796,[131]6.5760,[132]6.5691,[133]6.5555,[134]6.5644,[135]6.5596,[136]6.5491,[137]6.5408,[138]6.5234,[139]6.5126,[140]6.5101,[141]6.4830,[142]6.4794,[143]6.4498,[144]6.4283,[145]6.4197,[146]6.4070,[147]6.4104,[148]6.4101,[149]6.4047,[150]6.4007,[151]6.4037,[152]6.3940,[153]6.3789,[154]6.3707,[155]6.3774,[156]6.3727,[157]6.3896,[158]6.3943,[159]6.3981,[160]6.4011,[161]6.4143,[162]6.3863,[163]6.3744,[164]6.3508,[165]6.3203,[166]6.2935,[167]6.2558,[168]6.2253,[169]6.2117,[170]6.2011,[171]6.1750,[172]6.1585,[173]6.1421,[174]6.1119,[175]6.0898,[176]6.0779,[177]6.0581,[178]6.0349,[179]6.0179,[180]6.0087,[181]5.9869,[182]5.9695,[183]5.9546,[184]5.9524,[185]5.9448,[186]5.9454,[187]5.9516,[188]5.9485,[189]5.9654,[190]5.9665,[191]5.9878,[192]6.0035,[193]6.0195,[194]6.0306,[195]6.0520,[196]6.0674,[197]6.0883,[198]6.1032,[199]6.1060,[200]6.1102,[201]6.1052,[202]6.1251,[203]6.1329,[204]6.1330,[205]6.1432,[206]6.1498,[207]6.1461,[208]6.1546,[209]6.1593,[210]6.1638,[211]6.1749,[212]6.1826,[213]6.1933,[214]6.1958,[215]6.1980,[216]6.2120,[217]6.2295,[218]6.2431,[219]6.2435,[220]6.2400,[221]6.2343,[222]6.2324,[223]6.2227,[224]6.2156,[225]6.2119,[226]6.2319,[227]6.2393,[228]6.2448,[229]6.2501,[230]6.2464,[231]6.2637,[232]6.2520,[233]6.2350,[234]6.2204,[235]6.2017,[236]6.1944,[237]6.1850,[238]6.1878,[239]6.1730,[240]6.1623,[241]6.1645,[242]6.1681,[243]6.1662,[244]6.1551,[245]6.1519,[246]6.1415,[247]6.1301,[248]6.1230,[249]6.1208,[250]6.1253,[251]6.1191,[252]6.1150,[253]6.1047,[254]6.0999,[255]6.0884,[256]6.0706,[257]6.0587,[258]6.0507,[259]6.0489,[260]6.0406,[261]6.0367,[262]6.0311,[263]6.0259,[264]6.0068,[265]6.0064,[266]6.0045,[267]5.9985,[268]6.0072,[269]6.0053,[270]6.0054,[271]6.0129,[272]6.0167,[273]6.0167,[274]6.0187,[275]6.0269,[276]6.0326,[277]6.0485,[278]6.0587,[279]6.0675,[280]6.0697,[281]6.0794,[282]6.0850,[283]6.0996,[284]6.1077,[285]6.1164,[286]6.1293,[287]6.1286,[288]6.1348,[289]6.1263,[290]6.1109,[291]6.0960,[292]6.0814,[293]6.0681,[294]6.0702,[295]6.0689,[296]6.0739,[297]6.0731,[298]6.0757,[299]6.0734,[300]6.0626,[301]6.0623,[302]6.0550,[303]6.0461,[304]6.0376,[305]6.0340,[306]6.0213,[307]6.0235,[308]6.0259,[309]6.0104,[310]6.0047,[311]5.9983,[312]6.0006,[313]5.9954,[314]5.9937,[315]5.9781,[316]5.9735,[317]5.9570,[318]5.9369,[319]5.9493,[320]5.9613,[321]5.9657,[322]5.9617,[323]5.9553,[324]5.9524,[325]5.9627,[326]5.9630,[327]5.9652,[328]5.9687,[329]5.9742,[330]5.9772,[331]5.9890,[332]5.9866,[333]5.9933,[334]5.9878,[335]5.9821,[336]5.9857,[337]5.9832,[338]5.9830,[339]5.9774,[340]5.9734,[341]5.9818,[342]5.9841,[343]5.9892,[344]5.9893,[345]5.9891,[346]5.9866,[347]5.9903,[348]5.9941,[349]5.9966,[350]5.9933,[351]5.9943,[352]5.9944,[353]5.9882,[354]5.9888,[355]5.9939,[356]5.9971,[357]5.9933,[358]6.0026,[359]6.0051,[360]6.0016,[361]6.0011,[362]6.0076,[363]6.0186,[364]6.0244,[365]6.0288,[366]6.0300,[367]6.0383,[368]6.0357,[369]6.0366,[370]6.0381,[371]6.0325,[372]6.0374,[373]6.0423,[374]6.0407,[375]6.0407,[376]6.0479,[377]6.0431,[378]6.0457,[379]6.0517,[380]6.0436,[381]6.0401,[382]6.0350,[383]6.0341,[384]6.0336,[385]6.0324,[386]6.0319,[387]6.0321,[388]6.0285,[389]6.0233,[390]6.0163,[391]6.0085,[392]6.0043,[393]6.0027,[394]6.0054,[395]6.0045,[396]5.9971,[397]6.0038,[398]6.0070,[399]6.0144,[400]6.0143,[401]6.0162,[402]6.0175,[403]6.0195,[404]6.0260,[405]6.0171,[406]6.0138,[407]6.0137,[408]6.0151,[409]6.0260,[410]6.0373,[411]6.0488,[412]6.0648,[413]6.0758,[414]6.0837,[415]6.0893,[416]6.0976,[417]6.1097,[418]6.1132,[419]6.1196,[420]6.1286,[421]6.1404,[422]6.1437,[423]6.1507,[424]6.1612,[425]6.1700,[426]6.1762,[427]6.1805,[428]6.1889,[429]6.1939,[430]6.2021,[431]6.2158,[432]6.2192,[433]6.2186,[434]6.2143,[435]6.2147,[436]6.2173,[437]6.2269,[438]6.2344,[439]6.2311,[440]6.2303,[441]6.2254,[442]6.2239,[443]6.2248,[444]6.2251,[445]6.2235,[446]6.2256,[447]6.2287,[448]6.2331,[449]6.2309,[450]6.2317,[451]6.2277,[452]6.2160,[453]6.2077,[454]6.2022,[455]6.2032,[456]6.2077,[457]6.2097,[458]6.2074,[459]6.2078,[460]6.2164,[461]6.2138,[462]6.2124,[463]6.2159,[464]6.2147,[465]6.2119,[466]6.2042,[467]6.2045,[468]6.2042,[469]6.2063,[470]6.2065,[471]6.2018,[472]6.2062,[473]6.2008,[474]6.2022,[475]6.1968,[476]6.1979,[477]6.1909,[478]6.1900,[479]6.1963,[480]6.2009,[481]6.2029,[482]6.1985,[483]6.1947,[484]6.1965,[485]6.1950,[486]6.1894,[487]6.1893,[488]6.1870,[489]6.1822,[490]6.1798,[491]6.1769,[492]6.1713,[493]6.1685,[494]6.1666,[495]6.1657,[496]6.1621,[497]6.1566,[498]6.1552,[499]6.1510,[500]6.1418,[501]6.1353,[502]6.1357,[503]6.1348,[504]6.1260,[505]6.1281,[506]6.1291,[507]6.1237,[508]6.1196,[509]6.1191,[510]6.1228,[511]6.1273,[512]6.1307,[513]6.1326,[514]6.1387,[515]6.1333,[516]6.1324,[517]6.1335,[518]6.1329,[519]6.1359,[520]6.1383,[521]6.1395,[522]6.1422,[523]6.1429,[524]6.1484,[525]6.1514,[526]6.1522,[527]6.1539,[528]6.1487,[529]6.1497,[530]6.1448,[531]6.1435,[532]6.1480,[533]6.1501,[534]6.1479,[535]6.1502,[536]6.1447,[537]6.1427,[538]6.1480,[539]6.1493,[540]6.1532,[541]6.1536,[542]6.1547,[543]6.1562,[544]6.1572,[545]6.1552,[546]6.1558,[547]6.1516,[548]6.1467,[549]6.1467,[550]6.1435,[551]6.1400,[552]6.1379,[553]6.1341,[554]6.1320,[555]6.1288,[556]6.1284,[557]6.1311,[558]6.1276,[559]6.1275,[560]6.1274,[561]6.1276,[562]6.1254,[563]6.1250,[564]6.1294,[565]6.1314,[566]6.1311,[567]6.1289,[568]6.1295,[569]6.1282,[570]6.1314,[571]6.1318,[572]6.1329,[573]6.1328,[574]6.1292,[575]6.1285,[576]6.1285,[577]6.1270,[578]6.1253,[579]6.1258,[580]6.1193,[581]6.1157,[582]6.1148,[583]6.1157,[584]6.1159,[585]6.1085,[586]6.1019,[587]6.1027,[588]6.1075,[589]6.1127,[590]6.1157,[591]6.1177,[592]6.1163,[593]6.1129,[594]6.1139,[595]6.1116,[596]6.1149,[597]6.1126,[598]6.1096,[599]6.1119,[600]6.1112,[601]6.1097,[602]6.1113,[603]6.1138,[604]6.1148,[605]6.1181,[606]6.1204,[607]6.1190,[608]6.1156,[609]6.1163,[610]6.1198,[611]6.1182,[612]6.1210,[613]6.1173,[614]6.1121,[615]6.1047,[616]6.1072,[617]6.1011,[618]6.0962,[619]6.0906,[620]6.0769,[621]6.0702,[622]6.0685,[623]6.0700,[624]6.0706,[625]6.0708,[626]6.0697,[627]6.0722,[628]6.0721,[629]6.0717,[630]6.0749,[631]6.0804,[632]6.0860,[633]6.0846,[634]6.0879,[635]6.0883,[636]6.0857,[637]6.0821,[638]6.0845,[639]6.0814,[640]6.0823,[641]6.0823,[642]6.0888,[643]6.0912,[644]6.0926,[645]6.0911,[646]6.0953,[647]6.0917,[648]6.0926,[649]6.0928,[650]6.0968,[651]6.1019,[652]6.1030,[653]6.1068,[654]6.1003,[655]6.0996,
llama_print_timings: load time = 2722.23 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 1024717.17 ms / 335360 tokens ( 3.06 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1052144.49 ms
13B, Q2_4 + Q5_1
main: seed = 1682869318 llama.cpp: loading model from ../build/junk3.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 512 llama_model_load_internal: n_embd = 5120 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 40 llama_model_load_internal: n_layer = 40 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 15 (mostly Q2_4) llama_model_load_internal: n_ff = 13824 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 13B llama_model_load_internal: ggml ctx size = 73.73 KB llama_model_load_internal: mem required = 8284.13 MB (+ 1608.00 MB per state) llama_init_from_file: kv self size = 400.00 MBsystem_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
2.66 seconds per pass - ETA 29 minutes
[1]3.9721,[2]4.4565,[3]5.2597,[4]5.9033,[5]6.0338,[6]5.9622,[7]6.1601,[8]6.2656,[9]6.5236,[10]6.7829,[11]7.0013,[12]7.0990,[13]7.0476,[14]7.1859,[15]7.4172,[16]7.0334,[17]6.9135,[18]6.8786,[19]6.5400,[20]6.5070,[21]6.4154,[22]6.2361,[23]6.1969,[24]6.0978,[25]6.1050,[26]5.9363,[27]5.7476,[28]5.6483,[29]5.5673,[30]5.4279,[31]5.3936,[32]5.4050,[33]5.3676,[34]5.4138,[35]5.4320,[36]5.4581,[37]5.4515,[38]5.4411,[39]5.4719,[40]5.5259,[41]5.5538,[42]5.5976,[43]5.5570,[44]5.6013,[45]5.6034,[46]5.5709,[47]5.5914,[48]5.5665,[49]5.5732,[50]5.5425,[51]5.5525,[52]5.5445,[53]5.5919,[54]5.5826,[55]5.5603,[56]5.5833,[57]5.6051,[58]5.6318,[59]5.6509,[60]5.6875,[61]5.6795,[62]5.7390,[63]5.7665,[64]5.7749,[65]5.8128,[66]5.8121,[67]5.8288,[68]5.8395,[69]5.8698,[70]5.9008,[71]5.9260,[72]5.9637,[73]6.0136,[74]6.0186,[75]6.0287,[76]6.0452,[77]6.0630,[78]6.0533,[79]6.0811,[80]6.0773,[81]6.0940,[82]6.0916,[83]6.0430,[84]6.0330,[85]6.0263,[86]6.0075,[87]5.9484,[88]5.9136,[89]5.8907,[90]5.8817,[91]5.9066,[92]5.9045,[93]5.9073,[94]5.9044,[95]5.9318,[96]5.9287,[97]5.9266,[98]5.9230,[99]5.9158,[100]5.9112,[101]5.9382,[102]5.9320,[103]5.9477,[104]5.9492,[105]5.9504,[106]5.9664,[107]5.9629,[108]5.9788,[109]5.9759,[110]5.9701,[111]5.9873,[112]6.0048,[113]6.0053,[114]6.0034,[115]6.0066,[116]5.9956,[117]5.9974,[118]6.0221,[119]6.0409,[120]6.0689,[121]6.0859,[122]6.1071,[123]6.1462,[124]6.1624,[125]6.1543,[126]6.1905,[127]6.2246,[128]6.2549,[129]6.2404,[130]6.2488,[131]6.2446,[132]6.2396,[133]6.2282,[134]6.2393,[135]6.2381,[136]6.2292,[137]6.2257,[138]6.2107,[139]6.2011,[140]6.2007,[141]6.1759,[142]6.1720,[143]6.1485,[144]6.1310,[145]6.1232,[146]6.1096,[147]6.1134,[148]6.1171,[149]6.1134,[150]6.1137,[151]6.1188,[152]6.1085,[153]6.0971,[154]6.0912,[155]6.0971,[156]6.0959,[157]6.1130,[158]6.1163,[159]6.1179,[160]6.1222,[161]6.1343,[162]6.1058,[163]6.0948,[164]6.0712,[165]6.0433,[166]6.0157,[167]5.9800,[168]5.9506,[169]5.9372,[170]5.9281,[171]5.9059,[172]5.8906,[173]5.8760,[174]5.8466,[175]5.8240,[176]5.8097,[177]5.7904,[178]5.7676,[179]5.7527,[180]5.7448,[181]5.7262,[182]5.7087,[183]5.6951,[184]5.6934,[185]5.6880,[186]5.6896,[187]5.6955,[188]5.6950,[189]5.7125,[190]5.7137,[191]5.7318,[192]5.7454,[193]5.7611,[194]5.7725,[195]5.7933,[196]5.8061,[197]5.8252,[198]5.8379,[199]5.8412,[200]5.8429,[201]5.8373,[202]5.8553,[203]5.8624,[204]5.8614,[205]5.8712,[206]5.8764,[207]5.8730,[208]5.8796,[209]5.8831,[210]5.8888,[211]5.8998,[212]5.9059,[213]5.9151,[214]5.9187,[215]5.9211,[216]5.9327,[217]5.9480,[218]5.9628,[219]5.9616,[220]5.9575,[221]5.9532,[222]5.9522,[223]5.9451,[224]5.9380,[225]5.9342,[226]5.9546,[227]5.9632,[228]5.9704,[229]5.9768,[230]5.9741,[231]5.9897,[232]5.9793,[233]5.9619,[234]5.9460,[235]5.9269,[236]5.9199,[237]5.9101,[238]5.9136,[239]5.9005,[240]5.8905,[241]5.8933,[242]5.8951,[243]5.8940,[244]5.8841,[245]5.8807,[246]5.8702,[247]5.8600,[248]5.8524,[249]5.8490,[250]5.8530,[251]5.8444,[252]5.8398,[253]5.8301,[254]5.8258,[255]5.8145,[256]5.7966,[257]5.7849,[258]5.7773,[259]5.7751,[260]5.7663,[261]5.7617,[262]5.7569,[263]5.7506,[264]5.7336,[265]5.7324,[266]5.7288,[267]5.7218,[268]5.7296,[269]5.7292,[270]5.7288,[271]5.7352,[272]5.7392,[273]5.7404,[274]5.7412,[275]5.7480,[276]5.7552,[277]5.7693,[278]5.7775,[279]5.7851,[280]5.7886,[281]5.7987,[282]5.8037,[283]5.8164,[284]5.8253,[285]5.8337,[286]5.8472,[287]5.8436,[288]5.8492,[289]5.8424,[290]5.8276,[291]5.8144,[292]5.8000,[293]5.7865,[294]5.7876,[295]5.7878,[296]5.7926,[297]5.7915,[298]5.7945,[299]5.7927,[300]5.7833,[301]5.7830,[302]5.7765,[303]5.7683,[304]5.7593,[305]5.7559,[306]5.7445,[307]5.7467,[308]5.7477,[309]5.7330,[310]5.7289,[311]5.7237,[312]5.7254,[313]5.7193,[314]5.7175,[315]5.7028,[316]5.7002,[317]5.6864,[318]5.6684,[319]5.6795,[320]5.6913,[321]5.6957,[322]5.6918,[323]5.6866,[324]5.6844,[325]5.6958,[326]5.6973,[327]5.6980,[328]5.7006,[329]5.7052,[330]5.7076,[331]5.7186,[332]5.7145,[333]5.7218,[334]5.7162,[335]5.7112,[336]5.7145,[337]5.7124,[338]5.7122,[339]5.7075,[340]5.7049,[341]5.7114,[342]5.7142,[343]5.7184,[344]5.7186,[345]5.7190,[346]5.7163,[347]5.7199,[348]5.7239,[349]5.7266,[350]5.7247,[351]5.7257,[352]5.7258,[353]5.7197,[354]5.7213,[355]5.7262,[356]5.7298,[357]5.7267,[358]5.7352,[359]5.7368,[360]5.7326,[361]5.7319,[362]5.7393,[363]5.7502,[364]5.7557,[365]5.7603,[366]5.7622,[367]5.7718,[368]5.7690,[369]5.7704,[370]5.7726,[371]5.7684,[372]5.7738,[373]5.7777,[374]5.7762,[375]5.7754,[376]5.7821,[377]5.7776,[378]5.7799,[379]5.7845,[380]5.7770,[381]5.7744,[382]5.7703,[383]5.7691,[384]5.7695,[385]5.7677,[386]5.7658,[387]5.7652,[388]5.7614,[389]5.7571,[390]5.7516,[391]5.7449,[392]5.7416,[393]5.7411,[394]5.7436,[395]5.7422,[396]5.7362,[397]5.7439,[398]5.7481,[399]5.7549,[400]5.7535,[401]5.7535,[402]5.7548,[403]5.7576,[404]5.7632,[405]5.7513,[406]5.7473,[407]5.7472,[408]5.7478,[409]5.7592,[410]5.7692,[411]5.7790,[412]5.7942,[413]5.8061,[414]5.8126,[415]5.8184,[416]5.8260,[417]5.8361,[418]5.8395,[419]5.8448,[420]5.8530,[421]5.8638,[422]5.8671,[423]5.8735,[424]5.8834,[425]5.8915,[426]5.8979,[427]5.9019,[428]5.9094,[429]5.9138,[430]5.9201,[431]5.9333,[432]5.9357,[433]5.9345,[434]5.9307,[435]5.9313,[436]5.9338,[437]5.9427,[438]5.9507,[439]5.9471,[440]5.9469,[441]5.9426,[442]5.9404,[443]5.9406,[444]5.9423,[445]5.9409,[446]5.9426,[447]5.9445,[448]5.9481,[449]5.9461,[450]5.9467,[451]5.9429,[452]5.9315,[453]5.9228,[454]5.9169,[455]5.9165,[456]5.9207,[457]5.9221,[458]5.9201,[459]5.9199,[460]5.9272,[461]5.9232,[462]5.9207,[463]5.9215,[464]5.9202,[465]5.9184,[466]5.9109,[467]5.9116,[468]5.9107,[469]5.9121,[470]5.9116,[471]5.9063,[472]5.9090,[473]5.9043,[474]5.9051,[475]5.8991,[476]5.8992,[477]5.8916,[478]5.8895,[479]5.8934,[480]5.8974,[481]5.8984,[482]5.8939,[483]5.8903,[484]5.8919,[485]5.8881,[486]5.8823,[487]5.8820,[488]5.8798,[489]5.8753,[490]5.8725,[491]5.8696,[492]5.8638,[493]5.8606,[494]5.8583,[495]5.8567,[496]5.8530,[497]5.8476,[498]5.8459,[499]5.8416,[500]5.8336,[501]5.8267,[502]5.8268,[503]5.8254,[504]5.8169,[505]5.8170,[506]5.8175,[507]5.8126,[508]5.8087,[509]5.8087,[510]5.8112,[511]5.8161,[512]5.8199,[513]5.8225,[514]5.8285,[515]5.8238,[516]5.8229,[517]5.8235,[518]5.8235,[519]5.8256,[520]5.8277,[521]5.8293,[522]5.8309,[523]5.8315,[524]5.8376,[525]5.8404,[526]5.8412,[527]5.8427,[528]5.8372,[529]5.8381,[530]5.8332,[531]5.8322,[532]5.8378,[533]5.8404,[534]5.8386,[535]5.8415,[536]5.8367,[537]5.8347,[538]5.8397,[539]5.8405,[540]5.8437,[541]5.8446,[542]5.8459,[543]5.8479,[544]5.8487,[545]5.8477,[546]5.8480,[547]5.8440,[548]5.8391,[549]5.8387,[550]5.8373,[551]5.8341,[552]5.8323,[553]5.8286,[554]5.8264,[555]5.8236,[556]5.8227,[557]5.8249,[558]5.8214,[559]5.8214,[560]5.8201,[561]5.8207,[562]5.8182,[563]5.8181,[564]5.8226,[565]5.8238,[566]5.8239,[567]5.8217,[568]5.8222,[569]5.8203,[570]5.8232,[571]5.8238,[572]5.8245,[573]5.8244,[574]5.8218,[575]5.8204,[576]5.8199,[577]5.8177,[578]5.8157,[579]5.8151,[580]5.8094,[581]5.8061,[582]5.8056,[583]5.8061,[584]5.8063,[585]5.8000,[586]5.7940,[587]5.7944,[588]5.7986,[589]5.8039,[590]5.8066,[591]5.8080,[592]5.8069,[593]5.8037,[594]5.8046,[595]5.8026,[596]5.8064,[597]5.8039,[598]5.8005,[599]5.8030,[600]5.8020,[601]5.8005,[602]5.8020,[603]5.8043,[604]5.8051,[605]5.8076,[606]5.8089,[607]5.8077,[608]5.8045,[609]5.8050,[610]5.8095,[611]5.8080,[612]5.8100,[613]5.8066,[614]5.8025,[615]5.7953,[616]5.7980,[617]5.7922,[618]5.7870,[619]5.7819,[620]5.7695,[621]5.7639,[622]5.7617,[623]5.7633,[624]5.7634,[625]5.7641,[626]5.7634,[627]5.7661,[628]5.7669,[629]5.7676,[630]5.7704,[631]5.7748,[632]5.7803,[633]5.7787,[634]5.7821,[635]5.7818,[636]5.7784,[637]5.7751,[638]5.7773,[639]5.7740,[640]5.7749,[641]5.7755,[642]5.7812,[643]5.7830,[644]5.7850,[645]5.7839,[646]5.7875,[647]5.7830,[648]5.7841,[649]5.7844,[650]5.7874,[651]5.7912,[652]5.7914,[653]5.7952,[654]5.7891,[655]5.7880,
llama_print_timings: load time = 3757.20 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 1660958.51 ms / 335360 tokens ( 4.95 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1690237.58 ms
13B, Q3_4 + Q5_1
main: seed = 1682871034 llama.cpp: loading model from ../build/junk2.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 512 llama_model_load_internal: n_embd = 5120 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 40 llama_model_load_internal: n_layer = 40 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 10 (mostly Q3_4) llama_model_load_internal: n_ff = 13824 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 13B llama_model_load_internal: ggml ctx size = 73.73 KB llama_model_load_internal: mem required = 9353.66 MB (+ 1608.00 MB per state) llama_init_from_file: kv self size = 400.00 MBsystem_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
2.78 seconds per pass - ETA 30 minutes
[1]3.8638,[2]4.2438,[3]5.0285,[4]5.4691,[5]5.6500,[6]5.5749,[7]5.7279,[8]5.8483,[9]6.0969,[10]6.3309,[11]6.5275,[12]6.5851,[13]6.5289,[14]6.6307,[15]6.8287,[16]6.5065,[17]6.4160,[18]6.3973,[19]6.1030,[20]6.0757,[21]6.0022,[22]5.8315,[23]5.8015,[24]5.7102,[25]5.7213,[26]5.5727,[27]5.3968,[28]5.2983,[29]5.2181,[30]5.0845,[31]5.0477,[32]5.0637,[33]5.0127,[34]5.0546,[35]5.0779,[36]5.0973,[37]5.0901,[38]5.0859,[39]5.1136,[40]5.1553,[41]5.1784,[42]5.2145,[43]5.1772,[44]5.2185,[45]5.2210,[46]5.1948,[47]5.2217,[48]5.2018,[49]5.2068,[50]5.1752,[51]5.1815,[52]5.1747,[53]5.2203,[54]5.2098,[55]5.1910,[56]5.2134,[57]5.2298,[58]5.2522,[59]5.2691,[60]5.3031,[61]5.2955,[62]5.3497,[63]5.3745,[64]5.3859,[65]5.4235,[66]5.4215,[67]5.4387,[68]5.4507,[69]5.4795,[70]5.5091,[71]5.5318,[72]5.5665,[73]5.6155,[74]5.6227,[75]5.6330,[76]5.6482,[77]5.6602,[78]5.6456,[79]5.6718,[80]5.6661,[81]5.6766,[82]5.6737,[83]5.6292,[84]5.6184,[85]5.6138,[86]5.5979,[87]5.5348,[88]5.4920,[89]5.4704,[90]5.4603,[91]5.4818,[92]5.4778,[93]5.4791,[94]5.4792,[95]5.5064,[96]5.5025,[97]5.4993,[98]5.4960,[99]5.4891,[100]5.4861,[101]5.5101,[102]5.5053,[103]5.5205,[104]5.5248,[105]5.5265,[106]5.5405,[107]5.5396,[108]5.5541,[109]5.5528,[110]5.5473,[111]5.5658,[112]5.5820,[113]5.5806,[114]5.5798,[115]5.5831,[116]5.5710,[117]5.5707,[118]5.5948,[119]5.6120,[120]5.6409,[121]5.6560,[122]5.6767,[123]5.7139,[124]5.7325,[125]5.7269,[126]5.7621,[127]5.7945,[128]5.8224,[129]5.8098,[130]5.8178,[131]5.8138,[132]5.8098,[133]5.7973,[134]5.8061,[135]5.8063,[136]5.7966,[137]5.7923,[138]5.7782,[139]5.7695,[140]5.7679,[141]5.7406,[142]5.7374,[143]5.7121,[144]5.6965,[145]5.6883,[146]5.6761,[147]5.6809,[148]5.6838,[149]5.6807,[150]5.6801,[151]5.6846,[152]5.6780,[153]5.6682,[154]5.6629,[155]5.6696,[156]5.6674,[157]5.6831,[158]5.6844,[159]5.6853,[160]5.6893,[161]5.7012,[162]5.6757,[163]5.6658,[164]5.6443,[165]5.6185,[166]5.5949,[167]5.5631,[168]5.5362,[169]5.5232,[170]5.5140,[171]5.4926,[172]5.4804,[173]5.4673,[174]5.4398,[175]5.4193,[176]5.4058,[177]5.3890,[178]5.3683,[179]5.3554,[180]5.3479,[181]5.3314,[182]5.3149,[183]5.3022,[184]5.3012,[185]5.2933,[186]5.2949,[187]5.3005,[188]5.2977,[189]5.3145,[190]5.3146,[191]5.3317,[192]5.3451,[193]5.3599,[194]5.3711,[195]5.3902,[196]5.4024,[197]5.4218,[198]5.4356,[199]5.4370,[200]5.4385,[201]5.4321,[202]5.4455,[203]5.4518,[204]5.4465,[205]5.4555,[206]5.4606,[207]5.4563,[208]5.4615,[209]5.4655,[210]5.4714,[211]5.4821,[212]5.4887,[213]5.4977,[214]5.5005,[215]5.5040,[216]5.5156,[217]5.5323,[218]5.5463,[219]5.5465,[220]5.5432,[221]5.5386,[222]5.5383,[223]5.5318,[224]5.5251,[225]5.5218,[226]5.5412,[227]5.5471,[228]5.5546,[229]5.5616,[230]5.5581,[231]5.5732,[232]5.5628,[233]5.5476,[234]5.5337,[235]5.5123,[236]5.5070,[237]5.4979,[238]5.5009,[239]5.4895,[240]5.4804,[241]5.4834,[242]5.4855,[243]5.4848,[244]5.4748,[245]5.4715,[246]5.4611,[247]5.4513,[248]5.4448,[249]5.4415,[250]5.4449,[251]5.4366,[252]5.4321,[253]5.4228,[254]5.4190,[255]5.4096,[256]5.3932,[257]5.3831,[258]5.3762,[259]5.3759,[260]5.3675,[261]5.3624,[262]5.3579,[263]5.3528,[264]5.3296,[265]5.3294,[266]5.3264,[267]5.3203,[268]5.3272,[269]5.3268,[270]5.3278,[271]5.3341,[272]5.3373,[273]5.3387,[274]5.3400,[275]5.3461,[276]5.3526,[277]5.3652,[278]5.3739,[279]5.3822,[280]5.3861,[281]5.3953,[282]5.4006,[283]5.4136,[284]5.4226,[285]5.4297,[286]5.4423,[287]5.4388,[288]5.4440,[289]5.4380,[290]5.4237,[291]5.4105,[292]5.3974,[293]5.3855,[294]5.3862,[295]5.3865,[296]5.3913,[297]5.3903,[298]5.3920,[299]5.3896,[300]5.3805,[301]5.3810,[302]5.3746,[303]5.3659,[304]5.3587,[305]5.3564,[306]5.3456,[307]5.3488,[308]5.3494,[309]5.3358,[310]5.3325,[311]5.3280,[312]5.3295,[313]5.3237,[314]5.3223,[315]5.3092,[316]5.3057,[317]5.2930,[318]5.2766,[319]5.2871,[320]5.2986,[321]5.3035,[322]5.3005,[323]5.2960,[324]5.2941,[325]5.3036,[326]5.3052,[327]5.3061,[328]5.3092,[329]5.3144,[330]5.3167,[331]5.3270,[332]5.3231,[333]5.3307,[334]5.3259,[335]5.3205,[336]5.3228,[337]5.3216,[338]5.3212,[339]5.3170,[340]5.3142,[341]5.3207,[342]5.3238,[343]5.3283,[344]5.3288,[345]5.3303,[346]5.3287,[347]5.3323,[348]5.3360,[349]5.3380,[350]5.3362,[351]5.3372,[352]5.3372,[353]5.3320,[354]5.3332,[355]5.3378,[356]5.3408,[357]5.3376,[358]5.3456,[359]5.3477,[360]5.3443,[361]5.3442,[362]5.3513,[363]5.3619,[364]5.3670,[365]5.3709,[366]5.3726,[367]5.3815,[368]5.3790,[369]5.3802,[370]5.3821,[371]5.3781,[372]5.3829,[373]5.3872,[374]5.3851,[375]5.3848,[376]5.3909,[377]5.3874,[378]5.3900,[379]5.3941,[380]5.3870,[381]5.3839,[382]5.3800,[383]5.3783,[384]5.3780,[385]5.3769,[386]5.3759,[387]5.3759,[388]5.3730,[389]5.3694,[390]5.3640,[391]5.3582,[392]5.3547,[393]5.3541,[394]5.3571,[395]5.3565,[396]5.3512,[397]5.3575,[398]5.3618,[399]5.3687,[400]5.3677,[401]5.3684,[402]5.3694,[403]5.3719,[404]5.3775,[405]5.3626,[406]5.3584,[407]5.3576,[408]5.3585,[409]5.3694,[410]5.3786,[411]5.3883,[412]5.4022,[413]5.4125,[414]5.4185,[415]5.4245,[416]5.4314,[417]5.4411,[418]5.4435,[419]5.4482,[420]5.4558,[421]5.4656,[422]5.4689,[423]5.4741,[424]5.4831,[425]5.4906,[426]5.4966,[427]5.5008,[428]5.5078,[429]5.5114,[430]5.5176,[431]5.5305,[432]5.5337,[433]5.5328,[434]5.5292,[435]5.5305,[436]5.5332,[437]5.5416,[438]5.5489,[439]5.5459,[440]5.5453,[441]5.5407,[442]5.5392,[443]5.5403,[444]5.5420,[445]5.5411,[446]5.5430,[447]5.5452,[448]5.5483,[449]5.5468,[450]5.5479,[451]5.5449,[452]5.5295,[453]5.5197,[454]5.5145,[455]5.5148,[456]5.5189,[457]5.5200,[458]5.5180,[459]5.5180,[460]5.5252,[461]5.5215,[462]5.5181,[463]5.5165,[464]5.5162,[465]5.5143,[466]5.5069,[467]5.5058,[468]5.5037,[469]5.5049,[470]5.5041,[471]5.4993,[472]5.5002,[473]5.4956,[474]5.4943,[475]5.4876,[476]5.4859,[477]5.4775,[478]5.4748,[479]5.4757,[480]5.4784,[481]5.4786,[482]5.4740,[483]5.4698,[484]5.4706,[485]5.4645,[486]5.4580,[487]5.4569,[488]5.4541,[489]5.4487,[490]5.4458,[491]5.4424,[492]5.4359,[493]5.4332,[494]5.4314,[495]5.4290,[496]5.4250,[497]5.4188,[498]5.4162,[499]5.4126,[500]5.4045,[501]5.3978,[502]5.3970,[503]5.3960,[504]5.3887,[505]5.3887,[506]5.3894,[507]5.3838,[508]5.3802,[509]5.3806,[510]5.3827,[511]5.3868,[512]5.3907,[513]5.3932,[514]5.3987,[515]5.3948,[516]5.3937,[517]5.3936,[518]5.3936,[519]5.3960,[520]5.3972,[521]5.3983,[522]5.4000,[523]5.4006,[524]5.4058,[525]5.4084,[526]5.4089,[527]5.4106,[528]5.4052,[529]5.4058,[530]5.4021,[531]5.4018,[532]5.4066,[533]5.4091,[534]5.4073,[535]5.4099,[536]5.4055,[537]5.4036,[538]5.4085,[539]5.4093,[540]5.4112,[541]5.4111,[542]5.4125,[543]5.4145,[544]5.4157,[545]5.4145,[546]5.4147,[547]5.4114,[548]5.4073,[549]5.4074,[550]5.4054,[551]5.4028,[552]5.4010,[553]5.3980,[554]5.3957,[555]5.3938,[556]5.3928,[557]5.3943,[558]5.3909,[559]5.3915,[560]5.3903,[561]5.3906,[562]5.3881,[563]5.3879,[564]5.3921,[565]5.3932,[566]5.3936,[567]5.3919,[568]5.3927,[569]5.3912,[570]5.3937,[571]5.3951,[572]5.3961,[573]5.3968,[574]5.3937,[575]5.3920,[576]5.3915,[577]5.3898,[578]5.3881,[579]5.3883,[580]5.3830,[581]5.3801,[582]5.3801,[583]5.3810,[584]5.3814,[585]5.3756,[586]5.3698,[587]5.3703,[588]5.3746,[589]5.3796,[590]5.3827,[591]5.3843,[592]5.3831,[593]5.3792,[594]5.3805,[595]5.3790,[596]5.3829,[597]5.3809,[598]5.3778,[599]5.3806,[600]5.3795,[601]5.3784,[602]5.3789,[603]5.3818,[604]5.3826,[605]5.3855,[606]5.3871,[607]5.3854,[608]5.3826,[609]5.3834,[610]5.3875,[611]5.3863,[612]5.3887,[613]5.3859,[614]5.3819,[615]5.3760,[616]5.3784,[617]5.3734,[618]5.3688,[619]5.3644,[620]5.3533,[621]5.3482,[622]5.3463,[623]5.3477,[624]5.3481,[625]5.3489,[626]5.3484,[627]5.3511,[628]5.3519,[629]5.3525,[630]5.3553,[631]5.3598,[632]5.3644,[633]5.3633,[634]5.3663,[635]5.3660,[636]5.3623,[637]5.3585,[638]5.3606,[639]5.3574,[640]5.3579,[641]5.3584,[642]5.3637,[643]5.3654,[644]5.3676,[645]5.3662,[646]5.3699,[647]5.3651,[648]5.3664,[649]5.3667,[650]5.3695,[651]5.3737,[652]5.3742,[653]5.3779,[654]5.3724,[655]5.3715,
llama_print_timings: load time = 3902.02 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 1737619.18 ms / 335360 tokens ( 5.18 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1764928.09 ms
The text was updated successfully, but these errors were encountered: