-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama.cpp usage #13
Comments
Hi, thanks for checking it out! Did you skip the llama.cpp build step? The starter assumes a working llama.cpp |
Llama cpp has build in server support now (since like 2 months?) {
"gsuuon/llm.nvim",
config = function()
local llm = require('llm')
local curl = require('llm.curl')
local util = require('llm.util')
local provider_util = require('llm.providers.util')
local M = {}
---@param handlers StreamHandlers
---@param params? any Additional params for request
---@param options? { model?: string }
function M.request_completion(handlers, params, options)
local model = (options or {}).model or 'bigscience/bloom'
-- TODO handle non-streaming calls
return curl.stream(
{
-- url = 'https://api-inference.huggingface.co/models/', --.. model,
url = 'http://127.0.0.1:8080/completion',
method = 'POST',
body = vim.tbl_extend('force', { stream = true }, params),
headers = {
-- Authorization = 'Bearer ' .. util.env_memo('HUGGINGFACE_API_KEY'),
['Content-Type'] = 'application/json',
-- ['data'] = '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128}',
}
},
function(raw)
provider_util.iter_sse_items(raw, function(item)
local data = util.json.decode(item)
if data == nil then
handlers.on_error(item, 'json parse error')
return
end
if data.token == nil then
if data[1] ~= nil and data[1].generated_text ~= nil then
-- non-streaming
handlers.on_finish(data[1].generated_text, 'stop')
return
end
handlers.on_error(data, 'missing token')
return
end
local partial = data.token.text
handlers.on_partial(partial)
-- We get the completed text including input unless parameters.return_full_text is set to false
if data.generated_text ~= nil and #data.generated_text > 0 then
handlers.on_finish(data.generated_text, 'stop')
end
end)
end,
function(error)
handlers.on_error(error)
end
)
end
require('llm').setup({
hl_group = 'Substitute',
prompts = util.module.autoload('prompt_library'),
default_prompt = {
provider = M,
options = {
-- model = 'bigscience/bloom'
},
params = {
return_full_text = false
},
builder = function(input)
return { inputs = input }
end
},
})
end,
},
based on 'Adding your own prowider' - https://github.com/gsuuon/llm.nvim/blob/main/lua/llm/providers/huggingface.lua Configuring llm.nvim is bit hard, not sure what I did wrong. Do I have to write my own prompts for it? |
Hi Jose! You know, I think it'd probably be better to simply remove the llamacpp cli provider and switch it to targeting the llamacpp server directly. Outside of playing around with llamacpp flags, the cli provider won't be very useful. I assumed most people would just use an openai compat server, but that does add another dependency and setup step. |
I meant that I should remove the current llamacpp provider which uses the CLI and just have it talk to the server instead. You're very close! Just call |
Re: your edit -- |
@gsuuon I'm slowly getting there, but now with or without mode the response in automatically removed after last server response: print_clear.mp4I mean - I can just undo - to bring it back but is it not optimal. Also I noticed undo will remove one word by word, rather than whole, accumulated server response. I guess it would be cool if u could make it so that undo removes whole server reply, rather than by chunks.
|
That's the purpose of the The Optional selected lines would be a nice feature but should be a separate PR from updating the llamacpp provider if you do tackle it. |
@JoseConseco Oh btw, you can apply this patch to prevent the completion from disappearing - it's because diff --git a/lua/llm/provider.lua b/lua/llm/provider.lua
index 8e5af0f..a0ebc10 100644
--- a/lua/llm/provider.lua
+++ b/lua/llm/provider.lua
@@ -30,7 +30,7 @@ M.mode = {
---@class StreamHandlers
---@field on_partial (fun(partial_text: string): nil) Partial response of just the diff
----@field on_finish (fun(complete_text: string, finish_reason?: string): nil) Complete response with finish reason
+---@field on_finish (fun(complete_text?: string, finish_reason?: string): nil) Complete response with finish reason. Leave complete_text nil to just use concatenated partials.
---@field on_error (fun(data: any, label?: string): nil) Error data and optional label
local function get_segment(input, segment_mode, hl_group)
@@ -232,12 +232,19 @@ end
local function request_completion_input_segment(handle_params, prompt)
local seg = handle_params.context.segment
+ local completion = ""
+
local cancel = start_prompt(handle_params.input, prompt, {
on_partial = function(partial)
+ completion = completion .. partial
seg.add(partial)
end,
on_finish = function(complete_text, reason)
+ if complete_text == nil or string.len(complete_text) == 0 then
+ complete_text = completion
+ end
+
if prompt.transform == nil then
seg.set_text(complete_text)
else And then you would change the |
@gsuuon on_finish() - being called with empty string arg, that was my guess but I did not have time to make it work (by concatenating strings like u showed above, and feeding into on_finish) .
|
That can be added in the prompt, there's a starter called 'instruct' that shows how to use vim.ui.input to get some input. EDIT: just noticed that example is out of date, fixed to the current api (you can return a function from |
I think above 'instruct' would have to be remade into llama format:
In any case I made PR. Tomorrow I can make some fixes if needed. |
@gsuuon is there way to output AI message to popup window? llm_replace.mp4The issue I have now - llama will output code with comments, and I cant seem to force it to output the pure code only. |
@JoseConseco moved to #15 |
Hey,
First of all, thanks for working on this!
I was trying to get the local
llamacpp
provider working, without much success.I tried to debug what was going wrong, but my lack of lua knowledge makes this slow and difficult.
What I did to try and get it working was the following:
LLAMACPP_DIR
to the rootllama.cpp
folderllm.nvim
Current environment is nvim v0.9.0, on a MacOS M1. Are there any steps I'm missing?
The text was updated successfully, but these errors were encountered: