Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for connection to llamacpp server #14

Merged
merged 3 commits into from
Sep 26, 2023
Merged

Conversation

JoseConseco
Copy link
Contributor

This PR will allow to connect to llamacpp server, and fixes #13 . This way llama does not have to be started for each user prompt. This includes completions patch from issue mentioned above, provided by gsuuon.
I'm new to this so fell free to fix any issues in this PR.

This will allow to connect to llamacpp server. This way  llama does not have to be started for each user prompt. This inludes completions patch.
Copy link
Owner

@gsuuon gsuuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there, just need to clean up some of the stuff pulled from the huggingface example. We should replace the llamacpp provider with this (so accept the changes and rename llamacpp_server.lua to llamacpp.lua)

Do you want to add some basic setup steps to the README for this? No worries if not!

Comment on lines 22 to 26
headers = {
-- Authorization = 'Bearer ' .. util.env_memo('HUGGINGFACE_API_KEY'),
["Content-Type"] = "application/json",
-- ['data'] = '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128}',
},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Headers aren't necessary here

Suggested change
headers = {
-- Authorization = 'Bearer ' .. util.env_memo('HUGGINGFACE_API_KEY'),
["Content-Type"] = "application/json",
-- ['data'] = '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128}',
},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done

---@param params? any Additional params for request
---@param options? { model?: string }
function M.request_completion(handlers, params, options)
local model = (options or {}).model or "bigscience/bloom"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get rid of all the huggingface stuff here

Suggested change
local model = (options or {}).model or "bigscience/bloom"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


-- TODO handle non-streaming calls
return curl.stream({
-- url = 'https://api-inference.huggingface.co/models/', --.. model,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- url = 'https://api-inference.huggingface.co/models/', --.. model,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

end

if data.generation_settings ~= nil then -- last message
handlers.on_finish('', "stop")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just be a naked call now that it's handled in provider.lua

Suggested change
handlers.on_finish('', "stop")
handlers.on_finish()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 98 to 103
options = {
-- model = 'bigscience/bloom'
},
params = {
return_full_text = false,
},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are huggingface options / params, not needed here

Suggested change
options = {
-- model = 'bigscience/bloom'
},
params = {
return_full_text = false,
},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

by lllama docs we should use:
<instr><sys>U are assistane answer all</sys> What is the age of universe?</instr>

But
<sys>U are assistant</sys><instr>What is the age of uni</inst> somehow gives betterr results.. Leave it to user..
@JoseConseco
Copy link
Contributor Author

Ok, I also pushed some changes to default query. By default user can select text, and ask question about it. Best default query format is not clear. In the end from my test I got best results by using:
< instr>
Fix my code? -- user question or command
'''
Code block
'''
</ instr>
Adding block usually just gave me worse results... It may depend on selected model, not sure. I do not have time to test it more. IMO it is best leave it to user to decide - for best prompt format.
That is why I made bunch of helper methods in last commit-with presets of prompts.

@JoseConseco
Copy link
Contributor Author

JoseConseco commented Sep 26, 2023

Demo of how this works currently (using my custom modify prompt):

assome.mp4

@gsuuon gsuuon merged commit da785e3 into gsuuon:main Sep 26, 2023
@gsuuon
Copy link
Owner

gsuuon commented Sep 26, 2023

Ok, lets get this merged. I'll replace the llamacpp provider and clean up. Thanks for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

llama.cpp usage
2 participants