diff --git a/docs/content/faq/_index.en.md b/docs/content/faq/_index.en.md index 78e7b25e6958..1072b38bbccb 100644 --- a/docs/content/faq/_index.en.md +++ b/docs/content/faq/_index.en.md @@ -14,7 +14,7 @@ Here are answers to some of the most common questions.
-Most ggml-based models should work, but newer models may require additions to the API. If a model doesn't work, please feel free to open up issues. However, be cautious about downloading models from the internet and directly onto your machine, as there may be security vulnerabilities in lama.cpp or ggml that could be maliciously exploited. Some models can be found on Hugging Face: https://huggingface.co/models?search=ggml, or models from gpt4all are compatible too: https://github.com/nomic-ai/gpt4all. +Most gguf-based models should work, but newer models may require additions to the API. If a model doesn't work, please feel free to open up issues. However, be cautious about downloading models from the internet and directly onto your machine, as there may be security vulnerabilities in lama.cpp or ggml that could be maliciously exploited. Some models can be found on Hugging Face: https://huggingface.co/models?search=gguf, or models from gpt4all are compatible too: https://github.com/nomic-ai/gpt4all.
diff --git a/docs/content/getting_started/_index.en.md b/docs/content/getting_started/_index.en.md index 4185bb290879..5baf86a753f5 100644 --- a/docs/content/getting_started/_index.en.md +++ b/docs/content/getting_started/_index.en.md @@ -26,7 +26,7 @@ To run with GPU Accelleration, see [GPU acceleration]({{%relref "features/gpu-ac mkdir models # copy your models to it -cp your-model.bin models/ +cp your-model.gguf models/ # run the LocalAI container docker run -p 8080:8080 -v $PWD/models:/models -ti --rm quay.io/go-skynet/local-ai:latest --models-path /models --context-size 700 --threads 4 @@ -43,7 +43,7 @@ docker run -p 8080:8080 -v $PWD/models:/models -ti --rm quay.io/go-skynet/local- # Try the endpoint with curl curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{ - "model": "your-model.bin", + "model": "your-model.gguf", "prompt": "A long time ago in a galaxy far, far away", "temperature": 0.7 }' @@ -67,7 +67,7 @@ cd LocalAI # git checkout -b build # copy your models to models/ -cp your-model.bin models/ +cp your-model.gguf models/ # (optional) Edit the .env file to set things like context size and threads # vim .env @@ -79,10 +79,10 @@ docker compose up -d --pull always # Now API is accessible at localhost:8080 curl http://localhost:8080/v1/models -# {"object":"list","data":[{"id":"your-model.bin","object":"model"}]} +# {"object":"list","data":[{"id":"your-model.gguf","object":"model"}]} curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{ - "model": "your-model.bin", + "model": "your-model.gguf", "prompt": "A long time ago in a galaxy far, far away", "temperature": 0.7 }' diff --git a/docs/content/howtos/_index.md b/docs/content/howtos/_index.md index ec28923cef16..238268c033b2 100644 --- a/docs/content/howtos/_index.md +++ b/docs/content/howtos/_index.md @@ -10,14 +10,10 @@ This section includes LocalAI end-to-end examples, tutorial and how-tos curated - [Setup LocalAI with Docker on CPU]({{%relref "howtos/easy-setup-docker-cpu" %}}) - [Setup LocalAI with Docker With CUDA]({{%relref "howtos/easy-setup-docker-gpu" %}}) -- [Seting up a Model]({{%relref "howtos/easy-model-import-downloaded" %}}) -- [Making requests via Autogen]({{%relref "howtos/easy-request-autogen" %}}) -- [Making requests via OpenAi API V0]({{%relref "howtos/easy-request-openai-v0" %}}) -- [Making requests via OpenAi API V1]({{%relref "howtos/easy-request-openai-v1" %}}) -- [Making requests via Curl]({{%relref "howtos/easy-request-curl" %}}) +- [Seting up a Model]({{%relref "howtos/easy-model" %}}) +- [Making requests to LocalAI]({{%relref "howtos/easy-request" %}}) ## Programs and Demos This section includes other programs and how to setup, install, and use of LocalAI. - [Python LocalAI Demo]({{%relref "howtos/easy-setup-full" %}}) - [lunamidori5](https://github.com/lunamidori5) -- [Autogen]({{%relref "howtos/autogen-setup" %}}) - [lunamidori5](https://github.com/lunamidori5) diff --git a/docs/content/howtos/autogen-setup.md b/docs/content/howtos/autogen-setup.md deleted file mode 100644 index bd53b42b9b8f..000000000000 --- a/docs/content/howtos/autogen-setup.md +++ /dev/null @@ -1,91 +0,0 @@ - -+++ -disableToc = false -title = "Easy Demo - AutoGen" -weight = 2 -+++ - -This is just a short demo of setting up ``LocalAI`` with Autogen, this is based on you already having a model setup. - -```python -import os -import openai -import autogen - -openai.api_key = "sx-xxx" -OPENAI_API_KEY = "sx-xxx" -os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY - -config_list_json = [ - { - "model": "gpt-3.5-turbo", - "api_base": "http://[YOURLOCALAIIPHERE]:8080/v1", - "api_type": "open_ai", - "api_key": "NULL", - } -] - -print("models to use: ", [config_list_json[i]["model"] for i in range(len(config_list_json))]) - -llm_config = {"config_list": config_list_json, "seed": 42} -user_proxy = autogen.UserProxyAgent( - name="Admin", - system_message="A human admin. Interact with the planner to discuss the plan. Plan execution needs to be approved by this admin.", - code_execution_config={ - "work_dir": "coding", - "last_n_messages": 8, - "use_docker": "python:3", - }, - human_input_mode="ALWAYS", - is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"), -) -engineer = autogen.AssistantAgent( - name="Coder", - llm_config=llm_config, -) -scientist = autogen.AssistantAgent( - name="Scientist", - llm_config=llm_config, - system_message="""Scientist. You follow an approved plan. You are able to categorize papers after seeing their abstracts printed. You don't write code.""" -) -planner = autogen.AssistantAgent( - name="Planner", - system_message='''Planner. Suggest a plan. Revise the plan based on feedback from admin and critic, until admin approval. -The plan may involve an engineer who can write code and a scientist who doesn't write code. -Explain the plan first. Be clear which step is performed by an engineer, and which step is performed by a scientist. -''', - llm_config=llm_config, -) -executor = autogen.UserProxyAgent( - name="Executor", - system_message="Executor. Execute the code written by the engineer and report the result.", - human_input_mode="NEVER", - code_execution_config={ - "work_dir": "coding", - "last_n_messages": 8, - "use_docker": "python:3", - } -) -critic = autogen.AssistantAgent( - name="Critic", - system_message="Critic. Double check plan, claims, code from other agents and provide feedback. Check whether the plan includes adding verifiable info such as source URL.", - llm_config=llm_config, -) -groupchat = autogen.GroupChat(agents=[user_proxy, engineer, scientist, planner, executor, critic], messages=[], max_round=999) -manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config) - - -#autogen.ChatCompletion.start_logging() - -#text_input = input("Please enter request: ") -text_input = ("Change this to a task you would like the group chat to do or comment this out and uncomment the other line!") - -#Uncomment one of these two chats based on what you would like to do - -#user_proxy.initiate_chat(engineer, message=str(text_input)) -#For a one on one chat use this one ^ - -#user_proxy.initiate_chat(manager, message=str(text_input)) -#To setup a group chat use this one ^ -``` - diff --git a/docs/content/howtos/easy-model-import-downloaded.md b/docs/content/howtos/easy-model.md similarity index 93% rename from docs/content/howtos/easy-model-import-downloaded.md rename to docs/content/howtos/easy-model.md index cbe17431c823..ffbbfb1cd923 100644 --- a/docs/content/howtos/easy-model-import-downloaded.md +++ b/docs/content/howtos/easy-model.md @@ -59,9 +59,6 @@ What this does is tell ``LocalAI`` how to load the model. Then we are going to * name: lunademo parameters: model: luna-ai-llama2-uncensored.Q4_K_M.gguf - temperature: 0.2 - top_k: 40 - top_p: 0.65 ``` Now that we have the model set up, there a few things we should add to the yaml file to make it run better, for this model it uses the following roles. @@ -100,9 +97,6 @@ context_size: 2000 name: lunademo parameters: model: luna-ai-llama2-uncensored.Q4_K_M.gguf - temperature: 0.2 - top_k: 40 - top_p: 0.65 roles: assistant: 'ASSISTANT:' system: 'SYSTEM:' @@ -112,7 +106,7 @@ template: completion: lunademo-completion ``` -Now that we got that setup, lets test it out but sending a request by using [Curl]({{%relref "easy-request-curl" %}}) Or use the [OpenAI Python API]({{%relref "easy-request-openai-v1" %}})! +Now that we got that setup, lets test it out but sending a [request]({{%relref "easy-request" %}}) to Localai! ## Adv Stuff Alright now that we have learned how to set up our own models, here is how to use the gallery to do alot of this for us. This command will download and set up (mostly, we will **always** need to edit our yaml file to fit our computer / hardware) diff --git a/docs/content/howtos/easy-request-autogen.md b/docs/content/howtos/easy-request-autogen.md deleted file mode 100644 index 8b137891791f..000000000000 --- a/docs/content/howtos/easy-request-autogen.md +++ /dev/null @@ -1 +0,0 @@ - diff --git a/docs/content/howtos/easy-request-curl.md b/docs/content/howtos/easy-request-curl.md deleted file mode 100644 index b362504cc8b1..000000000000 --- a/docs/content/howtos/easy-request-curl.md +++ /dev/null @@ -1,35 +0,0 @@ - -+++ -disableToc = false -title = "Easy Request - Curl" -weight = 2 -+++ - -Now we can make a curl request! - -Curl Chat API - - -```bash -curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ - "model": "lunademo", - "messages": [{"role": "user", "content": "How are you?"}], - "temperature": 0.9 - }' -``` - -Curl Completion API - - -```bash -curl --request POST \ - --url http://localhost:8080/v1/completions \ - --header 'Content-Type: application/json' \ - --data '{ - "model": "lunademo", - "prompt": "function downloadFile(string url, string outputPath) {", - "max_tokens": 256, - "temperature": 0.5 -}' -``` - -See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info! -Have fun using LocalAI! diff --git a/docs/content/howtos/easy-request-openai-v0.md b/docs/content/howtos/easy-request-openai-v0.md deleted file mode 100644 index 4c3b891a7276..000000000000 --- a/docs/content/howtos/easy-request-openai-v0.md +++ /dev/null @@ -1,50 +0,0 @@ - -+++ -disableToc = false -title = "Easy Request - Openai V0" -weight = 2 -+++ - -This is for Python, ``OpenAI``=``0.28.1``, if you are on ``OpenAI``=>``V1`` please use this [How to]({{%relref "howtos/easy-request-openai-v1" %}}) - -OpenAI Chat API Python - - -```python -import os -import openai -openai.api_base = "http://localhost:8080/v1" -openai.api_key = "sx-xxx" -OPENAI_API_KEY = "sx-xxx" -os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY - -completion = openai.ChatCompletion.create( - model="lunademo", - messages=[ - {"role": "system", "content": "You are a helpful assistant."}, - {"role": "user", "content": "How are you?"} - ] -) - -print(completion.choices[0].message.content) -``` - -OpenAI Completion API Python - - -```python -import os -import openai -openai.api_base = "http://localhost:8080/v1" -openai.api_key = "sx-xxx" -OPENAI_API_KEY = "sx-xxx" -os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY - -completion = openai.Completion.create( - model="lunademo", - prompt="function downloadFile(string url, string outputPath) ", - max_tokens=256, - temperature=0.5) - -print(completion.choices[0].text) -``` -See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info! -Have fun using LocalAI! diff --git a/docs/content/howtos/easy-request-openai-v1.md b/docs/content/howtos/easy-request-openai-v1.md deleted file mode 100644 index dfa8e343dd66..000000000000 --- a/docs/content/howtos/easy-request-openai-v1.md +++ /dev/null @@ -1,28 +0,0 @@ - -+++ -disableToc = false -title = "Easy Request - Openai V1" -weight = 2 -+++ - -This is for Python, ``OpenAI``=>``V1``, if you are on ``OpenAI``<``V1`` please use this [How to]({{%relref "howtos/easy-request-openai-v0" %}}) - -OpenAI Chat API Python - -```python -from openai import OpenAI - -client = OpenAI(base_url="http://localhost:8080/v1", api_key="sk-xxx") - -messages = [ -{"role": "system", "content": "You are LocalAI, a helpful, but really confused ai, you will only reply with confused emotes"}, -{"role": "user", "content": "Hello How are you today LocalAI"} -] -completion = client.chat.completions.create( - model="lunademo", - messages=messages, -) - -print(completion.choices[0].message) -``` -See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info! -Have fun using LocalAI! diff --git a/docs/content/howtos/easy-request.md b/docs/content/howtos/easy-request.md new file mode 100644 index 000000000000..755370164b00 --- /dev/null +++ b/docs/content/howtos/easy-request.md @@ -0,0 +1,85 @@ + ++++ +disableToc = false +title = "Easy Request - All" +weight = 2 ++++ + +## Curl Request + +Curl Chat API - + +```bash +curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ + "model": "lunademo", + "messages": [{"role": "user", "content": "How are you?"}], + "temperature": 0.9 + }' +``` + +## Openai V1 - Recommended + +This is for Python, ``OpenAI``=>``V1`` + +OpenAI Chat API Python - +```python +from openai import OpenAI + +client = OpenAI(base_url="http://localhost:8080/v1", api_key="sk-xxx") + +messages = [ +{"role": "system", "content": "You are LocalAI, a helpful, but really confused ai, you will only reply with confused emotes"}, +{"role": "user", "content": "Hello How are you today LocalAI"} +] +completion = client.chat.completions.create( + model="lunademo", + messages=messages, +) + +print(completion.choices[0].message) +``` +See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info! + +## Openai V0 - Not Recommended + +This is for Python, ``OpenAI``=``0.28.1`` + +OpenAI Chat API Python - + +```python +import os +import openai +openai.api_base = "http://localhost:8080/v1" +openai.api_key = "sx-xxx" +OPENAI_API_KEY = "sx-xxx" +os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY + +completion = openai.ChatCompletion.create( + model="lunademo", + messages=[ + {"role": "system", "content": "You are LocalAI, a helpful, but really confused ai, you will only reply with confused emotes"}, + {"role": "user", "content": "How are you?"} + ] +) + +print(completion.choices[0].message.content) +``` + +OpenAI Completion API Python - + +```python +import os +import openai +openai.api_base = "http://localhost:8080/v1" +openai.api_key = "sx-xxx" +OPENAI_API_KEY = "sx-xxx" +os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY + +completion = openai.Completion.create( + model="lunademo", + prompt="function downloadFile(string url, string outputPath) ", + max_tokens=256, + temperature=0.5) + +print(completion.choices[0].text) +``` diff --git a/docs/content/howtos/easy-setup-docker-cpu.md b/docs/content/howtos/easy-setup-docker-cpu.md index 06a420b2f806..8022d79f603e 100644 --- a/docs/content/howtos/easy-setup-docker-cpu.md +++ b/docs/content/howtos/easy-setup-docker-cpu.md @@ -102,7 +102,8 @@ services: Make sure to save that in the root of the `LocalAI` folder. Then lets spin up the Docker run this in a `CMD` or `BASH` ```bash -docker-compose up -d --pull always +docker-compose up -d --pull always ##Windows +docker compose up -d --pull always ##Linux ``` @@ -128,4 +129,4 @@ Output will look like this: ![](https://cdn.discordapp.com/attachments/1116933141895053322/1134037542845566976/image.png) -Now that we got that setup, lets go setup a [model]({{%relref "easy-model-import-downloaded" %}}) +Now that we got that setup, lets go setup a [model]({{%relref "easy-model" %}}) diff --git a/docs/content/howtos/easy-setup-docker-gpu.md b/docs/content/howtos/easy-setup-docker-gpu.md index ee386d4634bb..d9ae891dfee8 100644 --- a/docs/content/howtos/easy-setup-docker-gpu.md +++ b/docs/content/howtos/easy-setup-docker-gpu.md @@ -117,7 +117,8 @@ services: Make sure to save that in the root of the `LocalAI` folder. Then lets spin up the Docker run this in a `CMD` or `BASH` ```bash -docker-compose up -d --pull always +docker-compose up -d --pull always ##Windows +docker compose up -d --pull always ##Linux ``` @@ -143,4 +144,4 @@ Output will look like this: ![](https://cdn.discordapp.com/attachments/1116933141895053322/1134037542845566976/image.png) -Now that we got that setup, lets go setup a [model]({{%relref "easy-model-import-downloaded" %}}) +Now that we got that setup, lets go setup a [model]({{%relref "easy-model" %}}) diff --git a/docs/content/model-compatibility/_index.en.md b/docs/content/model-compatibility/_index.en.md index 355b7740d90c..a45d36eeeed9 100644 --- a/docs/content/model-compatibility/_index.en.md +++ b/docs/content/model-compatibility/_index.en.md @@ -15,7 +15,7 @@ LocalAI will attempt to automatically load models which are not explicitly confi ### Hardware requirements -Depending on the model you are attempting to run might need more RAM or CPU resources. Check out also [here](https://github.com/ggerganov/llama.cpp#memorydisk-requirements) for `ggml` based backends. `rwkv` is less expensive on resources. +Depending on the model you are attempting to run might need more RAM or CPU resources. Check out also [here](https://github.com/ggerganov/llama.cpp#memorydisk-requirements) for `gguf` based backends. `rwkv` is less expensive on resources. ### Model compatibility table diff --git a/docs/content/models/_index.en.md b/docs/content/models/_index.en.md index da02b04d10f4..e745120c2306 100644 --- a/docs/content/models/_index.en.md +++ b/docs/content/models/_index.en.md @@ -25,7 +25,7 @@ GPT and text generation models might have a license which is not permissive for ## Useful Links and resources -- [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) - here you can find a list of the most performing models on the Open LLM benchmark. Keep in mind models compatible with LocalAI must be quantized in the `ggml` format. +- [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) - here you can find a list of the most performing models on the Open LLM benchmark. Keep in mind models compatible with LocalAI must be quantized in the `gguf` format. ## Model repositories