diff --git a/MODELS.md b/MODELS.md index 129c1c12df..8588da5e33 100644 --- a/MODELS.md +++ b/MODELS.md @@ -1,20 +1,16 @@ -## Models used in Dream - -Here you may find a list of models that currently available for use in Dream. - -| model name | container name | model link | open-source? | size (billion parameters) | GPU usage | max tokens (prompt + response) | description | -|-----------------------------|---------------------------------|-------------------------------------------------------------------------|--------------------------------------|---------------------------|---------------------------|--------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| BLOOMZ 7B | transformers-lm-bloomz7b | [link](https://huggingface.co/bigscience/bloomz-7b1) | yes | 7.1B | 33GB | 2,048 tokens | An open-source multilingual instruction-based large language model (46 languages). NB: free of charge. This model is up and running on our servers and can be used for free. | -| GPT-J 6B | transformers-lm-gptj | [link](https://huggingface.co/EleutherAI/gpt-j-6b) | yes | 6B | 25GB | 2,048 tokens | An open-source English-only large language model which is NOT fine-tuned for instruction following and NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. | -| GPT-3.5 | openai-api-davinci3 | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,097 tokens | A multilingual instruction-based large language model which is capable of code generation. Unlike ChatGPT, not optimised for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | -| ChatGPT | openai-api-chatgpt | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,096 tokens | Based on gpt-3.5-turbo -- the most capable of the entire GPT-3/GPT-3.5 models family. Optimized for chat. Able to understand and generate code. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | -| Open-Assistant Pythia 12B | transformers-lm-oasst12b | [link](https://huggingface.co/OpenAssistant/pythia-12b-sft-v8-7k-steps) | yes | 12B | 29GB (half-precision) | 5,120 tokens | An open-source English-only instruction-based large language model which is NOT good at answering math and coding questions. NB: free of charge. This model is up and running on our servers and can be used for free. | -| Vicuna 13B | transformers-lm-vicuna13b | [link](https://huggingface.co/lmsys/vicuna-13b-v1.3) | yes, but only for non-commercial use | 13B | 29GB (half-precision) | 2,048 tokens | An instruction-based large language model fine-tuned on LLaMa that achieves [more than 90%* quality of OpenAI ChatGPT and Google Bard](https://lmsys.org/blog/2023-03-30-vicuna/). The model performs best in English and is NOT good at answering math, reasoning, and coding questions. NB-1: Free of charge. This model is up and running on our servers and can be used for free. NB-2: cannot be used for commercial purposes (license restriction). | -| GPT-4 | openai-api-gpt4 | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 8,192 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | -| GPT-4 32K | openai-api-gpt4-32k | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 32,768 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. Same capabilities as the base gpt-4 mode but with 4x the context length. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | -| GPT-JT 6B | transformers-lm-gptjt | [link](https://huggingface.co/togethercomputer/GPT-JT-6B-v1) | yes | 6B | 14GB (half-precision) | 2,048 tokens | An open-source English-only large language model which was fine-tuned for instruction following but is NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. | -| ChatGPT 16k | openai-api-chatgpt-16k | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 16,384 tokens | Same capabilities as the standard gpt-3.5-turbo model but with 4 times the context. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | -| Anthropic Claude-v1 | anthropic-api-claude-v1 | [link](https://docs.anthropic.com/claude/reference/complete_post) | no (paid access via API) | | - (cannot be run locally) | 9,000 tokens | The largest model, ideal for a wide range of more complex tasks. NB: paid. You must provide your Anthropic API key to use the model. Your Anthropic API account will be charged according to your usage. | -| Anthropic Claude Instant v1 | anthropic-api-claude-instant-v1 | [link](https://docs.anthropic.com/claude/reference/complete_post) | no (paid access via API) | | - (cannot be run locally) | 9,000 tokens | A smaller model with far lower latency, sampling at roughly 40 words/sec! Its output quality is somewhat lower than the latest claude-1 model, particularly for complex tasks. However, it is much less expensive and blazing fast. NB: paid. You must provide your Anthropic API key to use the model. Your Anthropic API account will be charged according to your usage. | -| Russian XGLM 4.5B | transformers-lm-ruxglm | unavailable (private weights) | no | 4.5B | 15GB | 2,048 tokens | A private large language model for the Russian language which was fine-tuned for instruction following by Dmitry Kosenko in Summer 2023. This model is up and running on our servers and can be used for free. | -| ruGPT-3.5-13B | transformers-lm-rugpt35 | [link](https://huggingface.co/ai-forever/ruGPT-3.5-13B) | yes | 13B | 35GB (half-precision) | 2,048 tokens | A large language model for the Russian language which was used for trainig GigaChat. This model is up and running on our servers and can be used for free. | +| model name and link | container name | open-source? | size | GPU usage | max tokens (prompt + response) | licence | description | +|----------------------------------------------------------------------------------------------|-----------------------------------|--------------------------------------|------------------|---------------------------|--------------------------------|------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| [BLOOMZ 7B](https://huggingface.co/bigscience/bloomz-7b1) | transformers-lm-bloomz7b | yes | 7.1B | 33GB | 2,048 tokens | bigscience-bloom-rail-1.0, commercial use allowed | An open-source multilingual instruction-based large language model (46 languages). NB: free of charge. This model is up and running on our servers and can be used for free. | +| [GPT-J 6B](https://huggingface.co/EleutherAI/gpt-j-6b) | transformers-lm-gptj | yes | 6B | 25GB | 2,048 tokens | Apache 2.0 , commercial use allowed | An open-source English-only large language model which is NOT fine-tuned for instruction following and NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. | +| [GPT-3.5](https://platform.openai.com/docs/models/gpt-3-5) | openai-api-davinci3 | no | supposedly, 175B | - (cannot be run locally) | 4,097 tokens | available under subscription plan, commercial use allowed | A multilingual instruction-based large language model which is capable of code generation. Unlike ChatGPT, not optimized for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | +| [ChatGPT](https://platform.openai.com/docs/models/gpt-3-5) | openai-api-chatgpt | no | supposedly, 175B | - (cannot be run locally) | 4,096 tokens | available under subscription plan, commercial use allowed | Based on gpt-3.5-turbo -- the most capable of the entire GPT-3/GPT-3.5 models family. Optimized for chat. Able to understand and generate code. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | +| [Open-Assistant Pythia 12B](https://huggingface.co/OpenAssistant/pythia-12b-sft-v8-7k-steps) | transformers-lm-oasst12b | yes | 12B | 29GB (half-precision) | 5,120 tokens | Apache 2.0 , commercial use allowed | An open-source English-only instruction-based large language model which is NOT good at answering math and coding questions. NB: free of charge. This model is up and running on our servers and can be used for free. | +| [Vicuna 13B](https://huggingface.co/lmsys/vicuna-13b-v1.3) | transformers-lm-vicuna13b | yes, but only for non-commercial use | 13B | 29GB (half-precision) | 2,048 tokens | Non-commercial license | An instruction-based large language model fine-tuned on LLaMa that achieves [more than 90%* quality of OpenAI ChatGPT and Google Bard](https://lmsys.org/blog/2023-03-30-vicuna/). The model performs best in English and is NOT good at answering math, reasoning, and coding questions. NB-1: Free of charge. This model is up and running on our servers and can be used for free. NB-2: cannot be used for commercial purposes due to license restriction. | +| [GPT-4](https://platform.openai.com/docs/models/gpt-4) | openai-api-gpt4 | no | supposedly, 175B | - (cannot be run locally) | 8,192 tokens | available under subscription plan, commercial use allowed | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | +| [GPT-4 32K](https://platform.openai.com/docs/models/gpt-4) | openai-api-gpt4-32k | no | supposedly, 175B | - (cannot be run locally) | 32,768 tokens | available under subscription plan, commercial use allowed | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. Same capabilities as the base gpt-4 mode but with 4x the context length. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | +| [GPT-JT 6B](https://huggingface.co/togethercomputer/GPT-JT-6B-v1) | transformers-lm-gptjt | yes | 6B | 14GB (half-precision) | 2,048 tokens | Apache 2.0 , commercial use is allowed | An open-source English-only large language model which was fine-tuned for instruction following but is NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. | +| [ChatGPT 16k](https://platform.openai.com/docs/models/gpt-3-5) | openai-api-chatgpt-16k | no | supposedly, 175B | - (cannot be run locally) | 16,384 tokens | available under subscription plan, commercial use allowed | Same capabilities as the standard gpt-3.5-turbo model but with 4 times the context. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | +| [Anthropic Claude-v1](https://docs.anthropic.com/claude/reference/complete_post) | anthropic-api-claude-v1 | no | supposedly, 52B | - (cannot be run locally) | 9,000 tokens | available under subscription plan, commercial use allowed | The largest model, ideal for a wide range of more complex tasks. NB: paid. You must provide your Anthropic API key to use the model. Your Anthropic API account will be charged according to your usage. | +| [Anthropic Claude Instant v1](https://docs.anthropic.com/claude/reference/complete_post) | anthropic-api-claude-instant-v1 | no (paid access via API) | supposedly, 52B | - (cannot be run locally) | 9,000 tokens | available under subscription plan, commercial use allowed | A smaller model with far lower latency, sampling at roughly 40 words/sec! Its output quality is somewhat lower than the latest claude-1 model, particularly for complex tasks. However, it is much less expensive and blazing fast. NB: paid. You must provide your Anthropic API key to use the model. Your Anthropic API account will be charged according to your usage. | +| Russian XGLM 4.5B (private weights) | transformers-lm-ruxglm | no | 4.5B | 15GB | 2,048 tokens | Not available yet | A private large language model for the Russian language which was fine-tuned for instruction following by Dmitry Kosenko in Summer 2023. This model is up and running on our servers and can be used for free. | +| [ruGPT-3.5-13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B) | transformers-lm-rugpt35 | yes | 13B | 35GB (half-precision) | 2,048 tokens | MIT | A large language model for the Russian language which was used for trainig GigaChat. This model is up and running on our servers and can be used for free. |