From f5a726a5ffde278e167221c6811dfded48313247 Mon Sep 17 00:00:00 2001 From: Thibault Genaitay Date: Mon, 4 Nov 2024 09:59:14 +0100 Subject: [PATCH] feat(genapi): public beta launch (#3923) --- .../api-cli/understanding-errors.mdx | 4 +- .../api-cli/using-chat-api.mdx | 2 +- .../how-to/query-embedding-models.mdx | 15 +- ...t-models.mdx => query-language-models.mdx} | 36 ++- .../how-to/query-vision-models.mdx | 238 ++++++++++++++++++ .../how-to/use-function-calling.mdx | 5 +- .../how-to/use-structured-outputs.mdx | 14 +- ai-data/generative-apis/quickstart.mdx | 17 +- .../reference-content/rate-limits.mdx | 37 ++- .../reference-content/supported-models.mdx | 14 +- menu/navigation.json | 8 +- 11 files changed, 335 insertions(+), 55 deletions(-) rename ai-data/generative-apis/how-to/{query-text-models.mdx => query-language-models.mdx} (74%) create mode 100644 ai-data/generative-apis/how-to/query-vision-models.mdx diff --git a/ai-data/generative-apis/api-cli/understanding-errors.mdx b/ai-data/generative-apis/api-cli/understanding-errors.mdx index 1b134fb5ad..2f8d964613 100644 --- a/ai-data/generative-apis/api-cli/understanding-errors.mdx +++ b/ai-data/generative-apis/api-cli/understanding-errors.mdx @@ -7,7 +7,7 @@ content: paragraph: This page explains how to understand errors with Generative APIs tags: generative-apis ai-data understanding-data dates: - validation: 2024-09-02 + validation: 2024-10-31 posted: 2024-09-02 --- @@ -32,6 +32,8 @@ Below are usual HTTP error codes: - 404 - **Route Not Found**: The requested resource could not be found. Check your request is being made to the correct endpoint. - 422 - **Model Not Found**: The `model` key is present in the request payload, but the corresponding model is not found. - 422 - **Missing Model**: The `model` key is missing from the request payload. +- 429 - **Too Many Requests**: You are exceeding your current quota for the requested model, calculated in requests per minute. Find rate limits on [this page](/ai-data/generative-apis/reference-content/rate-limits/) +- 429 - **Too Many Tokens**: You are exceeding your current quota for the requested model, calculated in tokens per minute. Find rate limits on [this page](/ai-data/generative-apis/reference-content/rate-limits/) - 500 - **API error**: An unexpected internal error has occurred within Scaleway's systems. If the issue persists, please [open a support ticket](https://console.scaleway.com/support/tickets/create). For streaming responses via SSE, 5xx errors may occur after a 200 response has been returned. \ No newline at end of file diff --git a/ai-data/generative-apis/api-cli/using-chat-api.mdx b/ai-data/generative-apis/api-cli/using-chat-api.mdx index b1a1236731..c95706a383 100644 --- a/ai-data/generative-apis/api-cli/using-chat-api.mdx +++ b/ai-data/generative-apis/api-cli/using-chat-api.mdx @@ -87,6 +87,6 @@ If you have a use case requiring one of these unsupported parameters, please [co ## Going further -1. [Python code examples](/ai-data/generative-apis/how-to/query-text-models/#querying-text-models-via-api) to query text models using Scaleway's Chat API. +1. [Python code examples](/ai-data/generative-apis/how-to/query-language-models/#querying-language-models-via-api) to query text models using Scaleway's Chat API. 2. [How to use structured outputs](/ai-data/generative-apis/how-to/use-structured-outputs) with the `response_format` parameter 3. [How to use function calling](/ai-data/generative-apis/how-to/use-function-calling) with `tools` and `tool_choice` \ No newline at end of file diff --git a/ai-data/generative-apis/how-to/query-embedding-models.mdx b/ai-data/generative-apis/how-to/query-embedding-models.mdx index 11facb0dc7..7a5dd8e6f1 100644 --- a/ai-data/generative-apis/how-to/query-embedding-models.mdx +++ b/ai-data/generative-apis/how-to/query-embedding-models.mdx @@ -5,9 +5,9 @@ meta: content: h1: How to query embedding models paragraph: Learn how to interact with embedding models using Scaleway's Generative APIs service. -tags: generative-apis ai-data embedding-models +tags: generative-apis ai-data embedding-models embeddings-api dates: - validation: 2024-08-28 + validation: 2024-10-30 posted: 2024-08-28 --- @@ -18,7 +18,6 @@ The embedding service is OpenAI compatible. Refer to OpenAI's [embedding documen -- Access to this service is restricted while in beta. You can request access to the product by filling out a form on Scaleway's [betas page](https://www.scaleway.com/en/betas/#generative-apis). - A Scaleway account logged into the [console](https://console.scaleway.com) - [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization - A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) for API authentication @@ -51,22 +50,22 @@ client = OpenAI( ) ``` -### Generating embeddings with sentence-t5-xxl +### Generating embeddings with bge-multilingual-gemma2 -You can now generate embeddings using the `sentence-t5-xxl` model, such as the following example: +You can now generate embeddings using the `bge-multilingual-gemma2` model, such as the following example: ```python -# Generate embeddings using the 'sentence-t5-xxl' model +# Generate embeddings using the 'bge-multilingual-gemma2' model embedding_response = client.embeddings.create( input= "Artificial Intelligence is transforming the world.", - model= "sentence-t5-xxl" + model= "bge-multilingual-gemma2" ) # Output the embedding vector print(embedding_response.data[0].embedding) ``` -This code sends input text to the `sentence-t5-xxl` embedding model and returns a vector representation of the text. The `sentence-t5-xxl` model is specifically designed for generating high-quality sentence embeddings. +This code sends input text to the `bge-multilingual-gemma2` embedding model and returns a vector representation of the text. The `bge-multilingual-gemma2` model is specifically designed for generating high-quality sentence embeddings. ### Model parameters and their effects diff --git a/ai-data/generative-apis/how-to/query-text-models.mdx b/ai-data/generative-apis/how-to/query-language-models.mdx similarity index 74% rename from ai-data/generative-apis/how-to/query-text-models.mdx rename to ai-data/generative-apis/how-to/query-language-models.mdx index 7194ab4a7d..6791561cdd 100644 --- a/ai-data/generative-apis/how-to/query-text-models.mdx +++ b/ai-data/generative-apis/how-to/query-language-models.mdx @@ -1,25 +1,24 @@ --- meta: - title: How to query text models - description: Learn how to interact with powerful text models using Scaleway's Generative APIs service. + title: How to query language models + description: Learn how to interact with powerful language models using Scaleway's Generative APIs service. content: - h1: How to query text models - paragraph: Learn how to interact with powerful text models using Scaleway's Generative APIs service. -tags: generative-apis ai-data text-models + h1: How to query language models + paragraph: Learn how to interact with powerful language models using Scaleway's Generative APIs service. +tags: generative-apis ai-data language-models chat-completions-api dates: - validation: 2024-08-28 + validation: 2024-10-30 posted: 2024-08-28 --- -Scaleway's Generative APIs service allows users to interact with powerful text models hosted on the platform. +Scaleway's Generative APIs service allows users to interact with powerful language models hosted on the platform. -There are several ways to interact with text models: -- The Scaleway [console](https://console.scaleway.com) will soon provide a complete [playground](/ai-data/generative-apis/how-to/query-text-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time. -- Via the [Chat API](/ai-data/generative-apis/how-to/query-text-models/#querying-text-models-via-api) +There are several ways to interact with language models: +- The Scaleway [console](https://console.scaleway.com) provides complete [playground](/ai-data/generative-apis/how-to/query-language-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time. +- Via the [Chat API](/ai-data/generative-apis/how-to/query-language-models/#querying-language-models-via-api) -- Access to this service is restricted while in beta. You can request access to the product by filling out a form on Scaleway's [betas page](https://www.scaleway.com/en/betas/#generative-apis). - A Scaleway account logged into the [console](https://console.scaleway.com) - [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization - A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) for API authentication @@ -27,9 +26,20 @@ There are several ways to interact with text models: ## Accessing the Playground -Scaleway's Playground is in development, stay tuned! +Scaleway provides a web playground for instruct-based models hosted on Generative APIs. -## Querying text models via API +1. Navigate to **Generative APIs** under the **AI** section of the [Scaleway console](https://console.scaleway.com/) side menu. The list of models you can query displays. +2. Click the name of the chat model you want to try. Alternatively, click next to the chat model, and click **Try model** in the menu. + +The web playground displays. + +## Using the playground +1. Enter a prompt at the bottom of the page, or use one of the suggested prompts in the conversation area. +2. Edit the hyperparameters listed on the right column, for example the default temperature for more or less randomness on the outputs. +3. Switch models at the top of the page, to observe the capabilities of chat models offered via Generative APIs. +4. Click **View code** to get code snippets configured according to your settings in the playground. + +## Querying language models via API The [Chat API](/ai-data/generative-apis/api-cli/using-chat-api/) is an OpenAI-compatible REST API for generating and manipulating conversations. diff --git a/ai-data/generative-apis/how-to/query-vision-models.mdx b/ai-data/generative-apis/how-to/query-vision-models.mdx new file mode 100644 index 0000000000..93e7782e55 --- /dev/null +++ b/ai-data/generative-apis/how-to/query-vision-models.mdx @@ -0,0 +1,238 @@ +--- +meta: + title: How to query vision models + description: Learn how to interact with powerful vision models using Scaleway's Generative APIs service. +content: + h1: How to query vision models + paragraph: Learn how to interact with powerful vision models using Scaleway's Generative APIs service. +tags: generative-apis ai-data vision-models chat-completions-api +dates: + validation: 2024-10-30 + posted: 2024-10-30 +--- + +Scaleway's Generative APIs service allows users to interact with powerful vision models hosted on the platform. + + + Vision models can understand and analyze images, not generate them. + + +There are several ways to interact with vision models: +- The Scaleway [console](https://console.scaleway.com) provides a complete [playground](/ai-data/generative-apis/how-to/query-vision-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time. +- Via the [Chat API](/ai-data/generative-apis/how-to/query-vision-models/#querying-vision-models-via-api) + + + +- A Scaleway account logged into the [console](https://console.scaleway.com) +- [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization +- A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) for API authentication +- Python 3.7+ installed on your system + +## Accessing the playground + +Scaleway provides a web playground for vision models hosted on Generative APIs. + +1. Navigate to **Generative APIs** under the **AI** section of the [Scaleway console](https://console.scaleway.com/) side menu. The list of models you can query displays. +2. Click the name of the vision model you want to try. Alternatively, click next to the vision model, and click **Try model** in the menu. + +The web playground displays. + +## Using the playground +1. Upload one or multiple images to the prompt area at the bottom of the page. Enter a prompt, for example, to describe the image(s) you attached. +2. Edit the hyperparameters listed on the right column, for example the default temperature for more or less randomness on the outputs. +3. Switch models at the top of the page, to observe the capabilities of chat and vision models offered via Generative APIs. +4. Click **View code** to get code snippets configured according to your settings in the playground. + +## Querying vision models via the API + +The [Chat API](/ai-data/generative-apis/api-cli/using-chat-api/) is an OpenAI-compatible REST API for generating and manipulating conversations. + +You can query the vision models programmatically using your favorite tools or languages. +Vision models take both text and images as inputs. + + + Unlike traditional language models, vision models will take a content array for the user role, structuring text and images as inputs. + + +In the following example, we will use the OpenAI Python client. + +### Installing the OpenAI SDK + +Install the OpenAI SDK using pip: + +```bash +pip install openai +``` + +### Initializing the client + +Initialize the OpenAI client with your base URL and API key: + +```python +from openai import OpenAI + +# Initialize the client with your base URL and API key +client = OpenAI( + base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL + api_key="" # Your unique API secret key from Scaleway +) +``` + +### Generating a chat completion + +You can now create a chat completion, for example with the `pixtral-12b-2409` model: + +```python +# Create a chat completion using the 'pixtral-12b-2409' model +response = client.chat.completions.create( + model="pixtral-12b-2409", + messages=[ + { + "role": "user", + "content": [ + {"type": "text", "text": "What is this image?"}, + {"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}}, + ] # Vision models will take a content array with text and image_url objects. + + } + ], + temperature=0.7, # Adjusts creativity + max_tokens=2048, # Limits the length of the output + top_p=0.9 # Controls diversity through nucleus sampling. You usually only need to use temperature. +) + +# Print the generated response +print(response.choices[0].message.content) +``` + +This code sends messages, prompts and images, to the vision model and returns an answer based on your input. The `temperature`, `max_tokens`, and `top_p` parameters control the response's creativity, length, and diversity, respectively. + +A conversation style may include a default system prompt. You may set this prompt by setting the first message with the role system. For example: + +```python +[ + { + "role": "system", + "content": "You are Xavier Niel." + } +] +``` + +### Passing images to Pixtral + +1. **Image URLs**: If the image is available online, you can just include the image URL in your request as demonstrated above. This approach is simple and does not require any encoding. +2. **Base64 encoded**: image Base64 encoding is a standard way to transform binary data, like images, into a text format, making it easier to transmit over the internet. + +The following Python code sample shows you how to encode an image in base64 format and pass it to your request payload. + +```python +import base64 +from io import BytesIO +from PIL import Image + +def encode_image(img): + buffered = BytesIO() + img.save(buffered, format="JPEG") + encoded_string = base64.b64encode(buffered.getvalue()).decode("utf-8") + return encoded_string + +img = Image.open("path_to_your_image.jpg") +base64_img = encode_image(img) + +payload = { + "messages": [ + { + "role": "user", + "content": [ + { + "type": "text", + "text": "What is this image?" + }, + { + "type": "image_url", + "image_url": { + "url": f"data:image/jpeg;base64,{base64_img}" + } + } + ] + } + ], + ... # other parameters +} + +``` + +### Model parameters and their effects + +The following parameters will influence the output of the model: + +- **`messages`**: A list of message objects that represent the conversation history. Each message should have a `role` (e.g., "system", "user", "assistant") and `content`. The content is an array that can contain text and/or image objects. +- **`temperature`**: Controls the output's randomness. Lower values (e.g., 0.2) make the output more deterministic, while higher values (e.g., 0.8) make it more creative. +- **`max_tokens`**: The maximum number of tokens (words or parts of words) in the generated output. +- **`top_p`**: Recommended for advanced use cases only. You usually only need to use temperature. `top_p` controls the diversity of the output, using nucleus sampling, where the model considers the tokens with top probabilities until the cumulative probability reaches `top_p`. +- **`stop`**: A string or list of strings where the model will stop generating further tokens. This is useful for controlling the end of the output. + + + If you encounter an error such as "Forbidden 403", refer to the [API documentation](/ai-data/generative-apis/api-cli/understanding-errors) for troubleshooting tips. + + +## Streaming + +By default, the outputs are returned to the client only after the generation process is complete. However, a common alternative is to stream the results back to the client as they are generated. This is particularly useful in chat applications, where it allows the client to view the results incrementally as each token is produced. +The following example shows how to use the chat completion API: + +```python +from openai import OpenAI + +client = OpenAI( + base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL + api_key="" # Your unique API key from Scaleway +) +response = client.chat.completions.create( + model="pixtral-12b-2409", + messages=[{ + "role": "user", + "content": [ + {"type": "text", "text": "What is this image?"}, + {"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}}, + ] + }], + stream=True, +) + +for chunk in response: + if chunk.choices[0].delta.content: + print(chunk.choices[0].delta.content, end="") +``` + +## Async + +The service also supports asynchronous mode for any chat completion. + +```python + +import asyncio +from openai import AsyncOpenAI + +client = AsyncOpenAI( + base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL + api_key="" # Your unique API key from Scaleway +) + +async def main(): + stream = await client.chat.completions.create( + model="pixtral-12b-2409", + messages=[{ + "role": "user", + "content": [ + {"type": "text", "text": "What is this image?"}, + {"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}}, + ] + }], + stream=True, + ) + async for chunk in stream: + print(chunk.choices[0].delta.content, end="") + +asyncio.run(main()) +``` diff --git a/ai-data/generative-apis/how-to/use-function-calling.mdx b/ai-data/generative-apis/how-to/use-function-calling.mdx index 7c817d3126..eac3a5de15 100644 --- a/ai-data/generative-apis/how-to/use-function-calling.mdx +++ b/ai-data/generative-apis/how-to/use-function-calling.mdx @@ -4,10 +4,10 @@ meta: description: Learn how to implement function calling capabilities using Scaleway's Chat Completions API service. content: h1: How to use function calling - paragraph: Learn how to enhance AI interactions by integrating external tools and functions using Scaleway's Chat Completions API service. + paragraph: Learn how to enhance AI applications by integrating external tools using Scaleway's Chat Completions API service. tags: chat-completions-api dates: - validation: 2024-09-24 + validation: 2024-10-30 posted: 2024-09-24 --- @@ -19,7 +19,6 @@ Function calling allows a large language model (LLM) to interact with external t -- Access to Generative APIs. - [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization - A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) for API authentication - Python 3.7+ installed on your system diff --git a/ai-data/generative-apis/how-to/use-structured-outputs.mdx b/ai-data/generative-apis/how-to/use-structured-outputs.mdx index 5d69bb81d3..143a47a225 100644 --- a/ai-data/generative-apis/how-to/use-structured-outputs.mdx +++ b/ai-data/generative-apis/how-to/use-structured-outputs.mdx @@ -1,13 +1,13 @@ --- meta: title: How to use structured outputs - description: Learn how to interact with structured outputs using Scaleway's Chat Completions API service. + description: Learn how to get consistent JSON format responses using Scaleway's Chat Completions API service. content: h1: How to use structured outputs - paragraph: Learn how to interact with powerful text models using Scaleway's Chat Completions API service. + paragraph: Learn how to get consistent JSON format responses using Scaleway's Chat Completions API service. tags: chat-completions-api dates: - validation: 2024-09-17 + validation: 2024-10-30 posted: 2024-09-17 --- @@ -18,14 +18,12 @@ JSON, as a widely-used format, enables seamless integration with a variety of pl By specifying a response format when using the [Chat Completions API](/ai-data/generative-apis/api-cli/using-chat-api/), you can ensure that responses are returned in a JSON structure. There are two main modes for generating JSON: **Object Mode** (schemaless) and **Schema Mode** (deterministic, structured output). -You can interact with text models in several ways: -- Via the Scaleway [console](https://console.scaleway.com), which will soon provide a complete [playground](/ai-data/generative-apis/how-to/query-text-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time. -- Via the [Chat API](/ai-data/generative-apis/how-to/query-text-models/#querying-text-models-via-api) +There are several ways to interact with language models: +- The Scaleway [console](https://console.scaleway.com) provides a complete [playground](/ai-data/generative-apis/how-to/query-language-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time. +- Via the [Chat API](/ai-data/generative-apis/how-to/query-language-models/#querying-language-models-via-api) -- Access to Generative APIs. - While in beta, the service is restricted to invited users. You can request access by filling out a form on Scaleway's [Betas page](https://www.scaleway.com/en/betas/#generative-apis). - A Scaleway account logged into the [console](https://console.scaleway.com) - [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization - A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) for API authentication diff --git a/ai-data/generative-apis/quickstart.mdx b/ai-data/generative-apis/quickstart.mdx index 0eabad417b..83c320237e 100644 --- a/ai-data/generative-apis/quickstart.mdx +++ b/ai-data/generative-apis/quickstart.mdx @@ -7,7 +7,7 @@ content: paragraph: Get started with Scaleway Generative APIs for powerful AI-driven content generation. Follow this guide to set up, configure, and make your first API request. tags: generative-apis ai-data quickstart dates: - validation: 2024-09-04 + validation: 2024-10-30 posted: 2024-09-04 categories: - ai-data @@ -24,7 +24,6 @@ Hosted in European data centers and priced competitively per million tokens used - - Access to this service is restricted while in beta. You can request access to the product by filling out a form on the Scaleway's [betas page](https://www.scaleway.com/en/betas/#generative-apis). - A Scaleway account logged into the [console](https://console.scaleway.com) - [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization - A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) @@ -32,7 +31,19 @@ Hosted in European data centers and priced competitively per million tokens used ## Start with the Generative APIs Playground -Scaleway's Playground is in development, stay tuned! +Scaleway provides a web playground for instruct-based models hosted on Generative APIs. + +### Accessing the Playground +1. Navigate to **Generative APIs** under the **AI** section of the [Scaleway console](https://console.scaleway.com/) side menu. The list of models you can query displays. +2. Click the name of the chat model you want to try. Alternatively, click next to the chat model, and click **Try model** in the menu. + +The web playground displays. + +### Using the playground +1. Enter a prompt at the bottom of the page, or use one of the suggested prompts in the conversation area. +2. Edit the hyperparameters listed on the right column, for example the default temperature for more or less randomness on the outputs. +3. Switch models at the top of the page, to observe the capabilities of chat models offered via Generative APIs. +4. Click **View code** to get code snippets configured according to your settings in the playground. ## Install the OpenAI Python SDK diff --git a/ai-data/generative-apis/reference-content/rate-limits.mdx b/ai-data/generative-apis/reference-content/rate-limits.mdx index 78839ecbd8..8afbd81738 100644 --- a/ai-data/generative-apis/reference-content/rate-limits.mdx +++ b/ai-data/generative-apis/reference-content/rate-limits.mdx @@ -1,28 +1,43 @@ --- meta: title: What are Rate limits with Scaleway Generative APIs - description: Find our service limits in tokens per minute and queries per second + description: Find our service limits in tokens per minute and queries per minute content: h1: Rate limits - paragraph: This service will have limits sets at a later stage. + paragraph: Find our service limits in tokens per minute and queries per minute tags: generative-apis ai-data rate-limits dates: - validation: 2024-08-27 + validation: 2024-10-30 posted: 2024-08-27 --- ## What are the limits? - - This service has no rate limits while in closed beta. Limits will be set at a later stage. - - -Any given model served through Scaleway Generative APIs will ultimately get limited by: +Any model served through Scaleway Generative APIs gets limited by: - Tokens per minute -- Queries per second +- Queries per minute + +### Chat models + +| Model string | Requests per minute | Tokens per minute | +|-----------------|-----------------|-----------------| +| `llama-3.1-8b-instruct` | 300 | 100K | +| `llama-3.1-70b-instruct` | 300 | 100K | +| `mistral-nemo-instruct-2407`| 300 | 100K | +| `pixtral-12b-2409`| 300 | 100K | -We welcome feedback from early testers to set proper rates according to future use. +### Embedding models + +| Model string | Requests per minute | Tokens per minute | +|-----------------|-----------------|-----------------| +| `sentence-t5-xxl` | 600 | 1M | +| `bge-multilingual-gemma2` | 600 | 1M | ## Why do we set rate limits? -These limits will safeguard against abuse or misuse of Scaleway Generative APIs, helping to ensure fair access to the API with consistent performance. \ No newline at end of file +These limits safeguard against abuse or misuse of Scaleway Generative APIs, helping to ensure fair access to the API with consistent performance. + +## How can I increase the rate limits? + +We actively monitor usage and will improve rates based on feedback. +If you need to increase your rate limits, contact us via the support team, providing details on the model used and specific use case. \ No newline at end of file diff --git a/ai-data/generative-apis/reference-content/supported-models.mdx b/ai-data/generative-apis/reference-content/supported-models.mdx index 9975b24969..476d2951af 100644 --- a/ai-data/generative-apis/reference-content/supported-models.mdx +++ b/ai-data/generative-apis/reference-content/supported-models.mdx @@ -7,7 +7,7 @@ content: paragraph: Generative APIs offer serverless AI models hosted at Scaleway - no need to configure hardware or deploy your own models tags: generative-apis ai-data supported-models dates: - validation: 2024-09-02 + validation: 2024-10-30 posted: 2024-09-02 --- @@ -15,14 +15,17 @@ dates: This service is free while in beta. [Specific terms and conditions](https://www.scaleway.com/en/contracts/) apply. -Our [Chat API](/ai-data/generative-apis/how-to/query-text-models) has built-in support for the most popular instruct models. +Our [Chat API](/ai-data/generative-apis/how-to/query-language-models) has built-in support for the most popular instruct models. ## Chat models | Provider | Model string | Context window | License | Model card | |-----------------|-----------------|-----------------|-----------------|-----------------| -| Meta | `llama-3.1-8b-instruct` | 128k | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) | +| Meta | `llama-3.1-8b-instruct` | 128k | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) | +| Meta | `llama-3.1-70b-instruct` | 128k | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) | | Mistral | `mistral-nemo-instruct-2407` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | +| Mistral | `pixtral-12b-2409` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Pixtral-12B-2409) | + If you are unsure which chat model to use, we currently recommend Llama 3.1 8B Instruct (`llama-3.1-8b-instruct`) to get started. @@ -39,11 +42,12 @@ Our [Embeddings API](/ai-data/generative-apis/how-to/query-embedding-models) pro | Provider | Model string | Model size | Embedding dimension | Context window | License | Model card | |-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------| | SBERT | `sentence-t5-xxl` | 5B | 768 | 512 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/sentence-transformers/sentence-t5-xxl) | +| BAAI | `bge-multilingual-gemma2` | 9B | 3584 | 4096 | [Gemma](https://ai.google.dev/gemma/terms) | [HF](https://huggingface.co/BAAI/bge-multilingual-gemma2) | ## Request a model **Do not see a model you want to use?** [Tell us or vote for what you would like to add here.](https://feature-request.scaleway.com/?tags=ai-services) -## Deprecated models +## EOL models -This section will list models retired and no longer accessible for use. All models are currently in `Active` status. \ No newline at end of file +This section will list models retired and no longer accessible for use. All models are currently in `Active` status. diff --git a/menu/navigation.json b/menu/navigation.json index 5a2ebb7445..14de5aac44 100644 --- a/menu/navigation.json +++ b/menu/navigation.json @@ -668,8 +668,12 @@ { "items": [ { - "label": "Query text models", - "slug": "query-text-models" + "label": "Query language models", + "slug": "query-language-models" + }, + { + "label": "Query vision models", + "slug": "query-vision-models" }, { "label": "Query embedding models",