scaleway · tgenaitay · Oct 16, 2024 · Oct 16, 2024 · Oct 30, 2024 · Oct 16, 2024
@@ -32,6 +32,8 @@ Below are usual HTTP error codes:
 - 404 - **Route Not Found**: The requested resource could not be found. Check your request is being made to the correct endpoint.
 - 422 - **Model Not Found**: The `model` key is present in the request payload, but the corresponding model is not found.
 - 422 - **Missing Model**:  The `model` key is missing from the request payload.
+- 429 - **Too Many Requests**: You are exceeding your current quota for the requested model, calculated in requests per minute. Find rate limits on [this page](/ai-data/generative-apis/reference-content/rate-limits/)
+- 429 - **Too Many Tokens**: You are exceeding your current quota for the requested model, calculated in tokens per minute. Find rate limits on [this page](/ai-data/generative-apis/reference-content/rate-limits/)
 - 500 - **API error**: An unexpected internal error has occurred within Scaleway's systems. If the issue persists, please [open a support ticket](https://console.scaleway.com/support/tickets/create).
 
 For streaming responses via SSE, 5xx errors may occur after a 200 response has been returned.
@@ -27,7 +27,18 @@ There are several ways to interact with text models:
 
 ## Accessing the Playground
 
-Scaleway's Playground is in development, stay tuned!
+Scaleway provides a web playground for instruct-based models hosted on Generative APIs.
+
+1. Navigate to Generative APIs under the AI section of the [Scaleway console](https://console.scaleway.com/) side menu. The list of models you can query displays.
+2. Click the name of the chat model you want to try. Alternatively, click <Icon name="more" /> next to the chat model, and click **Try model** in the menu. 
+
+The web playground displays.
+
+## Using the Playground
+1. Enter a prompt at the bottom of the page, or use one of the suggested prompts in the conversation area.
+2. Edit the hyperparameters listed on the right column, for example the default temperature for more or less randomness on the outputs. 
+3. Switch model at the top of the page, to observe the capabilities of chat models offered via Generative APIs. 
+4. Click **View code** to get code snippets configured according to your settings in the playground.
 
 ## Querying text models via API
 

@@ -32,7 +32,19 @@ Hosted in European data centers and priced competitively per million tokens used
 
 ## Start with the Generative APIs Playground
 
-Scaleway's Playground is in development, stay tuned!
+Scaleway provides a web playground for instruct-based models hosted on Generative APIs.
+
+### Accessing the Playground
+1. Navigate to Generative APIs under the AI section of the [Scaleway console](https://console.scaleway.com/) side menu. The list of models you can query displays.
+2. Click the name of the chat model you want to try. Alternatively, click <Icon name="more" /> next to the chat model, and click **Try model** in the menu. 
+
+The web playground displays.
+
+### Using the Playground
+1. Enter a prompt at the bottom of the page, or use one of the suggested prompts in the conversation area.
+2. Edit the hyperparameters listed on the right column, for example the default temperature for more or less randomness on the outputs. 
+3. Switch model at the top of the page, to observe the capabilities of chat models offered via Generative APIs. 
+4. Click **View code** to get code snippets configured according to your settings in the playground.
 
 ## Install the OpenAI Python SDK
 

@@ -13,16 +13,31 @@ dates:
 
 ## What are the limits?
 
-<Message type="important">
-  This service has no rate limits while in closed beta. Limits will be set at a later stage.
-</Message>
-
-Any given model served through Scaleway Generative APIs will ultimately get limited by:
+Any model served through Scaleway Generative APIs gets limited by:
 - Tokens per minute
-- Queries per second
+- Queries per minute
+
+### Chat models
+
+| Model string | Requests per minute | Tokens per minute |
+|-----------------|-----------------|-----------------|
+| `llama-3.1-8b-instruct` | 300 | 100K |
+| `llama-3.1-70b-instruct` | 300 | 100K |
+| `mistral-nemo-instruct-2407`| 300 | 100K |
+| `pixtral-12b-2409`| 300 | 100K |
 
-We welcome feedback from early testers to set proper rates according to future use.
+### Embedding models 
+
+| Model string | Requests per minute | Tokens per minute |
+|-----------------|-----------------|-----------------|
+| `sentence-t5-xxl` | 600 | 1M |
+| `bge-multilingual-gemma2` | 600 | 1M |
 
 ## Why do we set rate limits?
 
-These limits will safeguard against abuse or misuse of Scaleway Generative APIs, helping to ensure fair access to the API with consistent performance.
+These limits safeguard against abuse or misuse of Scaleway Generative APIs, helping to ensure fair access to the API with consistent performance.
+
+## How can I increase the rate limits?
+
+We actively monitor usage and will improve rates based on feedback.
+If you need to increase your rate limits, please contact us via the support, providing details on the model used and specific use case.
@@ -21,8 +21,11 @@ Our [Chat API](/ai-data/generative-apis/how-to/query-text-models) has built-in s
 
 | Provider | Model string | Context window | License | Model card |
 |-----------------|-----------------|-----------------|-----------------|-----------------|
-| Meta        | `llama-3.1-8b-instruct`  | 128k  | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) |
+| Meta        | `llama-3.1-8b-instruct`  | 128k  | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |
+| Meta        | `llama-3.1-70b-instruct`  | 128k  | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) |
 | Mistral      | `mistral-nemo-instruct-2407`                 | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
+| Mistral      | `pixtral-12b-2409`                 | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Pixtral-12B-2409) |
+
 
 <Message type="tip">
   If you are unsure which chat model to use, we currently recommend Llama 3.1 8B Instruct (`llama-3.1-8b-instruct`) to get started.
@@ -39,6 +42,7 @@ Our [Embeddings API](/ai-data/generative-apis/how-to/query-embedding-models) pro
 | Provider | Model string | Model size | Embedding dimension | Context window |  License | Model card |
 |-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
 | SBERT        | `sentence-t5-xxl`  | 5B  | 768 | 512 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/sentence-transformers/sentence-t5-xxl) |
+| BAAI        | `bge-multilingual-gemma2`  | 9B  | 3584 | 8192 | [Gemma](https://ai.google.dev/gemma/terms) | [HF](https://huggingface.co/BAAI/bge-multilingual-gemma2) |
 
 ## Request a model