Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(genapi): getting ready for public beta #3844

Closed
wants to merge 68 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
fd154c8
feat(genapi): getting ready for public beta
tgenaitay Oct 16, 2024
c1263e8
Apply suggestions from code review
bene2k1 Oct 16, 2024
99a7f10
feat(ai): reduced due to sliding attention window
tgenaitay Oct 30, 2024
a701da2
fix(aps): update quotas (#3847)
bene2k1 Oct 16, 2024
2bf08d6
chore(dbaas): review docs 2024-10-16 (#3842)
bene2k1 Oct 17, 2024
48c7c98
doc(tem): indicates encryption is mandatory (#3840)
cbouvat Oct 17, 2024
b4285db
fix(ddx): fix migration macro (#3843)
bene2k1 Oct 17, 2024
1ee06d5
feat(tem): update quota information (#3848)
bene2k1 Oct 17, 2024
19c3a8b
chore(gen): review documentation MTA-5152 (#3837)
SamyOubouaziz Oct 17, 2024
e98228d
feat(k8s): add information about routed ip migration completed (#3849)
bene2k1 Oct 17, 2024
c031a64
feat(ins): update create instance (#3850)
bene2k1 Oct 17, 2024
f8af361
fix(tem): fix missing closing tag (#3851)
bene2k1 Oct 17, 2024
9189c88
chore(gen): review iam and labs 2024-10-16 (#3845)
bene2k1 Oct 17, 2024
f7ee3a3
feat(changelog): transactional-email-changed-increase-in-minimum-quot…
ofranc Oct 18, 2024
caa3b33
fix(tem): fix wrong validation date (#3852)
bene2k1 Oct 18, 2024
eaa21f7
fix: clarify partition labels (#3855)
pierreozoux Oct 18, 2024
b03b49c
fix(ins): fix create doc for windows and typos (#3854)
bene2k1 Oct 21, 2024
9ccb191
feat: add warning about limitations of kosmos upgrade (#3857)
pierreozoux Oct 21, 2024
8bc10ce
chore(tuto): review tutorials (#3860)
bene2k1 Oct 21, 2024
0f6f8eb
chore(ddx): review docs compute (#3859)
bene2k1 Oct 21, 2024
c0cb1ba
docs(serverless): add entries in serverless FAQ (#3858)
Bemilie Oct 21, 2024
fafe37f
fix(pgw): add more troubleshooting (#3862)
RoRoJ Oct 22, 2024
21a256b
fix(vpc): review doc (#3861)
RoRoJ Oct 22, 2024
00d36e7
fix(ddx): update ip addresses of ntp servers (#3863)
bene2k1 Oct 22, 2024
e43cbce
docs(SRV): add note on external container registries MTA-5165 (#3866)
SamyOubouaziz Oct 23, 2024
85220f1
feat(changelog): elastic-metal-added-disk-partitioning-configuration-…
ofranc Oct 23, 2024
d497214
docs(cpt): add gpu instanceXCockpit tuto (#3865)
nerda-codes Oct 23, 2024
9fbab81
fix(inference): changed to full names everywhere (#3871)
tgenaitay Oct 23, 2024
37a7390
fix(bil): added guideflow (#3864)
jsudraud Oct 23, 2024
df41c7b
fix(edge): fix duplicate anchors (#3869)
RoRoJ Oct 23, 2024
eb7f7ef
fix(pgw): note recommended solution (#3867)
RoRoJ Oct 23, 2024
fa7ed16
feat(wbh): add low level doc (#3783)
bene2k1 Oct 23, 2024
f1e7375
chore(GEN): documentation review MTA-5176 (#3870)
SamyOubouaziz Oct 23, 2024
4dae4ed
feat(wbh): add tip to howto pages (#3876)
bene2k1 Oct 24, 2024
75f3e60
docs(cpt): mention grafana agent deprecation (#3872)
nerda-codes Oct 24, 2024
39e9b7b
feat(IAM): Add undocumented permission sets (#3878)
crlptl Oct 24, 2024
60f5e99
fix(rdb): reviews 21/10 (#3879)
ldecarvalho-doc Oct 24, 2024
be0a7b6
feat(aps): add troubleshooting apple id (#3877)
bene2k1 Oct 24, 2024
a82ffe1
fix(wbh): fix meta information (#3881)
bene2k1 Oct 24, 2024
b1834d2
feat(wbh): add php version overview (#3883)
bene2k1 Oct 24, 2024
e7dcafa
feat(k8s): add modifying kernel documentation (#3882)
bene2k1 Oct 24, 2024
71c1796
docs(cpt): add tags and category (#3880)
nerda-codes Oct 24, 2024
1bcafaf
fix(edge): correct error troubleshooting (#3884)
RoRoJ Oct 24, 2024
4a6171e
Update index.mdx (#3885)
fpagny Oct 25, 2024
1a885bc
fix(wbh): fix date from future (#3887)
bene2k1 Oct 25, 2024
4cdfb31
Update index.mdx (#3888)
fpagny Oct 25, 2024
f6f3c66
feat(ai): bring support for function calling via tools and tools_choi…
tgenaitay Oct 25, 2024
ad89918
feat(changelog): add new entry (#3892)
ofranc Oct 25, 2024
2b808c2
fix(docs): update ipfs-pinning quickstart (#3894)
christian-vdz Oct 28, 2024
5407ee4
docs(srv): add notes and troubleshooting on encoding secrets and env …
SamyOubouaziz Oct 28, 2024
7f44991
fix(pgw): remove duplicate H2 (#3897)
RoRoJ Oct 28, 2024
cccdb1c
fix(ai): add missing link (#3899)
bene2k1 Oct 28, 2024
ad62179
fix(serverless): faq title level (#3901)
thomas-tacquet Oct 29, 2024
148b1bf
docs(srv): add SEM x Jobs documentation MTA-5199 (#3898)
SamyOubouaziz Oct 29, 2024
b9bd069
feat(serverless): fix somesyntax highlighting (#3895)
thomas-tacquet Oct 29, 2024
38468c2
feat(serverless): advanced cold starts doc (#3891)
thomas-tacquet Oct 29, 2024
1e81d59
docs(S3): update mentions of S3 in documentation MTA-5188 (#3874)
SamyOubouaziz Oct 29, 2024
8196ac6
feat(instances): understanding qga (#3896)
bene2k1 Oct 29, 2024
7c996be
chore(tutorials): review (#3902)
bene2k1 Oct 29, 2024
a59253a
chore(domains): weekly review (#3904)
nerda-codes Oct 29, 2024
be58816
fix(ai): fix double link in navigation (#3906)
bene2k1 Oct 29, 2024
8a150a5
fix(s3): fix rebase error (#3905)
SamyOubouaziz Oct 29, 2024
6ef6cff
Update use-structured-outputs.mdx (#3908)
fpagny Oct 29, 2024
a054c23
fix(ai): missing export (#3910)
tgenaitay Oct 29, 2024
18d1af1
fix(datalab): wording overview page (#3911)
jsudraud Oct 29, 2024
f8644a9
fix(k8s): remove doc temprary (#3912)
bene2k1 Oct 29, 2024
8b839ba
fix(ins): add posted date (#3909)
bene2k1 Oct 30, 2024
d7e9af3
feat(genapi): introducing vision models
tgenaitay Oct 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions ai-data/generative-apis/api-cli/understanding-errors.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ Below are usual HTTP error codes:
- 404 - **Route Not Found**: The requested resource could not be found. Check your request is being made to the correct endpoint.
- 422 - **Model Not Found**: The `model` key is present in the request payload, but the corresponding model is not found.
- 422 - **Missing Model**: The `model` key is missing from the request payload.
- 429 - **Too Many Requests**: You are exceeding your current quota for the requested model, calculated in requests per minute. Find rate limits on [this page](/ai-data/generative-apis/reference-content/rate-limits/)
- 429 - **Too Many Tokens**: You are exceeding your current quota for the requested model, calculated in tokens per minute. Find rate limits on [this page](/ai-data/generative-apis/reference-content/rate-limits/)
- 500 - **API error**: An unexpected internal error has occurred within Scaleway's systems. If the issue persists, please [open a support ticket](https://console.scaleway.com/support/tickets/create).

For streaming responses via SSE, 5xx errors may occur after a 200 response has been returned.
13 changes: 12 additions & 1 deletion ai-data/generative-apis/how-to/query-text-models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,18 @@ There are several ways to interact with text models:

## Accessing the Playground

Scaleway's Playground is in development, stay tuned!
Scaleway provides a web playground for instruct-based models hosted on Generative APIs.

1. Navigate to Generative APIs under the AI section of the [Scaleway console](https://console.scaleway.com/) side menu. The list of models you can query displays.
2. Click the name of the chat model you want to try. Alternatively, click <Icon name="more" /> next to the chat model, and click **Try model** in the menu.

The web playground displays.

bene2k1 marked this conversation as resolved.
Show resolved Hide resolved
## Using the Playground
1. Enter a prompt at the bottom of the page, or use one of the suggested prompts in the conversation area.
2. Edit the hyperparameters listed on the right column, for example the default temperature for more or less randomness on the outputs.
3. Switch model at the top of the page, to observe the capabilities of chat models offered via Generative APIs.
4. Click **View code** to get code snippets configured according to your settings in the playground.

## Querying text models via API

Expand Down
14 changes: 13 additions & 1 deletion ai-data/generative-apis/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,19 @@ Hosted in European data centers and priced competitively per million tokens used

## Start with the Generative APIs Playground

Scaleway's Playground is in development, stay tuned!
Scaleway provides a web playground for instruct-based models hosted on Generative APIs.

bene2k1 marked this conversation as resolved.
Show resolved Hide resolved
### Accessing the Playground
1. Navigate to Generative APIs under the AI section of the [Scaleway console](https://console.scaleway.com/) side menu. The list of models you can query displays.
2. Click the name of the chat model you want to try. Alternatively, click <Icon name="more" /> next to the chat model, and click **Try model** in the menu.

The web playground displays.

bene2k1 marked this conversation as resolved.
Show resolved Hide resolved
### Using the Playground
1. Enter a prompt at the bottom of the page, or use one of the suggested prompts in the conversation area.
2. Edit the hyperparameters listed on the right column, for example the default temperature for more or less randomness on the outputs.
3. Switch model at the top of the page, to observe the capabilities of chat models offered via Generative APIs.
4. Click **View code** to get code snippets configured according to your settings in the playground.

## Install the OpenAI Python SDK

Expand Down
31 changes: 23 additions & 8 deletions ai-data/generative-apis/reference-content/rate-limits.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,31 @@ dates:

## What are the limits?

<Message type="important">
This service has no rate limits while in closed beta. Limits will be set at a later stage.
</Message>

Any given model served through Scaleway Generative APIs will ultimately get limited by:
Any model served through Scaleway Generative APIs gets limited by:
- Tokens per minute
- Queries per second
- Queries per minute

### Chat models

| Model string | Requests per minute | Tokens per minute |
|-----------------|-----------------|-----------------|
| `llama-3.1-8b-instruct` | 300 | 100K |
| `llama-3.1-70b-instruct` | 300 | 100K |
| `mistral-nemo-instruct-2407`| 300 | 100K |
| `pixtral-12b-2409`| 300 | 100K |

We welcome feedback from early testers to set proper rates according to future use.
### Embedding models

| Model string | Requests per minute | Tokens per minute |
|-----------------|-----------------|-----------------|
| `sentence-t5-xxl` | 600 | 1M |
| `bge-multilingual-gemma2` | 600 | 1M |

## Why do we set rate limits?

These limits will safeguard against abuse or misuse of Scaleway Generative APIs, helping to ensure fair access to the API with consistent performance.
These limits safeguard against abuse or misuse of Scaleway Generative APIs, helping to ensure fair access to the API with consistent performance.

## How can I increase the rate limits?

We actively monitor usage and will improve rates based on feedback.
If you need to increase your rate limits, please contact us via the support, providing details on the model used and specific use case.
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,11 @@ Our [Chat API](/ai-data/generative-apis/how-to/query-text-models) has built-in s

| Provider | Model string | Context window | License | Model card |
|-----------------|-----------------|-----------------|-----------------|-----------------|
| Meta | `llama-3.1-8b-instruct` | 128k | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) |
| Meta | `llama-3.1-8b-instruct` | 128k | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |
| Meta | `llama-3.1-70b-instruct` | 128k | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) |
| Mistral | `mistral-nemo-instruct-2407` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
| Mistral | `pixtral-12b-2409` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Pixtral-12B-2409) |


<Message type="tip">
If you are unsure which chat model to use, we currently recommend Llama 3.1 8B Instruct (`llama-3.1-8b-instruct`) to get started.
Expand All @@ -39,6 +42,7 @@ Our [Embeddings API](/ai-data/generative-apis/how-to/query-embedding-models) pro
| Provider | Model string | Model size | Embedding dimension | Context window | License | Model card |
|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
| SBERT | `sentence-t5-xxl` | 5B | 768 | 512 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/sentence-transformers/sentence-t5-xxl) |
| BAAI | `bge-multilingual-gemma2` | 9B | 3584 | 8192 | [Gemma](https://ai.google.dev/gemma/terms) | [HF](https://huggingface.co/BAAI/bge-multilingual-gemma2) |

## Request a model

Expand Down
Loading