Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SageMaker endpoint magic command support #215

Merged
merged 10 commits into from
Jun 8, 2023
35 changes: 29 additions & 6 deletions docs/source/users/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Jupyter AI supports the following model providers:
| HuggingFace Hub | `huggingface_hub` | `HUGGINGFACEHUB_API_TOKEN` | `huggingface_hub`, `ipywidgets`, `pillow` |
| OpenAI | `openai` | `OPENAI_API_KEY` | `openai` |
| OpenAI (chat) | `openai-chat` | `OPENAI_API_KEY` | `openai` |
| SageMaker Endpoints | `sagemaker-endpoint` | N/A | `boto3` |
| SageMaker | `sagemaker-endpoint` | N/A | `boto3` |

The environment variable names shown above are also the names of the settings keys used when setting up the chat interface.

Expand Down Expand Up @@ -177,20 +177,20 @@ To compose a message, type it in the text box at the bottom of the chat interfac
alt='Screen shot of an example "Hello world" message sent to Jupyternaut, who responds with "Hello world, how are you today?"'
class="screenshot" />

### Usage with SageMaker Endpoints
### Using the chat interrface with SageMaker endpoints

Jupyter AI supports language models hosted on SageMaker Endpoints that use JSON
APIs. The first step is to authenticate with AWS via the `boto3` SDK and have
Jupyter AI supports language models hosted on SageMaker endpoints that use JSON
schemas. The first step is to authenticate with AWS via the `boto3` SDK and have
the credentials stored in the `default` profile. Guidance on how to do this can
be found in the
[`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).

When selecting the SageMaker Endpoints provider in the settings panel, you will
When selecting the SageMaker provider in the settings panel, you will
see the following interface:

<img src="../_static/chat-sagemaker-endpoints.png"
width="50%"
alt='Screenshot of the settings panel with the SageMaker Endpoints provider selected.'
alt='Screenshot of the settings panel with the SageMaker provider selected.'
class="screenshot" />

Each of the additional fields under "Language model" is required. These fields
Expand Down Expand Up @@ -601,3 +601,26 @@ You can see a list of all aliases by running the `%ai list` command.
Aliases' names can contain ASCII letters (uppercase and lowercase), numbers, hyphens, underscores, and periods. They may not contain colons. They may also not override built-in commands — run `%ai help` for a list of these commands.

Aliases must refer to models or `LLMChain` objects; they cannot refer to other aliases.

### Using magic commands with SageMaker endpoints

You can use magic commands with models hosted using Amazon SageMaker.

First, make sure that you've set your `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables either before starting JupyterLab or using the `%env` magic command within JupyterLab. For more information about environment variables, see [Environment variables to configure the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html) in AWS's documentation.

Jupyter AI supports language models hosted on SageMaker endpoints that use JSON schemas. Authenticate with AWS via the `boto3` SDK and have the credentials stored in the `default` profile. Guidance on how to do this can be found in the [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).

You will need to deploy a model in SageMaker, then provide it as the model name (as `sagemaker-endpoint:my-model-name`). See the [documentation on how to deploy a JumpStart model](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-deploy.html).

All SageMaker endpoint requests require you to specify the `--region-name`, `--request-schema`, and `--response-path` options. The example below presumes that you have deployed a model called `jumpstart-dft-hf-text2text-flan-t5-xl`.

```
%%ai sagemaker-endpoint:jumpstart-dft-hf-text2text-flan-t5-xl --region-name=us-east-1 --request-schema={"text_inputs":"<prompt>"} --response-path=generated_texts.[0] -f code
Write Python code to print "Hello world"
```

The `--region-name` parameter is set to the [AWS region code](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html) where the model is deployed, which in this case is `us-east-1`.

The `--request-schema` parameter is the JSON object the endpoint expects as input, with the prompt being substituted into any value that matches the string literal `"<prompt>"`. For example, the request schema `{"text_inputs":"<prompt>"}` will submit a JSON object with the prompt stored under the `text_inputs` key.

The `--response-path` option is a [JSONPath](https://goessner.net/articles/JsonPath/index.html) string that retrieves the language model's output from the endpoint's JSON response. For example, if your endpoint returns an object with the schema `{"generated_texts":["<output>"]}`, its response path is `generated_texts.[0]`.
124 changes: 124 additions & 0 deletions examples/sagemaker.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "79ebfce1-f5ac-4e39-83db-11416e310e8e",
"metadata": {
"tags": []
},
"source": [
"# Jupyter AI with the SageMaker endpoint\n",
"\n",
"This demo showcases the IPython magics Jupyter AI provides out-of-the-box for Amazon SageMaker.\n",
"\n",
"First, make sure that you've set your `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables either before starting JupyterLab or using the `%env` magic command within JupyterLab.\n",
"\n",
"Then, load the IPython extension:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "24f3f446-2b1d-4802-a47c-d298c06fc86e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"%load_ext jupyter_ai"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9f2b0270-1c33-4918-b534-4ec104f90141",
"metadata": {},
"source": [
"Jupyter AI supports language models hosted on SageMaker endpoints that use JSON APIs. Authenticate with AWS via the `boto3` SDK and have the credentials stored in the `default` profile. Guidance on how to do this can be found in the [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).\n",
"\n",
"You will need to deploy a model in SageMaker, then provide it as your model name (as `sagemaker-endpoint:my-model-name`). See the [documentation on how to deploy a JumpStart model](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-deploy.html).\n",
"\n",
"All SageMaker endpoint requests require you to specify the `--region-name`, `--request-schema`, and `--response-path` options.\n",
"\n",
"The `--region-name` parameter is set to the [AWS region code](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html), such as `us-east-1` or `eu-west-1`.\n",
"\n",
"The `--request-schema` parameter is the JSON object the endpoint expects as input, with the prompt being substituted into any value that matches the string literal `\"<prompt>\"`. For example, the request schema `{\"text_inputs\":\"<prompt>\"}` will submit a JSON object with the prompt stored under the `text_inputs` key.\n",
"\n",
"The `--response-path` option is a [JSONPath](https://goessner.net/articles/JsonPath/index.html) string that retrieves the language model's output from the endpoint's JSON response. For example, if your endpoint returns an object with the schema `{\"generated_texts\":[\"<output>\"]}`, its response path is `generated_texts.[0]`."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "31f3e6e3-48cf-4e60-96d3-8b8e1dd34bec",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"AI generated code inserted below &#11015;&#65039;"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"execution_count": 4,
"metadata": {
"text/html": {
"jupyter_ai": {
"model_id": "jumpstart-dft-hf-text2text-flan-t5-xl",
"provider_id": "sagemaker-endpoint"
}
}
},
"output_type": "execute_result"
}
],
"source": [
"%%ai sagemaker-endpoint:jumpstart-dft-hf-text2text-flan-t5-xl --region-name=us-east-1 --request-schema={\"text_inputs\":\"<prompt>\"} --response-path=generated_texts.[0] -f code\n",
"Write some Python code"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "42408ea8-7264-44bc-ac0c-6b5dd03134d6",
"metadata": {},
"outputs": [],
"source": [
"a = [] b = [] c = [] d = ["
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4ad8c62e-b0a5-4091-94e3-4067ed8d6c4a",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
17 changes: 16 additions & 1 deletion packages/jupyter-ai-magics/jupyter_ai_magics/magics.py
Original file line number Diff line number Diff line change
Expand Up @@ -478,11 +478,26 @@ def run_ai_cell(self, args: CellArgs, prompt: str):
f"An authentication token is required to use models from the {Provider.name} provider.\n"
f"Please specify it via `%env {auth_strategy.name}=token`. "
) from None

# configure and instantiate provider
provider_params = { "model_id": local_model_id }
if provider_id == "openai-chat":
provider_params["prefix_messages"] = self.transcript_openai
# for SageMaker, validate that required params are specified
if provider_id == "sagemaker-endpoint":
if (
args.region_name is None or
args.request_schema is None or
args.response_path is None
):
raise ValueError(
"When using the sagemaker-endpoint provider, you must specify all of " +
"the --region-name, --request-schema, and --response-path options."
)
provider_params["region_name"] = args.region_name
provider_params["request_schema"] = args.request_schema
provider_params["response_path"] = args.response_path

provider = Provider(**provider_params)

# generate output from model via provider
Expand Down
31 changes: 31 additions & 0 deletions packages/jupyter-ai-magics/jupyter_ai_magics/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,42 @@
FORMAT_CHOICES = list(get_args(FORMAT_CHOICES_TYPE))
FORMAT_HELP = """IPython display to use when rendering output. [default="markdown"]"""

REGION_NAME_SHORT_OPTION = '-n'
REGION_NAME_LONG_OPTION = '--region-name'
REGION_NAME_HELP = ("AWS region name, e.g. 'us-east-1'. Required for SageMaker provider; " +
"does nothing with other providers.")

REQUEST_SCHEMA_SHORT_OPTION = '-q'
REQUEST_SCHEMA_LONG_OPTION = '--request-schema'
REQUEST_SCHEMA_HELP = ("The JSON object the endpoint expects, with the prompt being " +
"substituted into any value that matches the string literal '<prompt>'. " +
"Required for SageMaker provider; does nothing with other providers.")

RESPONSE_PATH_SHORT_OPTION = '-p'
RESPONSE_PATH_LONG_OPTION = '--response-path'
RESPONSE_PATH_HELP = ("A JSONPath string that retrieves the language model's output " +
"from the endpoint's JSON response. Required for SageMaker provider; " +
"does nothing with other providers.")

class CellArgs(BaseModel):
type: Literal["root"] = "root"
model_id: str
format: FORMAT_CHOICES_TYPE
reset: bool
# The following parameters are required only for SageMaker models
region_name: Optional[str]
request_schema: Optional[str]
response_path: Optional[str]

# Should match CellArgs, but without "reset"
class ErrorArgs(BaseModel):
type: Literal["error"] = "error"
model_id: str
format: FORMAT_CHOICES_TYPE
# The following parameters are required only for SageMaker models
region_name: Optional[str]
request_schema: Optional[str]
response_path: Optional[str]

class HelpArgs(BaseModel):
type: Literal["help"] = "help"
Expand Down Expand Up @@ -60,6 +85,9 @@ def get_help(self, ctx):
help="""Clears the conversation transcript used when interacting with an
OpenAI chat model provider. Does nothing with other providers."""
)
@click.option(REGION_NAME_SHORT_OPTION, REGION_NAME_LONG_OPTION, required=False, help=REGION_NAME_HELP)
@click.option(REQUEST_SCHEMA_SHORT_OPTION, REQUEST_SCHEMA_LONG_OPTION, required=False, help=REQUEST_SCHEMA_HELP)
@click.option(RESPONSE_PATH_SHORT_OPTION, RESPONSE_PATH_LONG_OPTION, required=False, help=RESPONSE_PATH_HELP)
def cell_magic_parser(**kwargs):
"""
Invokes a language model identified by MODEL_ID, with the prompt being
Expand All @@ -84,6 +112,9 @@ def line_magic_parser():
default="markdown",
help=FORMAT_HELP
)
@click.option(REGION_NAME_SHORT_OPTION, REGION_NAME_LONG_OPTION, required=False, help=REGION_NAME_HELP)
@click.option(REQUEST_SCHEMA_SHORT_OPTION, REQUEST_SCHEMA_LONG_OPTION, required=False, help=REQUEST_SCHEMA_HELP)
@click.option(RESPONSE_PATH_SHORT_OPTION, RESPONSE_PATH_LONG_OPTION, required=False, help=RESPONSE_PATH_HELP)
def error_subparser(**kwargs):
"""
Explains the most recent error. Takes the same options (except -r) as
Expand Down