Skip to content

Commit

Permalink
SageMaker endpoint magic command support (jupyterlab#215)
Browse files Browse the repository at this point in the history
* WIP: New parameters for SageMaker Endpoint

* Uses symbolic constants consistently

* Updates sample notebook

* Updates docs, sample notebook

* Retitles section to be about magic commands

* Removes AWS_SESSION_TOKEN

* Links for more info

* Update docs/source/users/index.md

Co-authored-by: Piyush Jain <piyushjain@duck.com>

* Additional copy edits per @3coins

* Update docs/source/users/index.md

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
  • Loading branch information
JasonWeill and 3coins committed Jun 15, 2023
1 parent 2cd2b2e commit e3e64e6
Show file tree
Hide file tree
Showing 4 changed files with 200 additions and 7 deletions.
35 changes: 29 additions & 6 deletions docs/source/users/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Jupyter AI supports the following model providers:
| HuggingFace Hub | `huggingface_hub` | `HUGGINGFACEHUB_API_TOKEN` | `huggingface_hub`, `ipywidgets`, `pillow` |
| OpenAI | `openai` | `OPENAI_API_KEY` | `openai` |
| OpenAI (chat) | `openai-chat` | `OPENAI_API_KEY` | `openai` |
| SageMaker Endpoints | `sagemaker-endpoint` | N/A | `boto3` |
| SageMaker | `sagemaker-endpoint` | N/A | `boto3` |

The environment variable names shown above are also the names of the settings keys used when setting up the chat interface.

Expand Down Expand Up @@ -177,20 +177,20 @@ To compose a message, type it in the text box at the bottom of the chat interfac
alt='Screen shot of an example "Hello world" message sent to Jupyternaut, who responds with "Hello world, how are you today?"'
class="screenshot" />

### Usage with SageMaker Endpoints
### Using the chat interrface with SageMaker endpoints

Jupyter AI supports language models hosted on SageMaker Endpoints that use JSON
APIs. The first step is to authenticate with AWS via the `boto3` SDK and have
Jupyter AI supports language models hosted on SageMaker endpoints that use JSON
schemas. The first step is to authenticate with AWS via the `boto3` SDK and have
the credentials stored in the `default` profile. Guidance on how to do this can
be found in the
[`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).

When selecting the SageMaker Endpoints provider in the settings panel, you will
When selecting the SageMaker provider in the settings panel, you will
see the following interface:

<img src="../_static/chat-sagemaker-endpoints.png"
width="50%"
alt='Screenshot of the settings panel with the SageMaker Endpoints provider selected.'
alt='Screenshot of the settings panel with the SageMaker provider selected.'
class="screenshot" />

Each of the additional fields under "Language model" is required. These fields
Expand Down Expand Up @@ -601,3 +601,26 @@ You can see a list of all aliases by running the `%ai list` command.
Aliases' names can contain ASCII letters (uppercase and lowercase), numbers, hyphens, underscores, and periods. They may not contain colons. They may also not override built-in commands — run `%ai help` for a list of these commands.

Aliases must refer to models or `LLMChain` objects; they cannot refer to other aliases.

### Using magic commands with SageMaker endpoints

You can use magic commands with models hosted using Amazon SageMaker.

First, make sure that you've set your `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables either before starting JupyterLab or using the `%env` magic command within JupyterLab. For more information about environment variables, see [Environment variables to configure the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html) in AWS's documentation.

Jupyter AI supports language models hosted on SageMaker endpoints that use JSON schemas. Authenticate with AWS via the `boto3` SDK and have the credentials stored in the `default` profile. Guidance on how to do this can be found in the [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).

You will need to deploy a model in SageMaker, then provide it as the model name (as `sagemaker-endpoint:my-model-name`). See the [documentation on how to deploy a JumpStart model](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-deploy.html).

All SageMaker endpoint requests require you to specify the `--region-name`, `--request-schema`, and `--response-path` options. The example below presumes that you have deployed a model called `jumpstart-dft-hf-text2text-flan-t5-xl`.

```
%%ai sagemaker-endpoint:jumpstart-dft-hf-text2text-flan-t5-xl --region-name=us-east-1 --request-schema={"text_inputs":"<prompt>"} --response-path=generated_texts.[0] -f code
Write Python code to print "Hello world"
```

The `--region-name` parameter is set to the [AWS region code](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html) where the model is deployed, which in this case is `us-east-1`.

The `--request-schema` parameter is the JSON object the endpoint expects as input, with the prompt being substituted into any value that matches the string literal `"<prompt>"`. For example, the request schema `{"text_inputs":"<prompt>"}` will submit a JSON object with the prompt stored under the `text_inputs` key.

The `--response-path` option is a [JSONPath](https://goessner.net/articles/JsonPath/index.html) string that retrieves the language model's output from the endpoint's JSON response. For example, if your endpoint returns an object with the schema `{"generated_texts":["<output>"]}`, its response path is `generated_texts.[0]`.
124 changes: 124 additions & 0 deletions examples/sagemaker.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "79ebfce1-f5ac-4e39-83db-11416e310e8e",
"metadata": {
"tags": []
},
"source": [
"# Jupyter AI with the SageMaker endpoint\n",
"\n",
"This demo showcases the IPython magics Jupyter AI provides out-of-the-box for Amazon SageMaker.\n",
"\n",
"First, make sure that you've set your `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables either before starting JupyterLab or using the `%env` magic command within JupyterLab.\n",
"\n",
"Then, load the IPython extension:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "24f3f446-2b1d-4802-a47c-d298c06fc86e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"%load_ext jupyter_ai"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9f2b0270-1c33-4918-b534-4ec104f90141",
"metadata": {},
"source": [
"Jupyter AI supports language models hosted on SageMaker endpoints that use JSON APIs. Authenticate with AWS via the `boto3` SDK and have the credentials stored in the `default` profile. Guidance on how to do this can be found in the [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).\n",
"\n",
"You will need to deploy a model in SageMaker, then provide it as your model name (as `sagemaker-endpoint:my-model-name`). See the [documentation on how to deploy a JumpStart model](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-deploy.html).\n",
"\n",
"All SageMaker endpoint requests require you to specify the `--region-name`, `--request-schema`, and `--response-path` options.\n",
"\n",
"The `--region-name` parameter is set to the [AWS region code](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html), such as `us-east-1` or `eu-west-1`.\n",
"\n",
"The `--request-schema` parameter is the JSON object the endpoint expects as input, with the prompt being substituted into any value that matches the string literal `\"<prompt>\"`. For example, the request schema `{\"text_inputs\":\"<prompt>\"}` will submit a JSON object with the prompt stored under the `text_inputs` key.\n",
"\n",
"The `--response-path` option is a [JSONPath](https://goessner.net/articles/JsonPath/index.html) string that retrieves the language model's output from the endpoint's JSON response. For example, if your endpoint returns an object with the schema `{\"generated_texts\":[\"<output>\"]}`, its response path is `generated_texts.[0]`."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "31f3e6e3-48cf-4e60-96d3-8b8e1dd34bec",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"AI generated code inserted below &#11015;&#65039;"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"execution_count": 4,
"metadata": {
"text/html": {
"jupyter_ai": {
"model_id": "jumpstart-dft-hf-text2text-flan-t5-xl",
"provider_id": "sagemaker-endpoint"
}
}
},
"output_type": "execute_result"
}
],
"source": [
"%%ai sagemaker-endpoint:jumpstart-dft-hf-text2text-flan-t5-xl --region-name=us-east-1 --request-schema={\"text_inputs\":\"<prompt>\"} --response-path=generated_texts.[0] -f code\n",
"Write some Python code"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "42408ea8-7264-44bc-ac0c-6b5dd03134d6",
"metadata": {},
"outputs": [],
"source": [
"a = [] b = [] c = [] d = ["
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4ad8c62e-b0a5-4091-94e3-4067ed8d6c4a",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
17 changes: 16 additions & 1 deletion packages/jupyter-ai-magics/jupyter_ai_magics/magics.py
Original file line number Diff line number Diff line change
Expand Up @@ -478,11 +478,26 @@ def run_ai_cell(self, args: CellArgs, prompt: str):
f"An authentication token is required to use models from the {Provider.name} provider.\n"
f"Please specify it via `%env {auth_strategy.name}=token`. "
) from None

# configure and instantiate provider
provider_params = { "model_id": local_model_id }
if provider_id == "openai-chat":
provider_params["prefix_messages"] = self.transcript_openai
# for SageMaker, validate that required params are specified
if provider_id == "sagemaker-endpoint":
if (
args.region_name is None or
args.request_schema is None or
args.response_path is None
):
raise ValueError(
"When using the sagemaker-endpoint provider, you must specify all of " +
"the --region-name, --request-schema, and --response-path options."
)
provider_params["region_name"] = args.region_name
provider_params["request_schema"] = args.request_schema
provider_params["response_path"] = args.response_path

provider = Provider(**provider_params)

# generate output from model via provider
Expand Down
31 changes: 31 additions & 0 deletions packages/jupyter-ai-magics/jupyter_ai_magics/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,42 @@
FORMAT_CHOICES = list(get_args(FORMAT_CHOICES_TYPE))
FORMAT_HELP = """IPython display to use when rendering output. [default="markdown"]"""

REGION_NAME_SHORT_OPTION = '-n'
REGION_NAME_LONG_OPTION = '--region-name'
REGION_NAME_HELP = ("AWS region name, e.g. 'us-east-1'. Required for SageMaker provider; " +
"does nothing with other providers.")

REQUEST_SCHEMA_SHORT_OPTION = '-q'
REQUEST_SCHEMA_LONG_OPTION = '--request-schema'
REQUEST_SCHEMA_HELP = ("The JSON object the endpoint expects, with the prompt being " +
"substituted into any value that matches the string literal '<prompt>'. " +
"Required for SageMaker provider; does nothing with other providers.")

RESPONSE_PATH_SHORT_OPTION = '-p'
RESPONSE_PATH_LONG_OPTION = '--response-path'
RESPONSE_PATH_HELP = ("A JSONPath string that retrieves the language model's output " +
"from the endpoint's JSON response. Required for SageMaker provider; " +
"does nothing with other providers.")

class CellArgs(BaseModel):
type: Literal["root"] = "root"
model_id: str
format: FORMAT_CHOICES_TYPE
reset: bool
# The following parameters are required only for SageMaker models
region_name: Optional[str]
request_schema: Optional[str]
response_path: Optional[str]

# Should match CellArgs, but without "reset"
class ErrorArgs(BaseModel):
type: Literal["error"] = "error"
model_id: str
format: FORMAT_CHOICES_TYPE
# The following parameters are required only for SageMaker models
region_name: Optional[str]
request_schema: Optional[str]
response_path: Optional[str]

class HelpArgs(BaseModel):
type: Literal["help"] = "help"
Expand Down Expand Up @@ -60,6 +85,9 @@ def get_help(self, ctx):
help="""Clears the conversation transcript used when interacting with an
OpenAI chat model provider. Does nothing with other providers."""
)
@click.option(REGION_NAME_SHORT_OPTION, REGION_NAME_LONG_OPTION, required=False, help=REGION_NAME_HELP)
@click.option(REQUEST_SCHEMA_SHORT_OPTION, REQUEST_SCHEMA_LONG_OPTION, required=False, help=REQUEST_SCHEMA_HELP)
@click.option(RESPONSE_PATH_SHORT_OPTION, RESPONSE_PATH_LONG_OPTION, required=False, help=RESPONSE_PATH_HELP)
def cell_magic_parser(**kwargs):
"""
Invokes a language model identified by MODEL_ID, with the prompt being
Expand All @@ -84,6 +112,9 @@ def line_magic_parser():
default="markdown",
help=FORMAT_HELP
)
@click.option(REGION_NAME_SHORT_OPTION, REGION_NAME_LONG_OPTION, required=False, help=REGION_NAME_HELP)
@click.option(REQUEST_SCHEMA_SHORT_OPTION, REQUEST_SCHEMA_LONG_OPTION, required=False, help=REQUEST_SCHEMA_HELP)
@click.option(RESPONSE_PATH_SHORT_OPTION, RESPONSE_PATH_LONG_OPTION, required=False, help=RESPONSE_PATH_HELP)
def error_subparser(**kwargs):
"""
Explains the most recent error. Takes the same options (except -r) as
Expand Down

0 comments on commit e3e64e6

Please sign in to comment.