diff --git a/docs/source/users/index.md b/docs/source/users/index.md index 1de889461..40c8a4c62 100644 --- a/docs/source/users/index.md +++ b/docs/source/users/index.md @@ -48,7 +48,7 @@ Jupyter AI supports the following model providers: | HuggingFace Hub | `huggingface_hub` | `HUGGINGFACEHUB_API_TOKEN` | `huggingface_hub`, `ipywidgets`, `pillow` | | OpenAI | `openai` | `OPENAI_API_KEY` | `openai` | | OpenAI (chat) | `openai-chat` | `OPENAI_API_KEY` | `openai` | -| SageMaker Endpoints | `sagemaker-endpoint` | N/A | `boto3` | +| SageMaker | `sagemaker-endpoint` | N/A | `boto3` | The environment variable names shown above are also the names of the settings keys used when setting up the chat interface. @@ -177,20 +177,20 @@ To compose a message, type it in the text box at the bottom of the chat interfac alt='Screen shot of an example "Hello world" message sent to Jupyternaut, who responds with "Hello world, how are you today?"' class="screenshot" /> -### Usage with SageMaker Endpoints +### Using the chat interrface with SageMaker endpoints -Jupyter AI supports language models hosted on SageMaker Endpoints that use JSON -APIs. The first step is to authenticate with AWS via the `boto3` SDK and have +Jupyter AI supports language models hosted on SageMaker endpoints that use JSON +schemas. The first step is to authenticate with AWS via the `boto3` SDK and have the credentials stored in the `default` profile. Guidance on how to do this can be found in the [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html). -When selecting the SageMaker Endpoints provider in the settings panel, you will +When selecting the SageMaker provider in the settings panel, you will see the following interface: Screenshot of the settings panel with the SageMaker Endpoints provider selected. Each of the additional fields under "Language model" is required. These fields @@ -601,3 +601,26 @@ You can see a list of all aliases by running the `%ai list` command. Aliases' names can contain ASCII letters (uppercase and lowercase), numbers, hyphens, underscores, and periods. They may not contain colons. They may also not override built-in commands — run `%ai help` for a list of these commands. Aliases must refer to models or `LLMChain` objects; they cannot refer to other aliases. + +### Using magic commands with SageMaker endpoints + +You can use magic commands with models hosted using Amazon SageMaker. + +First, make sure that you've set your `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables either before starting JupyterLab or using the `%env` magic command within JupyterLab. For more information about environment variables, see [Environment variables to configure the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html) in AWS's documentation. + +Jupyter AI supports language models hosted on SageMaker endpoints that use JSON schemas. Authenticate with AWS via the `boto3` SDK and have the credentials stored in the `default` profile. Guidance on how to do this can be found in the [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html). + +You will need to deploy a model in SageMaker, then provide it as the model name (as `sagemaker-endpoint:my-model-name`). See the [documentation on how to deploy a JumpStart model](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-deploy.html). + +All SageMaker endpoint requests require you to specify the `--region-name`, `--request-schema`, and `--response-path` options. The example below presumes that you have deployed a model called `jumpstart-dft-hf-text2text-flan-t5-xl`. + +``` +%%ai sagemaker-endpoint:jumpstart-dft-hf-text2text-flan-t5-xl --region-name=us-east-1 --request-schema={"text_inputs":""} --response-path=generated_texts.[0] -f code +Write Python code to print "Hello world" +``` + +The `--region-name` parameter is set to the [AWS region code](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html) where the model is deployed, which in this case is `us-east-1`. + +The `--request-schema` parameter is the JSON object the endpoint expects as input, with the prompt being substituted into any value that matches the string literal `""`. For example, the request schema `{"text_inputs":""}` will submit a JSON object with the prompt stored under the `text_inputs` key. + +The `--response-path` option is a [JSONPath](https://goessner.net/articles/JsonPath/index.html) string that retrieves the language model's output from the endpoint's JSON response. For example, if your endpoint returns an object with the schema `{"generated_texts":[""]}`, its response path is `generated_texts.[0]`. diff --git a/examples/sagemaker.ipynb b/examples/sagemaker.ipynb new file mode 100644 index 000000000..14c41a728 --- /dev/null +++ b/examples/sagemaker.ipynb @@ -0,0 +1,124 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "79ebfce1-f5ac-4e39-83db-11416e310e8e", + "metadata": { + "tags": [] + }, + "source": [ + "# Jupyter AI with the SageMaker endpoint\n", + "\n", + "This demo showcases the IPython magics Jupyter AI provides out-of-the-box for Amazon SageMaker.\n", + "\n", + "First, make sure that you've set your `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables either before starting JupyterLab or using the `%env` magic command within JupyterLab.\n", + "\n", + "Then, load the IPython extension:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "24f3f446-2b1d-4802-a47c-d298c06fc86e", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "%load_ext jupyter_ai" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "9f2b0270-1c33-4918-b534-4ec104f90141", + "metadata": {}, + "source": [ + "Jupyter AI supports language models hosted on SageMaker endpoints that use JSON APIs. Authenticate with AWS via the `boto3` SDK and have the credentials stored in the `default` profile. Guidance on how to do this can be found in the [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).\n", + "\n", + "You will need to deploy a model in SageMaker, then provide it as your model name (as `sagemaker-endpoint:my-model-name`). See the [documentation on how to deploy a JumpStart model](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-deploy.html).\n", + "\n", + "All SageMaker endpoint requests require you to specify the `--region-name`, `--request-schema`, and `--response-path` options.\n", + "\n", + "The `--region-name` parameter is set to the [AWS region code](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html), such as `us-east-1` or `eu-west-1`.\n", + "\n", + "The `--request-schema` parameter is the JSON object the endpoint expects as input, with the prompt being substituted into any value that matches the string literal `\"\"`. For example, the request schema `{\"text_inputs\":\"\"}` will submit a JSON object with the prompt stored under the `text_inputs` key.\n", + "\n", + "The `--response-path` option is a [JSONPath](https://goessner.net/articles/JsonPath/index.html) string that retrieves the language model's output from the endpoint's JSON response. For example, if your endpoint returns an object with the schema `{\"generated_texts\":[\"\"]}`, its response path is `generated_texts.[0]`." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "31f3e6e3-48cf-4e60-96d3-8b8e1dd34bec", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/html": [ + "AI generated code inserted below ⬇️" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 4, + "metadata": { + "text/html": { + "jupyter_ai": { + "model_id": "jumpstart-dft-hf-text2text-flan-t5-xl", + "provider_id": "sagemaker-endpoint" + } + } + }, + "output_type": "execute_result" + } + ], + "source": [ + "%%ai sagemaker-endpoint:jumpstart-dft-hf-text2text-flan-t5-xl --region-name=us-east-1 --request-schema={\"text_inputs\":\"\"} --response-path=generated_texts.[0] -f code\n", + "Write some Python code" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "42408ea8-7264-44bc-ac0c-6b5dd03134d6", + "metadata": {}, + "outputs": [], + "source": [ + "a = [] b = [] c = [] d = [" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4ad8c62e-b0a5-4091-94e3-4067ed8d6c4a", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.8" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/packages/jupyter-ai-magics/jupyter_ai_magics/magics.py b/packages/jupyter-ai-magics/jupyter_ai_magics/magics.py index 70a16a07a..8f6118d15 100644 --- a/packages/jupyter-ai-magics/jupyter_ai_magics/magics.py +++ b/packages/jupyter-ai-magics/jupyter_ai_magics/magics.py @@ -478,11 +478,26 @@ def run_ai_cell(self, args: CellArgs, prompt: str): f"An authentication token is required to use models from the {Provider.name} provider.\n" f"Please specify it via `%env {auth_strategy.name}=token`. " ) from None - + # configure and instantiate provider provider_params = { "model_id": local_model_id } if provider_id == "openai-chat": provider_params["prefix_messages"] = self.transcript_openai + # for SageMaker, validate that required params are specified + if provider_id == "sagemaker-endpoint": + if ( + args.region_name is None or + args.request_schema is None or + args.response_path is None + ): + raise ValueError( + "When using the sagemaker-endpoint provider, you must specify all of " + + "the --region-name, --request-schema, and --response-path options." + ) + provider_params["region_name"] = args.region_name + provider_params["request_schema"] = args.request_schema + provider_params["response_path"] = args.response_path + provider = Provider(**provider_params) # generate output from model via provider diff --git a/packages/jupyter-ai-magics/jupyter_ai_magics/parsers.py b/packages/jupyter-ai-magics/jupyter_ai_magics/parsers.py index db63cd4da..d2b0476b5 100644 --- a/packages/jupyter-ai-magics/jupyter_ai_magics/parsers.py +++ b/packages/jupyter-ai-magics/jupyter_ai_magics/parsers.py @@ -6,17 +6,42 @@ FORMAT_CHOICES = list(get_args(FORMAT_CHOICES_TYPE)) FORMAT_HELP = """IPython display to use when rendering output. [default="markdown"]""" +REGION_NAME_SHORT_OPTION = '-n' +REGION_NAME_LONG_OPTION = '--region-name' +REGION_NAME_HELP = ("AWS region name, e.g. 'us-east-1'. Required for SageMaker provider; " + + "does nothing with other providers.") + +REQUEST_SCHEMA_SHORT_OPTION = '-q' +REQUEST_SCHEMA_LONG_OPTION = '--request-schema' +REQUEST_SCHEMA_HELP = ("The JSON object the endpoint expects, with the prompt being " + + "substituted into any value that matches the string literal ''. " + + "Required for SageMaker provider; does nothing with other providers.") + +RESPONSE_PATH_SHORT_OPTION = '-p' +RESPONSE_PATH_LONG_OPTION = '--response-path' +RESPONSE_PATH_HELP = ("A JSONPath string that retrieves the language model's output " + + "from the endpoint's JSON response. Required for SageMaker provider; " + + "does nothing with other providers.") + class CellArgs(BaseModel): type: Literal["root"] = "root" model_id: str format: FORMAT_CHOICES_TYPE reset: bool + # The following parameters are required only for SageMaker models + region_name: Optional[str] + request_schema: Optional[str] + response_path: Optional[str] # Should match CellArgs, but without "reset" class ErrorArgs(BaseModel): type: Literal["error"] = "error" model_id: str format: FORMAT_CHOICES_TYPE + # The following parameters are required only for SageMaker models + region_name: Optional[str] + request_schema: Optional[str] + response_path: Optional[str] class HelpArgs(BaseModel): type: Literal["help"] = "help" @@ -60,6 +85,9 @@ def get_help(self, ctx): help="""Clears the conversation transcript used when interacting with an OpenAI chat model provider. Does nothing with other providers.""" ) +@click.option(REGION_NAME_SHORT_OPTION, REGION_NAME_LONG_OPTION, required=False, help=REGION_NAME_HELP) +@click.option(REQUEST_SCHEMA_SHORT_OPTION, REQUEST_SCHEMA_LONG_OPTION, required=False, help=REQUEST_SCHEMA_HELP) +@click.option(RESPONSE_PATH_SHORT_OPTION, RESPONSE_PATH_LONG_OPTION, required=False, help=RESPONSE_PATH_HELP) def cell_magic_parser(**kwargs): """ Invokes a language model identified by MODEL_ID, with the prompt being @@ -84,6 +112,9 @@ def line_magic_parser(): default="markdown", help=FORMAT_HELP ) +@click.option(REGION_NAME_SHORT_OPTION, REGION_NAME_LONG_OPTION, required=False, help=REGION_NAME_HELP) +@click.option(REQUEST_SCHEMA_SHORT_OPTION, REQUEST_SCHEMA_LONG_OPTION, required=False, help=REQUEST_SCHEMA_HELP) +@click.option(RESPONSE_PATH_SHORT_OPTION, RESPONSE_PATH_LONG_OPTION, required=False, help=RESPONSE_PATH_HELP) def error_subparser(**kwargs): """ Explains the most recent error. Takes the same options (except -r) as