Feat: IDEFICS models support + Mistral 7B Instruct (#135)

* add [IDEFICS models](https://huggingface.co/blog/idefics), IDEFICS model interface and file upload capability in User Interface * update README with multimodal sample, instructions and warnings. * add [Mistral 7B Instruct](https://huggingface.co/mistralai/Mistral-7B-v0.1) support * remove LLama2 Base model
aws-samples · Oct 24, 2023 · 128f861 · 128f861
1 parent 8deaf4c
commit 128f861
Show file tree

Hide file tree

Showing 76 changed files with 19,186 additions and 1,796 deletions.
diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-# Deploying a Multi-LLM and Multi-RAG Powered Chatbot Using AWS CDK on AWS
+# Deploying a Multi-Model and Multi-RAG Powered Chatbot Using AWS CDK on AWS
 [![Release Notes](https://img.shields.io/github/v/release/aws-samples/aws-genai-llm-chatbot)](https://github.com/aws-samples/aws-genai-llm-chatbot/releases)
 [![GitHub star chart](https://img.shields.io/github/stars/aws-samples/aws-genai-llm-chatbot?style=social)](https://star-history.com/#aws-samples/aws-genai-llm-chatbot)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
@@ -27,14 +27,23 @@
 
 # Features
 ## Modular, comprehensive and ready to use
-This solution provides ready-to-use code so you can start **experimenting with a variety of Large Language Models, settings and prompts** in your own AWS account.
+This solution provides ready-to-use code so you can start **experimenting with a variety of Large Language Models and Multimodal Language Models, settings and prompts** in your own AWS account.
 
 Supported model providers:
 - [Amazon Bedrock](https://aws.amazon.com/bedrock/) 
 - [Amazon SageMaker](https://aws.amazon.com/sagemaker/) self-hosted models from Foundation, Jumpstart and HuggingFace.
 - Third-party providers via API such as Anthropic, Cohere, AI21 Labs, OpenAI, etc. [See available langchain integrations](https://python.langchain.com/docs/integrations/llms/) for a comprehensive list.
 
 
+## Experiment with multimodal models
+Deploy [IDEFICS](https://huggingface.co/blog/idefics) models on [Amazon SageMaker](https://aws.amazon.com/sagemaker/) and see how the chatbot can answer questions about images, describe visual content, generate text grounded in multiple images.
+
+
+![sample](assets/multimodal-sample.gif "AWS GenAI Chatbot")
+
+Read more how to deploy multimodal IDEFICS on Amazon SageMaker [here](#multimodal-models).
+
+
 ## Experiment with multiple RAG options with Workspaces
 A workspace is a logical namespace where you can upload files for indexing and storage in one of the vector databases. You can select the embeddings model and text-splitting configuration of your choice.
 
@@ -52,10 +61,10 @@ The solution comes with several debugging tools to help you debug RAG scenarios:
 
 
 ## Full-fledged User Interface
-The repository includes a CDK construct to deploy  a **full-fledged UI** built with [React](https://react.dev/) to interact with the deployed LLMs as chatbots. Hosted on [Amazon S3](https://aws.amazon.com/s3/) and distributed with [Amazon CloudFront](https://aws.amazon.com/cloudfront/). 
+The repository includes a CDK construct to deploy  a **full-fledged UI** built with [React](https://react.dev/) to interact with the deployed LLMs/MLMs as chatbots. Hosted on [Amazon S3](https://aws.amazon.com/s3/) and distributed with [Amazon CloudFront](https://aws.amazon.com/cloudfront/). 
 
 
-Protected with [Amazon Cognito Authentication](https://aws.amazon.com/cognito/) to help you interact and experiment with multiple LLMs, multiple RAG engines, conversational history support and document upload/progress.
+Protected with [Amazon Cognito Authentication](https://aws.amazon.com/cognito/) to help you interact and experiment with multiple LLMs/MLMs, multiple RAG engines, conversational history support and document upload/progress.
 
 
 The interface layer between the UI and backend is built with [API Gateway REST API](https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-rest-api.html) for management requests and [Amazon API Gateway WebSocket APIs](https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-websocket-api.html) for chatbot messages and responses.
@@ -68,7 +77,7 @@ Design system provided by [AWS Cloudscape Design System](https://cloudscape.desi
 
 Before you begin using the solution, there are certain precautions you must take into account:
 
-- **Cost Management with self-hosted models on SageMaker**: Be mindful of the costs associated with AWS resources, especially with SageMaker models billed by the hour. While the sample is designed to be cost-effective, leaving serverful resources running for extended periods or deploying numerous LLMs can quickly lead to increased costs.
+- **Cost Management with self-hosted models on SageMaker**: Be mindful of the costs associated with AWS resources, especially with SageMaker models billed by the hour. While the sample is designed to be cost-effective, leaving serverful resources running for extended periods or deploying numerous LLMs/MLMs can quickly lead to increased costs.
 
 - **Licensing obligations**: If you choose to use any datasets or models alongside the provided samples, ensure you check the LLM code and comply with all licensing obligations attached to them.
 
@@ -206,7 +215,7 @@ npm install && npm run build
 npm run create
 ```
 You'll be prompted to configure the different aspects of the solution, such as: 
-- The LLMs to enable (we support all models provided by Bedrock, FalconLite, LLama 2 and more to come)
+- The LLMs or MLMs to enable (we support all models provided by Bedrock along with SageMaker hosted Idefics, FalconLite, Mistral and more to come)
 - Setup of the RAG system: engine selection (i.e. Aurora w/ pgvector, OpenSearch, Kendra..) embeddings selection and more to come.
 
 When done, answer `Y` to create a new configuration.
@@ -251,16 +260,42 @@ GenAIChatBotStack.ApiKeysSecretNameXXXX = ApiKeysSecretName-xxxxxx
 
 10. Login with the user created in .8; you will be asked to change the password.
 
+
+# Multimodal models
+Currently the following multimodal models supported are:
+- [IDEFICS 9b Instruct](https://huggingface.co/HuggingFaceM4/idefics-9b)
+  - Requires `ml.g5.12xlarge` instance.
+- [IDEFICS 80b Instruct](https://huggingface.co/HuggingFaceM4/idefics-80b-instruct)
+  - Requires `ml.g5.48xlarge` instance.
+
+In order to have the right instance types and how to request them read [Amazon SageMaker requirements](#amazon-sagemaker-requirements-for-self-hosted-models-only)
+
+> NOTE: Make sure to review [IDEFICS models license sections](https://huggingface.co/HuggingFaceM4/idefics-80b-instruct#license).
+
+To deploy a multimodal model simply follow the [deploy instructions](#deploy)
+and select one ot the supported models (press Space to select/deselect) from the magic-create CLI step and deploy as [instructed in the above section]((#deployment-dependencies-installation)).
+
+![sample](assets/select-multimodal.gif "AWS GenAI Chatbot")
+
+
+> ⚠️ NOTE ⚠️ Amazon SageMaker are billed by the hour. Be aware not letting this models running unused to avoid unncesseray costs. 
+
+
+
 # Run user interface locally
 
 See instructions in the README file of the [`lib/user-interface/react-app`](./lib/user-interface/react-app) folder.
 
+
+
 # Clean up
 You can remove the stacks and all the associated resources created in your AWS account by running the following command:
 
 ```bash
 npx cdk destroy
 ```
+> **Note**: Depending on which resources have been deployed. Destroying the stack might take a while up to 45m. If deletion fails multipe times please manually delete the remaining stack's ENIs, you can filter ENIs by VPC/Subnet/etc using the search bar [here](https://console.aws.amazon.com/ec2/home#NIC) in the AWS console) and re-attempt a stack deletion.
+
 
 # Architecture
 This repository comes with several reusable CDK constructs. Giving you freedom to decide what the deploy and what not. 

diff --git a/assets/multimodal-sample.gif b/assets/multimodal-sample.gif
diff --git a/assets/select-multimodal.gif b/assets/select-multimodal.gif
diff --git a/bin/aws-genai-llm-chatbot.ts b/bin/aws-genai-llm-chatbot.ts
@@ -1,6 +1,6 @@
 #!/usr/bin/env node
-import "source-map-support/register";
 import * as cdk from "aws-cdk-lib";
+import "source-map-support/register";
 import { AwsGenAILLMChatbotStack } from "../lib/aws-genai-llm-chatbot-stack";
 import { getConfig } from "./config";
 

diff --git a/bin/config.ts b/bin/config.ts
@@ -1,6 +1,5 @@
 import {
   SupportedRegion,
-  SupportedSageMakerLLM,
   SystemConfig,
 } from "../lib/shared/types";
 import { existsSync, readFileSync } from "fs";
@@ -22,7 +21,7 @@ export function getConfig(): SystemConfig {
       endpointUrl: "https://bedrock-runtime.us-east-1.amazonaws.com",
     },
     llms: {
-      // sagemaker: [SupportedSageMakerLLM.FalconLite]
+      // sagemaker: [SupportedSageMakerModels.FalconLite]
       sagemaker: [],
     },
     rag: {

diff --git a/cli/magic-create.ts b/cli/magic-create.ts
@@ -7,7 +7,7 @@ import { Command } from "commander";
 import * as enquirer from "enquirer";
 import {
   SupportedRegion,
-  SupportedSageMakerLLM,
+  SupportedSageMakerModels,
   SystemConfig,
 } from "../lib/shared/types";
 import { LIB_VERSION } from "./version.js";
@@ -61,7 +61,7 @@ const embeddingModels = [
       options.bedrockRegion = config.bedrock?.region;
       options.bedrockEndpoint = config.bedrock?.endpointUrl;
       options.bedrockRoleArn = config.bedrock?.roleArn;
-      options.sagemakerLLMs = config.llms.sagemaker;
+      options.sagemakerModels = config.llms?.sagemaker;
       options.enableRag = config.rag.enabled;
       options.ragsToEnable = Object.keys(config.rag.engines).filter(
         (v: string) => (config.rag.engines as any)[v].enabled
@@ -125,6 +125,7 @@ async function processCreateOptions(options: any): Promise<void> {
         SupportedRegion.US_WEST_2,
         SupportedRegion.EU_CENTRAL_1,
         SupportedRegion.AP_SOUTHEAST_1,
+        SupportedRegion.AP_NORTHEAST_1,
       ],
       initial: options.bedrockRegion ?? "us-east-1",
       skip() {
@@ -154,22 +155,23 @@ async function processCreateOptions(options: any): Promise<void> {
     },
     {
       type: "multiselect",
-      name: "sagemakerLLMs",
+      name: "sagemakerModels",
       message:
-        "Which Sagemaker LLMs do you want to enable (enter for None, space to select)",
-      choices: Object.values(SupportedSageMakerLLM),
-      initial: options.sagemakerLLMs || [],
+        "Which SageMaker Models do you want to enable (enter for None, space to select)",
+      choices: Object.values(SupportedSageMakerModels),
+      initial: options.sagemakerModels || [],
     },
     {
       type: "confirm",
       name: "enableRag",
       message: "Do you want to enable RAG",
-      initial: options.enableRag || true,
+      initial: options.enableRag || false,
     },
     {
       type: "multiselect",
       name: "ragsToEnable",
-      message: "Which datastores do you want to enable for RAG",
+      message:
+        "Which datastores do you want to enable for RAG (enter for None, space to select)",
       choices: [
         { message: "Aurora", name: "aurora" },
         { message: "OpenSearch", name: "opensearch" },
@@ -301,7 +303,7 @@ async function processCreateOptions(options: any): Promise<void> {
         }
       : undefined,
     llms: {
-      sagemaker: answers.sagemakerLLMs,
+      sagemaker: answers.sagemakerModels,
     },
     rag: {
       enabled: answers.enableRag,
@@ -354,7 +356,7 @@ async function processCreateOptions(options: any): Promise<void> {
         type: "confirm",
         name: "create",
         message: "Do you want to create a new config based on the above",
-        initial: false,
+        initial: true,
       },
     ])) as any
   ).create

diff --git a/cli/magic.ts b/cli/magic.ts
@@ -1,17 +1,17 @@
-#!/usr/bin/env node 
+#!/usr/bin/env node
 // You might want to add this to the previous line --experimental-specifier-resolution=node
 
-import {  Command } from 'commander';
-import { LIB_VERSION } from './version.js'
+import { Command } from "commander";
+import { LIB_VERSION } from "./version.js";
 
-(async () =>{ 
-    let program = new Command();
-    program
-        .version(LIB_VERSION)
-        .command('create', '📦 creates a new configuration for the a Chatbot')
-        .command('show','🚚 display the current chatbot configuration')
-        .command('deploy', '🌟 deploys the chatbot to your account')
-        .description('🛠️  Easily create a chatbots');
+(async () => {
+  let program = new Command();
+  program
+    .version(LIB_VERSION)
+    .command("create", "📦 creates a new configuration for the a Chatbot")
+    .command("show", "🚚 display the current chatbot configuration")
+    .command("deploy", "🌟 deploys the chatbot to your account")
+    .description("🛠️  Easily create a chatbots");
 
-    program.parse(process.argv);
-} )();
+  program.parse(process.argv);
+})();
diff --git a/lib/authentication/index.ts b/lib/authentication/index.ts
@@ -1,10 +1,12 @@
+import * as cognitoIdentityPool from "@aws-cdk/aws-cognito-identitypool-alpha";
 import * as cdk from "aws-cdk-lib";
 import * as cognito from "aws-cdk-lib/aws-cognito";
 import { Construct } from "constructs";
 
 export class Authentication extends Construct {
   public readonly userPool: cognito.UserPool;
   public readonly userPoolClient: cognito.UserPoolClient;
+  public readonly identityPool: cognitoIdentityPool.IdentityPool;
 
   constructor(scope: Construct, id: string) {
     super(scope, id);
@@ -27,20 +29,39 @@ export class Authentication extends Construct {
       },
     });
 
+    const identityPool = new cognitoIdentityPool.IdentityPool(
+      this,
+      "IdentityPool",
+      {
+        authenticationProviders: {
+          userPools: [
+            new cognitoIdentityPool.UserPoolAuthenticationProvider({
+              userPool,
+              userPoolClient,
+            }),
+          ],
+        },
+      }
+    );
+
     this.userPool = userPool;
     this.userPoolClient = userPoolClient;
+    this.identityPool = identityPool;
 
     new cdk.CfnOutput(this, "UserPoolId", {
       value: userPool.userPoolId,
     });
 
     new cdk.CfnOutput(this, "UserPoolWebClientId", {
       value: userPoolClient.userPoolClientId,
-    })
-
-    new cdk.CfnOutput(this, 'UserPoolLink', {
-      value: `https://${cdk.Stack.of(this).region}.console.aws.amazon.com/cognito/v2/idp/user-pools/${userPool.userPoolId}/users?region=${cdk.Stack.of(this).region}`,
     });
 
+    new cdk.CfnOutput(this, "UserPoolLink", {
+      value: `https://${
+        cdk.Stack.of(this).region
+      }.console.aws.amazon.com/cognito/v2/idp/user-pools/${
+        userPool.userPoolId
+      }/users?region=${cdk.Stack.of(this).region}`,
+    });
   }
 }