Skip to content

Commit

Permalink
Feat: IDEFICS models support + Mistral 7B Instruct (#135)
Browse files Browse the repository at this point in the history
* add [IDEFICS models](https://huggingface.co/blog/idefics), IDEFICS model interface and file upload capability in User Interface
* update README with multimodal sample, instructions and warnings.
* add [Mistral 7B Instruct](https://huggingface.co/mistralai/Mistral-7B-v0.1) support
* remove LLama2 Base model
  • Loading branch information
bigadsoleiman authored Oct 24, 2023
1 parent 8deaf4c commit 128f861
Show file tree
Hide file tree
Showing 76 changed files with 19,186 additions and 1,796 deletions.
47 changes: 41 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Deploying a Multi-LLM and Multi-RAG Powered Chatbot Using AWS CDK on AWS
# Deploying a Multi-Model and Multi-RAG Powered Chatbot Using AWS CDK on AWS
[![Release Notes](https://img.shields.io/github/v/release/aws-samples/aws-genai-llm-chatbot)](https://github.com/aws-samples/aws-genai-llm-chatbot/releases)
[![GitHub star chart](https://img.shields.io/github/stars/aws-samples/aws-genai-llm-chatbot?style=social)](https://star-history.com/#aws-samples/aws-genai-llm-chatbot)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
Expand Down Expand Up @@ -27,14 +27,23 @@

# Features
## Modular, comprehensive and ready to use
This solution provides ready-to-use code so you can start **experimenting with a variety of Large Language Models, settings and prompts** in your own AWS account.
This solution provides ready-to-use code so you can start **experimenting with a variety of Large Language Models and Multimodal Language Models, settings and prompts** in your own AWS account.

Supported model providers:
- [Amazon Bedrock](https://aws.amazon.com/bedrock/)
- [Amazon SageMaker](https://aws.amazon.com/sagemaker/) self-hosted models from Foundation, Jumpstart and HuggingFace.
- Third-party providers via API such as Anthropic, Cohere, AI21 Labs, OpenAI, etc. [See available langchain integrations](https://python.langchain.com/docs/integrations/llms/) for a comprehensive list.


## Experiment with multimodal models
Deploy [IDEFICS](https://huggingface.co/blog/idefics) models on [Amazon SageMaker](https://aws.amazon.com/sagemaker/) and see how the chatbot can answer questions about images, describe visual content, generate text grounded in multiple images.


![sample](assets/multimodal-sample.gif "AWS GenAI Chatbot")

Read more how to deploy multimodal IDEFICS on Amazon SageMaker [here](#multimodal-models).


## Experiment with multiple RAG options with Workspaces
A workspace is a logical namespace where you can upload files for indexing and storage in one of the vector databases. You can select the embeddings model and text-splitting configuration of your choice.

Expand All @@ -52,10 +61,10 @@ The solution comes with several debugging tools to help you debug RAG scenarios:


## Full-fledged User Interface
The repository includes a CDK construct to deploy a **full-fledged UI** built with [React](https://react.dev/) to interact with the deployed LLMs as chatbots. Hosted on [Amazon S3](https://aws.amazon.com/s3/) and distributed with [Amazon CloudFront](https://aws.amazon.com/cloudfront/).
The repository includes a CDK construct to deploy a **full-fledged UI** built with [React](https://react.dev/) to interact with the deployed LLMs/MLMs as chatbots. Hosted on [Amazon S3](https://aws.amazon.com/s3/) and distributed with [Amazon CloudFront](https://aws.amazon.com/cloudfront/).


Protected with [Amazon Cognito Authentication](https://aws.amazon.com/cognito/) to help you interact and experiment with multiple LLMs, multiple RAG engines, conversational history support and document upload/progress.
Protected with [Amazon Cognito Authentication](https://aws.amazon.com/cognito/) to help you interact and experiment with multiple LLMs/MLMs, multiple RAG engines, conversational history support and document upload/progress.


The interface layer between the UI and backend is built with [API Gateway REST API](https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-rest-api.html) for management requests and [Amazon API Gateway WebSocket APIs](https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-websocket-api.html) for chatbot messages and responses.
Expand All @@ -68,7 +77,7 @@ Design system provided by [AWS Cloudscape Design System](https://cloudscape.desi

Before you begin using the solution, there are certain precautions you must take into account:

- **Cost Management with self-hosted models on SageMaker**: Be mindful of the costs associated with AWS resources, especially with SageMaker models billed by the hour. While the sample is designed to be cost-effective, leaving serverful resources running for extended periods or deploying numerous LLMs can quickly lead to increased costs.
- **Cost Management with self-hosted models on SageMaker**: Be mindful of the costs associated with AWS resources, especially with SageMaker models billed by the hour. While the sample is designed to be cost-effective, leaving serverful resources running for extended periods or deploying numerous LLMs/MLMs can quickly lead to increased costs.

- **Licensing obligations**: If you choose to use any datasets or models alongside the provided samples, ensure you check the LLM code and comply with all licensing obligations attached to them.

Expand Down Expand Up @@ -206,7 +215,7 @@ npm install && npm run build
npm run create
```
You'll be prompted to configure the different aspects of the solution, such as:
- The LLMs to enable (we support all models provided by Bedrock, FalconLite, LLama 2 and more to come)
- The LLMs or MLMs to enable (we support all models provided by Bedrock along with SageMaker hosted Idefics, FalconLite, Mistral and more to come)
- Setup of the RAG system: engine selection (i.e. Aurora w/ pgvector, OpenSearch, Kendra..) embeddings selection and more to come.

When done, answer `Y` to create a new configuration.
Expand Down Expand Up @@ -251,16 +260,42 @@ GenAIChatBotStack.ApiKeysSecretNameXXXX = ApiKeysSecretName-xxxxxx

10. Login with the user created in .8; you will be asked to change the password.


# Multimodal models
Currently the following multimodal models supported are:
- [IDEFICS 9b Instruct](https://huggingface.co/HuggingFaceM4/idefics-9b)
- Requires `ml.g5.12xlarge` instance.
- [IDEFICS 80b Instruct](https://huggingface.co/HuggingFaceM4/idefics-80b-instruct)
- Requires `ml.g5.48xlarge` instance.

In order to have the right instance types and how to request them read [Amazon SageMaker requirements](#amazon-sagemaker-requirements-for-self-hosted-models-only)

> NOTE: Make sure to review [IDEFICS models license sections](https://huggingface.co/HuggingFaceM4/idefics-80b-instruct#license).
To deploy a multimodal model simply follow the [deploy instructions](#deploy)
and select one ot the supported models (press Space to select/deselect) from the magic-create CLI step and deploy as [instructed in the above section]((#deployment-dependencies-installation)).

![sample](assets/select-multimodal.gif "AWS GenAI Chatbot")


> ⚠️ NOTE ⚠️ Amazon SageMaker are billed by the hour. Be aware not letting this models running unused to avoid unncesseray costs.


# Run user interface locally

See instructions in the README file of the [`lib/user-interface/react-app`](./lib/user-interface/react-app) folder.



# Clean up
You can remove the stacks and all the associated resources created in your AWS account by running the following command:

```bash
npx cdk destroy
```
> **Note**: Depending on which resources have been deployed. Destroying the stack might take a while up to 45m. If deletion fails multipe times please manually delete the remaining stack's ENIs, you can filter ENIs by VPC/Subnet/etc using the search bar [here](https://console.aws.amazon.com/ec2/home#NIC) in the AWS console) and re-attempt a stack deletion.

# Architecture
This repository comes with several reusable CDK constructs. Giving you freedom to decide what the deploy and what not.
Expand Down
Binary file added assets/multimodal-sample.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/select-multimodal.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion bin/aws-genai-llm-chatbot.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env node
import "source-map-support/register";
import * as cdk from "aws-cdk-lib";
import "source-map-support/register";
import { AwsGenAILLMChatbotStack } from "../lib/aws-genai-llm-chatbot-stack";
import { getConfig } from "./config";

Expand Down
3 changes: 1 addition & 2 deletions bin/config.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import {
SupportedRegion,
SupportedSageMakerLLM,
SystemConfig,
} from "../lib/shared/types";
import { existsSync, readFileSync } from "fs";
Expand All @@ -22,7 +21,7 @@ export function getConfig(): SystemConfig {
endpointUrl: "https://bedrock-runtime.us-east-1.amazonaws.com",
},
llms: {
// sagemaker: [SupportedSageMakerLLM.FalconLite]
// sagemaker: [SupportedSageMakerModels.FalconLite]
sagemaker: [],
},
rag: {
Expand Down
22 changes: 12 additions & 10 deletions cli/magic-create.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import { Command } from "commander";
import * as enquirer from "enquirer";
import {
SupportedRegion,
SupportedSageMakerLLM,
SupportedSageMakerModels,
SystemConfig,
} from "../lib/shared/types";
import { LIB_VERSION } from "./version.js";
Expand Down Expand Up @@ -61,7 +61,7 @@ const embeddingModels = [
options.bedrockRegion = config.bedrock?.region;
options.bedrockEndpoint = config.bedrock?.endpointUrl;
options.bedrockRoleArn = config.bedrock?.roleArn;
options.sagemakerLLMs = config.llms.sagemaker;
options.sagemakerModels = config.llms?.sagemaker;
options.enableRag = config.rag.enabled;
options.ragsToEnable = Object.keys(config.rag.engines).filter(
(v: string) => (config.rag.engines as any)[v].enabled
Expand Down Expand Up @@ -125,6 +125,7 @@ async function processCreateOptions(options: any): Promise<void> {
SupportedRegion.US_WEST_2,
SupportedRegion.EU_CENTRAL_1,
SupportedRegion.AP_SOUTHEAST_1,
SupportedRegion.AP_NORTHEAST_1,
],
initial: options.bedrockRegion ?? "us-east-1",
skip() {
Expand Down Expand Up @@ -154,22 +155,23 @@ async function processCreateOptions(options: any): Promise<void> {
},
{
type: "multiselect",
name: "sagemakerLLMs",
name: "sagemakerModels",
message:
"Which Sagemaker LLMs do you want to enable (enter for None, space to select)",
choices: Object.values(SupportedSageMakerLLM),
initial: options.sagemakerLLMs || [],
"Which SageMaker Models do you want to enable (enter for None, space to select)",
choices: Object.values(SupportedSageMakerModels),
initial: options.sagemakerModels || [],
},
{
type: "confirm",
name: "enableRag",
message: "Do you want to enable RAG",
initial: options.enableRag || true,
initial: options.enableRag || false,
},
{
type: "multiselect",
name: "ragsToEnable",
message: "Which datastores do you want to enable for RAG",
message:
"Which datastores do you want to enable for RAG (enter for None, space to select)",
choices: [
{ message: "Aurora", name: "aurora" },
{ message: "OpenSearch", name: "opensearch" },
Expand Down Expand Up @@ -301,7 +303,7 @@ async function processCreateOptions(options: any): Promise<void> {
}
: undefined,
llms: {
sagemaker: answers.sagemakerLLMs,
sagemaker: answers.sagemakerModels,
},
rag: {
enabled: answers.enableRag,
Expand Down Expand Up @@ -354,7 +356,7 @@ async function processCreateOptions(options: any): Promise<void> {
type: "confirm",
name: "create",
message: "Do you want to create a new config based on the above",
initial: false,
initial: true,
},
])) as any
).create
Expand Down
26 changes: 13 additions & 13 deletions cli/magic.ts
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
#!/usr/bin/env node
#!/usr/bin/env node
// You might want to add this to the previous line --experimental-specifier-resolution=node

import { Command } from 'commander';
import { LIB_VERSION } from './version.js'
import { Command } from "commander";
import { LIB_VERSION } from "./version.js";

(async () =>{
let program = new Command();
program
.version(LIB_VERSION)
.command('create', '📦 creates a new configuration for the a Chatbot')
.command('show','🚚 display the current chatbot configuration')
.command('deploy', '🌟 deploys the chatbot to your account')
.description('🛠️ Easily create a chatbots');
(async () => {
let program = new Command();
program
.version(LIB_VERSION)
.command("create", "📦 creates a new configuration for the a Chatbot")
.command("show", "🚚 display the current chatbot configuration")
.command("deploy", "🌟 deploys the chatbot to your account")
.description("🛠️ Easily create a chatbots");

program.parse(process.argv);
} )();
program.parse(process.argv);
})();
29 changes: 25 additions & 4 deletions lib/authentication/index.ts
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
import * as cognitoIdentityPool from "@aws-cdk/aws-cognito-identitypool-alpha";
import * as cdk from "aws-cdk-lib";
import * as cognito from "aws-cdk-lib/aws-cognito";
import { Construct } from "constructs";

export class Authentication extends Construct {
public readonly userPool: cognito.UserPool;
public readonly userPoolClient: cognito.UserPoolClient;
public readonly identityPool: cognitoIdentityPool.IdentityPool;

constructor(scope: Construct, id: string) {
super(scope, id);
Expand All @@ -27,20 +29,39 @@ export class Authentication extends Construct {
},
});

const identityPool = new cognitoIdentityPool.IdentityPool(
this,
"IdentityPool",
{
authenticationProviders: {
userPools: [
new cognitoIdentityPool.UserPoolAuthenticationProvider({
userPool,
userPoolClient,
}),
],
},
}
);

this.userPool = userPool;
this.userPoolClient = userPoolClient;
this.identityPool = identityPool;

new cdk.CfnOutput(this, "UserPoolId", {
value: userPool.userPoolId,
});

new cdk.CfnOutput(this, "UserPoolWebClientId", {
value: userPoolClient.userPoolClientId,
})

new cdk.CfnOutput(this, 'UserPoolLink', {
value: `https://${cdk.Stack.of(this).region}.console.aws.amazon.com/cognito/v2/idp/user-pools/${userPool.userPoolId}/users?region=${cdk.Stack.of(this).region}`,
});

new cdk.CfnOutput(this, "UserPoolLink", {
value: `https://${
cdk.Stack.of(this).region
}.console.aws.amazon.com/cognito/v2/idp/user-pools/${
userPool.userPoolId
}/users?region=${cdk.Stack.of(this).region}`,
});
}
}
Loading

0 comments on commit 128f861

Please sign in to comment.