Dining Concierge Chatbot

Teammates: Theodore Hadges and Ahmed Shoukr

This is a chatbot which uses AWS services to ask a user questions and use their preferences to suggest a list of restaurants they may enjoy.

The architecture is admittedly overkill for this application (it uses 11 different AWS services) but we wanted to use this project to practice and learn how to build a scalable cloud application.

Demo

Below, a demo of a user registering for a new account and talking to the chatbot.

At the end of the process, the user receives a text message which looks like this:

Architecture Overview

Auto-Deployment

We used CodePipeline to auto-deploy code from our master branch on GitHub to S3. We also used the CloudFront content delivery network to distribute our website and replicate the distribution configuration across multiple edge locations around the world for caching content and providing fast delivery of the website.

Here is the flowchart for this process:

Full Architecture Diagram

Here is the flowchart of our full architecture diagram.

Individual Components

S3

Created 2 main buckets: sortai.co and www.sortai.co and put all frontend files there
Set them up to host static website
Added automatic deployment with CodePipeline as discussed in the next section.

CodePipeline

The front-end is hosted in an S3 bucket. Since editing those files directly is impossible and downloading, editing, then reuploading is clumsy and inefficient, we decided to use CodePipeline. Whenever we push to our master branch in our GitHub repository, CodePipeline will pull those changes, then clone them to our S3 bucket. This makes for easy integration between GitHub and S3.

CloudFront

The website source files are hosted on S3, but we also use a CDN called CloudFront to replicate configurations of the distribution to edge locations around the world, for caching content and providing fast delivery of the website.

Yelp Scraping

We wrote a series of python scripts (and one run_pipeline.sh script, which executes all python scripts). The python scripts scrape restaurant data from Yelp using the Yelp Fusion API, then store a subset of this data in DynamoDB and our Elasticsearch index. More information about Yelp Scraping can be found here: https://github.com/ashoukr/chatbot/blob/main/yelp-scrape/README.md

Front-end

We used starter code from https://github.com/ndrppnc/cloud-hw1-starter and did not change it much, since front-end development was not an important part of this project for us. The main thing we did was replace the sdk (apigClient.js) with the one we generated in API Gateway. We also added userprofile.js and verifier.js, both for the purpose of authenticating the user via Cognito.

API Gateway

Name: /chatbot
We used the swagger API provided.
We used the Enable Cores in the UI to add proper headers for POST method to work properly.
However this did not add all the proper headers so we added them manually.
- In the Method Response under OPTIONS and POST, we added the following response headers for 200 response:
  - Access-Control-Allow-Headers, Access-Control-Allow-Origin, Access-Control-Allow-Credentials, Access-Control-Allow-Methods
- In the Integration Response under OPTIONS and POST, we added the following header mappings corresponding to the above mappings in order:
  - 'Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token', '*', 'true', 'POST'

General Lambda Setup

We used the Serverless Application Model (SAM) CLI provided by Amazon to write all the lambda functions and test them locally.
Additionally, this allowed us to manipulate all the infrastructure for the lambda functions.

# sample template for LF0
Resources:
  LF0Function:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: ./LF0/
      Handler: app.lambda_handler
      Runtime: python3.7
      Timeout: 200
      Events:
        LF0:
          Type: Api 
          Properties:
            Path: /LF0
            Method: get

We used Docker to dockerize and send all 3 lambda functions to s3 before they were deployed using cloudformation to obtain full memory capacity for additional packages we may need.
Each Lambda function has a requirements.txt file with additional packages we used such as requests_aws4auth used by LF2
We also wrote a deployment script to automate this process

# deploy.sh
sam build --use-container
sam package --s3-bucket chatbot-cloudformation-template --output-template-file packaged.yaml
sam deploy --template-file ./packaged.yaml --stack-name chatbot-message-handler --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND

Lambda LF0

Set up with API Gateway to receive user message and send it to Lex
Given API Gateway necessary additional role to pass messages to this lambda
Uses boto3 SDK to pass message to Lex

client = boto3.client('lex-runtime')
response = client.post_text(botName='ConceirgeBot', botAlias = 'ConceirgeBot', userId='string', inputText = message)

Returns body response to UI

Lex

Main Intents

DiningSuggestionsIntent
- You can ask questions like "Can you help me find a restaurant" or you can give a statement like "I want to go eat with friends" or "I am hungry" which will trigger this intent.
- Intent requires 8 different slots: cusine, name, location, zip_code, dining_time, date, number_of_people, and phone_number.
  - In addition to these questions, we trained a number of corresponding utternaces it can use to extract matching information
  - There are a maximum of 2 retries before the question is skipped.
GreetingIntent
- You can ask questions like "What's up" or add statements like "Hey" or "Hello" to trigger this intent
- Intent requires 1 slot: name
  - Multiple corresponding utterances given to extract name from user input
  - There are a miximum of 2 retries before the question is skipped.
ThankYouIntent
- Once a user says an utterance like "Thank you bot", "Thank you", etc. this intent is triggered
- No slots required here

Code Fullfillment

This is done by setting LF1 as an initialization and validation code hook in Lex, and using elicit_slot in LF0 whenever a slot input is invalid. See The next section for more info.

Lambda LF1

In a nutshell, LF1 takes events from Lex, validates them, then enqueues the message to SQS.

The event is first parsed to see which intent sent it. If it's the ThankYouIntent or the GreetingIntent, LF1 will simply build a response of the following form and send it back:

# sample response for ThankYouIntent
res = {
    "sessionAttributes": {},
    "dialogAction": {
        "type": "Close",
        "fulfillmentState": "Fulfilled",
        "message" : {
            "contentType" : "PlainText",
            "content" : "No problem!"
        }
    }
}    
return res

This is the end of LF1's involvement with the ThankYouIntent and GreetingIntent.

On the other hand, if the intent value is DiningSuggestionsIntent, we do some validation throughout the conversation. The validation is a little tricky in that we have designed it to validate after each user input (mid-conversation). We feel that this is the better than validating at the conversation so that the user does not have to retype their whole list of preferences. This is done by setting LF1 as an initialization and validation code hook in Lex, and using elicit_slot in LF0 whenever a slot input is invalid.

These are the intents LF1 validates:

Number of people : must be greater than 0 and less than 20
Phone number: must be 9 digits
Date: day should be either current day/any day after the current day
Time: greater than the current time if date is today
Cuisine: must be one of the 5 cuisine types (non case-sensitive): ['indian', 'italian', 'burgers','chinese', 'pizza'])

If any of these are invalid, the chatbot will call elicit_slot to tell the user that the input is invalid and ask for another input.

Once all slots are collected and validated, the message is enqueued to SQS, and a return message is sent to the user, saying that the bot will look in its directory and send them a text message in a moment with restaurant suggestions.

SQS

Created a chatbot-queue using the UI
Added necessary roles to LF1 to be able to push messages here.
Allowed LF2 to poll the queue every 1 minute for new messages.

Lambda LF2

CloudWatch Trigger is explained below
Once the data is obtained from SQS, we then parse it to grab all necessary fields.
We used the AWS4Auth to make a request to elastic search


region = 'us-east-1' # For example, us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

Then we set up query using the cuisine provided by user

query = {
  "query": {
    "match": {
      "Cuisine": cuisine
    }
  }
} 
headers = { "Content-Type": "application/json" }
# make query and get 3 different restaurants
r = requests.get(url, auth=awsauth, headers=headers, data=json.dumps(query))

Used the RestaurantID to obtain restaurants from dynamoDB

res = client.get_item(TableName='yelp-restaurants', Key = {'BusinessId' : {'S': business_ids[i] }} )

Used SNS SDK client to to invoke an SMS message to send to the user

client = boto3.client('sns')
client.publish(PhoneNumber='+1' + phone_number, Message = notification)

CloudWatch

We created a CloudWatch rule which triggers LF2 once every minute. While we could have simply added a SQS trigger to LF2, we used a CloudWatch rule to have more control over polling intervals. Thus, LF2 attempts to dequeue the next item from SQS. If such an item exists (i.e. if the queue is not empty) then LF2 will get data from the ES index and the DynamoDB table, then make a recommendation to the user as a string. It will then send this string to SNS.

DynamoDB

We simply used the AWS web console to create a table. The upload_to_dynamodb.py script takes care of populating it with Yelp data.

We populated the table by simply iterating over the list of restaurants we scraped and doing the following

restaurant["insertedAtTimestamp"] = str(datetime.datetime.now())
table.put_item(Item=restaurant)

, where restaurant is the current restaurant in the iteration and table is the table we're inserting into.

See below for more info:
https://github.com/ashoukr/chatbot/blob/main/yelp-scrape/README.md

ElasticSearch

Ahmed: Setup

Used Development and testing domain type
We used t3.small.elasticsearch instance type with 1 node for cost purposes. Because of this, we had to specify a policy presented by AWS to be able to create the cluster, then went back later and changed it to allow any AWS service to access the resource.
There are also no dedicated master nodes
No VPC used

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": "es:*",
      "Resource": "arn:aws:es:us-east-1:922059106485:domain/yelp-restaurants/*"
    }
  ]
}

# Sample PUT requrest used to insert data into elasticsearch for initial testing
# also creates index
$ http PUT https://search-yelp-restaurants-lg4z52z76iaf3lkg54o2jdcj2a.us-east-1.es.amazonaws.com/restaurants/Restaurant/6 Cuisine="chinese" RestaurantID="0CjK3esfpFcxIopebzjFxA"

{
    "_id": "4",
    "_index": "restaurants",
    "_primary_term": 1,
    "_seq_no": 0,
    "_shards": {
        "failed": 0,
        "successful": 1,
        "total": 2
    },
    "_type": "Restaurant",
    "_version": 1,
    "result": "created"
}

# Sample GET request used to ensure data inserted in elastic search
$ http GET https://search-yelp-restaurants-lg4z52z76iaf3lkg54o2jdcj2a.us-east-1.es.amazonaws.com/restaurants/Restaurant/1

# Delete restaurants table after testing
http DELETE https://search-yelp-restaurants-lg4z52z76iaf3lkg54o2jdcj2a.us-east-1.es.amazonaws.com/restaurants

In Elastic Kibana, we used SQL WorkBench to query the tables created and ensure data was present in the elastic search cluster

show tables like %;
describe tables like %;
SELECT * FROM restaurants

Finally, we gave LF2 Lambda a new role that allowed it to interact with the Elastic Search Cluster.

We populated the index by simply iterating over the list of restaurants we scraped and doing an es.index(index=path, doc_type="_doc", id=str(index), body=doc), where index is the ith iteration and doc is the key:value object taking the form:

doc = {
    "Cuisine": restaurant["Cuisine"],
    "RestaurantID": restaurant["BusinessId"]
}

More info can be seen here, or by viewing the upload_to_elastic_search.py file: https://github.com/ashoukr/chatbot/blob/main/yelp-scrape/README.md

SNS

Here we created a basic topic called chatbot-sms without any additional configuration
We added a new role to LF2 that allows it to communicate with this SNS topic.
We used boto3 SDK provided to push sns requests to this service. This instantly sends a message to the user with all the details.

client = boto3.client('sns')
client.publish(PhoneNumber = '+1' + phone_number, Message = notification)

Cognito

As a fun add-on, we decided to add authentication. We created a userpool, and a new client app within that pool. Since the callback URL (where to redirect to once the user is signed in) must use https (not http), we used our CloudFront URL. We used an authorization code grant flow whereby the user receives a code via email upon registration and must enter the code on the screen to activate their account.

Cognito has some nice built-in forms to handle registration and login. This is convenient in that we did not have to create a separate web page just for the form. We used one of these, as shown below:

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
chatbot		chatbot
media		media
src		src
swagger-api		swagger-api
yelp-scrape		yelp-scrape
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dining Concierge Chatbot

Table of Contents

Demo

Architecture Overview

Auto-Deployment

Full Architecture Diagram

Individual Components

S3

CodePipeline

CloudFront

Yelp Scraping

Front-end

API Gateway

General Lambda Setup

Lambda LF0

Lex

Main Intents

Code Fullfillment

Lambda LF1

SQS

Lambda LF2

CloudWatch

DynamoDB

ElasticSearch

SNS

Cognito

About

Releases 1

Packages

Contributors 2

Languages

ahmed1/chatbot

Folders and files

Latest commit

History

Repository files navigation

Dining Concierge Chatbot

Table of Contents

Demo

Architecture Overview

Auto-Deployment

Full Architecture Diagram

Individual Components

S3

CodePipeline

CloudFront

Yelp Scraping

Front-end

API Gateway

General Lambda Setup

Lambda LF0

Lex

Main Intents

Code Fullfillment

Lambda LF1

SQS

Lambda LF2

CloudWatch

DynamoDB

ElasticSearch

SNS

Cognito

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages