Serverless Reference Architecture: Real-time Stream Processing

README Languages: DE | ES | FR | IT | JP | KR | PT | RU | CN | TW

You can use AWS Lambda and Amazon Kinesis to process real-time streaming data for application activity tracking, transaction order processing, click stream analysis, data cleansing, metrics generation, log filtering, indexing, social media analysis, and IoT device data telemetry and metering.

The template creates the following resources:

Creates an Amazon Kinesis Data Stream for ingesting Tweet records
Creates an Amazon Kinesis Data Firehose for delivery of transformed records to S3 to support Amazon Athena queries
Creates an Amazon DynamoDB table named <stack-name>-EventData for storage of parsed and transformed Tweet data
Creates an Amazon S3 bucket for storage of transformed records to support Amazon Athena queries
Creates four AWS Lambda functions:
1. <stack-name>-DataStreamConsumer which receives records from the Kinesis Data Stream, parses and transforms the records, and writes them to the DynamoDB table
2. <stack-name>-DataStreamProducerJava which polls Twitter for trending topics and writes the tweets to the Kinesis Data Stream using the Kinesis Producer Library
3. <stack-name>-DataStreamProducerPython which polls Twitter for trending topics and writes the tweets to the Kinesis Data Stream using the kinesis.put_record() API
4. <stack-name>-FirehoseTransformer which transforms records for storage in Amazon S3 and analysis via Amazon Athena queries
Creates AWS Identity and Access Management (IAM) Roles and Policies which allow the Lambda functions to interact with the Kinesis Data Stream, Kinesis Firehose Stream, and DynamoDB table

Instructions

To access the Twitter API you need to get access tokens. Make sure you have these available. As a best practice, we are NOT hard coding these credentials in our Lambda functions. Instead, we are using AWS SSM Parameter Store to store them, and get them from within the Lambda code. The Lambda code expects the parameters named as below:
```
/twitter/consumer_key
/twitter/consumer_secret
/twitter/access_token_key
/twitter/access_token_secret
```
You can add these parameters manually by going to AWS Systems Manager > Parameter Store on the AWS Web Management Console, or by running the below commands on the AWS CLI. Note: make sure the credentials you are using in your CLI are allowed to perform the ssm put-parameter API call. For more information on setting up your IAM user permissions for Systems Manager Parameters, see here: Control Access to Systems manager Parameters
```
aws ssm put-parameter --name "/twitter/consumer_key" --value "xxx" --type "SecureString"
aws ssm put-parameter --name "/twitter/consumer_secret" --value "yyy" --type "SecureString"
aws ssm put-parameter --name "/twitter/access_token_key" --value "zzz" --type "SecureString"
aws ssm put-parameter --name "/twitter/access_token_secret" --value "jjj" --type "SecureString"
```
Build and Deploy the Application Stack

You may deploy the application stack using either method below:
1. Launch the AWS CloudFormation stack with the template.
  
  The AWS CloudFormation template completely automates the building, deployment, and configuration of all the components of the application.
2. Using SAM CLI
  
  Install (or upgrade) the AWS SAM CLI
```
pip install --upgrade pip --user
hash -r
pip install --upgrade aws-sam-cli --user
```
  Deploy the stack:
```
sam build && sam package --s3-bucket <your-bucket-name> --output-template-file packaged.yaml --region us-east-1
sam deploy --template-file ./packaged.yaml --stack-name <stack-name> --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND
```

Validation

TODO: instructions for Kinesis analytics
In the Amazon DynamoDB management console, select the table named <stack-name>-EventData and explore the records.
TODO: instructions for S3 / Athena
TODO: CloudWatch Logs?

Cleanup

To remove all created resources, delete the AWS CloudFormation stack. You will receive an error that the S3 bucket is not empty. Navigate to the S3 console, select the bucket and choose the option to empty the bucket. Return to the CloudFormation console and re-try deleting the stack.

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
README		README
images		images
src/lambda		src/lambda
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE.txt		NOTICE.txt
README.md		README.md
template.yaml		template.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Serverless Reference Architecture: Real-time Stream Processing

Instructions

Validation

Cleanup

About

Releases

Packages

Languages

License

timothy-baker/lambda-refarch-streamprocessing

Folders and files

Latest commit

History

Repository files navigation

Serverless Reference Architecture: Real-time Stream Processing

Instructions

Validation

Cleanup

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages