Skip to content

Latest commit

 

History

History
144 lines (112 loc) · 5.62 KB

File metadata and controls

144 lines (112 loc) · 5.62 KB

Automated Deployment with Cloud Development Kit (CDK)

Prerequisite

  • AWS global region account access (and IAM Access Key and Secret Key)
  • AWS China region account access (and IAM Access Key and Secret Key)
  • AWS CDK (See Getting Started with AWS CDK )
  • AWS CLI
  • SSH key pair in both accounts (used to access load testing machine)

Setup

Download the git repository

git clone https://github.com/aws-samples/aws-dynamodb-cross-region-replication.git

Put AKSK in Parameter Store

Since the AWS commercial region and China region are of different account systems, AKSK is needed to access resources in the target region and will be stored as secure string in SSM.

PROFILE_A=<profile for region A>
PROFILE_B=<profile for region B>
#AKSK for accessing region B, stored in SSM of region A
aws ssm put-parameter --name /DDBReplication/Table_B/AccessKey --value <access_key> --type String --profile $PROFILE_A 
aws ssm put-parameter --name /DDBReplication/Table_B/SecretKey --value <secret_key> --type SecureString --profile $PROFILE_A 
#AKSK for accessing region A, stored in SSM of region B
aws ssm put-parameter --name /DDBReplication/Table_A/AccessKey --value <access_key> --type String --profile $PROFILE_B 
aws ssm put-parameter --name /DDBReplication/Table_A/SecretKey --value <secret_key> --type SecureString --profile $PROFILE_B

Update the parameters in cdk/app.py

Replace the parameters in cdk/app.py

  • REGION_A/REGION_B
  • PARAMETER_STORE_PREFIX
  • KEY_NAME
'''
0. Specify the regions in REGION_A and REGION_B for replication
'''
REGION_A = 'ap-southeast-1'
REGION_B = 'cn-north-1'

'''
1. Credential setting in SSM Parameter Store for the target region. 
CDK is not allowed to deploy secure string for AKSK, you will need to set up parameter store in SSM manually
and provide the parameter prefix here, e.g.
for access_key, "/DDBReplication/TableCN/Access_Key" (StringType)
for secret_key, "/DDBReplication/TableCN/Secret_Key" (SecureStringType)
'''
PARAMETER_STORE_PREFIX = {
    REGION_A:'/DDBReplication/TableCN/',  # IMPORTANT! This is path to the AKSK to access REGION_B
    REGION_B:'/DDBReplication/TableSG/'  # IMPORTANT! This is path to the AKSK to access REGION_A
    }
'''
2. Specify the existing key name here for SSH. 
'''
KEY_NAME = {
    REGION_A:'<key_pair_name_A>',  # Key pair for loader EC2 in REGION_A
    REGION_B:'<key_pair_name_B>'  # Key pair for loader EC2 in REGION_B
    }

Deploy CDK stacks

There are 2 CDK stacks for both regions (total 4 stacks):

  • Source-dynamo-region-name: Stack setting up DynamoDB table, lambda function, VPC and NAT gateway, loader instance and loader statistics table
  • Replicator-region-name: Stack setting up Kinesis stream, replicator lambda, replicator statistics table
cd cdk/
cdk list
# Deploy each of the four stacks
cdk deploy <stack_name> --profile $PROFILE_A or $PROFILE_B
# In output of the stack, take note of the loader instance DNS name

Initialize the statistics table

Set intial count in both loader_stats and replicator_stats table

aws dynamodb put-item --table-name loader_stats --item '{ "PK": {"S":"loaded_count"}, "cnt": {"N":"0"}}' --profile $PROFILE_A
aws dynamodb put-item --table-name loader_stats --item '{ "PK": {"S":"loaded_count"}, "cnt": {"N":"0"}}' --profile $PROFILE_B
aws dynamodb put-item --table-name replicator_stats --item '{ "PK": {"S":"replicated_count"}, "cnt": {"N":"0"}}' --profile $PROFILE_A
aws dynamodb put-item --table-name replicator_stats --item '{ "PK": {"S":"replicated_count"}, "cnt": {"N":"0"}}' --profile $PROFILE_B

Load test

Install dependancies in loader instance

ssh -i <key_pair> ec2-user@<loader instance DNS name>
sudo yum install python3 -y
python3 -m venv my_app/env
source ~/my_app/env/bin/activate
pip install pip --upgrade
pip install boto3
echo "source ${HOME}/my_app/env/bin/activate" >> ${HOME}/.bashrc
source ~/.bashrc
pip install faker uuid
pip install --upgrade awscli

Load test with the load_items.py. The tool generates fake user profile items in the table and also updates in loader_stats the number of items it has loaded. To simulate data ingestion in both regions, run load_items.py simultaneously on loader instance in both regions.

git clone https://github.com/aws-samples/aws-dynamodb-cross-region-replication.git
cd aws-dynamodb-cross-region-replication
python3 load_items.py -t user_cdk-cn-north-1 -r cn-north-1 -n 10000

To see helper on load_items.py,

python3 load_items.py -h
usage: load_items.py [-h] [-n N] -r R [-b] -t T

optional arguments:
  -h, --help  show this help message and exit
  -n N        Number of items to generate and write to DynamoDB table
  -r R        Region of source DynamoDB table
  -b          Write to DynamoDB table using batched write for higher load
  -t T        DynamoDB table to load on

Monitoring

The CDK setup will create 3 metrics in both regions below and the recommended statistics:

Metric Name Where to find Recommended statistics
Total_loaded Cloudwatch metrics->DDB-Loader->loader Maximum over 10 seconds
Total_replicated Cloudwatch metrics->DDB-Replicator->replicator Maximum over 10 seconds
Updated_count Cloudwatch metrics->DDB-Replicator->replicator Sum over 10 seconds

To find out replication lag of the test, compare the Total_loaded in Region A vs. Total_replicated metric in Region B. The lag is the lag between the two metric timeline.