- AWS global region account access (and IAM Access Key and Secret Key)
- AWS China region account access (and IAM Access Key and Secret Key)
- AWS CDK (See Getting Started with AWS CDK )
- AWS CLI
- SSH key pair in both accounts (used to access load testing machine)
git clone https://github.com/aws-samples/aws-dynamodb-cross-region-replication.git
Since the AWS commercial region and China region are of different account systems, AKSK is needed to access resources in the target region and will be stored as secure string in SSM.
PROFILE_A=<profile for region A>
PROFILE_B=<profile for region B>
#AKSK for accessing region B, stored in SSM of region A
aws ssm put-parameter --name /DDBReplication/Table_B/AccessKey --value <access_key> --type String --profile $PROFILE_A
aws ssm put-parameter --name /DDBReplication/Table_B/SecretKey --value <secret_key> --type SecureString --profile $PROFILE_A
#AKSK for accessing region A, stored in SSM of region B
aws ssm put-parameter --name /DDBReplication/Table_A/AccessKey --value <access_key> --type String --profile $PROFILE_B
aws ssm put-parameter --name /DDBReplication/Table_A/SecretKey --value <secret_key> --type SecureString --profile $PROFILE_B
Replace the parameters in cdk/app.py
- REGION_A/REGION_B
- PARAMETER_STORE_PREFIX
- KEY_NAME
'''
0. Specify the regions in REGION_A and REGION_B for replication
'''
REGION_A = 'ap-southeast-1'
REGION_B = 'cn-north-1'
'''
1. Credential setting in SSM Parameter Store for the target region.
CDK is not allowed to deploy secure string for AKSK, you will need to set up parameter store in SSM manually
and provide the parameter prefix here, e.g.
for access_key, "/DDBReplication/TableCN/Access_Key" (StringType)
for secret_key, "/DDBReplication/TableCN/Secret_Key" (SecureStringType)
'''
PARAMETER_STORE_PREFIX = {
REGION_A:'/DDBReplication/TableCN/', # IMPORTANT! This is path to the AKSK to access REGION_B
REGION_B:'/DDBReplication/TableSG/' # IMPORTANT! This is path to the AKSK to access REGION_A
}
'''
2. Specify the existing key name here for SSH.
'''
KEY_NAME = {
REGION_A:'<key_pair_name_A>', # Key pair for loader EC2 in REGION_A
REGION_B:'<key_pair_name_B>' # Key pair for loader EC2 in REGION_B
}
There are 2 CDK stacks for both regions (total 4 stacks):
- Source-dynamo-region-name: Stack setting up DynamoDB table, lambda function, VPC and NAT gateway, loader instance and loader statistics table
- Replicator-region-name: Stack setting up Kinesis stream, replicator lambda, replicator statistics table
cd cdk/
cdk list
# Deploy each of the four stacks
cdk deploy <stack_name> --profile $PROFILE_A or $PROFILE_B
# In output of the stack, take note of the loader instance DNS name
Set intial count in both loader_stats and replicator_stats table
aws dynamodb put-item --table-name loader_stats --item '{ "PK": {"S":"loaded_count"}, "cnt": {"N":"0"}}' --profile $PROFILE_A
aws dynamodb put-item --table-name loader_stats --item '{ "PK": {"S":"loaded_count"}, "cnt": {"N":"0"}}' --profile $PROFILE_B
aws dynamodb put-item --table-name replicator_stats --item '{ "PK": {"S":"replicated_count"}, "cnt": {"N":"0"}}' --profile $PROFILE_A
aws dynamodb put-item --table-name replicator_stats --item '{ "PK": {"S":"replicated_count"}, "cnt": {"N":"0"}}' --profile $PROFILE_B
Install dependancies in loader instance
ssh -i <key_pair> ec2-user@<loader instance DNS name>
sudo yum install python3 -y
python3 -m venv my_app/env
source ~/my_app/env/bin/activate
pip install pip --upgrade
pip install boto3
echo "source ${HOME}/my_app/env/bin/activate" >> ${HOME}/.bashrc
source ~/.bashrc
pip install faker uuid
pip install --upgrade awscli
Load test with the load_items.py. The tool generates fake user profile items in the table and also updates in loader_stats the number of items it has loaded. To simulate data ingestion in both regions, run load_items.py simultaneously on loader instance in both regions.
git clone https://github.com/aws-samples/aws-dynamodb-cross-region-replication.git
cd aws-dynamodb-cross-region-replication
python3 load_items.py -t user_cdk-cn-north-1 -r cn-north-1 -n 10000
To see helper on load_items.py,
python3 load_items.py -h
usage: load_items.py [-h] [-n N] -r R [-b] -t T
optional arguments:
-h, --help show this help message and exit
-n N Number of items to generate and write to DynamoDB table
-r R Region of source DynamoDB table
-b Write to DynamoDB table using batched write for higher load
-t T DynamoDB table to load on
The CDK setup will create 3 metrics in both regions below and the recommended statistics:
Metric Name | Where to find | Recommended statistics |
---|---|---|
Total_loaded | Cloudwatch metrics->DDB-Loader->loader | Maximum over 10 seconds |
Total_replicated | Cloudwatch metrics->DDB-Replicator->replicator | Maximum over 10 seconds |
Updated_count | Cloudwatch metrics->DDB-Replicator->replicator | Sum over 10 seconds |
To find out replication lag of the test, compare the Total_loaded in Region A vs. Total_replicated metric in Region B. The lag is the lag between the two metric timeline.