This repository contains Python code for an AWS Lambda function that connects to an Amazon RDS PostgreSQL database, executes a SQL query, and writes the query result to Amazon S3 in Parquet format. The function is designed to be triggered by an AWS Lambda event.
This function can also be configured to run locally and write the Parquet file to a local folder.
-
lambda_function.py
: Contains the Lambda function's code. -
query.sql
: Contains the SQL query to be executed. -
local_lambda_runner.py
: Script for running the Lambda function locally. -
test_event.json
: Sample AWS Lambda event for running the Lambda function locally. -
test_lambda_function.py
: Unit tests for the Lambda function. -
Pipfile
: Dependency list for Pipenv. -
Pipfile.lock
: Dependency lock file for Pipenv. -
.github/workflows/deploy-to-lambda.yml
: GitHub Actions workflow for automating Lambda function deployment. -
.github/workflows/run-tests.yml
: GitHub Actions workflow for automating unit tests execution in PRs.
In the GitHub repository Actions secrets, make sure to configure the following secret variables:
Parameter | Description |
---|---|
AWS_ACCESS_KEY_ID | AWS access key ID. |
AWS_SECRET_ACCESS_KEY | AWS secret access key. |
ENVIRONMENT | Name of the environment (production ). |
DB_NAME_PROD | Name of the production database. |
DB_USER | Database username. |
DB_PASSWORD_PROD | Database password for production. |
DB_HOST_PROD | Database host. |
DB_PORT_PROD | Database port. |
FILE_NAME | Name of the output file. |
S3_PATH | Storage path for Amazon S3 (production). |
These are used by the GitHub Actions deployment workflow to authenticate with AWS and deploy and publish the Lambda function.
To use this Lambda function, follow these steps:
-
AWS Lambda Configuration:
- Create an AWS Lambda function and configure it to trigger based on your use case.
- Attach an execution role granting database read access and S3 write permissions.
- Attach the pre-built AWS Lambda Layer with Pandas and AWSWrangler
-
AWS IAM User for Lambda Deployment: Create an AWS IAM user with 3rd Party Access with the following inline policy. Make sure to replace the placeholders with your region, account number, and Lambda function name. Also, make sure to create the user's access key ID and secret access key and add them to your GitHub repository Secrets.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"lambda:UpdateFunctionCode",
"lambda:UpdateFunctionConfiguration",
"lambda:GetFunctionConfiguration",
"lambda:PublishVersion"
],
"Resource": "arn:aws:lambda:{REGION}:{ACCOUNT_NUMBER}:function:{LAMBDA_FUNCTION_NAME}"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": "lambda:PublishLayerVersion",
"Resource": "arn:aws:lambda:*:{ACCOUNT_NUMBER}:layer:*"
},
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"lambda:GetLayerVersion",
"lambda:GetLayerVersionPolicy"
],
"Resource": "arn:aws:lambda:*:{ACCOUNT_NUMBER}:layer:*:*"
},
{
"Sid": "VisualEditor3",
"Effect": "Allow",
"Action": [
"lambda:ListLayerVersions",
"lambda:ListLayers"
],
"Resource": "*"
}
]
}
-
GitHub Repo Secrets: Set required environment variables in your GitHub repository Actions secrets, including database connection details (e.g.,
DB_NAME_PROD
,DB_USER
,DB_PASSWORD_PROD
, etc.), output file name (FILE_NAME
), storage paths (S3_PATH
), and AWS Configuration (AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
). -
SQL Query: Create your SQL query to be executed by the Lambda function and save it in
query.sql
. -
GitHub Actions Deployment: Automate deployment using GitHub Actions. Push changes to the main branch or trigger events to execute the deployment workflow defined in
.github/workflows/deploy-to-lambda.yml
.
Once triggered, the Lambda function connects to the specified PostgreSQL database, executes the SQL query, and writes the result to the chosen storage location (S3 or local). Monitor execution and view errors in the AWS Lambda console.
The Lambda function includes error handling, logging exceptions, and returning a 500 status code with an error message in the response if an error occurs.
This project is licensed under the MIT License. See the LICENSE file for details.
You are welcome to contribute to this project by submitting a pull request. If you have any suggestions or problems, please open an issue. Thank you!
Your support keeps this project going!
- ⭐️ Star: Show your appreciation by giving this project a star.
- ☕️ Buy Me a Coffee: Contribute by buying a virtual coffee.
- 💼 Sponsor This Project: Consider sponsoring for ongoing support.
Making a difference, one line of code at a time...