Skip to content

Deployment

Jon Riecke edited this page Jan 31, 2023 · 4 revisions

Introduction

Planscape is a wildfire resilience planning application being developed by the California Natural Resources Agency (CNRA) in conjunction with the USDA Forest Service. It is structured as a web service with three main components:

  • A frontend web application, built in the Angular framework with Leaflet for working with maps;
  • A backend, built in Django REST with GeoDjango extensions; and
  • A database, specifically the PostGIS extensions of Postgres.

GitHub provides both the code repository and the continuous integration/continuous deployment (CI/CD) environment.

CNRA will deploy Planscape on Amazon Web Services (AWS). This document describes

  • The basic architecture of Planscape,
  • The AWS components needed for Planscape,
  • How Planscape components map onto AWS, and
  • How Planscape is deployed from GitHub to AWS.

It also gives some tips for setting up new instances and debugging, and lists open problems.

Basic architecture

The diagram below shows the basic architecture of the deployed Planscape application, where arrows indicate data flows.

Architecture (3)
  1. The browser sends an HTTP request to the URL of the AWS Elastic Load Balancer.

  2. The load balancer selects an ECS Task from a pool of running tasks and forwards the request. Tasks are the fundamental concept in AWS: they are abstract computational nodes, and map to concrete EC2 Instances. The EC2 instance runs two Docker containers.

  3. The request hits the frontend Docker container, which runs the Nginx proxy service. It checks the request path for the "/planscape-backend" prefix.

    1. If the path does not start with the prefix, it answers the query directly, e.g., serving static files or parts of the Angular Javascript application.
    2. If the path starts with the prefix, it forwards the request to the second Docker container, specifically to the "uwsgi" proxy. The red arrow indicates that the connection is a socket rather than a port.
  4. The second Docker container runs several instances of the Django backend (for scaling purposes).

    1. If the Django backend needs data from the database, it makes requests to the PostGIS database listening on port 5432.
    2. Requests and responses are cached in ElastiCache, listening on port 11211.
  5. A separate flow to fetch tiles goes directly to AWS S3 storage.

This architecture has several advantages from a security perspective:

  • The two Docker containers are isolated from each other except for the socket connection. The Django backend is not exposed to the outside world.
  • Only the necessary ports are exposed.
  • It is possible to control access to PostGIS and ElastiCache.

AWS component overview

Following this architecture, a development version of Planscape has been deployed on AWS in the "us-west-1" region.

The main Planscape application (see above diagram) uses the following components and configurations:

  • Identity and Access Management (IAM): Manages AWS Users, Groups, and Roles.
  • AWS Systems Manager: Stores parameters for the Planscape application which are referenced in the task. At the moment, only the following secret parameters are stored here in encrypted form:
    • PlanscapeDatabasePassword: password to the Postgres database
    • PlanscapeDjangoKey: Django key, used for cryptographic signing
  • Elastic Container Registry (ECR): AWS storage for Docker containers.
    • planscape_dev is the repository for Docker container images
    • TODO: Set up a Lifecycle Policy to cull old, unused images
  • Elastic Container Service (ECS): Infrastructure for launching Docker containers on EC2
    • PlanscapeDevCluster: AWS collection of services and tasks.
    • PlanscapeDevService: AWS mechanism for managing tasks. This can be configured to scale the number of tasks up or down automatically, with the tasks accessible the Elastic Load Balancer (see below).
    • PlanscapeDevTask: Configuration settings for AWS tasks. Each time Planscape is built with the GitHub action, a new version of PlanscapeDevTask is created with references to the new container objects in ECR. The task definition also contains startup parameters for the containers, some of which are pulled from the AWS Systems Manager.
  • Elastic Load Balancer (ELB): AWS mechanism for mapping a stable URL to a set of tasks, balancing load among the tasks.
  • EC2: Actual compute instances for the tasks.

The Planscape application also connects to

Deployment

Planscape is deployed to AWS via GitHub actions. The file .github/workflows/deploy_aws.yaml specifies the following steps, some of which use AWS CLI.

  1. Configure AWS credentials: uses the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, which are stored in GitHub secrets.

  2. Log into AWS ECR.

  3. Create new tags for the Docker containers that will be built.

  4. Build the frontend Docker container, substituting the URL of the Elastic Load Balancer into the Angular environment, and push it to ECR.

  5. Build the backend Docker container and push it to ECR.

  6. Update the skeleton ECS task definition stored in

    src/deployment/task_definition.json

    with the image name of the backend Docker container.

  7. Update the task definition with the image name of the frontend Docker container.

  8. Deploy the new task definition to ECS via PlanscapeDevService and wait for a restart.

Howto

Create an instance of RDS Postgres

  1. Log in to the Amazon RDS console.

  2. Click "Create database".

  3. On the next page,

    1. In "Choose a database creation method", select "Standard create". In "Engine options", select "PostgreSQL", and set Engine Version to "PostgreSQL 14.5-R1".

      image

    2. In "Templates", select "Dev/Test". In "Settings", fill in a "DB instance identifier" (e.g., planscape-prod) and select "Auto generate a password".

      image

    3. In "Instance configuration", select "Burstable classes" and "db.t3.micro":

      image

    4. In "Connectivity", select "Don't connect to an EC2 compute resource" and select "Yes" for public access; this will allow users outside of AWS running to connect and update the database.

      image

    5. Still in "Connectivity", select "Choose existing" VPC security groups, select an "Existing VPC security groups" (e.g., PlanscapeDevSecurity). In "Database authentication", select "Password and IAM database authentication".

      image

    6. A banner will appear at the top. You must click "View credential details" when done to get the password assigned for the superuser "postgres" account on the database.

      image

      Write this down, along with the location of the database, e.g., planscape-prod.c0aecsgfmxbc.us-west-1.rds.amazonaws.com

Configure Postgres

  1. If you haven't already, install the Postgres tools (in particular, psql and createuser) on your local machine.

  2. Choose a good password for the "planscape" Postgres user.

  3. From the root directory containing the Planscape code, run the following commands to create the database:

    export DB_LOC=<URL of database from above>
    export POSTGRES_PWD=<password for postgres user from above>
    export PLANSCAPE_PWD=<password for planscape user>
    createuser -h $DB_LOC -d -l -P -U postgres -W planscape
    

    This will prompt for PLANSCAPE_PWD twice, then for POSTGRES_PWD.

  4. Then run

    psql "postgresql://$DB\_LOC/postgres?user=planscape&password=$PLANSCAPE\_PWD" -c "CREATE DATABASE planscape;"
    psql "postgresql://$DB\_LOC/postgres?user=postgres&password=$POSTGRES\_PWD" -f src/deployment/rds\_postgis\_setup.sql
    

Useful Postgres commands

Some useful commands, using the environment variables above:

  1. Drop the "planscape" database:

    psql "postgresql://$DB_LOC/planscape?user=planscape&password=$PLANSCAPE_PWD" -c "DROP DATABASE planscape;"
    
  2. Create the "planscape" database:

    psql "postgresql://$DB_LOC/postgres?user=planscape&password=$PLANSCAPE_PWD" -c "CREATE DATABASE planscape;"
    
  3. Dump the contents of the database to a file:

    pg_dump "postgresql://$DB_LOC/postgres?user=planscape&password=$PLANSCAPE_PWD" -f file.sql
    
  4. Restore the contents of the database from a file:

    psql "postgresql://$DB_LOC/postgres?user=planscape&password=$PLANSCAPE_PWD" -f file.sql
    

Installing the AWS CLI

AWS provides a CLI for interacting with ECR, EC2, etc. The Getting Started guide is a good place to start. To install:

  1. Linux:

    curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
    unzip awscliv2.zip
    sudo ./aws/install
    
  2. MacOS:

    curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
    sudo installer -pkg AWSCLIV2.pkg -target /
    

Debugging in EC2

You can ssh into an EC2 instance for debugging, examining logs from containers, etc.

To set this up, you must first create a key:

  1. Open the Key Pairs page in the EC2 Management Console and click "Create key pair".

  2. In the "Create key pair" screen, give the key a name and click "Create key pair".

    image

    This will download a .pem file onto your local machine.

  3. Log in via SSH to one of the EC2 instances, e.g.,

    ssh -i "MyKey.pem" ec2-user@ec2-13-57-224-68.us-west-1.compute.amazonaws.com
    

    Subsequent SSH sessions should not require the "-i" flag. Some useful commands on EC2 instances:

  4. List the Docker containers installed on the instance, including those not running:

    docker container ls –a
    
  5. Show the logs from a container:

    docker container logs \<container ID\>
    

Enable a small number of calls from STADIA maps

  1. Log in at https://client.stadiamaps.com/

  2. Specify the domain of the Elastic Load Balancer, e.g., planscapedevload-1541713932.us-west-1.elb.amazonaws.com. We have 2.5k free views per day.

Remaining problems

  • P1: Set up auto scaling for EC2 Instances.
  • P1: Set up load testing
  • P1: Set up Google Analytics, or some other analytics system.
  • P2: Run the servers in the containers with a non-root user.
    • Problem: sharing the socket file descriptor seems to require root access