Skip to content

Commit

Permalink
Add details about CloudFront
Browse files Browse the repository at this point in the history
As part of putting this project to rest (edgi-govdata-archiving/web-monitoring#168), I put the production API behind CloudFront & WAF. This adds documentation for the current configuration.

Fixes #42.
  • Loading branch information
Mr0grog authored Feb 16, 2023
1 parent 624716a commit c87dc97
Showing 1 changed file with 24 additions and 0 deletions.
24 changes: 24 additions & 0 deletions manually-managed/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,30 @@ TBD
For details, see [`rds/README.md`](./rds/README.md).


## CloudFront & WAF

To protect the production instance of the API from from abuse, we point the DNS records for the API to CloudFront (AWS’s CDN) instead of directly to the API service’s load balancer. We also add some WAF (firewall) rules to the CloudFront distribution.

- CloudFront…
- Needs a separate SSL certificate in the `us-east-1` (N. Virginia) region. It’s set up the same way certificates are set up for the rest of the Kubernetes cluster in AWS Certificate Manager.
- Is in the NA/Europe price class.
- The origin:
- Uses the domain of the API service in the Kubernetes cluster (not a direct reference to the load balancer).
- Is HTTPS-only.
- Uses origin shield.
- Has a default behavior that:
- Redirects HTTP to HTTPS
- Allows all HTTP methods
- Caches GET, HEAD, OPTIONS
- Forwards all headers to the origin (the `AllViewer` origin request policy)
- Has a cache policy that:
- Includes `Authorization` and `Accept` headers and `_webpage-versions-db_session` cookies, and all query strings in the cache key.
- Compression support is enabled.
- There is a WAF ACL attached to the CloudFront distribution.
- It uses the `AWS-AWSManagedRulesKnownBadInputsRuleSet` built-in rule set.
- It uses a `per-ip-rate-limit` rule to block IP addresses requesting over a certain rate.


## ETL

We currently run scheduled scripts for extracting data from external services (Versionista, the Wayback Machine) and sending it to [web-monitoring-db][-db] to be imported. These are managed via `cron` on a single EC2 VM.
Expand Down

0 comments on commit c87dc97

Please sign in to comment.