-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(examples): local zones examples #314
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
4af1cd6
feat(examples): local zones examples
horsmand 8e031f8
Addressed feedback
horsmand b0c05b5
Addressed feedback round 2
horsmand a5fed82
Added removal policy for repo in python
horsmand 1c141c2
Updated RFDK version in example
horsmand 6d9453f
Addressed feedback round 3
horsmand 438736e
Updated RFDK version
horsmand fd37e2d
Updates based on PR comments
horsmand dec2bbd
Updated RFDK version
horsmand f10dd25
Fixing typos
horsmand File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# RFDK Sample Application - Local Zones | ||
|
||
If you have large asset files that your Worker instances need to access from your on-prem infrastructure, deploying the Workers to a geographically close AWS Local Zone can reduce the latency and increase the speed of your renders. This example will walk you through setting up your workers in a local zone while leaving the rest of the render farm in standard availability zones. Currently Amazon has launched a local zone in Los Angeles that is a part of the us-west-2 region, but they have more on the way. For more information on where local zones are available, how to get access, and what services they provide, refer to the [AWS Local Zones about page](https://aws.amazon.com/about-aws/global-infrastructure/localzones/). | ||
|
||
Before deploying your farm, you may want to read our [Connecting to the Render Farm](https://docs.aws.amazon.com/rfdk/latest/guide/connecting-to-render-farm.html#connecting-with-site-to-site-vpn) developer guide for guidance on how to create a connection from your local network to the farm using something like a VPN. All of the techniques listed in the guide require changes to the networking tier of your RFDK app to allow the connection. After your connection is set up, you will be able to configure your network file server to be available on your workers, so any local assets you have can be transferred as needed by the jobs they perform. | ||
|
||
--- | ||
|
||
_**Note:** This application is an illustrative example to showcase some of the capabilities of the RFDK. **It is not intended to be used for production render farms**, which should be built with more consideration of the security and operational needs of the system._ | ||
|
||
--- | ||
|
||
## Architecture | ||
|
||
This example app assumes you're familiar with the general architecture of an RFDK render farm. If not, please refer to the [All-In-AWS-Infrastructure-Basic](../All-In-AWS-Infrastructure-Basic/README.md) example for the basics. | ||
|
||
### Components | ||
|
||
#### Network Tier | ||
|
||
The network tier sets up a [VPC](https://aws.amazon.com/vpc/) that spans across all of the standard availability zones and local zones that are used, but the NAT Gateway for the VPC is only added to the standard zones, as it is not available in any local zones at this time. In this tier we override the Stack's `availabilityZones()` method, which returns the list of availability zones the Stack can use. It's by this mechanism that we control which zones the VPC will be deployed to. | ||
|
||
#### Security Tier | ||
|
||
This holds the root CA certificate used for signing any certificates required by the farm, such as the one used by the render queue. | ||
|
||
#### Service Tier | ||
|
||
The service tier contains the repository and render queue, both of which are provided the selection of standard availability zone subnets to be deployed into. The DocumentDB and EFS filesystem are not available in the local zones at this time, so the repository cannot be moved there. Since the repository needs to be in a standard availability zone, there isn't any benefit to moving the render queue to a local zone. | ||
|
||
#### Compute Tier | ||
|
||
This tier holds the worker fleet and its health monitor. The health monitor contains a [Network Load Balancer](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html) used to perform application-level health checks and the worker fleet contains an [Auto Scaling Group](https://docs.aws.amazon.com/autoscaling/ec2/userguide/AutoScalingGroup.html). Currently, these services are available in all launched local zones, so the construct can be placed in those zones. | ||
|
||
## Typescript | ||
|
||
[Continue to Typescript specific documentation.](ts/README.md) | ||
|
||
## Python | ||
|
||
[Continue to Python specific documentation.](python/README.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
*.swp | ||
package-lock.json | ||
__pycache__ | ||
.pytest_cache | ||
.env | ||
*.egg-info | ||
venv | ||
build | ||
|
||
# CDK asset staging directory | ||
.cdk.staging | ||
cdk.out | ||
cdk.context.json | ||
stage |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# RFDK Sample Application - Local Zones - Python | ||
|
||
## Overview | ||
[Back to overview](../README.md) | ||
|
||
## Instructions | ||
|
||
--- | ||
**NOTE** | ||
|
||
These instructions assume that your working directory is `examples/deadline/Local-Zones/python/` relative to the root of the AWS-RFDK package. | ||
|
||
--- | ||
|
||
1. This sample app on the `mainline` branch may contain features that have not yet been officially released, and may not be available in the `aws-rfdk` package installed through pip from PyPI. To work from an example of the latest release, please switch to the `release` branch. If you would like to try out unreleased features, you can stay on `mainline` and follow the instructions for building, packing, and installing the `aws-rfdk` from your local repository. | ||
|
||
2. Install the dependencies of the sample app: | ||
|
||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
3. If working on the `release` branch, this step can be skipped. If working on `mainline`, navigate to the base directory where the build and packaging scripts are, then run them and install the result over top of the `aws-rfdk` version that was installed in the previous step: | ||
```bash | ||
# Navigate to the root directory of the RFDK repository | ||
pushd ../../../.. | ||
# Enter the Docker container to run the build and pack scripts | ||
./scripts/rfdk_build_environment.sh | ||
./build.sh | ||
./pack.sh | ||
# Exit the Docker container | ||
exit | ||
# Navigate back to the example directory | ||
popd | ||
pip install ../../../../dist/python/aws-rfdk-<version>.tar.gz | ||
``` | ||
|
||
4. You must read and accept the [AWS Thinkbox End-User License Agreement (EULA)](https://www.awsthinkbox.com/end-user-license-agreement) to deploy and run Deadline. To do so, change the value of the `accept_aws_thinkbox_eula` in `package/lib/config.py` like this: | ||
|
||
```py | ||
self.accept_aws_thinkbox_eula: AwsThinkboxEulaAcceptance = AwsThinkboxEulaAcceptance.USER_ACCEPTS_AWS_THINKBOX_EULA | ||
``` | ||
|
||
5. Change the value of the `deadline_version` variable in `package/config.py` to specify the desired version of Deadline to be deployed to your render farm. RFDK is compatible with Deadline versions 10.1.9.x and later. To see the available versions of Deadline, consult the [Deadline release notes](https://docs.thinkboxsoftware.com/products/deadline/10.1/1_User%20Manual/manual/release-notes.html). It is recommended to use the latest version of Deadline available when building your farm, but to pin this version when the farm is ready for production use. For example, to pin to the latest `10.1.15.x` release of Deadline, use: | ||
|
||
```python | ||
self.deadline_version: str = '10.1.15' | ||
``` | ||
|
||
6. Change the value of the `deadline_client_linux_ami_map` variable in `package/config.py` to include the region + AMI ID mapping of your EC2 AMI(s) with Deadline Worker. You can use the following AWS CLI command to look up AMIs, replacing the `<region>` and `<version>` to match the AWS region and Deadline version you're looking for: | ||
|
||
```bash | ||
aws --region <region> ec2 describe-images --owners 357466774442 --filters "Name=name,Values=*Worker*" "Name=name,Values=*<version>*" --query 'Images[*].[ImageId, Name]' --output text | ||
``` | ||
|
||
7. Also in `package/lib/config.py`, you can set the `availability_zones_standard` and `availability_zones_local` values to the availability zones you want to use. These values must all be from the same region. It's required that you use at least two standard zones, but you can use more if you'd like. For the local zones, you can use one or more. | ||
|
||
8. To gain the benefits of putting your workers in a local zone close to your asset server, you are going to want to set up a connection from your local network to the one you're creating in AWS. | ||
1. You should start by reading through the [Connecting to the Render Farm](https://docs.aws.amazon.com/rfdk/latest/guide/connecting-to-render-farm.html) documentation and implementing one of the methods for connecting your network to your AWS VPC described there. | ||
2. With whichever option you choose, you'll want to make sure you are propagating the worker subnets to your local network. All the options in the document show how to propagate all the private subnets, which will include the ones used by the workers. | ||
3. Ensure your worker fleet's security group allows traffic from your network on the correct ports that your NFS requires to be open. The documentation shows how to [allow connections to the Render Queue](https://docs.aws.amazon.com/rfdk/latest/guide/connecting-to-render-farm.html#allowing-connection-to-the-render-queue), which you may also want to enable if you plan on connecting any of your local machines to your render farm, but you would also want to do something similar for the worker fleet, for example, ports `22` and `2049` are commonly required for NFS, so this code could be added to the `ComputeTier`: | ||
|
||
```python | ||
# The customer-prefix-cidr-range needs to be replaced by the CIDR range for your local network that you used when configuring the VPC connection | ||
self.worker_fleet.connections.allow_from(Peer.ipv4('customer-prefix-cidr-range'), Port.tcp(22)) | ||
self.worker_fleet.connections.allow_from(Peer.ipv4('customer-prefix-cidr-range'), Port.udp(22)) | ||
self.worker_fleet.connections.allow_from(Peer.ipv4('customer-prefix-cidr-range'), Port.tcp(2049)) | ||
self.worker_fleet.connections.allow_from(Peer.ipv4('customer-prefix-cidr-range'), Port.tcp(2049)) | ||
``` | ||
|
||
4. Add user-data to mount the NFS on the compute tier. This can be provided in the `UserDataProvider` in the `ComputeTier`. | ||
5. (optional) Set up [path mapping rules in Deadline](https://docs.thinkboxsoftware.com/products/deadline/10.1/1_User%20Manual/manual/cross-platform.html). | ||
|
||
9. Deploy all the stacks in the sample app: | ||
|
||
```bash | ||
cdk deploy "*" | ||
``` | ||
|
||
10. Once you are finished with the sample app, you can tear it down by running: | ||
|
||
```bash | ||
cdk destroy "*" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{ | ||
"app": "python -m package.app" | ||
} |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
#!/usr/bin/env python3 | ||
|
||
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
import os | ||
|
||
from aws_cdk.core import ( | ||
App, | ||
Environment | ||
) | ||
from aws_cdk.aws_ec2 import ( | ||
MachineImage | ||
) | ||
|
||
from .lib import ( | ||
config, | ||
network_tier, | ||
security_tier, | ||
service_tier, | ||
compute_tier | ||
) | ||
|
||
|
||
def main(): | ||
# ------------------------------ | ||
# Validate Config Values | ||
# ------------------------------ | ||
if not config.config.key_pair_name: | ||
print('EC2 key pair name not specified. You will not have SSH access to the render farm.') | ||
|
||
# ------------------------------ | ||
# Application | ||
# ------------------------------ | ||
app = App() | ||
|
||
if 'CDK_DEPLOY_ACCOUNT' not in os.environ and 'CDK_DEFAULT_ACCOUNT' not in os.environ: | ||
raise ValueError('You must define either CDK_DEPLOY_ACCOUNT or CDK_DEFAULT_ACCOUNT in the environment.') | ||
if 'CDK_DEPLOY_REGION' not in os.environ and 'CDK_DEFAULT_REGION' not in os.environ: | ||
raise ValueError('You must define either CDK_DEPLOY_REGION or CDK_DEFAULT_REGION in the environment.') | ||
env = Environment( | ||
account=os.environ.get('CDK_DEPLOY_ACCOUNT', os.environ.get('CDK_DEFAULT_ACCOUNT')), | ||
region=os.environ.get('CDK_DEPLOY_REGION', os.environ.get('CDK_DEFAULT_REGION')) | ||
) | ||
|
||
# ------------------------------ | ||
# Network Tier | ||
# ------------------------------ | ||
network = network_tier.NetworkTier( | ||
app, | ||
'NetworkTier', | ||
env=env | ||
) | ||
|
||
# ------------------------------ | ||
# Security Tier | ||
# ------------------------------ | ||
security = security_tier.SecurityTier( | ||
app, | ||
'SecurityTier', | ||
env=env | ||
) | ||
|
||
# ------------------------------ | ||
# Service Tier | ||
# ------------------------------ | ||
service_props = service_tier.ServiceTierProps( | ||
vpc=network.vpc, | ||
availability_zones=config.config.availability_zones_standard, | ||
root_ca=security.root_ca, | ||
dns_zone=network.dns_zone, | ||
deadline_version=config.config.deadline_version, | ||
accept_aws_thinkbox_eula=config.config.accept_aws_thinkbox_eula | ||
) | ||
service = service_tier.ServiceTier(app, 'ServiceTier', props=service_props, env=env) | ||
|
||
# ------------------------------ | ||
# Compute Tier | ||
# ------------------------------ | ||
deadline_client_image = MachineImage.generic_linux(config.config.deadline_client_linux_ami_map) | ||
compute_props = compute_tier.ComputeTierProps( | ||
vpc=network.vpc, | ||
availability_zones=config.config.availability_zones_local, | ||
render_queue=service.render_queue, | ||
worker_machine_image=deadline_client_image, | ||
key_pair_name=config.config.key_pair_name, | ||
) | ||
_compute = compute_tier.ComputeTier(app, 'ComputeTier', props=compute_props, env=env) | ||
|
||
app.synth() | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
Empty file.
111 changes: 111 additions & 0 deletions
111
examples/deadline/Local-Zone/python/package/lib/compute_tier.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
from dataclasses import dataclass | ||
from typing import ( | ||
List, | ||
Optional | ||
) | ||
|
||
from aws_cdk.core import ( | ||
Construct, | ||
Stack, | ||
StackProps | ||
) | ||
from aws_cdk.aws_ec2 import ( | ||
IMachineImage, | ||
InstanceClass, | ||
InstanceSize, | ||
InstanceType, | ||
IVpc, | ||
SubnetSelection, | ||
SubnetType | ||
) | ||
|
||
from aws_rfdk import ( | ||
HealthMonitor, | ||
SessionManagerHelper | ||
) | ||
from aws_rfdk.deadline import ( | ||
InstanceUserDataProvider, | ||
IRenderQueue, | ||
WorkerInstanceFleet | ||
) | ||
|
||
|
||
@dataclass | ||
class ComputeTierProps(StackProps): | ||
""" | ||
Properties for ComputeTier | ||
""" | ||
# The VPC to deploy resources into. | ||
vpc: IVpc | ||
# The availability zones the worker instances will be deployed to. This can include your local | ||
# zones, but they must belong to the same region as the standard zones used in other stacks in | ||
# this application. | ||
availability_zones: List[str] | ||
# The IRenderQueue that Deadline Workers connect to. | ||
render_queue: IRenderQueue | ||
# The IMachineImage to use for Workers (needs Deadline Client installed). | ||
worker_machine_image: IMachineImage | ||
# The name of the EC2 keypair to associate with Worker nodes. | ||
key_pair_name: Optional[str] | ||
|
||
|
||
class UserDataProvider(InstanceUserDataProvider): | ||
def __init__(self, scope: Construct, stack_id: str): | ||
super().__init__(scope, stack_id) | ||
|
||
def pre_worker_configuration(self, host) -> None: | ||
# Add code here for mounting your NFS to the workers | ||
host.user_data.add_commands("echo preWorkerConfiguration") | ||
|
||
|
||
class ComputeTier(Stack): | ||
""" | ||
The computer tier consists of the worker fleets. We'll be deploying the workers into the | ||
local zone we're using. | ||
""" | ||
def __init__(self, scope: Construct, stack_id: str, *, props: ComputeTierProps, **kwargs): | ||
""" | ||
Initializes a new instance of ComputeTier | ||
:param scope: The Scope of this construct. | ||
:param stack_id: The ID of this construct. | ||
:param props: The properties of this construct. | ||
:param kwargs: Any kwargs that need to be passed on to the parent class. | ||
""" | ||
super().__init__(scope, stack_id, **kwargs) | ||
|
||
# We can put the health monitor and worker fleet in all of the local zones we're using | ||
subnets = SubnetSelection( | ||
availability_zones=props.availability_zones, | ||
subnet_type=SubnetType.PRIVATE, | ||
one_per_az=True | ||
) | ||
|
||
# We can put the health monitor in all of the local zones we're using for the worker fleet | ||
self.health_monitor = HealthMonitor( | ||
self, | ||
'HealthMonitor', | ||
vpc=props.vpc, | ||
vpc_subnets=subnets, | ||
deletion_protection=False | ||
) | ||
|
||
self.worker_fleet = WorkerInstanceFleet( | ||
self, | ||
'WorkerFleet', | ||
vpc=props.vpc, | ||
vpc_subnets=subnets, | ||
render_queue=props.render_queue, | ||
# Not all instance types will be available in local zones. For a list of the instance types | ||
# available in each local zone, you can refer to: | ||
# https://aws.amazon.com/about-aws/global-infrastructure/localzones/features/#AWS_Services | ||
# BURSTABLE3 is a T3; the third generation of burstable instances | ||
instance_type=InstanceType.of(InstanceClass.BURSTABLE3, InstanceSize.LARGE), | ||
worker_machine_image=props.worker_machine_image, | ||
health_monitor=self.health_monitor, | ||
key_name=props.key_pair_name, | ||
user_data_provider=UserDataProvider(self, 'UserDataProvider') | ||
) | ||
SessionManagerHelper.grant_permissions_to(self.worker_fleet) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example does a great job of showing how to deploy the Workers to a local zone - great work!
I think one thing that might be missing though is an explanation of why this is important/useful. RFDK users want this so they can have a low-latency connection between their AWS infrastructure and their infrastructure hosted outside AWS. It might help to spend some time at the beginning of this README setting the stage here. It may also help to add some guidance, links, and next steps to point users in the right direction for how they'd take this example and proceed to the next step of connecting their infrastructure outside of AWS.
One good resource we could link to is our developer guide documentation on connecting to a RFDK render farm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding documentation for this in the top-level README.
As discussed offline, let's add a "next steps" section to the Python/TS READMEs that list the steps required to connect an on-premise networked file-system to the Workers. Even if we don't have all of the steps completely detailed, for now having a checklist would help guide readers to research and complete them. These are the steps that I foresee here:
NetworkTier
with the the VPN connection.