-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data/aws/vpc: Add an S3 endpoint to new VPCs #745
data/aws/vpc: Add an S3 endpoint to new VPCs #745
Conversation
af161ff
to
1a1dc49
Compare
Only |
Oh, and |
Is there a precedent for this kind of setup in OpenShift @cuppett ? |
Not in our hosted environments. Those and our reference architectures have historically taken a much more vanilla approach to each cloud. Using ALBs, NAT gateways and VPC endpoints is new and applies optimizations and competencies to this cloud. Excited to bring more maturity to our offering. All are relatively new within AWS (last couple years) and is helping to catch us up. |
@dgoodwin not sure who keeps track of all these unmanaged things. heads up. /approve |
Will this come out tagged with the expected tags? And I assume we need to add VPC endpoint cleanup to hiveutil's code. |
At least Terraform does not seem to allow tagging for this resource. It may be untaggable in AWS.
Maybe it doesn't block VPC deletion and is automatically removed when the VPC is deleted (in which case we would not need Hive changes). I'll see if I can tell from the CI logs. |
The logs won't be particularly helpful until #730 lands. I'll circle back to this after that happens. |
deletion doesn't need that right? |
No, but we're not going to see anything about this endpoint in the deletion logs. What I want is to see creation (and ideally an ID) for the new resource, and then I'll check to make sure it's gone after CI wraps up. |
/hold |
I looked at this a bit, and adding the endpoint does case panics in the uninstall due to some assumptions about the kinds of default fields found in a route table (easy fix). You can't leave the endpoint hanging around as that will block the VPC from ever being deleted. Unfortunately, the endpoints don't appear to be taggable objects, so we'll need to add a dependency in the VPC deletion path to find any endpoints associated with the VPC being deleted. |
installer wants to start using VPC endpoints (openshift/installer#745), so the uninstaller should find and remove them (they are untaggable objects) also, remove the code that would remove the individual routes from a route table (it's not needed to remove routes to be able to delete RouteTables) needed to add these permissions to get the installer/uninstaller working: ec2:DescribePrefixLists ec2:DescribeVpcEndpoints ec2:CreateVpcEndpoint ec2:DeleteVpcEndpoints
installer wants to start using VPC endpoints (openshift/installer#745), so the uninstaller should find and remove them (they are untaggable objects) also, remove the code that would remove the individual routes from a route table (it's not needed to remove routes to be able to delete RouteTables) needed to add these permissions to get the installer/uninstaller working: ec2:DescribePrefixLists ec2:DescribeVpcEndpoints ec2:CreateVpcEndpoint ec2:DeleteVpcEndpoints
FWIW, we've got the endpoint deletion merged now. so we're ready for this on the uninstall side of things openshift/hive#122 |
As suggested by Stephen Cuppett, this allows registry <-> S3 transfers to bypass the (NAT) gateways. Traffic over the NAT gateways costs money, so the new endpoint should make S3 access from the cluster cheaper (and possibly more reliable). This also allows for additional security policy flexibility, although I'm not taking advantage of that in this commit. Docs for VPC endpoints are in [1,2,3,4]. Endpoints do not currently support cross-region requests [1]. And based on discussion with Stephen, adding an endpoint may *break* access to S3 on other regions. But I can't find docs to back that up, and [3] has: We use the most specific route that matches the traffic to determine how to route the traffic (longest prefix match). If you have an existing route in your route table for all internet traffic (0.0.0.0/0) that points to an internet gateway, the endpoint route takes precedence for all traffic destined for the service, because the IP address range for the service is more specific than 0.0.0.0/0. All other internet traffic goes to your internet gateway, including traffic that's destined for the service in other regions. which suggests that access to S3 on other regions may be unaffected. In any case, our registry buckets, and likely any other buckets associated with the cluster, will be living in the same region. concat is documented in [5]. The wrapping brackets avoid [6]: level=error msg="Error: module.vpc.aws_vpc_endpoint.s3: route_table_ids: should be a list" although I think that's a Terraform bug. See also 8a37f72 (modules/aws/bootstrap: Pull AWS bootstrap setup into a module, 2018-09-05, openshift#217), which talks about this same issue. [1]: https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html [2]: https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints.html [3]: https://docs.aws.amazon.com/vpc/latest/userguide/vpce-gateway.html [4]: https://www.terraform.io/docs/providers/aws/r/vpc_endpoint.html [5]: https://www.terraform.io/docs/configuration/interpolation.html#concat-list1-list2- [6]: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/745/pull-ci-openshift-installer-master-e2e-aws/1673/build-log.txt
1a1dc49
to
e205898
Compare
I've rebased onto master and pulled in openshift/hive#122 with 1a1dc49 -> e205898. |
Pulling in openshift/hive@b7d71518 (remove VPC endpoints, 2018-12-03, openshift/hive#122). Generated with: $ sed -i s/2349f175d3e4fc6542dec79add881a59f2d7b1b8/802db5420da6a88f034fc2501081e2ab12e8463e/ Gopkg.toml $ dep ensure using: $ dep version dep: version : v0.5.0 build date : git hash : 22125cf go version : go1.10.3 go compiler : gc platform : linux/amd64 features : ImportDuringSolve=false
e205898
to
cdcaeb2
Compare
@wking: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/lgtm /retest |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
Covering 08e4dbc (CHANGELOG: Document changes since v0.5.0, 2018-12-07, openshift#841) through 5813f61 (Merge pull request openshift#745 from wking/s3-vpc-endpoint, 2018-12-13).
Covering 08e4dbc (CHANGELOG: Document changes since v0.5.0, 2018-12-07, openshift#841) through 5813f61 (Merge pull request openshift#745 from wking/s3-vpc-endpoint, 2018-12-13).
Covering 5813f61 (Merge pull request openshift#745 from wking/s3-vpc-endpoint, 2018-12-13) through 14505f3 (Merge pull request openshift#911 from hardys/issues/891, 2018-12-14).
Covering 5813f61 (Merge pull request openshift#745 from wking/s3-vpc-endpoint, 2018-12-13) through 874876e (Merge pull request openshift#711 from staebler/improve_input_validation, 2018-12-14).
As suggested by @cuppett, this allows registry <-> S3 transfers to bypass the (NAT) gateways. Traffic over the NAT gateways costs money, so the new endpoint should make S3 access from the cluster cheaper (and possibly more reliable). This also allows for additional security policy flexibility, although I'm not taking advantage of that in this commit. Docs for VPC endpoints are here, here, here, and here.
Endpoints do not currently support cross-region requests. And based on discussion with @cuppett, adding an endpoint may break access to S3 on other regions. But I can't find docs to back that up, and the docs say:
which suggests that access to S3 on other regions may be unaffected. In any case, our registry buckets, and likely any other buckets associated with the cluster, will be living in the same region.
concat
is documented here.