data/aws/vpc: Add an S3 endpoint to new VPCs #745

wking · 2018-11-28T06:39:33Z

As suggested by @cuppett, this allows registry <-> S3 transfers to bypass the (NAT) gateways. Traffic over the NAT gateways costs money, so the new endpoint should make S3 access from the cluster cheaper (and possibly more reliable). This also allows for additional security policy flexibility, although I'm not taking advantage of that in this commit. Docs for VPC endpoints are here, here, here, and here.

Endpoints do not currently support cross-region requests. And based on discussion with @cuppett, adding an endpoint may break access to S3 on other regions. But I can't find docs to back that up, and the docs say:

We use the most specific route that matches the traffic to determine how to route the traffic (longest prefix match). If you have an existing route in your route table for all internet traffic (0.0.0.0/0) that points to an internet gateway, the endpoint route takes precedence for all traffic destined for the service, because the IP address range for the service is more specific than 0.0.0.0/0. All other internet traffic goes to your internet gateway, including traffic that's destined for the service in other regions.

which suggests that access to S3 on other regions may be unaffected. In any case, our registry buckets, and likely any other buckets associated with the cluster, will be living in the same region.

concat is documented here.

abhinavdahiya · 2018-11-28T17:30:09Z

https://www.terraform.io/docs/providers/aws/r/vpc_endpoint.html#subnet_ids vs https://www.terraform.io/docs/providers/aws/r/vpc_endpoint.html#route_table_ids

wking · 2018-11-28T17:34:23Z

Only route_table_ids applies to Gateway mode.

abhinavdahiya · 2018-11-28T17:53:28Z

Only route_table_ids applies to Gateway mode.

Oh, and Gateway is deafult https://www.terraform.io/docs/providers/aws/r/vpc_endpoint.html#vpc_endpoint_type 👍

abhinavdahiya · 2018-11-28T17:55:09Z

Is there a precedent for this kind of setup in OpenShift @cuppett ?

cuppett · 2018-11-28T18:00:22Z

Not in our hosted environments. Those and our reference architectures have historically taken a much more vanilla approach to each cloud.

Using ALBs, NAT gateways and VPC endpoints is new and applies optimizations and competencies to this cloud. Excited to bring more maturity to our offering. All are relatively new within AWS (last couple years) and is helping to catch us up.

abhinavdahiya · 2018-11-28T18:34:24Z

@dgoodwin not sure who keeps track of all these unmanaged things. heads up.

/approve

dgoodwin · 2018-11-28T18:41:42Z

Will this come out tagged with the expected tags?

And I assume we need to add VPC endpoint cleanup to hiveutil's code.

wking · 2018-11-28T19:19:13Z

Will this come out tagged with the expected tags?

At least Terraform does not seem to allow tagging for this resource. It may be untaggable in AWS.

And I assume we need to add VPC endpoint cleanup to hiveutil's code.

Maybe it doesn't block VPC deletion and is automatically removed when the VPC is deleted (in which case we would not need Hive changes). I'll see if I can tell from the CI logs.

wking · 2018-11-28T19:25:05Z

I'll see if I can tell from the CI logs.

The logs won't be particularly helpful until #730 lands. I'll circle back to this after that happens.

abhinavdahiya · 2018-11-28T19:41:34Z

The logs won't be particularly helpful until #730 lands. I'll circle back to this after that happens.

deletion doesn't need that right?

wking · 2018-11-28T20:33:50Z

The logs won't be particularly helpful until #730 lands. I'll circle back to this after that happens.

deletion doesn't need that right?

No, but we're not going to see anything about this endpoint in the deletion logs. What I want is to see creation (and ideally an ID) for the new resource, and then I'll check to make sure it's gone after CI wraps up.

abhinavdahiya · 2018-12-01T00:54:08Z

/hold
need 👍 from hive team @joelddiaz

joelddiaz · 2018-12-03T16:43:37Z

I looked at this a bit, and adding the endpoint does case panics in the uninstall due to some assumptions about the kinds of default fields found in a route table (easy fix).

You can't leave the endpoint hanging around as that will block the VPC from ever being deleted. Unfortunately, the endpoints don't appear to be taggable objects, so we'll need to add a dependency in the VPC deletion path to find any endpoints associated with the VPC being deleted.

installer wants to start using VPC endpoints (openshift/installer#745), so the uninstaller should find and remove them (they are untaggable objects) also, remove the code that would remove the individual routes from a route table (it's not needed to remove routes to be able to delete RouteTables) needed to add these permissions to get the installer/uninstaller working: ec2:DescribePrefixLists ec2:DescribeVpcEndpoints ec2:CreateVpcEndpoint ec2:DeleteVpcEndpoints

joelddiaz · 2018-12-05T22:12:45Z

FWIW, we've got the endpoint deletion merged now. so we're ready for this on the uninstall side of things openshift/hive#122

As suggested by Stephen Cuppett, this allows registry <-> S3 transfers to bypass the (NAT) gateways. Traffic over the NAT gateways costs money, so the new endpoint should make S3 access from the cluster cheaper (and possibly more reliable). This also allows for additional security policy flexibility, although I'm not taking advantage of that in this commit. Docs for VPC endpoints are in [1,2,3,4]. Endpoints do not currently support cross-region requests [1]. And based on discussion with Stephen, adding an endpoint may *break* access to S3 on other regions. But I can't find docs to back that up, and [3] has: We use the most specific route that matches the traffic to determine how to route the traffic (longest prefix match). If you have an existing route in your route table for all internet traffic (0.0.0.0/0) that points to an internet gateway, the endpoint route takes precedence for all traffic destined for the service, because the IP address range for the service is more specific than 0.0.0.0/0. All other internet traffic goes to your internet gateway, including traffic that's destined for the service in other regions. which suggests that access to S3 on other regions may be unaffected. In any case, our registry buckets, and likely any other buckets associated with the cluster, will be living in the same region. concat is documented in [5]. The wrapping brackets avoid [6]: level=error msg="Error: module.vpc.aws_vpc_endpoint.s3: route_table_ids: should be a list" although I think that's a Terraform bug. See also 8a37f72 (modules/aws/bootstrap: Pull AWS bootstrap setup into a module, 2018-09-05, openshift#217), which talks about this same issue. [1]: https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html [2]: https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints.html [3]: https://docs.aws.amazon.com/vpc/latest/userguide/vpce-gateway.html [4]: https://www.terraform.io/docs/providers/aws/r/vpc_endpoint.html [5]: https://www.terraform.io/docs/configuration/interpolation.html#concat-list1-list2- [6]: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/745/pull-ci-openshift-installer-master-e2e-aws/1673/build-log.txt

wking · 2018-12-13T07:39:52Z

I've rebased onto master and pulled in openshift/hive#122 with 1a1dc49 -> e205898.

Pulling in openshift/hive@b7d71518 (remove VPC endpoints, 2018-12-03, openshift/hive#122). Generated with: $ sed -i s/2349f175d3e4fc6542dec79add881a59f2d7b1b8/802db5420da6a88f034fc2501081e2ab12e8463e/ Gopkg.toml $ dep ensure using: $ dep version dep: version : v0.5.0 build date : git hash : 22125cf go version : go1.10.3 go compiler : gc platform : linux/amd64 features : ImportDuringSolve=false

openshift-ci-robot · 2018-12-13T09:39:40Z

@wking: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/prow/e2e-libvirt	`1a1dc49`	link	`/test e2e-libvirt`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

abhinavdahiya · 2018-12-13T22:31:58Z

I've rebased onto master and pulled in openshift/hive#122 with 1a1dc49 -> e205898.

/lgtm

/retest

openshift-ci-robot · 2018-12-13T22:32:21Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [abhinavdahiya,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

abhinavdahiya · 2018-12-13T22:32:30Z

/hold cancel

Covering 08e4dbc (CHANGELOG: Document changes since v0.5.0, 2018-12-07, openshift#841) through 5813f61 (Merge pull request openshift#745 from wking/s3-vpc-endpoint, 2018-12-13).

Covering 5813f61 (Merge pull request openshift#745 from wking/s3-vpc-endpoint, 2018-12-13) through 14505f3 (Merge pull request openshift#911 from hardys/issues/891, 2018-12-14).

Covering 5813f61 (Merge pull request openshift#745 from wking/s3-vpc-endpoint, 2018-12-13) through 874876e (Merge pull request openshift#711 from staebler/improve_input_validation, 2018-12-14).

openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Nov 28, 2018

openshift-ci-robot requested review from flaper87 and russellb November 28, 2018 06:39

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 28, 2018

wking changed the title ~~data/data/aws/vpc/vpc: Add an S3 endpoint to new VPCs~~ data/aws/vpc: Add an S3 endpoint to new VPCs Nov 28, 2018

wking force-pushed the s3-vpc-endpoint branch 2 times, most recently from af161ff to 1a1dc49 Compare November 28, 2018 07:11

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 1, 2018

joelddiaz mentioned this pull request Dec 3, 2018

remove VPC endpoints openshift/hive#122

Merged

openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 7, 2018

wking force-pushed the s3-vpc-endpoint branch from 1a1dc49 to e205898 Compare December 13, 2018 07:39

openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 13, 2018

wking force-pushed the s3-vpc-endpoint branch from e205898 to cdcaeb2 Compare December 13, 2018 08:49

openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 13, 2018

openshift-ci-robot assigned abhinavdahiya Dec 13, 2018

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 13, 2018

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 13, 2018

openshift-merge-robot merged commit 5813f61 into openshift:master Dec 14, 2018

wking deleted the s3-vpc-endpoint branch December 14, 2018 00:26

wking mentioned this pull request Dec 14, 2018

CHANGELOG: Document changes since 2018-12-07 #906

Merged

wking mentioned this pull request Dec 15, 2018

CHANGELOG: Document changes since 5813f611 #917

Merged

This was referenced Oct 9, 2019

AWS VPC endpoint policy for installer-provided VPCs #2486

Closed

data/aws/vpc: Add an elasticloadbalancing endpoint #2485

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/aws/vpc: Add an S3 endpoint to new VPCs #745

data/aws/vpc: Add an S3 endpoint to new VPCs #745

wking commented Nov 28, 2018 •

edited

Loading

abhinavdahiya commented Nov 28, 2018

wking commented Nov 28, 2018

abhinavdahiya commented Nov 28, 2018

abhinavdahiya commented Nov 28, 2018

cuppett commented Nov 28, 2018

abhinavdahiya commented Nov 28, 2018

dgoodwin commented Nov 28, 2018

wking commented Nov 28, 2018

wking commented Nov 28, 2018

abhinavdahiya commented Nov 28, 2018

wking commented Nov 28, 2018

abhinavdahiya commented Dec 1, 2018

joelddiaz commented Dec 3, 2018

joelddiaz commented Dec 5, 2018

wking commented Dec 13, 2018

openshift-ci-robot commented Dec 13, 2018 •

edited

Loading

abhinavdahiya commented Dec 13, 2018

openshift-ci-robot commented Dec 13, 2018

abhinavdahiya commented Dec 13, 2018

data/aws/vpc: Add an S3 endpoint to new VPCs #745

data/aws/vpc: Add an S3 endpoint to new VPCs #745

Conversation

wking commented Nov 28, 2018 • edited Loading

abhinavdahiya commented Nov 28, 2018

wking commented Nov 28, 2018

abhinavdahiya commented Nov 28, 2018

abhinavdahiya commented Nov 28, 2018

cuppett commented Nov 28, 2018

abhinavdahiya commented Nov 28, 2018

dgoodwin commented Nov 28, 2018

wking commented Nov 28, 2018

wking commented Nov 28, 2018

abhinavdahiya commented Nov 28, 2018

wking commented Nov 28, 2018

abhinavdahiya commented Dec 1, 2018

joelddiaz commented Dec 3, 2018

joelddiaz commented Dec 5, 2018

wking commented Dec 13, 2018

openshift-ci-robot commented Dec 13, 2018 • edited Loading

abhinavdahiya commented Dec 13, 2018

openshift-ci-robot commented Dec 13, 2018

abhinavdahiya commented Dec 13, 2018

wking commented Nov 28, 2018 •

edited

Loading

openshift-ci-robot commented Dec 13, 2018 •

edited

Loading