Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Rate limiting ec2:DescribeInstances API along with Batching for high TPS #292

Merged
merged 1 commit into from
Feb 21, 2020

Conversation

bhks
Copy link
Contributor

@bhks bhks commented Feb 21, 2020

Prerequisite:

The AWS ec2:DescribeInstances API is a low TPS batching capable API.

Summary :

As of Today a K8s cluster can support up-to 5000 nodes. every nodes has to run kubelet process to join to cluster, while joining the cluster kubelet makes multiple calls to API Server and if it fails it does a retry with backoff where it can make 8-9 calls per object type per second.

Given the Authenticator is a webhook and API Server makes calls to Authenticator for every API call being made to verify the users or nodes role.

While verifying for nodes it needs to call to ec:DescribeInstances to get the PrivateDNSName for the node. When we make call to verify token and get-caller-identity we know about the instanceId. PrivateDNSName and whether the session is valid but we don't know whether this node belongs to the same aws account.

Problem:

For example lets assume we have 500 nodes , and a cluster have 20 object types.

Thinking of two situations:

  • When Master restarts or new Master comes up and old master goes down in a HA Master setup
  • When cluster has a sudden hike in number of nodes joining

For the master restart , every connection to the old master from every kubelet running on nodes will close like watches and all hence kubelet will try to re-connect and get the updated list and do watch connection while doing so it has to pass through Authenticator and authenticator does verification of instance with PrivateDNSNames belonging to the same aws account.

As of today not having any rate limiting for the ec2:DescribeInstances make service unavailable as after 20-30 calls the aws ec2 API starts throttling because every node calls goes to API in the same interval and at multiple fold.

The number of calls per second for the failed to get authenticated would be

Number of nodes * 20 * 8 = 500 * 20 * 8 = ~ 8000 TPS

In a minute it can go upto ~ 200 K - ~400K

As this throttling would not get resolved at the same time it takes almost 2-5 minutes to get resolve

So total number of throttling goes to a very high number which impacts other services running in the aws account or CNI or Kube-Controller-manager.

While the Authenticator tries to cache the PrivateDNSNames but that does not help in this case as the kubelet calls very fast and the master has restarted so it does not have every node in cache.

Solution:

  1. Adding rate Limiting to make sure it does not hit the high TPS
  2. Adding rate limit will make the system very slow in the above mentioned situations as for the same node it would still make calls to ec2:DescribeInstances. So adding a look up map to check if we already have the in flight request for the ec2 instanceId and then wait for upto 5 second and keep looking into the cache to check if the Inflight request has completed and added the PrivateDNSName to cache.
  3. Given having a large number of nodes it can still take minutes to process every single node description so adding a batching routine which when number of Inflight request becomes more than 5 it can start batching and every 200 milliseconds batches the instances and just makes one call to ec2 to get the details of instances and put the PrivateDNSNames into the cache.

Testing:

  • Added unit tests which emulates the behavior with 100 nodes.
  • Tested in a real EKS cluster with 500, 1000 and 3000 nodes to check the count of API as well as overall performance of the system when we recycle the masters.

Next:

  • Adding Backoff Retry for AWS API ec2:DescribeInstances

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 21, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @bhagwat070919!

It looks like this is your first PR to kubernetes-sigs/aws-iam-authenticator 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/aws-iam-authenticator has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Feb 21, 2020
@k8s-ci-robot
Copy link
Contributor

@bhagwat070919: GitHub didn't allow me to assign the following users: mhausler.

Note that only kubernetes-sigs members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign mhausler

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bhks bhks force-pushed the ratelimit branch 2 times, most recently from 66a59ef to 3b9735b Compare February 21, 2020 06:48
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 21, 2020
@bhks bhks requested review from micahhausler and wongma7 February 21, 2020 06:57
Comment on lines +15 to +16
go test -v -coverprofile=coverage.out -race $(GITHUB_REPO)/...
go tool cover -html=coverage.out -o coverage.html
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, but this change should probably have been a separate PR.

DefaultPort = 21362
// Default Ec2 TPS Variables
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add a newline before the comment so that gofmt will intent this a bit nicer.


const (
// max limit of k8s nodes support
max_channel_size = 5000
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This could be set higher, then it won't have to be changed later on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about 10K ?

Comment on lines 28 to 29
// Making sure the single instance calls waits max till 5 seconds 100* (50 * time.Millisecond)
total_iteration_for_wait_interval = 100
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't 5 seconds quite a long time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems OK, given that its just for calls from nodes not in the cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 seconds is the timeout for any given request and upper limit.

instanceIdsChannel chan string
}

func NewEC2Provider(roleARN string, qps int, burst int) EC2Provider {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In go, this func would usually be named just New, since then it will get called using ec2provider.New()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, yeah it does not hurt given I already have extracted it out from the server.go and in this package there are no other New()

Comment on lines 44 to 45
dnsCache map[string]string
dnsCacheLock sync.RWMutex
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename these cache it lock, since they will always be referenced using dnsCache.cache and dnsCache.lock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment on lines 48 to 52
type ec2RequestQueue struct {
requestQueueMap map[string]bool
requestQueueLock sync.RWMutex
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is not even really a queue, is it? The pattern of a map to book is commonly used when you want a Set. Also, maps in go return the keys in random order, so you won't even get them in the order they were added when you iterate over it.

Can this just be

type ec2Requests struct {
	set  map[string]bool
	lock sync.RWMutex
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

  • my Idea was:
    its a logical concept to understand easily as its a request queue where we have inflight requests saved to have lookup.

I will make this change.

Comment on lines 115 to 116
p.privateDNSCache.dnsCacheLock.Lock()
defer p.privateDNSCache.dnsCacheLock.Unlock()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good! Change to use this "lock / defer unlock"-pattern in all the functions attached to ec2ProviderImpl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah , I like your idea as I was trying to be performance agnostic about these code in terms of nanoseconds while loosing the readability.
I will make all the unlock with defer.

Copy link
Member

@micahhausler micahhausler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great improvement! Mostly stylistic changes, and a reduction in the amount of logging

// DefaultPort is the default localhost port (chosen randomly).
DefaultPort = 21362
// Default Ec2 TPS Variables
Default_Ec2_DescribeInstances_Qps = 15
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Go generally doesn't use underscores, maybe DefaultEC2DescribeInstancesQPS. See Effective Go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

serverCmd.Flags().Int(
"ec2-describeInstances-qps",
Default_Ec2_DescribeInstances_Qps,
"AWS Ec2 rate limiting with qps")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Ec2 -> EC2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

import (
"errors"
"fmt"
"github.com/aws/aws-sdk-go/aws"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you separate stdlib imports from third party imports?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope one day my Goland will learn this and don't make this mistakes.

// max limit of k8s nodes support
max_channel_size = 5000
// max number of in flight non batched ec2:DescribeInstances request to flow
max_allowed_single_request = 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you update these variable names to not include underscores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

pkg/ec2provider/ec2provider.go Show resolved Hide resolved
}
var instances []*ec2.Instance
instances = append(instances, instance)
res1 := &ec2.Reservation{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of all the variables and appends, what do you think about just returning a single variable with the nested fields set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea let me check.

}, nil
}

func setup() *ec2ProviderImpl {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you name this something more specific like newMockedEC2ProviderImpl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done , good name suggestions

"fmt"
"log"
"net/http"
"regexp"
"sigs.k8s.io/aws-iam-authenticator/pkg/ec2provider"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to third party imports?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment on lines 79 to 81
func (p *testEC2Provider) StartEc2DescribeBatchProcessing() {

}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiniest of nit: this could be onelined

func (p *testEC2Provider) StartEc2DescribeBatchProcessing() {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

var instanceId string
select {
case instanceId = <-p.instanceIdsChannel:
logrus.Infof("Received the Instance Id := %s from buffered Channel for batch processing ", instanceId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to log this since you have a log call in the batch process? Can this be debug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as Debug log

@bhks
Copy link
Contributor Author

bhks commented Feb 21, 2020

This is a great improvement! Mostly stylistic changes, and a reduction in the amount of logging

Thanks !!!

I somehow knew that there would be an idea of removing the loggings but given the fact for every instance the whole workflow prints the messages once except few which just roam around for cache lookup.

@bhks
Copy link
Contributor Author

bhks commented Feb 21, 2020

Thank you so much @mogren and @micahhausler for helping improve the Go code quality.
I have updated the code with comments incorporation.

@bhks bhks requested a review from micahhausler February 21, 2020 21:02
Copy link
Member

@micahhausler micahhausler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 21, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bhagwat070919, micahhausler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 21, 2020
@k8s-ci-robot k8s-ci-robot merged commit 037664c into kubernetes-sigs:master Feb 21, 2020
@bhks bhks deleted the ratelimit branch February 21, 2020 21:43
Copy link

@mogren mogren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work Bhagwat, thanks for fixing this!

/lgtm

joanayma pushed a commit to joanayma/aws-iam-authenticator that referenced this pull request Aug 11, 2021
* Adding workers_launch_template ebs encryption

* Update CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants