Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure: reduce client polling interval from default 60s to 5s #359

Merged
merged 1 commit into from
Mar 14, 2018

Conversation

skriss
Copy link
Contributor

@skriss skriss commented Mar 6, 2018

Signed-off-by: Steve Kriss steve@heptio.com

The Azure go client defaults to a 60s polling interval to check for API call results. This is a significant amount of time to wait for calls that don't take very long to return. Lower this to five seconds for our snapshot/disk operations to improve performance.

@ncdc
Copy link
Contributor

ncdc commented Mar 7, 2018

@paulbouwer @jimzim @noelbundick I know we aren't on v14 of the Azure SDK yet, but does this change make sense for now?

@nrb
Copy link
Contributor

nrb commented Mar 7, 2018

If I've found the right place, this is reduced to 30s by default, and the field that controls it is renamed in the newer SDK. https://godoc.org/github.com/Azure/azure-sdk-for-go/services/classic/management#pkg-constants

@noelbundick
Copy link

@ncdc makes sense to me!

I'd make sure to test this with a "real world" disk size. Doing a snapshot & restore in the same region should be quick. I'm not sure exactly how long it will take (particularly the restore disk create operation) with a disk of, let's say 1TB with lots of data & not just zeroes.

If that takes a long time, let's say an hour - a 1 second interval will eat up 3,600 requests of the allowed 15,000 read calls allowed for the entire subscription per hour. If a user had multiple restores and/or other CI/CD or everyday activities going on in a busy subscription, that might be particularly impactful, when a 5s interval might be sufficient instead.

If that scenario is measured in seconds or minutes, no problem!

Just something to think about :)

Details on ARM limits:

@ncdc
Copy link
Contributor

ncdc commented Mar 8, 2018

@skriss it might be worth creating a 1TB disk and filling it with random data. Then take a snapshot & time it. The create a disk from the snapshot & time it.

@skriss
Copy link
Contributor Author

skriss commented Mar 8, 2018

Good idea, thanks @noelbundick @ncdc. Will do. Even backing off to 5s will be a significant improvement, and not much worse than 1.

@nrb
Copy link
Contributor

nrb commented Mar 12, 2018

Testing this out with the 1GB volume nginx example appears to have definitely made Ark more responsive. I have not tested this with anything larger, though.

@skriss
Copy link
Contributor Author

skriss commented Mar 13, 2018

@ncdc @noelbundick @nrb I tested a backup & restore with a disk with ~100GB data on it (on the way to generating a full TB) and the disk snapshot and restore operations are ~instantaneous - no more than a couple seconds.

@skriss
Copy link
Contributor Author

skriss commented Mar 13, 2018

Did the same test with a ~700GB and it's also ~instantaneous. Based on that, I'm comfortable leaving the timeout where it is. Does anybody think it would still be a good idea to back it off to something slightly longer?

@ncdc
Copy link
Contributor

ncdc commented Mar 14, 2018

You could do something like 1s for up to 10 or 15s, then up to 2s, or 5s, etc.

@skriss
Copy link
Contributor Author

skriss commented Mar 14, 2018

I don't think that's practically possible with the Azure SDK -- all of the polling is handled within the SDK.

@ncdc
Copy link
Contributor

ncdc commented Mar 14, 2018

Ok, so we should be good to move forward with this, since the snaps are fast.

@skriss
Copy link
Contributor Author

skriss commented Mar 14, 2018

Yeah, I feel comfortable as-is based on observed behavior.

@ncdc
Copy link
Contributor

ncdc commented Mar 14, 2018

Although honestly I think 5s would be perfectly reasonable too.

@skriss
Copy link
Contributor Author

skriss commented Mar 14, 2018

Let's go with 5s for now. Still a significant improvement and slightly more conservative from an API calls perspective.

Signed-off-by: Steve Kriss <steve@heptio.com>
@skriss skriss force-pushed the reduce-azure-polling-interval branch from c380776 to 45cee7d Compare March 14, 2018 16:07
@skriss skriss changed the title Azure: reduce client polling interval from default 60s to 1s Azure: reduce client polling interval from default 60s to 5s Mar 14, 2018
@ncdc ncdc merged commit 07fcc92 into vmware-tanzu:master Mar 14, 2018
@ncdc ncdc added this to the v0.8.0 milestone Mar 14, 2018
@ncdc ncdc self-assigned this Mar 14, 2018
@skriss
Copy link
Contributor Author

skriss commented Mar 14, 2018

woohoo!

@skriss skriss deleted the reduce-azure-polling-interval branch March 14, 2018 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants