Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io1 volumes with <100 iops fail to provision #575

Closed
kcking opened this issue Oct 2, 2020 · 7 comments
Closed

io1 volumes with <100 iops fail to provision #575

kcking opened this issue Oct 2, 2020 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@kcking
Copy link
Contributor

kcking commented Oct 2, 2020

/kind bug

What happened?
provisioning io1 volume with < 100 IOPS fails

What you expected to happen?
provision the volume with a min of 100 IOPS instead of failing, as this was the behavior of the in-tree EBS plugin

How to reproduce it (as minimally and precisely as possible)?
provision an io1 volume with < 100 IOPS

Anything else we need to know?:
Maybe this is intended behavior, but was surprising to me as we were migrating our infra over from in-tree to this driver

Environment

  • Kubernetes version (use kubectl version): 1.17
  • Driver version: 0.6
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 2, 2020
@wongma7
Copy link
Contributor

wongma7 commented Oct 6, 2020

I think it is better to throw an error than silently accept and correct an invalid input.

Can you share your StorageClass? Do you trigger this situation by setting a low iopsPerGB value in it? Since even with CSI migration enabled there is no guarantee of compatibility between in-tree and CSI StorageClasses, I don't think we are violating compatibility by tweaking the validation around iopsPerGB.

Open to discussion.

@msau42
Copy link
Contributor

msau42 commented Oct 6, 2020

I agree that it's better to error than silently round up, so that users are not surprised when they get their bill.

For CSI migration, you could preserve in-tree behavior by rounding up when you translate the in-memory StorageClass parameters.

@ayberk
Copy link
Contributor

ayberk commented Oct 6, 2020

I, too, concur that we should keep this behavior, but at the same time we shouldn't surprise users migrating from the in-tree plugin. Maybe it's worth having a section in the README to note differences for migration.

@kcking
Copy link
Contributor Author

kcking commented Oct 7, 2020

Totally hear the point of a surprise bill. Yes the storage class in question had a very low IOPS/GB and was a 16GB volume because it was a testing environment.

I'm not using storage class migration because AFAICT it is not enabled in k8s 1.17 (the current latest for EKS). Instead I had my k8s database operator (foundationdb) migrate the cluster to the new CSI storage class, but ran into errors because the CSI driver could not bring up any io1 volumes with that configuration. I had to manually intervene and fix the storage class (I just changed it to gp2).

Perhaps this edge case is a rare one, but documentation regarding the io1 storage class difference between the in-tree plugin and the csi driver might have saved me from this suprise :)

@wongma7
Copy link
Contributor

wongma7 commented Oct 7, 2020

OK, I am fine with changing the behavior to be like in-tree to avoid situations like that. I do want to at least change the "silent" part of it...we definitely need to document what happens if iopsPerGB * requestedGB is less than minimum, and a log message won't hurt, something like "Setting iops to minimum 100 because x would be too low"

@msau42
Copy link
Contributor

msau42 commented Oct 7, 2020

I think folks in this thread were actually ok with the opposite, ie error, but add more documentation in the repo that this is different than intree behavior.

And in terms of csi migration, we can retain the old behavior in the translation layer.

@wongma7
Copy link
Contributor

wongma7 commented Oct 7, 2020

Got it, opened #582 to fix the doc, yes let's have translation layer retain old behavior and spit out some log message .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants