-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for growing FlexVolume size. #1700
Conversation
/ok-to-test |
/assign |
/assign @chakri-nelluri @verult |
@gnufied , @chakri-nelluri and @verult , any updates or comments on this proposal? |
I have already reviewed this when it was in Google doc. Looks good from resizing API perspective. /lgtm |
cc @chakri-nelluri @verult to review flex API. |
Adding a separate The reason why we added On the other hand, I believe it's OK if we assume all drivers support resize. If the driver without resize support is absent in master, the Expand Controller's call to find expandable plugin fails, and the controller throws an error that it can't find a matching plugin, as it should. Kubelet will take no action on a resize call because PVC's capacity status hasn't changed. I'm not super familiar with the volume resize design so please verify. As long as the failure of Expand Controller doesn't block anything else, we are OK. We'd need something like attacher-defaults.go for resize, though. |
thanks for the review.
Please correct me if I'm wrong |
Yeah I'm with you for the workflow you described. For (1) I was originally thinking of having something like Having |
@xingzhou Are there specific storage systems (or anything else that uses a Flexvolume driver) that requires |
AFAICS, some scenarios, like local storage resizing, instead of resizing on master node, require to install flex driver on worker node and do the resizing work there. So by using In addition, in our (IBM) case, we also want to resize volumes on worker node, because while resizing the volumes, we have to read some configurations which resides on the worker node. |
Thanks for clarifying. I have some more thoughts about ExpandFS, but I'll add them in the dedicated issue. |
52f74fd
to
e2648ff
Compare
Have updated the doc to remove the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor comments, LGTM otherwise
|
||
`ExpandFS` will call underneath volume driver `expandfs` method to finish FS resize. The sample code looks like: | ||
``` | ||
func (plugin *flexVolumeExpandablePlugin) ExpandFS(spec *volume.Spec, newSize resource.Quantity, oldSize resource.Quantity) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too
|
||
A sample implementation of `ExpandVolumeDevice` method is like: | ||
``` | ||
func (plugin *flexVolumeExpandablePlugin) ExpandVolumeDevice(spec *volume.Spec, newSize resource.Quantity, oldSize resource.Quantity) (resource.Quantity, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*flexVolumePlugin
Thanks @xingzhou added few comments. Please feel free to ping me on slack, I will do a quick review. |
Proposal for growing FlexVolume size, the feature ticket is at: kubernetes/enhancements#304 The original google doc for this proposal is at: https://docs.google.com/document/d/1dwom9xQ3Fg5F_jJrybr0slp-QsO3_CKiisxzNRoMhec/edit?usp=sharing
e2648ff
to
a0fafec
Compare
reviewers, please take a look at the latest version |
@verult and @chakri-nelluri, do you think we can get this proposal merged? |
|
||
#### RequiresFSResize | ||
|
||
`RequiresFSResize` is a method to implement `ExpandableVolumePlugin` interface. The return value of this method identifies whether or not a file system resize is required once physical volume get expanded. If the return value is `true`, PV resize controller will consider the volume resize operation is done and then update the PV object’s capacity in K8s directly; If the return value is `false`, PV resize controller will leave kubelet to do the file system resize, and kubelet on worker node will call `ExpandFS` method of FlexVolume to finish the file system resize step(at present, only offline FS resize is supportted, online resize support is under community discussion [here](https://github.com/kubernetes/community/pull/1535)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If the return value is false
, PV resize controller will consider the volume resize operation is done..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's my mistake, need to exchange the behavior of the two return values. I'll update this in the next patch update.
/lgtm |
One more thing: is the default value of |
|
||
### Admission Control Changes | ||
|
||
Whether or not a specific volume plugin supports volume expansion is validated and checked in PV resize admission plugin. In general, we can list FlexVolume as the ones that support volume expansion and leave the actual expansion capability check to the underneath volume driver when PV resize controller calls the `ExpandVolumeDevice` method of FlexVolume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that we will end up accepting a volume resize request at the Kubernetes API layer, only to possibly later discover (when ExpandVolumeDevice
is executed) that the volume is not re-sizable. Leaving it in a bad state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like currently the resize admission controller has a hard-coded list of plugins that support resize. Is it worth augmenting it to dynamically check resize support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to do that, but there was a concern raised before that doing so will force us to load all volume plugins in admission controller. It may not be an issue for api-server since it probably already loads all volume plugins but this needs to be carefully considered. Last time I tried something like this - it ended up increasing size of kubectl and few other binaries quite a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other thing is, if we do not add a capability like Resize
for flex volume driver, we can not tell whether or not a flex volume driver supports resize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saad-ali CSI will also likely run into this admission controller issue, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking of removing static checking we are currently doing and not do any checks at all. The reasoning is - if admission controller is enabled, we only allow resizing of PVC created from storageClasses where allowVolumeExpansion
property is set to true. Now if somehow - k8s admin enabled that field for a SC that does not support resizing, we will log an PVC event and that will enable k8s admin to fix the SC.
Or to rephrase - I think the fact that, we allow resizing of only dynamically provisioned PVC with allowExpansion
set to true
should be more than enough. if a k8s admin misconfigures the SC then it will be considered an admin error (and he should know better).
``` | ||
type DriverCapabilities struct { | ||
Attach bool `json:"attach"` | ||
RequiresFSResize bool `json:"requiresFSResize"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a capability for the Resize
alone? Not just FSResize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One difficulty is that in order for Kubernetes to distinguish between different Flex driver types, we end up with an exponential explosion of plugin structs in the Kubernetes Flexvolume code - AttachablePlugin
, ExpandablePlugin
, AttachableExpandablePlugin
, and the regular plugin. I'm not sure if there's a better way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, if we need to add more capability for Flex Volume in the future, may produce more type of plugin.
For the default value of |
@chakri-nelluri : Can you please do the review and approve if its fine? |
@saad-ali We need someone to review and approve this |
/approve Let's make this the last feature added to Flex. All new features (e.g. snapshots, etc.) should be added to CSI not Flex. I do not want the Flex API to continue to expand. Continuing to grow both the CSI and Flex interfaces will became a nightmare to manage. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: saad-ali The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hey @xingzhou , I noticed there aren't any code changes inside the resize admission controller. Did we decide to remove the admission controller changes in the end? Thanks! |
@verult yeah we ended up removing static check from admission controller because it won't work for CSI too. |
Proposal for growing FlexVolume size.
Proposal for growing FlexVolume size, the feature ticket is at:
kubernetes/enhancements#304
The original google doc for this proposal is at:
https://docs.google.com/document/d/1dwom9xQ3Fg5F_jJrybr0slp-QsO3_CKiisxzNRoMhec/edit?usp=sharing