Skip to content
This repository has been archived by the owner on Sep 7, 2022. It is now read-only.

SPBM policy integration for dynamic volume provisioning inside kubernetes #142

Closed

Conversation

BaluDontu
Copy link

@BaluDontu BaluDontu commented May 11, 2017

Till now, vSphere CP provides support to configure persistent volume with VSAN storage capabilities - kubernetes#42974. Right now this only works with VSAN.

Also there might be other use cases:

  1. But the user might need a way to configure a policy on other datastores like VMFS, NFS etc.
  2. How to use SIOC, VMCrypt policies for a persistent disk.

We can achieve about 2 use cases by using existing storage policies which are already created on vCenter. The user will specify the SPBM policy ID as part of dynamic provisioning and volume will have the policy configured with it.

This feature will allow you to specify the SPBM policy ID as part of dynamic volume provisioning. The created persistent volume will have the SPBM policy ID associated with it when u create a PVC.

For example,

kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
  name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
    diskformat: zeroedthick
    storagepolicyName: policy1
    datastore: VMFSDatastore

When you deploy a pod with the PVC referring to the above mentioned storageclass, you will see the volume association on the vCenter for the node where the volume is attached to.

@@ -101,39 +110,50 @@ func (util *VsphereDiskUtil) CreateVolume(v *vsphereVolumeProvisioner) (vmDiskPa
case datastore:
volumeOptions.Datastore = value
case Fstype:
fstype = value
fstype := value

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove :=. Change this to
fstype = value

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation. I will fix it in next commit.

deviceConfigSpec.Profile = append(deviceConfigSpec.Profile, profileSpec)
}
virtualMachineConfigSpec.DeviceChange = append(virtualMachineConfigSpec.DeviceChange, deviceConfigSpec)
task, err := vm.Reconfigure(ctx, virtualMachineConfigSpec)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want add check for nil task to avoid NPE? we have observed nil task when user does not have read permission.

Copy link
Author

@BaluDontu BaluDontu May 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed i guess.

@@ -1266,19 +1282,43 @@ func (vs *VSphere) CreateVolume(volumeOptions *VolumeOptions) (volumePath string
dc, err := f.Datacenter(ctx, vs.cfg.Global.Datacenter)
f.SetDatacenter(dc)

if volumeOptions.StoragePolicyName != "" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you fail early if storagePolicyName and VSANStorageProfileData are set? If we are already doing it, can you point to the code where we fail early?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see that code in vspherE_volume_util.go file - 6d41087#diff-482fa05b285b59ab0bc70641f984ce98R131


if volumeOptions.Datastore != "" {
if !IsUserSpecifiedDatastoreCompatible(dsRefs, volumeOptions.Datastore) {
return "", fmt.Errorf("User specified datastore: %q is not compatible with the StoragePolicy: %q requirements", volumeOptions.Datastore, volumeOptions.StoragePolicyName)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want to say why it is not compatible? SPBM gives reasons why a particular datastore is incompatible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have made sure it shows messages why a datastore is incompatible.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you paste an example of what message the k8s user would see? SPBM does return CompatibilityResult.error that says what are the expected capabilities and what the storage can offer (actual capability). Do we show this to k8s user?

compatibleHubs := res.CompatibleDatastores()
// Return an error if there are no compatible datastores.
if len(compatibleHubs) < 1 {
return nil, fmt.Errorf("There are no compatible datastores: %+v that satisfy the storage policy: %+q requirements", datastores, storagePolicyID)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning 0 compatible datastores with nil error is a perfectly fine behavior for this method. The caller can decide if it was expecting 0 or 1 or 2 or more etc. Based on the caller's requirements, it can treat it as error or perfectly fine condition.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Moved it to the original caller function.

}

// Get the datastore morefs.
func (vs *VSphere) getDatastoreMorefs(ctx context.Context, dsRefs []types.ManagedObjectReference) ([]mo.Datastore, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this method to getDatastoreMo since you are returning the Datastore manaject objects here and not really a managed object references. You may want to consider renaming the local variables in this method for the same reasons.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

// Verify if the user specified datastore is in the list of compatible datastores.
func IsUserSpecifiedDatastoreCompatible(dsRefs []mo.Datastore, dsName string) bool {
Copy link

@SandeepPissay SandeepPissay May 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename the function according to the implemented logic. This method seems to be just checking if the given dsName is in the given Datastore managed object array. The method has nothing to do with compatibility or incompatibility checks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restructured the code slightly here to make sure IsUserSpecifiedDatastoreCompatible() does what it needs to do.

}

// Get the best fit compatible datastore by free space.
func GetBestFitCompatibleDatastore(dsRefs []mo.Datastore) string {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to GetMostFreeDatastore and return the Datastore managed object. Also I think we should not use Refs here as these are not managed object references, instead they are managed objects. I see this misnaming used a lot in your change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed it all places.

return nil, err
}

// The K8s cluster might be deployed inside a cluster or a host.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we cannot assume this. The k8s cluster could be created in a child resource pool that may be within one or many resource pool/child resource pools.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this logic altogether.

}

// Get all datastores accessible inside the current Kubernetes cluster.
func (vs *VSphere) getAllAccessibleDatastoresForK8sCluster(ctx context.Context, resourcePool *object.ResourcePool) ([]types.ManagedObjectReference, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are assuming that all the datastores within the cluster are accessible to all k8s nodes. This may not be true! You will also have to check if the ESX hosts running the k8s nodes have access to the datastores.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. You are correct! I have retrieved datastores from the esx host.

@@ -374,7 +376,47 @@
pvpod 1/1 Running 0 48m
```

### Virtual SAN policy support inside Kubernetes
### Storage Policy Management inside kubernetes
#### Using existing vCenter SPBM policy

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed there are some known issues with the user specifying the policy name instead of policy ID, and I think we should document this so that the VI admin/k8s admin/k8s user knows about it and how to deal with it. I'm just listing down our discussion points

*** Admin updating the policy name in vCenter could cause confusions/inconsistencies ***
This happens when VI admin changes the policy name without the knowledge of k8s users or letting the k8s users know about it. The k8s users will not be able to create volume with storage class that uses old policy name. k8s user and VI admin has to resolve such issues by finding out the latest policy name and updating the existing storage classes with latest policy name.

*** Two or more PVs could show different policy names but with the same policy ID ***
The k8s user can see the policy name and policy ID while describing PVs. If a storage policy name in vCenter is updated, then this is not reflected in the policy names shown for the existing PVs (the policy ID will be shown correctly). The k8s user has to update the storage class to use the latest policy name, and any PVs created after this point will show the correct policy name and policy ID. However, if the old PVs and new PVs are described, the old PVs will continue to show the old policy name which could cause confusions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. This would not be the right place to describe these known issues. When we announce a release with this feature that's when we would have this information written up.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, how are you tracking that these known issues are not lost?

Copy link
Author

@BaluDontu BaluDontu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed sandeep's comments.

@@ -374,7 +376,47 @@
pvpod 1/1 Running 0 48m
```

### Virtual SAN policy support inside Kubernetes
### Storage Policy Management inside kubernetes
#### Using existing vCenter SPBM policy
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. This would not be the right place to describe these known issues. When we announce a release with this feature that's when we would have this information written up.

@@ -1266,19 +1282,43 @@ func (vs *VSphere) CreateVolume(volumeOptions *VolumeOptions) (volumePath string
dc, err := f.Datacenter(ctx, vs.cfg.Global.Datacenter)
f.SetDatacenter(dc)

if volumeOptions.StoragePolicyName != "" {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see that code in vspherE_volume_util.go file - 6d41087#diff-482fa05b285b59ab0bc70641f984ce98R131

compatibleHubs := res.CompatibleDatastores()
// Return an error if there are no compatible datastores.
if len(compatibleHubs) < 1 {
return nil, fmt.Errorf("There are no compatible datastores: %+v that satisfy the storage policy: %+q requirements", datastores, storagePolicyID)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Moved it to the original caller function.

}

// Get the best fit compatible datastore by free space.
func GetBestFitCompatibleDatastore(dsRefs []mo.Datastore) string {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed it all places.

}

// Get the datastore morefs.
func (vs *VSphere) getDatastoreMorefs(ctx context.Context, dsRefs []types.ManagedObjectReference) ([]mo.Datastore, error) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


if volumeOptions.Datastore != "" {
if !IsUserSpecifiedDatastoreCompatible(dsRefs, volumeOptions.Datastore) {
return "", fmt.Errorf("User specified datastore: %q is not compatible with the StoragePolicy: %q requirements", volumeOptions.Datastore, volumeOptions.StoragePolicyName)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have made sure it shows messages why a datastore is incompatible.

}

// Verify if the user specified datastore is in the list of compatible datastores.
func IsUserSpecifiedDatastoreCompatible(dsRefs []mo.Datastore, dsName string) bool {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restructured the code slightly here to make sure IsUserSpecifiedDatastoreCompatible() does what it needs to do.

}

// Get all datastores accessible inside the current Kubernetes cluster.
func (vs *VSphere) getAllAccessibleDatastoresForK8sCluster(ctx context.Context, resourcePool *object.ResourcePool) ([]types.ManagedObjectReference, error) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. You are correct! I have retrieved datastores from the esx host.

return nil, err
}

// The K8s cluster might be deployed inside a cluster or a host.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this logic altogether.

if err != nil {
return nil, err
}
dsMorefs := make(map[int][]types.ManagedObjectReference)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on line#210-219

This entire block can be written in a simple and efficient way. There is no need to create multiple maps to arrive at the shared datastore list and also I found it a bit complex to understand the logic. Here's how we can write this block (pseudo code only):

Set sharedDs;
for (i = 0; i < vmList.size(); i++) {
Set accessibleDs = getAccesibleDatastoresForNode(vm.get(i))
if (i == 0) {
sharedDs.addAll(accesibleDs);
} else {
sharedDs = intersect(sharedDs, accessibleDs);
if (sharedDs.size() == 0) {
break;
}
}
}

sharedDs intersect(sharedDs, accessibleDs) {
for ds : sharedDs {
if (accessibleDs.get(ds) == null) {
sharedDs.remove(ds)
}
}
return sharedDs
}

@@ -1535,20 +1533,27 @@ func (vs *VSphere) cleanUpDummyVMs(dummyVMPrefix string) {
f.SetDatacenter(dc)

// Get the folder reference for global working directory where the dummy VM needs to be created.
vmFolder, err := getFolder(ctx, vs.client, vs.cfg.Global.Datacenter, vs.cfg.Global.WorkingDir)
vmFolder, err := f.Folder(ctx, strings.TrimSuffix(vs.cfg.Global.WorkingDir, "/"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not quite get why you had to modify this method. Can you explain what is this method doing, when it is invoked and why this needs change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to get folder. I don't need a function, I can get folder reference from working directory path.

var sharedList []string
for _, val1 := range list1 {
// Check if val1 is found in list2
for _, val2 := range list2 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This search will be expensive if the number of entries are a lot but may not be so inefficient if the number of entries are less. I was expecting you to use "map" since "set" data structure is not there in go std library. But the logic seems correct, see if you can use map instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the datastores will be very minimum for a VM, this logic would be not so expensive.

@SandeepPissay
Copy link

The code changes looks ok to me. I dont see any "testing done" mentioned in this review. Could you explain what tests have you done on the latest code to verify that it works fine?

@BaluDontu
Copy link
Author

BaluDontu commented May 20, 2017

A few test cases, I have executed for this PR.

  • Specify only SPBM storage policy name.
    • Verify if the disk is provisioned on a compatible datastore with max free space.
  • Specify SPBM storage policy name with user specified datastore which is compatible.
    • Verify if the disk is provisioned on the user specified datastore even though free space is less.
  • Specify a storage policy name which defined on VC and has no compatible datastores.
    • Verify if PVC create errors out that there were no compatible datastores that matches the storage policy requirements.
  • Specify a storage policy name which is not defined on VC.
    • Verify if PVC create errors out that no pbm profile with this policy is found.
  • Specify both SPBM storage policy name and VSAN capabilities together.
    • Verify if PVC create errors out that you can't use both SPBM policy name with VSAN capabilities. You can only specify one.
  • Specify SPBM storage policy name with user specified datastore which is non-compatible.
    • Verify if PVC create errors out that it can't provision a disk on a non-compatible datastore with the reason why its not compatible.
  • Switch off one of the hosts where VM using the volume is residing. If the pod starts on a new host, make sure that volume is still accessible and is it attached successfully.
  • Enable DRS, make sure the movement of VM's between the hosts doesn't affect the volume availability for the VM on the new host.
  • Check specifying VSAN policy params in storageclass and make sure it doesn't break the overall logic with this newly implemented functionality.
  • The user should specify the "storagepolicyname" in the storage class. If a wrong param is given, error out to user that specify a correct parameter.

@SandeepPissay
Copy link

LGTM

@BaluDontu BaluDontu closed this Jun 8, 2017
@BaluDontu
Copy link
Author

The PR is already merged on Kubernetes - kubernetes#46176

@BaluDontu BaluDontu deleted the SPBMPolicySupportVsphere branch July 13, 2017 23:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants