Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to assign tags to virtual machines #1014

Merged
merged 4 commits into from
Oct 18, 2021

Conversation

alexander-demicev
Copy link
Contributor

What this PR does / why we need it:
Add the ability to assign tags to virtual machines, similar to what is done in AWS. In order to do so, we need to have Tags field in VSphereMachineTemplate and create REST client. This change assumes that tags are already created by the user.

Release note:

-->

Add the ability to assign tags to virtual machines. 

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 24, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @alexander-demichev. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot requested review from ncdc and yastij August 24, 2020 10:50
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 24, 2020
@@ -38,6 +40,7 @@ func TestCreate(t *testing.T) {
t.Fatal(err)
}
model.Service.TLS = new(tls.Config)
model.Service.RegisterEndpoints = true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line and _ "github.com/vmware/govmomi/vapi/simulator" are needed to make simulator register rest session endpoint or it returns an error - unable to create rest client: POST https://127.0.0.1:63223/rest/com/vmware/cis/session: 404 Not Found

@yastij
Copy link
Member

yastij commented Aug 24, 2020

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 24, 2020
@yastij
Copy link
Member

yastij commented Aug 24, 2020

@alexander-demichev - we're planning to cut v0.7.1 today, let's review this, merge it and cut a release once it's ready

@yastij yastij self-assigned this Aug 24, 2020
@randomvariable
Copy link
Member

Need to add the conversion webhooks for the tests to pass.

@alexander-demicev alexander-demicev force-pushed the tags branch 2 times, most recently from 29638ff to 48bfb68 Compare August 24, 2020 12:27
@EleanorRigby
Copy link
Contributor

@alexander-demichev : I think these tags have to pre-exist in vCenter. Is there a way to dynamically create/delete them with VMs?

@ncdc
Copy link
Contributor

ncdc commented Aug 27, 2020

@alexander-demichev would you be able to elaborate on your use case here? I recognize that other infra providers such as CAPA allow this, but you may want to read https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/performance/tagging-vsphere67-perf.pdf about tag performance in larger configurations.

Copy link
Member

@yastij yastij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the tests yet, I'll do another pass monday

controllers/vspheremachine_controller.go Outdated Show resolved Hide resolved
controllers/vspherevm_controller.go Outdated Show resolved Hide resolved
@@ -41,7 +42,9 @@ import (
)

// VMService provdes API to interact with the VMs using govmomi
type VMService struct{}
type VMService struct {
Tags []string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

pkg/services/govmomi/service.go Show resolved Hide resolved
@@ -273,6 +281,18 @@ func (vms *VMService) reconcilePowerState(ctx *virtualMachineContext) (bool, err
}
}

func (vms *VMService) reconcileTags(ctx *virtualMachineContext) error {
tagManager := tags.NewManager(ctx.Session.RestClient)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this blank line

pkg/session/session.go Outdated Show resolved Hide resolved
Copy link
Member

@yastij yastij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pass for the API


// Tags is an optional set of tags to add to an instance.
// +optional
Tags []string `json:"tags,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put these in the VirtualMachineCloneSpec

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make the tag category explicit also ? (this would enable key:value use cases)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yastij Do you have an example of how to attach a tag from a specific category?

api/v1alpha3/vspherevm_types.go Outdated Show resolved Hide resolved
@yastij
Copy link
Member

yastij commented Aug 31, 2020

@alexander-demichev - the scalability concerns pointed by @ncdc are standing. what is the use case we want to cover and at what scale ?

@alexander-demicev
Copy link
Contributor Author

@yastij @ncdc I didn't know about performance issues with tags. My use case is similar to what tags do for other providers.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 30, 2020
@vrabbi
Copy link

vrabbi commented Dec 2, 2020

i also have a few use cases for this. having the tags would greatly enhance the ability for extensibility using the VMware Ecosystem tools. the use cases i have seen for this so far are:

  1. on-boarding clusters to vRealize Automation using the tag based on-boarding mechanism
  2. using VMware Event Broker Appliance to run custom tasks when a specific event occurs. this use case is very helpful as there is no specific event in vSphere for a CAPV cluster being deployed as it is simply provisioning VMs from the vSphere perspective. by adding the tags a custom action could be built in VEBA which could run any custom logic when a node is created in a CAPV clusters. this could include DRS rules (anti-affinity or affinity), creating DNS records for the nodes, adding nodes to a CMDB , creating custom groups in VROPs for monitoring a cluster etc.

regarding the performance issues, while this is something to take into consideration, 25k tag assignments with good performance as mentioned in the article seems quite reasonable. many systems add tags in vSphere and the performance issue i dont think should be a blocker here either. just as maybe the user doesnt have enough resources isnt a blocker for automated deployments.
having CAPV add tags on its own would be an issue but if its opt-in and user defined i think it makes a lot of sense to add this functionality

@embano1
Copy link

embano1 commented Dec 7, 2020

Not sure about the initial use case here from reading the PR, but I'd suggest to use an async/throttled out-of-band (ie decoupled) mechanism based on vSphere events for this if possible, eg using the event broker or similar. It helps with separation of concerns (access permissions to VC), scalability concerns, network issues and making the controller logic less vulnerable to reconciliation/blocking issues due to heavy VC RPCs. We've seen users successfully adopting this model for tag management (beyond K8s controllers) and it works nicely (orthogonal) with the async reconciliation pattern in K8s controllers.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 7, 2020
@yastij
Copy link
Member

yastij commented Dec 8, 2020

@vrabbi - I agree that we should have this. @alexander-demichev can you address the comments and rebase ?

@yastij yastij removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 8, 2020
@alexander-demicev
Copy link
Contributor Author

@yastij yes, I'll try to find time this week

@gab-satchi
Copy link
Member

Summarizing what's left to be done for the PR:

  • rebase now that main branch has switched over to v1alpha4. The API changes will need to be modified to drop v1alpha2
  • switch to the batch tagging API. also implies we use tagIDs in the spec and not the tag name.
  • rate limiting for the tag attach call. This may not be strictly necessary if we switch to the batch tagging

@alexander-demichev apologies for the long delay with this PR. Is this still something you're interested in contributing?

Copy link

@vrabbi vrabbi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be explicit we are dealing with Tag IDs and not Tag Names

@@ -514,6 +514,11 @@ spec:
a linked clone. This field is ignored if LinkedClone is not enabled.
Defaults to the source's current snapshot.
type: string
tags:
description: Tags is an optional set of tags to add to an instance.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be "Tags is an optional set of tag IDs to add to an instance."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Tags is an optional set of tags to add to an instance.
description: Tags is an optional set of tag IDs to add to an instance.

@@ -574,6 +574,12 @@ spec:
to create a linked clone. This field is ignored if LinkedClone
is not enabled. Defaults to the source's current snapshot.
type: string
tags:
description: Tags is an optional set of tags to add to an
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be "Tags is an optional set of tag IDs to add to an instance"

@@ -272,6 +272,11 @@ spec:
a linked clone. This field is ignored if LinkedClone is not enabled.
Defaults to the source's current snapshot.
type: string
tags:
description: Tags is an optional set of tags to add to an instance.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be "Tags is an optional set of tag IDs to add to an instance."

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 18, 2021
@alexander-demicev
Copy link
Contributor Author

Hi all, I updated the PR:

  1. bumped govmomi dependency so we can use batch tagging
  2. changed API field name to TagIDs

I'm having an issue with generating conversions, it's failing with the following error:
E0818 13:26:54.674110 59681 conversion.go:755] Warning: could not find nor generate a final Conversion function for sigs.k8s.io/cluster-api-provider-vsphere/api/v1alpha4.VirtualMachineCloneSpec -> sigs.k8s.io/cluster-api-provider-vsphere/api/v1alpha3.VirtualMachineCloneSpec E0818 13:26:54.674407 59681 conversion.go:756] the following fields need manual conversion: E0818 13:26:54.674417 59681 conversion.go:758] - TagIDs

@@ -140,6 +140,10 @@ func (vms *VMService) ReconcileVM(ctx *context.VMContext) (vm infrav1.VirtualMac
return vm, err
}

if err := vms.reconcileTags(vmCtx); err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think error handling could be slightly improved here instead of reconciling (potentially forever) on any error, including unrecoverable errors.

To quote from the docs:

AttachMultipleTagsToObject attaches multiple tag IDs to a managed object. This operation is idempotent. If a tag is already attached to the object, then the individual operation is a no-op and no error will be thrown. This operation is not atomic. If the underlying call fails with one or more tags not successfully attached to the managed object reference it might leave the managed object reference in a partially tagged state and needs to be resolved by the caller. In this case BatchErrors is returned and can be used to analyse failure reasons on each failed tag. Specified tagIDs must use URN-notation instead of display names or a generic error will be returned and no tagging operation will be performed. If the managed object reference does not exist a generic 403 Forbidden error will be returned. This operation was added in vSphere API 6.5.

We are working on improved error type handling (for assertions) but the above should give you an impression of different error scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that there are no typed errors at the moment?

Copy link

@embano1 embano1 Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only partially ([]BatchError) and the situation is improving as we export more errors. Please see the tests for the method which shows common error scenarios and the errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, we don't distinguish between terminal and intermittent errors and we don't need to tackle that in this PR. What could be useful here is setting a condition on the VSphereMachine for reconciling tags. Right now if there is an error, nothing will show on the VSphereMachine to help the user debug further.

@alexander-demicev alexander-demicev force-pushed the tags branch 2 times, most recently from a8ec17d to 7a2d727 Compare August 19, 2021 12:30
Copy link
Member

@gab-satchi gab-satchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the changes. tested it out locally and it works. It's a slight nuisance to have to use the tag URNs instead of just the name but that's an API limitation.

@@ -0,0 +1,24 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I ran a make generate there is a new function in the generated conversion file. Should add that in. This file/function is still needed as it requires us to do the manual conversion.

@@ -140,6 +140,10 @@ func (vms *VMService) ReconcileVM(ctx *context.VMContext) (vm infrav1.VirtualMac
return vm, err
}

if err := vms.reconcileTags(vmCtx); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, we don't distinguish between terminal and intermittent errors and we don't need to tackle that in this PR. What could be useful here is setting a condition on the VSphereMachine for reconciling tags. Right now if there is an error, nothing will show on the VSphereMachine to help the user debug further.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 26, 2021
@yastij
Copy link
Member

yastij commented Oct 6, 2021

@alexander-demichev - pending @gab-satchi's comment on adding a condition, this looks good. Once we merge the v1beta1 types could you rebase your PR ? this way it can make it into the new release

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 15, 2021
@alexander-demicev
Copy link
Contributor Author

@yastij Hi, I rebased the PR and added a condition

Copy link
Member

@yastij yastij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yastij

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 18, 2021
@gab-satchi
Copy link
Member

thanks for the changes

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 18, 2021
@k8s-ci-robot k8s-ci-robot merged commit 423b6df into kubernetes-sigs:master Oct 18, 2021
@alexander-demicev alexander-demicev deleted the tags branch March 29, 2022 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.