Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ support server group and scheduler hint additional properties #2163

Conversation

okozachenko1203
Copy link
Contributor


What this PR does / why we need it:
This PR adds a new field SchedulerHintAdditionalProperties in both OpenstackServerSpec and OpenstackMachineSpec to specify nova scheduler hint properties during server creation.
In addition, it passes openStackMachineSpec.ServerGroup to OpenStackServerSpec.ServerGroup which seems missing during the spec conversion.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • if necessary:
    • includes documentation
    • adds unit tests

/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 20, 2024
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 20, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @okozachenko1203. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 20, 2024
Copy link

netlify bot commented Aug 20, 2024

Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name Link
🔨 Latest commit bb6cf4d
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-cluster-api-openstack/deploys/66eadc87d2a3560008de74d8
😎 Deploy Preview https://deploy-preview-2163--kubernetes-sigs-cluster-api-openstack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@jichenjc
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 20, 2024
@okozachenko1203 okozachenko1203 force-pushed the feat/support-server-group-scheduler-hints branch from 54b04e8 to 5eb1da6 Compare August 21, 2024 05:21
@okozachenko1203
Copy link
Contributor Author

/retest

@okozachenko1203
Copy link
Contributor Author

/cherry-pick release-0.10
/cherry-pick release-0.9

@k8s-infra-cherrypick-robot

@okozachenko1203: only kubernetes-sigs org members may request cherry picks. If you are already part of the org, make sure to change your membership to public. Otherwise you can still do the cherry-pick manually.

In response to this:

/cherry-pick release-0.10
/cherry-pick release-0.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jichenjc
Copy link
Contributor

This PR adds a new field SchedulerHintAdditionalProperties in both OpenstackServerSpec and OpenstackMachineSpec to specify nova scheduler hint properties during server creation.
In addition, it passes openStackMachineSpec.ServerGroup to OpenStackServerSpec.ServerGroup which seems missing during the spec conversion.

I think it make sense to split those 2 as 2nd seems to be a bug and #1 to be a feature


// SchedulerHintAdditionalProperties are arbitrary scheduler hint key/values.
// +optional
SchedulerHintAdditionalProperties map[string]string `json:"schedulerHintAdditionalProperties,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems this is not true, we can't expect it is always a string from the API spec?

https://docs.openstack.org/api-ref/compute/



    The request validation schema is per hint. For example, some require a single string value, and some accept a list of values.

    Hints are only used based on the cloud scheduler configuration, which varies per deployment.

    Hints are pluggable per deployment, meaning that a cloud can have custom hints which may not be available in another cloud.

For these reasons, it is important to consult each cloud’s user documentation to know what is available for scheduler hints.

Copy link
Contributor Author

@okozachenko1203 okozachenko1203 Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jichenjc yes, it is true. We have our own custom filter and need to integrate it with capi.
strictly speaking, SchedulerHintAdditionalProperties presents os:scheduler_hints["additionalProperties"], But it is also any type. (correct me if i'm wrong) I think we cannot map[string]interface{} because kubebuilder doesn't support. Also it is not a good idea to define our own hint with strict type, and we cannot know what potential properties other filters define and use.
I couldn't find any other reference or type struct def which handles any type, and i used a string here because our custom filter accepts boolean and string type values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jichenjc do you have thought on this? :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no solution but just questioning the API and the way of implementation
if we are sure the string (maybe define our own format) , or we claim we only support string type filter
it might be ok...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure if any changes required for this convo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please could you convert this map to a list of named objects. This is a recommendation of the k8s API guidelines which (I think!) we now follow everywhere else: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#lists-of-named-subobjects-preferred-over-maps

@okozachenko1203 okozachenko1203 changed the title feat: support server group and scheduler hint additional properties ✨ support server group and scheduler hint additional properties Aug 22, 2024
server, err := s.getComputeClient().CreateServer(
keypairs.CreateOptsExt{
CreateOptsBuilder: serverCreateOpts,
KeyName: instanceSpec.SSHKeyName,
},
servers.SchedulerHintOpts{
Group: instanceSpec.ServerGroupID,
Group: instanceSpec.ServerGroupID,
AdditionalProperties: schedulerAdditionalProperties,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and as group is part of the scheduler properties ,should we consider consolidate them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jichenjc well, it is true that all those fields in servers.schedulerHintOpts are merged and passed as os:scheduler_hints request body parameter, so we can set group under additionalProperties. But I am not sure if that is necessary because gophercloud already defined Group as an explicit parameter (i think it is just for convenient) under servers.SchedulerHintOpts.
https://github.com/gophercloud/gophercloud/blob/5cb81d730a1e027aa5981edd23c7b1403411db40/openstack/compute/v2/servers/requests.go#L141

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's correct to leave Group separate here. I considered asking that we validate that AdditionalProperties does not contain group, but the problem with that is that we can't easily extend that in the future as we add new 'supported' hints.

So I think additionalProperties is a footgun, but there's no good way round that. We should just document that adding 'group' to it is a bad idea and has undefined behaviour.

@mdbooth
Copy link
Contributor

mdbooth commented Sep 4, 2024

/cherry-pick release-0.10 /cherry-pick release-0.9

We would not typically backport this kind of change. Backporting to 0.9 in particular would likely be very difficult.

Copy link
Contributor

@mdbooth mdbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted below we should convert this map into a list in the API, and marshal it into an appropriate map before send it to gophercloud.

I'm also pontificating on @jichenjc's observation that the value type should be any rather than string. I also note that your own custom filter accepts boolean or string. Another might accept an arbitrary struct?

How about using a discriminated union here:

additionalProperties:
- name: my_custom_bool_hint
  value:
    type: bool
    bool: true
- name: my_custom_string_hint
  value:
    type: string
    string: "foo"
- name: my_custom_int_hint
  value:
    type: int
    int: 7
- name: my_custom_object_hint
  value:
    type: object
    object:
      foo: bar
      baz:
      - baz1
      - baz2

I'd recommend against implementing any types that we don't currently need, but object would probably involve RawExtension in some capacity. This would be unmarshalled into map[string]any before passing to gophercloud.

@JoelSpeed got a better idea?


// SchedulerHintAdditionalProperties are arbitrary scheduler hint key/values.
// +optional
SchedulerHintAdditionalProperties map[string]string `json:"schedulerHintAdditionalProperties,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please could you convert this map to a list of named objects. This is a recommendation of the k8s API guidelines which (I think!) we now follow everywhere else: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#lists-of-named-subobjects-preferred-over-maps

Copy link
Contributor

@mdbooth mdbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for coming back on this so quickly.

The change from map to array is definitely good. Although it was my suggestion, I'm still hoping we can come up with something better than the union discriminated by type for the value. If I can find a minute I'm going to protype replacing the whole value with a RawExtension and write some tests to see what that looks like in practise. If it's ok that might be a better solution.

If we do go with the discriminated union, I'd say:

  • lets omit object for now because I'm going to ask for validation and tests, and you don't want to write those for something nobody's using yet
  • lets rename int to number so our janky JSON marshaller at least uses the correct JSON jargon

Whatever we come up with, we will need to write CEL to validate it, and test those validations in the apivalidations suite. They're pretty easy to write and there are a ton of examples to follow already.

I'll try to prototype the RawExtension thing today, but if I'm taking too long feel free to try it yourself. I want to know:

  • What does it look like to populate in Go
  • What does it look like to populate in a yaml template
  • Are there any quirks in how it marshals to json which would make it unusable for this?

Ideally it just acts like any, but I don't know.

controllers/openstackmachine_controller.go Show resolved Hide resolved
Copy link
Contributor

@mdbooth mdbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks. Incidentally, I tested out RawExtension and it's not what we want: it requires the data to have an APIVersion and Kind, which isn't relevant to this use case. Based on my quick tinkering I think it will be possible to create an Object type, but until we have somebody who needs it I don't think it's worth the effort to write. It will certainly be more complex than the simple types implemented here.

Please can you add API validations for the new API. I've linked examples below and it should be fairly straightforward. Please can you also add tests to test/e2e/suites/apivalidations/openstackmachine_test.go. There are lots of examples in there and these should be very straightforward. I think we want:

  • empty list is ok
  • item with empty name is not ok
  • item with empty value is not ok
  • value.type is bool and value.bool is populated is ok
  • value.type is bool and value.string is populated is not ok
  • ... and repeat

Please can we also add something to the unit tests, probably

func TestService_ReconcileInstance(t *testing.T) {
getDefaultServerCreateOpts := func() servers.CreateOpts {
to validate that we're marshalling the struct correctly when sending it to gophercloud? We're already passing expectedSchedulerHintOpts into expectCreateServer, so this should be fairly easy to add:

  • Modify expectCreateServer to take servers.SchedulerHintOpts instead of SchedulerHintOptsBuilder, as this will be easier to validate. This change shouldn't break anything.
  • Add a new test: "With custom scheduler hint number" (or whatever)
  • In getInstanceSpec, fetch the default, set the required scheduler hints
  • In expect, call expectCreateServer with the expected schedulerHintOpts.

api_violations.report Outdated Show resolved Hide resolved
api/v1beta1/openstackmachine_types.go Show resolved Hide resolved
api/v1beta1/openstackmachine_types.go Outdated Show resolved Hide resolved
api/v1beta1/openstackmachine_types.go Outdated Show resolved Hide resolved
api/v1beta1/openstackmachine_types.go Outdated Show resolved Hide resolved
api/v1beta1/openstackmachine_types.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 11, 2024
@EmilienM
Copy link
Contributor

/test pull-cluster-api-provider-openstack-test

@EmilienM
Copy link
Contributor

Job is running. Since it's required we can hold cancel and it'll land once the CI job has finished.
Thanks for all the work here and your patience.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 13, 2024
@EmilienM
Copy link
Contributor

ugh it went sleep again
/test pull-cluster-api-provider-openstack-test

@EmilienM
Copy link
Contributor

@okozachenko1203 I don't know what's going on in CI but could you please try a rebase on main, this will trigger jobs. I'll add the LGTM again.

@EmilienM
Copy link
Contributor

/test pull-cluster-api-provider-openstack-test

@okozachenko1203 okozachenko1203 force-pushed the feat/support-server-group-scheduler-hints branch from 45b6ccb to 4e0fad5 Compare September 14, 2024 01:43
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 14, 2024
@EmilienM
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 16, 2024
// +kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'Bool' ? self.bool != null : self.bool == null",message="bool is required when type is Bool, and forbidden otherwise"
// +kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'Number' ? self.number != null : self.number == null",message="number is required when type is Number, and forbidden otherwise"
// +kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'String' ? self.string != null : self.string == null",message="string is required when type is String, and forbidden otherwise"
// +union.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The union tags come from an unimplemented KEP from upstream Kube, eventually they should be implemented

Comment on lines +65 to +69
// Number is the integer value of the scheduler hint, used when Type is "Number".
// This field is required if type is 'Number', and must not be set otherwise.
// +unionMember,optional
Number *int `json:"number,omitempty"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minimum or maximum values that are applicable here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there can be any minima or maxima in this case, as the nature of the interface is we don't know what it will be used for.

I wonder if we should make the size explicit anyway by making this an int64.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set the maximum to the maximum of an int64 so that it is at least in the schema, but, ideally it would be lower.

All integer types in Kube APIs should be int32 or int64, other int primitives are not allowed by convention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that openapi codegen set int32 as default format for int typed values in zz_generated.openapi.go schema.
As long as the schema set it as int32, do you still need to set max or min value using the max and min value of int32 explicitly?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should always use the typed int, so, int32 or int64 in this case. And its best practice to have maximums, but in this case, I don't know that it's possible without excluding actually valid OSP API values

api/v1beta1/openstackmachine_types.go Outdated Show resolved Hide resolved
api/v1beta1/openstackmachine_types.go Show resolved Hide resolved
api/v1alpha1/openstackserver_types.go Outdated Show resolved Hide resolved
api/v1beta1/openstackmachine_types.go Outdated Show resolved Hide resolved
@EmilienM
Copy link
Contributor

/lgtm cancel
Valid comments from Joel (thanks for taking the time).

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 16, 2024
Copy link
Contributor

@mdbooth mdbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you rebase on top of #2173 when it lands to fix the bizarre error executing the -test job.

When you do that it's going to highlight a couple of genuine issues. There are some lint errors and some failures in apivalidations.

Comment on lines +65 to +69
// Number is the integer value of the scheduler hint, used when Type is "Number".
// This field is required if type is 'Number', and must not be set otherwise.
// +unionMember,optional
Number *int `json:"number,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there can be any minima or maxima in this case, as the nature of the interface is we don't know what it will be used for.

I wonder if we should make the size explicit anyway by making this an int64.

@okozachenko1203 okozachenko1203 force-pushed the feat/support-server-group-scheduler-hints branch from 44d2963 to a021b6d Compare September 18, 2024 12:47
@okozachenko1203 okozachenko1203 marked this pull request as draft September 18, 2024 13:14
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 18, 2024
@okozachenko1203 okozachenko1203 force-pushed the feat/support-server-group-scheduler-hints branch from a021b6d to 650e18c Compare September 18, 2024 13:18
@okozachenko1203 okozachenko1203 marked this pull request as ready for review September 18, 2024 13:29
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 18, 2024
manual conversion for schedulerhintproperties

use named object instead of map

update manifests

fix lint error

remove object from SchedulerHintAdditionalValue struct

add tests and validations

fix cel validation

add more unit tests

add additional annotations

update generated manifests

fix one apivalidation testcase
Copy link
Contributor

@mdbooth mdbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, thanks!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 23, 2024
@k8s-ci-robot k8s-ci-robot merged commit 6560f88 into kubernetes-sigs:main Sep 23, 2024
9 checks passed
@okozachenko1203 okozachenko1203 deleted the feat/support-server-group-scheduler-hints branch September 24, 2024 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

8 participants