Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support optional config #530

Merged
merged 9 commits into from
Jul 4, 2023
Merged

Conversation

adrianchiris
Copy link
Collaborator

No description provided.

@adrianchiris adrianchiris added the on hold This enhancement is currently on hold pending additional clarification and evaluation label May 22, 2023
@adrianchiris adrianchiris force-pushed the support-optional-config branch 2 times, most recently from dd763e1 to 871d33f Compare May 23, 2023 11:09
@adrianchiris adrianchiris removed the on hold This enhancement is currently on hold pending additional clarification and evaluation label Jun 28, 2023
@adrianchiris
Copy link
Collaborator Author

@ykulazhenkov @rollandf PTAL

example-cr.yaml Outdated Show resolved Hide resolved
kind-cluster.yaml Outdated Show resolved Hide resolved
nv-ipam-cm.yaml Outdated Show resolved Hide resolved
rdma-dp-config Outdated Show resolved Hide resolved
sriov-dp-config Outdated Show resolved Hide resolved
value-overwrite.yaml Outdated Show resolved Hide resolved
@adrianchiris
Copy link
Collaborator Author

@ykulazhenkov addressed your comments, PTAL

@ykulazhenkov ykulazhenkov self-requested a review June 29, 2023 09:28
Copy link
Collaborator

@ykulazhenkov ykulazhenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx

@@ -208,11 +212,11 @@ type IBKubernetesSpec struct {
UfmSecret string `json:"ufmSecret,omitempty"`
}

// NVIPAMSpec describes configuration options for nv-ipam
// 1. Image information for nv-ipam
// 2. Config for nv-ipam in JSON format
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this comment if a bit inaccurate because we specify a string representation of JSON here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what exactly would you like to appear here ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just 'Config for nv-ipam' is better option with a good examples and defaults. Anyway, I won't block PR for this or other nit

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack will update

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -561,7 +561,7 @@ optionally deployed components:
| `nvIpam.repository` | string | `ghcr.io/mellanox` | NVIDIA IPAM Plugin image repository |
| `nvIpam.version` | string | `v0.0.3` | NVIDIA IPAM Plugin image version |
| `nvIpam.imagePullSecrets` | list | `[]` | An optional list of references to secrets to use for pulling any of the Plugin image |
| `nvIpam.config` | string | `''` | Network pool configuration as described in https://github.com/Mellanox/nvidia-k8s-ipam |
| `nvIpam.config` | string | `{"pools": {"rdma-pool": {"subnet": "192.168.0.0/16", "perNodeBlockSize": 100, "gateway": "192.168.0.1"}}}` | Network pool configuration as described in [nvidia-k8s-ipam](https://github.com/Mellanox/nvidia-k8s-ipam), the default defines a single IP Pool named `"rdma-pool"`|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: JSON shouble be coverd in double quotes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean ?

something like:
"{\"foo\": \"bar\"}"

dont u think it will hinder readability ?

also the value above is what we put in values.yaml as default there are no double quotes there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean:
"{"pools": {"rdma-pool": {"subnet": "192.168.0.0/16", "perNodeBlockSize": 100, "gateway": "192.168.0.1"}}}"

also the value above is what we put in values.yaml as default there are no double quotes there.

that's because we use pipe symbol | thare which means multi-line string

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, ill update

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@@ -11,6 +11,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
{{ if .CrSpec.Config -}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, add deletion of this comfigmap logic to NicClusterPolicy controller for a consistancy: if CRD is changed and config is deleted we have to delete config map

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a general mechanims at state level where resources are being deleted if needed.
do you really think we need a finer grained mechanism ?

if user sets config to nil, its expected that the next steps for him is to re-create the config map with the given name.
so it will be deleted by the user.

thinking about it, NOT deleting it may be actually better, because then the user may know the name of the config map it needs to re-create :).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a general mechanims at state level where resources are being deleted if needed. do you really think we need a finer grained mechanism ?
This

if cr.Spec.SecondaryNetwork == nil || cr.Spec.SecondaryNetwork.Multus == nil {
won't delete ConfigMap if Multus should be deployed

if user sets config to nil, its expected that the next steps for him is to re-create the config map with the given name. so it will be deleted by the user.

thinking about it, NOT deleting it may be actually better, because then the user may know the name of the config map it needs to re-create :).

I don't understans this use-case when user deletes operator-generated ConfigMap and creates it's own. My assumption is that all configuration should be done via NicClusterPolicy

Copy link
Collaborator Author

@adrianchiris adrianchiris Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understans this use-case when user deletes operator-generated ConfigMap and creates it's own. My assumption is that all configuration should be done via NicClusterPolicy

the use-case is having configuration provided out of band. this was an internal request.
where:

  1. user deploys network operator and provides nicclusterpolicy with nil config for e.g nv-ipam
  2. at a later step when network provisioning occurs the user defines the needed config map for nv-ipam

this was the ask, in contrast to changing nicclusterpolicy at a later step which will probably mess up argo CD or smthn,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the use-case is having configuration provided out of band. this was an internal request.

in this case, please add a note about this feature in a README and this will be good to merge

Copy link
Collaborator Author

@adrianchiris adrianchiris Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't delete ConfigMap if Multus should be deployed

Yes, and i think we dont need a finer grained mechansim to handle it (for now...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a section in README.

Check [Upgrade section in Helm Chart documentation](deployment/network-operator/README.md#upgrade) for details.

## Externally Provided Configurations For Network Operator Sub-Components
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@e0ne PTAL

we can either keep this section here or move it to its own document under /docs/
as its an advanced use case.

@adrianchiris adrianchiris force-pushed the support-optional-config branch 2 times, most recently from 4496082 to 2cf48d5 Compare July 2, 2023 12:49
Copy link
Collaborator

@e0ne e0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks for addressing my comments. It could be merged after rebase and CI passed

- Consolidate image spec with config to a
  common struct and use where applicable
  (device plugins, multus, nv-ipam)
- set Config field as optional. use nil value
  to determine if unset.

Signed-off-by: adrianc <adrianc@nvidia.com>
Run: `make generate && make manifests && make release-build`

Signed-off-by: adrianc <adrianc@nvidia.com>
This is required to allow providing configurations
for device plugins (RDMA, SR-IOV), nv-ipam out of band.

That is, the user creates it prior to deploying
these components with network operator. or alternatively
restarting the relevant daemonset after config map
creation.

note that for multus, not providing configuration
for configMap, will lead to multus using "auto"
configuration same as it does today.

Signed-off-by: adrianc <adrianc@nvidia.com>
Signed-off-by: adrianc <adrianc@nvidia.com>
Signed-off-by: adrianc <adrianc@nvidia.com>
Signed-off-by: adrianc <adrianc@nvidia.com>
define a single pool to be used for rdma networks
to serve as default configuration.

Signed-off-by: adrianc <adrianc@nvidia.com>
update README to reflect the new default for nvIpam config

Signed-off-by: adrianc <adrianc@nvidia.com>
- Update project README
- Update Helm README, adding quotes to default configuration
  of nv-ipam
- Update comment in CRD

Signed-off-by: adrianc <adrianc@nvidia.com>
@adrianchiris
Copy link
Collaborator Author

@e0ne PTAL

Copy link
Collaborator

@e0ne e0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for addressing my comments

@adrianchiris
Copy link
Collaborator Author

adrianchiris commented Jul 4, 2023

@e0ne Feel free to merge this one when ready.

@e0ne e0ne merged commit 6c21718 into Mellanox:master Jul 4, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants