Exposing Hardware Topology through CRDs in Node feature discovery #333

swatisehgal · 2020-08-10T14:34:40Z

Currently Node feature Discovery consists of nfd-master and nfd-worker. The former is responsible for labeling Kubernetes node objects and the latter is responsible for detecting features and communicating them to nfd-master. nfd-worker runs as a daemon-set on all the nodes in the cluster.

Resource Topology Exporter (KEP, Code) can nicely fit into NFD architecture by:

enabling nfd-worker to gather node resource topology information
passing this information to the nfd-master over grpc
Ultimately, the nfd-master is responsible for creating CRD instances for all the worker nodes in the cluster.

Design document explaining the approach and proposed changes are provided in detail here.

Possible implementation approaches of introducing Resource Topology Exporter in NFD operand are summarized below:

Option 1:Introducing Resource Topology Exporter as a source or a new command line option in the nfd-worker. New source (or new command line option) is where nfd-worker would gather hardware topology information. Nfd-master would update NumaTopology CRD
Option 2 (preferred): A new helper daemon eg nfd-node-topology (similar to nfd-worker).

The new helper gathers hardware topology information and sends to nfd-master over gRPC through another endpoint.
The communication between nfd-node-topology between nfd-master is over the separate endpoint which contains hardware topology information but can be easily enhanced to other CRDs
Nfd-master would update NumaTopology CRD using NodeTopologyRequest that it receives from nfd-worker
NOTE: NFD operator would have to be enhanced to cater to nfd-node-topology (like it manages nfd-master and nfd-worker)
The advantage of this approach is that nfd-worker and nfd-master continue to work the same way as they currently do

Changes to the gRPC interface: Introducing another gRPC endpoint for communication between nfd-worker and nfd-master. (completely separate proto as shown below)

syntax = "proto3";

option go_package = "sigs.k8s.io/node-feature-discovery/pkg/topologyupdater";
import "github.com/k8stopologyawareschedwg/noderesourcetopology-api/pkg/apis/topology/v1alpha1/generated.proto";

package topologyupdater;

service NodeTopology{
    rpc UpdateNodeTopology(NodeTopologyRequest) returns (NodeTopologyResponse);
}
message NodeTopologyRequest {
    string nfd_version = 1;
    string node_name = 2;
    repeated string topology_policies = 3;
    repeated v1alpha1.Zone zones = 4;
}
message NodeTopologyResponse {
}

Additionally, we propose that NFD becomes home for NodeResourceTopology CRD API definition (and informer, clientset, handlers etc)

marquiz · 2020-08-12T14:03:44Z

I'd suggest to use a more generic name for the package, "cr-updater" or something.

Regarding the data/CRD, I was pondering should we prepare for more detailed information, e.g. bake in or make it possible to present numa distances(?)

swatisehgal · 2020-08-13T10:03:02Z

@marquiz Sure, we can make the package name more generic.
Can you elaborate on how do you see information related to numa distances being exposed here?
In order to gather resource information out of tree we are currently relying on podResource API. As our only source of truth for obtaining device information is (device manager in) kubelet, we have device information in kubelet (health, device ids and numa node as part of topologyInfo) but need to extend podResource API (as proposed here: kubernetes/enhancements#1884) . Numa distances cannot be obtained from kubelet as it is not supported currently in kubelet.

As per the design of topology manager, cpu manager and device manager implement the hint providers interface and provide numa node id as a hint to topology manager.
We are open to suggestion here, but as a starting point we could expose the constructs in terms of numa node ids (as it aligns with the current design of topology manager) and more importantly that's the only information at our disposal currently. We would be more than happy to update later when topology information becomes more descriptive in topology manager as the API itself would evolve and would allow us to expose this information. Let me know your thoughts.

ffromani · 2020-08-13T14:13:00Z

A way we can expose the numa distance could probably be like

message NUMANodeResource {
    int numa_id = 1;
    map<string, string> resources = 2;
    map<int, int> distances = 3;
}

However I second what @swatisehgal wrote, the thing is we were using the podresources API - and the extensions we were proposing here - as source of truth. However, if there is interest specifically about the numa distances we can maybe learn them reliably from sysfs, kinda like numactl does. That should be pretty safe to reconcile. It could require access to more system resources though.

swatisehgal · 2020-08-13T15:01:21Z

Thanks @fromanirh for this code snippet. Getting numa node distance from sysfs is totally doable and populating this as part of NUMANodeResource makes sense! I was considering numa node distance from the point of view of resources and their distance from each other (in my previous comment).

marquiz · 2020-08-13T18:13:01Z

I don't have the answers myself, either, just throwing some ideas 😊 I was just trying to think a bit ahead and broader and not "cement" this too tightly just for this one use case and scheduler extension. When we add something we should at least try to think about possible other users and usages, too, e.g. some other scheduler extensions and possibly alternative rte daemons digging more detailed data. Just trying to avoid the situation where we quickly patch together the api for one narrow use case and after a while we're stuck with that and have problems serving new users. This case is simple, but still. Maybe we have something to learn from how the Linux kernel handles apis 😄

swatisehgal · 2020-08-18T18:46:30Z

@marquiz Thanks for the input. I see your point and have updated the design document to include NumaNode distances as part of the NumaNodeResource structure

fejta-bot · 2020-11-16T19:23:31Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

swatisehgal · 2020-11-18T18:12:17Z

/remove-lifecycle stale

swatisehgal · 2020-11-18T18:13:07Z

Development work is in progress to support this feature. PR will be linked here soon!

fejta-bot · 2021-02-16T18:22:13Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

zvonkok · 2021-02-16T18:44:16Z

/remove-lifecycle stale

swatisehgal · 2021-05-13T09:43:26Z

No	Task Break Down	PR	Owner	Status
1.	NFD Topology Updater Implementation	#525	@swatisehgal	Merged
2.	NFD Topology Updater Documentation	#526	@swatisehgal	Merged
3.	NFD Topology Updater E2E tests	#528	@fromanirh
4.	Enable Configz Endpoint for obtaining Kubelet config	#530	@Tal-or
5.	Allow excluding of specific resource accounting from NRT's objects	#545	@Tal-or
6.	NFD Topology Updater: Add memory accounting in NRT	#593	@cynepco3hahue	Merged
7.	Simplify accounting logic in NFD-topology Updater	TBD	@fromanirh
8.	NFD-topology Updater Helm deployment if topologyupdater explicitly enabled	#623	@zwpaper	Merged
9.	Event based NFD-topology Updater: Ensure that CRs are updated on every deviceplugin and/or pod creation/deletion event as opposed to the current timer based approach	TBD	TBD

ffromani · 2021-05-14T06:40:07Z

Expect the missing two PRs by monday

ffromani · 2021-05-18T09:06:42Z

No	Task Break Down	PR	Owner
1.	NFD Topology Updater Implementation	#525	@swatisehgal
2.	NFD Topology Updater Documentation	#526	@swatisehgal
3.	NFD Topology Updater E2E tests	#528	@fromanirh
4.	Enable Configz Endpoint for obtaining Kubelet config	#530	@Tal-or

ArangoGutierrez · 2021-08-11T21:38:38Z

/label kind/feature

k8s-ci-robot · 2021-08-11T21:38:39Z

@ArangoGutierrez: The label(s) /label kind/feature cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda

In response to this:

/label kind/feature

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2022-02-07T14:26:41Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

ffromani · 2022-02-07T14:41:01Z

/remove-lifecycle stale

ArangoGutierrez · 2022-03-23T22:10:56Z

Hey all, any updates on this feature set?

k8s-triage-robot · 2022-06-21T22:28:47Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

marquiz · 2022-07-08T12:34:12Z

There is progress lately in e2e-tests and /configz endpoint

/remove-lifecycle stale

k8s-triage-robot · 2022-10-06T13:15:28Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-11-05T13:36:28Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

marquiz · 2022-11-08T13:38:57Z

This is still alive and tracking
/remove-lifecycle rotten

marquiz · 2023-01-20T08:56:05Z

@ffromani @Tal-or @swatisehgal are we expecting new PRs here or should we close this issue? We could also track the missing pieces as separate issues

marquiz · 2023-03-03T11:55:31Z

@ffromani @PiotrProkop @Tal-or @swatisehgal should we close this issue as implemented and track any future enhancements in separate issues?

ffromani · 2023-03-03T11:57:41Z

@ffromani @PiotrProkop @Tal-or @swatisehgal should we close this issue as implemented and track any future enhancements in separate issues?

From my perspective yes, we're now in a pretty good shape and we can track future enhancements with separate issues.

swatisehgal · 2023-03-03T12:16:40Z

@ffromani @PiotrProkop @Tal-or @swatisehgal should we close this issue as implemented and track any future enhancements in separate issues?

+1. Nice to see that we have come a long way :) Let's close this issue. Thanks!

marquiz · 2023-03-03T13:01:36Z

Closing as suggested. Thanks for everybody involved for working on this 🎉

/close

k8s-ci-robot · 2023-03-03T13:01:40Z

@marquiz: Closing this issue.

In response to this:

Closing as suggested. Thanks for everybody involved for working on this 🎉

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

swatisehgal changed the title ~~Exposing Numa node Topology through CRDs in Node feature discovery~~ Exposing Hardware Topology through CRDs in Node feature discovery Aug 10, 2020

klihub mentioned this issue Sep 15, 2020

Create a common library for discovering HW resources and topology container-orchestrated-devices/resource-management-improvements-wg#2

Open

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 16, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2020

swatisehgal mentioned this issue Nov 18, 2020

Remove NodeResourceTopology API from staging kubernetes/kubernetes#96275

Merged

swatisehgal mentioned this issue Feb 2, 2021

Add GetAllocatableResource to PodResource API kubernetes/enhancements#2404

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 16, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 16, 2021

This was referenced May 13, 2021

Introducing NFD Topology Updater exposing Resource hardware Topology info through CRs #525

Merged

Documentation capturing enablement of NFD-Topology-Updater in NFD #526

Merged

ffromani mentioned this issue May 18, 2021

topology updater: add e2e tests #528

Merged

Tal-or mentioned this issue May 18, 2021

WIP: Topology updater: TM policy from configz endpoint #530

Closed

swatisehgal mentioned this issue May 20, 2021

PodResource API: Inability to account for available CPUs as guaranteed pods can belong to shared pool kubernetes/kubernetes#102190

Closed

Tal-or mentioned this issue Jun 16, 2021

Allow excluding of specific resource accounting from NRT's objects #545

Closed

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 11, 2021

cynepco3hahue mentioned this issue Sep 12, 2021

resourcemonitor: aggregate and provide the memory and hugepages information #593

Merged

ArangoGutierrez mentioned this issue Sep 15, 2021

More extensive and expressive custom rules #464

Merged

marquiz mentioned this issue Oct 4, 2021

Helm support for topology-updater #616

Closed

swatisehgal mentioned this issue Oct 26, 2021

Topology aware scheduler plugin in kube-scheduler kubernetes/enhancements#2044

Closed

ffromani mentioned this issue Dec 21, 2021

Question about when topology updater runs #703

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 7, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 7, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 21, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 8, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 6, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 5, 2022

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Nov 8, 2022

k8s-ci-robot closed this as completed Mar 3, 2023

swatisehgal mentioned this issue Oct 13, 2023

Populate NodeResourceTopology API kubernetes-sigs/noderesourcetopology-api#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exposing Hardware Topology through CRDs in Node feature discovery #333

Exposing Hardware Topology through CRDs in Node feature discovery #333

swatisehgal commented Aug 10, 2020 •

edited

Loading

marquiz commented Aug 12, 2020

swatisehgal commented Aug 13, 2020 •

edited

Loading

ffromani commented Aug 13, 2020

swatisehgal commented Aug 13, 2020 •

edited

Loading

marquiz commented Aug 13, 2020

swatisehgal commented Aug 18, 2020

fejta-bot commented Nov 16, 2020

swatisehgal commented Nov 18, 2020

swatisehgal commented Nov 18, 2020

fejta-bot commented Feb 16, 2021

zvonkok commented Feb 16, 2021

swatisehgal commented May 13, 2021 •

edited

Loading

ffromani commented May 14, 2021

ffromani commented May 18, 2021 •

edited

Loading

ArangoGutierrez commented Aug 11, 2021

k8s-ci-robot commented Aug 11, 2021

k8s-triage-robot commented Feb 7, 2022

ffromani commented Feb 7, 2022

ArangoGutierrez commented Mar 23, 2022

k8s-triage-robot commented Jun 21, 2022

marquiz commented Jul 8, 2022

k8s-triage-robot commented Oct 6, 2022

k8s-triage-robot commented Nov 5, 2022

marquiz commented Nov 8, 2022

marquiz commented Jan 20, 2023

marquiz commented Mar 3, 2023

ffromani commented Mar 3, 2023

swatisehgal commented Mar 3, 2023

marquiz commented Mar 3, 2023

k8s-ci-robot commented Mar 3, 2023

Exposing Hardware Topology through CRDs in Node feature discovery #333

Exposing Hardware Topology through CRDs in Node feature discovery #333

Comments

swatisehgal commented Aug 10, 2020 • edited Loading

marquiz commented Aug 12, 2020

swatisehgal commented Aug 13, 2020 • edited Loading

ffromani commented Aug 13, 2020

swatisehgal commented Aug 13, 2020 • edited Loading

marquiz commented Aug 13, 2020

swatisehgal commented Aug 18, 2020

fejta-bot commented Nov 16, 2020

swatisehgal commented Nov 18, 2020

swatisehgal commented Nov 18, 2020

fejta-bot commented Feb 16, 2021

zvonkok commented Feb 16, 2021

swatisehgal commented May 13, 2021 • edited Loading

ffromani commented May 14, 2021

ffromani commented May 18, 2021 • edited Loading

ArangoGutierrez commented Aug 11, 2021

k8s-ci-robot commented Aug 11, 2021

k8s-triage-robot commented Feb 7, 2022

ffromani commented Feb 7, 2022

ArangoGutierrez commented Mar 23, 2022

k8s-triage-robot commented Jun 21, 2022

marquiz commented Jul 8, 2022

k8s-triage-robot commented Oct 6, 2022

k8s-triage-robot commented Nov 5, 2022

marquiz commented Nov 8, 2022

marquiz commented Jan 20, 2023

marquiz commented Mar 3, 2023

ffromani commented Mar 3, 2023

swatisehgal commented Mar 3, 2023

marquiz commented Mar 3, 2023

k8s-ci-robot commented Mar 3, 2023

swatisehgal commented Aug 10, 2020 •

edited

Loading

swatisehgal commented Aug 13, 2020 •

edited

Loading

swatisehgal commented Aug 13, 2020 •

edited

Loading

swatisehgal commented May 13, 2021 •

edited

Loading

ffromani commented May 18, 2021 •

edited

Loading