Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing Hardware Topology through CRDs in Node feature discovery #333

Closed
swatisehgal opened this issue Aug 10, 2020 · 31 comments
Closed

Exposing Hardware Topology through CRDs in Node feature discovery #333

swatisehgal opened this issue Aug 10, 2020 · 31 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@swatisehgal
Copy link
Contributor

swatisehgal commented Aug 10, 2020

Currently Node feature Discovery consists of nfd-master and nfd-worker. The former is responsible for labeling Kubernetes node objects and the latter is responsible for detecting features and communicating them to nfd-master. nfd-worker runs as a daemon-set on all the nodes in the cluster.

Resource Topology Exporter (KEP, Code) can nicely fit into NFD architecture by:

  • enabling nfd-worker to gather node resource topology information
  • passing this information to the nfd-master over grpc
  • Ultimately, the nfd-master is responsible for creating CRD instances for all the worker nodes in the cluster.

Design document explaining the approach and proposed changes are provided in detail here.

Possible implementation approaches of introducing Resource Topology Exporter in NFD operand are summarized below:

  1. Option 1:Introducing Resource Topology Exporter as a source or a new command line option in the nfd-worker. New source (or new command line option) is where nfd-worker would gather hardware topology information. Nfd-master would update NumaTopology CRD

  2. Option 2 (preferred): A new helper daemon eg nfd-node-topology (similar to nfd-worker).

  • The new helper gathers hardware topology information and sends to nfd-master over gRPC through another endpoint.

  • The communication between nfd-node-topology between nfd-master is over the separate endpoint which contains hardware topology information but can be easily enhanced to other CRDs

  • Nfd-master would update NumaTopology CRD using NodeTopologyRequest that it receives from nfd-worker

  • NOTE: NFD operator would have to be enhanced to cater to nfd-node-topology (like it manages nfd-master and nfd-worker)

  • The advantage of this approach is that nfd-worker and nfd-master continue to work the same way as they currently do

Changes to the gRPC interface: Introducing another gRPC endpoint for communication between nfd-worker and nfd-master. (completely separate proto as shown below)

syntax = "proto3";

option go_package = "sigs.k8s.io/node-feature-discovery/pkg/topologyupdater";
import "github.com/k8stopologyawareschedwg/noderesourcetopology-api/pkg/apis/topology/v1alpha1/generated.proto";

package topologyupdater;

service NodeTopology{
    rpc UpdateNodeTopology(NodeTopologyRequest) returns (NodeTopologyResponse);
}
message NodeTopologyRequest {
    string nfd_version = 1;
    string node_name = 2;
    repeated string topology_policies = 3;
    repeated v1alpha1.Zone zones = 4;
}
message NodeTopologyResponse {
}

Additionally, we propose that NFD becomes home for NodeResourceTopology CRD API definition (and informer, clientset, handlers etc)

@swatisehgal swatisehgal changed the title Exposing Numa node Topology through CRDs in Node feature discovery Exposing Hardware Topology through CRDs in Node feature discovery Aug 10, 2020
@marquiz
Copy link
Contributor

marquiz commented Aug 12, 2020

I'd suggest to use a more generic name for the package, "cr-updater" or something.

Regarding the data/CRD, I was pondering should we prepare for more detailed information, e.g. bake in or make it possible to present numa distances(?)

@swatisehgal
Copy link
Contributor Author

swatisehgal commented Aug 13, 2020

@marquiz Sure, we can make the package name more generic.
Can you elaborate on how do you see information related to numa distances being exposed here?
In order to gather resource information out of tree we are currently relying on podResource API. As our only source of truth for obtaining device information is (device manager in) kubelet, we have device information in kubelet (health, device ids and numa node as part of topologyInfo) but need to extend podResource API (as proposed here: kubernetes/enhancements#1884) . Numa distances cannot be obtained from kubelet as it is not supported currently in kubelet.

As per the design of topology manager, cpu manager and device manager implement the hint providers interface and provide numa node id as a hint to topology manager.
We are open to suggestion here, but as a starting point we could expose the constructs in terms of numa node ids (as it aligns with the current design of topology manager) and more importantly that's the only information at our disposal currently. We would be more than happy to update later when topology information becomes more descriptive in topology manager as the API itself would evolve and would allow us to expose this information. Let me know your thoughts.

@ffromani
Copy link
Contributor

A way we can expose the numa distance could probably be like

message NUMANodeResource {
    int numa_id = 1;
    map<string, string> resources = 2;
    map<int, int> distances = 3;
}

However I second what @swatisehgal wrote, the thing is we were using the podresources API - and the extensions we were proposing here - as source of truth. However, if there is interest specifically about the numa distances we can maybe learn them reliably from sysfs, kinda like numactl does. That should be pretty safe to reconcile. It could require access to more system resources though.

@swatisehgal
Copy link
Contributor Author

swatisehgal commented Aug 13, 2020

Thanks @fromanirh for this code snippet. Getting numa node distance from sysfs is totally doable and populating this as part of NUMANodeResource makes sense! I was considering numa node distance from the point of view of resources and their distance from each other (in my previous comment).

@marquiz
Copy link
Contributor

marquiz commented Aug 13, 2020

I don't have the answers myself, either, just throwing some ideas 😊 I was just trying to think a bit ahead and broader and not "cement" this too tightly just for this one use case and scheduler extension. When we add something we should at least try to think about possible other users and usages, too, e.g. some other scheduler extensions and possibly alternative rte daemons digging more detailed data. Just trying to avoid the situation where we quickly patch together the api for one narrow use case and after a while we're stuck with that and have problems serving new users. This case is simple, but still. Maybe we have something to learn from how the Linux kernel handles apis 😄

@swatisehgal
Copy link
Contributor Author

@marquiz Thanks for the input. I see your point and have updated the design document to include NumaNode distances as part of the NumaNodeResource structure

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 16, 2020
@swatisehgal
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2020
@swatisehgal
Copy link
Contributor Author

Development work is in progress to support this feature. PR will be linked here soon!

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 16, 2021
@zvonkok
Copy link
Contributor

zvonkok commented Feb 16, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 16, 2021
@swatisehgal
Copy link
Contributor Author

swatisehgal commented May 13, 2021

No Task Break Down PR Owner Status
1. NFD Topology Updater Implementation #525 @swatisehgal Merged
2. NFD Topology Updater Documentation #526 @swatisehgal Merged
3. NFD Topology Updater E2E tests #528 @fromanirh
4. Enable Configz Endpoint for obtaining Kubelet config #530 @Tal-or
5. Allow excluding of specific resource accounting from NRT's objects #545 @Tal-or
6. NFD Topology Updater: Add memory accounting in NRT #593 @cynepco3hahue Merged
7. Simplify accounting logic in NFD-topology Updater TBD @fromanirh
8. NFD-topology Updater Helm deployment if topologyupdater explicitly enabled #623 @zwpaper Merged
9. Event based NFD-topology Updater: Ensure that CRs are updated on every deviceplugin and/or pod creation/deletion event as opposed to the current timer based approach TBD TBD

@ffromani
Copy link
Contributor

Expect the missing two PRs by monday

@ffromani
Copy link
Contributor

ffromani commented May 18, 2021

No Task Break Down PR Owner
1. NFD Topology Updater Implementation #525 @swatisehgal
2. NFD Topology Updater Documentation #526 @swatisehgal
3. NFD Topology Updater E2E tests #528 @fromanirh
4. Enable Configz Endpoint for obtaining Kubelet config #530 @Tal-or

@ArangoGutierrez
Copy link
Contributor

/label kind/feature

@k8s-ci-robot
Copy link
Contributor

@ArangoGutierrez: The label(s) /label kind/feature cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda

In response to this:

/label kind/feature

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 7, 2022
@ffromani
Copy link
Contributor

ffromani commented Feb 7, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 7, 2022
@ArangoGutierrez
Copy link
Contributor

Hey all, any updates on this feature set?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 21, 2022
@marquiz
Copy link
Contributor

marquiz commented Jul 8, 2022

There is progress lately in e2e-tests and /configz endpoint

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 8, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 6, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 5, 2022
@marquiz
Copy link
Contributor

marquiz commented Nov 8, 2022

This is still alive and tracking
/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Nov 8, 2022
@marquiz
Copy link
Contributor

marquiz commented Jan 20, 2023

@ffromani @Tal-or @swatisehgal are we expecting new PRs here or should we close this issue? We could also track the missing pieces as separate issues

@marquiz
Copy link
Contributor

marquiz commented Mar 3, 2023

@ffromani @PiotrProkop @Tal-or @swatisehgal should we close this issue as implemented and track any future enhancements in separate issues?

@ffromani
Copy link
Contributor

ffromani commented Mar 3, 2023

@ffromani @PiotrProkop @Tal-or @swatisehgal should we close this issue as implemented and track any future enhancements in separate issues?

From my perspective yes, we're now in a pretty good shape and we can track future enhancements with separate issues.

@swatisehgal
Copy link
Contributor Author

@ffromani @PiotrProkop @Tal-or @swatisehgal should we close this issue as implemented and track any future enhancements in separate issues?

+1. Nice to see that we have come a long way :) Let's close this issue. Thanks!

@marquiz
Copy link
Contributor

marquiz commented Mar 3, 2023

Closing as suggested. Thanks for everybody involved for working on this 🎉

/close

@k8s-ci-robot
Copy link
Contributor

@marquiz: Closing this issue.

In response to this:

Closing as suggested. Thanks for everybody involved for working on this 🎉

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

8 participants