Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lfx-mentorship-2022-summer]Cluster Resource modeling #772

Closed
RainbowMango opened this issue Sep 28, 2021 · 15 comments
Closed

[lfx-mentorship-2022-summer]Cluster Resource modeling #772

RainbowMango opened this issue Sep 28, 2021 · 15 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@RainbowMango
Copy link
Member

RainbowMango commented Sep 28, 2021

What would you like to be added:
We don't want to collect and store each node's resources in detail(That's a burden for Karmada to maintain the information), but we want to build a resource model for each cluster, something like:

resourceModel:
  - grade:
      cpu: "1"
      memory: 2Gi
    count: 10
  - grade:
      cpu: "2"
      memory: 4Gi
    count: 6
  - grade:
      cpu: "4"
      memory: 8Gi
    count: 2
  - grade:
      cpu: "8"
      memory: 16Gi
    count: 1

Why is this needed:
In the scheduling progress, the karmada-scheduler makes decisions as per a bunch of factors, one of the factors is the resource details of the cluster.

We introduced ResourceSummary to the Cluster API. For example:

  resourceSummary:
    allocatable:
      cpu: "4"
      ephemeral-storage: 206291924Ki
      hugepages-1Gi: "0"
      hugepages-2Mi: "0"
      memory: 16265856Ki
      pods: "110"
    allocated:
      cpu: 950m
      memory: 290Mi
      pods: "11"

But the ResourceSummary is not precise enough, it mechanically counts the resources on all nodes, but ignores the fragment resources.(For example, a cluster with 2000 node, 1 core cpu left on each node, from the ResourceSummary, we get there are 2000 core CPU left for the cluster, that's not correct.)

References

@RainbowMango RainbowMango added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 28, 2021
@RainbowMango RainbowMango changed the title [PlaceHolder]Cluster Resource modeling [lfx-mentorship-2022-summer]Cluster Resource modeling May 7, 2022
@anutosh491
Copy link

anutosh491 commented May 13, 2022

Hello @RainbowMango , hope you're doing well , this is Anutosh here from India . I'm an open source enthusiast and I'm currently involved with communities based on numerical and symbolic computations/algorithms in math and physics like numpy, sympy, networkx.

I am keen to take part in the LFX Mentorship program for the summer term and this project interests me. But being new to the project , I would be glad if you could suggest any relevant resources/links I should be going through as a beginner for getting to know the project and the library better . Thank you !

@AALEKH
Copy link

AALEKH commented May 15, 2022

Hello @RainbowMango , hope you're doing well , this is Anutosh here from India . I'm an open source enthusiast and I'm currently involved with communities based on numerical and symbolic computations/algorithms in math and physics like numpy, sympy, networkx.

I am keen to take part in the LFX Mentorship program for the summer term and this project interests me. But being new to the project , I would be glad if you could suggest any relevant resources/links I should be going through as a beginner for getting to know the project and the library better . Thank you !

Hey @RainbowMango Any Update on this comment, I would like to volunteer to work on this issue under LFX Mentorship Program? Happy to send you a proposal about the same, Thanks!!

@RainbowMango
Copy link
Member Author

@AALEKH @anutosh491 Thanks for reaching us. This task requires some basic knowledge about Kubernetes, and after that, you can get started with Karmada quick start.
Here is some documents too that might be helpful to understand the project.

@anutosh491
Copy link

anutosh491 commented May 19, 2022

@AALEKH @anutosh491 Thanks for reaching us. This task requires some basic knowledge about Kubernetes, and after that, you can get started with Karmada quick start.
Here is some documents too that might be helpful to understand the project.

Thank you @RainbowMango I have been learning more about karmada, the problem it solves and the functionality behind it . I will be going through the docs soon and then start working on my application !

EDIT1: I went through most of the resources shared above . I am now much more comfortable with Karmada and have a better understanding of how Karmada operates. Thanks for the resources : )

@anutosh491
Copy link

anutosh491 commented May 27, 2022

resourceModel:
  - grade:
      cpu: "1"
      memory: 2Gi
    count: 10
  - grade:
      cpu: "2"
      memory: 4Gi
    count: 6

Hello @RainbowMango , I've been framing my cover letter for the LFX mentorship program and had couple doubts regarding this proposed model.

  1. What information does grade and count convey ? I realize that grade would be type corev1.ResourceList and would be carrying pairs of resources and quantity !
  2. resourceModel would also be introduced in the Cluster API only right ?
  3. Also could you elaborate a bit more on what all fragmented resources you're talking about in this line ?

But the ResourceSummary is not precise enough, it mechanically counts the resources on all nodes, but ignores the fragment resources

  1. Also is there any other file /code chunk in any file you would like me to go through ? I've gone through some files completely like pkg/apis/cluster.types.go , pkg/scheduler/core/generic_scheduler.go which helped me in general gain more idea about the project ! ( I plan to go through the failover/ rescheduling algorithms code i.e. division.go and the other one sometime soon)

@anutosh491
Copy link

anutosh491 commented May 27, 2022

Also does this call for removing or rather deprecating the Resource Summary class and used objects throughout the codebase ?

@anutosh491
Copy link

Hello @RainbowMango sir , could you please help me with the doubts I've asked above as today is the last day to apply ! I am actually ready with my application material but just want to confirm these basic doubts before turning in my application . Thanks in advance !

@RainbowMango
Copy link
Member Author

What information does grade and count convey ? I realize that grade would be type corev1.ResourceList and would be carrying pairs of resources and quantity !

The grade and count on the issue are examples of what kind of things we are trying to build.

resourceModel would also be introduced in the Cluster API only right?

Probably. Given the Cluster object is a large object, another option may be to build a separated API to store the module.
Where and how to store the module is not the key concern here, more important is how to describe a cluster's resource situation by the model.

Also could you elaborate a bit more on what all fragmented resources you're talking about in this line ?

Please see the example:

For example, a cluster with 2000 node, 1 core cpu left on each node, from the ResourceSummary, we get there are 2000 core CPU left for the cluster, that's not correct.

Also does this call for removing or rather deprecating the Resource Summary class and used objects throughout the codebase ?

Probably.

@RainbowMango
Copy link
Member Author

@halfrost, please assign this issue to you by command:
/assign @halfrost
to show you are working on this.

@karmada-bot
Copy link
Collaborator

@RainbowMango: GitHub didn't allow me to assign the following users: halfrost.

Note that only karmada-io members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

@halfrost, please assign this issue to you by command:
/assign @halfrost
to show you are working on this.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@halfrost
Copy link
Contributor

/assign @halfrost

@RainbowMango
Copy link
Member Author

@halfrost Could you please post the API design here? I know some guys who are interested in it.

@halfrost
Copy link
Contributor

OK, I will write a document about the detail of API design.

@RainbowMango
Copy link
Member Author

/close
in favor of #2379

@karmada-bot
Copy link
Collaborator

@RainbowMango: Closing this issue.

In response to this:

/close
in favor of #2379

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
Development

No branches or pull requests

5 participants