-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Koordinator doesn't support multiple card sharing #2097
Comments
/assign |
Welcome! You can refer to this proposal |
I have started doing it, but I need sometime to understand your design principle and code, I will try my best to complete it as soon as possible |
OK, if you need help, questions or discussions by this github issue or DingDing talk are both welcome! |
is it the implement of mutate and validate webhook in the path of pkg/webhook/pod/mutating/extended_resource_spec.go? I didn't see any work of gpu extender resource, I am doing this task now @ZiMengSheng |
what is your DingDing account, Can I add friends? |
王建宇 |
The scheduler need to calculcate requestsPerCard and numGPUs by gpu.shared protocol. |
What happened:
A node has 8 GPU cards, each GPU card has 80 Gi GPU memory. I want to use four cards, each GPU card 40 Gi GPU Memory via
koordinator.sh/gpu.shared
. But pod will stuck in Pending phase.What you expected to happen:
Pod should be scheduled.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
kubectl version
):The text was updated successfully, but these errors were encountered: