Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit maximum number of RDMA resources for Shared Device Plugin #580

Merged
merged 1 commit into from
Aug 2, 2023

Conversation

e0ne
Copy link
Collaborator

@e0ne e0ne commented Jul 27, 2023

There us hardware limitation of 128 GIDs per port in RocE mode. Since RocE v1 and v2 are enabled by default we have to limit to 63 devices per port to be allocated to workloads.

There is no such limitation for IPoIB mode, so user should update default value if needed.

This commit also makes rdmaHcaMax configurable via helm values.

Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, can you update example folder as well ?

@e0ne
Copy link
Collaborator Author

e0ne commented Jul 31, 2023

overall LGTM, can you update example folder as well ?

examples updated

There us hardware limitation of 128 GIDs per port in RocE mode.
Since RocE v1 and v2 are enabled by default we have to limit to
63 devices per port to be allocated to workloads.

There is no such limitation for IPoIB mode, so user should update
default value if needed.

This commit also makes rdmaHcaMax configurable via helm values.

Signed-off-by: Ivan Kolodiazhny <ikolodiazhny@nvidia.com>
@e0ne e0ne merged commit 9a4a8d1 into Mellanox:master Aug 2, 2023
14 checks passed
@e0ne e0ne mentioned this pull request Aug 2, 2023
24 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants