Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU-VPC module #3391

Merged
merged 4 commits into from
Dec 12, 2024
Merged

Conversation

cdunbar13
Copy link
Contributor

@cdunbar13 cdunbar13 commented Dec 11, 2024

This PR adds the modules/network/gpu-vpc module, which is a modified version of the rdma-vpc found in the experimental branch.

A number of features have been stripped (ssh and iap firewall, secondary ranges, etc.) and what is left is a module that creates a VPC and several subnets, with the regular use case of inter-gpu communication (i.e. RDMA).

Tested in a modified A3U blueprint. NCCL tests had expected results at 8GB message size.

@cdunbar13 cdunbar13 added the release-key-new-features Added to release notes under the "Key New Features" heading. label Dec 11, 2024
@cdunbar13 cdunbar13 requested a review from tpdownes December 11, 2024 19:40
@tpdownes tpdownes assigned cdunbar13 and unassigned tpdownes Dec 12, 2024
cdunbar13 and others added 4 commits December 12, 2024 16:51
@cdunbar13 cdunbar13 merged commit 76d2e38 into GoogleCloudPlatform:develop Dec 12, 2024
8 of 61 checks passed
@cdunbar13 cdunbar13 deleted the gpu-vpc-promo branch December 12, 2024 17:46
@nick-stroud nick-stroud mentioned this pull request Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-key-new-features Added to release notes under the "Key New Features" heading.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants