Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workload services #732

Merged
merged 21 commits into from
Nov 29, 2022
Merged

Add workload services #732

merged 21 commits into from
Nov 29, 2022

Conversation

AmartC
Copy link
Contributor

@AmartC AmartC commented Oct 27, 2022

This PR adds the specs for the workload and pretrained DRAIN services as well as the specs for the CPU and GPU inferencing services and the training controller.

@AmartC AmartC force-pushed the add-workload-services branch 2 times, most recently from a7636ef to d6320bf Compare November 1, 2022 23:10
Copy link
Contributor

@dbason dbason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this looks fine.

To fix the compilation errors you will need to update the ConvertSpec function in /pkg/resources/opnicluster/util.go and replace

services.Drain = v1beta2.DrainServiceSpec(input.Services.Drain)

with

services.Drain = v1beta2.DrainServiceSpec{
    ImageSpec: input.Services.Drain.ImageSpec,
    Enabled: input.Services.Drain.Enabled,
    NodeSelector: input.Services.Drain.NodeSelector,
    Tolerations: input.Services.Drain.Tolerations,
    Replicas: input.Services.Drain.Replicas,
}

You will also need to remove the commented lines from /controllers/ai_opnicluster_controller_test.go. Also please look into adding tests for the new services you have added.

pkg/resources/opnicluster/services.go Outdated Show resolved Hide resolved
pkg/resources/opnicluster/services.go Outdated Show resolved Hide resolved
pkg/resources/opnicluster/services.go Show resolved Hide resolved
Copy link
Contributor

@dbason dbason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to remove the section in /pkg/resourcees/opnicluster/opnicluster.go that logs about GPU learning not being supported. It starts on line 166

pkg/resources/opnicluster/util.go Show resolved Hide resolved
@AmartC AmartC requested a review from dbason November 3, 2022 20:40
@AmartC AmartC marked this pull request as ready for review November 3, 2022 21:00
kralicky
kralicky previously approved these changes Nov 4, 2022
kralicky
kralicky previously approved these changes Nov 18, 2022
pkg/resources/opnicluster/services.go Outdated Show resolved Hide resolved
dbason
dbason previously approved these changes Nov 22, 2022
Copy link
Member

@kralicky kralicky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AmartC AmartC merged commit 6cf1d1f into main Nov 29, 2022
@AmartC AmartC deleted the add-workload-services branch November 30, 2022 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants