Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for specifying custom container run arguments #3160

Merged
merged 9 commits into from
Apr 9, 2024

Conversation

danegsta
Copy link
Member

@danegsta danegsta commented Mar 26, 2024

PR adds a new annotation for specifying custom arguments to the container run command issued by the orchestrator. This is intended to support advanced scenarios like enabling GPU support in Docker containers, where adding annotations for individual flags may not make sense.

Microsoft Reviewers: Open in CodeFlow

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication label Mar 26, 2024
@danegsta danegsta linked an issue Mar 26, 2024 that may be closed by this pull request
@davidfowl
Copy link
Member

What goes into the manifest?

@danegsta
Copy link
Member Author

I was just writing up a message to ask that very question. I'm assuming we will want manifest entries for this annotation along with the rest of the container specific data?

@danegsta
Copy link
Member Author

Only concern I have with adding the run arguments to the manifest is that they can potentially be runtime specific (the --gpus flag, for example, isn't supported in podman).

@DamianEdwards
Copy link
Member

Yeah I think these can't be in the manifest as they're opaque and very much tied to the container host. We can build higher-level container runtime feature constructs (e.g. GPU support) and those could go in the manifest and could leverage this PR's changes when running locally, if we want to.

@DamianEdwards
Copy link
Member

I'd really like to get this in as it enables GPU support for AI scenarios with Aspire which we have some folks experimenting with right now. @davidfowl @danegsta how do you feel about my comment above RE not putting this in the manifest?

@danegsta
Copy link
Member Author

I'm already a bit nervous about adding them to the manifest without any additional runtime context, so I'd be fine with backing out my changes to write to the manifest until we've had a chance to figure out exactly how we want to add this data.

@davidfowl
Copy link
Member

If they don't go into the manifest does that mean this is a runtime only scenario?

@DamianEdwards
Copy link
Member

I think so yes. We could choose to add first-class modeling for GPU support later as a portable concept, but this is low level for when you want to pass args directly to the container runtime.

@davidfowl
Copy link
Member

When you run on kuberentes aren't those args potentially applicable? How is this different from docker build args?

@DamianEdwards
Copy link
Member

Honestly I don't know. Are they? But even just enabling GPU uses different runtime args between docker and podman.

@danegsta
Copy link
Member Author

It looks like Podman 5.0 added support for the same --gpu argument as Docker. Doesn't really change the calculous on adding generic container run arguments to the manifest, but maybe this should become a more targeted feature for enabling GPU support rather than exposing the low level ability to add extra run arguments?

@DamianEdwards
Copy link
Member

@danegsta I think both should exist, with GPU support being something that would go to the manifest, but container runtime args being something that only apply during run. Doing the container runtime args feature first allows us to unblock any scenario where args have to be passed to the container runtime locally (including enabling GPUs, setting resource constraints, configuring networks, changing privileges, and setting the user). I think that's worth doing as an escape hatch to allow folks to get the full capabilities of their local container runtime during dev until we decide which higher level concepts to model.

@danegsta
Copy link
Member Author

@DamianEdwards I backed out the logic to write the manifest.

@danegsta danegsta merged commit 4946a74 into dotnet:main Apr 9, 2024
8 checks passed
@davidfowl
Copy link
Member

@danegsta backport?

@davidfowl
Copy link
Member

/backport to release/8.0

Copy link
Contributor

Started backporting to release/8.0: https://github.com/dotnet/aspire/actions/runs/8626993889

Copy link
Contributor

@davidfowl backporting to release/8.0 failed, the patch most likely resulted in conflicts:

$ git am --3way --ignore-whitespace --keep-non-patch changes.patch

Applying: Add support for new run arguments option
Using index info to reconstruct a base tree...
M	src/Aspire.Hosting/Dcp/ApplicationExecutor.cs
M	src/Aspire.Hosting/Dcp/Model/Container.cs
M	src/Aspire.Hosting/ResourceBuilderExtensions.cs
M	tests/Aspire.Hosting.Tests/DistributedApplicationTests.cs
Falling back to patching base and 3-way merge...
Auto-merging tests/Aspire.Hosting.Tests/DistributedApplicationTests.cs
Auto-merging src/Aspire.Hosting/ResourceBuilderExtensions.cs
Auto-merging src/Aspire.Hosting/Dcp/Model/Container.cs
Auto-merging src/Aspire.Hosting/Dcp/ApplicationExecutor.cs
CONFLICT (content): Merge conflict in src/Aspire.Hosting/Dcp/ApplicationExecutor.cs
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Add support for new run arguments option
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Error: The process '/usr/bin/git' failed with exit code 128

Please backport manually!

Copy link
Contributor

@davidfowl an error occurred while backporting to release/8.0, please check the run log for details!

Error: git am failed, most likely due to a merge conflict.

danegsta added a commit to danegsta/aspire that referenced this pull request Apr 10, 2024
* Add support for new run arguments option

* Add manifest entry for runArgs

* Container, not resource

* Don't write manifest entry

* Move extension methods to ContainerResourceBuilderExtensions and restrict to ContainerResource
danmoseley pushed a commit that referenced this pull request Apr 10, 2024
)

* Add support for new run arguments option

* Add manifest entry for runArgs

* Container, not resource

* Don't write manifest entry

* Move extension methods to ContainerResourceBuilderExtensions and restrict to ContainerResource
@danmoseley danmoseley mentioned this pull request Apr 12, 2024
@prom3theu5
Copy link

This is cool
Would it be possible to emit them to the manifest please as currently they aren't so downstream tools have no way to know that run args are expected to be passed.

@DamianEdwards
Copy link
Member

We decided not to emit them as they're container-host specific and thus don't translate well, e.g. args for docker run are different than those for podman run. But we can always change that decision. Please log an issue and we can discuss there.

@github-actions github-actions bot locked and limited conversation to collaborators May 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose ability to run containers with access to GPUs
4 participants