Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override container runtime when inferentia support is enabled #2458

Merged
merged 1 commit into from
Jun 14, 2020

Conversation

fenxiong
Copy link
Contributor

@fenxiong fenxiong commented May 22, 2020

Summary

Added an agent config InferentiaSupportEnabled populated by ECS_ENABLE_INF_SUPPORT env. For a container that has AWS_NEURON_VISIBLE_DEVICES specified, if InferentiaSupportEnabled is on, the agent will override its runtime to the neuron docker runtime which is needed for using the inferentia devices.

This change enables us to only use the neuron runtime for container that needs the inf device, and only do so when such runtime is installed on the AMI (which is indicated by the ECS_ENABLE_INF_SUPPORT config that we will add together with installing the neuron runtime).

Implementation details

  • api/task: added logic to override container runtime to neuron if needed. had to do a refactor in dockerHostConfig to satisfy gocyclo complexity check.
  • api/container: added method RequireNeuronRuntime to check if the container specifies using inf.
  • config: added InferentiaSupportEnabled config, populated by ECS_ENABLE_INF_SUPPORT env.

Testing

Unit tests added; Built the agent and successfully ran an inf task and verified that the runtime is only set to neuron for the container that specifies AWS_NEURON_VISIBLE_DEVICES.

New tests cover the changes: yes

Description for the changelog

TBD

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

agent/api/task/task_test.go Outdated Show resolved Hide resolved
agent/api/task/task_test.go Outdated Show resolved Hide resolved
Added an agent config InferentiaSupportEnabled populated by ECS_ENABLE_INF_SUPPORT env. For a container that has AWS_NEURON_VISIBLE_DEVICES specified, if InferentiaSupportEnabled is on, the agent will override its runtime to the neuron docker runtime needed for using the inferentia devices.

This enables us to only use the neuron runtime for container that needs the inf device, and only do so when such runtime is installed on the AMI (which is indicated by the ECS_ENABLE_INF_SUPPORT config that we will add together with installing the neuron runtime).
@fenxiong fenxiong added this to the 1.41.0 milestone Jun 1, 2020
@fenxiong fenxiong merged commit ec4b332 into aws:dev Jun 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants