-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use resolved digest for image pulls #4157
Use resolved digest for image pulls #4157
Conversation
122f9e0
to
78a1bfa
Compare
9accad7
to
9c66114
Compare
78a1bfa
to
a2dd4af
Compare
a2dd4af
to
9f9aa2a
Compare
// Given an image reference and a manifest digest string, returns a canonical reference | ||
// for the image. | ||
// If the image reference has a digest then the canonical reference will still use the provided | ||
// manifest digest overwriting the exiting digest in the image reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (non-blocking): typo in exiting
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix this in a follow-up PR right after.
// Tag the pulled image so that it can be found using the image reference in the task. | ||
ctx, cancel := context.WithTimeout(engine.ctx, tagImageTimeout) | ||
defer cancel() | ||
err := engine.client.TagImage(ctx, imageRef, container.Image) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are retagging the image with mutable tags (such as nginx:latest
), and we have a subsequent task whose container uses the same image but using digest (such as nginx:shaXYZ
)
- will the new task be able to discover that the local
nginx:latest
is identical tonginx:shaXYZ
and reuse it? - will Agent be able to keep the correct image reference counter on
nginx:latest
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yinyic .
- will the new task be able to discover that the local
nginx:latest
is identical tonginx:shaXYZ
and reuse it?
If nginx:latest
still resolves to nginx:shaXYZ
then docker image pull nginx:shaXYZ
will behave just like docker image pull nginx:latest
and pull will end after docker determines that all image layers are already present on the host. If nginx:shaXYZ
is a different image than locally available nginx:latest
then Docker will pull the image from the repository and two images will be available on the host after that.
In case of prefer-cached
image pull behavior, Agent will do a docker image inspect nginx:shaXYZ
. If previously pulled nginx:latest
image had shaXYZ
digest then inspect will succeed and Agent will use the local image again otherwise inspect will fail and Agent will pull nginx:shaXYZ
image.
In case of once
image pull behavior, Agent will not find nginx:shaXYZ
image in its internal cache and pull the image. After the pull Agent will add nginx:shaXYZ
to its internal cache.
The behavior explained above is the same as it exists currently and this PR does not intend to change the "net" image pull behavior of Agent.
- will Agent be able to keep the correct image reference counter on
nginx:latest
?
Agent's image reference counting will continue to work as before. If nginx:shaXYZ
and nginx:latest
image on the host have the same image ID then Agent will have two references for the same underlying image on the host. If nginx:shaXYZ
has a different image ID than local nginx:latest
then Agent will have one reference each for the two images. This is the current Agent behavior and this PR does not intend to make any changes to it.
Summary
This PR updates image pull logic to use a resolved image manifest digest if one is available. Image manifest digests are resolved during container transition to
MANIFEST_PULLED
state. The change ensures that the pulled image is the same as pointed by the resolved digest.Implementation details
TagImage
toDockerClient
interface and its implementation. The method tags an image on the host. The implementation performs retries using a newConstantBackoff
strategy that backs off the same duration every time. A constant backoff retry strategy is fine in this case as there is no external service involved.ConstantBackoff
backoff strategy underecs-agent
module. The strategy returns the same amount of backoff duration regardless of how many times itsDuration
method is called.*DockerTaskEngine.pullAndUpdateContainerReference
method that is used for pulling container images so that it uses the container'sImageDigest
field to prepare a canonical reference to the image to be pulled. The method tags the pulled image withContainer.Image
if a different image reference was used to pull the image so that image caching and image cleanup continue to work as before.GetCanonicalRef
toagent/utils/reference
package that returns a canonical image reference given an image reference and a manifest digest.Testing
TestPullContainerWithAndWithoutDigestInteg
to check that*DockerTaskEngine.pullContainer
can pull images for containers with and without anImageDigest
set.TestPullContainerWithAndWithoutDigestConsistency
to check that*DockerTaskEngine.pullContainer
pulls the same image with or without a digest set and the image can be inspected withcontainer.Image
field in both cases.In addition to the integration tests above, performed the following manual tests.
always
image pull behavior. Checked that all tasks ran as expected. Images were pulled using digests and tagged with the image reference in the task definition. Images were cleaned up without any issues.once
and thenprefer-cached
image pull behavior. Checked that all tasks ran as expected. Cached images were used in both cases when found. Image cleanup worked as expected withonce
image pull behavior. Image pull is disabled whenprefer-cached
image pull behavior is used.always
image pull behavior to force image pulls. Measured the task average start times (startedAt
-createdAt
) and task pull times (pullStoppedAt
-pullStartedAt
) for both cases. Changes to resolve image manifest digest in Implement transition function for MANIFEST_PULLED state #4152 caused an additional delay in task start times that ranged from 300ms (ECR) to 900ms (Dockerhub), however, with this PR the image pulls are now slightly faster. Pull time for an image that's already available on the host is reduced from ~700ms (Dockerhub), ~250ms (public ECR), and ~130ms (private ECR) to ~260ms (Dockerhub), ~100ms (public ECR), and ~50ms (private ECR). Combined with changes to resolve image manifest digests (Implement transition function for MANIFEST_PULLED state #4152) the overall average increase in task start time in my test environment is ~500ms (Dockerhub) and ~150ms (ECR).New tests cover the changes: yes
Description for the changelog
Does this PR include breaking model changes? If so, Have you added transformation functions?
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.