"no supported platform found in manifest list" / "no matching manifest for XXX in the manifest list entries" #3835

tianon · 2017-12-21T22:41:18Z

TLDR: Not all architectures are created equal, but perhaps even more importantly, not all build servers we have access to are equal in performance, power, or ability to process builds reliably.

Important: Please do not post here with reports of individual image issues -- we're aware of the overall problem, and this issue is a discussion of solving it generally. Off-topic comments will be deleted.

When we merge an update PR to https://github.com/docker-library/official-images, it triggers Jenkins build jobs over in https://doi-janky.infosiftr.net/job/multiarch/ (see #2289 for more details on our multiarch approach).

Sometimes, we'll have non-amd64 image build jobs finish before their amd64 counterparts, and due to the way we push the manifest list objects to the library namespace on the Docker Hub, that results in amd64-using folks (our primary target users) getting errors of the form "no supported platform found in manifest list" or "no matching manifest for XXX in the manifest list entries" (see linked issues below for several reports from users of this variety).

Thus, manifest lists under the library are "eventually consistent" -- once all arches complete successfully, the manifest lists get updated to include all the relevant sub-architectures.

Our current method for combating the main facet of this problem (missing amd64 images while other arches are successfully built and available) is to trigger amd64 build jobs within an hour after the update PR is merged, and all other arches only within 24 hours. This helps to some degree in ensuring that amd64 builds first, but not always. For example, our arm32vN servers are significantly faster than our AWS-based amd64 server, so if those jobs happen to get queued at the same time as existing amd64 jobs are, they'll usually finish a lot more quickly. Additionally, given the slow IO speed of our AWS-based amd64 build server, the queue for amd64 build jobs piles up really quickly (which also doesn't help with keeping our build window low).

As for triggering jobs more directly, the GitHub webhooks support in Jenkins makes certain assumptions about how jobs and pipelines are structured/triggered, and thus we can't use GitHub's webhooks to effectively trigger these jobs (without doing additional custom development to sit between the two systems), and thus rely on the built-in Jenkins polling mechanism. This has been fine (we haven't noticed any scalability issues with how often we're polling), and even if we were triggering builds more aggressively, that's only half the problem (since then our build queues would just pile up faster).

One solution that has been proposed is to wait until all architectures successfully build before publishing the relevant manifest list. If a naïve version of this suggestion were implemented right now, we would have no image updates published because our s390x worker is currently down (as an example -- we do frequently lose builder nodes given that all non-amd64 arches are using donated resources). Additionally, as noted above, some architectures build significantly slower than others (before we got our hyper-fast ARM hardware, arm32vN used to take days to build images like python), so it isn't exactly fair to force all architectures to wait for the one slowpoke before providing updated images to our userbase. As a final thought on this solution, some architectures outright fail, and the maintainers don't necessarily notice or even care (for example, mongo:3.6 on windows-amd64 has been failing consistently with a mysterious Windows+Docker graph driver error that we haven't had a chance to look into or escalate, and wouldn't be fair to block updated image availability on).

One compromise would be to use the Jenkins Node API (https://doi-janky.infosiftr.net/computer/multiarch-s390x/api/json) to determine whether a particular builder is down in order to determine whether to block on builds of that architecture. Additionally, we could try to get creative with checking pending builds / queue length for a particular architecture's builds to determine whether or not a given architecture is significantly backlogged and thus a good candidate for not waiting.

We could also attempt to determine when a particular tag was added/merged, and set a time limit for some number of hours before we just assume it must be backlogged, failing, or down and move along without that tag, but this is slightly more complicated (since we don't have a modification time for a particular tag directly, and really can only determine that information on an image level without complex Git walking / image manifest file parsing). Perhaps even just a time limit on the image level would be enough, but in the case of our mongo:3.6 example, that would mean all tag updates to mongo (whether they're related to the 3.6 series or not) would wait the maximum amount of time before being updated due to one version+architecture combination failing.

Related issues: (non-comprehensive)

The text was updated successfully, but these errors were encountered:

m5p3nc3r · 2018-01-11T13:44:25Z

Hey @tianon Thanks for introducing me to this issue yesterday. Initial thoughts:

The core of the problem seems to be the availability of amd64 images (breaks my heart being an Arm guy - but its a fair statement of the situation today!). We therefore need to make sure that the fat manifest is only published once the amd64 build has completed successfully?

How about implementing a system whereby the manifest is only published when a list of 'gold' architectures have build successfully? This way, you could ensure that the amd64 issue never rears its ugly head again? It would also mean that for specific images - say popular base images like Alpine, that the manifest is only published when for example amd64, arm64v8, ppc64le and s360x are successful?

Its not an ideal solution, but it could act as a way of stabilising things whilst a better solution could be implemented.

One last thought - if this were to be implemented, it would be useful to have a global 'gold architecture' list and a per project delta from that list.

MattF-NSIDC · 2018-01-25T00:46:50Z

What advice do you have for developers affected by this bug? I'm using an image which depends on docker-library/tomcat and I've been unable to build for about a half hour. I read your post pretty carefully, I think, but didn't see any mention of a workaround. Based on what I've read, this is not a problem that can be solved on my side, I would just have to wait.

If that is the case, is there any way for me to do maybe an API query to estimate a wait time until dockerhub reaches a consistent state?

tianon · 2018-01-25T20:20:30Z

@MattF-NSIDC fair point -- this was intended as a tracking issue for the problem and discussion around how to solve the crux of it properly; I think a short blurb here about how to work around it in the meantime is definitely appropriate. Here's my current recommendation:

If you rely on a specific image, use https://github.com/docker-library/repo-info (linked from every image description) to find the exact sha256 digest (also available from the docker pull output, but the repo-info repository has retroactive digests in the Git history), or even simply use a more specific tag.

If you're looking for a specific architecture, use the architecture-specific namespace to find it (as linked from both https://github.com/docker-library/official-images#architectures-other-than-amd64 and every image description under "Supported architectures").

As for ETA, even if we find a reasonable solution to "wait for things to be available", we'll still have a limit on how long we wait before pushing whatever we've got, which will likely be on the order of hours but still less than 24 (because IMO, 24h ought to be the absolute maximum we wait before we storm ahead, and should roughly match our current builds-scheduling timing).

tianon · 2018-01-25T20:30:14Z

@m5p3nc3r yeah, the "gold architectures" solution is basically exactly the solution I wrote on a quick note on my desk the first time we had this problem 😄

I don't love it, but it does seem like the closest we can get, and definitely is going to be better than what we're doing now. It would also allow us to simply trigger all builds as soon as possible after merge and let everything simply trickle through. The main challenge I see is how to implement the "timeout" functionality, but perhaps we simply punt and put the timeout on the full image instead of individual tags and call it a day.

tianon · 2019-05-03T23:52:22Z

I built https://github.com/tianon/dockerhub-public-proxy as a bit of a trial balloon to see if I could run it behind something that caches aggressively (like Cloudflare) to try to help with some of the speed issues that have force us to optimize in some of the ways that have made this issue glaring, and my initial testing is very promising (hopefully helping us no-op things like blob mounts in a faster way than the general solution manifest-tool has to do since we know that for our subset of the problem all our layers are public data and thus highly cacheable, especially for a content-addressable digest that literally can't change by design and is thus technically infinitely cacheable 😄).

Hoping to hack more on that train of thought next week so that we could go back to a more naive solution to pushing tags that simply pushes everything (ala, gather all arch-specific :latest bits and push them to :latest, etc), which would make this problem a non-issue again. 👍

tianon · 2019-05-10T00:03:56Z

Got a solution built and tested over in #5897 that makes the coordination problem a moot point focusing instead on optimizing the pushing process as much as possible with pretty decent benchmarks. 👍 🤘

tianon · 2019-05-11T00:36:03Z

As noted in the commit message on docker-library/oi-janky-groovy@51e9901, this can finally be closed thanks to #5897!! 🎉 🤘 ❤️

That's fully implemented and working on our infrastructure now, which can be seen right this second with the recent alpine:3.9 update that added alpine:3.9.4 (#5898), and alpine:3.9's amd64 entry is updated to point to the new 3.9.4 image but all the other architecture entries still point to 3.9.3 (that is, until they finally build and catch up, which should be triggering within the next hour 💪), and alpine:3.9.4 is available and alpine:3.9.3 is now "archived" and will remain untouched. 👍

Alpine Digest Comparisons:

$ manifest-tool inspect alpine:3.9.4
Name:   alpine:3.9.4 (Type: application/vnd.docker.distribution.manifest.list.v2+json)
Digest: sha256:182aba30aabc7dc99ccbafbd8f4d0e1141f6f2763c38f4dedacb33a45a29f2c2
 * Contains 1 manifest references:
1    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
1       Digest: sha256:bf1684a6e3676389ec861c602e97f27b03f14178e5bc3f70dce198f9f160cce9
1  Mfst Length: 528
1     Platform:
1           -      OS: linux
1           - OS Vers: 
1           - OS Feat: []
1           -    Arch: amd64
1           - Variant: 
1           - Feature: 
1     # Layers: 1
         layer 1: digest = sha256:e7c96db7181be991f19a9fb6975cdbbd73c65f4a2681348e63a141a2192a5f10

$ manifest-tool inspect alpine:3.9
Name:   alpine:3.9 (Type: application/vnd.docker.distribution.manifest.list.v2+json)
Digest: sha256:ecb3fea3e2ea5b6ecf4266e7861a21d3d1462f022a6521cb3053d26c7a0b5f14
 * Contains 7 manifest references:
1    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
1       Digest: sha256:bf1684a6e3676389ec861c602e97f27b03f14178e5bc3f70dce198f9f160cce9
1  Mfst Length: 528
1     Platform:
1           -      OS: linux
1           - OS Vers: 
1           - OS Feat: []
1           -    Arch: amd64
1           - Variant: 
1           - Feature: 
1     # Layers: 1
         layer 1: digest = sha256:e7c96db7181be991f19a9fb6975cdbbd73c65f4a2681348e63a141a2192a5f10

2    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
2       Digest: sha256:c4ba6347b0e4258ce6a6de2401619316f982b7bcc529f73d2a410d0097730204
2  Mfst Length: 528
2     Platform:
2           -      OS: linux
2           - OS Vers: 
2           - OS Feat: []
2           -    Arch: arm
2           - Variant: v6
2           - Feature: 
2     # Layers: 1
         layer 1: digest = sha256:9d34ec1d9f3e63864b68d564a237efd2e3778f39a85961f7bdcb3937084070e1

3    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
3       Digest: sha256:7b7521cf1e23b0e1756c68a946b255d0619266767b7d62bf7fe7c8618e0a9a17
3  Mfst Length: 528
3     Platform:
3           -      OS: linux
3           - OS Vers: 
3           - OS Feat: []
3           -    Arch: arm
3           - Variant: v7
3           - Feature: 
3     # Layers: 1
         layer 1: digest = sha256:c2a5cdd4aa08146b4516cc95f6b461f2994250a819b3e6f75f23fa2a8c1b1744

4    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
4       Digest: sha256:bc6e6ad08312deb806ff4bf805c2e24f422859ff3f2082b68336e9b983fbc2f7
4  Mfst Length: 528
4     Platform:
4           -      OS: linux
4           - OS Vers: 
4           - OS Feat: []
4           -    Arch: arm64
4           - Variant: v8
4           - Feature: 
4     # Layers: 1
         layer 1: digest = sha256:6f37394be673296a0fdc21b819c5df40431baf7d3af121bee451726dd1457493

5    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
5       Digest: sha256:ffb8eeffb932b5f92601b9952d8881cfeccc81e328b16e3dbf41ec78b0fc0e7d
5  Mfst Length: 528
5     Platform:
5           -      OS: linux
5           - OS Vers: 
5           - OS Feat: []
5           -    Arch: 386
5           - Variant: 
5           - Feature: 
5     # Layers: 1
         layer 1: digest = sha256:9a81e6a1a3b4f174d22173a96692c9aeffaefcd00f40607d508951a2b14d6f1f

6    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
6       Digest: sha256:ca8b1210e89642b693c17c123bd2bd2c3bcac3a2fb8e92d5f0490f7bf54fbc10
6  Mfst Length: 528
6     Platform:
6           -      OS: linux
6           - OS Vers: 
6           - OS Feat: []
6           -    Arch: ppc64le
6           - Variant: 
6           - Feature: 
6     # Layers: 1
         layer 1: digest = sha256:fe0f92a92ee06f38abf50fefd22331ac42262e3872ecd2d7ddfa7c24ab71a53a

7    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
7       Digest: sha256:888079d28c835cd15087b9d8ba745ac0b60aa0a2601f9e2a4d790b443f8316c1
7  Mfst Length: 528
7     Platform:
7           -      OS: linux
7           - OS Vers: 
7           - OS Feat: []
7           -    Arch: s390x
7           - Variant: 
7           - Feature: 
7     # Layers: 1
         layer 1: digest = sha256:5b51e37a522c2e7cd3c67e8a3e5500b45189ea6698e9fdaed7f5d48282326633

$ manifest-tool inspect alpine:3.9.3
Name:   alpine:3.9.3 (Type: application/vnd.docker.distribution.manifest.list.v2+json)
Digest: sha256:28ef97b8686a0b5399129e9b763d5b7e5ff03576aa5580d6f4182a49c5fe1913
 * Contains 7 manifest references:
1    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
1       Digest: sha256:5c40b3c27b9f13c873fefb2139765c56ce97fd50230f1f2d5c91e55dec171907
1  Mfst Length: 528
1     Platform:
1           -      OS: linux
1           - OS Vers: 
1           - OS Feat: []
1           -    Arch: amd64
1           - Variant: 
1           - Feature: 
1     # Layers: 1
         layer 1: digest = sha256:bdf0201b3a056acc4d6062cc88cd8a4ad5979983bfb640f15a145e09ed985f92

2    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
2       Digest: sha256:c4ba6347b0e4258ce6a6de2401619316f982b7bcc529f73d2a410d0097730204
2  Mfst Length: 528
2     Platform:
2           -      OS: linux
2           - OS Vers: 
2           - OS Feat: []
2           -    Arch: arm
2           - Variant: v6
2           - Feature: 
2     # Layers: 1
         layer 1: digest = sha256:9d34ec1d9f3e63864b68d564a237efd2e3778f39a85961f7bdcb3937084070e1

3    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
3       Digest: sha256:7b7521cf1e23b0e1756c68a946b255d0619266767b7d62bf7fe7c8618e0a9a17
3  Mfst Length: 528
3     Platform:
3           -      OS: linux
3           - OS Vers: 
3           - OS Feat: []
3           -    Arch: arm
3           - Variant: v7
3           - Feature: 
3     # Layers: 1
         layer 1: digest = sha256:c2a5cdd4aa08146b4516cc95f6b461f2994250a819b3e6f75f23fa2a8c1b1744

4    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
4       Digest: sha256:bc6e6ad08312deb806ff4bf805c2e24f422859ff3f2082b68336e9b983fbc2f7
4  Mfst Length: 528
4     Platform:
4           -      OS: linux
4           - OS Vers: 
4           - OS Feat: []
4           -    Arch: arm64
4           - Variant: v8
4           - Feature: 
4     # Layers: 1
         layer 1: digest = sha256:6f37394be673296a0fdc21b819c5df40431baf7d3af121bee451726dd1457493

5    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
5       Digest: sha256:ffb8eeffb932b5f92601b9952d8881cfeccc81e328b16e3dbf41ec78b0fc0e7d
5  Mfst Length: 528
5     Platform:
5           -      OS: linux
5           - OS Vers: 
5           - OS Feat: []
5           -    Arch: 386
5           - Variant: 
5           - Feature: 
5     # Layers: 1
         layer 1: digest = sha256:9a81e6a1a3b4f174d22173a96692c9aeffaefcd00f40607d508951a2b14d6f1f

6    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
6       Digest: sha256:ca8b1210e89642b693c17c123bd2bd2c3bcac3a2fb8e92d5f0490f7bf54fbc10
6  Mfst Length: 528
6     Platform:
6           -      OS: linux
6           - OS Vers: 
6           - OS Feat: []
6           -    Arch: ppc64le
6           - Variant: 
6           - Feature: 
6     # Layers: 1
         layer 1: digest = sha256:fe0f92a92ee06f38abf50fefd22331ac42262e3872ecd2d7ddfa7c24ab71a53a

7    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
7       Digest: sha256:888079d28c835cd15087b9d8ba745ac0b60aa0a2601f9e2a4d790b443f8316c1
7  Mfst Length: 528
7     Platform:
7           -      OS: linux
7           - OS Vers: 
7           - OS Feat: []
7           -    Arch: s390x
7           - Variant: 
7           - Feature: 
7     # Layers: 1
         layer 1: digest = sha256:5b51e37a522c2e7cd3c67e8a3e5500b45189ea6698e9fdaed7f5d48282326633

Or, more clearly: 🎉

$ diff -u <(manifest-tool inspect alpine:3.9) <(manifest-tool inspect alpine:3.9.3)
--- /dev/fd/63	2019-05-10 17:35:43.032489978 -0700
+++ /dev/fd/62	2019-05-10 17:35:43.032489978 -0700
@@ -1,8 +1,8 @@
-Name:   alpine:3.9 (Type: application/vnd.docker.distribution.manifest.list.v2+json)
-Digest: sha256:ecb3fea3e2ea5b6ecf4266e7861a21d3d1462f022a6521cb3053d26c7a0b5f14
+Name:   alpine:3.9.3 (Type: application/vnd.docker.distribution.manifest.list.v2+json)
+Digest: sha256:28ef97b8686a0b5399129e9b763d5b7e5ff03576aa5580d6f4182a49c5fe1913
  * Contains 7 manifest references:
 1    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
-1       Digest: sha256:bf1684a6e3676389ec861c602e97f27b03f14178e5bc3f70dce198f9f160cce9
+1       Digest: sha256:5c40b3c27b9f13c873fefb2139765c56ce97fd50230f1f2d5c91e55dec171907
 1  Mfst Length: 528
 1     Platform:
 1           -      OS: linux
@@ -12,7 +12,7 @@
 1           - Variant: 
 1           - Feature: 
 1     # Layers: 1
-         layer 1: digest = sha256:e7c96db7181be991f19a9fb6975cdbbd73c65f4a2681348e63a141a2192a5f10
+         layer 1: digest = sha256:bdf0201b3a056acc4d6062cc88cd8a4ad5979983bfb640f15a145e09ed985f92
 
 2    Mfst Type: application/vnd.docker.distribution.manifest.v2+json
 2       Digest: sha256:c4ba6347b0e4258ce6a6de2401619316f982b7bcc529f73d2a410d0097730204

tianon mentioned this issue Dec 30, 2017

Fix broken arm build for swipl #3855

Merged

This was referenced Jan 9, 2018

Can't run php:7.2-fpm-alpine docker-library/php#559

Closed

No matching manifest for linux/ppc64le for busybox image #3884

Closed

tianon mentioned this issue Jan 18, 2018

no matching manifest for linux/amd64 in the manifest list entries docker-library/wordpress#276

Closed

docker-library deleted a comment Jan 18, 2018

docker-library deleted a comment from yosifkit Jan 18, 2018

docker-library deleted a comment Jan 18, 2018

docker-library deleted a comment from jawadrehman Jan 18, 2018

tianon mentioned this issue Jan 24, 2018

unable to pull golang:1.9.3 on aarch64 docker-library/golang#203

Closed

aogilvie mentioned this issue Nov 28, 2018

Pull of "node:8" fails with "no matching manifest for unknown in the manifest list entries" [RESOLVED] nodejs/docker-node#941

Closed

aldafu mentioned this issue Nov 28, 2018

Node.js not available from docker hub nodejs/node#24696

Closed

touv mentioned this issue Nov 28, 2018

Update format streamgraph Inist-CNRS/lodex#843

Merged

wglambert mentioned this issue Nov 30, 2018

Add openjdk 11 based clojure images #5117

Merged

tianon pinned this issue Dec 24, 2018

This was referenced Feb 4, 2019

rabbitmq:alpine not found in arm docker-library/rabbitmq#306

Closed

Pulling image on Raspberry PI 3 fails docker-library/rabbitmq#307

Closed

thresheek mentioned this issue Feb 28, 2019

Missing image for arm64 on docker hub nginxinc/docker-nginx#313

Closed

StefanSchoof mentioned this issue Mar 1, 2019

No more arm image for 10-alpine nodejs/docker-node#1004

Closed

tianon mentioned this issue Mar 4, 2019

docker:dind is missing multi-arch docker-library/docker#145

Closed

wglambert mentioned this issue Mar 14, 2019

golang:1.12-alpine does not have layers for linux/arm64/v8 docker-library/golang#269

Closed

This was referenced Apr 2, 2019

Not all architectures available #5667

Closed

Latest image for s390x docker-library/httpd#129

Closed

Latest image for s390x docker-library/golang#271

Closed

This was referenced Apr 10, 2019

Docker Alpine 3.9.3 for ARMv7 is missing on Dockerhub alpinelinux/docker-alpine#4

Closed

Alpine: multi-arch builders not picking up new changes #5708

Closed

jamesrgregg mentioned this issue Apr 16, 2019

Root Cause Analysis for the failed arm builds edgexfoundry/ci-management#345

Closed

thresheek mentioned this issue Apr 17, 2019

The latest nginx image is not multi-arch nginxinc/docker-nginx#324

Closed

tianon mentioned this issue May 9, 2019

Add new "put-multiarch" script #5897

Merged

alicefr mentioned this issue May 10, 2019

docker: docker images latest version available kata-containers/tests#1556

Closed

tianon closed this as completed in docker-library/oi-janky-groovy@51e9901 May 11, 2019

tianon mentioned this issue May 13, 2019

Add AdoptOpenJDK Images (Includes both HotSpot and Eclipse OpenJ9) #5710

Merged

9 tasks

tianon added a commit to docker-library/faq that referenced this issue May 15, 2019

Remove docker-library/official-images#3835 caveat 🎉

0c7f8aa

tianon unpinned this issue Aug 14, 2019

dweomer mentioned this issue Oct 7, 2019

problem with golang:1.12-alpine3.9 or general travis build problem rancher/k3os#199

Closed

kozlovic mentioned this issue Apr 1, 2020

[nats] Release v2.1.6 #7720

Merged

wglambert mentioned this issue Sep 17, 2020

arm64v8 cannot run at amd64 machine after binfmt docker-library/golang#343

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"no supported platform found in manifest list" / "no matching manifest for XXX in the manifest list entries" #3835

"no supported platform found in manifest list" / "no matching manifest for XXX in the manifest list entries" #3835

tianon commented Dec 21, 2017 •

edited

Loading

m5p3nc3r commented Jan 11, 2018

MattF-NSIDC commented Jan 25, 2018 •

edited

Loading

tianon commented Jan 25, 2018

tianon commented Jan 25, 2018

tianon commented May 3, 2019

tianon commented May 10, 2019

tianon commented May 11, 2019 •

edited

Loading

"no supported platform found in manifest list" / "no matching manifest for XXX in the manifest list entries" #3835

"no supported platform found in manifest list" / "no matching manifest for XXX in the manifest list entries" #3835

Comments

tianon commented Dec 21, 2017 • edited Loading

m5p3nc3r commented Jan 11, 2018

MattF-NSIDC commented Jan 25, 2018 • edited Loading

tianon commented Jan 25, 2018

tianon commented Jan 25, 2018

tianon commented May 3, 2019

tianon commented May 10, 2019

tianon commented May 11, 2019 • edited Loading

tianon commented Dec 21, 2017 •

edited

Loading

MattF-NSIDC commented Jan 25, 2018 •

edited

Loading

tianon commented May 11, 2019 •

edited

Loading