Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registry failed on storage health check - s3aws.Stat("/") #4361

Open
maka86 opened this issue May 21, 2024 · 4 comments
Open

Registry failed on storage health check - s3aws.Stat("/") #4361

maka86 opened this issue May 21, 2024 · 4 comments

Comments

@maka86
Copy link

maka86 commented May 21, 2024

Description

We had outage for existing docker registry server (2.8.3), now We just built a new docker registry:2.8.3 with same settings in EC2, however, we hit the following issues.

When container starts, passed ALB health check then failed on s3aws.Stat("/")

The log when container starts, pass the ELB health check, check path registry:5000/v2/

10.21.11.226 - - [21/May/2024:06:11:23 +0000] "GET /v2/ HTTP/1.1" 200 2 "" "ELB-HealthChecker/2.0"
time="2024-05-21T06:11:33.612124577Z" level=info msg="response completed" go.version=go1.20.8 http.request.host="localhost:5000" http.request.id=32086d61-5859-4c89-bdfb-436722cffc39 http.request.method=GET http.request.remoteaddr="127.0.0.1:47768" http.request.uri="/v2/" http.request.useragent=Wget http.response.contenttype="application/json; charset=utf-8" http.response.duration=2.09839ms http.response.status=200 http.response.written=2
time="2024-05-21T06:11:33.612046526Z" level=debug msg="authorizing request" go.version=go1.20.8 http.request.host="localhost:5000" http.request.id=32086d61-5859-4c89-bdfb-436722cffc39 http.request.method=GET http.request.remoteaddr="127.0.0.1:47768" http.request.uri="/v2/" http.request.useragent=Wget

Registry starts failed, ELB health check 503

time="2024-05-21T06:11:33.831893432Z" level=debug msg="s3aws.Stat("/")" go.version=go1.20.8 instance.id=e8c2d6e3-d6bf-4e88-a9cc-c1ec8d88c53a service=registry trace.duration=350.466167ms trace.file="github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).Stat" trace.id=e10513a0-4a84-4296-b7c5-9144c28cf351 trace.line=155 version=2.8.3
10.21.11.226 - - [21/May/2024:06:11:38 +0000] "GET /v2/ HTTP/1.1" 503 125 "" "ELB-HealthChecker/2.0"

If we set env variable
REGISTRY_HEALTH_STORAGEDRIVER_ENABLED: "false"

The ELB health check is passed, but it shows this error
$ curl -X GET https://registry.sanbox.internal.com:5000/v2/images/node/tags/list {"errors":[{"code":"UNKNOWN","message":"unknown error","detail":{"DriverName":"s3aws","Enclosed":{}}}]}

Reproduce

1: start container without "REGISTRY_HEALTH_STORAGEDRIVER_ENABLED: "false", registry service failed on s3aws.Stat("/")
2: start container with "REGISTRY_HEALTH_STORAGEDRIVER_ENABLED: "false", registry service return
{"errors":[{"code":"UNKNOWN","message":"unknown error","detail":{"DriverName":"s3aws","Enclosed":{}}}]}

Expected behavior

registry:2.8.3 should be run as normal.

registry version

registry:2.8.3

Additional Info

The docker compose file:

registry-service:
    image: registry:2
    container_name: docker-registry
    restart: always
    ports:
      - '5000:5000'
    environment:
       REGISTRY_STORAGE: s3
       REGISTRY_STORAGE_S3_BUCKET: docker-image-repo
       REGISTRY_STORAGE_S3_REGION: ap-southeast-2
       REGISTRY_STORAGE_S3_ENCRYPT: 'true'
       SEARCH_BACKEND: sqlalchemy
       REGISTRY_HTTP_SECRET: <secret>

Searched a bit, it might be caused by s3 permission, but the execution role has full s3 access.

We have to use 2.8.3 for now to support many old images

@maka86 maka86 changed the title Registry failed on storage health check "s3aws","Enclosed" Registry failed on storage health check - s3aws.Stat("/") May 21, 2024
@milosgajdos
Copy link
Member

I would really appreciate if you formatted your logs using markdown code formatting. It's really hard to parse the context from your message.

@maka86
Copy link
Author

maka86 commented May 21, 2024

Hi @milosgajdos sorry about that, I have updated the log block. Thanks

@maka86
Copy link
Author

maka86 commented May 21, 2024

@milosgajdos, I just did a small test to prove the container can access s3 bucket

run the following sdk to access s3 in container,

 import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('docker-image-repo')

for obj in bucket.objects.all():
    print(obj.key)

result:

docker/registry/v2/blobs/sha256/08/084c1da10d39c6b7dc2bb41ba84771e6c0a60611ca17e493b5ad258cae7b7eb5/data
docker/registry/v2/blobs/sha256/08/084c81f08dd1abd2e5530390610303de2165ee5e4b49c4718d5168921bca76b9/data
docker/registry/v2/blobs/sha256/08/084cae691009e79856b8948aa70d437d649a39ecd1376b3ffa521b123f168aca/data
docker/registry/v2/blobs/sha256/08/084cb812b096e26793403a078ca871e6f3eec2f6f6bfe8e62d54b3f35fc84e4e/data
docker/registry/v2/blobs/sha256/08/084d0db3995e0a48bff875f01d4322a1d215bd31c84102faaec6f85bf20a9311/data
docker/registry/v2/blobs/sha256/08/084d379991bd2eaa55e2dd1143dcc9e5a1e8b5ca2a7837e5b906b402f04bc8d5/data
......
......

Container can access the s3 bucket. does this mean the issue is from s3aws storage driver ?

@milosgajdos
Copy link
Member

I have a feeling this is related to #3275.

Note that v2.8.3 is essentially in a maintenance mode and won't be receiving any updates besides high security patches. When stable v3 release is out v2.x will be marked as deprecated completely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants