Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

goroutine fatal when pushing big amount of images #154

Closed
glensc opened this issue May 4, 2018 · 17 comments
Closed

goroutine fatal when pushing big amount of images #154

glensc opened this issue May 4, 2018 · 17 comments

Comments

@glensc
Copy link
Collaborator

glensc commented May 4, 2018

in similar context as #153, but different scenario, got goroutine fatal errors when pushing ~400 images to registry.

the registry:5000 is insecure registry confured via env: INSECURE_REGISTRY_EX=registry:5000

worked fine with 1 image with same configuration.

$ lstags '--docker-json=/root/.docker/config.json' '--yaml-config=lstags.yml' --push '--push-prefix=/' '--push-registry=registry:5000'
INFO[0000] BATCH 1 of 13
fatal error: concurrent map writes

goroutine 1346 [running]:
runtime.throw(0x7f16c1, 0x15)
    /home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/panic.go:619 +0x81 fp=0xc4203d5d60 sp=0xc4203d5d40 pc=0x429951
runtime.mapassign_faststr(0x77a200, 0xc42008b650, 0xc42030a1b0, 0x2f, 0xc420f91d40)
    /home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/hashmap_fast.go:703 +0x3e9 fp=0xc4203d5dd0 sp=0xc4203d5d60 pc=0x40bb39
github.com/ivanilves/lstags/api/v1.(*API).CollectTags.func1(0xc4202c6320, 0xc42008b650, 0xc420d9a9c0, 0xc420e339e0)
    /home/travis/gopath/src/github.com/ivanilves/lstags/api/v1/v1.go:143 +0x609 fp=0xc4203d5fc0 sp=0xc4203d5dd0 pc=0x6f0709
runtime.goexit()
    /home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc4203d5fc8 sp=0xc4203d5fc0 pc=0x455791
created by github.com/ivanilves/lstags/api/v1.(*API).CollectTags
    /home/travis/gopath/src/github.com/ivanilves/lstags/api/v1/v1.go:120 +0x24f

More complete output:
push_b_1a.log

@glensc
Copy link
Collaborator Author

glensc commented May 4, 2018

workarounded with: --concurrent-requests=1

@ivanilves
Copy link
Owner

OK, I got it, this looks to be an issue with improper goroutine isolation.

... or maybe ... default concurrency limits being too high (still we should not have such bogus error messages)

@ivanilves
Copy link
Owner

Hey @glensc could you please share this lstags.yml with big number of images with me?

One thing also I want to investigate - we have no deduplication of passed Docker repos,
so if your file contained some duplicates, this would probably also contribute to this case.

@glensc
Copy link
Collaborator Author

glensc commented May 5, 2018

no, i can't share it for obvious reasons :) it has unique 415 images, all specified with tag (no regexps) from same gitlab docker registry using https. what other characteristics you need to know?

the config was generated with gitlab-registry-usage

@ivanilves
Copy link
Owner

no, i can't share it for obvious reasons :) it has unique 415 images, all specified with tag (no regexps) from same gitlab docker registry using https. what other characteristics you need to know?

OK! Well, the interesting part is: Are there any duplicates among image specifications?

@glensc
Copy link
Collaborator Author

glensc commented May 5, 2018

no duplicates. btw, shared my scripts: sciapp/gitlab-registry-usage#3

@ivanilves
Copy link
Owner

Great. Thank you!

Meanwhile I will continue to develop "improper goroutine isolation" fix.

@glensc
Copy link
Collaborator Author

glensc commented May 5, 2018

here's one more, even with concurrency 1!

3,6M push_b_7.log

@ivanilves
Copy link
Owner

this is supposed to fix it #158 (but I need a review from someone before merge)

@vonrabbe
Copy link
Collaborator

vonrabbe commented May 6, 2018

merged!

https://github.com/ivanilves/lstags/releases/tag/v1.0.66 - let's promote it to stable if @glensc confirms concurrency issue is solved

@glensc
Copy link
Collaborator Author

glensc commented May 10, 2018

not solved. will post complete log in few hours

$ lstags '--docker-json=/root/.docker/config.json' '--yaml-config=lstags.yml' --push --push-update '--push-prefix=/' '--push-registry=registry:5000'

fatal error: concurrent map writes

goroutine 13713 [running]:
runtime.throw(0x7f17e1, 0x15)
        /home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/panic.go:619 +0x81 fp=0xc427f39b78 sp=0xc427f39b58 pc=0x429951
runtime.mapassign_faststr(0x77a280, 0xc425e9e4e0, 0xc4202e4660, 0x25, 0x2)
        /home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/hashmap_fast.go:703 +0x3e9 fp=0xc427f39be8 sp=0xc427f39b78 pc=0x40bb39
github.com/ivanilves/lstags/api/v1.(*API).CollectPushTags.func1(0xc420305a80, 0x19f, 0x19f, 0x7fffde19cea0, 0x1, 0x7fffde19ceb2, 0xd, 0x1, 0xc4202ae2d0, 0xc425e9e480, ...)
        /home/travis/gopath/src/github.com/ivanilves/lstags/api/v1/v1.go:277 +0xc88 fp=0xc427f39f70 sp=0xc427f39be8 pc=0x6f1848
runtime.goexit()
        /home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc427f39f78 sp=0xc427f39f70 pc=0x455791
created by github.com/ivanilves/lstags/api/v1.(*API).CollectPushTags
        /home/travis/gopath/src/github.com/ivanilves/lstags/api/v1/v1.go:226 +0x439

btw, does that commandline also pull images from source repo?
i've been running this command before above "push command":

$ lstags '--docker-json=/root/.docker/config.json' '--yaml-config=lstags.yml' --pull

@ivanilves
Copy link
Owner

ivanilves commented May 10, 2018

Yes! But it went from v1.go:143 to v1.go:277, which means the remedy applied is probably working and we need re-apply it to other part of code. 💪

Please see the proposed fix in #160

btw, does that commandline also pull images from source repo? i've been running this command before above "push command":

Yes it does! 😉 Maybe we need to highlight this in README? 😇

P.S.:
Also updated #143 to hunt down all other map writes in goroutines (if any?) ASAP.

@vonrabbe
Copy link
Collaborator

vonrabbe commented May 11, 2018

@glensc we released https://github.com/ivanilves/lstags/releases/tag/v1.0.67 could you please try it? 🙏

@ivanilves our current CI build runs low intensity tests, which is not optimal for detection of concurrency-related errors. We need to add a stress test to our CI build, so we will detect these errors before users do. WDYT?

@vonrabbe
Copy link
Collaborator

Anyway: #162

@ivanilves ivanilves self-assigned this May 12, 2018
@glensc
Copy link
Collaborator Author

glensc commented May 13, 2018

downloaded 1.0.67, either it works or still goroutine errors. can't say :)

@ivanilves
Copy link
Owner

downloaded 1.0.67, either it works or still goroutine errors. can't say :)

OK, so it probably works, but you need to run it more times to get representative statistics? 🙂

@glensc
Copy link
Collaborator Author

glensc commented May 14, 2018

seems working now. i'll create new issue if problem occurs.

@glensc glensc closed this as completed May 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants