Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to prevent the deadlock in the QueueDiskChannel Test again #18415

Merged
merged 9 commits into from
Jan 29, 2022

Conversation

zeripath
Copy link
Contributor

This time we're going to adjust the pause tests to only test the right
flag.

Signed-off-by: Andrew Thornton art27@cantab.net

This time we're going to adjust the pause tests to only test the right
flag.

Signed-off-by: Andrew Thornton <art27@cantab.net>
@zeripath zeripath added type/testing skip-changelog This PR is irrelevant for the (next) changelog, for example bug fixes for unreleased features. labels Jan 26, 2022
@zeripath zeripath added this to the 1.17.0 milestone Jan 26, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jan 26, 2022

Codecov Report

Merging #18415 (d41608c) into main (3349fd8) will decrease coverage by 0.02%.
The diff coverage is 40.22%.

❗ Current head d41608c differs from pull request most recent head 7b45f74. Consider uploading reports for the commit 7b45f74 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main   #18415      +/-   ##
==========================================
- Coverage   46.03%   46.01%   -0.03%     
==========================================
  Files         840      842       +2     
  Lines       92856    93192     +336     
==========================================
+ Hits        42746    42881     +135     
- Misses      43323    43509     +186     
- Partials     6787     6802      +15     
Impacted Files Coverage Δ
cmd/restore_repo.go 0.00% <0.00%> (ø)
models/auth/twofactor.go 20.89% <0.00%> (-0.65%) ⬇️
models/db/engine.go 36.13% <ø> (ø)
models/issue_milestone.go 72.11% <ø> (+1.76%) ⬆️
modules/generate/generate.go 0.00% <0.00%> (ø)
modules/indexer/code/elastic_search.go 1.36% <0.00%> (-0.25%) ⬇️
modules/migration/issue.go 100.00% <ø> (ø)
modules/private/restore_repo.go 0.00% <0.00%> (ø)
modules/queue/queue.go 37.09% <0.00%> (ø)
modules/queue/setting.go 24.32% <0.00%> (ø)
... and 61 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 726715f...7b45f74. Read the comment docs.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jan 26, 2022
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Jan 26, 2022
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Jan 27, 2022
@wxiaoguang
Copy link
Contributor

A new failure:

2022/01/27 03:12:30 .../queue/workerpool.go:121:zeroBoost() [W] WorkerPool: 1 (for TestChannelQueue) has zero workers - adding 5 temporary workers for 5m0s
2022/01/27 03:12:30 ...eue/queue_channel.go:150:func1() [W] ChannelQueue: first-channel Terminated before completed flushing
2022/01/27 03:12:30 .../queue/workerpool.go:121:zeroBoost() [W] WorkerPool: 8 (for second-level) has zero workers - adding 1 temporary workers for 5m0s
2022/01/27 03:12:30 ...ueue_disk_channel.go:183:func3() [W] LevelQueue: second-level shut down before completely flushed
2022/01/27 03:12:30 ...disk_channel_test.go:207:func1() [I] pausing
2022/01/27 03:12:31 ...eue/queue_channel.go:150:func1() [W] ChannelQueue: first-channel Terminated before completed flushing
--- FAIL: TestPersistableChannelQueue_Pause (0.75s)
    queue_disk_channel_test.go:449: 
        	Error Trace:	queue_disk_channel_test.go:449
        	Error:      	Handler processing should have stopped
        	Test:       	TestPersistableChannelQueue_Pause
2022/01/27 03:12:31 .../queue/workerpool.go:121:zeroBoost() [W] WorkerPool: 14 (for second-level) has zero workers - adding 1 temporary workers for 5m0s
2022/01/27 03:12:31 ...disk_channel_test.go:207:func1() [I] pausing
2022/01/27 03:12:31 .../queue/workerpool.go:121:zeroBoost() [W] WorkerPool: 18 (for TestChannelQueue) has zero workers - adding 5 temporary workers for 5m0s
FAIL

…g else

Signed-off-by: Andrew Thornton <art27@cantab.net>
Signed-off-by: Andrew Thornton <art27@cantab.net>
Signed-off-by: Andrew Thornton <art27@cantab.net>
@zeripath
Copy link
Contributor Author

I think this might be sorted now.

@zeripath
Copy link
Contributor Author

Argh! I've still not fixed this!

@zeripath
Copy link
Contributor Author

zeripath commented Jan 29, 2022

ah this is another test problem in qct_pause - I've copied over the same ideas I've put into queue_disk_channel_test.go

Signed-off-by: Andrew Thornton <art27@cantab.net>
@zeripath zeripath merged commit 92b715e into go-gitea:main Jan 29, 2022
@zeripath zeripath deleted the prevent-race-in-qdc-test branch January 29, 2022 11:37
zeripath added a commit to zeripath/gitea that referenced this pull request Feb 5, 2022
…shed (go-gitea#18593)

Backport go-gitea#18593

There is a possible race whereby a worker pool could be cancelled but yet the
underlying queue is not empty. This will lead to flush-all cycling because it
cannot empty the pool.

* On shutdown of Persistant Channel Queues close datachan and empty

Partial Backport go-gitea#18415

Although we attempt to empty the datachan in queues - due to
races we are better off just closing the channel and forcibly emptying
it in shutdown.

Fix go-gitea#18618

Signed-off-by: Andrew Thornton <art27@cantab.net>
lunny pushed a commit that referenced this pull request Feb 6, 2022
…shed (#18593) (#18620)

* Only attempt to flush queue if the underlying worker pool is not finished (#18593)

Backport #18593

There is a possible race whereby a worker pool could be cancelled but yet the
underlying queue is not empty. This will lead to flush-all cycling because it
cannot empty the pool.

* On shutdown of Persistant Channel Queues close datachan and empty

Partial Backport #18415

Although we attempt to empty the datachan in queues - due to
races we are better off just closing the channel and forcibly emptying
it in shutdown.

Fix #18618

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Move zero workers warning to debug

Fix #18617

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Update modules/queue/manager.go

Co-authored-by: Gusted <williamzijl7@hotmail.com>

* Update modules/queue/manager.go

Co-authored-by: Gusted <williamzijl7@hotmail.com>

Co-authored-by: Gusted <williamzijl7@hotmail.com>
zeripath added a commit to zeripath/gitea that referenced this pull request Feb 16, 2022
…ea#18415)

Partial Backport of go-gitea#18415

Instead of using an asynchronous goroutine to push to disk on shutdown
just close the datachan and immediately push to the disk.

Prevents messages of incompletely flushed queues.

Signed-off-by: Andrew Thornton <art27@cantab.net>
lunny added a commit that referenced this pull request Feb 22, 2022
#18788)

Partial Backport of #18415

Instead of using an asynchronous goroutine to push to disk on shutdown
just close the datachan and immediately push to the disk.

Prevents messages of incompletely flushed queues.

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Chianina pushed a commit to Chianina/gitea that referenced this pull request Mar 28, 2022
…o-gitea#18415)

* Attempt to prevent the deadlock in the QueueDiskChannel Test again

This time we're going to adjust the pause tests to only test the right
flag.

* Only switch off pushback once we know that we are not pushing anything else
* Ensure full redirection occurs
* More nicely handle a closed datachan
* And handle similar problems in queue_channel_test

Signed-off-by: Andrew Thornton <art27@cantab.net>
@go-gitea go-gitea locked and limited conversation to collaborators Apr 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. skip-changelog This PR is irrelevant for the (next) changelog, for example bug fixes for unreleased features. type/testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants