Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent deferred queue from growing #4017

Merged
merged 3 commits into from
Jul 1, 2024

Conversation

rouming
Copy link
Contributor

@rouming rouming commented Jun 28, 2024

This PR addresses an issue related to deferred queue grow.

  1. The first patch reverts a commit (62b00e2) that attempted to use the URL argument as a key for deferred requests. The revert is necessary because the original commit introduced more problems than it solved, particularly with the RemoveDeferred() function, which was not covered at all. A better solution for handling deferred requests will follow.

  2. The second patch introduces separate deferred contexts for LOC and controller requests. This prevents conflicts where deferred items for different URLs could override each other, ensuring that requests to both destinations are reached.

  3. The third patch fixes the issue of a growing deferred queue by ensuring proper merging of deferred requests. Previously, new requests added during queue processing were not properly merged, leading to queue growth (reported by @milan-zededa). The patch ensures that new requests are merged by key, maintaining the queue size.

Ps. PRs to stable branches will follow in a couple of days.

cc: @milan-zededa

@rouming rouming requested a review from milan-zededa as a code owner June 28, 2024 17:29
@rouming rouming added bug Something isn't working stable Should be backported to stable release(s) labels Jun 28, 2024
Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eriknordmark
Copy link
Contributor

Why doesn't the build workflow run for this PR?

rouming added 3 commits July 1, 2024 10:00
…new deferred"

This reverts commit 62b00e2.

Actually original commit introduces more potential problems, for example
`RemoveDeferred()` call was not covered at all. In order to fix
deferred removal as well more ugly tweaks have to be made.

But there is a better way! A proper fix will follow. Stay tuned.

Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
…C reqs

LOC has different URL, so controller and LOC deferred requests can't be
placed into the same queue due to the same requests keys. Having two
destinations (controller + LOC), the second sender overrides a deferred
item, so first one destination will be never reached.

First attempt to fix this was made here:

   62b00e2 ("zedcloud/deferred: respect URL argument as a key on starting new deferred")

But this introduces more problems then fixes them (for example
`RemoveDeferred()` was not covered at all).

This is the proper way to deal with different destinations.

Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
This fixes the growing deferred queue issue caused by missing
request merges. Commit 9f2c9ad ("zedcloud/deferred: don't
take lock while processing the deferred queue") relaxes locks
and makes it possible to add new requests while queue processing
waits in send. New requests should land in the same `deferredItems`
list, which should be processed later at the end of the
`handleDeferred()` operation. The bug lies in the actual merging
of two queues: the one that has not been completed due to a possible
error and the other queue, which was populated with new requests
during the send.

Proper queue merging is not just concatenation of queues (which
would cause the resulting queue to always grow), but involves item
replacement by key, so the queue size stays the same, with the item
being replaced.

Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
Fixes: 9f2c9ad ("zedcloud/deferred: don't take lock while processing the deferred queue")
Reported-by: Milan Lenco <milan@zededa.com>
@uncleDecart
Copy link
Member

LGTM

Copy link
Contributor

@milan-zededa milan-zededa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-run tests (commenting apparently cancels ongoing runs)

@milan-zededa
Copy link
Contributor

milan-zededa commented Jul 1, 2024

Unfortunately, we run out of docker pulls for the day so tests are mostly failing now.
However, we would like to make a new EVE release and have this patch included, so I propose to merge this.

@milan-zededa milan-zededa merged commit 4582efb into lf-edge:master Jul 1, 2024
47 of 59 checks passed
@uncleDecart
Copy link
Member

@milan-zededa do you want to release Eden with docker login? It should reduce this timeouts occurrence ?

@milan-zededa
Copy link
Contributor

@milan-zededa do you want to release Eden with docker login? It should reduce this timeouts occurrence ?

Yes, let's do it. Are you able to set docker login/password in the eve repository?

@uncleDecart
Copy link
Member

It's already there, we have it for publishing. The only problem would be that most likely testing time will increase because we will be running on specified Pinned runners

@uncleDecart
Copy link
Member

@milan-zededa #4041

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stable Should be backported to stable release(s)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants