Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event-Publisher Flush Queue on Shutdown #767

Merged
merged 4 commits into from
Mar 16, 2023

Conversation

atai92
Copy link
Contributor

@atai92 atai92 commented Jan 26, 2023

Closes #

💸 TL;DR

Event-Publisher sidecar is not flushing the entire POSIX queue when receiving a SIGINT or SIGTERM. This PR creates a signal handler which will flush the queue whenever a SIGINT or SIGTERM is received. If this is not addressed, some applications that use the Event Publisher sidecar may experience dropped messages if their Event Publisher sidecar is terminated and the queue still has messages (ex: during a deployment).

📜 Details

Design Doc

Jira

🧪 Testing Steps / Validation

Reproduced the issue by running the event-publisher, filling up the event queue, and adding events to the queue after each polling cycle by modifying the event-publisher

    try:
        [event_queue.put("{}",timeout=QUEUE_TIMEOUT) for i in range(0,10)]
    except:
        pass
    
    while True:
        message: Optional[bytes]

        try:
            message = event_queue.get(timeout=QUEUE_TIMEOUT)
            event_queue.put("{}",timeout=QUEUE_TIMEOUT)
        except TimedOutError:
            message = None

        if batcher.is_ready:
            serialize_and_publish_batch(publisher, batcher)
        batcher.add(message)

This keeps the queue saturated with something.
Then, force-closed the event-publisher using ctrl-c.

Afterwards, check the POSIX queue to see if there are messages left over:

>>> from baseplate.lib.message_queue import MessageQueue

        "/events-" + "test",
        max_messages=10,
        max_message_size=8192,
    )
message = event_queue.get(timeout=QUEUE_TIMEOUT)>>> QUEUE_TIMEOUT = 0.2
>>> event_queue = MessageQueue(
...         "/events-" + "test",
...         max_messages=10,
...         max_message_size=8192,
...     )
>>> message = event_queue.get(timeout=QUEUE_TIMEOUT)

If we don't see a timeout, then we know that there was a message that was retrieved.

Screenshot 2023-01-25 at 4 14 38 PM
Note that the bad request above should not impact the test because the client error will only result in the messages being discarded due to a network error which is not the objective of this test.

To test the solution, the same test was done with the flush_queue_signal_handler() set for SIGINT and SIGTERM. As can be seen below, a TimedOutError was received when trying to get a message from the queue signaling that the queue was empty and flushed correctly.
Screenshot 2023-01-25 at 4 21 13 PM

These tests were both repeated a few times.

✅ Checks

  • CI tests (if present) are passing
  • Adheres to code style for repo
  • Contributor License Agreement (CLA) completed if not a Reddit employee

@atai92 atai92 requested a review from sydjryan January 26, 2023 00:25
@atai92 atai92 requested a review from a team as a code owner January 26, 2023 00:25
Copy link

@sydjryan sydjryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good, approving to quiet harold - I know we're talking in slack about when to release it

@KTAtkinson KTAtkinson merged commit 62da77f into reddit:develop Mar 16, 2023
KTAtkinson pushed a commit that referenced this pull request Mar 16, 2023
vent-Publisher sidecar is not flushing the entire POSIX queue when receiving a SIGINT or SIGTERM. This PR creates a signal handler which will flush the queue whenever a SIGINT or SIGTERM is received. If this is not addressed, some applications that use the Event Publisher sidecar may experience dropped messages if their Event Publisher sidecar is terminated and the queue still has messages (ex: during a deployment).
KTAtkinson added a commit that referenced this pull request May 15, 2023
KTAtkinson added a commit that referenced this pull request May 15, 2023
KTAtkinson added a commit that referenced this pull request May 15, 2023
KTAtkinson added a commit that referenced this pull request May 15, 2023
KTAtkinson added a commit that referenced this pull request May 15, 2023
seanrees pushed a commit to seanrees/baseplate.py that referenced this pull request May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants