Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes EventDebouncer not producing events #998

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

ivg
Copy link
Contributor

@ivg ivg commented Aug 7, 2023

Resolves #997

The problem is that we can only exit from the loop by timeout and it will never happen if the condition variable is triggered every debounce_interval_seconds or more often. A solution is simple - just add a monotonic timer.

Update:
Resolves: #999
Resolves: #1000

@ivg
Copy link
Contributor Author

ivg commented Aug 8, 2023

Note that the test failure is irrelevant to the change (it doesn't involve the debouncer, the next test with the debouncer passes).

Makes it more readable and fixes a few issues, see gorakhargosh#999 and gorakhargosh#1000
@ivg
Copy link
Contributor Author

ivg commented Aug 10, 2023

I've looked at the failing test and what brings some suspicion is that there is some racing between a process watcher which is enabled when restart_on_command_exit (defaults to True) and restarting on the event, which is actually what we check in this test. I.e., once we have an event that restarts the process, we kill it, and the process watcher immediately kicks in and tries to restart it as well. Just a hypothesis, I can create a separate PR with a fix to the test or to the trick.

Edit: see #1002, let's give it a shot!

ivg added a commit to ivg/watchdog that referenced this pull request Aug 10, 2023
Just a long shot for a failure observed on gorakhargosh#998. My hypothesis is that
when we stop ProcessWatcher before we restart the process manually, we
don't yield to it and immediately kill the process. Next, when the
ProcessWatcher thread is woken up, we have to conditions ready - the
popen_obj and stopped_event, see the corresponding code, ``` while
True: if self.popen_obj.poll() is not None: break if
self.stopped_event.wait(timeout=0.1): return ```

And desipte that `stopped_event` is set, we first check for
`popen_obj` and trigger the process restart.

We can also make the ProcessWatcher logic more robust, by checking if
we are stopped before calling the termination callback, e.g.,

```
        try:
            if not self.stopped_event.is_set():
                self.process_termination_callback()
        except Exception:
            logger.exception("Error calling process termination callback")
```

I am not 100% sure about that, as I don't really know what semantics
is expected from ProcessWatcher by other users. But at least the
AutoRestarter expects this semantics - i.e., a watcher shall not call
any events after it was stopped.
BoboTiG pushed a commit that referenced this pull request Jul 28, 2024
* fixes a possible race condition in AutoRestartTrick

Just a long shot for a failure observed on #998. My hypothesis is that
when we stop ProcessWatcher before we restart the process manually, we
don't yield to it and immediately kill the process. Next, when the
ProcessWatcher thread is woken up, we have to conditions ready - the
popen_obj and stopped_event, see the corresponding code, ``` while
True: if self.popen_obj.poll() is not None: break if
self.stopped_event.wait(timeout=0.1): return ```

And desipte that `stopped_event` is set, we first check for
`popen_obj` and trigger the process restart.

We can also make the ProcessWatcher logic more robust, by checking if
we are stopped before calling the termination callback, e.g.,

```
        try:
            if not self.stopped_event.is_set():
                self.process_termination_callback()
        except Exception:
            logger.exception("Error calling process termination callback")
```

I am not 100% sure about that, as I don't really know what semantics
is expected from ProcessWatcher by other users. But at least the
AutoRestarter expects this semantics - i.e., a watcher shall not call
any events after it was stopped.

* tries an alternative solution

i.e., don't send events if stopped
@BoboTiG

This comment was marked as resolved.

@BoboTiG
Copy link
Collaborator

BoboTiG commented Jul 28, 2024

@ivg can you give a look at Linux failures? It is related to your changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants