Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: stopped watching of resources - avoid overflow exceptions in resource watcher OnException handle… #505

Merged
merged 2 commits into from
Jan 16, 2023

Conversation

tomasfabian
Copy link
Contributor

@tomasfabian tomasfabian commented Jan 16, 2023

This PR provides a fix for the resource watcher's exp. backoff reconnection logic. After reaching the bellow explained retry attempts threshold the reconciliation of the watched resources halts.

…r #482

Description:
The _reconnectHandler publisher (IObserver<TimeSpan> part of the ISubject<TimeSpan>) stops emitting new values into the pipeline after 40 consecutively failed reconnection attempts due to an unhandled overflow exception and the reconnection subscription represented by the IObservable<Timespan> would remain idle, but the background thread is unnoticedly aborted.

The PR provides the following improvements and fixes:

  • checks the _reconnectAttempts variable against a constant MaxRetriesAttempts with value set to 39 attempts in order to avoid System.OverflowException thrown from the ErrorBackoffStrategy delegate.
  • the logic in the OnException event handler is wrapped with a try-catch-finally block. The finally block ensures that new values - interpreted as Timer ticks ensuring latency between inversion of control invocations - will be produced in case of arbitrary exceptions in the future.
  • adds Retry operator into the reconnection handler's subscription pipeline and logs errors on OnError messages.

Let me know if I should move some or all of the following MaxRetriesAttempts logic, constant and the DefaultBackoff property directly into the BackoffStrategies static class. This would lead to a more implicit (hidden) error guard. DefaultBackoff is necessary in case of introducing a finally block in the OnException method.

I could also introduce some unit tests for the ResourceWatcher<TEntity> class by injecting System.Reactive.Concurrency.IScheduler into its constructor in order to mock time and threads.

Operators without proper unobserved thread error logging are unaware about the issue. I would recommend to log such errors in the following way:

    var app = builder.Build();

    void TaskSchedulerUnobservedTaskException(object? sender, UnobservedTaskExceptionEventArgs e)
    {
        app.Logger.LogError(e.Exception, "Caught unhandled TaskScheduler exception");
    }

    app.Lifetime.ApplicationStarted.Register(() =>
    {
        TaskScheduler.UnobservedTaskException += TaskSchedulerUnobservedTaskException;
    });
    app.Lifetime.ApplicationStopped.Register(() =>
    {
        TaskScheduler.UnobservedTaskException -= TaskSchedulerUnobservedTaskException;
    });

Copy link
Owner

@buehler buehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about the number?

The rest seems perfectly fine to me. I'm happy to merge it if you gave me your opinion on my comment.

@buehler
Copy link
Owner

buehler commented Jan 16, 2023

And of course: Thank you for contributing to this project! <3

buehler
buehler previously approved these changes Jan 16, 2023
@buehler buehler enabled auto-merge (squash) January 16, 2023 13:11
@buehler
Copy link
Owner

buehler commented Jan 16, 2023

Linting and testing must succeed for the PR to be merged.

@tomasfabian
Copy link
Contributor Author

Linting and testing must succeed for the PR to be merged.

I will check it, but the master branch from which I forked is in erroneous state, too,

auto-merge was automatically disabled January 16, 2023 13:37

Head branch was pushed to by a user without write access

@tomasfabian
Copy link
Contributor Author

Linting and testing must succeed for the PR to be merged.

You were right. I tried to fix the linting issues in the second commit. Can I run the analyzer somehow from here, please?

@buehler
Copy link
Owner

buehler commented Jan 16, 2023

Yea, if you push to this PR, I can approve and run the analyzer. Also, if you see any warnings during "build whole solution", this is a strong indicator that the build is going to fail. All warnings produce errors in production mode.

@buehler buehler merged commit 6fdd512 into buehler:master Jan 16, 2023
@tomasfabian
Copy link
Contributor Author

Thank you for the review and approval @buehler!
The release is probably halted by this semantic release error. Do I understand it correctly, please?

@tomasfabian tomasfabian changed the title fix: avoid overflow exceptions in resource watcher OnException handle… fix: stopped watching of resources - avoid overflow exceptions in resource watcher OnException handle… Jan 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants