-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Graph updaters shutdown #5092
Improve Graph updaters shutdown #5092
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## dev-2.x #5092 +/- ##
=============================================
+ Coverage 64.51% 64.57% +0.06%
- Complexity 14031 14064 +33
=============================================
Files 1725 1729 +4
Lines 67222 67374 +152
Branches 7200 7208 +8
=============================================
+ Hits 43365 43507 +142
- Misses 21434 21447 +13
+ Partials 2423 2420 -3
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor readability issues.
We should however, refactor this code a bit. Thread handling is difficult, and almost impossible to unit-test (often just creating a sense of false-security). So, my suggestion is to refactor the Updater interfaces, and break up the different phases in methods and move all iteration and thread handling logic to the "updater framework".
src/ext/java/org/opentripplanner/ext/siri/updater/SiriETGooglePubsubUpdater.java
Outdated
Show resolved
Hide resolved
src/ext/java/org/opentripplanner/ext/siri/updater/SiriETGooglePubsubUpdater.java
Show resolved
Hide resolved
src/ext/java/org/opentripplanner/ext/siri/updater/azure/AbstractAzureSiriUpdater.java
Outdated
Show resolved
Hide resolved
src/ext/java/org/opentripplanner/ext/siri/updater/azure/AbstractAzureSiriUpdater.java
Outdated
Show resolved
Hide resolved
src/ext/java/org/opentripplanner/ext/siri/updater/azure/AbstractAzureSiriUpdater.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opentripplanner/updater/GraphUpdaterManager.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Thomas Gran <t2gran@gmail.com>
@@ -221,12 +221,12 @@ private Result<UpdateSuccess, UpdateError> apply( | |||
|
|||
/* commit */ | |||
return addTripToGraphAndBuffer(result.successValue(), journey, entityResolver); | |||
} catch (Throwable t) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch!
try { | ||
if (updaterList.stream().allMatch(GraphUpdater::isPrimed)) { | ||
LOG.info("OTP UPDATERS INITIALIZED - OTP is ready for routing!"); | ||
return; | ||
} | ||
//noinspection BusyWait | ||
Thread.sleep(1000); | ||
} catch (Exception e) { | ||
} catch (RuntimeException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to catch RuntimeException
? Aren't those meant to stop the JVM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine. RuntimeException
is the base-class for all unchecked exceptions, for which lots of instances can occur during quite normal operations (like NoSuchElementException
for example).
On this subject, while I agree that there are restrictions to its application, I don't think catching Throwable
is automatically bad form. I tend to catch all errors on the top-level of the module or thread and log with as much pertinent information as possible from the state. If the alternative is to let the error propagate upwards to either the JVM or the execution framework (like Spring or an executor service) I rather log it myself to make sure the information is sent through the logging system in use. (Propagation of stdout/stderr is not always setup correctly.)
Catching Throwable
and then continue execution is a no-no though, we'll agree on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realised that we caught Exception
before which is a superclass of RuntimeException
so this is catching fewer exceptions, not more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it is. Unless there is a specific reason Exception
would actually be better.
(That RuntimeException
inherits from Exception
always trips me up.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always safe to catch RuntimeException instead of Exception since compilation would fail if the block threw any checked exception. In this part of the code catching Exception is problematic since it could swallow InterruptedException and prevent graceful shutdown. And this is exactly what happened in this method.
But I can also replace :
catch (RuntimeException e) {
LOG.error(e.getMessage(), e);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
otpIsShuttingDown = true;
LOG.info("OTP is shutting down, cancelling wait for updaters readiness.");
}
by:
catch (InterruptedException e) {
Thread.currentThread().interrupt();
otpIsShuttingDown = true;
LOG.info("OTP is shutting down, cancelling wait for updaters readiness.");
} catch (Exception e) {
LOG.error(e.getMessage(), e);
}
which is equivalent and maybe less confusing.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vpaturet I think the suggested second form is indeed clearer.
Summary
This PR ensures that graph updaters stop execution gracefully when the OTP server shuts down.
This is particularly important in a cloud environment where OTP instances are routinely started and stopped.
More specifically:
Additionally:
Throwables
were caught. NowExceptions
are caught instead, to prevent hiding memory issues.Issue
None
Unit tests
✅
Documentation
No