-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Counters for last persister successes #2167
Conversation
Small comment. Can we move these so that they mark the end of a full successful run of the persister? Might have to do something like keep a boolean running through the for-loop, since we try/catch within the loop to cover as much as we can. We would want to know this since, even if one small piece succeeds and increments the timestamp, we could still be missing/erroring on a much larger and more problematic part and not know it |
@@ -184,8 +196,10 @@ protected boolean moveToHistory(SingularityTaskId object) { | |||
LOG.debug("Moving {} to history", object); | |||
try { | |||
historyManager.saveTaskHistory(taskHistory.get()); | |||
persisterSuccess = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ends up making things the same as before. Due to the fact that we catch and continue the loop, we can end up with a situation like the following:
item 1 succeeds -> persisterSuccess = true
item 2 fails -> persisterSuccess = false
item 3 succeeds -> persisterSuccess = true
end -> mark as success
So in that case we are losing the fact that item 2 failed. Might be able to do it easier with a variable that starts as true and is just turned to false if we ever enter one of the catch blocks.
Side question, why make this a class level variable vs just initializing a boolean on each run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, that makes much more sense; I'll update.
I can update to initialize on each run, thinking about it now it's probably better for gc, I was trying to reduce some code copying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
if (persisterSuccess.get()) { | ||
LOG.info( | ||
"Request Persister: successful move to history ({} so far)", | ||
lastPersisterSuccess.incrementAndGet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably most helpful to set this to something like System.currentTimeMillis() that way we know the last timestamp it ran and can generate an effective 'lag' type of metric off of that. incrementAndGet will just add 1 to the existing value. Logging message would change then too, something like 'finished run on persister at {}' or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sounds good, I'll make that update.
Should the lastPeristerSuccess be a timestamp rather than an AtomicLong for consumption from InternalSingularityService/sfx?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AtomicLong is nice in that we are never competing between threads to read/update it (or if it's inside a lambda and has to be effectively final. It's also hard to update a raw long
but also bind it in guice as a singleton without something like AtomicLong to wrap it
🚢 |
Counter to ensure that zk -> mysql is happening. Every time one of the persisters move their data to history, the counter is logged and increased.