-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing Retry logic [Reliable Channel] [STABLE Branch] #561
Implementing Retry logic [Reliable Channel] [STABLE Branch] #561
Conversation
…ity. Added extension to return the timeout values as expected before.
Moved the Handlers out of the concrete package to the common package to keep the same consistency. Removed a couple of unessecary methods. Added docs.
Added MaxInstantRetry configuration to allow for instantaneous retry on a failed transmission.
Javadoc and formatting updates
throttling = Boolean.valueOf(namesAndValues.get("Throttling")); | ||
developerMode = Boolean.valueOf(namesAndValues.get(DEVELOPER_MODE_NAME)); | ||
try { | ||
maxInstantRetries = Integer.parseInt(INSTANT_RETRY_NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will throw. Shouldn't it be Integer.parseInt(namesAndValues.get(INSTANT_RETRY_NAME)); ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good catch. This will blow. But I am not quite sure, it seems that this value comes as string from XML. In case of user specifies wrong value which cannot be converted to string then this would throw. Let's have try catch around this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes very good catch. Fixed and added null check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integer.parseInt does throw NumberFormatException which I already had a catch for. It should only log if we have ERROR tracing on.
@debugthings just wanted to understand did you changed / added anything into this PR which is different from the previous one which we reviewed extensively? Do I need to go through this PR again? |
|
||
private final static String INSTANT_RETRY_NAME = "MaxInstantRetry"; | ||
private final static int DEFAULT_MAX_INSTANT_RETRY = 3; | ||
private final static int DEFAULT_MAX_TELEMETRY_BUFFER_CAPACITY = 500; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number also seems low to me. I think default max capacity can be double
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same as before, so we should not change it in this PR.
private final static int DEFAULT_MAX_INSTANT_RETRY = 3; | ||
private final static int DEFAULT_MAX_TELEMETRY_BUFFER_CAPACITY = 500; | ||
private final static int MIN_MAX_TELEMETRY_BUFFER_CAPACITY = 1; | ||
private final static int MAX_MAX_TELEMETRY_BUFFER_CAPACITY = 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also definitely less. I think this should be definitely increased.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same as before, so we should not change it in this PR.
telemetry.getContext().getProperties().put("DeveloperMode", "true"); | ||
} | ||
|
||
if (telemetrySampler != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just kept for maintaining backward compatibility with Sampling V1? Nonetheless, with this major version we should definitely scrap off the old sampling algorithm as it uses user id based sampling which is not desirable for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dhaval24 - Again, nothing new here. The same code was there before. We can address unrelated changes as separate PR's.
* - The sampler | ||
*/ | ||
@Override | ||
public void setSampler(TelemetrySampler telemetrySampler) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be deprecated. It is the old sampling strategy which we do not encourage any more. If we are not scrapping it off entirely then atleast please use @depracted tag here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dhaval24 - same here. Unrelated to this PR.
} | ||
} | ||
|
||
public void setMaxInstantRetries(int maxInstantRetries) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java docs are still missing here!
} | ||
} | ||
|
||
public int getMaxInstantRetries() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here please update java docs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks good to me. Just some minor nit-pickings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One change needed when reading instant retry configuration. Need to get value from "namesAndValues" map. See comment. Thanks!
|
||
public final class TransmissionPolicyManager implements Stoppable, TransmissionHandlerObserver { | ||
|
||
private int INSTANT_RETRY_AMOUNT = 3; // Should always be set by the creator of this class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is using the java naming convention for constants. For member variables, camelCase
is preferred, but I'm not going to block the PR for this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
Added null check
@dhaval24 I did add the ability to set the max retry amount from the XML so it will be available if that number is too high or low. Outside of that, it was adding javadocs or simple refactor |
@debugthings for me it's approved :-) I do not have any additional concerns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing
if (backOffMillis > 0) | ||
{ | ||
long backOffSeconds = backOffMillis / 1000; | ||
InternalLogger.INSTANCE.logAlways(InternalLogger.LoggingLevel.TRACE, "App is throttled, telemetry will be blocked for %s seconds.", backOffSeconds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this call become very noisy with no option to turn off on big load?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, I can remove the log always, The backoff is called in large increments so it would become chatty if there were multiple threads and if we were unable to send data for a large amount of time.
|
||
// List of transmission policies implemented as handlers | ||
private List<TransmissionHandler> transmissionHandlers; | ||
|
||
// The future date the the transmission is blocked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaning spaces irregularities is always good idea. Code should be clean logically and stylistically =)
*/ | ||
public void clearBackoff() { | ||
policyState.setCurrentState(TransmissionPolicy.UNBLOCKED); | ||
backoffManager.onDoneSending(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spaces
} | ||
if (bufferedReader != null) { | ||
bufferedReader.close(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be closed in finally statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this if should definitely go in finally!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lines 125--130 should be wrapped in the finally block below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, moved to finally and added try/catch around those close() statements as well.
* Fix null ref check in telemetry correlation Utils (#541) * Fix null ref check in TelemetryCorrelationUtils * Modifying log level to warning * Updating Changelog * Fix handling of NaN and +/-Infinity in JSON serializer (#499) * Handle NaN and +/-Infinity in metrics * Default NaN/Infinity serialization to 0 to be consistent with other AI SDKs and make the code compatible with Java 6 * fixed javadoc errors and added section to generate pom.xml with all builds (#551) * Updating version number to 2.0.0 * Implementing Retry logic [Reliable Channel] [STABLE Branch] (#561) * Initial commit of retry and backoff logic fixes * Fixing warnings on files I touched this round * Fix the eclipse UI from screaming about the docker Contstants * Fixed backoff logic to use existing method. Added more logging to the sender channel. * Added the partial response handler, more logging * Added gson to core. Fixed backoff manager to keep original functionality. Added extension to return the timeout values as expected before. * Added unit tests. * Fixing string typed ArrayList<> to List<> per Dhaval * Missed one * Making tests consistent. * Added javadoc comments, simplified logic for a few methods * Added exception logging per @dhaval24. Fixed formatting on touched files * Updates per last round of commits Moved the Handlers out of the concrete package to the common package to keep the same consistency. Removed a couple of unessecary methods. Added docs. * Latest fixes * Add MaxInstantRetry Added MaxInstantRetry configuration to allow for instantaneous retry on a failed transmission. * Javadoc Updates Javadoc and formatting updates * NumberFormatException fix Added null check * JavaDocs for TPM * Fixing FixedRateSampling to work in old and new version of sampling (#540) Overriding default sampling percentage when programatically specified sampling percentage by user. * upgrade to logback v1.2.3 (#565) * Reliable channel: replacing logAlways "TRACE" statements with "info" (#571) * Reliable channel: close resources in finally block. (#572) * Reliable channel: close resources in finally block. * change logging to warning when closing resources * Bugfix against retry logic (#576) * Refactor * BUGFIX Logic would never backoff After adding the instant retry amount logic to the code this line of code could cause the transmissions to not back off. * Changes requested * Fixed javadocs tags, that caused build errors while executing `javadoc` gradle task (#578) * Update Changelog * Fix link in changelog * Fix another link in changelog * Update gradle.properties * Fix customizing pom.xml in Gradle build (#582) * Fix customizing pom.xml in Gradle build * Insert license after 1. row in pom.xml * Filter artifacts relocated by shadow task from pom dependencies - match artifacts by groupId - fixes #583 * Generate a pom file "beside" the artifact jar file
Completion of PR #556 into Stable Branch
Added retry logic that is similar to the .NET framework.
https://github.com/Microsoft/ApplicationInsights-Java/blob/7a5e572b2ffd3256898c6bd05dc3bec6754e3629/core/src/main/java/com/microsoft/applicationinsights/internal/channel/common/TransmissionNetworkOutput.java#L172
All exceptions are now handled by
ErrorHandler
in this line.https://github.com/Microsoft/ApplicationInsights-Java/blob/094383a25b0f98fe6aa2dd6dd58bae10259492e6/core/src/main/java/com/microsoft/applicationinsights/channel/concrete/inprocess/ErrorHandler.java#L39-L42
For significant contributions please make sure you have completed the following items: