Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Retry Logic For Elasticsearch Requests #4408

Closed
criminosis opened this issue Apr 22, 2024 · 0 comments · Fixed by #4409
Closed

Implement Retry Logic For Elasticsearch Requests #4408

criminosis opened this issue Apr 22, 2024 · 0 comments · Fixed by #4409

Comments

@criminosis
Copy link
Contributor

Describe the feature:
JG views writes to mixed indices as a "best effort" and documents that failures in transactions with regards to mixed indices are left to a separate periodic clean up process after enabling JG's WAL.

However this results in clients not getting feedback for more mundane write failures to mixed indices like Elasticsearch like a 429 Too Many Requests that just need to respect the backpressure being conveyed by Elasticsearch and retry again.

Elasticsearch's guidance states that 429 errors in particular are the client's responsibility.

To implement this feature would require implementing retry logic around this callsite. Additionally supporting some new config settings for Elasticsearch:

  • retry-limit: How many times to attempt a retry. Default to 0 to maintain current behavior.
  • retry-initial-wait: How long (in milliseconds) to wait before applying the initial retry. Defaults to 1ms.
  • retry-max-wait: How long (in milliseconds) to wait at most for each retry attempt, operates as a ceiling if a high number of retries are configured since exponential backoff will grow exceptionally quickly. Defaults to 1000ms.
  • retry-error-codes: Which error codes contained in a thrown Elasticsearch ResponseException to apply retries upon.

Describe a specific use case for the feature:
Once this feature is implemented it'll make writes to Elasticsearch better handle write errors like 429 during high write periods instead of silently failing the write and users of JanusGraph needing to be aware of the WAL & separate periodic repair application.

criminosis added a commit to criminosis/janusgraph that referenced this issue Apr 22, 2024
…figured for retrying

Closes JanusGraph#4408

Signed-off-by: Allan Clements <criminosis@gmail.com>
criminosis added a commit to criminosis/janusgraph that referenced this issue Apr 24, 2024
…figured for retrying

Closes JanusGraph#4408

Signed-off-by: Allan Clements <criminosis@gmail.com>
criminosis added a commit to criminosis/janusgraph that referenced this issue Apr 24, 2024
…figured for retrying

Closes JanusGraph#4408

Added additional tests for coverage

Included example of comma separated values for retry-error-codes config parameter

Signed-off-by: Allan Clements <criminosis@gmail.com>
criminosis added a commit to criminosis/janusgraph that referenced this issue Apr 26, 2024
…figured for retrying

Closes JanusGraph#4408

Added additional tests for coverage

Included example of comma separated values for retry-error-codes config parameter

Simplified retry loop logic

Moved retry configurations into constructor and removed setters

Signed-off-by: Allan Clements <criminosis@gmail.com>
porunov pushed a commit that referenced this issue Apr 27, 2024
…figured for retrying

Closes #4408

Added additional tests for coverage

Included example of comma separated values for retry-error-codes config parameter

Simplified retry loop logic

Moved retry configurations into constructor and removed setters

Signed-off-by: Allan Clements <criminosis@gmail.com>
@porunov porunov added this to the Release v1.1.0 milestone Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants