Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-149: Improve error messages on write thread failure #150

Merged
merged 2 commits into from
Nov 17, 2021
Merged

Conversation

C0urante
Copy link

Addresses #149.

The KCBQThreadPoolExecutor is modified to only track one write thread exception at a time, which allows us to include a complete stack trace of that exception when failing the task.

The "Attempted to reduce batch size below 1." error is rewritten to include more information on a potential root cause and make the attached cause of the exception clearer.

Some cleanup of unnecessary exception wrapping when instantiating various AbstractConfig subclasses is also made.

Copy link
Member

@ddasarathan ddasarathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, all threads can throw exceptions but only the first gets logged and that bubbles up and kills the task. Since the exception will be thrown, we do not want to reset the encounteredError atomic ref?

Copy link
Member

@ddasarathan ddasarathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@C0urante
Copy link
Author

From what I understand, all threads can throw exceptions but only the first gets logged and that bubbles up and kills the task. Since the exception will be thrown, we do not want to reset the encounteredError atomic ref?

We do not reset the reference because the exception may be used multiple times. We throw exceptions encountered from write threads in both BigQuerySinkTask::flush (as part of the call to KCBQThreadPoolExecutor::awaitCurrentTasks) and at the very beginning of BigQuerySinkTask::put; the former prevents us from committing offsets for data that we weren't able to write to BigQuery (see #68), and the latter causes the task to fail (since throwing an error from SinkTask::flush doesn't actually cause a task to fail).

@C0urante
Copy link
Author

@ddasarathan thanks for taking a look. Given the LGTM I'll merge but if you are unsatisfied by my explanation of the way we use KCBQThreadPoolExecutor.encounteredError please let me know; happy to revert if something isn't right here or file a follow-up if there's room for improvement.

@C0urante C0urante merged commit d8fc535 into 1.6.x Nov 17, 2021
@C0urante C0urante deleted the gh-149 branch November 17, 2021 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants