-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry on IOException #122
base: 2.1.x
Are you sure you want to change the base?
Retry on IOException #122
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rikusg, this is a valuable improvement and I'm glad to hear it's already yielded benefits for you in production. I've left some stylistic comments but the core logic seems sound and I'd love to get it into the connector.
I've noticed that you've filed this against the 2.1.x branch; do you think you could retarget against the 2.0.x branch (which is the earliest branch where the improvements from #145 are available)?
@@ -97,7 +97,7 @@ public void run() { | |||
currentIndex += currentBatchSize; | |||
successCount++; | |||
} catch (BigQueryException err) { | |||
logger.warn("Could not write batch of size {} to BigQuery.", currentBatchList.size(), err); | |||
logger.warn("Could not write batch of size {} to BigQuery. (Error code {})", currentBatchList.size(), err.getCode(), err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it might also be beneficial to include err.getError()
in the output here, since (like error.getCode()
) that information is left out from just logging err
.
What do you think about something like this?
logger.warn("Could not write batch of size {} to BigQuery. (Error code {})", currentBatchList.size(), err.getCode(), err); | |
logger.warn("Could not write batch of size {} to BigQuery. Error code: {}, underlying error (if present): {}", currentBatchList.size(), err.getCode(), err.getError(), err); |
} else if (err.getCode() == UNKNOWN_CODE && err.getCause() != null && err.getCause() instanceof IOException){ | ||
//Retry when IO exceptions occur | ||
logger.warn("IO Exception: {}, attempting retry", err.getCause().getMessage()); | ||
retryCount++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here seems mostly fine, but I'm wondering if we can move it into the new BigQueryErrorResponses
class introduced in #145? I think a new isUnspecifiedIOError
method could work?
It'd also be nice to see a basic unit test for this logic if it's not too much trouble.
@@ -154,6 +156,10 @@ public void writeRows(PartitionedTableId table, | |||
// rate limit exceeded error | |||
logger.warn("Rate limit exceeded for table {}, attempting retry", table); | |||
retryCount++; | |||
} else if (err.getCode() == UNKNOWN_CODE && err.getCause() != null && err.getCause() instanceof IOException){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no need for the null guard if all we're doing is an instanceof
check (it'll handle null
fine):
} else if (err.getCode() == UNKNOWN_CODE && err.getCause() != null && err.getCause() instanceof IOException){ | |
} else if (err.getCode() == UNKNOWN_CODE && err.getCause() instanceof IOException){ |
@@ -154,6 +156,10 @@ public void writeRows(PartitionedTableId table, | |||
// rate limit exceeded error | |||
logger.warn("Rate limit exceeded for table {}, attempting retry", table); | |||
retryCount++; | |||
} else if (err.getCode() == UNKNOWN_CODE && err.getCause() != null && err.getCause() instanceof IOException){ | |||
//Retry when IO exceptions occur |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it'd be nice to keep the style for these comments consistent
//Retry when IO exceptions occur | |
// IO error |
From time to time we would have tasks fail due to a com.google.cloud.bigquery.BigQueryException: Connection reset
Is seems that the connection reset comes from the BigQuery end.
This PR adds IOException handling to the existing retry logic.
We have been running this stable for the past 2 months.
Addresses #142