Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow continue on errors #102

Merged
merged 11 commits into from
Jul 4, 2023
Merged

Allow continue on errors #102

merged 11 commits into from
Jul 4, 2023

Conversation

shkediy
Copy link
Contributor

@shkediy shkediy commented Jun 30, 2023

No description provided.

@shkediy
Copy link
Contributor Author

shkediy commented Jun 30, 2023

This PR contains a retry mechanism in case of a failure.
RGHibernate tries to write a whole batch as a single transaction.
When a failure occurs, the whole batch is written to a retry stream for reprocessing.
When a batch is taken out of the retry, it tries to execute the updates one by one. If an update fails, it is moved to a DLQ stream together with details about the error, where it can be reviewed manually.

@MeirShpilraien MeirShpilraien changed the title mastercard changes Allow continue on errors Jul 2, 2023
@MeirShpilraien MeirShpilraien self-requested a review July 2, 2023 07:53
@@ -137,7 +127,7 @@ public Connector(String name, String xmlDef, int batchSize, int duration, int re
.setPattern(streamName)
.setBatchSize(batchSize)
.setDuration(duration)
.setFailurePolicy(FailurePolicy.RETRY)
.setFailurePolicy(FailurePolicy.CONTINUE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be consider a breaking, we should allow to set it with register the connector and the default should be RETRY.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it now, you should probably change it to ABORT if the retry mechanism is enable no? If the retry mechanism is enable then you will not raise an exception and will just set the error stream, so if there is an exception raised it must be a bug in the code no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, you're right. Those kind of errors are not supposed to happen in prod. However, we probably want to skip over those just to be safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can set it to continue if retryInterval < 0, otherwise retry

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an option to use DLQ in case of non connection related errors. Changed mode back to RETRY

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just missed where we use it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I moved it to separate exceptions and forgot to put it back in. Just did

lastStreamId = null;
cause = e;

retry(record.subList(lastCommittedIdx + 1, record.size()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do the retry only if it was requested when register the connector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an option to use DLQ

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see where you use this new value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I moved it to separate exceptions and forgot to put it back in. Just did

Copy link
Contributor

@MeirShpilraien MeirShpilraien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Some small technical comments.
One design comment is that I believe this retry mechanism should be allow to enable/disable when we register the connector. By default it should be disabled to avoid breaking changes. I would also like to see some tests that cover this new functionality. And I see that our CI is not running, maybe @chayim can help with that.


}
catch (Exception e) {
msg = String.format("Failed committing transaction error='%s'", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is no really need to set the message here right? And then no need to set it to NULL after? Also I do not see where you check the errorsToDLQ value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I moved it to separate exceptions and forgot to put it back in. Just did

transaction.commit();
session.clear();
} catch (Exception e) {
String msg = String.format("Failed retrying transaction error='%s'", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you fail here because of connection error?

Copy link
Contributor

@MeirShpilraien MeirShpilraien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few last comments

@MeirShpilraien MeirShpilraien merged commit 51aaf1b into RedisGears:master Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants