-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for processing tasks in order within groups #476
Changes from 4 commits
4ac10d3
654ed20
0196cb1
39df2bb
8edc430
2f3f8ef
45595fe
ed92de5
bb88a00
b8b2228
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -94,7 +94,21 @@ class DefaultMigrationManager { | |
"Update length of invocation column on outbox for MySQL dialects only.", | ||
"ALTER TABLE TXNO_OUTBOX MODIFY COLUMN invocation MEDIUMTEXT", | ||
Map.of( | ||
Dialect.POSTGRESQL_9, "", Dialect.H2, "", Dialect.ORACLE, "SELECT * FROM dual"))); | ||
Dialect.POSTGRESQL_9, "", Dialect.H2, "", Dialect.ORACLE, "SELECT * FROM dual")), | ||
new Migration( | ||
9, | ||
"Add createTime column to outbox", | ||
"ALTER TABLE TXNO_OUTBOX ADD COLUMN createTime TIMESTAMP(6) NULL AFTER invocation", | ||
Map.of( | ||
Dialect.POSTGRESQL_9, | ||
"ALTER TABLE TXNO_OUTBOX ADD COLUMN createTime TIMESTAMP(6)", | ||
Dialect.ORACLE, | ||
"ALTER TABLE TXNO_OUTBOX ADD createTime TIMESTAMP(6)")), | ||
new Migration( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can do this with a single migration:
|
||
10, | ||
"Add groupId column to outbox", | ||
"ALTER TABLE TXNO_OUTBOX ADD COLUMN groupId VARCHAR(250)", | ||
markushc marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Map.of(Dialect.ORACLE, "ALTER TABLE TXNO_OUTBOX ADD groupId VARCHAR2(250)"))); | ||
|
||
static void migrate(TransactionManager transactionManager, Dialect dialect) { | ||
transactionManager.inTransaction( | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,7 +36,8 @@ | |
public class DefaultPersistor implements Persistor, Validatable { | ||
|
||
private static final String ALL_FIELDS = | ||
"id, uniqueRequestId, invocation, lastAttemptTime, nextAttemptTime, attempts, blocked, processed, version"; | ||
"id, uniqueRequestId, invocation, lastAttemptTime, nextAttemptTime, attempts, blocked, processed, version, " | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For clarity elsewhere, can we use the order
|
||
+ "createTime, groupId"; | ||
|
||
/** | ||
* @param writeLockTimeoutSeconds How many seconds to wait before timing out on obtaining a write | ||
|
@@ -98,7 +99,11 @@ public void migrate(TransactionManager transactionManager) { | |
public void save(Transaction tx, TransactionOutboxEntry entry) | ||
throws SQLException, AlreadyScheduledException { | ||
var insertSql = | ||
"INSERT INTO " + tableName + " (" + ALL_FIELDS + ") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)"; | ||
"INSERT INTO " | ||
+ tableName | ||
+ " (" | ||
+ ALL_FIELDS | ||
+ ") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"; | ||
var writer = new StringWriter(); | ||
serializer.serializeInvocation(entry.getInvocation(), writer); | ||
if (entry.getUniqueRequestId() == null) { | ||
|
@@ -139,6 +144,9 @@ private void setupInsert( | |
stmt.setBoolean(7, entry.isBlocked()); | ||
stmt.setBoolean(8, entry.isProcessed()); | ||
stmt.setInt(9, entry.getVersion()); | ||
stmt.setTimestamp( | ||
10, entry.getCreateTime() == null ? null : Timestamp.from(entry.getCreateTime())); | ||
stmt.setString(11, entry.getGroupId()); | ||
} | ||
|
||
@Override | ||
|
@@ -248,6 +256,13 @@ public boolean unblock(Transaction tx, String entryId) throws Exception { | |
@Override | ||
public List<TransactionOutboxEntry> selectBatch(Transaction tx, int batchSize, Instant now) | ||
throws Exception { | ||
List<TransactionOutboxEntry> results = selectBatchUnordered(tx, batchSize, now); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have a problem here I think. When We need to run one entry at a time, synchronously, and if that fails, keep retrying it forever. That means:
As for the most efficient way to fetch the data given that we need to only process one-at-a-time for each
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the comments. So separating flush for ordered processing and invoking There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be nice to merge #484 (maybe not necessarily as-is currently, but the idea seems solid) as I believe that would make the code easier to work with, especially when introducing additional SQL as in this PR. It might also be nice to merge #492 in case sequences are needed directly as the "create sequence" functionality is improved in later versions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated the branch to process ordered tasks synchronously, while unordered tasks are still submitted asynchronously as before (see |
||
results.addAll(selectBatchOrdered(tx, batchSize - results.size(), now)); | ||
return results; | ||
} | ||
|
||
public List<TransactionOutboxEntry> selectBatchUnordered( | ||
Transaction tx, int batchSize, Instant now) throws Exception { | ||
String forUpdate = dialect.isSupportsSkipLock() ? " FOR UPDATE SKIP LOCKED" : ""; | ||
//noinspection resource | ||
try (PreparedStatement stmt = | ||
|
@@ -262,6 +277,7 @@ public List<TransactionOutboxEntry> selectBatch(Transaction tx, int batchSize, I | |
+ dialect.booleanValue(false) | ||
+ " AND processed = " | ||
+ dialect.booleanValue(false) | ||
+ " AND groupid IS NULL " | ||
+ dialect.getLimitCriteria() | ||
+ forUpdate)) { | ||
stmt.setTimestamp(1, Timestamp.from(now)); | ||
|
@@ -270,6 +286,56 @@ public List<TransactionOutboxEntry> selectBatch(Transaction tx, int batchSize, I | |
} | ||
} | ||
|
||
public List<TransactionOutboxEntry> selectBatchOrdered( | ||
final Transaction tx, final int batchSize, final Instant now) throws Exception { | ||
String forUpdate = dialect.isSupportsSkipLock() ? " FOR UPDATE SKIP LOCKED" : ""; | ||
try (PreparedStatement stmt = | ||
tx.connection() | ||
.prepareStatement( | ||
dialect.isSupportsWindowFunctions() | ||
? "WITH t AS" | ||
+ " (" | ||
+ " SELECT RANK() OVER (PARTITION BY groupid ORDER BY createtime) AS rn, " | ||
+ ALL_FIELDS | ||
+ " FROM " | ||
+ tableName | ||
+ " WHERE processed = " | ||
+ dialect.booleanValue(false) | ||
+ " )" | ||
+ " SELECT " | ||
+ ALL_FIELDS | ||
+ " FROM t " | ||
+ " WHERE rn = 1 AND nextAttemptTime < ? AND blocked = " | ||
+ dialect.booleanValue(false) | ||
+ " AND groupid IS NOT NULL " | ||
+ dialect.getLimitCriteria() | ||
+ (dialect.isSupportsWindowFunctionsForUpdate() ? forUpdate : "") | ||
: "SELECT " | ||
+ ALL_FIELDS | ||
+ " FROM" | ||
+ " (" | ||
+ " SELECT groupid AS group_id, MIN(createtime) AS min_create_time" | ||
+ " FROM " | ||
+ tableName | ||
+ " WHERE processed = " | ||
+ dialect.booleanValue(false) | ||
+ " GROUP BY group_id" | ||
+ " ) AS t" | ||
+ " JOIN " | ||
+ tableName | ||
+ " t1" | ||
+ " ON t1.groupid = t.group_id AND t1.createtime = t.min_create_time" | ||
+ " WHERE nextAttemptTime < ? AND blocked = " | ||
+ dialect.booleanValue(false) | ||
+ " AND groupid IS NOT NULL " | ||
+ dialect.getLimitCriteria() | ||
+ forUpdate)) { | ||
stmt.setTimestamp(1, Timestamp.from(now)); | ||
stmt.setInt(2, batchSize); | ||
return gatherResults(batchSize, stmt); | ||
} | ||
} | ||
|
||
@Override | ||
public int deleteProcessedAndExpired(Transaction tx, int batchSize, Instant now) | ||
throws Exception { | ||
|
@@ -311,6 +377,11 @@ private TransactionOutboxEntry map(ResultSet rs) throws SQLException, IOExceptio | |
.blocked(rs.getBoolean("blocked")) | ||
.processed(rs.getBoolean("processed")) | ||
.version(rs.getInt("version")) | ||
.createTime( | ||
rs.getTimestamp("createTime") == null | ||
? null | ||
: rs.getTimestamp("createTime").toInstant()) | ||
.groupId(rs.getString("groupId")) | ||
.build(); | ||
log.debug("Found {}", entry); | ||
return entry; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIMESTAMP(6)
is millisecond accuracy based on the system clock of the system writing the record. If you have two boxes writing records at roughly the same time, the ordering will be easily screwed up by clock skew.To my mind, the only way to guarantee order is to have a group-specific incrementing counter with write-locking semantics, e.g:
and
then write your record then
This all has a big negative: there is big contention on write. But I don't think it's avoidable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a single global sequence (
CREATE SEQUENCE ordered_sequence
+INSERT INTO ... nextval('ordered_sequence')
) to replace thecreateTime
column do the same job with minimum complexity? If so, we could could try that instead.I'm actually not sure why using a sequence wasn't the approach taken in the original pull request (#252). With an 8-byte BIGINT wraparound seems unlikely to become an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@badgerwithagun I guess having a sequence per group ID could result in a lot of sequences if the following use case is supported:
Suppose you have on the order of a million entities for which you want to publish CRUD events. The events per entity need to be ordered, but the order of events for different entities do not matter. You would then use the entity ID as group ID resulting in on the order of a million sequences?
Having just one sequence as @markushc suggests would be simpler, but I don't know if it would be a critical performance limitation. What if an outbox entry is assigned a sequence number if and only if its group ID is non-null? On the other hand, I don't know if there is a use case for combining
schedule
calls with and without group ID even though the current API would allow for that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the branch to use the database timestamp, with microsecond precision where available, for initializing the task creation time to eliminate the possibility of clock skew between nodes (see
dialect.getCurrentTimestamp()
inDefaultPersistor.java
). Staying with the approach of ordering based on creation time avoids the complexity related to group-specific locks. Thoughts?