-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite JS/GWT to use row counts (vs. row batches) #2406
Conversation
…re into lab-grow-viewport
@@ -171,7 +171,8 @@ public static class DeltaUpdatesBuilder { | |||
private final DeltaUpdates deltaUpdates = new DeltaUpdates(); | |||
private final BarrageUpdateMetadata barrageUpdate; | |||
private final String[] columnTypes; | |||
private int recordBatchesSeen = 0; | |||
private long numAddRowsRemaining = 0; | |||
private long numModRowsRemaining = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something - but it looks like this value is set once in the constructor; but never decremented? But we have a condition later on it == 0
? Seems like that will never get hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct. When porting, missed a critical line decrementing numModRowsRemaining
. Please see the change below (line 264)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sad we don't have more thorough automated tests for producer -> barrage -> consumer; assuming it would be better at catching something like this. I'll let @nbauernfeind do deeper look into logic.
@@ -229,6 +237,7 @@ private void handleAddBatch(RecordBatch recordBatch, ByteBuffer body) { | |||
addedColumnData[columnIndex] = new DeltaUpdates.ColumnAdditions(columnIndex, columnData); | |||
} | |||
deltaUpdates.setSerializedAdditions(addedColumnData); | |||
numAddRowsRemaining -= (long) recordBatch.length().toFloat64(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming (long) ... toFloat64()
is "valid", maybe something to do w/ the JS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is "the valid" thing to do, here.
recordBatchesSeen++; | ||
return recordBatchesSeen == barrageUpdate.numAddBatches() + barrageUpdate.numModBatches(); | ||
// return true when complete | ||
return numAddRowsRemaining == 0 && numModRowsRemaining == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it even possible to have < 0? I wonder if we should prefer to throw IllegalStateException if we see that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. It's worth noting that the implementation does not technically support multiple record batches for deltas or snapshots. Deltas overwrite the earlier dataset. Snapshots aren't waiting for future fragments.
So, the web client will still break with any amount of fragmentation.
Optimize snapshots for large tables (#2365) was merged and it removed the guaranteed accuracy of the
num_add_batches
/num_mod_batches
provided by theBarrageUpdateMetadata
message. These became an estimate (as a temporary measure since these metadata tags have been removed in barrage repo release 0.5.0). This update rewrites GWTBarrageUtils.java
to use row counts fromrowsIncluded
androwsModified
metadata instead of number of batches.Expected to fix #2405