Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite JS/GWT to use row counts (vs. row batches) #2406

Merged
merged 65 commits into from
May 23, 2022

Conversation

lbooker42
Copy link
Contributor

@lbooker42 lbooker42 commented May 19, 2022

Optimize snapshots for large tables (#2365) was merged and it removed the guaranteed accuracy of the num_add_batches / num_mod_batches provided by the BarrageUpdateMetadata message. These became an estimate (as a temporary measure since these metadata tags have been removed in barrage repo release 0.5.0). This update rewrites GWT BarrageUtils.java to use row counts from rowsIncluded and rowsModified metadata instead of number of batches.

Expected to fix #2405

@lbooker42 lbooker42 self-assigned this May 19, 2022
@lbooker42 lbooker42 added the bug Something isn't working label May 19, 2022
@lbooker42 lbooker42 added this to the May 2022 milestone May 19, 2022
@@ -171,7 +171,8 @@ public static class DeltaUpdatesBuilder {
private final DeltaUpdates deltaUpdates = new DeltaUpdates();
private final BarrageUpdateMetadata barrageUpdate;
private final String[] columnTypes;
private int recordBatchesSeen = 0;
private long numAddRowsRemaining = 0;
private long numModRowsRemaining = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something - but it looks like this value is set once in the constructor; but never decremented? But we have a condition later on it == 0? Seems like that will never get hit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct. When porting, missed a critical line decrementing numModRowsRemaining. Please see the change below (line 264)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sad we don't have more thorough automated tests for producer -> barrage -> consumer; assuming it would be better at catching something like this. I'll let @nbauernfeind do deeper look into logic.

@@ -229,6 +237,7 @@ private void handleAddBatch(RecordBatch recordBatch, ByteBuffer body) {
addedColumnData[columnIndex] = new DeltaUpdates.ColumnAdditions(columnIndex, columnData);
}
deltaUpdates.setSerializedAdditions(addedColumnData);
numAddRowsRemaining -= (long) recordBatch.length().toFloat64();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming (long) ... toFloat64() is "valid", maybe something to do w/ the JS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is "the valid" thing to do, here.

recordBatchesSeen++;
return recordBatchesSeen == barrageUpdate.numAddBatches() + barrageUpdate.numModBatches();
// return true when complete
return numAddRowsRemaining == 0 && numModRowsRemaining == 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it even possible to have < 0? I wonder if we should prefer to throw IllegalStateException if we see that.

Copy link
Member

@nbauernfeind nbauernfeind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. It's worth noting that the implementation does not technically support multiple record batches for deltas or snapshots. Deltas overwrite the earlier dataset. Snapshots aren't waiting for future fragments.

So, the web client will still break with any amount of fragmentation.

@lbooker42 lbooker42 merged commit 5b02f1f into deephaven:main May 23, 2022
@github-actions github-actions bot locked and limited conversation to collaborators May 23, 2022
@lbooker42 lbooker42 deleted the lab-large-tables branch June 26, 2024 19:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working NoDocumentationNeeded
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Phantom barrage rows
3 participants