Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: consume data by batch #57

Merged
merged 2 commits into from
Nov 16, 2023
Merged

feat: consume data by batch #57

merged 2 commits into from
Nov 16, 2023

Conversation

Sotatek-HuyLe3a
Copy link
Collaborator

@Sotatek-HuyLe3a Sotatek-HuyLe3a commented Nov 9, 2023

#55

@Sotatek-HuyLe3a Sotatek-HuyLe3a marked this pull request as draft November 9, 2023 09:48
@Sotatek-HuyLe3a Sotatek-HuyLe3a marked this pull request as ready for review November 9, 2023 10:38
Copy link
Member

@satran004 satran004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Sotatek-HuyLe3a . Few comments otherwise looks good to me.

Cursor currentCursor = cursorService.getCursor().orElse(null);
if (currentCursor != null && currentCursor.getSlot() > slotHeight) {
log.info("Delete cursors with slot greater than {}", slotHeight);
cursorStorage.deleteBySlotGreaterThan(eventPublisherId, slotHeight);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sotatek-HuyLe3a I think we can use CursorService.rollbackTo(slot) method here instead of directly accessing CursorStorage.

long slotHeight = blockRepository.getSlotHeight().orElse(0L);

if (slotHeight == 0) {
if (!syncAutoStart) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sotatek-HuyLe3a Yaci Store should automatically start this if autoSync is true. Do we need to handle it separately here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @satran004 , My intention is to start the Yaci Store service if syncStart is false, however, my current approach will make the autoSync property meaningless. I will fix it.

@satran004
Copy link
Member

@Sotatek-HuyLe3a Few more questions.

  1. For regular synchronization (after a full sync), is there any impact during a rollback? Or does it not matter because the batch size is changed to 1 during regular sync? Should we explicitly set the batch size to 1 when eventMetadata.isSync() becomes true?

  2. What about rollback during a restart? Will it be handled properly during batch processing? This is because, during a restart, yaci-store sends a rollback event for the last block.

@Sotatek-HuyLe3a
Copy link
Collaborator Author

Sotatek-HuyLe3a commented Nov 10, 2023

@Sotatek-HuyLe3a Few more questions.

  1. For regular synchronization (after a full sync), is there any impact during a rollback? Or does it not matter because the batch size is changed to 1 during regular sync? Should we explicitly set the batch size to 1 when eventMetadata.isSync() becomes true?
  2. What about rollback during a restart? Will it be handled properly during batch processing? This is because, during a restart, yaci-store sends a rollback event for the last block.
  1. With the current logic, we do not always consume based on a fixed batch size. When crawling to the tip, we will consume one by one because we have condition (BlockEventListener.java): "lastReceivedTimeElapsed >= commitThreshold."
    if (currentBlockCount % batchSize == 0 || lastReceivedTimeElapsed >= commitThreshold) { blockSyncService.startBlockSyncing(); blockCount.set(0); }

However, since we can start BlockSync when we are close to the tip, and I think there's no need to set the batch size value to 1, we can add one more condition:
if (currentBlockCount % batchSize == 0 || lastReceivedTimeElapsed >= commitThreshold || eventMetadata.isSyncMode()) { blockSyncService.startBlockSyncing(); blockCount.set(0); }

2.If we process a batch and encounter a failure, the data will be rolled back, except for the data in the cursor table, and we handled this issue when ledger-sync restart. Each time there is a restart, Yaci Store publishes a rollback event to the last block as a way to clean the data and I think this does not have any impact on batch processing.

@satran004
Copy link
Member

Thanks @Sotatek-HuyLe3a

if (currentBlockCount % batchSize == 0 || lastReceivedTimeElapsed >= commitThreshold || eventMetadata.isSyncMode()) { blockSyncService.startBlockSyncing(); blockCount.set(0); }

Makes sense. Can you please change the code accordingly? Then we can merge this PR. Thanks

@satran004 satran004 changed the base branch from main to develop November 16, 2023 07:48
@satran004 satran004 self-requested a review November 16, 2023 07:48
@satran004 satran004 merged commit a5daaa7 into develop Nov 16, 2023
1 check passed
@satran004 satran004 deleted the batch branch November 16, 2023 07:50
@satran004 satran004 mentioned this pull request Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants