Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Segment Replication] Update getSegmentInfosSnapshot() logic to return SegmentInfos with highest generation number #4288

Closed
wants to merge 3 commits into from

Conversation

dreamer-89
Copy link
Member

@dreamer-89 dreamer-89 commented Aug 23, 2022

Signed-off-by: Suraj Singh surajrider@gmail.com

Description

The PR updates the logic getSegmentInfosSnapshot() to return SegmentInfos copy (In memory Vs On-disk) with highest generation number. The logic is copied from #4178 (comment) with minor tweaks and unit tests.

Issues Resolved

#4178

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dreamer-89 dreamer-89 requested review from a team and reta as code owners August 23, 2022 22:17
@dreamer-89 dreamer-89 requested a review from mch2 August 23, 2022 22:17
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

/**
* Returns generation number (_N in segment_N) from last commit on disk.
*/
public long getLastCommitGeneration() throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this method on the store? can we call

SegmentInfos.getLastCommitGeneration(store.directory());

directly from InternalEngine?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created new method for consistency purpose as to keep disk(store) related calls on Store

SegmentInfos sisDisk = new SegmentInfos(org.apache.lucene.util.Version.LATEST.major);
sisDisk.setNextWriteGeneration(memoryGen + 1);

when(store.getLastCommitGeneration()).thenReturn(sisDisk.getGeneration());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than mocking here, you should be able to invoke SegmentInfos.commit and create a new _N file with a higher generation.

Copy link
Member Author

@dreamer-89 dreamer-89 Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sharing this, I wasn't aware of this method.

I tried SegmentInfos.commit but it also updated the in-memory copy of SegmentInfos i.e. both sisDisk & sisInMemory have same generation number. So, for tests, need to use mocks to make on-disk copy to have higher segment generation.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter

This comment was marked as outdated.


// This method simulates behaviour of on-disk SegmentInfos having lower segment generation number compared to
// actual in-memory SegmentInfos and verified that engine.getSegmentInfosSnapshot returns in-memory SegmentInfos
public void testGetSegmentInfosWithHighestGenInMemory() throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> This test case will have both in-memory and on-disk segmentInfos on same segment generation number but not lower on on-disk as it is mentioned (or purpose of this test case). In this test case we are not incrementing or decrementing segment generation of either on-disk on in-memory, so both will be same. So we are not testing the case here that we actually want to.

-> Is this test really necessary? Will there be a scenario ever with on-disk segmentInfos having lower segment generation number than in-memory ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Looks like comment is outdated. Updated.
  2. Yes, this is useful to ensure in memory copy gets picked up (when segment generation is higher/equal compared to on-disk copy's segment generation). Yes, this scenario is happening during replica promotion.

…n SegmentInfos having highest generation number

Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
… number as on-disk copy

Signed-off-by: Suraj Singh <surajrider@gmail.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@dreamer-89
Copy link
Member Author

dreamer-89 commented Aug 29, 2022

Inconsistent on-disk & in-memory generation number in SegmentInfos is not expected state. Need to think through the problem again and identify cause of mis-match in on-disk & in-memory copies of SegmentInfos. If there is a way to identify mis-match (and possibly fix it) would be the ideal solution here. Converting it to draft for now.

CC @mch2

@dreamer-89 dreamer-89 closed this Sep 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants