Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Engine with decoupled Translog interfaces #3671

Merged
merged 7 commits into from
Jun 29, 2022

Conversation

Bukhtawar
Copy link
Collaborator

@Bukhtawar Bukhtawar commented Jun 23, 2022

Signed-off-by: Bukhtawar Khan bukhtawa@amazon.com

Description

The PR aims to Integrate Engine with decoupled Translog interfaces introduced in #3638 . This is the second part of introducing new interfaces and implementation as a part of decoupling. The overall proposal can be reviewed at #3471
Breaks down the main decouple PR into #3471 Pluggable translog work

Issues Resolved

#3241

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
@Bukhtawar Bukhtawar requested review from a team and reta as code owners June 23, 2022 19:37
@Bukhtawar Bukhtawar marked this pull request as draft June 23, 2022 19:37
@Bukhtawar Bukhtawar changed the title Integrate Engine with decouple Translog interfaces WIP: Integrate Engine with decoupled Translog interfaces Jun 23, 2022
Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
@Bukhtawar Bukhtawar changed the title WIP: Integrate Engine with decoupled Translog interfaces Integrate Engine with decoupled Translog interfaces Jun 24, 2022
@Bukhtawar Bukhtawar marked this pull request as ready for review June 24, 2022 12:10
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 20772d8
Log 6295

Reports 6295

@Bukhtawar Bukhtawar requested a review from nknize June 24, 2022 12:43
@Bukhtawar Bukhtawar added the v3.0.0 Issues and PRs related to version 3.0.0 label Jun 24, 2022
Comment on lines 286 to 297
translogManagerRef = new InternalTranslogManager(
engineConfig.getTranslogConfig(),
engineConfig.getPrimaryTermSupplier(),
engineConfig.getGlobalCheckpointSupplier(),
translogDeletionPolicy,
shardId,
readLock,
() -> getLocalCheckpointTracker(),
translogUUID,
new CompositeTranslogEventListener(Arrays.asList(internalTranslogEventListener, translogEventListener)),
this::ensureOpen
);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contemplating introducing a TranslogFactory

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure a3d45e8aa43e96c8fba5329be196d2562d4dbda5
Log 6301

Reports 6301

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success dc1d0c7
Log 6302

Reports 6302

} catch (IOException e) {
IOUtils.closeWhileHandlingException(store::decRef, readerManager);
Translog translog = null;
if (translogManagerRef != null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if TranslogManager should implement Closeable as well since it manages closeable resource (like Translog), wdyt?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good point however there are multiple places where the Engine is consuming the underlying Translog directly, which I plan on getting rid of in subsequent PRs as a part of moving Translog to a module(this PR would become too big to review otherwise).

Once complete I will make the suggested change

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #3709

@Bukhtawar Bukhtawar requested review from reta, mch2 and kartg June 25, 2022 14:30
@Override
public IndexResult index(Index index) throws IOException {
ensureOpen();
IndexResult indexResult = new IndexResult(index.version(), index.primaryTerm(), index.seqNo(), false);
final Translog.Location location = translog.add(new Translog.Index(index, indexResult));
final Translog.Location location = translogManager.getTranslog(false).add(new Translog.Index(index, indexResult));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not clear to me at the first place why would getTranslog use boolean argument to check if engine is open or not: in all flows this value is set to false (please correct me if I am wrong). It seems like only tests use true as an argument (which could use different assertion for that). I would suggested to clean up interface by removing this confusing parameter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @reta will make the change, let me know if you have more comments and I can try address them in the next revision

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Bukhtawar I went through a few times, I have nothing else to comment on

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
@Bukhtawar
Copy link
Collaborator Author

start gradle check

@Bukhtawar Bukhtawar requested a review from reta June 28, 2022 13:04
@@ -93,7 +93,7 @@ public boolean shouldRollTranslogGeneration() {
public void trimOperationsFromTranslog(long belowTerm, long aboveSeqNo) throws TranslogException {}

@Override
public Translog getTranslog(boolean ensureOpen) {
public Translog getTranslog() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI (new change set), the NoOpTranslogManager is effectively unusable since every single caller assumes it will never return null for Translog getTranslog(). Hopefully, NoOpTranslogManager will never be used but you may think about introducing NoOpTranslog

Copy link
Collaborator Author

@Bukhtawar Bukhtawar Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct the NoOpTranslog is already being worked upon as a part of #3600.

@Bukhtawar
Copy link
Collaborator Author

start gradle check

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! It's great to see the decoupling work begin. I gave it a quick review. I noticed a lot of this is cosmetic so I just have some validating questions.

One general question. Is the point of having TranslogManager as an interface because we want to be able to create other translog managers (e.g., kafka manager)?

@@ -167,6 +168,8 @@ public final EngineConfig config() {
return engineConfig;
}

public abstract TranslogManager translogManager();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 Great to see this decoupling

@@ -308,7 +334,9 @@ public InternalEngine(EngineConfig engineConfig) {
}
this.lastRefreshedCheckpointListener = new LastRefreshedCheckpointListener(localCheckpointTracker.getProcessedCheckpoint());
this.internalReaderManager.addListener(lastRefreshedCheckpointListener);
maxSeqNoOfUpdatesOrDeletes = new AtomicLong(SequenceNumbers.max(localCheckpointTracker.getMaxSeqNo(), translog.getMaxSeqNo()));
maxSeqNoOfUpdatesOrDeletes = new AtomicLong(
SequenceNumbers.max(localCheckpointTracker.getMaxSeqNo(), translogManager.getTranslog().getMaxSeqNo())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of this indirection. I think refactoring utility methods like getMaxSeqNo directly to the TranslogManager instead of invoking getTranslog everywhere is cleaner?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I just noticed the previous review comments. Looks like you're planning to decouple this further in follow up PRs. So disregard.

@@ -759,20 +761,20 @@ public long getProcessedCheckpoint() {
}

public void testFlushIsDisabledDuringTranslogRecovery() throws IOException {
engine.ensureCanFlush(); // recovered already
engine.translogManager().ensureCanFlush(); // recovered already
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This this indirection to translogManager() temporary and planned to be cleaned up in the follow on PRs?

@Bukhtawar
Copy link
Collaborator Author

Bukhtawar commented Jun 28, 2022

One general question. Is the point of having TranslogManager as an interface because we want to be able to create other translog managers (e.g., kafka manager)?

Thanks @nknize
The idea was to free Engine of the translog responsibilities and create various flavours like NoOpTranslogManager to be able to create an Engine which doesn't support translog operations, WriteOnlyTranslogManager for NRTReplicaEngine.
The first part of the PR was #3638 where we introduced newer interfaces. For remote extensions(kafka) we have #3242 which is being worked upon. The idea there is to further decouple TranslogWriter and TranslogReader from its underlying store. I will be raising that PR soon.

@peterzhuamazon
Copy link
Member

Hi @Bukhtawar
Gradle check fixed. Sorry for confusion.
I rebased your branch and pushed so PR re-run checks now, thanks.

@Bukhtawar Bukhtawar requested a review from nknize June 28, 2022 18:35
@nknize
Copy link
Collaborator

nknize commented Jun 28, 2022

Is the point of having TranslogManager as an interface because we want to be able to create other translog managers (e.g., kafka manager)?

The idea was to free Engine of the translog responsibilities and create various flavours

so, yes? 😄

For remote extensions(kafka) we have #3242 which is being worked upon.

I presume this will create a new KafkaTranslogManager?

I'm trying to glean from this incremental PR if the implementation design is to have 1:1 concrete TranslogManager implementations per translog mechanism. So if a user wants to use Kafka as the Translog we, OpenSearch, would provide a KafkaTranslogManager for handling the kafka specific logic?

Copy link
Member

@mch2 mch2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have only a nit, lgtm.


@Override
public void onTragicFailure(AlreadyClosedException ex) {
failOnTragicEvent(ex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a nit - I'm wondering if we should leave it up to the engine to determine if the error is tragic or not instead of explicitly catching AlreadyClosedException in InternalTranslogManager? Then we don't need this separation in TranslogEventListener.

@Bukhtawar
Copy link
Collaborator Author

Bukhtawar commented Jun 28, 2022

I'm trying to glean from this incremental PR if the implementation design is to have 1:1 concrete TranslogManager implementations per translog mechanism. So if a user wants to use Kafka as the Translog we, OpenSearch, would provide a KafkaTranslogManager for handling the kafka specific logic?

The idea is definitely to support streaming stores like kafka and other blob stores to start with. At this point honestly, I am still contemplating if we should introduce a new low level abstraction or continue with TranslogManger as the abstraction itself.

I'll be shortly putting forth a proposal and open up a discussion

/cc @nknize

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
@Bukhtawar
Copy link
Collaborator Author

Will continue discussion over #3242. Merging it as all feedbacks incorporated

@Bukhtawar Bukhtawar merged commit 088e019 into opensearch-project:main Jun 29, 2022
imRishN pushed a commit to imRishN/OpenSearch that referenced this pull request Jul 3, 2022
…ct#3671)

* Integrate Engine with decoupled translog interface

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants