Do not create engine under IndexShard#mutex #45263

dnhatn · 2019-08-06T22:30:06Z

Today we create a new engine under IndexShard#mutex. This is not ideal because it can block the cluster state updates which require the same mutex. The constructor of InternalEngine can take some time as it implicitly builds the global ordinals if eager_global_ordinals setting is true. To solve this problem, I explored two options:

Move the expensive stuff from the Engine's constructor to the start method, and call start outside IndexShard#mutex. See https://github.com/elastic/elasticsearch/compare/master...dnhatn:start-engine?expand=1
Create engines under a new mutex in IndexShard

I prefer the second approach for it's more contained.

Closes #43699

elasticmachine · 2019-08-06T22:30:09Z

Pinging @elastic/es-distributed

henningandersen · 2019-08-08T06:38:36Z

I also prefer the second approach, since snapshot of metadata will soon need to ask for things like max seqno and global checkpoint, which could then not be initialized yet in the case we have an engine. I am thinking about this work (that is still to be done): https://github.com/elastic/elasticsearch/pull/42518/files#diff-49e1a1b834b522f4ae6997c5defe9eb0R1243.

s1monw · 2019-08-08T12:01:39Z

@dnhatn why can't we create an engine under a different lock just before we acquire the mutex and do some state checks. I don't think we should make such a drastic change to the concurrency model just because a ctor can be heavy. You can protect from creating two engines with a separate lock and then once it's created do the state check and change the reference? Something like this:

diff --git a/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java b/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
index a98f501946b..4897792223b 100644
--- a/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
+++ b/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
@@ -1593,16 +1593,26 @@ public class IndexShard extends AbstractIndexShardComponent implements IndicesCl
         assert recoveryState.getRecoverySource().expectEmptyRetentionLeases() == false || getRetentionLeases().leases().isEmpty()
             : "expected empty set of retention leases with recovery source [" + recoveryState.getRecoverySource()
             + "] but got " + getRetentionLeases();
-        synchronized (mutex) {
-            verifyNotClosed();
-            assert currentEngineReference.get() == null : "engine is running";
-            // we must create a new engine under mutex (see IndexShard#snapshotStoreMetadata).
+        synchronized (this) {
             final Engine newEngine = engineFactory.newReadWriteEngine(config);
-            onNewEngine(newEngine);
-            currentEngineReference.set(newEngine);
-            // We set active because we are now writing operations to the engine; this way,
-            // if we go idle after some time and become inactive, we still give sync'd flush a chance to run.
-            active.set(true);
+            boolean success = false;
+            try {
+                synchronized (mutex) {
+                    verifyNotClosed();
+                    assert currentEngineReference.get() == null : "engine is running";
+                    // we must create a new engine under mutex (see IndexShard#snapshotStoreMetadata).
+                    onNewEngine(newEngine);
+                    currentEngineReference.set(newEngine);
+                    // We set active because we are now writing operations to the engine; this way,
+                    // if we go idle after some time and become inactive, we still give sync'd flush a chance to run.
+                    active.set(true);
+                    success = true;
+                }
+            } finally {
+                if (success == false) {
+                    newEngine.close();
+                }
+            }
         }

dnhatn · 2019-08-08T22:32:05Z

I started with your suggestion, but then I dropped it. I was worried about deadlock for we need to make sure the lock ordering. I've applied your suggestion in bd56da8. Can you please take another look. Thank you.

henningandersen

I added a couple of comments to consider. I think this can work out, but need to ponder on it a little

henningandersen · 2019-08-09T13:03:49Z