Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hit java.lang.IndexOutOfBoundsException in es log after recent Lucene9 snapshot upgrade #78785

Closed
wwang500 opened this issue Oct 6, 2021 · 7 comments · Fixed by #79461
Closed
Assignees
Labels
>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team

Comments

@wwang500
Copy link

wwang500 commented Oct 6, 2021

Recently, (after #78286 and #73324 merge), one of ml nightly test failed with java.lang.IndexOutOfBoundsException in es log.

Part of es.log:

[2021-10-05T19:17:57,811][INFO ][o.e.x.m.j.p.DataCountsReporter] [node-0] [nginx_lat_long_partition_1633461374_000_0] 4000000 records written to autodetect; missingFieldCount=0, invalidDateCount=0, outOfOrderCount=0
[2021-10-05T19:18:22,075][INFO ][o.e.x.m.j.p.DataCountsReporter] [node-0] [nginx_lat_long_partition_1633461374_000_0] 5000000 records written to autodetect; missingFieldCount=0, invalidDateCount=0, outOfOrderCount=0
[2021-10-05T19:18:48,627][INFO ][o.e.x.m.j.p.DataCountsReporter] [node-0] [nginx_lat_long_partition_1633461374_000_0] 6000000 records written to autodetect; missingFieldCount=0, invalidDateCount=0, outOfOrderCount=0
[2021-10-05T19:19:15,997][INFO ][o.e.x.m.j.p.DataCountsReporter] [node-0] [nginx_lat_long_partition_1633461374_000_0] 7000000 records written to autodetect; missingFieldCount=0, invalidDateCount=0, outOfOrderCount=0
[2021-10-05T19:19:40,138][INFO ][o.e.x.m.j.p.DataCountsReporter] [node-0] [nginx_lat_long_partition_1633461374_000_0] 8000000 records written to autodetect; missingFieldCount=0, invalidDateCount=0, outOfOrderCount=0
[2021-10-05T19:20:16,349][INFO ][o.e.x.m.j.p.DataCountsReporter] [node-0] [nginx_lat_long_partition_1633461374_000_0] 9000000 records written to autodetect; missingFieldCount=0, invalidDateCount=0, outOfOrderCount=0
[2021-10-05T19:20:33,866][WARN ][o.e.i.e.Engine           ] [node-0] [.ml-anomalies-custom-nginx][0] failed engine [merge failed]
org.apache.lucene.index.MergePolicy$MergeException: java.lang.IndexOutOfBoundsException
	at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2340) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.lang.IndexOutOfBoundsException
	at java.nio.Buffer.checkIndex(Buffer.java:749) ~[?:?]
	at java.nio.DirectByteBuffer.getInt(DirectByteBuffer.java:692) ~[?:?]
	at org.apache.lucene.store.ByteBufferGuard.getInt(ByteBufferGuard.java:128) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readInt(ByteBufferIndexInput.java:591) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.util.packed.DirectReader$DirectPackedReader20.get(DirectReader.java:222) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.util.packed.DirectMonotonicReader.get(DirectMonotonicReader.java:149) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.set(Lucene90DocValuesProducer.java:1356) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.docValueCount(Lucene90DocValuesProducer.java:1348) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25.nextDoc(Lucene90DocValuesProducer.java:1405) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:837) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:148) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:154) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:168) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.lambda$merge$2(SegmentMerger.java:139) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:273) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4964) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4500) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6252) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:636) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:113) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:697) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
[2021-10-05T19:20:33,870][WARN ][o.e.i.c.IndicesClusterStateService] [node-0] [.ml-anomalies-custom-nginx][0] marking and sending shard failed due to [shard failure, reason [merge failed]]
org.apache.lucene.index.MergePolicy$MergeException: java.lang.IndexOutOfBoundsException
	at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2340) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.lang.IndexOutOfBoundsException
	at java.nio.Buffer.checkIndex(Buffer.java:749) ~[?:?]
	at java.nio.DirectByteBuffer.getInt(DirectByteBuffer.java:692) ~[?:?]
	at org.apache.lucene.store.ByteBufferGuard.getInt(ByteBufferGuard.java:128) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readInt(ByteBufferIndexInput.java:591) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.util.packed.DirectReader$DirectPackedReader20.get(DirectReader.java:222) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.util.packed.DirectMonotonicReader.get(DirectMonotonicReader.java:149) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.set(Lucene90DocValuesProducer.java:1356) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.docValueCount(Lucene90DocValuesProducer.java:1348) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25.nextDoc(Lucene90DocValuesProducer.java:1405) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:837) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:148) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:154) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:168) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.lambda$merge$2(SegmentMerger.java:139) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:273) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4964) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4500) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6252) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:636) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:113) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:697) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
[2021-10-05T19:20:33,873][WARN ][o.e.c.r.a.AllocationService] [node-0] failing shard [failed shard, shard [.ml-anomalies-custom-nginx][0], node[Uh4S5-63QhCcM5kOXQXplw], [P], s[STARTED], a[id=7jkjMTUpSXKM9Pz1zQjBpQ], message [shard failure, reason [merge failed]], markAsStale [true], failure [org.apache.lucene.index.MergePolicy$MergeException: java.lang.IndexOutOfBoundsException
	at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2340)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737)
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.IndexOutOfBoundsException
	at java.base/java.nio.Buffer.checkIndex(Buffer.java:749)
	at java.base/java.nio.DirectByteBuffer.getInt(DirectByteBuffer.java:692)
	at org.apache.lucene.store.ByteBufferGuard.getInt(ByteBufferGuard.java:128)
	at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readInt(ByteBufferIndexInput.java:591)
	at org.apache.lucene.util.packed.DirectReader$DirectPackedReader20.get(DirectReader.java:222)
	at org.apache.lucene.util.packed.DirectMonotonicReader.get(DirectMonotonicReader.java:149)
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.set(Lucene90DocValuesProducer.java:1356)
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.docValueCount(Lucene90DocValuesProducer.java:1348)
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25.nextDoc(Lucene90DocValuesProducer.java:1405)
	at org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:837)
	at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:148)
	at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:154)
	at org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:168)
	at org.apache.lucene.index.SegmentMerger.lambda$merge$2(SegmentMerger.java:139)
	at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:273)
	at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
	at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4964)
	at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4500)
	at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6252)
	at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:636)
	at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:113)
	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:697)
]]
org.apache.lucene.index.MergePolicy$MergeException: java.lang.IndexOutOfBoundsException
	at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2340) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.lang.IndexOutOfBoundsException
	at java.nio.Buffer.checkIndex(Buffer.java:749) ~[?:?]
	at java.nio.DirectByteBuffer.getInt(DirectByteBuffer.java:692) ~[?:?]
	at org.apache.lucene.store.ByteBufferGuard.getInt(ByteBufferGuard.java:128) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readInt(ByteBufferIndexInput.java:591) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.util.packed.DirectReader$DirectPackedReader20.get(DirectReader.java:222) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.util.packed.DirectMonotonicReader.get(DirectMonotonicReader.java:149) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.set(Lucene90DocValuesProducer.java:1356) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.docValueCount(Lucene90DocValuesProducer.java:1348) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25.nextDoc(Lucene90DocValuesProducer.java:1405) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:837) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:148) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:154) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:168) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.lambda$merge$2(SegmentMerger.java:139) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:273) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4964) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4500) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6252) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:636) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
	at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:113) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:697) ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
[2021-10-05T19:20:33,876][INFO ][o.e.c.r.a.AllocationService] [node-0] current.health="RED" message="Cluster health status changed from [GREEN] to [RED] (reason: [shards failed [[.ml-anomalies-custom-nginx][0]]])." previous.health="GREEN" reason="shards failed [[.ml-anomalies-custom-nginx][0]]"
[2021-10-05T19:21:08,216][INFO ][o.e.c.r.a.AllocationService] [node-0] current.health="GREEN" message="Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[.ml-anomalies-custom-nginx][0]]])." previous.health="RED" reason="shards started [[.ml-anomalies-custom-nginx][0]]"
[2021-10-05T19:21:09,595][ERROR][o.e.x.m.j.p.n.ShortCircuitingRenormalizer] [node-0] [nginx_lat_long_partition_1633461374_000_0] Normalization failed

It happened only on 8.0 branch.

@wwang500 wwang500 added >bug :Search/Search Search-related issues that do not fall into other categories labels Oct 6, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Oct 6, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@jpountz
Copy link
Contributor

jpountz commented Oct 7, 2021

@wwang500 Thanks for reporting this, it's bad that Lucene can corrupt indices like that. Do you know when you saw the first instance of this bug, was it only after the latest Lucene snapshot upgrade?

@droberts195
Copy link
Contributor

Wei told me that the first time this was seen was in ML QA build 9772 (that's on an internal Jenkins server) and the Elasticsearch commit that was built from was 5964ffe7df6841.

So the problem pre-dates #78286, and was most likely introduced by #73324.

@jpountz
Copy link
Contributor

jpountz commented Oct 9, 2021

I opened https://issues.apache.org/jira/browse/LUCENE-10159 since this is most likely a Lucene issue.

@jpountz
Copy link
Contributor

jpountz commented Oct 11, 2021

Would it be possible to get a zip file containing the index that triggered this exception? This would allow us to introspect the content of the index files to better understand what got wrong with this corrupt segment.

@wwang500
Copy link
Author

Would it be possible to get a zip file containing the index that triggered this exception? This would allow us to introspect the content of the index files to better understand what got wrong with this corrupt segment.

Zip files have been sent out to @dnhatn .

@dnhatn
Copy link
Member

dnhatn commented Oct 17, 2021

I've opened apache/lucene#389.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants