-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address mapping and compute engine runtime field issues #117792
Conversation
This change attempts to solve the following issue: ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same `SourceProvider` evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used by `SourceProvider`. Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position. Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api. Closes elastic#117644
@@ -108,7 +108,12 @@ public DocValueFormat docValueFormat(String format, ZoneId timeZone) { | |||
|
|||
@Override | |||
public BlockLoader blockLoader(BlockLoaderContext blContext) { | |||
return new LongScriptBlockDocValuesReader.LongScriptBlockLoader(leafFactory(blContext.lookup())); | |||
boolean isSyntheticSource = SourceFieldMapper.isSynthetic(blContext.indexSettings()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have two different implementations based on source mode?
|
||
@Override | ||
public final StoredFieldsSpec rowStrideStoredFieldSpec() { | ||
return StoredFieldsSpec.NEEDS_SOURCE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed because otherwise the follow error occurs:
[2024-12-03T04:52:11,002][WARN ][o.e.x.e.a.EsqlResponseListener] [test-cluster-0] Request failed with status [INTERNAL_SERVER_ERROR]: java.lang.IllegalStateException: found row stride readers [[RowStrideReaderWork[reader=ScriptLongs, builder=org.elasticsearch.compute.data.LongBlockBuilder@19f6840e, loader=org.elasticsearch.index.mapper.LongScriptBlockDocValuesReader$RowStrideLongScriptBlockLoader@6f332c9, offset=1], RowStrideReaderWork[reader=ScriptLongs, builder=org.elasticsearch.compute.data.LongBlockBuilder@12a1d61f, loader=org.elasticsearch.index.mapper.LongScriptBlockDocValuesReader$RowStrideLongScriptBlockLoader@7b7a5f6c, offset=2]]] without stored fields [StoredFieldsSpec[requiresSource=false, requiresMetadata=false, requiredStoredFields=[]]]
at org.elasticsearch.compute.lucene.ValuesSourceReaderOperator.loadFromSingleLeaf(ValuesSourceReaderOperator.java:254)
at org.elasticsearch.compute.lucene.ValuesSourceReaderOperator.process(ValuesSourceReaderOperator.java:143)
at org.elasticsearch.compute.operator.AbstractPageMappingOperator.getOutput(AbstractPageMappingOperator.java:76)
at org.elasticsearch.compute.operator.Driver.runSingleLoopIteration(Driver.java:258)
at org.elasticsearch.compute.operator.Driver.run(Driver.java:189)
at org.elasticsearch.compute.operator.Driver$1.doRun(Driver.java:378)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:34)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1575)
However it isn't really required. Since the SearchLook / SourceLoader will load stored fields and not compute engine.
…te engine and scripting infrastructure to load _source
Another issue is the thread switching, as the SyntheticSourceProvider is designed to work with only one thread. I think we need to address this before merging the change. It appears we are loading the source once for each script field, but I think it's okay to address the performance issue later.
|
…w SourceLoader for each runtime field used a the query.
…ocid data partitioning and accesses _source (either synthetic or stored source)
Hi @martijnvg, I've created a changelog YAML for you. |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
server/src/main/java/org/elasticsearch/search/lookup/ReinitializingSourceProvider.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two workarounds are good! Feel free to merge the PR. Thanks, Martijn!
This change addresses the following issues: Fields mapped as runtime fields not getting stored if source mode is synthetic. Address java.io.EOFException when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. (1) Address concurrency issue when runtime fields get pushed down to Lucene. (2) 1: ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same SourceProvider evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used by SourceProvider. Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position. Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api. 2: The concurrency issue that arrises with source provider if source operator executes in parallel with data portioning set to DOC. The same SourceProvider instance then gets access by multiple threads concurrently. SourceProviders implementations are not designed to handle concurrent access. Closes elastic#117644
This change addresses the following issues: Fields mapped as runtime fields not getting stored if source mode is synthetic. Address java.io.EOFException when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. (1) Address concurrency issue when runtime fields get pushed down to Lucene. (2) 1: ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same SourceProvider evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used by SourceProvider. Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position. Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api. 2: The concurrency issue that arrises with source provider if source operator executes in parallel with data portioning set to DOC. The same SourceProvider instance then gets access by multiple threads concurrently. SourceProviders implementations are not designed to handle concurrent access. Closes elastic#117644
…) (#118048) * Address mapping and compute engine runtime field issues (#117792) This change addresses the following issues: Fields mapped as runtime fields not getting stored if source mode is synthetic. Address java.io.EOFException when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. (1) Address concurrency issue when runtime fields get pushed down to Lucene. (2) 1: ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same SourceProvider evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used by SourceProvider. Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position. Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api. 2: The concurrency issue that arrises with source provider if source operator executes in parallel with data portioning set to DOC. The same SourceProvider instance then gets access by multiple threads concurrently. SourceProviders implementations are not designed to handle concurrent access. Closes #117644 * fixed compile error after backporting
… (#118049) * Address mapping and compute engine runtime field issues (#117792) This change addresses the following issues: Fields mapped as runtime fields not getting stored if source mode is synthetic. Address java.io.EOFException when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. (1) Address concurrency issue when runtime fields get pushed down to Lucene. (2) 1: ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same SourceProvider evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used by SourceProvider. Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position. Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api. 2: The concurrency issue that arrises with source provider if source operator executes in parallel with data portioning set to DOC. The same SourceProvider instance then gets access by multiple threads concurrently. SourceProviders implementations are not designed to handle concurrent access. Closes #117644
The previous fix to ensure that each thread uses its own SearchProvider wasn't good enough. When multiple threads access `ReinitializingSourceProvider` the simple thread accounting could still result in returned `SourceProvider` being used by multiple threads concurrently. The ReinitializingSourceProvider was introduced via elastic#117792 Closes elastic#118238
This reverts the workaround that was introduced in elastic#117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. This is now covered by the `ReinitializingSourceProvider` workaround that covers that and the concurrency problem. With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.
This PR addresses the following issues:
java.io.EOFException
when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. (1)1: ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same
SourceProvider
evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used bySourceProvider
. Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position.Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api.
2: The concurrency issue that arrises with source provider if source operator executes in parallel with data portioning set to DOC. The same
SourceProvider
instance then gets access by multiple threads concurrently.SourceProviders
implementations are not designed to handle concurrent access.Closes #117644