-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL Support loading points from source into WKB blocks #103698
Changes from 16 commits
261bf04
eb47224
081468e
17ae512
4cbac42
3b0c89d
895cad1
60985fe
35c5df0
eeeffda
0487203
cfc4341
1e6e744
8c5638e
b96933b
db2094b
22f631f
1b41302
4d9eed2
e0c869d
1ead4ce
d699483
bf1ab97
c62e5aa
935bfd6
5bcbaca
3e89588
07e162a
cb436f5
378614f
a72d0c2
572e60f
f1a3600
0f1a58b
d0f6a12
8dffba7
d1d4ce6
084d08e
c176db1
4509eab
97f5589
d18e3a9
a830c79
a11ebe0
b69889e
1927ec6
bb74171
6500d83
82d95b6
3fd2b18
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 103698 | ||
summary: Reading points from source to reduce precision loss | ||
area: ES|QL | ||
type: enhancement | ||
issues: [] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,7 @@ | |
import org.apache.lucene.index.SortedDocValues; | ||
import org.apache.lucene.index.SortedSetDocValues; | ||
import org.apache.lucene.util.BytesRef; | ||
import org.elasticsearch.common.geo.SpatialPoint; | ||
import org.elasticsearch.core.Releasable; | ||
import org.elasticsearch.search.fetch.StoredFieldsSpec; | ||
import org.elasticsearch.search.lookup.Source; | ||
|
@@ -348,6 +349,11 @@ interface BlockFactory { | |
*/ | ||
BytesRefBuilder bytesRefs(int expectedCount); | ||
|
||
/** | ||
* Build a builder to load {@link SpatialPoint}s backed by WKB in BytesRefBlock. | ||
*/ | ||
BytesRefBuilder geometries(int expectedCount); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we remove this and use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I've removed that now, but could not remove GeometriesBlockLoader because it uses Geometries which does:
While the equivalent one in the BytesRefsBlockLoader does:
So the BytesRefs one assuming incoming data is a String object, while the Geometries one assumes it is a |
||
|
||
/** | ||
* Build a builder to load doubles as loaded from doc values. | ||
* Doc values load doubles deduplicated and in sorted order. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,9 +12,17 @@ | |
import org.apache.lucene.index.SortedSetDocValues; | ||
import org.apache.lucene.util.BytesRef; | ||
import org.apache.lucene.util.UnicodeUtil; | ||
import org.elasticsearch.common.geo.SpatialPoint; | ||
import org.elasticsearch.geometry.Geometry; | ||
import org.elasticsearch.geometry.Point; | ||
import org.elasticsearch.geometry.utils.GeometryValidator; | ||
import org.elasticsearch.geometry.utils.WellKnownBinary; | ||
import org.elasticsearch.geometry.utils.WellKnownText; | ||
import org.elasticsearch.search.fetch.StoredFieldsSpec; | ||
|
||
import java.io.IOException; | ||
import java.nio.ByteOrder; | ||
import java.text.ParseException; | ||
import java.util.ArrayList; | ||
import java.util.List; | ||
|
||
|
@@ -131,6 +139,24 @@ public RowStrideReader rowStrideReader(LeafReaderContext context) { | |
} | ||
} | ||
|
||
public static class GeometriesBlockLoader extends SourceBlockLoader { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It makes me feel uneasy, cannot we reuse the BytesRef reader as we are reading an array of bytes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right now we need this to get to the code in the 'Geometries' BlockSourceReader, but as that can get simplified, perhaps this can too. |
||
private final ValueFetcher fetcher; | ||
|
||
public GeometriesBlockLoader(ValueFetcher fetcher) { | ||
this.fetcher = fetcher; | ||
} | ||
|
||
@Override | ||
public Builder builder(BlockFactory factory, int expectedCount) { | ||
return factory.geometries(expectedCount); | ||
} | ||
|
||
@Override | ||
public RowStrideReader rowStrideReader(LeafReaderContext context) { | ||
return new Geometries(fetcher); | ||
} | ||
} | ||
|
||
private static class BytesRefs extends BlockSourceReader { | ||
BytesRef scratch = new BytesRef(); | ||
|
||
|
@@ -149,6 +175,41 @@ public String toString() { | |
} | ||
} | ||
|
||
private static class Geometries extends BlockSourceReader { | ||
|
||
Geometries(ValueFetcher fetcher) { | ||
super(fetcher); | ||
} | ||
|
||
@Override | ||
protected void append(BlockLoader.Builder builder, Object v) { | ||
if (v instanceof SpatialPoint point) { | ||
BytesRef wkb = new BytesRef(WellKnownBinary.toWKB(new Point(point.getX(), point.getY()), ByteOrder.LITTLE_ENDIAN)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am very confused here, do we really generate points now? Do we really need this? the idea is that we only load WKB? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this was defensive coding, for the variety of possible loading we will see going forward. The source value fetcher was generating WKT (but I can change that to WKB), doc-values was generating longs, and stored fields would generate points themselves (I assumed, but have not written support for that yet). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lets remove defensive coding, I think we should only accept WKB here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ++ on keeping this list as small as possible. Accepting There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Removed support for SpatialPoint |
||
((BlockLoader.BytesRefBuilder) builder).appendBytesRef(wkb); | ||
} else if (v instanceof String wkt) { | ||
try { | ||
iverase marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// TODO: figure out why this is not already happening in the GeoPointFieldMapper | ||
Geometry geometry = WellKnownText.fromWKT(GeometryValidator.NOOP, false, wkt); | ||
if (geometry instanceof Point point) { | ||
BytesRef wkb = new BytesRef(WellKnownBinary.toWKB(point, ByteOrder.LITTLE_ENDIAN)); | ||
((BlockLoader.BytesRefBuilder) builder).appendBytesRef(wkb); | ||
} else { | ||
throw new IllegalArgumentException("Cannot convert geometry into point:: " + geometry.type()); | ||
} | ||
} catch (IOException | ParseException e) { | ||
throw new IllegalArgumentException("Failed to parse point geometry: " + e.getMessage(), e); | ||
} | ||
} else { | ||
throw new IllegalArgumentException("Unsupported source type for point: " + v.getClass().getSimpleName()); | ||
} | ||
} | ||
|
||
@Override | ||
public String toString() { | ||
return "BlockSourceReader.Geometries"; | ||
} | ||
} | ||
|
||
public static class DoublesBlockLoader extends SourceBlockLoader { | ||
private final ValueFetcher fetcher; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this to AbstractPointGeometryFieldMapper and remove the override method in AbstractShapeGeometryFieldMapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no intermediate class common between Geo and Cartesian points, so that would require quite a bit of restructuring. Besides, the plan is to support geo_shape very soon, so we would just move back to this then anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought
AbstractPointGeometryFieldMapper
is common for geo and cartesian points. I know it will change but we should keep PRs to the minimum please or it makes reviewing very difficult.