-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make GeoHash
coordinate conversions ~40-100x faster
#348
Conversation
|
Label | Latitude | Longitude | Lat Error (vs. input) | Lon Error (vs. input) |
---|---|---|---|---|
Test input | 30.5388942218 | 104.0555758833 | - | - |
Test IEEE-754 repr. | 30.5388942217999997 | 104.0555758832999942 | - | - |
main | 30.5388928949832916 | 104.0555772185325623 | 0.0000013268167081 | -0.0000013352325681 |
PR (1b59a0a) | 30.5388915538787842 | 104.0555745363235474 | 0.0000026679212155 | 0.0000013469764468 |
PR (calculate center) | 30.538892894983292 | 104.05557721853256 | 0.0000013268167081 | -0.0000013352325681 |
edit: and I finally realize, the de-quantized coordinate represents the minimum value and we get the same coordinate as main
branch which represented the center of this error-bounding-box by:
Funny in hindsight as I was earlier wondering why those error values were multiplies of each other..
Premature optimization is Few test failures left, need to dive deeper into some rounding corner-cases. The diff looks incomprehensible to review, it's pretty much a rewrite 😅 |
Added |
adds "Investigate LightClient implementation" to todo list. It seems run some tests until timeout if it receives the expected amount of bytes For some reason, I remembered/thought that the hash integer was bit-cast i.e. I just realized the original implementation was not actually redis "52-bit" precision either 🤔 Redis seems to always "clear" out the last 2 bits for the textual representation output, so it always returns |
30947c3
to
7da6041
Compare
There's small corner-case error in the McLoughin's quantization trick on which this PR is based. The statement
is wrong for the upper-bound of the range. Exponent bias is There's a lot of problems with GeoHash in general around the corner values where implementations differ.. |
And to add to then confusion, the original latitude limits are "wrong". According EPSG:3857 (also known as EPSG:3785, also known as EPSG:900913.. etc.) the latitude limits are I'll won't touch the existing latitude limit in this PR. |
e3d3ccf
to
80ebee4
Compare
Interesting stuff! Glad to see you are making progress here. :) |
23cebf1
to
b4ec75c
Compare
Added benchmarks from my laptop which has Also wrote little bit more about the PR to its description. |
Added more background to the PR description about the nuances of geohash.. |
GeoHash
coordinate conversionsGeoHash
coordinate conversions ~40-150x faster
GeoHash
coordinate conversions ~40-150x fasterGeoHash
coordinate conversions ~40-100x faster
* Keep them in registers (or atmost spill to stack) * Also some other misc. simplification
* I will return to this method in a follow-up
…curve encoding more directly * Credits to https://mmcloughlin.com/posts/geohash-assembly for the quantization approach!
* Abuse IEEE-754 binary representation in the encoding too * Implement Z-curve decoding more efficiently * Add GetGeoErrorByPrecision to calculate the error at given bit precision (52 for us). * Optimize base32 string encoding. * Test still fail. Will need to investigate more.
* And define it to run the tests with it
…unding * More accurate and faster, what not to love
* Little extra encouragement to JIT.
* And add GeoHashBenchmark job with it enabled
* We might not even wan't cmov/csel which might stall out-of-order execution. Doesn't matter what is emitted tbh.
* The GeoHash class incorporates material from mmcloughlin/geohash and georust/geohash, both licensed under MIT License. Thank you for sharing!
Out of curiosity, what required a force-push? 😅 (This broke the history experience in GitHub UI. I have still this branch in it's pre-force-push sync state if we want to restore that.) |
No changes, actually. I was updating the branch with rebase option to bring in latest changes from main. |
* Refactor Haversine-distance calculation * See https://github.com/dotnet/runtime/blob/333fb71d54bd84256e740aa08f8b836d4cd71d98/src/libraries/System.Private.CoreLib/src/System/Numerics/ITrigonometricFunctions.cs#L65-L113 * Do not spill the coordinate ranges to heap when encoding/decoding * Keep them in registers (or atmost spill to stack) * Also some other misc. simplification * Add shared AsciiUtils to Garnet.common to simplify the unit conversions * Slightly adjust GeoHash tests * dotnet format * Restore flag bit for GetGeoHashCode for now * I will return to this method in a follow-up * Optimize GeoToLongValue to use float quantization trick and do the Z-curve encoding more directly * Credits to https://mmcloughlin.com/posts/geohash-assembly for the quantization approach! * Further optimize Geohash & Base32 encoding and decoding * Abuse IEEE-754 binary representation in the encoding too * Implement Z-curve decoding more efficiently * Add GetGeoErrorByPrecision to calculate the error at given bit precision (52 for us). * Optimize base32 string encoding. * Test still fail. Will need to investigate more. * typo * Further clarify the quantization method * Calculate the center of bounding-box * Clarify dequantization method * Make the bounding-box center fix-up use constants * Add more test-data and restore original epsilon calc. * tests: sqc8b49rnyt -> sqc8b49rnys * tests: nsqdtr74hyu1 -> nsqdtr74hyu0 * Exponent is 1023, not 0 * Add USE_PDEP_PEXT switch for PDEP/PEXT Z-curve en/decode * And define it to run the tests with it * Use FusedMultiplyAdd to do (x+y)*z in one op to avoid intermediate rounding * More accurate and faster, what not to love * Remove #define USE_PDEP_PEXT, tests passed * Move GeoHash specific unit tests own file * Mark Z-curve encode/decode with MethodImpl.AI * Little extra encouragement to JIT. * Add GeoHash specific benchmarks * oops * format * Add UsePdepPext build switch. * And add GeoHashBenchmark job with it enabled * Avoid shifting by using already shifted mask for the PDEP/PEXT * Use AVX512 support to guard PDEP/PEXT usage. * Add MemoryDiagnoser back * Fix incorrect quantization approach * Make it little bit more clear what happens in the corner-case guard * We might not even wan't cmov/csel which might stall out-of-order execution. Doesn't matter what is emitted tbh. * Remove not needed pow2 trick * Let JIT do its thing, it's pretty good * Adjust comments a bit * Fix comment typos * Adjust comments * Adjust comments * Add third-party notices to NOTICE.md * The GeoHash class incorporates material from mmcloughlin/geohash and georust/geohash, both licensed under MIT License. Thank you for sharing! --------- Co-authored-by: Yoganand Rajasekaran <60369795+yrajas@users.noreply.github.com>
Note
Credits for the quantization approach to https://mmcloughlin.com/posts/geohash-assembly (by @mmcloughlin)
and de-quantization method to https://github.com/georust/geohash, big thank you for everyone contributing to both efforts!
This PR originally was meant to be simply tiny optimization to avoid unnecessary heap allocations on the coordinate conversions but I had change of plans when I discovered an article about much faster (de)quantization method by Michael McLoughlin and this ended up being
GeoHash
class rewrite. Please see his great article and explanation.Much of the commits after 564891a is me discovering new exciting details about IEEE-754 floating-point rounding, confusing reference implementations (it turns out, there's no single correct way to reference a location on earth 😅) and other fun Geohash corner-cases.
However, more notably there's tiny error in the quantization approach described in the McLoughing's excellent article which I describe in this comment and fix in
f364a96c..b4ec75c
. Gist of it is that in IEEE-754 double precision format, the exponent assumption of1023
does not hold for2.0
, which the maximum coordinate values get rounded to.There should be no other behavioral differences than the last character of textual representation, which one of the existing tests already ignored understandably due to nuances around the precision of it. The textual representation
isshould simply be base-32 encoded (custom geohash alphabet) of the integer that represents a specific GeoHash where more characters means more bits of precision. Every encoded character gives 5-bits of precision. Very simple right! I wish it wasGarnet (and Redis) chose 52-bit precision. Why? Because we want to store the geohash integer in a sorted set, which allows us to utilize the geohash structure and do really efficient range queries. Now because we want to reuse the existing sorted set we need to safely store the geohash integer in IEEE-754 double precision score which we can do that safely when our integer is in range
[(-2)^53. 2^53]
.Now, we can still encode this 52-bit precision integer in the base-32 alphabet just fine. The remaining 55-52=3 bits will be zero and that's what Garnet has done, but Redis seems to zero out the last 2 bits and output the last character of the textual representation as
0
.Redis made another choice too, which is that they chose to follow EPSG:3875 where:
This makes sense as the poles cause all sorts of problems in GIS applications.
Confusing history behind Web Mercator
These limits however do not apply to "standard" geohash format which valid latitudes are
[-90, 90]
. This difference also adds nuance to geohash implementations. Garnet currently accepts coordinates per the geohash limitations.This PR also introduces
AsciiUtils
inGarnet.common
as place to share common ASCII manipulation logic (acts more like polyfill forAscii
class in .NET 6 and a place where we can optimize further than what BCL's currently does).This PR is strictly focused on
GeoHash
class alone. I'll remove more unnecessary allocations in the actual RESP protocol parsing & encoding in follow-up. There's some vectorization opportunities here and the distance math can be definitely improved too.Benchmarks (Tiger Lake, i5-1135G7)
main
PR
Benchmarks (Zen 2, Ryzen 7 3700X)
main
PR
Zen 2 PDEP/PEXT "erratum" in action
See that 48B heap allocation for the final string in
GetGeoHashCode
, we don't need that :)