Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change float to double in vec2d #652

Merged
merged 22 commits into from
Sep 14, 2022
Merged

Conversation

nrabinowitz
Copy link
Collaborator

Fixes #651

This changes float to double as suggested, and adds tests for cellsToLinkedMultiPolygon (which depends on geo coordinate precision) that confirm improved precision in the output.

H3Index indexes[] = {0, 0, 0, 0, 0, 0, 0};
int numHexes = 7;

for (int res = 1; res < 15; res++) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that even with this fix, both of these tests fail at res 0. I'm not sure where the additional imprecision lies for that case, but I think we need vertex mode to fix it.

@nrabinowitz
Copy link
Collaborator Author

This may address uber/h3-py#277 - it's probably worth checking whether that case is now covered.

@isaacbrodsky
Copy link
Collaborator

isaacbrodsky commented Aug 23, 2022

Please add to the CHANGELOG describing this change. As I understand it, the impact is that cellToBoundary results for distortion vertexes were calculated at a lower precision (and cellToLinkedMultiPolygon and edge boundary calculations that use the same code path).

Please also add a test for one or more of the specific cells in the issue. While cellsToLinkedMultiPolygon exercises this today, it will not cover this in the future because the plan is to change that function to not do coordinate comparisons.

@isaacbrodsky
Copy link
Collaborator

It looks like the tests are failing since distortion vertexes will be calculated slightly differently due to the precision change.

@nrabinowitz
Copy link
Collaborator Author

I see a few things in the tests:

  • I'm not sure how to fix the cell boundary tests without regenerating the test files with new random cells. That doesn't feel very kosher, but I suspect this will be tedious-to-impossible to do with edits.
  • This update breaks some of our test cases for h3ToGeoBoundary returns duplicate points for some hexagons #45. These are highly subject to FPE - I can see that of the 8 test cells that have 7 vertexes, 4 or 5 of them have 8 vertexes in the JS library. We might need a change to the cellToBoundary code here to do an almost-equal comparison instead of strict equality.
  • I get an error for directed edge boundary assertions: directedEdge.directedEdge_boundary: t_assert failed at /Users/nick/dev/h3/src/apps/testapps/testDirectedEdgeExhaustive.c:80, edgeBoundary.numVerts == revEdgeBoundary.numVerts, numVerts is equal for edge and reverse. I suspect this might be downstream from h3ToGeoBoundary returns duplicate points for some hexagons #45
  • I get failures in testH3CellArea, which looks like it's also a descriptive rather than a prescriptive test case; we can probably just update the test expectation.

Does any of this raise concerns? I don't think we've promised that the geo boundary would remain stable within FPE, but it seems possible this might come across as a breaking change for some users.

@nrabinowitz
Copy link
Collaborator Author

  • I'm not sure how to fix the cell boundary tests without regenerating the test files with new random cells. That doesn't feel very kosher, but I suspect this will be tedious-to-impossible to do with edits.

Actually, testing the output here what I see are failures in b1->numVerts == b2.numVerts, expected cell boundary count, which I'm pretty sure is #45 again (this makes sense, we use the vector intersection to test whether we need another vertex). I'm guessing that the improved precision is breaking our fix for that issue.

@coveralls
Copy link

coveralls commented Aug 24, 2022

Coverage Status

Coverage increased (+0.0003%) to 99.038% when pulling 2539aa2 on nrabinowitz:vec2d-double into db0a3e4 on uber:master.

@nrabinowitz
Copy link
Collaborator Author

Ok, I've fixed the tests:

  • For the extra-vertex issue, I changed from a strict equality check to an almost-equal check, using FLT_EPSILON as the threshold. This appears to have fixed all of the test failures stemming from incorrect vertex count, and I suspect will fix the JS issue where some of the extra vertexes still appear.
  • The boundary regression tests fail on coarse Class III resolutions due to changing boundaries. I regenerated the res 1, res 3, and random sample of res 5 test files with updated boundaries. I also added a README file covering the commands to regen these files, as it took me a little bit to track this down.

Please also add a test for one or more of the specific cells in the issue. While cellsToLinkedMultiPolygon exercises this today, it will not cover this in the future because the plan is to change that function to not do coordinate comparisons.

The res 1 and res 3 cells in the issue should be covered by the regen'd test files - I don't think we need extra tests for this (since the issue was not that the vertices were wrong, but that we might want to use double in that spot).

Now the question is: Do we want to do this? On the one hand, I'd rather rely on a more precise calculation, and I feel weird about the idea that the canonical boundaries are defined by a bad implementation detail. On the other hand, I don't know the ramifications of changing the boundaries, which are sort of meant to be canonical and unchanging.

One argument in favor is that presumably the more correct boundaries would enclose some points that are indexed to the cell but would otherwise fall outside the boundary with a point-in-spherical-poly check.

@@ -747,8 +747,8 @@ void _faceIjkToCellBoundary(const FaceIJK *h, int res, int start, int length,
adjacent hexagon edge will lie completely on a single icosahedron
face, and no additional vertex is required.
*/
bool isIntersectionAtVertex =
_v2dEquals(&orig2d0, &inter) || _v2dEquals(&orig2d1, &inter);
bool isIntersectionAtVertex = _v2dAlmostEquals(&orig2d0, &inter) ||

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this remove the only two usage of _v2dEquals

So I guess you could either rename _v2dAlmostEquals or get rid of _v2dEquals.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will drop the exact version

Comment on lines +16 to +18
For a random sample of cells at a given res:

./bin/mkRandGeoBoundary -n 5000 -r 5 > tests/inputfiles/rand05cells.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this specifically how rand05cells.txt was generated in this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@@ -297,4 +297,66 @@ SUITE(cellsToLinkedMultiPolygon) {

H3_EXPORT(destroyLinkedMultiPolygon)(&polygon);
}

TEST(kRingResolutions) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kRing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh, will update

bool _v2dAlmostEquals(const Vec2d *v1, const Vec2d *v2) {
return fabs(v1->x - v2->x) < FLT_EPSILON &&
fabs(v1->y - v2->y) < FLT_EPSILON;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline at end of file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can fix - is there a reason clang-format doesn't take care of that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From some quick searches, clang-format does not do that.

@nrabinowitz
Copy link
Collaborator Author

So I tried for the better part of a day to construct a case where a point that indexed to the cell was not physically inside the cell boundary when the value was a float, but was inside the cell boundary when the value was double. My test case looks like this:

// For every vertex, take some points around the vertex. If they index
// to the cell, they should also be inside the physical boundary.
for (int i = 0; i < boundary.numVerts; i++) {
    for (int latOffset = -1; latOffset < 1; latOffset++) {
        for (int lngOffset = -1; lngOffset < 1; lngOffset++) {
            if (latOffset && lngOffset) {
                point.lat = boundary.verts[i].lat +
                            latOffset * FLT_EPSILON;
                point.lng = boundary.verts[i].lng +
                            lngOffset * FLT_EPSILON;

                H3Index cell2;
                t_assertSuccess(
                    H3_EXPORT(latLngToCell)(&point, 1, &cell2));
                if (cell2 == cell) {
                    // Check whether the point is physically inside
                    // the geo boundary
                    t_assert(
                        pointInsideGeoLoop(&geoloop, &bbox, &point),
                        "Boundary contains input point");
                }
            }
        }
    }
}

This has identified a number of cases with res 1 cells where the assertion fails, but in all cases it fails for both float and double 😭.

  • Any ideas for how I might fix this?
  • I can't spend much more time on it - do we think it's truly necessary for the PR? Or should we drop the PR?

I've verified that the only vertexes this affects are distortion vertexes - all others are unchanged.

@isaacbrodsky
Copy link
Collaborator

@nrabinowitz Doesn't pointInsideGeoLoop assume cartesian geometry? Wouldn't the test need to be using spherical point in polygon?

@nrabinowitz
Copy link
Collaborator Author

@nrabinowitz Doesn't pointInsideGeoLoop assume cartesian geometry? Wouldn't the test need to be using spherical point in polygon?

  • I don't think it would matter so near to the vertices
  • We don't have a spherical point-in-poly function, so I couldn't write a unit test for this

@nrabinowitz
Copy link
Collaborator Author

OK, I found a single point (one of the float-based distortion vertexes) that fails my test with float but passes with double. Presumably there are more, but a longer search would be required. This PR should now be ready for re-review.

Comment on lines +64 to 67
bool _v2dAlmostEquals(const Vec2d *v1, const Vec2d *v2) {
return fabs(v1->x - v2->x) < FLT_EPSILON &&
fabs(v1->y - v2->y) < FLT_EPSILON;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this function do comparisons in single precision float instead of double?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fabs takes and returns a double, so the comparison here is still in double precision I believe. FLT_EPSILON was chosen as a reasonable threshold here; a number of boundary tests fail when we use the stricter DBL_EPSILON threshold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong use of float instead of double?
4 participants