Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiGeometry binary predicate support. #1220

Draft
wants to merge 16 commits into
base: branch-23.08
Choose a base branch
from

Conversation

thomcom
Copy link
Contributor

@thomcom thomcom commented Jul 17, 2023

Description

Closes #1063
Closes #1210

MultiPoint and MultiLineString predicates, this PR.

An exception to this involves MultiLineStrings that can be merged by their endpoints. Shapely/GEOS appears to merge these MultiLineStrings before calling the binary predicates on them, producing different results for contains, covers, crosses, touches, and within. We can only duplicate this behavior by implementing a merge functionality of our own.

This PR has implementations of Multi geometry binary predicates for:
Point-MultiPoint
MultiPoint-Point
MultiPoint-MultiPoint
Point-MultiLinestring
MultiLineString-Point
LineString-MultiLinestring
MultiLinestring-LineString
MultiLineString-MultiLineString

Remaining to be tested and implemented are one Polygon configuration: Polygon-MultiLineString and the MultiPolygon configurations.

(Multi)LineString-(Multi)LineString tests have issues caused by non-simple MultiLineStrings - that is, any LineString that has self-intersection is non-simple.

crosses

Crosses compares the vertices of intersection with boundary vertices of the lhs and rhs, and is true only when the intersection vertices are not equal to boundary vertices of the lhs or rhs.

There are two fundamental issues in the crosses predicate that I'm still befuddled on:

  1. LineString "compaction": GEOS appears to connect sequential endpoints lines in a MultiLineString into a single LineString before computing intersections. This is the major issue facing this PR.

touches

The touches predicate appears to have the same fundamental issues as the crosses predicate and fails on the same cases. This is because touches compares vertices of intersection with boundary vertices of the lhs and rhs, and is true when the intersection vertices only equal boundary vertices.

MultiPolygon incomplete

Major improvements also need to be made to the Polygon.contains predicate in order to handle predicates with MultiPolygon types on the lhs. This is because MultiPolygon.contains({MultiLineString, MultiPolygon}) iff all geometries of the rhs are contained in any geometry of the lhs. This requires that improvements be made to the point-in-polygon and .contains routines to better handle many-to-many operations.

Quadtree point in polygon can be used with many-to-many operations, but will quickly create results too large for memory with large sets of lhs and rhs. A single MultiPolygon in the lhs needs to have a many-to-many point in polygon called against the points in its corresponding rhs, but does not need to compute point in polygon for all of the points in the other, non-corresponding rhss.

MultiPolygon predicates have begun implementation here:

https://github.com/thomcom/cuspatial/tree/feature/multipolygon-test-dispatch

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@thomcom thomcom self-assigned this Jul 17, 2023
@github-actions github-actions bot added the Python Related to Python code label Jul 17, 2023
@thomcom thomcom added 2 - In Progress Currenty a work in progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 20, 2023
@thomcom thomcom marked this pull request as ready for review July 20, 2023 20:16
@thomcom thomcom requested a review from a team as a code owner July 20, 2023 20:16
@thomcom thomcom requested review from trxcllnt and isVoid July 20, 2023 20:16
# rhs is contained by lhs if all of the vertices in the
# intersection are in rhs. However, the intersection
# does not contain the vertex (0.5, 0.5) in the original rhs.
rhs_self_intersection = _basic_intersects_pli(rhs, rhs)
Copy link
Contributor

@isVoid isVoid Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like you need a separate API that handles linestring merging. Something like:

void merge_points_on_multilinestring_range(MultiLinestringRange range, OutputIt output);

Because calling the self intersection just to remove the additional points is too expensive.

points = _pli_points_to_multipoints(pli)
lines = _pli_lines_to_multipoints(pli)
# Optimization: only compute the subsequent boundaries and equalities
# of indexes that contain point intersections and do not contain line
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# of indexes that contain point intersections and do not contain line
# of indices that contain point intersections and do not contain line

Comment on lines +57 to +58
lhs_crosses = lhs_boundary_matches != points.sizes
rhs_crosses = rhs_boundary_matches != points.sizes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says "none of the points of intersection is in the boundary of the geometry". AKA, there is 0 points in the intersection results that are in the boundary. Shouldn't this be

lhs_boundary_matches == 0

?

@@ -39,6 +42,15 @@ class OverlapsPredicateBase(EqualsPredicateBase):
pass


class LineStringLineStringOverlaps(BinPred):
def _preprocess(self, lhs, rhs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small docstring helps:

Suggested change
def _preprocess(self, lhs, rhs):
def _preprocess(self, lhs, rhs):
"""Linestring overlaps with another linestring if the overlapping part of
two linestring is not the entire section of the linestring.
"""

Sounds correct?

"""
x x
| /
| //x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought only backward slash needs escaping, does forward slash too?

assert (got.values_host == expected.values).all()


def test_point_multipoint(predicate): # noqa: F811
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is noqa needed? There's no import here.

@thomcom thomcom changed the title MultiGeometry tests for MultiPoint and MultiLineString MultiGeometry binary predicate support. Aug 2, 2023
@thomcom thomcom force-pushed the feature/multigeometry-test-dispatch branch from d5af1c0 to 1c8e980 Compare August 2, 2023 21:54
@thomcom thomcom marked this pull request as draft August 2, 2023 21:57
@github-actions github-actions bot added the ci label Aug 3, 2023
@thomcom thomcom removed the request for review from trxcllnt August 17, 2023 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currenty a work in progress ci improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Related to Python code
Projects
Status: In Progress
2 participants