Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[oshdb-filter] discussion for PR #380 refactor oshdb-filter #385

Closed
wants to merge 1 commit into from

Conversation

rtroilo
Copy link
Member

@rtroilo rtroilo commented May 26, 2021

This PR is for discussion of some refactoring within the oshdb-filter module. It is based on PR #380
And should provide some suggestion as a starting point for a discussion:

  • As the Filter only adds static methods to the FilterExpression, we could extract it as a utility class and replace it with FilterExpression where ever it is referenced as a type

  • FilterInternal seem to just provide a default implementation for negate which we can pull to the FilterExpression interface, and remove FilterInternal

  • we could "pull" some classes like (ChangesetIdFilter|GeometryTypeFilter|ContributorIdFilter)(Equals|EqualsAnyOf|Range) together as they provide just "strategies" for filtering and no concrete class would be necessary. Like ContributionFilter, and GeometryFilter

  • Wrapping the Contribution Based filters with a Negatable does not work because Negatable "blindly" negate all filter methods also applyOSM which for the ContributionTypes should always return true, because of the possible "minor"-versions.

  • We should also take a look over some tests for example:

public void testGeometryTypeFilterPolygon() {
    FilterExpression expression = parser.parse("geometry:polygon");
    OSMEntity validWay = createTestOSMEntityWay(new long[]{1, 2, 3, 4, 1});
    assertTrue(expression.applyOSMGeometry(validWay, gf.createPolygon()));

Should this test not fail? It is a closed ring but it is missing the tags which leads to a "polygon", e.g. building=yes

- make Filter.class a utility class, replace Filter with FilterExpression
- remove intemediate interface/class FilterInternal, RangedFilter
- combine ContributionFilters together
- combine GeometryFilters together
- make negatable geometry/contribution-filter
@tyrasd
Copy link
Member

tyrasd commented May 27, 2021

Should this test not fail? It is a closed ring but it is missing the tags which leads to a "polygon", e.g. building=yes

this test should not fail, because applyOSMGeometry is called with a polygonal geometry. The oshdb-filter module was designed to be independent from possible alternative geometry building strategies, e.g. if one chooses to use a TagInterpreter which always returns isArea() -> false then the filter geometry:polygon would not match a closed-way-building polygon.

@tyrasd
Copy link
Member

tyrasd commented May 27, 2021

Wrapping the Contribution Based filters with a Negatable does not work because Negatable "blindly" negate all filter methods

Right, this is a major bug/oversight and also does not fulfill the contract defined for the method (see

* <p>Must be compatible with the result of {@link #applyOSM}, e.g. that it must not return false
* when <code>oshEntity.getVersions().….anyMatch(applyOSM)</code> would evaluate to true.</p>
). As far as I can see, it's independent from the contribution based filters, however.

also applyOSM which for the ContributionTypes should always return true, because of the possible "minor"-versions.

Right! For contribution/snapshot-based filters this is also true for applyOSM. 😬

Comment on lines +102 to +105

private Filter() {
// utility class
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really needed for an abstract class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think so, the private constructor is enough, like it is e.ge OSHEntityTimeUtils. Even sonarlint was complaining why I don't do 😂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant, that the Filter should better not be abstract.

Copy link
Member

@tyrasd tyrasd May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why I don't do

you mean convert the class to an interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, that is what sonarlint suggest ("because it is abstract class with only static methods"), but actually what it should be is rule-1118 Utility Class
So wrong here is that this class is abstract.

@@ -12,7 +11,7 @@
/**
* A filter condition which can be applied to an OSM entity.
*/
public interface Filter extends FilterExpression {
public abstract class Filter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the change from interface to abstract class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change here is actually not to implement FilterExpression and to keep the Filter as a "sandalone" utility class, as it only provides static methods for creating FilterExpressions in various forms.

@tyrasd
Copy link
Member

tyrasd commented May 27, 2021

As the Filter only adds static methods to the FilterExpression, we could extract it as a utility class and replace it with FilterExpression where ever it is referenced as a type

At first glance I dislike this change in this PR. It seemingly unnecessarily destroys the distinction of FilterExpressions in operators (like BinaryOperator) and concrete filters (formerly the Filter interface). Was there a particular reason to remove this little bit of internal structure? Also, it used to be able to statically-type methods like normalize and ensure they return what they promise to return.

@tyrasd tyrasd added the question Further information is requested label May 27, 2021
@rtroilo
Copy link
Member Author

rtroilo commented May 27, 2021

At first glance I dislike this change in this PR. It seemingly unnecessarily destroys the distinction of FilterExpressions in operators (like BinaryOperator) and concrete filters (formerly the Filter interface). Was there a particular reason to remove this little bit of internal structure? Also, it used to be able to statically-type methods like normalize and ensure they return what they promise to return.

I see what you mean, but the normalize method is the only place where I think needs a distinction. And we still can distinct if it is a BinaryOperator or not, the question is then if we need to have this "layer" or not, no problem with that. But I also think it is not obvious where you need to/have to use a Filter or FilterExpression, this we need to make it clear.
Here is an example which I actually had and which will result in an IllegalStateException("unsupported state during filter normalization") without an obvious reason:

mapReducer.filter(new FilterExpression() {
  public FilterExpression negate() {
    throw new UnsupportedOperationException();
  }
  public boolean applyOSM(OSMEntity entity) {
    return ...// custom filter logic
  }
});

We could solve this, by letting the mapReducer.filter accept only the subtype Filter. Would this limit us in some way?

@rtroilo
Copy link
Member Author

rtroilo commented May 27, 2021

Should this test not fail? It is a closed ring but it is missing the tags which leads to a "polygon", e.g. building=yes

this test should not fail, because applyOSMGeometry is called with a polygonal geometry. The oshdb-filter module was designed to be independent from possible alternative geometry building strategies, e.g. if one chooses to use a TagInterpreter which always returns isArea() -> false then the filter geometry:polygon would not match a closed-way-building polygon.

That's true, because we check on the geometry and not actually on the osm it self. But let me bring the same example but add one more assert.

public void testGeometryTypeFilterPolygon() {
    FilterExpression expression = parser.parse("geometry:polygon");
    OSMEntity validWay = createTestOSMEntityWay(new long[]{1, 2, 3, 4, 1});
    assertTrue(expression.applyOSM(validWay)); // should this be true if Taginterpreter.isArea(entity) -> false
    assertTrue(expression.applyOSMGeometry(validWay, gf.createPolygon()));

I'm thinking about a query like "type:way and barrier=wall and geometry:polygon" on a contribution endpoint. This could build potentially alot of geometries even if it is clear in the osm filter that this entity never should result in a polygon.
We could use the TagInterpreter.isArea/isLine methods already within the GeometryTypeFilter.applyOSM.
What do you think?

I just saw that for this example "barrier=wall", the DefaultTagInterpreter tag this as polygon! Is this intended?
Because according to the osm wiki https://wiki.openstreetmap.org/wiki/Tag:barrier%3Dwall#Mapping_as_an_area barrier=wall should only be a polygon if there is also the tag "area=yes" present, but we whitelist "barrier=wall" for polygons.

{ "key": "barrier",
  "polygon": "whitelist",
  "values": ["city_wall", "ditch", "hedge", "retaining_wall", "wall", "spikes" ]}

@tyrasd
Copy link
Member

tyrasd commented May 31, 2021

I just saw that for this example "barrier=wall", the DefaultTagInterpreter tag this as polygon! Is this intended?

Well, OSM constantly changes 😅. A few years ago when I first compiled this polygon tag list, the OSM wiki didn't really mandate tagging area=yes for polygonal walls. Also, normally tags in the OSM wiki which have the icon don't require area=yes to be added (take a look at waterway=dam for example). Apparently, the requirements for polygonal barrier=walls were in the meantime made more strict. It's probably time to update our tag list, I guess.

a query like "type:way and barrier=wall and geometry:polygon" […] never should result in a polygon.

I wouldn't say never since any OSM entity can always have the area=yes tag to make it an area, even when its base tags don't do so (e.g. there are currently almost 20k barrier=wall OSM features mapped with area=yes). But anyway, yes, we could exclude some features in the geometry type filters during the applyOSM stage based on their tags.

What do you think?

I think this is an optimization which only few queries really benefit from and it would make the implementation more complex, because we need to somehow make sure that the same TagInterpreter is used both for filtering and later building of the geometries. Also, I'm not 100% happy with the currentl implemenation of the tag-interpretation anyway, so I'm hesitating to further expose and rely on those parts of the OSHDB if not really necessary.

@tyrasd
Copy link
Member

tyrasd commented May 31, 2021

We could solve this, by letting the mapReducer.filter accept only the subtype Filter. Would this limit us in some way?

Yes: then one couldn't use the method anymore to set non-"primitive" filters with boolean operators (e.g. mr.filter(filterParser.parse("… and …")).

Here is an example which I actually had and which will result in an IllegalStateException("unsupported state during filter normalization") without an obvious reason: […]

I guess a simple "solution" for this would be that the MapReducer.optimizeFilters1 method would just silently fail in such situations. Of course the actual optimization could not take place then, but that would IMHO be OK in such cases. 😏

Base automatically changed from filter-contributions to master June 12, 2021 14:33
@tyrasd tyrasd closed this Jun 18, 2021
@tyrasd tyrasd deleted the refactor-filter branch June 18, 2021 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants