[KeepRight] Use regex group capture to extract error details #5275

kymckay · 2018-08-31T22:11:47Z

As discussed here, this PR improves the extraction of details from error descriptions by directly using regex group capture instead of splitting the strings and using a lot of logic to figure out what's what (effectively re-implementing the same group capture behaviour).

Advantages of this approach:

Can precisely extract what we want, no relying on space characters to split the string nicely.

No need to handle edge cases in the code like this:

iD/modules/util/keepRight/keepRight_error.js

Lines 66 to 73 in 74e06d4

    
           // handle special cases 
        
           // error _170 
        
           if (errorType === '_170') { return { var1: entity.description }; } 
        
           // error _220 
        
           if (errorType === '_220')  { _220 = true; } 
        
           if (errorType === '_401') { _401 = true; }

Less looping and calls to regex .test()/.match() methods.
We're being more explicit about what we expect in the error strings (aka \d+ shows that we expect a number whereas {$1} didn't). This is a minor maintainability improvement.
Can easily handle errors that have multiple possible string templates (see types 312 and 282).

Side effect of this approach:

I've changed the way that IDs are identified. The code to identify IDs and their type (node/way/relation) was a bit clumsy (and wouldn't correctly handle some cases) especially with some inconsistencies in the error messages it had to account for. It became a bit more so once the code was no longer looping over every word. So instead I've added an extra optional "IDs" key to the errorSchema.json which provides an array of strings where the indices correspond to the regex groups (see here). This let's us explicitly tell the code which groups capture an ID and what type of ID they are ("n"/"w"/"r").
- It seems there are a couple of cases where strings can have an arbitrary list of node/way/relation IDs depending on the geometry involved in the error. For getting the IDs in these cases I'm capturing the whole list as a group and parsing those with unique code (see error types 211, 231 and 294).
To make life easier I've added a "regex" key to the errorSchema.json so that errors where the message is fixed aren't parsed for details at all (saves us having to escape any special regex characters and prevents needlessly matching a fixed string for many errors). However, it might actually make more sense to just remove these messages all together from the file since we don't need that data for anything.

This focuses on converting error types I found on the map for testing (can confirm the code is working!) as well as the more problematic cases from the old code to show that this approach can handle them easily.

This adds a flag to the error schema to explicitly say whether to parse the description with regex or not. Prevents us from having to escape special characters in fixed strings and is a minor optimisation.

These aren't shown in the layer currently, but for future use this should work

kymckay · 2018-09-03T12:23:42Z

@thomas-hervey

I consider this PR now feature complete 😃

All error types now have details extracted via regex and as a bonus I've handled cases where arbitrary lists of IDs could be present so that they all become links.

Locations with some of the more complicated errors:

Some of the output messages on our end (in core.yaml) could use improvement for clarity and consistency (quoting keys and values for example), but I'll leave that for another PR.

thomas-hervey · 2018-09-05T16:45:24Z

Thank you very much @SilentSpike for the great refactor. I've reviewed these changes and I am fine merging them so long as there aren't parsing conflicts between the templates and the KeepRight.at server responses.

kymckay added 11 commits August 31, 2018 21:42

Use regex group capture to extract error details

a233950

Convert preliminary amount of errors to regex

eb78395

This focuses on converting error types I found on the map for testing (can confirm the code is working!) as well as the more problematic cases from the old code to show that this approach can handle them easily.

Fix missing and unnecessary semicolons

4a74208

Convert all error message schema to regex

792d4ee

Add support for arbitrary number of details

3033142

Fix parsing of error type 231

40a9bd9

Adhere to style guide for variables

e7c30b8

Handle simple error descriptions explicitly

e7c31fb

This adds a flag to the error schema to explicitly say whether to parse the description with regex or not. Prevents us from having to escape special characters in fixed strings and is a minor optimisation.

Fix parsing of error type 211

25a358b

Fix parsing of error type 294

fdf27b7

Convert warnings to regex

d506a71

These aren't shown in the layer currently, but for future use this should work

Add missing error types 73, 74 and 75

a8f3311

thomas-hervey self-assigned this Sep 5, 2018

thomas-hervey merged commit ee35037 into openstreetmap:keep-right_QA Sep 5, 2018

kymckay deleted the keep-right-regex branch September 5, 2018 17:03

thomas-hervey mentioned this pull request Jan 10, 2019

KeepRight interface followup issues #5679

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[KeepRight] Use regex group capture to extract error details #5275

[KeepRight] Use regex group capture to extract error details #5275

kymckay commented Aug 31, 2018 •

edited

Loading

kymckay commented Sep 3, 2018 •

edited

Loading

thomas-hervey commented Sep 5, 2018

	// handle special cases
	// error _170
	if (errorType === '_170') { return { var1: entity.description }; }

	// error _220
	if (errorType === '_220') { _220 = true; }

	if (errorType === '_401') { _401 = true; }

[KeepRight] Use regex group capture to extract error details #5275

[KeepRight] Use regex group capture to extract error details #5275

Conversation

kymckay commented Aug 31, 2018 • edited Loading

kymckay commented Sep 3, 2018 • edited Loading

thomas-hervey commented Sep 5, 2018

kymckay commented Aug 31, 2018 •

edited

Loading

kymckay commented Sep 3, 2018 •

edited

Loading