Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests and fixes of potential duplicate consecutive node issues in way.js #3676

Closed
wants to merge 3 commits into from

Conversation

slhh
Copy link
Contributor

@slhh slhh commented Dec 24, 2016

I have added some tests for replaceNode, updateNode, addNode, and removeNode. These are partially failing with the original code, due to some potential issues in view of duplicate nodes.

Reworked code passing these tests is in the second commit.

In addition I have added some code comments.

Related issues are #1296 and #3211.

@bhousel
Copy link
Member

bhousel commented Jan 4, 2017

Hey @slhh sorry for taking so long to look at this. I think it's a pretty good improvement. I agree that range checking the index argument and removing the duplicates is worth doing.

One thing that worries me is that there may be places elsewhere in the code where we are calling addNode and updateNode expecting them to just use Array.splice, so any change in behavior from what Array.splice does might affect something else.

I normally hate exceptions, but I think in this case I'd almost rather throw an out of range exception when trying to call these functions with a weird index value. Rather than silently succeed by adding the node to the beginning or end or something like that. That way we can track down places in the code where the assumptions fall short.

  • with exceptions: "I tried to circularize a way and it made an error and iD stopped working"
  • silent failures: "I tried to circularize a way and the corner got a bit mangled but I just continued editing and told nobody."

I guess it's a judgement call, but I think the case for exceptions is better for us for debugging and better for OSM not getting bad data put into it. What do you think?

Again, overall this looks totally fine.. My worry is about other code in iD that assumes the methods internally use Array.splice or that contains subtle bugs.

@slhh
Copy link
Contributor Author

slhh commented Jan 5, 2017

One thing that worries me is that there may be places elsewhere in the code where we are calling addNode and updateNode expecting them to just use Array.splice, so any change in behavior from what Array.splice does might affect something else.

This worried me too, therefore, I tried to reproduce the splice behavior. All the index checks are intended to do this, and don't limit the useable index range.
Of course one difference remains which is the behavior in case of duplicates. A calling function expecting the updated node at a certain position or a certain length can fail. I think this pull request should not be merged just before a release.

I an earlier version I haven't supported the negative index range. The one direct test with negative index failed, but no additional higher level tests failed. And a little manually testing hasn't shown an issue.
Therefore, I think the negative index range is not used, but I can't be absolutely sure.

In case we want to limit the useable index range, which makes sense in my opinion, an exception seems to be useful. Can we throw it in case of debugging only? An exception in the field is only useful in case we want to address the resulting issue with high priority. We should not impair user experience just to fill the bug tracker without working on it.

Provided we prevent duplicate nodes in incoming data from the database, we may also throw an exception in case the functions have to remove duplicates. We do only need to compare the resulting lenght with the expected length. I'm not sure if we want to let calling code rely on way.js for duplicate node prevention or want to have all required rules for prevention in the calling code. In the later case an exception can be useful to find the bug.

@bhousel
Copy link
Member

bhousel commented Jan 5, 2017

In case we want to limit the useable index range, which makes sense in my opinion, an exception seems to be useful. Can we throw it in case of debugging only? An exception in the field is only useful in case we want to address the resulting issue with high priority. We should not impair user experience just to fill the bug tracker without working on it.

The other notable place we throw an exception in iD is in graph.entity, when calling code requests an entity which is no longer in the graph. This is always a programming error, and we don't want the user to continue editing with the graph in a bad state.

So yes, anytime a user reports something like that, it's very high priority to get it fixed.

Provided we prevent duplicate nodes in incoming data from the database, we may also throw an exception in case the functions have to remove duplicates.

I don't think we want an exception when removing duplicate consecutive nodes because I'm sure there are some cases of this in existing OSM data, which is pretty messy from imports, old bugs, even people just messing around with the api. For a little while there were even some ways referencing nodes that had been deleted (there is no referential integrity in the OSM database).

@pnorman
Copy link
Contributor

pnorman commented Jan 6, 2017

there is no referential integrity in the OSM database

Just to clarify, this is wrong, and in the way nodes table node_id is a foreign key: https://github.com/openstreetmap/openstreetmap-website/blob/c83f15821d5a48dd424cc5b534939d7f30a9fae0/db/structure.sql#L2184-L2189

@slhh
Copy link
Contributor Author

slhh commented Jan 6, 2017

@bhousel I have added a commit changing it to throw exceptions. Feel free to use it as you like.
The Travis build failed, but maybe you know why. It seems to have similar problems as the Windows build is used to have even with current master.

with exceptions: "I tried to circularize a way and it made an error and iD stopped working"

The best would be if iD were catching the exception, reporting the error, and resuming from the end of the last successful action. I don't know how complex this would be, at least my current knowledge is too limited to do it.

I don't think we want an exception when removing duplicate consecutive nodes

I would agree, but for the reason of making actions easier due to relying on way.js for duplicate prevention.

because I'm sure there are some cases of this in existing OSM data,

Of course they are, but that's not a reason, we just had to add some code eliminating the duplicate e.g. in the OSM service. That's why I wrote: "Provided we prevent duplicate nodes in incoming data from the database"

which is pretty messy from imports, old bugs, even people just messing around with the api.

You have missed iD's still existing bugs here. I have watched iD generating a duplicate when testing for my other pull request, which is nicely monitoring such duplicates.
It was just a somewhat failed click on an connecting node of three starting or ending highway, just the first click after starting iD. I recognized a little movement of the node and the undo stack shows something like way connected to line (in German). Selecting the node the new parent ways sections showed that the node has index 1 and 2 of the same one of the three ways. It wasn't a duplicate from the database and I haven't been able to reproduce the bug.

I don't think the problem was due to my code modification. There were only the modification of the sidebar, but not the changes to mode select at that stage. And the sidebar haven't been opened when the bug occurs.

@bhousel
Copy link
Member

bhousel commented Jan 6, 2017

You have missed iD's still existing bugs here. I have watched iD generating a duplicate when testing for my other pull request, which is nicely monitoring such duplicates.
It was just a somewhat failed click on an connecting node of three starting or ending highway, just the first click after starting iD. I recognized a little movement of the node and the undo stack shows something like way connected to line (in German). Selecting the node the new parent ways sections showed that the node has index 1 and 2 of the same one of the three ways. It wasn't a duplicate from the database and I haven't been able to reproduce the bug.

Good find though, can you open a separate issue for this? Maybe it's possible to double click around the endpoint of a way and have iD try to add a midpoint there that ends up merging into the existing node.

@bhousel
Copy link
Member

bhousel commented Jan 7, 2017

Just a quick update, I'm adding a lot of tests (like hundreds) and changing slightly the logic to handle different kinds of duplicates in different places along the nodelist, possibly involving connecting nodes on circular ways. Once we throw those into the mix, things get kind of complicated. It might take another day or two to merge fully, but this work should make these functions a lot safer in the end.

@slhh
Copy link
Contributor Author

slhh commented Jan 12, 2017

Good find though, can you open a separate issue for this?

@bhousel
I hope it it is fixed by this pull request. Do you really want a new issue? Maybe I can add the information to #1296 instead.

Maybe it's possible to double click around the endpoint of a way and have iD try to add a midpoint there that ends up merging into the existing node.

That was my first idea too, but I haven't been able to reproduce it. Maybe its needs to be too fast to be reproduced intentionally.

@bhousel
Copy link
Member

bhousel commented Jan 12, 2017

I hope it it is fixed by this pull request. Do you really want a new issue? Maybe I can add the information to #1296 instead.

Yes, reading #1296 again, it sounds like you found one of the pathways to cause it. You can add the info there and it hopefully will be fixed by this PR.

I wanted to merge this yesterday, but I realized that our area drawing mode does intentionally splice the node array using index -1 which breaks the assumption that addNode is always called with a positive range. This is related to #1369, so I'm going to attempt to fix that also by having the area drawing code add the node to an existing temp way where it belongs rather than recreating it each time.

bhousel added a commit that referenced this pull request Jan 12, 2017
These changes are needed now that `addNode`
* wants to preserve circularity
* automatically remove duplicates
* range checks index argument (can't call it with -1 anymore)
@bhousel bhousel added the wip Work in progress label Jan 13, 2017
@bhousel
Copy link
Member

bhousel commented Jan 16, 2017

Thanks @slhh for kicking this off.. I just merged the updated version!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants