Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terms preset does not work properly #2756

Closed
bagage opened this issue Aug 21, 2015 · 11 comments
Closed

Terms preset does not work properly #2756

bagage opened this issue Aug 21, 2015 · 11 comments

Comments

@bagage
Copy link
Contributor

bagage commented Aug 21, 2015

If I understood it correctly, terms list in preset are synonyms for one kind of structure. For instance, given the French presets, Digue should be equivalent of Terre-plein (for key embankment). However if a select a line and type Digue, Terre-plein is the 32nd of 34 total results while it should have been the 1st.
The same goes for other words (WC is not leading to toilets, etc.), so am I misinterpreting these terms or is something wrong?

Using iceweasel 40.0 on Debian 4.1.5-1, same behavior on 2 different computers as well as on Windows/Firefox. French installed on both computers but one has iceweasel in French while the other has it in English.

@ghost
Copy link

ghost commented Aug 28, 2015

Same here.

Steps to reproduce

  1. Set your OSM account to Dutch (nl)
  2. Open the iD instance at osm.org
  3. Add a point
  4. Search for wc

Expected result

Toiletten is the top match.

Actual result

Toiletten isn't even in the list at all.

This seems to work fine for longer keys though: e.g. pinnen correctly resolves to Bankautomaat.

Note

The translation of terms that is included in this instance of iD is WC. I've just added more synonyms in Transifex.

@bagage
Copy link
Contributor Author

bagage commented Sep 1, 2015

OK, I think I found the reason behind that semi-random behavior. terms array must be filled with lowercase AND comma-separated (without space) synonyms.

For instance, amenity/toilets for French is:

"terms": "Toilettes, WC"

To work properly, it must be changed to:

"terms": "toilettes,wc"

I am not sure where the fix should be though. Some solutions could be:

  1. preset.terms should trim terms after splitting?
  2. collection.search should convert to lowercase everything when comparing 2 strings?
  3. format presets.json and all translated JSON so that preset.terms are correctly formatted. Must be done automatically after fetching new translations from Transifex (maybe in Makefile). This solution is not needed if both 1) and 2) are done.

@ghost
Copy link

ghost commented Sep 1, 2015

OK, I think I found the reason behind that semi-random behavior. terms array must be filled with lowercase AND comma-separated (without space) synonyms.

No, that should not have any impact. It is true that lowercase and without spaces is preferable, but it is not necessary to have the search work.

@bagage
Copy link
Contributor Author

bagage commented Sep 1, 2015

What is doing the work for it then? I tried on a local copy and had it working only by changing as mentioned. But I am not js programmer so I believe you. However I could not find the code handling that; can you point me in its direction? :)

@ghost
Copy link

ghost commented Sep 1, 2015

I haven't looked at that code myself, I was told so by an iD developer some time ago. IIRC it splits on commas and then calculates the Levenshtein distance. It may be that with the comma, the Levenshtein distance is bigger, but for longer words this does not matter too much?

@ghost
Copy link

ghost commented Sep 1, 2015

This seems to be the relevant code: https://github.com/openstreetmap/iD/blob/master/js/id/presets/collection.js#L75, but I am not really familiar with the code.

@bagage
Copy link
Contributor Author

bagage commented Sep 1, 2015

Well, Levenshtein distance is not the only method used; basically search is documented to do the following. It will compare the given value converted to lowercase:

  1. with all presets' name (lowercased)
  2. then all presets' terms (not lowercased)
  3. then all presets' tags (not lowercased)
  4. then levenshtein distance for all presets' name (maybe is it case insensitive, but not sure about that)
  5. then levenshtein distance for all presets' terms (maybe is it case insensitive, but not sure about that)
  6. then it concatenates all these result in the order above (not sure about that totally).

Levenshtein distance might find expected result sometimes - but not in first position (see my first post with Terre-plein and Digue) while when a given input value is exactly a term value, it should be the first in the results list . And moreover, Levenshtein may not behave correctly to trailing space nor case issues while this should not affect results list.

edit: renaming Digue to digue in fr.json make it appears first.

@ghost
Copy link

ghost commented Sep 1, 2015

Observation of the app's behaviour:

  1. changing the case does not affect the search results;
  2. putting a space in front of your word does, but it still doesn't find our toilets in the Dutch version.

@bagage
Copy link
Contributor Author

bagage commented Sep 1, 2015

What did you modify? In nl.json there is only one entry in terms which is WC. If you change it to:

            "terms": "wc"

Then recompile the application entirely with make clean && make (otherwise it seems to not be reloaded). Then launch python -m SimpleHTTPServer and try again. It is working for me, I do not see why we would get different behaviors.

@ghost
Copy link

ghost commented Sep 1, 2015

Seems you are right. Without capitals it finds it.

@ghost
Copy link

ghost commented Sep 1, 2015

I am working on this. I first changed js/id/presets/preset.js # 49:

/*from*/ return preset.t('terms', {'default': ''}).split(',');
/* to */ return preset.t('terms', {'default': ''}).toLowerCase().split(',');

and it worked but then I thought: why make the client do that every time, when we can settle this at make translations time. Now I am learning lodash to accomplish this.

@ghost ghost mentioned this issue Sep 1, 2015
@bhousel bhousel closed this as completed in 209b324 Sep 6, 2015
bhousel added a commit that referenced this issue Sep 6, 2015
Fix "terms" translations (closes #2756)
MarianoTucat added a commit to skedgo/iD that referenced this issue Sep 10, 2015
* upstream/master: (346 commits)
  Add test case for callback function returning false
  Correct coding style
  Make raw tag editor show docs for key=value (closes openstreetmap#2754)
  Improvements to access field (closes openstreetmap#2763)
  Disable fullscreen button, add keyboard shortcuts
  update tests to reflect changes in dd32ec3
  Correct API version in osmChange and changeset XML
  Fix "terms" translations (closes openstreetmap#2756)
  Remove Raven code (closes openstreetmap#2769)
  Less strict polygon intersection test in findOuter (closes openstreetmap#2755)
  Use different leaf_cycle/leaf_type for singular tree (closes openstreetmap#2753) And don't show "mixed" options for singular tree (closes openstreetmap#2752)
  Change caption "Access" -> "Allowed Access" (closes openstreetmap#2761)
  Fix broken link and other help improvements (closes openstreetmap#2760)
  don't try to label if centroid is undefined
  Remove unnecessary PhantomJS install step (as mocha-phantomjs is already included as dev dependency)
  switch jshint to eslint (closes openstreetmap#2733)
  Add make rule to npm install maki
  Replace 'X' with Cancel button on save panel (closes openstreetmap#2378)
  Add recycling:glass_bottles, recycling:plastic (closes openstreetmap#2730)
  Add preset for leisure=bowling_alley (closes openstreetmap#2734)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant