-
-
Notifications
You must be signed in to change notification settings - Fork 929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(locale): add street_name to en_US
, en_GB
and en
#2371
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## next #2371 +/- ##
==========================================
- Coverage 99.58% 99.58% -0.01%
==========================================
Files 2820 2823 +3
Lines 253975 255522 +1547
Branches 1103 1102 -1
==========================================
+ Hits 252922 254462 +1540
- Misses 1025 1032 +7
Partials 28 28
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to add these data, then they should appear in the street patterns as well.
You mean by simply including a reference to But then I would need to add |
No, it should not include any suffixes. It should just add the street_names to the patterns. |
So should we go this route? |
I'm not convinced about using Wikipedia as a source. At least for en_GB these are often "famous" streets which are associated with a specific city (there are noticeably a lot of streets from London and Cambridge in the list) It might be better to pick more generic names that frequently appear in en like Main Street, High Street. |
Here is an alternative data source for en_GB, these are 400 common street names in UK - i used https://www.ordnancesurvey.co.uk/products/os-open-names#get to get a list of ~900,000 street names, found the most frequent, then handpicked 400.
|
Pardon me @matthewmayer but what exactly is your problem? Nevertheless, I still appreciate your feedback an can change the according files. |
Faker is often used for things like populating mockups. Therefore it's good to avoid fake data that stands out too much . So for example we use common surnames like Smith not celebrity names like "Arnold Schwarzenegger" Similarly I think we should avoid "weird" street names, ones that exist only in a single place, contain profanity, etc. |
Apologies if you felt I was stepping on your toes by proposing an alternative source of data. That was not my intention. The PR is good I just think the Wikipedia category is not a great source due to the reasons I list above. Feel free to modify the original data or merge in some of my suggestions as you prefer. |
No hard feelings. Just wanted to get this off my chest. My primary concern was the limited time I had to respond, especially considering that this project is OSS and not a commercial product. Again, your suggestions and alternative data sources were quite helpful so thank you for that. |
Using a similar methodology and the US TIGER Shapefiles for all states of the US, here are just under 400 popular en_US street names
|
My approval given, when @matthewmayer's things got addressed 🙂 |
Thanks for reminding me that I have a doing here. I can get it done this afternoon. |
14fca7b
to
a6c0012
Compare
Description
This PR adds
street_name
locale data to the location definitions of the localesen
,en_US
anden_GB
. The following table shows the sources I used for each locale:https://en.wikipedia.org/wiki/Category:Streets_in_the_United_States_by_state
https://en.wikipedia.org/wiki/Category:Streets_in_England
Links
This PR additionally fixes #2364.