Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoJSON for states #4784

Closed
michaelblyons opened this issue Dec 7, 2020 · 23 comments
Closed

GeoJSON for states #4784

michaelblyons opened this issue Dec 7, 2020 · 23 comments
Labels
considering Not Actionable - still considering if this is something we want

Comments

@michaelblyons
Copy link
Collaborator

michaelblyons commented Dec 7, 2020

This is a poor-man's extension to rapideditor/country-coder#19. (Related: rapideditor/country-coder#26) Is it worthwhile to make simple GeoJSON shapes that "loose fit" states?

They would be useful for locationSet.include, but not really for exclude. I'm imagining something like

    {
      "displayName": "Some Regional Bus System",
      "locationSet": {
        "include": [
          "us-wa.lenient.geojson",
          "us-or.lenient.geojson",
          "us-id.lenient.geojson"
         ]
      },
      "tags": {
        "network": "Some Regional Bus System",
        "route": "bus"
      }
    },

It'd be really helpful, I imagine, for doing state police. Also, the country-code (text) search in NSI.guide would start working for locations tagged that way.

@michaelblyons michaelblyons added the question Not Actionable - just a question about something label Dec 7, 2020
@UKChris-osm
Copy link
Collaborator

If we were going to go down the "*.geojson" road for States (and UK Counties 👍), could this data not be pulled from OSM's current boundaries, rather than loosely drawn?

@michaelblyons
Copy link
Collaborator Author

If we were going to go down the "*.geojson" road for States (and UK Counties 👍), could this data not be pulled from OSM's current boundaries, rather than loosely drawn?

If there's a way to do this already, I'm all for it (👍). But the library used for NSI (if I understand correctly) does not have ISO 3166-2 codes since it would mean 10x-20x more shapes.

I presume the library would want to be strictly accurate. But here, I don't care if there's some overlap in adjoining shapes. Limiting things loosely, with simple 4-to-16-point shapes is "good enough."

@UKChris-osm
Copy link
Collaborator

Sorry for being vague - I meant couldn't we pull the boundary data (such as Florida) and save them as our own "*.geojson" features for use within the NSI?

@bhousel
Copy link
Member

bhousel commented Dec 8, 2020

Just catching up on this..

It'd be really helpful, I imagine, for doing state police.

Yes for state police it would be useful.. Unioning the geometries together in the include is not very performant, but at least it is cached. For the case of a regional bus line, it would be much more performant to represent it as a rough polygon or box than to union 3 state polygons together.

Also, the country-code (text) search in NSI.guide would start working for locations tagged that way.

Ah yeah I need to fix that (#4077) - we shouldn't design the index around that country code search box.

If we were going to go down the "*.geojson" road for States (and UK Counties 👍), could this data not be pulled from OSM's current boundaries, rather than loosely drawn?

I dont think that would work - OSM's boundaries are way too detailed. We want to use the roughest polygon possible that answers the question "does this feature belong in here?"

I presume the library would want to be strictly accurate. But here, I don't care if there's some overlap in adjoining shapes. Limiting things loosely, with simple 4-to-16-point shapes is "good enough."

I've drawn a bunch of these kind of polygons and it is a lot of work. The shapes can be very undetailed around places where nobody lives, but then you will need to add a lot of points around more populated boundaries. (like Kansas City or St. Louis)

@michaelblyons
Copy link
Collaborator Author

Sorry for being vague - I meant couldn't we pull the boundary data (such as Florida) and save them as our own "*.geojson" features for use within the NSI?

@UKChris-osm Oh, I understand better now, thanks. I think the limiting factor is file size and whether that's a concern we have here. There are a couple states already in features/ and they're pretty minimal.

I've drawn a bunch of these kind of polygons and it is a lot of work. The shapes can be very undetailed around places where nobody lives, but then you will need to add a lot of points around more populated boundaries. (like Kansas City or St. Louis)

@bhousel If we take the OSM data, "grow" the boundaries by 10-30 km, and then simplify down to 30 points or so, I don't think it'd be a big problem. Ideally, we could keep any straight lines from the original shape, but if not 🤷.

If some Missouri things get suggested on the Kansas side of KC, it's not a big deal, right? They'll only appear when you're typing something that matches anyway. The thing I'm looking to do here is keep "CSP" (Colorado State Patrol) separate from "CSP" (Connecticut State Police). If you live in Kansas City and there are two separate "First Bank"s on either side of the border, we should just displayName them so a mapper can pick the right one in the overlap region.

For the case of a regional bus line, it would be much more performant to represent it as a rough polygon or box than to union 3 state polygons together.

Bummer.

@bhousel
Copy link
Member

bhousel commented Dec 9, 2020

If some Missouri things get suggested on the Kansas side of KC, it's not a big deal, right?

I think it's not a big deal, but it's probably a big deal to someone somewhere. 😆

@UKChris-osm
Copy link
Collaborator

I always assumed that iD could tell whenever you placed a node / way on the map that it was within a certain boundary or near a city as it would often suggest data based on that (such as which road name, post code or City to attached to a building within the address) and so adding state / county codes to the NSI would have been a simpler process.

If I were to create some rough outlines for England (Cornwall, Devon, Somerset etc) and wanted to use the Devon and Cornwall outline for Devon and Cornwall Police, could I apply two .geojson ("/gb/Dev.geojson" & "/gb/Corn.geojson") files to the entry, or would I have to create a merged "/gb/DevCorn.geojson" file for this?

@bhousel
Copy link
Member

bhousel commented Dec 9, 2020

I always assumed that iD could tell whenever you placed a node / way on the map that it was within a certain boundary or near a city as it would often suggest data based on that (such as which road name, post code or City to attached to a building within the address) and so adding state / county codes to the NSI would have been a simpler process.

There are a few ways that iD accomplishes this, but they aren't very straightforward.

iD previously used the Nominatim API to know where in the world the user was editing. Now it includes the country-coder boundary data to get that answer quickly. Some background on openstreetmap/iD#6941 .

For anything where country coder isn't the right level of detail (state or local things), location-conflation can either generate a geojson for it or use a custom one that we supply - then iD can test whether that feature belongs where the user is editing.

So in terms of efficiency, most->least:

  • a locationSet with a single country-coder string in it {include: ['gb']}, iD already ships with this built in
  • a locationSet with multiple include/exclude - {include: ['fr', 'gb']}, location-conflation will generate it on the fly by merging them, this adds some milliseconds but it's not free.
  • a locationSet with a custom geojson - {include: ["corn.geojson"]}, it increases the amount of data iD needs to download, this adds many more milliseconds
  • at this point in the list we have some control over how detailed the geojsons are and how much space they take up, and how much downloading they will require and how many milliseconds it will take to merge them together. Exporting every state geometry at high detail to geojson and including multiples of them in the locationSets is a thing that will definitely work, but we're getting into bad-performance-territory.

So basically

  • use geojsons only when you have to
  • make them as small as possible (few points, reduced coordinate precision)
  • try not to include/exclude too many things in a locationSet

@peternewman
Copy link
Collaborator

Based on #4718 (comment) and I guess partly what @michaelblyons was angling at, is there an argument for adding stuff that's roughly the shape of the state/county and crucially named the same as the ISO 3166-2 standard, so for people who don't want to or can't handle custom geojson, they can see that our include was two files called GB-DEV.geojson and GB-CON.geojson and approximate those to their pair in ISO 3166-2, meanwhile we could merge them offline to keep our behaviour more performant if necessary.

@UKChris-osm
Copy link
Collaborator

I've tried a couple "*.geojson" files for Devon and Cornwall in England, hopefully I have does this correctly 👍

@michaelblyons
Copy link
Collaborator Author

… approximate those to their pair in ISO 3166-2, meanwhile we could merge them offline to keep our behaviour more performant if necessary.

Broadly speaking, I like this idea. It does mean that we'd want to err on the side of slightly overlapping territory in the shapes. Otherwise, offline merging sends the same number of points (or more) than the files separately.

By way of example: If you look at the southeast end of california.geojoson, the Arizona border is simplified. If we have an include: ["california.geojson", "arizona.geojson"] entry, the Arizona shape needs to either follow that border or overlap it. If it doesn't, the merged shape will have bubbles or very thin spikes, each of which will be extra coordinates to transfer. You could fill bubbles with additional [long, lat] includes, but that's a little tedious.

As I type this, I wonder a little bit whether someone else has already encountered such a problem and whether their shapes are public domain...

@bhousel
Copy link
Member

bhousel commented Dec 11, 2020

My gut approach to this would be to grab a public database of us shapes, like found here, then load them into QGIS and use the topological editing feature to remove or adjust the shared vertices anywhere that we can simplify the shapes, and just spend a lot of time on making it good.

@bhousel
Copy link
Member

bhousel commented Dec 11, 2020

My gut approach to this would be to grab a public database of us shapes, like found here, then load them into QGIS and use the topological editing feature to remove or adjust the shared vertices anywhere that we can simplify the shapes, and just spend a lot of time on making it good.

FWIW, I did spend some time on this today, but I gave up. Someone with more experience in QGIS or other similar tools could probably do this.

@bhousel bhousel added considering Not Actionable - still considering if this is something we want and removed question Not Actionable - just a question about something labels Dec 29, 2020
@UKChris-osm
Copy link
Collaborator

Is there any performance gain / lose by including two or three single gps points as a location set, and letting locationConflation build an area from that?

For example: 3-point radius

... or would a ".geojson" still be preferable?

@bhousel
Copy link
Member

bhousel commented Jan 9, 2021

Is there any performance gain / lose by including two or three single gps points as a location set, and letting locationConflation build an area from that? For example: 3-point radius
... or would a ".geojson" still be preferable?

That's very clever! I think a .geojson is still slightly preferable, just because it's more clear what it is doing, but they'd both be fast approaches.

I did improve the performance of the unioning recently, and took this comparison screenshot:
rapideditor/location-conflation#26 (comment)
Basically, the biggest hit to performance is unioning a lot of things. (so, those locationSets where someone added like 20 countries to it). But even with that it performs well enough.

I've been adding this to iD the past few days, and it resolves them all in a few seconds. This means that for the first few seconds that iD starts up, a user won't have these presets available yet.

In practice it's surprisingly fast.. The first time I tried it, I was planning to pay close attention to how long the task took. But iD showed me the "you have unsaved changes do you want to restore them" dialog box, and by the time I clicked "no" the background task had already completed! 😆

@UKChris-osm
Copy link
Collaborator

That's very clever! I think a .geojson is still slightly preferable, just because it's more clear what it is doing, but they'd both be fast approaches.

I'd certainty agree that using a ".geojson" can make things clearer as you can give the area a meaningful name 👍

I was looking for a quicker way to add a little zone area that didn't require a ".geojson", as there were a couple of areas (such as London) where the 25km radius was a little too small to fit the whole London area in, or where there is one side of the area that sticks out a little and is out of the 25km radius.

Do you anticipate any future plans where locationConflation would enable a larger, or more customisable radius, such as:

  • {"include":[[-0.15,51.5,small]]}
  • {"include":[[-0.15,51.5,large]]}
  • {"include":[[-0.15,51.5,6km]]}
  • {"include":[[-0.15,51.5,30km]]}

@bhousel
Copy link
Member

bhousel commented Jan 9, 2021

Do you anticipate any future plans where locationConflation would enable a larger, or more customisable radius, such as:

Yes, I could do that pretty quickly and would be useful for the Mattoon Burger King zone of exclusion too.

For history - originally I added this radius feature so that we could have all the YouthMappers chapters in the osm-community-index. It's awesome how much GeoJSON we can avoid shipping by letting people declare a point radius and having location-conflation build it on the fly.

@bhousel
Copy link
Member

bhousel commented Mar 27, 2021

Do you anticipate any future plans where locationConflation would enable a larger, or more customisable radius, such as:

  • {"include":[[-0.15,51.5,small]]}
  • {"include":[[-0.15,51.5,large]]}
  • {"include":[[-0.15,51.5,6km]]}
  • {"include":[[-0.15,51.5,30km]]}

hey @UKChris-osm ! I released an update to location-conflation today to support an optional radius parameter like you suggested, so now we can use any size circles. 😄

The optional radius parameter is just a number in kilometers.
example: { include: [[-88.3726, 39.4818, 32]] }

bhousel added a commit that referenced this issue Mar 27, 2021
@UKChris-osm
Copy link
Collaborator

UKChris-osm commented Mar 27, 2021

You caught me out @bhousel as I had just run build, wiki and dist and was about to sync it when you updated :)

Is 25km still the default? UPDATE: I see that it is :)

@jdcarls2
Copy link
Contributor

jdcarls2 commented Oct 4, 2021

I just ran into this when elaborating on a brand (Pinnacle Bank), and finding that there were in fact 2 bank brands in the US using this name, but in separate regions. Each bank's website helpfully displays a map of the states that they serve, and this seemed like a great way to differentiate the locations.

Reading back through this issue, I have considerable QGIS experience with things like topological editing. The few state geojson files I have added were individually simplified features, but it wouldn't be too much work to generate a file per state (or province, etc) with properly aligned features between each.

If there's real interest in having these sorts of features, I could add them to my pull request sometime this week probably. (Or would it be a separate pull request or something? Apologies, it's the first pull request I've done ever, and I'm still getting the hang of how Git works.)

@michaelblyons
Copy link
Collaborator Author

If there's real interest in having these sorts of features,

From my perspective, I think that would be quite helpful indeed.

I could add them to my pull request sometime this week probably. (Or would it be a separate pull request or something? Apologies, it's the first pull request I've done ever, and I'm still getting the hang of how Git works.)

If I were in your place, I'd make a separate PR with all the borders that weren't necessary for your original Pinnacle Bank PR. Depending on how you do the shape naming, you may need to map pre-existing features to your new GeoJSON files. (If so, it'll just be a find-replace.)

Reading back through this issue, I have considerable QGIS experience with things like topological editing. The few state geojson files I have added were individually simplified features, but it wouldn't be too much work to generate a file per state (or province, etc) with properly aligned features between each.

Excellent! Please just recall that Brian asks that the shapes be as simple as reasonable: (If no longer relevant, Brian, jump in and say so. 😉 )

  • make them as small as possible (few points, reduced coordinate precision)

@bhousel
Copy link
Member

bhousel commented Oct 4, 2021

@jdcarls2 said:

Reading back through this issue, I have considerable QGIS experience with things like topological editing. The few state geojson files I have added were individually simplified features, but it wouldn't be too much work to generate a file per state (or province, etc) with properly aligned features between each.

Whoa this would be great! I'd definitely welcome a PR with simplified seam-free geojson state shapes.
(related rapideditor/location-conflation#34 @LaoshuBaby asked for this for China too!)

If there's real interest in having these sorts of features, I could add them to my pull request sometime this week probably. (Or would it be a separate pull request or something? Apologies, it's the first pull request I've done ever, and I'm still getting the hang of how Git works.)

Welcome to the project! 👋 Your PR looks really good - I would not have guessed that it's your first ever!
If you make the state shapes it's easier to review it as a separate PR that does only that.

(By the way, if you plan to make a bunch of PRs this month, you can sign up for the Hacktoberfest challenge and your PRs will count towards earning some swag - the name-suggestion-index project is participating!)

Also - If you want more help with setting up Git or making pull requests, we're happy to talk you through it.. A bunch of us hang out on the OSM-US Slack: https://slack.openstreetmap.us/ in the #poi channel.

@michaelblyons said:

Excellent! Please just recall that Brian asks that the shapes be as simple as reasonable: (If no longer relevant, Brian, jump in and say so. 😉 )

Yes this is still true - we still need to download the shapes and union them together shortly after iD starts up. One promising development though: I'm working through a side-quest to move this work into a background thread (web worker), and it's going ok! facebook/Rapid#297

@bhousel
Copy link
Member

bhousel commented Oct 13, 2021

Done in #5449 , thanks again @jdcarls2 👏

This means that going forward we can use US states in defining locationSets! I'll start updating some of the existing items that I'm familiar with 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
considering Not Actionable - still considering if this is something we want
Projects
None yet
Development

No branches or pull requests

5 participants