Skip to content

Web conference notes, 2019.08.22

Vladinthecity edited this page Sep 13, 2019 · 25 revisions

Attendees

Agenda

(continued from 8.08.18 meeting)

  1. https://github.com/CityOfLosAngeles/mobility-data-specification/issues/345
  2. https://github.com/CityOfLosAngeles/mobility-data-specification/issues/334
  3. https://github.com/CityOfLosAngeles/mobility-data-specification/issues/341
  4. https://github.com/CityOfLosAngeles/mobility-data-specification/issues/315
  5. https://github.com/CityOfLosAngeles/mobility-data-specification/issues/281

Minutes

Transcription:

Today continuing what is left of from 8/08 agenda

Active PR from active log rotation will be discussed in next call in early September

Harmonizing things from provider and agency API (Max): contextualizing:

Agency is meant to be a superset of provider and has a few different goals but meant to be real time focused, but meant to be harmonious, mistakes were made but as we have gotten deeper with the implementation there are places where things don’t line up. Therefore compiling a list and be broken out into separate things some which will be non-breaking and some which will be breaking.

Hunter: Any Questions? This mostly that the event models are equivalent

John from Lime: is the idea that agency is superset of provider, will be the breaking changes be ones where the breaks of agency are used instead of the events of provider.

Max E & A: response: there is not a fixed strategy. There may be modest changes to either In recognition that more people use Provider Default will be to change agency

Hunter: Hopefully by next call we will have a list of all that is not discrepancy. We are just behind getting that done. Effort to minimize breakages. (Possible more notes on this thought)

Max E & A: I am going to commit to enumerating everything I can find in this ticket before the next meeting of this August body and everyone is encouraged find anything I have missed.

Hunter: Closed ticket Agency Service Stuff: every city has some kind of geofence like thing that is articulated with their policy. This was an open ticket from (unknown name Paris service area construct) how to make this more robust PR 322. How to express a city policy in single consumable API endpoint and document. Would love more feedback from folks who are responsible and dealing with, from operators how do you currently consume the different policies from different cities and from cities what requirements do you have and how do you like to publish that and what limitation are currently facing.

Hunter: Providers how do you consume the different e.g. L.A. different Caps

John from LIME: city will provide a geojson or shapefile with a description in an email. Sometimes they will use different GeoJOSN descriptions. Very manual and oriented around the shapefile.

Hunter: someone in the GIS section makes the shapefiles someone on the transportation policy writes down rules description. Then you reconcile on the back end to so system that gets represented to users.

John from LIME: Yes that is correct we usually start the conversation with what we currently support, but basically as you described

Hunter: Any other providers have a different example

Crickets:

Hunter: John is standardizing this is something that is easier for you to process and consume or an additional burden?

John from LIME: enum and types of service areas that we would digest and standardize to a format, GeoJSON would be great. How often would these policy be changing, is there a schedule, knowing expectations of changes would be useful.

Max: have you had a chance to look at the Policy API

John from LIME: no

Max: we should have presentation on the Policy API, a presentation commitment should be something we should have.

John from LIME: Would this be part of agency or something else?

Max: Can be something outside of Agency.

Hunter: some of my comments is that this should be implementable with ARCGIS to keep complexity down. Monthly/weekly/updatable files. I know Kagen has some thoughts on this too, some other cities as well. Changing this from the manual process to something more standardized be something you personally adopt?

Kagen from Santa Monica: Yes we would be interested in expressing e.g. no ride zones, no parking zones, and no deployment zones that are currently on the ESRI open portal, we want to adopt a more standard approach to this. Two comments; 1) I definitely hear John and others in getting this vast arrays of different types of geofiles and I just want to be careful and support the common use files which are Shapefiles. And as Hunters point to make this relatively simple to implement, all cities will have the Shapefile accessibility. 2) Scheduling the expectation of update our approach, we update and expect companies to check with our updates. Scheduling may not work, we cannot schedule a e.g. water main break. We would be careful in prescribing something that maybe too hard implement.

Max: it’s also explicitly not included in the spec as it will vary from city to city

Kagen: definitely interested in looking at this further

Hunter: communication between cities and providers will not be superseded.

Kagen: e.g. marathon long term projects should and could be scheduled

Hunter: please comment further on GIThub. Has anyone any further thoughts, design requirements, things you don’t want to see.

Question from Chat: do you think making service areas real-time be a good idea?

I think right now the expression policies and service areas API draft are very sort of real time, max please correct me if I am wrong,

Max: if you know the end date you publish the end date, if you don’t know the end date you publish it and it’s on the agency’s to specify how often the provider checks in. This the current design intent, this is being used in L.A. currently. The city would be in the position to say what frequency providers need to check in. Varies from city to city.

Hunter: Another question? Why not incorporate into provider API?

The reason why not: this is information created by city government or regulator.

Max: If the city needs to communicate in real time how you do this? Are you going to ask to open a long pole, web socket, pole frequency?

Hunter: no answer for that. That is an issue to further discuss.

Max: totally agree.

Hunter: Any last questions concerning, pain point, articulating this issue, more work to be done.

Max: Please look at Compliance API draft, compliance engine that we built takes policy, state of vehicles and the geofences as input and emits the degree to which you are not in compliance, or not following the rules, we using this in our current geofences, DACs, this is all open source and available.

Hunter: hold discussion and future presentation and not quite fully baked yet. Bill are you on the call?

Bill: PR341 indicates when status events are removed from the provider API, e.g. status change endpoint, sometimes we get different things back if we query for the same time range, sometimes events have been removed and there are no ways to indicate that right now, so when we calculate metrics or a real time map on top of a status change, things change allot. It’s nice to know that this was true in a point of time and how to audit that. That is the motivation. My proposal is to add a field ‘retraction time’ or choose another name which would be null if the event is valid and then you can populate it with a timestamp when you invalidate it. Other solutions, but this is the basic over view.

Hunter: Would this be a super burdensome? (I think it’s only Lime)

John Doe: What would be a case/ decision that would lead to this? I am a little concerned with the back side?

John from LIME: we try to publish cases as soon as we can. E.g. rider is connected via cell and scooter and both events are published or logged, and we may want to remove one of the events to make the data look cleaner. Brain storm scenario.

Bill: makes sense, remembers example of a provider running a algorithm cleaning up older previous 24 hr. events.

John from LIME: e.g. pick scooter for rebalance but after picking up realize it maybe a maintenance event and therefore we want to go back and change that event.

Catlin from Remix: we are seeing the same things, query discrepancies, we don’t have the bandwidth to go back in time and scrape the data. How do we handle updates? Other than modifying the event maybe publishing a new event that contains the later time contained update or delta of the old event. E.g. remove an event. There is never a case where things have changed just that there is a record of a change.

Bill: I mentioned something like that as a solution. Main issue with that is that there are non-localized events. E.g. x minus two weeks ago. A researcher may not pick up those events/corrections that get released.

Caitlin: good point they may not pick up corrections

Ryan from Remix: alternative: similar to the publish at timestamp having an update at timestamp which could encompass creation, updating and refraction, and then having as an alternative to querying over the actual event time, let clients query over the updated time, so that clients can get a stream of any records that were updated, created or removed, if they are in that mode of following the log, but alternately able to use the historical API.

Alex from SF: I agree there is a problematics with the difference in the queries, I don’t know how we get to this issue. We don’t want to be in position where we are updating back data. Can we agree to a time of no querying updates?

Unknown Speaker: Can we do a combination of proposed solutions on one hand an updated at timestamps of historical events and then maybe a key to a future event that is the delta event that shows there has been changes, so that when one looks at it historically. Another thing is to treat event stream as an event only log it would solve both use cases, if we continue to treat log as append only it would overlap with agency better.

Hunter: from our experience here. We had a provider where there API had some internal issues missing data for 3 days, and usurped data. We had to do a back fill. Our current strategy for making sure we have a complete record is every hour we query for the last 12 hours. Would be nice to get a nice

Unknown: this speaks to the tension to the original use of provider and the new use of provider that to be near real time. We maybe need to spend more time on this.

Hunter: Three legged stool: Provider for 3 days ago, GBFS for what’s on the street right now, agency how to communicate with. I know GBFS has allot potential changes, anyone from their working groups on?

Hunter: Question from the companies that do real time visualization is there a reason why you went with reconstruction provider instead of scraping GBFS?

Unknown maybe Remix: GBFS doesn’t have any information of unavailable scooters, also in general having an API that we can ingest via a poling way is more accurate, and this is how we are thinking about agency vs provider.

Caitlin from REMIX: the historical aspect has been very helpful, from provider/agency, we can’t back fill from GBFS.

Unknown from GBFS: we could propose a change to GBFS to add broken vehicles, wouldn’t be far off.

Hunter: GBFS hasn’t been doing updates in a long time and it would be good to expand our solutions with other tools. Anyone on from Mobility Data? If you don’t know Mobility Data is the new consultant to NABSA for GBFS. I will reach out

Unknown: providers have express concern about this before if it could not be verified.

Unknown: Red herring, immutable, historical data, timestamping, reprocessing data.

Unknown: Append only log? (More resent data, integrity structure)

Unknown: no opposed to that solution. Updated time referencing query?

Hunter: this is complicated, trying to wrap my head around all these excellent ideas. Will add these notes to PR315 and 341

Hunter: Open API remix folks, any progress on this?

Caitlin: no progress on this. Demetri (out for next two weeks, no progress until next time)

Hunter: I think there is broad consensus to having open API to all the json schemas. Any other strong feelings?

Unknown: confused are we thinking about getting rid of json schemas and using open API this was the question early,

Hunter: my understanding we will have the json schemas for the data types but the api shape types like trips query parameters start/end times will be done in open api. If that is incorrect lets reconcile that now.

Unknown: can everything be in open API ? I don’t know?

Chris from BOLT: Happy to look into this some more, and support both and deprecate one over the other if there is a preference.

Unknown: share notes with Dimitri when he is back.

Hunter: any last things? Alright see you all in two weeks.

Clone this wiki locally