Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving bikes showing up in HOPR free bike feed #201

Closed
mjarrett opened this issue Dec 3, 2019 · 7 comments
Closed

Moving bikes showing up in HOPR free bike feed #201

mjarrett opened this issue Dec 3, 2019 · 7 comments

Comments

@mjarrett
Copy link
Contributor

mjarrett commented Dec 3, 2019

I've noticed a problem in the free_bike_status.json feed for at least two HOPR bikeshare programs. I'm hoping to get confirmation that I'm reading the situation correctly and advice on how to address this.

I query the free bike feed every ~1 minute and record the location of all the bikes. For other systems I've looked at, bikes disappear from the feed as they're booked and reappear when they're available. In these HOPR feeds, the bikes remain in the feed and their movement can be tracked while in use. This doesn't conform to the spec and makes writing general analysis tools that work on all GBFS feeds difficult.

I've been focusing on HOPR's UBC system (Vancouver Canada) but have tested with HOPR Orlando and see the same behaviour. A simplified version of the code I'm using is below if you want to try to recreate this (Python with Pandas).

import datetime as dt
import pandas as pd
import time
import urllib
import json

def query_free_bikes():
    url = 'https://gbfs.hopr.city/api/gbfs/13/free_bike_status'
    with urllib.request.urlopen(url) as data_url:
        data = json.loads(data_url.read().decode())

    df = pd.DataFrame(data['data']['bikes'])
    df['bike_id'] = df['bike_id'].astype(str)

    df['time'] = data['last_updated']
    df.time = df.time.map(lambda x: dt.datetime.utcfromtimestamp(x))
    df = df.set_index('time')
    df.index = df.index.tz_localize('UTC')

    return df

# Every minute or so query the feed and add to dataframe
df = pd.DataFrame()
for i in range(60):
    df = pd.concat([df,query_free_bikes()])
    time.sleep(60)

# pivot dataframe to get location of each bike over time
df['coords'] = list(zip(df.lat,df.lon))
pdf = pd.pivot_table(df, values='coords',index='time',columns='bike_id', aggfunc='first')

The above code runs for an hour and produces a dataframe with the coordinates of each bike at each sampled time. An example of a moving bike:

2019-12-03 17:34:24+00:00            (49.26472, -123.259653333333)
2019-12-03 17:37:34+00:00            (49.26472, -123.259653333333)
2019-12-03 17:40:52+00:00            (49.26472, -123.259653333333)
2019-12-03 17:43:57+00:00            (49.26472, -123.259653333333)
2019-12-03 17:47:00+00:00    (49.2647777777778, -123.257742222222)
2019-12-03 17:50:03+00:00    (49.2602622025558, -123.252093005683)
2019-12-03 17:53:11+00:00    (49.2589688888889, -123.247982222222)
2019-12-03 17:56:18+00:00    (49.2589688888889, -123.247982222222)
2019-12-03 17:59:24+00:00    (49.2589688888889, -123.247982222222)
2019-12-03 18:02:26+00:00    (49.2589688888889, -123.247982222222)
2019-12-03 18:05:29+00:00    (49.2589688888889, -123.247982222222)
2019-12-03 18:08:32+00:00    (49.2589688888889, -123.247982222222)

Any advice appreciated! I've tried getting in touch with HOPR but haven't received a reply from their support email address.

@barbeau
Copy link
Member

barbeau commented Dec 3, 2019

@mjarrett Thanks for flagging this. Yikes - bikes in motion definitely shouldn’t be in the feed, especially due to privacy reasons.

https://github.com/NABSA/gbfs/blob/master/gbfs.md#free_bike_statusjson says:

Describes bikes that are not at a station and are not currently in the middle of an active ride.

Sounds like we need to be even more strongly explicit in saying that this data isn’t allowed in feeds. I’m hoping that perhaps the bikes in motion are being rebalanced, and not actively in use by customers 🤞. This data still shouldn't be in the feed but is better than exposing customer ride data.

Has anyone else seen this type of bikes-in-motion data in feeds? Including bikes that are out-of-service being rebalanced?

@mjarrett
Copy link
Contributor Author

mjarrett commented Dec 3, 2019

Thanks @barbeau! I should add: I'm doubtful that these are re-balancings (or short trips happening between samples). Bikes never seem to disappear from the feed.

I've uploaded a ~4 hr sample from HOPR Orlando here if anyone wants to take a look: https://docs.google.com/spreadsheets/d/1hFXKmkrlA2uoS0WpLVmBEmOfjyLjtT74SMQs-onhh8Y/edit?usp=sharing

@barbeau
Copy link
Member

barbeau commented Dec 3, 2019

@mjarrett Thanks, good to know. Would you mind taking a quick look at the Tampa HOPR feed to see if the same issues occurs?

https://gbfs.hopr.city/api/gbfs/8/free_bike_status

I'm more connected in Tampa than Orlando or Vancouver, so I may be able to get some traction here.

@garteli
Copy link

garteli commented Dec 4, 2019

Issue resolved

@barbeau
Copy link
Member

barbeau commented Dec 4, 2019

@garteli Could you elaborate a bit? Do you officially represent HOPR?

@mjarrett
Copy link
Contributor Author

mjarrett commented Dec 4, 2019

I've received an email from HOPR also saying that the issue is resolved. A quick look at the UBC feed suggests a fix has been pushed out in the last hour or so (bikes are now disappearing and reappearing in the feed). I'll take a more thorough look this evening and check with the other systems.

@mjarrett
Copy link
Contributor Author

mjarrett commented Dec 5, 2019

I've checked UBC, Orlando and Tampa, and to me it looks like free bike feeds are behaving as per the spec. As far as I'm concerned this has been resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants