Downloads historical data for CitiBike stations, and comes with a set of scripts that can be run to generate various reports. Write-ups:
- http://afeld.me/nerdery/1515624
- http://www.nerve.com/infographics/most-popular-least-popular-citibike-stations-new-york-city
As noted on their data page, Citi Bike publishes a feed of their station information. This data is delivered in a popular format called JSON, and includes the intersection and coordinates of each of their 300+ stations, as well as the number of available bikes and the station capacity. A friend of mine, Abe Stanway, set up a system to request and store that data. It has been running for months, and he made that historical data public.
After my scraping (a.k.a. data collector) script (a.k.a. code) retrieves the list of stations, it retrieves and stores the historical data for the past week (over three million records) in a local database. I then run another script to aggregate the data. Each station has one data point per second saying how many bikes it has available, so the query is essentially asking "for each station, how much of the time were there fewer than two bikes available, and how much of the time were there fewer than two docks open?" The results get exported into a CSV, which is available on Google Docs.
First, run the initial setup:
bundle
bundle exec ruby scrape.rb
The data will then be loaded into a SQLite3 database, data.db
. You can explore it with sqlite3 data.db
. To run a particular report:
bundle exec ruby reports/REPORT_NAME.rb
For example (8/22/13-8/29/13):
See all of them here.
- http://citibikenyc.com/system-data
- https://github.com/noneck/CitiBike-OpenData-Law/wiki/CitiBike-NYC-Tools-and-Apps
- https://github.com/edgar/citibikenyc
- http://appservices.citibikenyc.com/data2/stations.php
- http://data.citibik.es/
Special thanks to HowAboutWe for sponsoring a refresh of the work.