So I attended Transportation Camp in Berkeley last weekend. Frank Proulx shared his collection of bike counts, which I have since posted. A bunch from San Francisco (including a shapefile of bike counter locations), a few from Marin, and one from Oakland. It has been a minute since I had a real opportunity to play with data, so I indulged myself and dove into bike counts from 66th and Telegraph. These are hourly counts spanning one year from August 7, 2012 to August 7, 2013.
I first tried plotting the hourly count frequencies as a box plot:
Which shows much heavier southbound traffic in the evening, from Berkeley toward Oakland. We also see a suspicious number and consistency of outliers:
But no I didn’t think weekends can entirely explain that pattern. Sure enough, when you look at the hourly averages per month, you see there is something wrong with August.
It is not credible that the bike count peaks at 11 pm.
Finally I plotted bike count against time of day on each day of the year so that we can just see all of it at once.
I drew gray lines on each Monday, to test Matt’s theory. Indeed he’s right, you can see that bike traffic drops off on the weekend. Christmas also stands out, as does Thanksgiving. But what is immediately obvious is that the time may not have been synched correctly with the counts early on in the period, from August through mid-September.
I’ve posted my code (R, ggplot, plyr) on Github, along with the bike counts and more accessible versions of the plots. If you take some time to play with the data, let me know what you find!