An investigation into ways to rebalance CitiBike stations throughout New York City.
As has been recently documented by the press, one of the major challenges that Citi Bike is facing is the rebalancing of their stations. As origins and destinations of Citi Bike trips are not necessarily symmetrical during the day, Citi Bike has been forced to constantly move bikes around the city, taking them from full stations and delivering them to empty ones. This problem is both financially expensive and frustrating for Citi Bike users: many people complain about either not finding bikes at their stations of origin or not finding empty spots when they arrive at their final destinations.
To study this problem we have created a series of visualizations which should serve as a starting point for further analysis.
First, we visualized the average activity for weekdays in October 2013. As the first image (Citi Bike Hourly Activity) shows, the activity hotspots remain pretty constant throughout the day, specially between 10am and midnight, with most of the activity centered around Union Square. In addition, we also see how both Grand Central and Penn Station become strong hotspots during peak hours. Of interest, though, is the sudden shift that occurs around 5am, with the activity hotspots switching from the East Village/Lower East Side area, to Grand Central and Penn Station. This is probably due to the fact that during most of the night, compared to other areas, the stations in the East Village/Lower East Side continue to have high activity, but during most of the day, and specially during peak hours, they are not as active as the stations around Union Square or Grand Central and Penn Station.
The second image is the same visualization at a faster speed.
Next, we visualized overall patterns of origins and destinations. As image 3 shows (Citi Bike Hourly Balance), the big hotspots of imbalance are mostly located around the East Village, Lower East Side, Midtown East and West and Union Square. However, the variation of these hotspots throughout the day is pretty extensive and it's very difficult to detect smooth transitions apart from peak hours. Of note are a couple of big "jumps" between origins and destinations, one of them around 1-2pm on the East Village/Lower East Side and another one around 5am also in the same area.
The fourth image is the same visualization at a faster speed.
We also created a series of imbalance matrices (by hour of day) for every single station on the system. Again, using the same data as the animations above, this first matrix (Citi Bike Hourly Balance) clearly shows how the big imbalances happen (as expected) mainly between 6am and 10am (morning peak hour) and between 4pm and 8pm (evening peak hour). However, there are some stations whose imbalance starts and ends earlier, like 8th Ave. & 31st Street, W 33rd Street & 7th Ave. and W 41st Street & 8th Ave. (more origins than destinations starting around 2pm). In addition, this matrix also shows that not all of the stations suffer from big imbalances during peak hours. Indeed, stations like E 31st Street & 3rd Ave or E 32nd Street & Park Ave. barely have any imbalances during peak hours. You can download a high-res version of this matrix here.
Furthermore, as not all of the stations have the same level of activity, we produced two more matrices, both showing station imbalance, but this time comparing it to the overall hourly activity for each station. The first one (Imabalance matrix normalized by hourly activity) shows the imbalance as a percentage of the activity for that hour. Hence, the great imbalances appearing late at night, when there are fewer trips and there's a higher chance of having all of them as origins or destinations. However, it is still interesting to see that there are higher imbalances during the morning peak hour than during the evening one, as a percentage of the overall activity.
The second matrix (Activity and imbalance matrix) shows the imbalance as colors and the overall activity as brightness, so we can see how in the hours between the peak times there's still a lot of activity but it is mostly well balanced. In addition, we can see how late at night (imbalanced as it may be) there's still very little activity. Finally, we can also see some outlier stations with a lot of activity and still pretty imbalanced: for example, in the morning 8th avenue and 31st street, 17th street and Broadway, Lafayette and 8th street, and Pershing Square (north); and in the evening 8th avenue and 31st street, 41st street and 8th avenue and again Pershing Square (north). You can download both of these matrices at high res here and here.
Finally, we have created hotspot maps for both the AM and the PM peak hours. As you can see from the maps below, Citi Bike activity closely matches what we would expect to see in New York: the AM peak hour map shows people leaving residential neighborhoods (Lower East Side, East Village, Chelsea and Hells Kitchen) and arriving at Midtown East and the Financial District, and the PM peak hour map shows the reverse. To note, however, is the fact that these two maps are not completely symmetrical, meaning that there are certain trips that happen in the morning which do not have their counterpart in the evening, and vice versa. Also, there are some stations that while being inside imbalance hotspots do not show that large of an imbalance. These stations have been outlined on the maps and should be further studied. You can view high-res versions of these maps here: AM and PM.
Project director, data analysis and visualization: Juan Francisco Saldarriaga