I get asked a lot in my classes what it means to be data driven. And while there are plenty of examples, here’s a little mini-example I hope will demonstrate something of what it means to be data-driven in government.
The Guardian recently had an interesting feature piece on the issue of walkability. A key feature for walkable communities is the presence of sidewalks. In the article, which notes a lack of sidewalks in Denver, an official is quoted as saying:
“At the current funding rate it would take 440 years to complete all the sidewalks in Denver.”
This quote was reported following the statement of this fact:
The city allocated $30m to improve sidewalks – although an estimated $1bn is needed.
Setting aside the questionable math that it would take over 4 centuries of effort by a major American city to put in place a rather simple piece of infrastructure, I think this is a great example of how data and good analysis can be useful to help find a workable solution to what seems a difficult problem, namely putting sidewalks in places where they serve a benefit to the community.
First of all, it’s important to recognize a walkable city doesn’t need to have a sidewalk on every street. While it would be nice, the key is having sidewalks where people need them to be safe and want them in order to conduct business they would otherwise do by driving. This means there are corridors where sidewalks should be in order to increase safety and encourage walking. The outcome is safe, secure streets that everyone feels confident to use in order to access the areas of the city they wish to go.
There are many ways to measure this. The things that readily come to mind are:
- Satisfaction surveys
- Economic activity
- Pedestrian counts
I’m sure there are many others, but one of the most important, I’d argue, is accidents involving pedestrians. If an area has had an accident, there’s likely some reason the pedestrian was there and a reason they were hit. Assuming pedestrians are safer on sidewalks than on the street, the location of accidents involving pedestrians where there aren’t sidewalks nearby may be an indicator of areas that should be a priority for sidewalk construction. This would allow the City of Denver to identify priority areas to look at for initial investment in sidewalk construction as a public safety measure.
Looking at the 3,553 traffic accidents involving 1 or more pedestrians from 1 January 2012 through 18 September 2018, 390 of these accidents occurred 75 ft or further from a sidewalk (5 weren’t coded properly in the data). I chose 75 ft since it’s roughly the width of an average roadway. You could change that number to find something more appropriate.
While each of these is a tragedy, it might be easy to claim this is still too many areas to focus on. Thankfully, the City of Denver has identified “Pedestrian Routes.” These are routes in the city identified as pedestrian corridors via field investigation. There were 30 pedestrian-involved accidents within 75 feet of a Pedestrian Route, indicating these might be good candidates to get the first round of financing to improve sidewalks.
We could improve on this very quick analysis by layering on the location of key facilities where walking is prevalent (outdoor shopping malls, schools, parks, etc.) and begin to map out priority areas with more specificity. These are areas where people have a high desirability to walk and are at a higher risk while doing so.
Another approach is to look at this from a different point of view. Since sidewalks are a network problem (it does no good to have a stretch of sidewalk that doesn’t connect to anything) prioritize routes connecting these areas together, particularly along commonly used routes (such as the Pedestrian Routes). Then, most importantly, take this to the people who live and work in these areas for their input. That is an essential data point, once you’ve mapped out the paths and done the estimates for how much could reasonably be constructed in a given year, show the evolution of the system for their comment and feedback in a way that is accessible (an interactive map showing the network evolution over time. Bonus points if it can take in user input while it displays, like SoundCloud comments on a song). This could again take in any number of factors to weight priority.
At the end of this, the City of Denver would have a much better idea of where to spend the $30 million it can allocate the $1bn necessary to best deliver value to citizens right away from data that is both quantitative (the number of pedestrian-involved accidents, number of businesses, volume of traffic, etc) and qualitative (interviews, public meetings, focus groups, etc). This also points out the problem of dirty data. Of these 3,553 accidents, 5 weren’t coded properly and fell onto Null Island. While this is thankfully few, it speaks to the importance of having accurate, reliable data in order to make data-driven decisions.
This is what it means to be data driven, it means using data to drive decisions. The alternative is to take a whack-a-mole approach of picking an area at random or based on some bias that could end up putting resources where they aren’t making the most impact.