I think like most people, I often use the adjective “data-driven” without having really thought about what it means. I often use it in my data analytics classes as shorthand for how government can be led by data without stopping to discuss the concept or even explain the term.
A recent webcast with Carl Anderson changed that. He was talking about how to create a data-driven organization based on his book. While he drew heavily from his experience in private industry, I began thinking about what this would mean in the context of government and realized I’ve been doing a disservice to my students by not unpacking the term. To be fair, I don’t think anyone has really done this in the context of governing, particularly along the lines that that Anderson advocates in his book. There is an opportunity to develop Anderson’s ideas in the context of government at all levels and I hope to help contribute to that conversation.
I’m still working through the book and my ideas, but I think the most critical insight was his point that just because an organization collects data and uses statistics doesn’t make it data driven. In the context of NYC, the Mayor’s Office of Operations has been publishing the Mayor’s Management Report (MMR) since 1977 (a troubled time for NYC). While it’s changed form over the years, it stands as the statistical abstract of the city, providing metrics by which to measure the delivery of city services to the residents of NYC.
The existence of something like the MMR doesn’t automatically mean NYC is data driven. As Anderson points out, it’s not just about collecting data that makes an organization data driven, it’s the institutionalization of a data culture that’s important. The features of this culture are that is it open and sharing of information, primarily through self-service applications that empower everyone in the organization to work with data. This is possible through broad data literacy and an objective, inquisitive culture that emphasizes testing ideas with data rather than just acting on assumptions. Everyone has a role to play in working with and creating value from data, as well as generating ideas and contributing to the mission of an organization, summed up in the quote:
“Do you have data to back that up?” should be a question that no one is afraid to ask and everyone is prepared to answer
In the context of city agencies, this means consolidated or at least interoperable data systems that enable users at all levels to easily access data in a usable form for analysis. It’s also means leadership that advocates the use of data as a strategic asset in the mission of the organization and engaging the broader workforce in the mission of bringing data to the work of agencies, not just to a select few “whiz kids”, but broadly across the all areas of the organization and the city as a whole. This requires employees at all levels to be familiar with the basic tools for working with data and the techniques of data analysis. This doesn’t mean everyone is a data scientist, it means that everyone feels comfortable opening up Excel or whatever data application they feel most comfortable with to download, manipulate, and visualize data for insights and understanding.
One aspect of government open data that often gets forgotten is how it works to make data not only available to the public, but to other government employees. Open data is a model that can be used to design data systems for business as usual, not some after thought or add-on to an existing system. This helps drive the creation of a true data-driven culture, rather than one that publishes statistics but keeps operating on assumptions, beliefs, and, as Anderson puts it, the Highest Paid Person’s opinion (HiPPO).
This isn’t to say the intuition and experience of those who’ve worked in their field for decades aren’t important, but it’s important to not take that specific and highly-subjective information and place it in a relevant context. This larger context comes from the data.
At the risk of getting too metaphysical, the reason we collect data in the first place is because it describes a phenomenon of importance. We collect browser history because it describes the phenomenon of how individuals use the internet, which is helpful for optimizing websites and driving online sales. In NYC, we collect information on parking tickets because the data describe the habits and behaviors of NYC drivers (at least in terms of where and how they park). We collect data on trash collection to understand the phenomenon of trash generation by residents and plan for its orderly collection and disposal.
The data doesn’t tell the whole story in each of these cases, but it tells an important part of the story. Importantly, it tells a larger story than is possible from the experience of any one or small group of people. This is why we use data, to fill in the biases and assumptions from our own subjective experiences. In a city like New York, with a population of over 8.4 million and growing, managing for the present and future requires a dynamic understanding of context. This can only really come from a culture that values the sophisticated and robust collection, storage, and analysis of data to make informed decisions at all levels possible with clear, objective analysis of the facts at hand.
In January, I’ll be teaching a data analytics class for the NYC DCAS Management Academy and I look forward to workshopping some of these ideas with them as we define what it means to have a data-driven culture within the organizations they lead. I look forward to the work and hope to blog about the results.
In the meantime, here are the slides from a presentation similar to the one Carl Anderson gave for O’Reilly: