This post is part of a series of thoughts following the 2016 International Open Data Conference and Open Cities Summit I recently attended in Madrid, Spain. See my introductory post on the conference and context for this series of posts.
What is Government Open Data?
A lot of the talks at the 2016 International Open Data Conference and Open Cities Summit assumed a shared understanding of open data. I don’t know that they were particularly wrong, but I found the experience a good opportunity to define the term for clarity as I explore some other important things that came up during the conference. I’m sure 80-90% of the participants could’ve quoted the Open Knowledge International definition by heart:
Open data is data that can be freely used, shared and built-on by anyone, anywhere, for any purpose.
While some sessions touched on the broad sweep of that definition (all data everywhere), most focused on the government open data that is near and dear to the heart of many of us. I think it’s important to take a moment to describe government open data and the value it provides in open societies, particularly to help inform later posts I hope to publish on open data, open government, and, in particular, what these things are not (big, smart, or inherently self-justifying).
1. Government Open Data Describes the Outcome of Government Actions
Our resident open data expert and blogger in New York City, Ben Wellington, has recently earned a lot of press analyzing parking ticket data from the New York Police Department. He found two spots that were improperly marked and were netting the city over $55,000 in yearly income. He also found a pattern of ticketing drivers who were lawfully parked.
In each of these cases, the laws concerning parking are openly available and citizens rightly expected the government (in this case the NYPD) was properly following those laws, but the proof was in the data. Neither of these were malicious. The agencies involved had no incentive to do this analysis and dutifully responded by fixing the problems (including painting the spots to clearly indicate they weren’t legal parking spots and instituting a revised training policy). The data was the proof of action (appropriate or not) and opening it up allowed others to help keep the agencies accountable.
To summarize, laws describe what should happen in a society while data shows what did happen in the course of enforcing those laws. As a society, we openly publish laws to clearly communicate expectations. We should naturally want (demand?) government openly publish data to see how that expectation has been fulfilled and ensure their part of the social contract has been fulfilled.
2. Government Open Data is Society’s Official Record
Were I ever to be in a financial position to purchase property in New York City, one of my first stops would be to the NYC Open Data portal to look over data from the Departments of Finance and Buildings on ownership and violations, respectively, to help inform my decision about the purchase. As in many other cases, government (in this case the City of New York) has the official record of ownership. No matter what handshake deals or other transactions may have occurred in the crazy world of NYC real estate, if it wasn’t registered with the relevant NYC agency, it didn’t happen.
To summarize, the records of government are the foundation of our social, political, and economic reality. They should be freely available and easily accessible to all who participate in the reality these records create in a way that doesn’t privilege the access of one group (property developers, real estate agents, etc.) over another (citizens, housing rights advocates, etc), allowing each of us to help shape the reality in which we live.
3. Government Open Data Describes Critical Services
Having never been to Madrid before, I was dependent on Google Maps and other applications to help me plan my trips from the airport and around the city. As a non-Spanish speaker, it was important that this information be provided in English. All of the applications I used were powered by open data from the local transit authority (who had a great presentation on their open data at IODC, by the way). The idea of not having this data readily available is inconceivable in age we live in.
This is true for data on trash collection, snow plowing, school openings and closings, and a host of other services government provides. Were this data not available, I would have virtually no idea what services were available to me as a visitor or resident of Madrid. It would be as if the city as a government tending to the needs of it’s citizens didn’t exist.
— Datapolitan (@Datapolitan) October 6, 2016
Additionally, the release of data central to the social, political, or economic drama of our lives in an open, machine-readable way ensures this data can be combined with other important sources of information. Government often isn’t particularly good at creating the integrated, multi-faceted presentation of important data, such as multi-modal travel planning including bike shares and car shares in addition to public transit options, or aggregating political speeches, campaign literature, and news articles with voter registration information polling locations. They’re (pretty) good at collecting and disseminating the data that power those applications and as a taxpayer, I’d prefer that was their focus of time and money, allowing others to do the work of making this useful to users.
To summarize, as a provider of critical services to citizens, visitors, and other stakeholders, government has a responsibility to make data on those services available in a way that not only allows those stakeholders to understand the provision of those services, but also allow intermediaries to integrate that information with other sources to increase value and expand the context for users, something government often isn’t always well positioned or resourced to do.
4. Government Open Data is Produced at Taxpayer Expense
Another key element in understanding government open data is that, as a tax-financed enterprise, the work of collecting, storing, and disseminating data has already been paid for by the public and the public has a right to what they’ve paid for. While there are compelling arguments for not releasing some data collected by government that could violate the privacy of citizens or undermine security, the burden of explanation for open data shouldn’t be on the public for why something should be released but on government for explaining why something shouldn’t be released.
Technical and logistical challenges are an insufficient reason to provide this service. Collecting trash in a city of 8.4 million people is a challenge. Educating 1.1 million students presents a number of logistical issues. NYC government does this by finding ways to overcome these challenges and fulfill its responsibilities. Open data should should be no different, whether it’s in New York City, Madrid, Sydney, Nairobi, or Buenos Aires.
In conclusion, government open data is critical part in open societies. In addition to the compelling social and political interests, there are clear economic interests and issues of equity with respect to the release of data to the public. Government open data should ideally be provided to the public in ways that make the data easily accessible by as many stakeholders as possible without regard for their ultimate intent.
In the following post, I want to talk about the best means by which government can provide open data to the public to meet the elements discussed in this post. In the meantime, please feel free to share your thoughts in the comments section below or find me on Twitter.