“Big data” in government are the topic of the moment across the world. Public sector datasets have evolved to include large volumes of very diverse sources of information. They raise issues of privacy, security or information sharing, but also of design and training. To make good use of big data, public managers have to ask the right questions. Not only on the data but also to a new generation of “data scientists” on whom they depend. The post discusses five key points about how public managers need to prepare for working with big data.
Big data matters both to managers in government and to scholars because the public sector is one of the main generators but also the guardian of big data.
A few weeks ago, I attended the panel on “Big data and (big) business” organised by the University of Bath. The panellists came from very diverse backgrounds including information security, mathematics, business analytics, marketing, visualisation, emergency management, broadcasting and accounting. Discussions about big data ranged from national security, competitive advantage for businesses as well as the implications of big data to our everyday life. For example, Prof Julie Barnett raised concerns about communicating risks to the public regarding healthcare data using the case of the recent NHS leaflet on “Better information means better care”. Drawing from these panel discussions and recent advances in the area, five key points emerge.
Where does data come from and what is “big” about them?
Government datasets have always been “big”. What’s new with big data are the complexity (many different variables) and diversity of information sources involved. For example, data from citizen surveys, structured data from transactions such as claims and payments and data from any historical records can be combined with information coming from every device and system in a local area. Apart from storage and processing power, systems’ greater ability to “talk” to each one (interoperability) has made possible many of the major advances in big data. Another important factor has been the fact that mobile devices can access services to pinpoint their own location very precisely.
By having many more variables and sources of information, the traditional data analysis process changes. We no longer collect data only to answer specific questions. Now we have large volumes of data available, which then creates the need to …
Ask the right questions
Big data does not create problems of sophisticated data analysis. Rather, the problem is one of asking the right questions to enable public managers to make decisions based on intelligent use of evidence. That was always true even when working with “small” data used in forecasting, planning and service design. But with big data, finding associations between variables can be a small part of the process. Whereas social science methods usually start with specific questions, when using big data in government, we need to experiment more broadly to understand the scope and relevance of a big dataset for decision-making. Finding the right questions will lead to answers than require far more than descriptive or inferential statistics.
Who will produce the answers?
Answers that mayors and ministers need from big data won’t consist of pure statistics. So a key part of a public manager’s job now is to learn how to communicate with data experts to get them to interrogate data and present answers in ways that provide policy makers with what they need. Data experts or scientists find insight in big data by merging sources, cleaning datasets, transforming variables and applying combinations of analytical techniques such as network analysis or data mining algorithms.
In recent years in business, there has been a buzz about the emergence of a profession of data scientists (see this 2012 Harvard Business Review article). These new professionals must balance technical skills with serendipity and creativity. This is most obviously true for data scientists working in data-driven journalism initiatives like those developed by the Guardian or the Times. In government, data scientists are likely to be recruited from among policy analysts, technology advisors or developers.
What will the answers look like?
Answers from big data involve statistics as well as advanced visualisation techniques (the Oxford Internet Institute has some great examples). But they are more than high-level conclusions. Although many central government departments might be more interested in aggregation than segmentation, local councils will use local data to reach individuals and particular communities. Moreover, intelligent use of big data calls for flexibility so that even when we do not know precise questions to ask, we can develop ways to find answers from continuous flows of data that come from different sources.
When the private sector faced similar problems, business intelligence research made major advances with interactive dashboards. However, these tools usually need to be supported by appropriate data structures. And those cost money.
So, as we work toward developing the right tools, we have to ask whether public managers should expect to use big data to make better decisions in private or whether big data form part of the open government agenda.
Big data are not open data but can foster innovation
Not all “big datasets” can be made public. Sometimes that is for reasons of client confidentiality. Sometimes, there are reasons of national security or crime prevention or the secrecy appropriate to those policing operations which still being planned. But often there are clear benefits when big data analysis is combined with open data principles. This article on the Guardian explains the difference and connection between open and big data very clearly. Colleagues Anneke Zuiderwijk and Marijn Janssen from the Netherlands discuss open data policies in their recent Government Information Quarterly article. Despite the many issues they identify, we have seen how open data can help up make better use of big data as citizens and organisations can get involved to support with their own intelligence, analytics or visualisation tools.
So the overarching question for every public manager is: how can their authority or agency use their own big data to stimulate and sustain public value?