The State of Open Data - Sectors and Communities
The chapters in this section explore sixteen different sectors and communities where open data has been applied.
The earliest advocates turned to open data because they faced particular problems. They were not seeking data in general, but rather specific datasets to help them solve those problems. In the years that have followed, a broad movement on open data has secured access to data on thousands of different topics. How useful this data has been in solving problems or meeting social challenges is dependent on both the data and on the particular problems and challenges that were targeted. Open data is not a one-size-fits-all solution, but instead plays out in different ways in different settings. As the chapters in this section will illustrate, to understand the state of open data, we need to look at open data in context, exploring the particular sectors where it has evolved and the communities that have developed around it.
There are very few sectors where open data might not have a role. However, to provide a broad overview of open data developments, the focus chapters in this section were selected based on an analysis of the agenda and discussions at recent editions of the International Open Data Conference (see Introduction), as well as themes identified in the 2015 Sustainable Development Goals (SDGs)1and the categories of high-value data identified in the G8 Open Data Charter2and global measurement tools (see Chapter 22). We have sought to select sectors at varying stages of progress, ranging from government finances (Chapter 10) where budget and subsidy datasets have had a pivotal role in shaping early work on open data through to telecommunications (Chapter 14), a sector largely overlooked to date as an area of focus for open data initiatives. Our coverage is by no means comprehensive, and, inevitably, there are different choices that could have been made on the scope of each sector. Water and air quality, for example, could arguably have been addressed as sectors in their own right, although, in this volume, they find their place as sub-themes within the essay on the environment (Chapter 7).
The key advantage of a sectoral approach in a review of open data is that it requires us to take a step back and to understand open data in context. Understanding and intervening in the struggles around land ownership data (Chapter 12), for example, requires an appreciation of the different systems related to land ownership and a recognition of the role that records and data play in securing land rights. Progress on opening up corporate ownership data (Chapter 3) can also be better understood in the context of the global financial crisis and the search for policy responses at that point in time when “shovel-ready” open data approaches were available to draw on. Sectoral engagement with open data is far from inevitable but instead relies on the right combination of advocacy, infrastructure, and backing at key opportunity points. These opportunities can evolve quickly from external events, as in the 2008 financial crisis, or from the alignment of different stakeholder interests over time, such as with agriculture (Chapter 2), where a case can be made for opening up new pre-competitive space and a sectoral shift from closed to open models of data production and use.
The histories and horizons of open data vary from sector to sector. We have worked with the authors of each chapter to identify key dates in the development of open data in their sectors. These timelines are published as part of the online companion to this book. Taking this long view helps us to understand the way in which open data ideas enter into an existing landscape of data systems, political attitudes, stakeholder relationships, and programmes of action. In the crime and justice sector, for example (Chapter 4), the history of open data might have started with interactive crime mapping in 2005, but new technological approaches have to contend with long-established and localised legacy ICT systems and the conservative ethos of many judicial institutions. The crime and justice chapter also draws important attention to the way open data work unfolds between different branches of government, encouraging us to consider government stakeholders beyond just the executive branch.
A sectoral approach also allows us to look beyond the “usual suspects” who self-identify with open data to locate other important stakeholders who have, to date, been on the periphery of the open data discourse. In the health chapter, for example (Chapter 11), the creators of an open source health management information system (HMIS) emerge as central players whose actions, in tandem with national-level policy activity, can contribute to improvements in the availability of aggregated open health data. Chapters on education (Chapter 6) and geospatial data (Chapter 9) also identify key stakeholder groups (the open education working group and open geospatial community, respectively) who have had relatively weak links to wider open data communities in spite of their relevant expertise and knowledge. A sectoral approach also reveals common influences across sectors. Eleven of the sixteen chapters in this section, for example, mention either the Open Data Charter3or the Open Government Partnership4as an influence on open data advances, and nine chapters draw on evidence from the Open Data Barometer5to understand progress.
Finally, a sectoral lens can help us to assess open data maturity and explore how embedded open data has become across a sector. To comprehensively assess the state of open data in a particular sector might require looking at the proportion of data generated in that sector which is ultimately available as open data, or it might involve an audit of use cases, identifying how far open data approaches have been adopted in addressing key sectoral challenges. While the chapters that follow are indicative rather than exhaustive, they show very different states of open data adoption. For example, the chapter on development assistance and humanitarian action (Chapter 5) suggests that the idea of open by default has become reasonably embedded in the sector, allowing stakeholders to shift their focus to developing and embedding more mature data-use practices. However, the chapter authors also note the ongoing challenge of building a data, and open data, culture in the sector, particularly given complex relationships between international, national, and local stakeholders. In the extractives sector (Chapter 8), work on governance, looking at issues such as contracts, tax, and royalty payments, has progressively integrated open data over the last decade, resulting in increased data availability and use. Yet, at the same time, the wider sector has seen a vast growth in proprietary data collection by commercial firms using emerging technologies, meaning that while the absolute quantity of open data available may have grown, the relative proportion of open to closed data has likely declined. A similar issue appears to be at play in the transport sector (Chapter 15), where route-planning apps have been a poster-child of the open data movement, but where the authors report that only a fraction of the data used to drive these apps is actually provided as open data. Even when open data is available, it may only cover a limited portion of the transportation experience. If a small group of stakeholders have access to superior but restricted-access application programming interfaces, the ideal conditions for innovation in the development of solutions will not develop.
One factor evident throughout the chapters in this section (and indeed throughout this volume) is that while open data has a technical foundation, progress relies upon policy, people, and collaboration. Open data tends to enter the discourse of a sector through the actions of one or more small groups that are able to enrol a wider group around them to develop and explore the application of open data. These are the open data communities that this section also attempts to bring into focus.
The original working title for this section of the book was “Open data communities” rather than “Open data sectors and communities”. Yet, it became clear that for most chapters, there was an open question as to the extent to which a coherent and recognisable community could be said to exist around the chapter subject. For most, the idea of community invokes a group with some degree of shared values, attitudes, and goals, and whose members have some degree of interaction. Although there are many successful “thematic” open data communities, in some sectors there are many different groups, each with distinct agendas, and with varying levels of interconnection, whilst in other sectors the sense of a distinct open data community is much more nascent.
By looking at the extent of community networking within, and across, sectors, we bring into focus a number of the drivers for community cohesion, including levels of collaboration, learning, and progress on securing impact from open data. For example, in the broad accountability and anti-corruption field (Chapter 1), we find strong connections have been made between distinct communities of investigative journalists, open contracting and procurement specialists, and individuals acting under a “follow the money” banner. While often meeting separately, these groups also benefit from a high degree of fluidity and the exchange of ideas through events, multilateral meetings, and field-building publications. By contrast, although the crime and justice chapter (Chapter 4) identifies many individual projects looking at open data, there is little evidence of a sustained global or regional community pushing open data forward in this sector, and instead the landscape is made up of ad-hoc initiatives by governments or other stakeholders without the evidence of substantial community development. Using a community lens can highlight how differing sectoral cultures, and different levels of investment in community coordination, impact on the degree to which action has been mobilised to address open data.
A community lens also brings to the fore questions about the people involved in steering and shaping open data activity within particular domains, inviting an exploration of whether communities are diverse or whether they are globally representative. Ultimately, all of the chapters serve to illustrate that community building requires intentional effort and sustained investments of time, resources, and energy. For example, substantial efforts have gone into outreach and to providing travel support to enable participants from lower-income countries to participate in open data events, such as the International Open Data Conference,6the GODAN Summit focusing on agriculture,7Open Contracting global events,8or meetings of the International Aid Transparency Initiative’s Technical Advisory Group.9We should also note that global community building often requires bridging language barriers, and the flow of learning and conversation between different linguistic open data communities is worthy of further investigation.
Lastly, a community lens can be used to examine the position of an open data community within a wider sector as a whole. Are open data specialists simply talking to each other or are they reaching out to shape wider sectoral work? The picture is varied, although, in almost all cases, there are opportunities to improve the integration of open data practitioners into existing sectoral communities of practice and to leverage open data to broaden those communities. A level of cultural adaptation is generally required as open data communities interface with existing communities of practice. For example, the national statistics chapter (Chapter 13) calls for improved connections between open data and national statistics offices (NSOs), recognising the need to focus on building mutual respect and understanding between statistics professionals and open data communities. The urban development chapter (Chapter 16) also illustrates the challenges of inserting an open data community into the mainstream of the sector, where, although open data has become a central topic in community discussions of resilient cities, within the commercial-led smart-cities marketplace, open data is treated as a minor tool rather than a transformative agenda.
The chapters in this section identify hundreds of different organisations engaging with the open data agenda and many different projects opening data and putting it to use. However, they also reveal that increasing open data adoption and impact across a sector is by no means inevitable. The process of making data open and ensuring that datasets can serve a much wider range of use cases than those for which they were originally created has resulted in a myriad of issues around data quality and interoperability that are only now starting to be addressed. Many chapters also point to major bottlenecks caused by endemic capacity gaps around data analysis and use, as well as the limited deployment of strategic actions to connect data analysis with policy change. In many sectors, the full potential of open data is being missed, in part, due to a shortage of sustained specialist work on technical and policy challenges and difficulty in finding non-profit or for-profit models that can bring the extended focus needed to move beyond pilots into long-term projects and programmes.
What is clear, however, is that although, in 2009, open data was promoted as a general reform, today, it is primarily seen as an asset to be used in meeting specific goals (including the SDGs). This raises many new questions for the open data movement as a whole, including whether it can be said that there is even a single overarching open data movement or whether we have many divergent sectoral movements and communities. How can open data be used to go deeper into sectoral problem solving while still maintaining cross-cutting learning and connections between communities? The chapters that follow are intended to address these questions and more.
2: Cabinet Office. (2013). G8 Open Data Charter and Technical Annex. GOV.UK, 18 June. https://www.gov.uk/government/publications/open-data-charter/g8-open-data-charter-and-technical-annex ↩