sectors

13. National Statistics

Open Data and National Statistics

  • Key Points
  • Timeline
  • Read & Engage
  • Cite
  • For national statistical offices (NSOs) and their partner agencies, open data provides a route to engage with a larger world of data-driven innovation and to demonstrate their relevance and value to the public.
  • Progress on making official statistics openly available has been slow and fraught with quick wins missed and a lack of long-term investment.
  • Greater engagement between open data and NSO communities is needed to drive cultural and practical changes, recognising the strengths that each bring to the data ecosystem in support of the Sustainable Development Goals.

How to cite this chapter

Swanson, E., Badiee, S., & Rudow, C. (2019) Open Data and National Statistics. In T. Davies, S. Walker, M. Rubinstein, & F. Perini (Eds.), The State of Open Data: Histories and Horizons. Cape Town and Ottawa: African Minds and International Development Research Centre.

Print version DOI: 10.5281/zenodo.2677837

National Statistics

Introduction

Data has the power to save lives, end poverty, protect the planet, and transform our world, but only if it is open and well used. This chapter is concerned with open, official statistics, which include some of the most important datasets that decision-makers need to create policies, design programmes, and monitor results. They are derived from data produced by governments as part of their official function. They provide a quantitative record of the country’s social, economic, and environmental condition.1Collected through censuses, surveys, and administrative records, official statistics are the product of national statistical systems, which are confederations of official agencies that in most countries are coordinated by a national statistical office (NSO).

Since they are produced by public bodies using public funds, official statistics should be considered public goods, capable of being used and reused for many purposes without diminishing their value to others, and available to be copied or reproduced by anyone. In economists’ terms, they are non-rivalrous and non-excludable. Making official statistics openly available is, therefore, economically efficient. Beyond satisfying economic theory, making official statistics openly available can stimulate innovative applications, encourage citizen engagement, and increase confidence in the statistical system as a whole.

Although their responsibilities differ from country to country, NSOs generally have the authority to set statistical standards, to design and implement large-scale data collection programmes, and to ensure the quality, reliability, and availability of official statistics. Through their links to other NSOs and to international statistical agencies, they contribute to, and benefit from, new techniques and common standards. Because of their centrality and the importance of statistics for setting policies and measuring outcomes, NSOs and national statistical systems should be at the forefront of the data revolution and the open data agenda. Where they lack explicit authority, they can, and should, lead by example. For NSOs and their partner agencies, open data is more than a dissemination strategy; embracing the principles of open data is an opportunity to engage with the larger world of data-driven innovation and to demonstrate their relevance to their own governments, the private sector, and the public at large.

There is an emerging, international consensus on the principles of open data, and much advice is available on how to make data open, but implementation of these principles has been difficult. Measurement of the availability of open data from official sources reveals slow progress at best. There are relatively low-cost actions that could make official statistics more open: providing data in machine-readable formats, making metadata available, and publishing open terms of use. However, producing larger and more complex datasets in response to the demands of the Sustainable Development Goals (SDGs) will require increasing the capacity of national statistical systems and securing additional resource commitments from governments to support robust, effective, independent, and open statistical systems.

National statistical offices, official statistics, and open data

As the coordinating body for a country’s national statistical system, NSOs are charged with identifying, collecting, processing, analysing, and disseminating official statistics on behalf of the government. NSOs are a part of government, but should be independent of partisan activities. Their independence is critical to their position as information brokers that need to build trust and remain free from influences that might bias their data or analyses. NSOs and the larger statistical system should, however, be responsive to the demands of policy-makers, who finance their budgets to meet their own, and the public’s, need for reliable information. These demands are not fixed. They grow and change as new challenges and opportunities present themselves.

National statistical systems are the repositories of two kinds of data: microdata, which are the unit records of censuses, surveys, and administrative datasets, as well as aggregate data or indicators. Microdata contains identifiable information about people, businesses, or other entities. Before this data can be made openly available, it must be anonymised or aggregated into public-use data and indicators. Access to the underlying microdata must be strictly controlled.

Guidance for NSOs is provided by the United Nations Fundamental principles of official statistics, a set of ten principles that set out the professional and scientific standards for NSOs.2The first principle, which arguably incorporates the remaining nine and embraces the core principle of open data, says that “official statistics that meet the test of practical utility are to be compiled and made available on an impartial basis by official statistical agencies to honour citizens’ entitlement to public information”. The sixth principle states that data on individuals “is to be strictly confidential and used exclusively for statistical purposes”. Balancing the public’s right to information with the possible privacy risks for certain microdata sets is a balancing act that all NSOs work to maintain.

As the data ecosystem expands, NSOs are expected to take a stronger coordinating role, encompassing new data sources, producers, and users, including both public and private actors. NSOs must also engage with a diverse set of stakeholders, including academic institutions, non-governmental organisations (NGOs), and bilateral and multilateral agencies in support of their research, development projects, and applications of open data. But many NSOs still lack the human, physical, and financial resources needed to perform even their traditional role. A report on the World Bank’s Statistical Capacity Indicators Database found that 39% of the 131 countries studied had a low statistical capacity. They lack a recent census, survey, complete civil registration and vital statistics system, or general statistical capacity.3The global community needs to be conscious of the varying capacities of NSOs, and create space for a variety of approaches based on technical capacity and country-level compatibility. There is no one-size-fits-all approach to building open data practices in NSOs around the world.

By lowering the transaction costs for disseminating data, open data can reduce the operational costs for NSOs, who will have an increasing role in coordinating and managing the data ecosystem. There are greater economic benefits to governments through the more efficient management of programmes, and to individuals and businesses through the use of data to create new products and services. In one of the earliest studies of the benefits of open data, Rufus Pollock estimated welfare gains to opening data that were previously sold by the British government to be from GBP 1.6 to 6 billion.4A study of the European Union’s open data portal predicted a total of Euro 1.7 billion will be saved in efficiency gains from open data for the public sector in the year 2020 alone.5Research on the opening of Landsat satellite data in the United States (US) points to similar financial benefits. Annual savings from the open Landsat data for NGOs, Federal Government, and the private sector is estimated at between USD 350 and 436 million per year.6

The degree of engagement with open data among NSOs varies widely. Some are leading, such as Mexico, Jamaica, and the Philippines. They are embracing open data by establishing open data portals, reviewing access to information laws and policies, and including open data in national budgeting and planning processes. Others have been slower to implement even the simplest open data policies.

International progress on open data

At the international level, there have been important steps taken toward open data. New standards, principles, and operating guidelines have been created; Open Knowledge International7nd the Open Data Charter8have established a working definition of open data. The Cape Town global action plan for sustainable development data,9adopted at the first United Nations World Data Forum in 2017, includes open data among its key actions for innovation and the modernisation of national statistical systems. Open data was subsequently addressed at the 48th and 49th annual meetings of the United Nations Statistical Commission (UNSC), a meeting of chief statisticians from UN member states and the highest decision-making body on statistical activities. The UNSC discussions on open data from the 49th meeting, held in March 2018, showed that countries are starting to treat open data as a priority and trying to integrate it into their national strategies and budgeting processes, as well as seeking international support for technical and financial assistance. Further, discussions from the 49th UNSC resulted in the designation of a subgroup to recommend changes to incorporate open data concepts in the Fundamental Principles of Statistics.

Beyond international advocacy for open data, practical steps to implement open data have been taken. A network of regional open data hubs has been developed by Open Data for Development (OD4D).10PARIS21 now includes open data in its recommendations on National Strategies for the Development of Statistics (NSDS)11and in its training programmes. The World Bank’s Open Data Readiness Assessment (ODRA)12helps countries identify gaps and opportunities for implementing open data. And NSOs are increasingly involved in international open data events, such as the International Open Data Conference (IODC). These are important advances that empower local actors to choose their own paths towards statistical development and learn from a growing network of open data actors.

The national and international policy developments are encouraging, but results must be measured by their impact on the availability and openness of official statistics. There is a consensus among projects measuring open data implementation that many countries have not fully adopted open data policies and practices and that implementation has been slow.13To accelerate progress, additional financial resources are needed to build capacity and modernise national statistical systems in low- and middle-income countries. Further, the value of data needs to be demonstrated to strengthen popular and political support for open data.

Key issues and challenges

Current state of open data for national statistics?

There are several quantitative indexes that measure the openness of government data. Among these are the Open Data Inventory (ODIN), the Open Data Barometer (ODB), and the Global Open Data Index (GODI). ODIN is designed to measure the openness of official statistics produced by national statistical systems and is the most appropriate index for this paper. The ODB and GODI both include “national statistics” among the types of public information they evaluate, but they are more concerned with non-statistical datasets, such as government budgets, voting records, transportation timetables, weather information, and maps.14Despite the differences in the data incorporated in their assessments, all these indexes employ a similar definition of open data, based on the principles of the Open Data Charter15and the Open Definition.16The indexes also point to similar conclusions: there is a large gap between the success of some countries regarding open data and the failure of others. Many of the datasets that users seek are unavailable or not provided on open terms, and there has been little improvement in open data scores over the last four years.

The ODIN scores highlight the large differences in open access to official statistics between countries. The highest scoring country in the ODIN 2017 report, Denmark, scored 80 (out of 100), while the lowest scoring country, Chad, scored 3. The median score was 37. Similar disparities between high and low-to-middle income countries’ open data scores were found in the ODB.17Scores are typically correlated with a country’s GDP, but there are examples of relatively poor countries that provide open data on a large set of official statistics. In ODIN 2017, Rwanda, for example, had a higher score for data openness than one-third of the OECD countries. A few countries have made significant improvements. In 2017, Bulgaria’s ODIN score increased by 14 points, placing it in the top ten globally, because the NSO made more data available in machine-readable and non-proprietary formats, and revised its terms of use to make them more open.

Despite widespread support for open data, the open data indexes have not, on average, registered a significant improvement in the last few years. Figure 1 shows the average open data scores from the ODB, ODIN, and GODI indexes. To make these indexes more comparable, only countries that had a score in every year of the index’s study period were used. Small changes in methodology limit comparability over time,18but a general pattern is clear: there is no clear upward trend in average scores; if anything, there appears to be a levelling off of progress toward open data.

Figure 1: Measuring open data index scores over time. Source: Data taken from the ODB, ODIN, and GODI indexes

To have open data, you first need data. Without open data, it is difficult to demonstrate the value of data to policy-makers, and, without recognition of the value of data, progress toward complete and open data will remain slow. For many countries, this defines a nexus of problems: lack of focus on the demand side, lack of commitment, and, lack of resources. There continues to be a mismatch in countries between data demand and supply. Like all service providers, NSOs must understand their clients. If members of government, businesses, and citizens cannot access the data they need, then they will go elsewhere or do without.19Beyond simply publishing data on their website or through a dedicated data portal, NSOs must engage with their clients, demonstrate the relevance and value of data, and provide tools and information that make the data more accessible. User surveys, feedback options, and monitoring web traffic are some of the methods that can be used to understand client needs.

The SDGs have increased the demands on NSOs as they require a comprehensive set of data from social, economic, and environmental sectors to measure progress toward the 2030 targets. This presents an opportunity for closing the gap between supply and demand since much of the data required for monitoring the SDGs depends upon the work of the national statistical system. But the 2017 ODIN report finds that critical datasets on the environment and gender are absent from some national data portals.20The lack of gender data is a particular obstacle to the SDG commitment to “Leave no one behind”,21which focuses on making disaggregated data available on gender, age, income, disability, and other important factors to make sure that the SDG targets are met for all segments of society. NSOs have an important role to play in closing these data gaps and meeting the demands of the SDGs.

Additional resources are needed for national statistical systems

Many national statistical systems are underfunded and lack the modern data infrastructure and statistical capacity necessary to meet the demands of the 2030 SDG Agenda. The Development co-operation report 201722and The state of development data funding23report find that funding levels for statistics are insufficient. Both recommend that the donor community (including multilateral, bilateral, and philanthropic organisations) adopt new financing strategies to provide more resources for data production and statistical capacity building. It is not just a matter of how much financing is given, but how it is given. As PARIS21’s project on Capacity Development 4.0 makes clear, better allocation of resources and coordination of donors’ programmes can increase the effectiveness of capacity-building programmes. The amounts needed are not large. Properly allocated and well used, an increase in support for statistics from 0.30 to 0.45% of official development assistance is needed to increase the statistical capacity to support the SDGs. National statistical systems with strong open data practices will have a positive effect on capacity-building efforts.

Increasing political support for open data

The countries that outperformed expectations in the open data indexes can provide important lessons on best practices. Countries like Rwanda, which has the highest ODIN score of any low-income country, or Mexico, which has developed a strong culture of support for open data and is consistently ranked highly in measures of open data, are good examples. Because many of the actions needed to make data open (e.g. open licensing and providing machine-readable formats) do not require large investments and are achievable with simple policy changes, it is often leadership and politics that keep data from being open.

NSOs are, by their design, supposed to be apolitical government organisations. Politics, however, often becomes entangled in NSO activities because official statistics can be used to justify funding from donors24or defend a politician’s governing record,25or because census statistics can be used for taxation and other functions of state power.26Because of NSOs’ apolitical nature, the leadership in the organisations often lack or do not want to use their political capacity to push for an open data agenda.27Successful national movements for open data require a high-level commitment on behalf of the government (often at the head of state level), long-term planning to create continued political support in transition, and guiding political frameworks. With this political support, minor changes in policy and better dissemination tools could open data in many countries.

A rising open data star: Mexico

The Instituto Nacional de Estadística y Geografía (INEGI) in Mexico is opening data and leading the way in its region with high-level support from the Office of the President.28INEGI’s hard work prompted the country to move into the top-ten most open countries in ODIN 2017, passing the United States (US). Mexico also consistently outperforms other countries in its region and other middle-income countries as measured by the ODB and GODI indexes. As a result, impactful open data programmes can be seen across the country, like Mejora Tu Escuela, a programme that displays school data and rankings to spur educational improvements by holding schools accountable.29

Demonstrating the value of data

The impact of open data on the economy, good governance, and democracy needs to be measured and communicated to the public, decision-makers, and politicians. If the value can be demonstrated, a virtuous cycle of data use can begin. People who use data will make better decisions. Data-based decisions will have more positive outcomes, and this will lead to greater data use and encourage additional funding for data and statistics. Broader use of data can also help NSOs improve the quality of their data. The more statistics are compared, contrasted, and combined with other data and information, the more light is shed on quality issues that may not have been identified previously.

The results from research studies on the use of open data on development are mixed and show that data has the capacity to generate economic impacts, but decision-makers often have difficulty incorporating data into their decision-making process. The Results Development Initiative30and the Avoiding data graveyards report31point to low use of data and open data platforms by decision-makers. Conversely, a survey from the United Nations Economic Commission for Europe finds there is a rising perception of the importance of data use and an increase in the citations of data in the countries surveyed.32More research is needed to understand the obstacles to, and incentives for, making better use of development data for public decision-making.

Leading the pack on open data in Africa: Rwanda

Rwanda has proven that, with a commitment to open data and some practical steps, low-income countries can open data. Rwanda has strategically invested in funding for statistics and open data.33As a result, the country earned the highest ODIN 2017 ranking for a low-income country. As a champion of open data, the country has also seen societal benefits like the open data land use portal that promotes land rights in the country. It especially benefits women, who are often cheated in land deals due to lack of access to land documentation.34

Conclusion

Taking stock of the state of open data for official statistics, we see that much progress has been made but that more is needed. International financial support for NSOs and a global push to demonstrate the value of open data for development could have dramatic effects on changing popular and political support for open data. However, there are also actions that NSOs can take to support open data in their own countries.

An important first step is to secure political and institutional support for open data within the government and to obtain the support of other stakeholders. This effort should be coordinated with a government-wide open data initiative, if possible. Legal frameworks and access to information policies should be reviewed and revised as necessary to support open data policies. Open data should be incorporated in countries’ NSDS, as well as in the planning and implementation of SDG national reporting platforms. For countries that have not already done so, an ODRA can be used to identify a roadmap for implementing open data. NSOs should champion open data in their own countries. Their perspectives and voices are needed at international discussions around open data, such as the IODC and United Nations World Data Forum.

Implementing open data programmes for existing datasets need not be expensive, and countries do not need to wait for additional funding to make progress. Data in PDF or image files can be converted to non-proprietary and machine-readable formats at little or no cost. Current production processes should be updated to go directly to machine-readable files, which will reduce costs over the long run. Metadata should be assembled and made available. And all data should be published under an open licence, such as a Creative Commons Public Domain (CC0) or Attribution Only (CC-BY) licence. These steps only require the political will to open data and few additional resources.

Just as it is important to make the case for the value of data at the international level, it is also important at the country level. Open data expands the reach and influence of the national statistical system, increasing the value of official statistics to the government and to the public. Data that is open can be used and reused without diminishing its value, for mobile phone applications, analyses, and other applications. By following the “Leave no one behind” movement, NSOs can also build a broad coalition of all segments of society to make sure all people are included and can benefit from this data. Most NSOs are more focused on the technical aspects of running their organisations, but effort should also be put into spreading data success stories to the public to increase support for open data. Overall, open data can raise the profile of data and the profile of NSOs as trusted organisations that are responsive to national and international demands.

When these steps at the international and national levels are taken, the open data index scores will begin to improve, and, more importantly, citizens will start to see the promised benefits of open data and much needed movement toward the 2030 SDGs.

Eric Swanson

Open Data Watch

Eric Swanson is a co-founder of Open Data Watch, where he is the Director of Research. He is a globally recognised economist with a passion for analysing the most effective ways to use data for development.

Shaida Badiee

Open Data Watch

Shaida Badiee is a co-founder of Open Data Watch, where she directs the strategic planning, partnership, and fund-raising work. She is a Senior Advisor to Data2X focused on gender data, co-chairs the SDSN TReNDS group, is part of Technical Advisory Group for the Global Partnership on Sustainable Development Data, is a member of the PARIS21 board, and serves on a number of other boards.

Caleb Rudow

Open Data Watch

Caleb Rudow is a research and data analyst at Open Data Watch and conducts research on open data funding, patterns of data use, and technical issues around open data policy.

Further Reading
References