Open data is often described as a non-rival good and inexhaustible resource. If I take a digital copy of a dataset, it doesn’t leave less data for you. This effectively costless sharing of open data is central to the logic that it should be made freely available and reusable, rather than treated as a finite resource to be hoarded. Land as a resource, however, is very different. Each use of land precludes use by others. Land is finite, and there is competition to control and exploit it. Potential users of land are often excluded by distance, physical, and legal barriers. Data also plays into this competition over land. Effective access to land data for one user may lead to significant first-mover’s advantage and, thus, preclude other users from taking action vis-a-vis a parcel of land, even if they eventually have access to the same data.
When we also consider the natural resources that land provides from the minerals underneath to the soil and crops on top, we can see that land can be managed well or can become degraded through over-exploitation. Unlike a digital dataset, where each different use can bring cumulative benefits, with land, there is a much more delicate balance to be struck. Yet, when it comes to understanding who owns or holds rights over land, the transactions that affect it, or how it is being managed, the word most often used is “murky”.1Comprehensive and detailed information about land ownership is scarce.
Some of this is unsurprising. Land ownership patterns have developed over many centuries with overlapping systems of tenure, and, in many countries, these can involve feudal structures, traditional rights, common lands, leaseholds, and freeholds. The first registers of titles to land only emerged in the 1850s under colonial administrative predicaments, and many countries still lack centralised registers, let alone systems that have digitised full country-wide records. Unlike many other government databases that might be born-digital, such as those created by electronic monitoring of the distribution of welfare services, land (ownership) data is often stored in legacy, pre-digital, information systems. Digitisation and verification of such legacy data is a significantly expensive and extensive undertaking, especially in larger countries still migrating from a paper-based land records system. This implies, among other things, that land ownership data is costly to produce and maintain, even though relatively costless to share once digitised. Further, across the world, owners, custodians, and communities have a wide range of, often complex and overlapping, rights and responsibilities in relation to land, which are often not automatically captured by simplified data representations used when land information systems are migrated from paper-based to digital records.
However, over recent decades, markets for land have globalised, and land has increasingly become a valuable asset class. This has led to vast, and often secretive, land deals taking place across the world with much remaining unknown about their scale and scope.2At the same time, national and local debates over land rights have been unfolding, with local communities often fighting similar battles in parallel geographic silos. National-scale debates and movements have also brought into focus the importance of understanding land and land ownership. For example, the Constitutional Court of South Africa has recently declared two landmark judgments upholding the land rights of women and communities affected by mining activities.3
Ultimately, the lack of transparency on land deals and the fragmented information landscape around land ownership presents problems felt by government, citizens, civil society organisations, and the private sector. For example, without clear information, governments are unable to identify and evaluate policy interventions to stimulate housing development, developers cannot locate land to build on, and communities cannot monitor whether environmental protections are being upheld or claim their rights over geographical areas inhabited for generations. Taken together, all these challenges have fed into calls for increased openness about land ownership, and they bring focus to the idea that open data can be used as a critical tool to address the land ownership transparency gap.
Land ownership and open data already have a history. When, in 2011, Michael Gurstein wrote his widely cited paper, “Open data: Empowering the empowered or effective data use for everyone?”, it was the release of land ownership information he turned to in order to ask his critical questions.4Drawing on the account by Solomon Benjamin et al. (2007) of the Bhoomi land reform project in Bangalore, he described how “the digitization and related digital access to land title had the direct effect of shifting power and wealth to those with the financial resources and skills to use this information in self–interested ways”.5Although Gurstein was cautious not to frame this as an argument against open data, but as one about the complementary interventions needed alongside it, the Bhoomi case has become iconic in open data discourse, frequently used to introduce the potential downsides of openness.
How far then have open data ideas progressed in relation to land ownership and governance? What is the current state of the art? And what lessons has the last decade provided? In the following sections, this chapter explores these questions through four lenses: first, with a look at cadastres and land registers, then at data on land deals and transactions, followed by data on land use, and finally, at how the land governance community is engaging with open data. In doing so, the chapter seeks to highlight how the topic of land ownership and open data provides a unique perspective on the challenges of building open data infrastructures and ecosystems in the context of unequally distributed power and wealth and how the power dynamics around data cannot be ignored.
Cadastres and land registers
Understanding land ownership generally relies upon two types of data: cadastres, which record the boundaries (formal or informal) of land parcels, and land registries, which record property rights and interests, and the details of ownership of particular parcels of land.6While some countries have unified systems, in others, there are separate systems for each function, different systems at each level of government, or distinct cadastres and registries maintained by individual agencies, such as government departments related to natural resources and mining.
Since they started tracking land ownership data, both the Open Data Index7and the Open Data Barometer8have reported it to be one of the least available categories of data. This has remained a consistent finding, even after the Open Data Index dataset definition was updated in 2016 to remove the requirement that open land ownership data should include identifiable property owners.9This revision, based on work with Cadasta Foundation, represented a more mature understanding in the open data community of the complex power dynamics and administrative structures around property ownership in different countries and the careful balance to be struck between privacy and transparency when it comes to land ownership records.
For example, in New Zealand, a detailed cadastre showing plots and the tenure type of each plot has been available since 2011 under Creative Commons licensing,10but access to data that includes ownership information requires users to agree to a separate licence for personal data.11In the United Kingdom (UK), individual title information can only be accessed for individual plots by purchasing title deeds, but a unified dataset of land held by commercial, corporate, and government owners was made available for free as bulk data in 2017, albeit under restrictive licensing terms that emphasise it should only be used for personal and non-commercial use, effective management of land, and prevention of crime.12Apart from transparency needs and privacy concerns, the significant commercial value of land data, especially of disaggregated data that incorporates ownership and land use information, shapes the decisions by land administration authorities regarding the opening of data as the New Zealand and UK cases illustrate.
While Rufus Pollock’s arguments support the view that the model of charging users for access to land titles is economically inefficient and leads to a loss of societal benefits (as well as leading to inequality between those who can afford to build their own plot-by-plot view of land ownership and those who cannot),13others see selling access to data plot-by-plot as a reasonable restriction, judging that open access to the full dataset would be harmful in a way that selective access to records is not. Cadasta Foundation’s analysis of open land ownership data suggests, however, that the level of land ownership transparency that is appropriate is likely to be context dependent from country to country, noting that “the UK is a highly developed and relatively equitable country with a 150 year old land administration system that holds 24 million titles. Opening up data on property owners’ names in this context has very different risks and implications than in a country with less formal documentation, or where dispossession, kidnapping, and or death are real and pervasive issues.”14
Who uses land data?
United States (US) real-estate platform Zillow draws upon US housing transaction data to provide housing purchase and rental valuations and provides an open application programming interface (API) of government records it has digitised and converted into structured data. The business was valued at USD 540 million at the time of its IPO in 2011.15
In New Zealand, wind farm developers have taken advantage of machine-readable cadastral and land ownership data to speed up the process of identifying and planning new sites.16
Investigations by the New York Times uncovered the true owners of expensive New York apartments purchased through anonymous shell companies. The investigation helped lead to actions by the US Government to seize assets suspected to have been bought with money stolen from Malaysia’s sovereign wealth fund in the 1MDB scandal.17
Note: current use of land data is greatly limited by availability. A number of the cases illustrating what could be done with land data in this chapter have sourced their data through Right to Information (RTI) requests or other research, rather than having direct access to open land datasets. Of the 17 countries with more than a 0% score for open publication of land ownership data in the latest Open Data Index, five are from Asia, 11 from Europe, and one from the Caribbean region.18
Privacy and security issues aside, one of the biggest hurdles to increasing the availability of land ownership records is the fact that many have still not been digitised. For many decades, development banks, including the World Bank, have provided extensive financial support to national and subnational efforts to develop cadastres and land registries in developing and middle-income countries. It is notable, however, that none of these projects, even those recently established, appear to have any explicit open data component, talking at best only about online portals.19It is also worth noting that many digital land titling projects have taken decades longer than planned to complete and have struggled to overcome the considerable technical and logistical challenges of converting millions of paper records into digital forms.
Large-scale land digitisation projects also face critical questions about their tendency to adopt narrow ontologies, and to represent land in terms of simple ownership, rather than as a complex web of rights.20Studies report that digitisation initiatives restructure not only data but the bureaucracy around it.2122It is primarily this concern with the way digitisation took place, ignoring traditional land usage in favour of only a limited class of documented land rights and centralising power over land decisions within higher levels of government, that was arguably at the root of the Bhoomi case,23with open access in situations of low literacy or low capacity of users to effectively use the digitised data presenting a secondary, albeit critical, complication.
For the millions of people around the world without secure title to their land, the official datasets and data structures used to judge land disputes represent a major source of power. But if open data is understood as more than a one-way flow of data from governments, and instead, as a means to allow citizens to create and publish data about their land ownership, opportunities exist to shift that balance of power and create records that can be used to support land claims. For example, tools developed by Cadasta Foundation support communities to document their own land use and rights data, adopting flexible data models and offering fine-grained control of what is, or is not, shared openly.24Where such systems are compatible with local legal regimes, they can give communities more control of land ownership evidence and offer a route to greater empowerment.
There have also been a number of announcements in the last few years of blockchain or distributed ledger-based alternatives to, or add-ons for, government land registry systems. Although these might, in theory, provide access to cryptographically secured and open land data,25they do not escape the need to determine the provenance of the information added to the ledger, and evidence of any blockchain-based land registers in operation, or achieving impacts on the ground, is vanishingly thin.26
Even when land registry data is collected and kept updated, three further barriers to open data access are commonly found: cost, infrastructure, and discoverability. In South Africa, for example, it is possible to browse a detailed cadastral map of property boundaries and tenure types online through a free portal,27but access to detailed data requires the payment of fees for each 100 or 200 parcels.28
Renee Sieber, in Chapter 9: Geospatial, also notes the increasing presence of private businesses in providing cadastral services, sometimes in return for exclusive rights to monetise the resulting data. In Europe, the 2007 INSPIRE Directives on geospatial data (see Chapter 32: European Union) have led to some progress on making cadastral records available as standardised open data,29although users seeking to bring together data across countries are likely to be met with numerous technical errors, incompatible metadata, and broken APIs. The technical complexity of both producing and consuming cadastral data may also help explain why spot checks of Open Data Index and Open Data Barometer assessments reveal weaknesses in the accuracy of their measurements with respect to land ownership and with their researchers apparently struggling to consistently locate and assess the openness of cadastral data.30
In summary, open data ideas are relatively new within the long-established and politically charged field of land registration. While in some higher-income countries an early balance appears to have been struck between making cadastral data “open by default” and protecting the privacy rights of individual owners, there is a long way to go before the balance is struck for most countries, particularly when capacity to use data is also unevenly distributed. While the possibility of open data approaches allowing marginalised groups to take control of the representation of their own land rights is worthy of more focused research, the key technological need right now appears to be skills for grassroots data collection and management as opposed to innovations in specific database technology, such as blockchain or other distributed ledger solutions.
Data on land ownership is not only captured through static registries. Over the last decade, there has also been considerable interest in transaction data related to the buying and selling of land. This kind of data can reveal the value of land, show changing patterns of land ownership and use, and highlight risks related to money laundering and corruption.
Sources of land deal data range from national government records, such as the UK Land Registry Price Paid Dataset that lists residential property transactions,31to crowdsourced datasets, such as GRAIN32and Land Matrix,33created by a network of researchers drawing on crowdsourcing and media reports to provide a partial global view of prospective or completed land deals. This latter class of data has become the subject of some controversy, illustrating the tensions that can exist when creating datasets to support research and advocacy.
Founded in 2009 by a group involving the International Land Coalition (ILC), among others, LandMatrix.org launched a beta dataset of “land grabs” in April 2012, offering a downloadable list of locations and investors, along with the anticipated size of the area to be bought. This, along with data from GRAIN, helped to spark a number of academic papers and media reports on the phenomena of land deals with a particular emphasis on deals in Africa. However, Oya (2013) has argued that the crowdsourced data lacked methodological rigour, and a focus on generating “killer facts” through rapid research could ultimately undermine the work of researchers and advocacy organisations seeking to understand deals, providing “false precision” and generating data that would not be trusted by governments and businesses.34Scoones et al. (2013) have described this as the “politics of evidence”.35By 2013, revisions to the LandMatrix methodology and dataset structure to more clearly illustrate source information had responded to some of these critiques, suggesting a reasonably tight feedback loop between academic and activist communities. Although it appears work on open data around land deals peaked in 2012–13, both GRAIN and LandMatrix have continued data collection. LandMatrix, in particular, is preparing for a new version to be released with updated data and features, working through a network of regional focal point institutions, including the University of Pretoria in South Africa, the Asian Farmers’ Association for Sustainable Rural Development (AFA) in Asia, and the Foundation for Development in Justice and Peace (FUNDAPAZ) in Latin America.36
Oya’s critique of land grab databases also questioned the reliance on datasets alone and called for more mixed-methods and in-depth research. One tool responding to this has been OpenLandContracts.org,37which was launched in October 2015 by the Columbia Center on Sustainable Investment (CCSI) and builds on a platform created for extractives contract monitoring. This tool provides full text land deal documents and allows their annotation to create additional structured data. Szoke-Burke (2016) writes that the platform can encourage “more sustainable land-use practices and fresh opportunities for public participation in decision-making on [land] investments”.38
It is notable, however, that while the systematic publication of government procurement contracts has received considerable international attention (see Chapter 1: Accountability and anti-corruption), there has been much less policy focus on proactive publication of government land deals, even in light of substantial programmes of government land disposal in a number of countries. The UK, for example, has required local government agencies to prepare and publish open data on their land holdings, identifying surplus land which might be sold off for housing or property development. Yet there is no corresponding requirement to publish data on the land that has been sold off, who it was sold to, and how it is subsequently developed.39This fits with an emphasis in government policy on using data to support an emerging PropTech (Property Technology) sector,40rather than supporting public ownership of land.41In seeking to take a global look at this issue, we could not locate any sources indicating the extent to which different countries provide structured data on government land holdings, their purchases, and disposals.
Ultimately, when it comes to land deals, crowdsourced open data has been instrumental in generating debate. However, its use has also brought into relief the politics of data, leading organisations to seek a balance between rapid data-driven research and rigorous data collection that combines quantitative and qualitative perspectives. Data on government land deals is of particular interest; however, there appears, at present, to be few coordinated calls for its proactive publication.
Private Eye - Land deals data and offshore ownership
In 2015 and 2016, British satirical and current affairs magazine, Private Eye, investigated ownership of UK property through offshore companies using a mix of land registry and land transaction data, albeit obtained through Freedom of Information requests, taking advantage of journalistic privilege to draw on some copyright protected information. The magazine published an interactive map showing GBP 170 billion of UK property acquired by companies registered offshore over a ten-year period, highlighting how these structures were used for large-scale tax avoidance or provided secrecy vehicles that could facilitate money-laundering.42 The investigation helped spark plans to require foreign companies buying UK property to declare their beneficial owners43and the open release of the UK’s Overseas Company land ownership dataset.
From a sustainable development perspective, it is not so much land ownership that matters per se, but rather the use to which land is put (albeit noting that ownership has a big impact on the equitable or distorted distribution of benefits from that use). In recent years, there has been a step-change in the global availability of remote sensing data on land quality and its use. This has been accompanied by a number of local projects making use of geospatial tools to layer together land rights and land use information, guiding policy design and supporting community action. We also note promising examples that show how open data can be used to support citizens in accessing and enjoying the use of public lands.
Two sources have been instrumental in making it possible to zoom to any square mile on earth and access visualisations and open data on estimated soil quality, land cover, and land use. Openly licensed satellite data is the driver for platforms like soilgrids.org44that provides downloads under the Open Data Commons Open Database License (ODbL). However, recent experiments have also turned to crowdsourced OpenStreetMap data to generate land use maps, combining this with satellite data to offer usable land-use classifications across the world.454647Although there are still some methodological challenges in reconciling figures from crowdsourced and remote sensing datasets with national records, this data has the potential to be used in both planning and measuring development interventions, including by tracking the impact of development activity on soil health and land productivity.
The East West Management Institute’s (EWMI) Open Development Initiative (ODI) in the Mekong region48also draws on geospatial tools and a number of base maps as the background for curated datasets on concessions, oil and gas blocks, and registered Indigenous lands, supporting research into the relationship between different land users. Through the ODI, EWMI acts as a paradigmatic “infomediary”49with goals to “change public perceptions about information and build demand for more transparency, shift dynamics from debates over basic data, encourage independent analysis, and level the playing field in regard to information access”.50The breadth of scholarly literature citing ODI sources suggests this goal is being met. Notably, however, the data available on different ODI maps across the Mekong region varies with detailed government-sourced land use only available for Cambodia, while sites for Laos, Myanmar, Vietnam, and Thailand have to fall back on international sources. When it comes to concessions, data gaps are a global problem with the 2017 Resource Governance Index51finding that over 50% of the countries surveyed lacked any public cadastre of oil, gas, or mining concessions and licences.52
Along with land allocated for resource extraction, many countries have land allocated for national parks, reserves, and recreation areas. In the US, an online platform for finding campsites (hipcamp.com), a mass membership environmental charity (the Sierra Club), and Code for America have come together with over 50 other partners to advocate for US National and State parks to adopt an open data approach within their park reservation system.53Active since 2014, the group has proposed model language for Parks Services to include in contracts with third-party vendors and has offered to broker introductions between national park staff and open data experts.54The AccessLand.org project hopes to encourage all parks to create open APIs that will allow a variety of civic and entrepreneurial platforms to hook into their data to discover available facilities and facilitate the booking of park spaces.55
This last case draws attention once again to the interactive opportunities of open data about land by creating systems that not only present information but also support two-way engagement through data.
The land governance community
As the introductory section of this chapter describes, land governance debates often play out in very local contexts, leading to the creation of many grassroots communities, activist networks, and stakeholder groups. However, the land governance sector has a track record of organising internationally with multi-stakeholder networks such as the ILC56and Global Land Tools Network (GLTN)57that emerged in 1995 and 2006, respectively.
In 2009, ILC and the consortium behind the experimental landtenure.info database58launched plans for the Land Portal to be a clearinghouse for land governance information and data.59The Land Portal quickly evolved to have a strong focus on open data and semantic linked open data standards, aggregating and repackaging existing indicator data and developing LandVoc as a flexible vocabulary for describing land governance documents and data.60Active in advocacy for open data in the land governance sector,61the Land Portal has taken a particular stance in its approach to both the sources of its data and the audience for the information that results from it.62In their 2014 business plan, the Land Portal describes a focus on “supporting the efforts of the rural poor to gain equitable access to land by addressing a fragmentation of information resources on land, which makes it difficult and often prohibitively expensive to draw together reliable evidence in support of programs, advocacy campaigns or policy formulation, especially for grassroots organisations”.63One of the datasets made available through the site is the Property Rights Index (Prindex), launched in 2016 and now covering 36 countries with measures to represent citizen perceptions of how secure their land rights are and to complement or challenge more formal technical measures of national tenure systems.64Through a series of partnerships with grassroots groups in Latin America, Africa, and Asia, the Land Portal has also explored approaches to filling gaps in available information and data, seeking to redress the imbalance of an information ecosystem where the majority of data remains the product of powerful global players.65
Since the Sustainable Development Goals (SDGs) were established in 2015, the land governance community has been tracking the quality and availability of data required to measure progress against land-relevant targets and indicators. As of December 2018, of the 12 land-related indicators, only three have both an established methodology and regular data collection, with six indicators still lacking an established methodology. Of the “tier 2” indicators (methodology established, but no regular data collection), two relate to gender and one to inclusive access to public space for people of all ages, genders, and disabilities.66
Most recently, funding for the work of the Land Portal (and a number of other land governance data projects) has predominantly come from the UK Department for International Development’s LEGEND (Land: Enhancing Governance for Economic Development) programme,67from Omidyar Network,68and from partnerships with GODAN (Global Open Data for Agriculture and Nutrition: see Chapter 2: Agriculture). However, compared to the levels of support for specific open data initiatives in other sectors, such as agriculture or anti-corruption, resourcing for open data in land remains comparatively limited at present.
Overall, open data appears to still be a relatively niche issue within the land governance community. An increasing number of organisations in the sector have adopted open licences for their data and publications, and, in 2017, a number signed onto a Land Information Ecosystem Declaration,69yet broad mainstream recognition of the role of open data still appears limited. This may be because of the particular political slant adopted by advocates of open land data, or simply because data issues still feel distant from the concerns of actors involved in fighting local land governance battles.
When it comes to land ownership data, we are confronted by a transparency gap and a messy reality of patchy and overlapping recordkeeping and data systems. However, where data is available, solid foundations have been laid for a responsible data70approach to be taken, recognising that, where ownership records include personal data, “open by default” does not automatically apply. Ultimately, both data collection and data publication need to account for the political context and power dynamics in which they are undertaken, and recognise the way in which remote sensing and crowdsourcing can rapidly transform the overall data landscape.
Over the last decade, numerous examples have made it clear that when better land ownership and use data is made available in appropriate ways, and when it is connected with data on company ownership, agricultural practices, or Indigenous rights, it can generate substantial value realised through investigative journalism, community action, academic research, and by informing government strategies. Continued development of the critical and multi-method research skills needed to use land data effectively will be vital to unlocking further value in the future.
Looking ahead, there are three key areas for action. First, we need continued work to understand and create the conditions under which marginalised and disadvantaged groups are empowered to access and use data on land ownership to secure their property claims, to seek justice, and to address corruption. Not only is capacity building vital to make the most of land ownership data, but without capacity building to level the playing field between developers, PropTech firms, and existing land users, just outcomes from increasing openness cannot be taken for granted.
Second, donors and governments investing in the technical infrastructures for land governance should be incorporating open data terms into all their project plans, funding agreements, and contracts. This does not mean all data must be open by default, but rather that systems must be open data ready, and the proprietary control of land ownership and use data must be ruled out. Directing just a small percentage of the millions invested in land registry systems every year toward open data approaches could be transformative.
Lastly, we need to see much better baseline and monitoring data on current levels of openness around the world for cadastre, land registry, and land deal data. Current open data studies lack the depth and geographic coverage needed to allow accurate monitoring of progress. At a minimum, studies need to distinguish between data that covers all forms of tenure and data that is restricted to only corporate or government-owned land. With a better baseline, it should also be possible to foster stronger advocacy, calling for land registry and land deal open data to be published with purpose.
In closing, the key lesson to take away from looking at open data and land ownership is that political struggles over the collection, curation, and release of data are now part and parcel of political struggles related to land ownership and use. Although this is brought into sharp relief in the case of land, open data in each sector is equally likely to possess its own complex politics, and advocates taking a stand on open data should always consider the wider political context within which it is pursued.