Goal 2 of the Sustainable Development Goals (SDGs) commits United Nations member states to both achieve food and nutrition security and to promote sustainable agriculture. The world population is projected to exceed 9 billion people by 2050,1and the corresponding growing demand for food is exerting massive pressure on the use of water, land, and soil, which is further exacerbated by global warming. The majority of the world’s food is still harvested by smallholder farmers,2many of whom are poor and food insecure themselves.3
Agriculture is a knowledge intensive industry. Government and private sector-supported research and agricultural extension work (e.g. farmer education) is central to improving crop yields, understanding and implementing sustainable practices, and getting food to market. However, it is only in the past two decades that the agricultural sector has valued data as a tool for generating, sharing, and exploiting knowledge to improve yields, reduce losses, and increase overall agricultural business outcomes.
Rapid internet and mobile phone penetration, especially in the developing world, the accessibility of satellite and remote sensing data, and new data collection and analytical approaches all play a role in the “datification” of agriculture. While data-related opportunities are increasing, challenges still exist in the policy, ethical, and data standards domains, and key datasets remain absent or inaccessible. This is especially true in terms of nutrition-related data, which is largely under-utilised in the field of agriculture. Despite some progress in raising consumers’ awareness of the nutritional value of the food they consume, demand has not been significantly redirected to the production of more nutritious food, especially in the developing world.
Networks and leadership: A history of open data in agriculture
Work on open data in agriculture has emerged from a long history of knowledge management practice and international networking. Agricultural libraries in the United States (US) have been sharing bibliographical data since the 1940s. In the 1980s, the Food and Agriculture Organization (FAO) of the United Nations developed AGROVOC4initially as a printed thesaurus of terms and later established it as the first real data standard (vocabulary) for an open agriculture information ecosystem. FAO also created the first network to support agricultural information sharing in 2003, known as GLOBAL.RAIS (Global Alliance of the Regional Agricultural Information Systems).5In 2008, they launched the Coherence in Information for Agricultural Research for Development (CIARD) initiative,6a global movement dedicated to open agricultural knowledge, working to align the efforts of national, regional, and international institutions, and to improve information sharing and services.
The importance of considering not only data, but open data, came to the fore in 2012, when the US convened an international conference on Open Data for Agriculture, the result of a G8 commitment, with an emphasis on making “reliable agricultural and related information available to African farmers, researchers, and policymakers”.7This led to the creation of Global Open Data for Agriculture and Nutrition (GODAN) as a convening network to bring together public, private, and non-profit stakeholders to find ways to open up and use data more effectively.
GODAN was conceived to focus on awareness raising and advocacy as reflected in its statement of purpose,8but, from the outset, it was found that change through advocacy results only when partners are brought together to debate the issues and obstacles to making open data for agriculture a reality, especially when they can draw on provocative policy-focused research and recommendations. An approach to “Convene, Equip, and Empower” now frames the overall GODAN theory of change.9
Other notable networks that advocate for open data in agriculture through high-level communications, research, and events include the Global Partnership for Sustainable Development Data, the Research Data Alliance, Global Forum for Agricultural Research, Presidents United to Solve Hunger, and AgriCord.10
A value chain perspective
When we consider the potential and use of open data in agriculture, there are numerous facets that reflect the breadth and diversity of the sector, especially when one also considers nutrition as a key element of the field. Whether it is food price data, geodata, plant genomes, country statistics, nutrition data, or data from a grassroot initiative to quantify food composition, published open data sets can be used by a wide variety of stakeholders to generate impact.11The actors involved are similarly diverse. Consider, for example, the single value chain for cheese production illustrated below.
Cheese is made of milk produced with the involvement of feed producers, dairy farmers, transporters, and processing factories. Each actor has an interest in understanding the provenance of their inputs and the markets they operate in. Some of the production chain involves data that can be made open. In other cases, data will be seen as a commercial asset. Regulators may be interested in product traceability, nutritional content, and labelling, and in providing this information to consumers. Producers are also interested in investment opportunities and risk reduction. In this simple value chain, there are various ancillary datasets that may be considered pre-competitive, yet still have some commercial value (weather data, transportation data, genetic data on livestock, etc.). These datasets can inform production, allowing producers to adjust the sourcing of inputs or to modify the production process to improve both the quality and the volume of their crops. Openness is clearly a tool to facilitate the flow of data across this value chain and to realise the maximum potential of data, yet openness requires policy choices, private sector engagement, and consumer awareness. It also requires that consideration be given to how different actors will be able to use the data that becomes available based on its level of interoperability. This chapter will attempt to unpack a number of these issues in more depth.
Open data issues in agriculture
Agriculture is a complex sector, and it can be difficult to define its boundaries. Agriculture and food systems integrate seamlessly into other systems, such as ecology, human health, and the built environment. Sustainable agriculture is considered a “wicked problem”,12where too many elements are involved in order for the problem to ever be considered “solved”. The data and metadata that are collected within agricultural systems are equally complex because they are generated by thousands of global stakeholders from multiple sectors, using an incredible range of types, formats, and ontologies. However, when we consider some of the primary forms and uses of agricultural data, such as research, production management, and statistical monitoring, we can start to map out some of the roles that different stakeholders play as illustrated in Figures 2 and 3.
Governments collect and share data in the form of national and international statistics (e.g. US National Agriculture Census13and FAOSTAT14, but often also support farmers and agricultural practices by publishing key datasets used for ICT-enabled farm extension and to empower consumers in food supply chains. Governments may also provide policy-relevant open data, including data related to national standards and frameworks used by service providers who help farmers or processors meet regulatory requirements.15Government also uses open data to promote transparency in their operations, with registers of land ownership a key example.16They are able to use their regulatory power to collect, or require the publication of, key data from private actors. Since 2012, a number of governments have developed and implemented open data policies to help embed open data practice in their own organisations or use their role as donors, funders, and commissioners to bring open data into the mainstream of agricultural development work.
Larger agricultural businesses are increasingly interested in open data, and companies are exploring opportunities to act as both data producers and consumers.17Some larger companies recognise that they are being held accountable by society and that greater transparency is a key foundation of their licence to operate.18In 2014, with the support of the Open Data Institute, Syngenta, a multi-billion dollar firm, placed open data at the core of its transparency strategy;19however, for many firms, operational “transparency” remains more opaque with information buried in corporate reports and the lack of structured background data. This presents challenges not only for public scrutiny, but also for investors seeking to target more sustainable investments.20Due to the nature and size of the value chains of larger corporate entities in the agri-food business that operate on a truly global level, they can have a significant impact on countries that lag behind in terms of reaching the SDGs.
Many farmers in developed countries are turning to data-based precision agriculture. Even in the developing world, farming involves increasing amounts of data collection and analysis. However, smallholder farmers often lack the technical capacity to manage or exploit the open data they create or that is provided by external producers. Instead, they often rely on intermediaries from the private sector or government. These intermediaries typically develop portals, apps, and tools that allow farmers to benefit from data on a range of topics, such as weather, infestations, or soil quality, that would otherwise be unavailable to them. Farmers’ organisations have raised questions about the potential exploitation of data from farmers, with it being used against the interests of farmers unless it is well governed. In some countries, farmers have decided to take data management into their own hands by collectively developing portals and tools for themselves.
Academia and research have a long history of sharing data, and the cultural environment is shifting in a more open direction as open science is being embraced by more researchers,21donors,22and research networks.2324The FAIR (Findable, Accessible, Interoperable, Reusable) data principles25have seen very rapid adoption in the scientific community, and open data has an important, albeit not exclusive, role within these principles. In partnership with international institutions, researchers have built a range of research infrastructure, including the European Open Science Cloud,26and networks for the discovery of data, such as the CIARD Ring.27The Interest Group on Agricultural Data (IGAD) at the Research Data Alliance (RDA)28connects a global community of researchers in the agricultural domain to exchange state-of-the-art research data on agriculture. However, access to research data remains fragmented. Although good permanent repositories exist,29it is not uncommon for data associated with a research project to be published, but then disappear when funding for the project tied to maintaining the data servers is no longer available.
Overall, although the supply of open data from all these different stakeholders is increasing, there remain large gaps, quality issues, and challenges in making data interoperable, as well as difficulties in establishing appropriate incentives for the stakeholders that are most relevant within the value chain.
Toward a global (open) data ecosystem for agriculture and food
Agricultural data includes social, environmental, physical, and financial factors. If viewed through the value chain, this includes inputs (fertilizer, pesticides, seeds), production (soil, weather, growth, land and water use), harvest (farmer income, yield, storage), and transport to market (food prices, road conditions, CO2 emissions). This data is collected using several methods: in-situ sensors, household surveys/interviews and on-the-ground collection, and, increasingly, through technology, such as satellites and drones, and sensors on farm equipment.
With all this data, what would it take to secure the best access to data for improving agriculture and food security? This is the question addressed by Syngenta and GODAN partners in articulating their vision for a global data ecosystem for agriculture and food.30A global data ecosystem encompasses open standards and frameworks that enable decentralised data exchange. In an ideal open data ecosystem, all data, from geospatial to household surveys, could be layered together and used by any actor within the ecosystem. This is a socio-technical project: combining principles (such as the FAIR principles), technology, and stakeholder engagement.
Standards are explicit guidelines for the collection, management, and organisation of data. They can dramatically improve the interoperability of data between different stakeholders across agricultural value chains. Standards take many forms, including vocabularies, taxonomies, measurement protocols, data models, and equipment interfaces. The field of agriculture has long engaged in processes of standardisation for specific purposes, such as food safety, cross compliance of subsidies, machine engineering, and lab analysis, yet the existence of many sub-fields in agriculture has led to a proliferation of standards. These various standards have a surprisingly low degree of interoperability as they were developed to primarily serve the specific sub-fields; however, the need to use data from different sources for new applications (including big data and artificial intelligence applications) has made interoperability increasingly important. The starting point for greater interoperability is increased transparency on the development and use of current standards.
In order for standards to be more useful for research and for decision-making, they must be online, open, and machine-readable. GODAN Action (see box below) has completed a mapping of agri-food standards31and discovered that 16% of the standards are not online, only 56% are machine-readable, and only 21% are clearly available under open licences, thereby limiting their use for open data. The relative openness of standards is often related to the sub-field where they originated. For example, plant science standards are more likely to be open than soil-related standards, and supply chain standards are even less likely to be open.
GODAN and the Agricultural Information Management Standards (AIMS) initiative, hosted by FAO, have developed the VEST Registry32to make standards more open and useful by cataloguing ontologies in use in different agricultural sub-fields.33The RDA/IGAD,34started in 2013, works specifically on methods to make agricultural data more interoperable across crop-specific themes (such as rice and wheat) by developing joint standardised vocabularies, such as the Global Agricultural Concept Scheme.35Identifying and describing the standards in use provides a first step to increasing interoperability and rationalising standards; however, it is also important to increase widespread adoption of standards by embedding their use requirements in the development of guidelines and policies on open data.
GODAN Action36is a three-year multi-sector project funded by the Department for International Development (DFID) in the UK and implemented by the Open Data Institute (ODI), GODAN, the Technical Centre for Agricultural and Rural Cooperation (CTA), Wageningen UR, and FAO, which aims to enable data users, practitioners, and intermediaries to work effectively with open data in the agriculture and nutrition sectors. GODAN Action works on three focal areas that will help overcome open data challenges: promoting standards and best practices, measuring open data impact, and building capacity with stakeholders. GODAN Action is applying these three focal areas to three specific data themes: weather data (2017), nutrition data (2018), and land use data (2019).
Over the past decade, open access and open data policies have become more prominent among governments and funders of agricultural programmes. The US and the United Kingdom (UK) made some of the first efforts toward the creation of open data policies. In 2013, US President Obama signed an executive order37toward making data open by default, which led to the US Department of Agriculture’s (USDA) launch of the Food, Agriculture, and Rural virtual community38on data.gov. The UK created its open data policy in 201239and has since opened thousands of agriculture-related datasets through the Department for Food and Rural Affairs,40and the European Union (EU) has undertaken similar work through the EU Open Data Portal.41These examples illustrate the potential for public policy development in support of the publication of agriculturally relevant data.
Several governments in Africa are in the process of developing open data policies specifically for agriculture. In 2017, Kenya held a Ministerial Conference on Open Data for Agriculture and Nutrition, which culminated in the Nairobi Declaration, a 16-article statement on open data policy in agriculture and nutrition.42The statement was signed by 15 African ministers, who have formed a network to develop policies for their respective countries. Francophone Africa is developing a similar network to support public policy development, the Conférence d’Afrique Francophone sur les Données Ouvertes (CAFDO).43
In 2016, a beta version of an International Open Data Charter Open Up Guide on Agriculture was published,44setting out a call for all governments to adopt a focus on agriculture within their wider open data policies and providing guidance on policy and practice specifically in the agricultural domain. The full version of the Open Up Guide45was subsequently launched in 2018 at the International Open Data Conference in Buenos Aires, Argentina.
Funders of agricultural research and development have developed open access policies, although these generally require only open journal publication of the research conclusions without necessarily requiring the underlying data to also be published as open data. Since 2012, the UK’s DFID, the US Agency for International Development (USAID), and the Gates Foundation, among others, have established policies that require their funded researchers to share both research publications and research data under conditions that permit access and reuse.46However, a review of these policies in 2017 found they lacked clear open data definitions, suggesting a need to strengthen understanding of open data as a distinct concept alongside open access. There is also growing recognition that funded projects need support to understand and apply open data principles to their work, as well as access to technical data infrastructures to ease data publication and sharing. Several initiatives, such as the Gates Foundation funded Initiative for Open Ag Funding,47which ran from 2016 to 2018, have explored how to make programmatic data (financial and administrative data about funded programmes) open as well, building on the International Aid Transparency Initiative.48
A large gap also exists in the development of any coherent open data policy or practice among the private sector actors within the agricultural industry, although when put into place, such data policies would likely seek to balance open access with business interests, thereby limiting open data benefits and overall transparency.
The widely cited case of John Deere tractors has become a key reference point in discussions related to data ethics. These “smart machines” not only plough the soil, but also capture vast amounts of data, which, under their “terms of service”, are fed back to John Deere to analyse and exploit with no guarantee of benefits or data going back to the farmer.49Cases like this50have helped to spark an emphasis on data ethics in agriculture, exploring perceived power imbalances between farmers and big agribusiness and triggering initiatives, such as the EU Code of Conduct on Agricultural Data Sharing by Contractual Agreement,51endorsed by hundreds of equipment manufacturers.
Data privacy and security issues relate to the management and use of personally identifiable data, whether it is photographic, geospatial, financial, or demographic. There are many issues and ongoing discussions underway related to the degree of access that industry, government, and research institutions should have to data on the choices (e.g. agricultural practice, land use, product use) that an individual farmer makes. The norm is that data should not be made open when farm and farmer data privacy and security are at risk. There is general acceptance that sensitive data can be made available at times if aggregated, but not at the individual level. Data collectors must make every effort to prevent data breaches and inform farmers how data about them is used.52One such initiative that is now gaining traction in opening up data across agricultural companies, such as tractor companies and farm sourcing corporations, is the Open Ag Data Alliance53which has built an open source framework to allow farmers to access and control their own data.
Data ownership and legal rights issues are a difficult and complex component of the data ethics debate within the agriculture domain.54If data is to be increasingly made open by default, the sector would benefit from improved clarity around legal data ownership and governance frameworks. Legal issues that affect access to, and the use of, data at the international, national, and subnational level include copyright, database rights, technical protection measures, trade secrets, patents, plant breeders’ rights, privacy, and even tangible property rights.55Within the sector, there is general agreement that farmers should steward their own data and that legal frameworks should be transparent, but the discussions are complex,56and many worry that more stringent mechanisms around farm data ownership could hurt innovation.57
Responsible data relates to employing data in ways that do not increase power imbalances. Careful examination of context can result in data being opened, shared with a chosen group, or kept closed.58Governments may publish data to improve accountability, as a policy instrument or as a service to citizens, especially if collection has been paid for by taxes. The Open Data Charter59encourages governments to make their data “open by default” for this reason, but accepts that there may be cases when data cannot be opened.
There is growing recognition in the field that to release data responsibly, the effects on vulnerable communities, especially women, Indigenous populations, and migrant workers must be considered.60The sensitive information at issue in this case is not always personally identifiable information, but rather knowledge that, if made open, may allow others to profit from it to the detriment of others. For example, if data released indicates that women are managing or using land without obtaining the legal rights to do so, external actors may undertake to gain control of the land at the expense of the women.61Trust between stakeholders around appropriate data responsibilities is important, but little guidance currently exists on best practices.
Preliminary work on issues of privacy, responsible data, and data ownership in agriculture has been carried out, and numerous farm organisations, manufacturers, and other entities have expressed interest in participating in further conversations around data ethics to build a new consensus, especially as it pertains to smallholder rights.62This work is still at an early stage.
While smallholder farmers could benefit significantly from open data-driven knowledge on when and where to plant and harvest, and what current market prices are, at present it is highly resourced stakeholders who appear to be the primary beneficiaries of open data in agriculture. To ensure all stakeholders have the technical resources, knowledge, and capabilities to collect, publish, or reuse open data, efforts over the last few years have sought to overcome major capacity gaps among governments, data intermediaries, and farmers. For example, the GODAN Capacity Development Working Group and GODAN Action host webinars and provide a conversation space for those exploring how to use open data to create benefit for themselves or their organisations.63
Early learning from the field is showing that forming relationships among organisations and individuals, building trust, and ensuring a high diversity of stakeholders are all important in moving from awareness of open data to implementation of new business models and data use strategies. Researchers, governments, donors, NGOs, and farmers’ organisations have all discussed trust as an essential component of capacity development and willingness to commit to open data in agriculture.64 Evidence shows, however, that digital skills, including access to technology, access to the internet, and even simple word processing and spreadsheet management skills are lacking in rural farming areas, especially in developing countries, and among women and vulnerable communities. To seek to address these issues, CTA has invested in IT capacity development efforts and e-Learning specifically for women and girls.65As mobile phones are increasingly available in developing countries, advocates expect that skills will increase, especially in rural agricultural areas. However, it is also anticipated that more capacity development efforts will be needed to ensure that all farmers can access, use, and share open data, including through the use of mobile platforms.
As we have seen, agriculture is diverse, as is the potential for applying open data to support a range of activities in the sector, from providing remote sensing data for precision agriculture applications to bringing farming extension advice to smallholder farm owners. Although the stakeholders may look very different, overarching sector goals remain mostly unchanged: to grow nutritious food as efficiently as possible, balanced with the need to secure the basic livelihood of people everywhere, using successful business models. As outlined in this chapter, the burgeoning ecosystem for open agricultural data is only beginning to address a myriad of issues as evidenced by the series of discussions that took place at the GODAN Summit in 2016 (see Figure 4). In the light of a growing world population and ever-increasing pressures on resources, we need technological improvements and innovative approaches in many areas of agriculture and nutrition to meet this goal, and data will be central to that effort.
To date, the private sector has shown only minimal interest in publishing their data openly for reuse. A much greater emphasis on incentives and business models that encourage the release of open data at all levels of agricultural value chains is necessary. Both researchers and companies need to undergo a cultural shift from closed and proprietary to shared and open, recognising the value of open data in promoting innovation, cost-sharing, and improved value chain efficiencies. The extent to which the FAIR principles have caught on, at least in the rhetoric of the sector, is encouraging and highlights the value of communicating open data ideas as part of a broader normative agenda for advancing agriculture.
In 2018, the meaningful sharing of useful (both anonymised and identifiable) on-farm data was often curtailed by legitimate privacy concerns raised by farmers and their organisations, or by farm machinery and farm management systems that operate in a proprietary space. The open data community needs to increasingly involve stakeholders who are trusted by farmers, such as farm cooperatives, in order to promote innovation using on-farm data. Right now, for those wanting to innovate with data, obtaining large satellite datasets from governments or agents who have already adopted an open data policy is a lot easier than opening up on-farm data or nutrition data from surveys. Yet inclusive innovation also requires remote sensing to provide ground-truth data, highlighting the need for ongoing efforts to secure granular data about farms with the acceptance and support of farmers and their communities. As the agri-food industry increasingly needs a “licence to operate” from the public, they have begun to release more data on their sustainability performance. Early examples of this data publication and of the private sector’s involvement in tracking SDG progress is promising in that regard. In addition, open data for agriculture has almost exclusively focused on food security, but, thus far, has neglected to consider textiles and forestry, which bear a large environmental cost and should be priority areas for future focus.
The seeds are sown for the growth of open data in agriculture, but, as yet, the evidence of lasting impact is limited. Creating the right ecosystem will need more than awareness raising. It will require all stakeholders to grapple with challenging ethical issues by turning debates and discussions into consensus, capacity development, guidance, and common approaches that can be deployed at scale.