Why we invested in building an equitable data economy

By Sushant Kumar, Principal, Omidyar Network

A version of this post first appeared on LiveMint.com on August 13.

urn left at the next exit, in 200 meters you will reach your destination”, the voice on Google Maps running on her phone dutifully informed Meera Kumar. This was the last turn that she needed to take before reaching her ophthalmologist. As she navigated her car through her city’s notorious traffic, Meera marvelled at the convenience of the applications on her phone. Google Maps had saved her at least fifteen minutes by guiding her to the fastest route. It was perfect. She wondered how such an amazing utility is available for free. In fact, Meera had made couple of voice calls in the morning, messaged her friends, posted a photo of her morning coffee, and had sought an appointment with an ophthalmologist, whom she had discovered online. All without paying a single dollar.

We have come to expect services on the Internet to be available for free. The top two companies providing a vast majority these free services, Google and Facebook, rank number four and six in the list of world’s top companies by market capitalization. It is well known that Google and Facebook make money from digital advertising. The advertisers pay for the Internet user’s attention and engagement, mainly through clicks. This business model utilizes users’ data to make predictions about their behaviour and to target them — with a goal to serve relevant content and advertising at the right time. For example, Meera had earlier seen an ad for an eyewear shop in her neighborhood because she had searched for an ophthalmologist.

Meera’s experience is a simplistic and benign representation of the “age of surveillance capitalism”, a term made prominent by Prof. Shoshana Zuboff from Harvard University through her book, which describes how large Internet companies have built tremendous power and revenue through collection of data about the users. The holy grail of this business model is to “think the thought before you do” and to predict your behaviour. The best way to achieve this is to collect and analyze all the data about your demographic, online search, purchase history, friends, address, video-viewing patterns and much more, which accurately defines your digital presence and identity. Such deep understanding combined with non-personal data such as weather and traffic patterns has helped create viable online businesses. Tech giants in the field of online advertising, ride sharing, gig economy and ecommerce all thrive on data and consider it a competitive advantage.

Does this mean our data is the “raw material” and should we get paid for this data? Afterall, Facebook made a revenue of roughly $30 per user in 2019, as a global average. The answer lies in understanding the unique nature of data and its value, and in enabling tools of greater societal value creation, which are in early stages of theoretical and empirical exploration.

Data is unique in its conception. In order to rationalize policy-making around data, it has been compared with several things. The best known among them is that “data is the new oil”. In addition, data has been compared to labor, property, gold, coal, sunlight, poison, carbon dioxide, and other such comprehensible things. These comparisons seek to establish that data is a resource, which can be utilized for value creation (such as property, coal, oil, gold) or that it can lead to individual and public harms (like carbon dioxide, poison).

There are four characteristics of data and its value that uniquely complicate the task of attributing costs and benefits to it.

· First, from an economic lens, data does not diminish as it gets used, unlike resources such as oil or coal. Many copies of the same data can be made and can be theoretically used unlimited number of times by several organizations. Data utilized for serving pesky ads can also be utilized for pandemic research.

· Second, data involves benefits or costs imposed on third parties that are not related to the transaction. For example, one person’s daily commute data when aggregated with the wider population’s data will help improve traffic prediction in Google Maps.

However, in some cases, this data can negatively impact individual privacy and well-being if made public. For example, data about individual members of a community can be aggregated to discriminate against an entire community. Data about the social network of an individual can be used to draw inferences about people who choose to keep their data private.

· Third, value of data can be exponentially enhanced by combining it with other data sets. At the same time, the liabilities and costs associated with handling such data are also likely to increase manifold. Therefore, the understanding of value will need to account for all aspects of individual benefits and harms, societal welfare, and profits for businesses.

· Lastly, in most economic activity, there is an exchange of value; stakeholders create certain value and realize some economic return. Value creation is driven by at least five stakeholders: technology company, government, community, gig-economy worker, and the individuals. The technology company that collects and processes data, realizes value through the monetization of data. Individuals and communities receive economic value of services from tech corporations (such as maps, news, email, networking), governments receive some tax revenues, and gig workers make their wages. However, it is increasingly clear that the current design of the data economy leads to concentration of profits with the technology giants and does not support realization of fair value for people and society.

Some online tools promise to provide a platform for users to sell their data in the open market. To test this, a journalist from Wired, Gregory Barber, conducted an experiment. He became his own data broker and signed up for several applications that promised crypto currency in exchange for his Facebook data, location info and health records. After six weeks of dedicated effort, Gregory’s earnings totaled to 0.3 cents. This experiment is not a definitive assessment of the value of an individual’s data. In addition, the promise of “selling data” does not account for protecting the rights of users and for minimizing harms to the community.

There is some good news too. There is a growing class of tech services which promise to provide control, empowerment, and a minimization of harms to users. Personal data stores such as Solid, Digi.me, Hub of all Things (HAT) and Meeco promise its users the ability to store and manage permissions to their data. For example, Digi.me, in which Omidyar Network is an investor, allows users to create an encrypted library of their data that they can share with companies such as a loan provider for credit assessment or with insurance providers for a better premium. In addition, some personal data stores also allow users to make their data available securely in exchange for some rewards. These efforts are in early stages of development, maturity, and discovery of business models. However, they do offer the promise of decentralizing control of data away from the large tech players and to make more data available for generating public value. Greater control in the hands of users could lead us towards optimal outcomes, as research from Stanford University suggests. In this research, when users were given control over their data, they optimized for both privacy and consumption by keeping sensitive data private and making other data sets available to multiple companies, thereby improving market outcomes.

However, the responsibility of creating a fair data economy cannot be shouldered only by individuals’ actions. Individuals are usually not in a position to negotiate their rights and economic gains. First-time Internet users, especially in emerging markets, are increasingly transacting online and are vulnerable to harms. Omidyar Network is interested in exploring select policy ideas that can address the systemic flaws and nudge the data economy towards equitable outcomes for all stakeholders.

1. A progressive tax levied on monetization of data can serve to extract some value for the purpose of redistribution. Noble prize-winning economist, Paul Romer has proposed that the revenue generated from the monetization of data should be subject to a progressive tax, similar to an excise tax, that can serve to disincentivize collection and monetization of user data. While the original proposal intends to minimize surveillance and not boost revenue generation, this tool can be explored for its efficacy in fixing systemic flaws and in aiding redistribution of economic gains. The progressive nature will ensure that early-stage innovators are not disadvantaged.

2. The idea for giving every citizen a “data dividend” is similar to universal basic income, albeit focused on data-related economic activity. This idea was proposed by the California Governor Gavin Newsom and has been recently elevated by Andrew Yang’s Data Dividend Project. While the idea holds promise, the articulation of “data as property” is simplistic and wrought with deficiencies. A credible action plan for policy and political interventions is needed to make progress.

3. Most importantly, catalyzing greater societal value creation, such as finding a cure for cancer using data, can create immense value for societies, thereby increasing the size of the pie for everyone. The European Union’s data strategy report and the non-personal data (NPD) committee’s report in India has emphasized unlocking value of data for society by enabling greater access to community data sets. Responsible stewardship of data through mechanisms such as cooperatives, trusts, exchanges, personal data stores, and account aggregators will ensure right safeguards and enable greater data sharing by people. Individuals may be open to sharing health data for research, without expecting any monetary compensation, when offered a safe mechanism that prevents misuse.

In the last year, Omidyar Network has begun digging deeper into these issues and other areas through our support of researchers, startups, and industry experts willing to reimagine the social contract for data. Below are some examples of the organizations and individuals helping to advance this work globally:

· Aapti Institute — a global research institution that generates public, policy-relevant, actionable and accessible knowledge from the frontiers of tech and society, about our networked lives, to support the creation of a fair, free, and equitable society. With our support, they have been building knowledge base for good data stewardship, much of which is housed at the Data Economy Lab.

· IT for Change — a nonprofit based in Bengaluru, India, which aims for a society where digital technologies contribute to human rights, social justice, and equity. Our support enables their policy research on data economy across nine countries in the global south, including examining how data policy frameworks balance the competing, and sometimes conflicting, interests of different stakeholders; identifying areas where digital startups can challenge the incumbents with novel and responsible business models; and ensuring privacy and personal data protection of individuals.

· Indraprastha Institute of Information Technology, Delhi (IIIT-D) — a state university located in Delhi, India with a research-oriented with a focus on computer science and related fields. We are supporting thedevelopment of a data trust model, including governance and legal frameworks, to better steward the publicly available data about buses.

· UCL — The Institute for Innovation and Public Purpose is changing how public value is imagined, practiced, and evaluated to tackle societal challenges. With grant support, Mariana Mazzucato and Tim O’Reilly are leading research on economic rents, algorithmic monopolies, and competition policy to inform the global digital and data policy discussions.

· Richard Whitt — a former Googler with a passion for making the open Web a more trustworthy and accountable place for human beings. Richard is currently writing a series of essays, highlighting various aspects of data governance presented by society’s encounter with COVID-19 and encouraging digital fiduciaries to help govern sensitive data-centric ecosystems, both now and in the post pandemic world.

· Yale School of Management — The Tobin Centre for Economic Policy supports research that offers a “counterweight to pure ideology; bears directly on questions of public policy; brings together research methods from a variety of methodological perspectives; and is shared effectively with opinion-shapers and policymakers”. In our collaboration, the center will focus on analyzing, translating, and recommending policies that promote equitable and fair data markets.

On balance, there are complex trade-offs associated with the data economy. We can all agree that individuals have a right to ask for a better bargain, greater share of value, and the minimization of harms. The road to an equitable data economy is undoubtedly tied with questioning the structural dominance of the large tech enterprises.

Omidyar Network is a social change venture that reimagines critical systems, and the ideas that govern them, to build more inclusive and equitable societies.