On July 12, 2020, a committee of experts established by the government of India proposed a regulatory framework for ‘non-personal data’, which may well define the future of India’s digital policy.
The expert committee’s draft report has proposed an entirely new regulation for ‘non-personal data’, entailing an attempted definition of ownership rights over data with ‘communities’, and establishing a regulatory authority to make rules about data governance and use.
Non-personal data, such as data about the environment, production processes, or geospatial information, holds both public and economic value, but its collection and use can equally produce collective harms.
Take, for instance, the increasing ‘datafication’ of agriculture – where data collected about agricultural processes, which includes data about milk production, soil conditions, or pesticide use. This type of data is rarely about specific persons, and therefore does not fall under the scope of personal data protection law. However, who gets to collect, share and use agricultural data matters. Farmers could utilise such data to grow crops more efficiently, limit pesticide use and to move away from monocultures. However, the same data can be collected from farmers without their knowledge or consent, and then monitor a farmer’s ability to pay for inputs, lock them into services and machinery, or manipulate commodity markets. In short, it can further increase a farmer’s dependence on suppliers of seeds, fertilizers and pesticides.
How then, do we ensure that the economic and societal value of this data is unlocked, while safeguarding against the various harms to the communities that relate to the data?
One of the highlights of the Expert Committee’s report is that it clearly recognises the need to address the imbalances in power between those collecting the data and those most affected by it, and promises to foreground the interests of communities. However, despite its seemingly radical agenda, it ultimately fails to adequately address the source of inequities in the digital economy, or justify the forms of regulation that have been proposed.
The Committee’s recommendations stem from its equation of data to a resource or commodity, which is best governed through the allocation of ownership rights and by promoting its most economically efficient use. The committee’s recommendations are focused on ‘unlocking’ value in existing forms of data, primarily through mechanisms for data sharing between businesses and government. This is an explication of the common refrain that ‘data is oil’ and a valuable commodity with economic potential.
However, this conception of data is limiting in many respects. First, it ignores that ‘data’ is not merely something that is ‘captured’ by existing data businesses or technology providers, but rather reflects the motives, assumptions and biases of the people and the firms which collect and utilise this information. The processes and business models of ‘datafication’ are mired in a reductionist conception of people and spaces as resources and commodities.
The top-down approach taken by the Committee is insufficient to challenge this problem. Bottom-up approaches must consider the technical standards, technologies and the institutions which collect and utilise information, and establish agency over each of these elements.
Second, defining rights in data as those of ‘ownership’ can limit the range of interests which can be reflected in data. In particular, we need to look beyond economic interests and also consider the various other rights or interests that may be involved in different forms of information. In the case of agricultural data, for example, we may consider the interests of farmers who could benefit from crop sowing data, or consider environmental interests in developing agricultural practices that are safe and sustainable. Ultimately, these are questions about the governance of the data collection and processing which must be decided by the communities about whom the data pertains.
A policy framework for the governance of data must take into account the overlapping and occasionally contested interests in such governance, which need to be accounted for not only at the level of the individual, but also at the level of collective governance.
Trust, but not blindly
As a solution to the twin problem of collective harm and lack of communal control over data, the Committee proposes to grant communities collective ownership and management rights over their data through institutions known as ‘data trusts’. The generally agreed on definition of a data trust is that it is a legal relationship between trustees, who steward data rights, those who hand over their rights for stewarding by the trustee and the beneficiaries of the trust. As an example: rights over agricultural data could be placed in a trust by the holders of those rights, to be stewarded for the benefit of the farmers (or for general public benefit). The key component of a data trust is the fact that trustees hold a fiduciary duty of loyalty. That is, they can only act in the sole interest of the beneficiaries.
Institutions like data trusts are an important addition to the conversation on forms of data governance. However, whether communal data rights and data trusts would indeed empower, instead of harm, communities depends in large part on their implementation. The Committee’s recommendations, unfortunately, raise more questions than they answer.
First of all, the report assumes an identifiable, pre-existing community, whose interests can be protected by allocating ownership rights to them. However, contemporary data analysis and automated decision making technologies challenge this notion. When it comes to data, communities could span virtual realms that do not adhere to geographical boundaries. Who then decides what constitutes a community? Is it the members themselves? Or would this decision be made by a central authority?
Another concern is the lack of consideration given to the question of who should be the ‘trustee’ of the data, which exercises rights on behalf of the community. From the examples provided by the Committee, these could range from existing government agencies to voluntary groups. However, the Committee fails to account for independence and conflict of interest, which are crucial factors for the governance of a trust. For example, a government agency may wish to access information about a community to fulfil its own responsibilities, creating a conflict with the interests of the community.
Finally, nothing is said about how decisions will be made about how data is collected, shared and used. Data trusts are primarily a legal instrument, which leaves the problem of governance largely undetermined. In our view, a legitimate data trust should be one in which the various voices within a community are heard and decisions made by trustees are transparent and can be challenged by those who are affected by them.
On the whole, while it is laudable that important questions of data governance are being given consideration, the recommendations of the Committee are substantially underdeveloped. The Committee fails to both adequately assess the reasons for the inequities in the data economy, nor does it provide a workable framework for community-led governance of non-personal data. A mature policy on non-personal data that truly respects the rights of communities to control if they want data relating to them to be collected or shared needs to carefully consider questions of community representation, governance and legitimacy, and addresses questions of agency and self-determination.
Divij Joshi is a Mozilla Fellow working on tech policy in India. Anouk Ruhaak is a Mozilla Fellow working on collective data governance.