COVID-19: The Devil in the Data

The world is fast approaching 1.5 million confirmed infections. The United States alone accounts for about 400,000 cases. By contrast, 1.3-billion-strong India, more than four times the size of the US, only has a little over 5,000. In between are a handful of European nation states, such as Spain, Italy, France, and Germany, each with more than 100,000. Then come China, Iran, and the UK, with 50,000 to 80,000 cases. In total, 209 countries are affected. Total fatalities, now approaching 100,000, make for even more depressing reading.Numbers such as these are central to how we make sense of COVID-19. They present a picture of its global spread, help us make comparisons, and allow us to express our concern for others, even as we calibrate responses to safeguard our own health. But such numbers can also create a false sense of comfort by suggesting we know more than we do. Consider recent reports that some estimates exclude cases that are confirmed but asymptomatic. Such cases, other reports indicate, may comprise a quarter of all infections. Or consider that low infection numbers in many countries may be driven by faulty models that suggest no community spread and, therefore, a lack of urgency when it comes to testing.In the absence of transparency about our data practices, numbers obfuscate as much as they illuminate. They allow states to justify the adoption of specific, sometimes draconian, measures and to make claims about their efficacy after the fact. They also provide fertile ground for disputes about the intentions of states and sub-state actors, serving to ratchet up domestic and international tensions, as accusations of culpability for actions real and imagined get bandied about. Data is clearly neutral in the most partisan of ways.Data transparency typically consists of two related elements: availability and commensurability. What good is data if we hide it from others? But availability alone is not enough. The data also need to be commensurable (that is, derived using a set of common standards). In our present context, making data about infected cases available has only limited value if countries use different standards to define confirmed infections or if they choose not to universally test symptomatic individuals.A local resident waits for the bus as the spread of the coronavirus disease continues, in New Orleans, Louisiana, April 7, 2020. Photo: Reuters/Carlos BarriaThere are a number of reasons why both data availability and data commensurability have proven to be enduring challenges to global society. We can list them under three “I”s: Interests, Incapacity, and Incompetence. Interests may be national, institutional, or personal. Some states may under-report data to avoid embarrassment on the international stage, others to not compromise their legitimacy domestically. In some instances, data secrecy may also serve the interests of maintaining public order. Institutions and individuals too can act parochially, driven by incentives such as access to funding, influence, or some other form of advancement.If interests suggest motivation, incapacity suggests structural issues. An incapacity to collect and disseminate data may stem from lack of technical knowledge about best practices. It may also stem from paucity or misallocation of resources. Finally, we can never exclude from any explanation the role of outright incompetence. All too often, what we encounter in the real world is a combination of all three. Interests, incapacity, and incompetence make for a heady cocktail and every society exists along a spectrum, from the capable to the comical to the catastrophic.Our quest for transnational data transparency is an old one. Shortly after the first World War, the League of Nations established an International Statistical Commission with the aim of sharing data across borders. In the interwar years the economist Simon Kuznets helped create what remains perhaps the most influential of statistically commensurable measures: Gross Domestic Product. Decades later, as the world emerged from the cumulative destruction of two world wars and a global depression, UN Secretary General Trygve Lie renewed the call for commensurable data. In a speech delivered in 1947, Lie observed, “we cannot cure our troubles unless we know in the first place what those troubles are. Likewise, we cannot achieve international understanding… unless the peoples of the world are given the facts about each other.”Only “clear and systematically organised facts,” Lie explained, could be “relied upon to measure resources and potentialities for progress and to direct policies and actions designed to achieve the objectives of all civilised people.”In the years since, the UN has been at the heart of a range of efforts to generate commensurable data. Its System of National Accounts (SNA), first created in 1953, provides a common rubric through which to track and measure economic activity. The Human Development Index (HDI), around since 1990, offers a more comprehensive and representative assessment of individual well-being than GDP. But it is the World Health Organization (WHO), the UN agency tasked with dealing with global public health, that best encapsulates both the strides made and the struggles that remain. As the world’s apex public health institution, it plays a crucial role in coordinating global responses to pandemics like COVID-19. Yet, as an agency with no autonomous authority to collect data, it is entirely dependent on the transparency of member countries. And therein lies the rub. Witness its frequently confused statements on COVID-19, as it grapples with the incomplete and self-serving data provided by nation states.Also read: How Poor Data Protection Can Endanger Communities During Communal RiotsTo be sure, the fault lies not in the data, but in us. Instead of valuing data transparency, we have permitted states and corporations to weaponise data and use them to extract from us our labor and our obedience. And so, instead of data transparency, we inhabit a world of data tyranny. The novel coronavirus has laid bare these fault lines, accentuating troubling global trends. The dangers are manifold; none more so than the growing threat of hyper-nationalism. The United States under Trump has taken great pains to use the label “Chinese virus” or “Wuhan virus”. Mike Pompeo’s insistence that the latter term be used in a joint communique of the G-7 resulted in a breakdown in negotiations. The PRC has responded by blaming the US for sending the virus to China in the first place. Elsewhere, such as in Hungary, leaders are using the pandemic as an excuse to arrogate dictatorial powers to themselves. An ugly corollary of such heightened jingoism is the growth of overt racism and mutual suspicion everywhere. Witness the increased incidents of anti-Chinese racism in India and the United States.Hyper-nationalism thrives in a climate of data secrecy. It is also destructive. The twentieth century bears witness and offers warning. The hardening of physical and mental boundaries and the concomitant waning of mutual trust will also hamstring efforts to better understand COVID-19. It will lead to delays in assessment and in devising suitable remedies. The cost in lives and livelihoods, whether measured in financial, physical, or emotional terms, will be colossal. It will also be borne disproportionately by the most precarious among us, the poor, the minorities, and the elderly.We may yet hope that the fault lines that now lie exposed will precipitate timely introspection. Perhaps we will course correct and institute systemic and structural changes to our politics and our economics. As the Bretton-Woods consensus, and the more recent neoliberal turn it has facilitated, unravel before our eyes, we can dream again of a more equitable, less environmentally destructive, and friendlier world. Whether we get there or not remains to be seen.But at the very least, COVID-19 should help us recognise we cannot set down that path without data transparency, without, in Lie’s words, “the facts about each other”.Arunabh Ghosh is a historian of modern China and teaches at Harvard University. He is the author of Making it Count: Statistics and Statecraft in the early People’s Republic of China (Princeton, 2020).