VoxEU Column COVID-19

Data needs for shutdown policy

Decisions about whether to clamp down or ease up on social distancing hinge on how deadly and widespread is the novel coronavirus. But as this column discusses, neither is known because tests for the virus have focused on those showing severe symptoms and at high risk. If the virus is still not widespread, then it is deadly and there is still time to implement measures – more severe than those currently in place in the US – to suppress it until a vaccine or treatment becomes available. If the virus is widespread, then the true death rate is low and cautiously opening up the economy becomes an option. Data from random testing of the population, which are still unavailable, are critical to informing this choice.

COVID-19 has thrust upon policymakers the nightmare problem of how best to save lives while limiting long-term economic damage arising from economic shutdown, social distancing, and the collapse of demand in many sectors. All available options are bad, but some are worse than others. The problem is, we don’t know which options are the least bad because we don’t know the true mortality rate or how prevalent the virus is in the population. 

SIR models

The workhorse epidemiological model is the so-called susceptible-infected-recovered (SIR) model. The SIR model describes the dynamics of an epidemic (Baldwin 2020). Those who have not had the disease (the susceptible) can contract it through contact with infected individuals; the infected either recover or die. The SIR model, augmented with mortality, elegantly describes complicated epidemiological dynamics with only three parameters: the contagion rate, the recovery rate, and the death rate among the infected. The model implies a single summary measure, R, the case reproduction number, which is the number of follow-on cases produced by a single case. Thus, the SIR model, augmented to account for mortality, traces out the path of the disease from its earliest cases to the point that it is either extinguished or immunity (either herd immunity or vaccination) is achieved. The SIR model, like any model, is a simplification, but it is a useful one that underlies nearly all the epidemiological projections of the path of COVID-19.

In a series of papers over the second half of March 2020, a number of economists have looked at the challenges of projections using SIR models (Atkeson 2020a, 2020b, Berger et al. 2020, Eichenbaum et al. 2020, Malkov 2020, Stock 2020). (The economists aren’t trying to muscle in on the epidemiologists, rather, if there ever was an all-hands-on-deck moment, this is it.) One would be forgiven for thinking that fitting an SIR model is simple: three unknown parameters (the transmission rate, the recovery rate, and the death rate) and three equations (infections, recoveries, deaths). The trouble is, for the coronavirus we don’t have data on infections or recoveries because there has not been the testing needed to estimate the infection rate in the overall population. In fact, even the data on deaths might not be highly reliable, especially early in the pandemic when they might have been confused with flu or some other comorbidity. Without data for at least two of the equations, we have one equation and three unknowns.

Because tests for the virus have been in limited supply, they generally have been targeted to individuals who meet strict guidelines that combine severe symptoms, being at high risk group, or being health care worker or first responder. As a result, those tested are more likely to have the virus than the general population, so the positive testing rates to date cannot be readily generalised. At the same time, there are some individuals, possibly many, who have or have had the virus but did not meet stringent guidelines for getting tested. Sometimes this is called the asymptomatic rate, although more precisely it is the fraction of infected who do not meet testing guidelines.

Sensitivity of projections to the asymptomatic rate

It turns out the asymptomatic rate is critical for projecting the epidemic dynamics and the policy response. Figure 1 shows the epidemic dynamics emerging from a simple SIR model for a particular policy path meant to approximate the current US shutdown left in place through mid-June, then slowly lifting it through the end of October when social distancing returns to normal. In the top panel, the asymptomatic rate is low at 30% (the estimate of the asymptomatic rate in Nishirua et. al. (2020) for 565 Japanese nationals evacuated from Wuhan, all of whom were tested). In the bottom panel, the estimated asymptomatic rate is 86%, Li et. al.’s (2020) estimate of the undetected infection rate in China.

Figure 1 Rates of symptomatic (left axis) and ever-infected (right axis)

Source: Stock (2020, Figure 3)

These simulations are too crude to provide reliable numerical values; the point is to illustrate the sensitivity to the unknown asymptomatic rate, that is, the fractions of infections missed under current testing guidelines.

In the two simulations, approximately three-fourths of the population has been infected by the end of this year; however, the rate of symptomatic infections – those sick enough to qualify for testing under current guidelines – is very different. If the asymptomatic rate is high, the true death rate is much lower, and allowing the virus to spread through the population results in less strain on hospitals and fewer deaths.

Policy without data

One could argue that the simulated policy underlying Figure 1 is suboptimal – less bad than the other bad alternatives – in both cases. While more work is needed to firm up this argument, it is useful to consider the two situations separately.

First, suppose that the asymptomatic rate is low. Then this ‘protracted status quo’ policy will lead to very many deaths, perhaps in the millions, and it could be preferable to take very strong action now to stamp out the virus, avoid those deaths, and wait until a vaccine is available. In the jargon of the model, doing so requires bringing R down so that R < 1 so that rate of new infections (the attack rate) drops to zero. 

Estimates from Wuhan (Wang et. al. 2020) suggest that it is possible to bring R below 1. Wuhan’s measures included mandatory relocation of confirmed cases, presumptive cases, cases with fever, and close contacts to hotels, dorms, or pop-up hospitals; stringent social isolation measures; and mandatory thorough follow-up contact tracing. Wuhan’s restrictions were more stringent than currently in place in Italy, where R is estimated still to exceed 1 (Abbott et. al. 2020). In Singapore, mandatory contact tracing includes using a smart phone app that collects data on who else’s phone you have been close to and for how long, with jail for those who do not cooperate. Other tracing technology includes applying facial recognition software to security camera data. Is America ready for such measures?

Next, suppose that the asymptomatic rate is high. Then the protracted status quo policy has fewer deaths, but it keeps the US economy at its current state of partial shutdown into the early fall. The economic costs of so doing are enormous: the US unemployment rate is probably already at 8-10% and is likely to rise as demand falls further, and private forecasters are looking at second-quarter declines in GDP growth at an unprecedented 18-34% an annual rate. The $2+ trillion relief package would need to be renewed, perhaps multiple times, saddling future generations with debt. With a high asymptomatic rate, reopening sooner and thus achieving herd immunity sooner would forestall immense economic damage.

Which course should we steer? Well, it depends in the first instance on the prevalence of the virus in the population, specifically the fraction of the infected that are ineligible for testing under current protocols. 

This asymptomatic rate, and with it the true infection rate, could be readily estimated through random sampling of the population. There are some technical difficulties – for example, we would not be able to force compliance – but economists and statisticians have developed a powerful toolkit for providing incentives to individuals to participate in experiments and to correct for the induced problem of sample selection.

The real difficulties, however, appear not to be technical but political, or more precisely a conflict among whether scarce testing resources should be targeted exclusively to the sick who might be helped, or whether they should also be deployed to achieve broader public health and economic goals. Even calls for widespread testing as a surveillance system (e.g. Gottlieb et. al. 2020) tend to focus on identifying cases as they arise and contact tracing, not on the widespread random testing needed for epidemiological modelling. Wide-scale testing has been rolled out in Iceland, but participation is voluntary and not surprisingly the early volunteers do not appear to be representative of the population (40% of the volunteers have cold symptoms). Norway and Germany are rolling out large-scale testing, and with proper design the results could be informative.

Another type of testing is for antibodies to the virus; this testing can be used to estimate the true rate of recovered individuals in the population. Here too, random testing is critical to get an estimate of the recovered rate.

Until a month ago, it was common for economists to feel awash in massive new data sets, such as credit card records, minute-by-minute electricity use data, and prices of millions of goods scraped from the Web. Now, decisions that could save millions of lives or prevent an economic catastrophe with effects that will ripple for decades hinge on the lack of data to estimate a single parameter – how widespread this virus really is.


Abbott, S et al. (2020), “Temporal variation in transmission during the COVID-19 outbreak in Italy,” CMMID Repository.

Atkeson, A (2020a), “What will be the economic impact of COVID-19 in the US? Rough estimates of disease scenarios,” NBER Working Paper 26867.

Atkeson, A (2020b), “How Deadly is COVID-19? Understanding the Difficulties with Estimation of its Fatality Rate,” manuscript, UCLA.

Baldwin, R (2020), “It’s not exponential: An economist’s view of the epidemiological curve”, 12 March.

Berger, D, K Herkenhoff, and S Mongey (2020), “An SEIR Infectious Disease Model with Testing and Conditional Quarantine,” Becker-Friedman Macro Finance Research Program Working Paper 2020-25.

Eichenbaum, M, S Rebelo, and M Trabant (2020). “The macroeconomics of epidemics,” NBER Working Paper 26882.

Li, R et al. (2020), “Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2),” Science.

Malkof, E (2020), “COVID-19 in the US: Estimates of Scenarios with Possibility of Reinfection,” manuscript, University of Minnesota and Federal Reserve Bank of Minneapolis.

Mizumoto, K et al. (2020), “Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020,” Eurosurveillance 25(10).

Nishiura, H et al. (2020), “Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19),” International Journal of Infectious Disease, forthcoming.

Pueyo, T (2020), “Coronavirus: The Hammer and the Dance,” Medium, 19 March.

Qui, J (2020), “Covert coronavirus infections could be seeding new outbreaks,” Nature News, 20 March.

Roser, M, H Ritchie, and E Ortiz-Ospina (2020), “Coronavirus Disease (COVID-19) Statistics and Research,” Our World in Data.

Stock, J H (2020), “Data Gaps and the Policy Response to the Novel Coronavirus,” NBER Working Paper 26902.

Wang, C et al. (2020), “Evolving Epidemiology and Impact of Non-pharmaceutical Interventions on the Outbreak of Coronavirus Disease 2019 in Wuhan, China,” 6 March.

3,255 Reads