We know that goods that are expensive in some countries are cheap in others, and that accounting for these relative price differences is critical if we want to measure the wealth of nations properly. If all commodities were perfectly traded across borders, there would be a single world price for each. But some of the most important commodities are not traded across space (ranging from haircuts to medical services), and so converting local prices into a single currency at market exchange rates does not solve the problem.
The computation of purchasing power parities (PPPs) to take these differences in prices between countries has been one of the most celebrated successes of economic measurement. Currently, PPP adjustment is employed by the most-used sources of cross-country data, including the World Development Indicators and the Penn World Tables. The creation of what economists consider to be reliable PPP adjustment has fuelled research into cross-country growth (Barro 1991, and the large number of papers that followed) as well as the measurement of world distribution of income (Chen and Ravallion 2001, Sala-i-Martin 2006).
Concerns about purchasing power parity
There are, however, important concerns about PPP adjustment as it stands.
- Does adjusting for PPP provide better estimates of the underlying economy than using market exchange rates? This is especially relevant for poor countries. PPP adjustment attempts to improve on market exchange rates by capturing the prices of nontraded goods, but it introduces other biases. Among these is the inability to measure the difference in quality between goods using prices, and the mechanical overvaluation of consumption bundles because the relative prices used for valuing the bundles differ from the transacted prices. Dowrick and Akmal (2008) argue that these biases are serious enough that we should use alternative methods to compare the economies of different countries and develop metrics to do this by combining purchasing power parities and market exchange rate data. Almas (2012), on the other hand, uses survey-based data on food consumption to compare PPP-adjusted estimates of GDP (using the PPP available at the time) with the income levels that would rationalise food shares in typical Engel curves for food. She finds that while rich countries are well described by PPP-adjusted data, poor countries may be better described by estimates of GDP based on market exchange rates.
- Have successive rounds of price surveys generated improvements or deteriorations in our estimates of PPP? Purchasing power parities are computed by the International Comparisons Project (ICP) using international price surveys. Until 1996, these largely focused on the developed world. For countries without survey data, prices were extrapolated using a regression-based procedure. They were not based on direct data from these countries. This changed dramatically following the 2005 round of surveys, which was the first to include China and several other large developing countries. The new estimates suggested that prices in developing countries were higher (hence, the economies of these countries were smaller) than was previously thought. This conclusion, in turn, was largely undone after the next round of surveys in 2011. The instability of the PPP estimates does not tell us whether the changes implemented in the 2005 or the 2011 surveys led to progress or regress. Methodological arguments can suggest plausible hypotheses for which price estimates, from which survey, are preferable – but is difficult to reach firm conclusions without an independent benchmark for what the measurement goal may be.
The volatility of the successive price surveys raises a third question:
- Should we discard past price data when new data becomes available? The standard practice has been to revise both current and all past estimates of GDP once a new ICP price survey has been conducted, essentially throwing out the price data from the previous rounds. But discarding data appears to be suboptimal. Johnson et al. (2013) propose an alternative approach. It assigns prices from each ICP survey to the benchmark year of the survey and interpolates prices between the benchmark years. A version of this approach has now been incorporated into version 8.0 of the Penn World Tables (Feenstra et al. 2015) alongside the more conventional estimates based on the most recent PPP data. While Feenstra et al. caution that national accounts growth rates may be more accurate than the PPP-adjusted growth rates used in the multiple-benchmark approach, it is not clear whether the multiple-benchmark PPPs might outperform the most recent PPPs in a cross section.
Lights as a benchmark comparison
In a recent paper, we address these questions using satellite-recorded night-time lights as an independent benchmark for unobserved true income (Pinkovskiy and Sala-i-Martin 2018), following the methodology of our previous paper (Pinkovskiy and Sala-i-Martin 2016). If we had a measurement of GDP for which the error was uncorrelated with the measurement errors of the different PPP-adjusted GDPs, it would be straightforward to see which set of PPPs was better by comparing both of them to the independent measure.
In Pinkovskiy and Sala-i-Martin (2016), we argue that data on satellite-recorded night-time lights – studied by Elvidge et al. (1997, 1999, 2012) and in economics by Henderson et al. (2012) and Chen and Nordhaus (2011) – can be used to create this independent measurement. While errors in different versions of GDP come from errors in the underlying national accounts data (such as faulty assumptions about economic relationships like input-output tables), or from errors in calculating indicators of PPP between different currencies, errors in the relationship between night-time lights and economic output come from weather and atmospheric disturbances that affect how light is captured by the orbiting satellites.
Our method is straightforward: we run a regression of the night-time lights on both measures and treat the coefficients on the two measures as proportional to the weights that these measures should receive in an optimal estimator of underlying economic activity. If one measure has a much larger coefficient than another measure, it should be preferred. Our methodology does not explain whythe better measure is better, but it also does not require us to know anything about the way that the statistics that we are comparing were constructed, except that we can assume their measurement error to be independent of the measurement error in night-time lights.
Our night-time lights analysis gives clear-cut answers to the three questions above, and reconciles current practice with the concerns about PPP measurement.
- PPP-adjusted GDP is a better measure of unobserved true income than GDP at market exchange rates. More precisely, for the most recent (2011) PPP measures, it correlates better with night-time lights. In accordance with Almas (2012), however, PPPs based on the 1996 price survey are a worse measure of unobserved true income for poor countries than GDP at market exchange rates. After the 2005 and 2011 surveys, PPP-adjusted GDP became a better measure of unobserved true income than GDP at market exchange rates both for rich and for poor countries.
- PPPs have been steadily improving over time. At least, this is the case for the 2005 and 2011 rounds of ICP. On the other hand, we find that the underlying national accounts data that undergoes PPP adjustment has not been generally improving over time – it may even have deteriorated in some time periods on average. This is notwithstanding revisions to the national accounts data over time, some of which have been large, such as the recent Nigerian rebasing of its GDP (Economist 2014). We undoubtedly have better measures of economic activity than we had 15 years ago, but most of the improvement has come from better estimates of PPPs.
- GDP at the latest PPP is a much better estimator of underlying economic activity than GDP at the multiple-benchmark PPP. We compared night-time lights with the Penn World Tables GDP at 2011 PPP, and Penn World Tables GDP at the synthetic PPP used in the multiple benchmark series. This conclusion appears to be counterintuitive: we seem to find that it is optimal to ignore data. It can be understood, however, if we assume that the rate of the improvement in the quality of the price data is fast enough to outweigh the magnitude of the fluctuations in prices. We have presented a straightforward model of these processes that shows that it is optimal to use the latest price data to measure prices in the earliest years, because the measurement error problem is most significant for this data. Contemporaneous price data in later years is optimal, because measurement error is fairly low, and so it is less important to reduce it relative to matching the annual variation in price data. This is exactly the pattern that we find in the data.
A dispiriting conclusion?
Our conclusion is somewhat dispiriting because, intuitively, it would be better to have a GDP series that changes continuously with additional data, rather than requiring revisions to long-ago observations whenever an update is made. For now, it appears that price survey methodology is continuously improving, and improving so rapidly that current estimates of prices now (or methods of their aggregation) may be superior estimates of prices in the past than estimates of those past prices that were made in the past. Once our methods of estimating prices reach a steady state, it may become preferable to move to the approach of continuous variation of GDP estimates embodied in the multiple-benchmark series.
Almås, I (2012), "International Income Inequality: Measuring PPP bias by estimating Engel curves for food", American Economic Review 102(2): 1093-1117.
Barro, R J (1991), "Economic Growth in a Cross Section of Countries", Quarterly Journal of Economics 106(2): 407-443.
Chen, S, and M Ravallion (2010), "The Developing World is Poorer Than We Thought, but No Less Successful in the Fight Against Poverty", Quarterly Journal of Economics 125(4): 1577-1625.
Chen, X, and W D Nordhaus (2011), "Using Luminosity Data as a Proxy for Economic Statistics", Proceedings of the National Academy of Sciences 108(21): 8589-8594.
Dowrick, S, and M Akmal (2005), "Contradictory trends in global income inequality: A tale of two biases", Review of Income and Wealth 51(2): 201-229.
The Economist (2014), "Step Change: Revised Figures Show Nigeria Africa's Largest Economy", 12 April.
Elvidge, C D, K E Baugh, E A Kihn, H W Kroehl, and E R Davis (1997), "Mapping City Lights With Nighttime Data from the DMSP Operational Linescan System", Photogrammetric Engineering & Remote Sensing 63(6): 727-734.
Elvidge, C D, K E Baugh, J B Dietz, T Bland, P C Sutton, and H W Kroehl (1999), "Radiance Calibration of DMSP-OLS Low-Light Imaging Data of Human Settlements", Remote Sensing of Environment 68(1):77-88.
Elvidge, C D, K E Baugh, S J Anderson, P C Sutton, and T Ghosh (2012), "The Night Light Development Index (NLDI): A Spatially Explicit Measure of Human Development from Satellite Data", Social Geography 7: 23-35.
Feenstra, R C, R Inklaar and M P Timmer (2015), "The Next Generation of the Penn World Table", American Economic Review 105(10): 3150-3182.
Henderson, J V, A Storeygard, and D N Weil (2012), "Measuring Economic Growth from Outer Space", American Economic Review 102(2): 994-1028.
Johnson, S, W Larson, C Papageorgiou, and A Subramanian (2013), "Is Newer Better? Penn World Table Revisions and Their Impact on Growth Estimates", Journal of Monetary Economics 60(2): 255-274.
Pinkovskiy, M L, and X Sala-i-Martin (2016), "Lights, Camera, ... Income! Illuminating the National Accounts-Household Surveys Debate", Quarterly Journal of Economics 131(2): 579-631.
Sala-i-Martin, X (2006), "The World Distribution of Income: Falling Poverty and… Convergence, Period", Quarterly Journal of Economics 121(2): 351-397.
Summers, R, and A Heston (1991), "The Penn World Table (Mark 5): an expanded set of international comparisons, 1950-1987", NBER working paper 1562.