The pioneering work of Irving Kravis, Alan Heston, and Robert Summers (1978), which led to the Penn World Table (PWT) data, was aimed at converting national measures of GDP and income into internationally comparable estimates. Cross-country comparisons could not be based on national GDP data because these were valued at domestic prices. Since some goods and especially services were known to be cheaper in poor countries compared to rich countries, adjustments needed to be made to the valuation of these goods and services so that they could be made internationally comparable. These adjustments were made by calculating common international prices – the so-called purchasing power parity (PPP) prices – for all goods and services. With these PPP adjustments, GDP could then be compared across countries.
A puzzling feature is that data – especially for GDP growth but also for the level of GDP and the PPPs – for the same country at the same point in time change across successive versions of the PWT.1 A stark example of this relates to Equatorial Guinea’s growth rate. According to PWT (version 6.2), it was the second-fastest-growing country among 40 African countries during the two-and-a-half decades beginning in 1975. However, according to the previous version (PWT 6.1), Equatorial Guinea was the slowest-growing country.
Doubts about existing GDP data quality have led many researchers such as Henderson, Storeygard, and Weil (2009) and Young (2009) to find other ways of measuring GDP.
In Johnson, Larson, Papageorgiou, and Subramanian (2009), we instead investigate the root causes and consequences of this data variability. More precisely, we attempt to systematically uncover the facts of the data variability across versions of the Penn World Table, offer explanations, and provide recommendations to researchers and policymakers on how to best use these data.2
Facts in the data
We focus on the differences between two seemingly minor revisions in the Penn World Table mark 6. Versions 6.1 and 6.2 use broadly similar national income data provided by country authorities and similar underlying methodology but produce growth estimates that can be very different.
High frequency means high variability; low GDP means high variability
First, we observe that GDP growth rates averaged across different time horizons exhibit much greater variability the shorter the time span (Figure 1). In all three panels, the mean difference is approximately zero, but in the annual panel, the standard deviation of the average GDP growth difference is 5.4 percentage points, versus 1.6 in the 10-year panel or 1.5 in the 29-year panel. In short, data is more variable at higher frequencies perhaps startlingly so.
Also, as predicted by Rao and Selvanathan (1992), data variability for a given country is negatively related to its size, which is proxied by total GDP. In the annual panel, the standard deviation of the bottom third countries in terms of total GDP is 7.7 percentage points, versus 4.9 in the middle third or 1.9 in the top third. We also find that data variability is greater for historical estimates than for more recent estimates.
Based on a robustness analysis of thirteen major cross-country growth studies, we show that the greater variability of high-frequency data means that studies that use annual data are less robust across PWT versions than studies that average data over longer horizons.
Figure 1. Differences in annual per capita GDP growth between PWT 6.2 and PWT 6.1
We find that the PWT methodology raises a more basic question about valuation. The rationale for the PWT is to come up with GDP level and growth data that are at common international (the so-called PPP) prices so that the data are comparable across countries. The methodology, however, leads to the construction of GDP growth estimates that are based not on common international prices but on a mixture of international and domestic prices. In this case, it is not obvious that the data are comparable across countries.
As a practical matter, how should researchers deal with these problems of data variability and valuation? We offer a few concrete recommendations.
- Given the sensitivity of the growth results to data revisions at annual frequency, research using annual data should have a higher bar in demonstrating robustness to alternative data series.
- If researchers need to use annual growth data, it may be better to use national income accounts data, which are in the World Bank’s World Development Indicators database.
Although, these data are not PPP-adjusted, PPP effects are less important for annual data so the costs of foregoing the use of PPP-adjusted data are smaller. Moreover, the alternative of using PWT growth data comes with the cost of high variability of GDP estimates.
1 While the makers of the Penn World Table have attached quality rankings to the data since the table’s inception, these rankings have been largely ignored. A notable exception is Mankiw, Romer, and Weil (1992) who estimate regressions with poor data quality observations omitted as a robustness exercise.
2 There has been a flurry of activity on this topic in recent years, including Ciccone and Jarocinski (2008), Deaton and Heston (2009), and Ponomareva and Katayama (forthcoming).
Ciccone, Antonio, and Marek Jarocinski (2008), “Determinants of Economic Growth: Will Data Tell?” European Central Bank Working Paper 852. Frankfurt: European Central Bank.
Deaton, Angus, and Alan Heston (forthcoming), “Understanding PPPs and PPP-Based National Accounts”, American Economic Journal — Macroeconomics.
Henderson, J. Vernon, Adam Storeygard, and David N.Weil (2009a), “Measuring Economic Growth from Outer Space”, NBER Working Papers No. 15199.
Henderson, J. Vernon, Adam Storeygard, and David N.Weil (2009b), “Measuring economic growth from outer space”, VoxEU.org, 2 September.
Johnson, S., W. Larson, C. Papageorgiou, and A. Subramanian (2009), “Is Newer Better? Penn World Table Revisions and Their Impact on Growth Estimates”, NBER Working Paper 15455
Kravis, Irving B., Alan Heston, and Robert Summers (1978), “Real GDP Per Capita for More Than One Hundred Countries,” Economic Journal. 88: 215—242.
Mankiw, N. Gregory, David Romer, and David N. Weil (1992), “A Contribution to the Empirics of Economic Growth”, Quarterly Journal of Economics, 107: 407—437.
Ponomareva, Natalia, and Hajime Katayama (forthcoming), “Does the Version of the Penn World Tables Matter? An Analysis of the Relationship between Growth and Volatility?” Canadian Journal of Economics.
Rao, Prasada D.S., and E.A. Selvanathan (1992), “Computation of Standard Errors for Geary-Khamis Parities and International Prices: A Stochastic Approach”, Journal of Business and Economics Statistics, 10: 109—115.
Young, Alwyn (2009), “Real Consumption Measures for the Poorer Regions of the World.” London School of Economics working paper.