VoxEU Column Labour Markets Macroeconomic policy

What data on millions of US workers reveal about individual earnings risk

29 Apr 2015

Many policy design issues depend crucially on the nature of the idiosyncratic risks to labour income. The earning dynamics literature has typically relied on an implicit or explicit assumption that earnings shocks are log-normally distributed. This column challenges conventional knowledge by bringing in new evidence from a very large administrative dataset on US workers. It presents evidence suggesting income shocks exhibit substantial deviations from log-normality, and that shock persistence depends on income levels as well as the size and sign of the shock.

Authors

Jae Song

Fatih Karahan

Serdar Ozkan

Fatih Guvenen

Workers’ perceptions of unforeseeable labour-market events are central to many personal economic decisions. Both pleasant surprises, like being offered a dream job, getting a promotion, or getting a raise; and disappointments, such as job loss, health shocks, etc., are central to many personal economic decisions as these risks are hard to insure.

This is why such risks also lay at the heart of numerous economic policy questions – what determines the inequality of consumption and wealth? What parts of the society should fiscal stimulus policies target in order to maximize their impact on aggregate demand? And what is the optimal way to tax earnings? Addressing these questions thus requires a sound understanding of the nature of earnings risk over a career.

Despite a vast body of research1, it is fair to say that many aspects of the nature of earnings risk remain unknown. For example, what does the probability distribution of earnings shocks look like? How well is it approximated by a log-normal distribution (an assumption often made out of convenience)? And, perhaps more important, how do the properties of shocks differ across low- and high-income workers and how do they change over the life cycle? How about dynamics of earnings – do positive shocks exhibit persistence that is different from negative shocks?

In a recent paper (Guvenen et al 2015), we draw on a very large and confidential panel dataset from the US Social Security Administration to shed new light on the nature of the risks to labour income. The substantial sample size – more than 200 million observations from 1978 to 2010 – allows us to employ a fully nonparametric approach and take what amounts to high-resolution pictures of individual earnings histories.

Are earnings shocks log-normally distributed?

Since its inception in the late 1970s2, the earnings dynamics literature has worked with the implicit or explicit assumption of a Gaussian framework, thereby making no use of higher-order moments beyond the variance-covariance matrix. One of the few exceptions is an important paper by Geweke and Keane (2000), who emphasize the non-Gaussian nature of earnings shocks.

In our paper we find that changes in earnings (over both short and long time horizons) display striking deviations from log-normality. The blue line in figure 1 shows the distribution of (log) income changes (left panel, annual change; right panel, five-year change) and compares it to that of a normal distribution chosen to have the same standard deviation as the data.

On both panels, notice the sharpness of the peak in the empirical density, how little mass there is on the ‘shoulders,’ and how long the tails are relative to the normal distribution. These three features of an empirical density are best summarised by a statistic called kurtosis. A common measure of kurtosis is the fourth standardised central moment of the distribution. The empirical density plotted in the left panel of figure 1 has a kurtosis of almost 18, much higher than a normal distribution, which has a kurtosis of 3.

Figure 1. Histogram of log earnings changes (left panel, annual change; right panel, five-year change)

To provide a more familiar interpretation of these kurtosis values, we calculate measures of concentration. Table 1 reports the fraction of individuals experiencing a change in log earnings (of either sign) of less than a threshold equalling 0.05, 0.1, 0.2, 0.5 and 1. If the data were drawn from a normal density, only 8% of individuals would experience an annual change in earnings of less than 5%. This share in the data is 35%, showing a much higher concentration of earnings changes near zero. Furthermore, the last row shows that the probability that a worker will receive a very large shock (larger than 150 log points—an almost fivefold increase or an 80% drop) is 11.5 times higher in the data than under log-normality. To put it differently, in a given year, most individuals experience very small earnings shocks, and a small but non-negligible number experience very large shocks.

Table 1. Fraction of individuals with selected ranges of log earnings changes

Moreover, the average kurtosis masks significant heterogeneity across individuals by age and level of earnings, increasing with age and earnings – prime-aged males with recent earnings of $100,000 (in 2005 dollars) face earnings shocks with a kurtosis as high as 35, whereas young workers with recent earnings of $10,000 face a kurtosis of only 5.

A second important deviation from log-normality is that the distribution of earnings shocks is not symmetric – it displays strong negative skewness. Specifically, large downward movements in earnings (disaster shocks) are more likely than large upward swings. Furthermore, shocks become more negatively skewed with higher earnings and with age. This worsening is due entirely to the fact that large upside earnings moves become less likely from age 25 to 45 and to the increasing disaster risk after age 45.

In a related study, Bonhomme and Robin (2010) analyse French earnings data over short panels and find the distribution of transitory component to be left-skewed and leptokurtic. In this paper, we go beyond the overall distribution and find substantial variation in the degree of non-normality with age and earnings levels.

What do these non-normal features of the data mean for analyses of risk?

Although a complete answer is beyond the scope of our paper, consider the well-known thought experiment in which an individual is indifferent between (i) a gamble that changes consumption by a random proportion (1+δ), and (ii) a fixed payment π, the risk premium, to avoid the gamble. Let us compare two scenarios for the standard constant relative risk aversion utility function with a curvature of 10. In the first one, δ is drawn from a Gaussian distribution with zero mean and a standard deviation of 0.10. In the second, δ has the same mean and standard deviation but has a skewness coefficient of -2 and a kurtosis of 30 (consistent with our empirical findings). The solution for the risk premium, π, under different assumptions is displayed in table 2. As seen here, an individual would be willing to pay 22% of her average consumption to avoid the non-normal bet compared to only about 5% for the normal bet—an amplification of 450%.

Table 2. Effect of skewness and kurtosis on risk premium

Furthermore, in a recent paper Golosov et al (2014) study the quantitative importance of higher order moments in earnings risk on optimal taxation. They show that negative skewness and excess kurtosis imply a top marginal tax rate on earnings that is substantially higher than under the assumption of Gaussian shocks with the same variance.

Asymmetric mean reversion

How about the dynamics of changes in earnings? Figure 2 plots impulse-response functions conditional on the recent earnings of individuals and on the size of the shocks. More specifically, it shows the 10-year income change (y-axis) following income shocks of different magnitudes (x-axis) for six income groups.3 Several patterns emerge. First, large shocks are more transitory than small shocks. Second, negative shocks to the lowest income group (solid blue line) are quite transitory with a mean reversion of about 80 percent, whereas positive shocks are quite persistent, with only about a 20 percent reversion to the mean after 10 years. As we move up the earnings distribution, the positive and negative branches of each graph rotate in opposite directions, so that for the highest earnings group, we have the opposite pattern: mean reversion for negative shocks is only about 20 percent, whereas this is about 75 percent for positive shocks.

Figure 2. Asymmetric mean reversion – Butterfly pattern

Earnings profile heterogeneity

Finally, we turn to another important aspect of income dynamics and study how average earnings growth varies over the life cycle and across individuals. For this purpose, we group individuals based on their lifetime earnings (LE), computed by summing their earnings from ages 25 through 60 using at least 33 years of data. Figure 3 plots the growth in average earnings between ages 25 and 55 against lifetime earnings percentiles. The median individual experiences an earnings growth of 38%, while for individuals in the 95th percentile, this figure is 230%; for those in the 99th percentile, this figure is almost 1,500%. This feature of the data turns out to be very difficult to match with standard models of idiosyncratic risk.

How is earnings growth distributed across the different decades of the life cycle? Figure 4 answers this question by plotting, separately, earnings growth from ages 25 to 35, 35 to 45, and 45 to 55. Across the board, the bulk of earnings growth happens during the first decade. In fact, for the median LE group, average earnings growth from ages 35 to 55 is zero (notice that the solid blue line and the grey line with circles overlap at LE50). Second, after age 45, the only groups that are experiencing growth, on average, are those in the top 2 percent of the LE distribution.

Figure 3. Life-cycle earnings growth rates, by lifetime earnings group

Figure 4. Log earnings growth over subperiods of life cycle

Summing up

A broader message from our findings is a call for researchers to reconsider the standard approach in the literature to studying earnings dynamics. The covariance matrix approach that dominates current work (whereby the variance-covariance matrix of earnings changes are the only set of moments considered in pinning down parameters) is too opaque and a bit mysterious – it is difficult to judge the economic implications of matching or missing certain covariances. Furthermore, the standard model in the literature assumes log-normal shocks, whereas this study shows important deviations from log-normality, in the form of very high kurtosis and negative skewness.

With the increasing availability of very large panel data sets, we believe that researchers' priority in choosing methods needs to shift from efficiency concerns to transparency. The approach adopted here is an example of the latter, and we believe it allows economists to be better judges of what each moment implies for the economic questions they have at hand.

References

Bonhomme, S and J M Robin (2010) “Generalized non-parametric deconvolution with an application to earnings dynamics”, The Review of Economic Studies, 77(2): 491–533.

Borovicka, J, L P Hansen and J A Scheinkman (2014) “Shock elasticities and impulse responses”, Working paper, University of Chicago.

Christiano, L J, M Eichenbaum and C L Evans (2005) “Nominal rigidities and the dynamic effects of a shock to monetary policy”, Journal of Political Economy, 113(1): 1–45.

Geweke, J and M Keane (2000) “An empirical analysis of earnings dynamics among men in the PSID: 1968-1989”, Journal of Econometrics, 96: 293–356.

Golosov, M, M Troshki and A Tsyvinski (2014) “Redistribution and social insurance”, Working paper, Princeton University.

Guvenen, F (2009) “An empirical investigation of labor income processes”, Review of Economic Dynamics, 12(1): 58–79.

Hause, J C (1980) “The fine structure of earnings and the on-the-job training hypothesis”, Econometrica, 48(4):1013–1029.

Meghir, C and L Pistaferri (2004) “Income variance dynamics and heterogeneity”, Econometrica, 72(1):1–32.

Lillard, L A and Y Weiss (1979) “Components of variation in panel earnings data: American scientists 1960-70”, Econometrica, 47(2):437–454.

Lillard, L A and R J Willis (1978) “Dynamic aspetcs of earnings mobility”, Econometrica, 46(5):985–1012.

MaCurdy, T E (1982) “The use of time series processes to model the error structure of earnings in a longitudinal data analysis”, Journal of Econometrics, 18(1):83–114.

Storesletten, K, C I Telmer and A Yaron (2004) “Consumption and risk sharing over the life cycle”, Journal of Monetary Economics, 51:609–633.

Footnotes

1 Important contributions include but are not limited to Meghir and Pistaferri (2004), Storesletten et al (2004), and Guvenen (2009).

2 Earliest contributions include Lillard and Willis (1978), Lillard and Weiss (1979), Hause (1980), and MaCurdy (1982).

3 In this regard, our approach is in the spirit of the recent macroeconomics literature that views impulse responses as key to understanding time-series dynamics in aggregate data (e.g., Christiano et al 2005, Borovicka et al 2014).

2,835 Reads