|
|
Microeconometrics
Much Ado
In recent years, more and more applied work in microeconometrics has
been conducted using disaggregated data at the household level, such as
the Family Expenditure Survey (FES). Such data can yield useful
information, but new problems arise in analysing them. Many household
surveys record "zero expenditures', when the household makes no
purchases of particular goods during the survey interview period. This
forces the investigator to consider the "truncation problem':
expenditures cannot be negative, yet the conventional regression model
allows this to occur. Another difficulty lies in the short duration of
the interview period: a household might not happen to purchase a
relatively durable good in this particular period, but may nevertheless
consume its services continuously.
How can methods of analysis be adapted to deal with this problem? The
importance of the topic prompted CEPR and the ESRC Econometric Study
Group to hold a workshop at the Centre on 30 November. It was organized
by CEPR Associate Director Professor Richard Blundell (University
College London) and brought together participants from a number of
British universities. The two papers presented at the workshop
approached the problem of "zero observations' from different angles
and provided interesting alternative solutions.
Some economic variables cannot by their very nature take on negative
values. In the microeconomic context, such variables could be
expenditure or labour supply. In the conventional linear regression
model, however, the dependent variable can take on any positive or
negative value. One well-known solution to this problem is provided by
the "Tobit model', a regression model in which the values of
endogenous variable are truncated at zero and only positive values are
assumed.
Joanna Gomulka (LSE) noted in her paper "Gamma-Tobit: A Tobit Type
Model with Gamma-distributed Error Terms', that the assumption of
normally distributed error terms usually made in the Tobit model can
have important implications for the estimates of regression
coefficients. In order to obtain a more general framework, Gomulka
considered a family of distributions that contains the normal
distribution as a special case. She argued that this approach offered
much more flexibility, yet only required the estimation of one
additional parameter.
This modified Tobit model was then applied to models of household
expenditure on tobacco and alcohol. These are typically categories where
many zero expenditures are recorded in survey data, and the truncation
problem is thus potentially severe. Expenditure on these commodities is
particularly important for policy-makers because of their implications
for health and their importance as sources of government revenue.
Expenditure data are available from the FES and relate to some 53,000
households over the period 1970-80. From this main sample Gomulka and
her colleagues drew several random subsamples of about one-tenth this
size in order to compare "Tobit' and "modified-Tobit'
estimators. Their model explains the share of tobacco and alcohol in
total household expenditure in terms of total expenditure, time,
relative prices, the age and socioeconomic status of the head of the
household, and household composition.
For tobacco the difference between the results obtained under the
assumption of normal errors and those obtained under Gomulka's more
general distribution is relatively small. This indicates that the usual
Tobit model is not unreasonable for this commodity.
In the case of expenditures on alcohol, however, the outcome is
completely different. Here the results suggested an important departure
from the conventional model. Hence, forecasts made on the basis of the
modified model may differ substantially from those based on the usual
Tobit model. The modified model proposed by Gomulka had enough
additional flexibility to allow it to properly capture the particular
spread of the data and seemed a sounder basis for policy analysis.
Of course, other ways of extending the class of underlying distributions
could be considered. This proved to be one of the main issues in the
discussion of the paper. A more fundamental methodological question was
also raised: should we focus our attention on the properties of the
error term - the "unexplained' part of the dependent variable, or
on the performance of the explanatory variables in the model, such as
age and socioeconomic status?
The estimation of demand functions and Engel curves (the relationship
between an individual's consumption of a particular good and his total
consumption) has always been a central issue in applied
microeconometrics. The necessary budget data, disaggregated to household
level, are often taken from household surveys such as the FES. In
"Zero Expenditures and the Estimation of Engel Curves', Michael
Keen (Essex) stressed the short duration of the interview period in such
household surveys. For the FES, for instance, this period only covers
two weeks. Data from such surveys may be prone to serious
"measurement error', since a household may not happen to purchase
certain (more durable) commodities during the relatively short interview
period. In this context there is a crucial distinction between the
purchase of a good and its "consumption'.
We know that zero observations may reflect either the relative
infrequency of purchase, a genuine lack of consumption, or systematic
under-reporting. In view of the prevalence of the zeros, Keen argued
that the second explanation was implausible. Moreover, under-reporting
in the FES data seems to be confined to certain goods such as tobacco
and alcohol. Therefore, Keen assumed in his further analysis that zero
expenditures arise from the infrequency of purchases. He argued this
enhanced both the clarity and the tractability of the subsequent
analysis. Keen also assumed the probability of purchase for any good was
equal for all consumers and independent of true consumption, and the
Engel curves were assumed to be linear. The restriction that the sum of
a household's consumption over all commodities should equal that
household's total consumption, gave rise to a correlation between the
error and the explanatory variables in the model. Conventional
estimation methods are inappropriate in this case, and Keen argued that
they would lead to overestimation of the marginal propensity to consume
goods which were purchased infrequently. Keen proposed an estimator
using as an instrumental variable "normal income', obtained from
the FES, to correct for these difficulties.
Is this procedure appropriate for the FES data? Keen used data on 195
one-parent families (with fewer than two working members) from the 1977
FES and found that the predicted overestimation with the conventional
methods did in fact occur. The large differences between the
conventional estimator and the one used by Keen suggest that the
measurement errors have seriously contaminated these data. However, two
of Keen's initial hypotheses did not seem to be confirmed by the data:
the independence between the probability of purchase and the level of
consumption was rejected for three commodity groups. The hypothesis of
linear Engel curves seemed to be dubious for most goods. Nevertheless
Keen's empirical results clearly demonstrated the importance of the
measurement error problem in the FES data.
In his presentation, Keen noted the simplicity and tractability of his
approach, even for the estimation of complete demand systems. He
remarked that Gomulka's work dealt mainly with explaining the demand of
"potential' consumers by differences in preferences , whereas his
own paper stressed the distinction between purchase and consumption .
The results of the papers suggested that both of these possible causes
for zero observations can, for some commodities, seriously influence
empirical results and the policy conclusions drawn from them. There was
a need to combine treatment of both problems in a more comprehensive
framework. In view of the rapidly growing use of highly disaggregated
survey data, which typically contain zero observations, this topic is
likely to inspire further research.
|
|