Discussion paper

DP17941 A two sample size estimator for large data sets

In GMM estimators moment conditions with additive error terms involve an observed component and a predicted component. If the predicted component is computationally costly to evaluate, it may not be feasible to estimate the model with all the available data. We propose an estimator that uses the full data set for the computationally cheap observed component, but a reduced sample size for the predicted component. We show consistency, asymptotic normality, and derive standard errors and a practical criterion for when our estimator is variance-reducing. We demonstrate the estimator's properties on a range of models through Monte Carlo studies and an empirical application to alcohol demand.


O'Connell, M, H Smith and O Thomassen (2023), ‘DP17941 A two sample size estimator for large data sets‘, CEPR Discussion Paper No. 17941. CEPR Press, Paris & London. https://cepr.org/publications/dp17941