Discussion paper

DP18507 Demand Estimation with Text and Image Data

We propose a demand estimation method that allows researchers to estimate substitution patterns from unstructured image and text data. We first employ a series of machine learning models to measure product similarity from products' images and textual descriptions. We then estimate a nested logit model with product-pair specific nesting parameters that depend on the image and text similarities between products. Our framework does not require collecting product attributes for each category and can capture product similarity along dimensions that are hard to account for with observed attributes. We apply our method to a dataset describing the behavior of Amazon shoppers across several categories and show that incorporating texts and images in demand estimation helps us recover a flexible cross-price elasticity matrix.


Compiani, G, I Morozov and S Seiler (2023), ‘DP18507 Demand Estimation with Text and Image Data‘, CEPR Discussion Paper No. 18507. CEPR Press, Paris & London. https://cepr.org/publications/dp18507