Machine Learning for Demand Estimation in Long Tail Markets
针对长尾市场中零销量产品导致估计偏差的问题,提出两阶段估计器,先用深度学习预测市场份额,再重新加权修正偏差,得到一致且因果可解释的参数估计,用于定价和品类决策。
Random coefficient multinomial logit models are widely used to estimate customer preferences from sales data. However, these estimation models can only allow for products with positive sales; this selection leads to highly biased estimates in long tail markets, that is, markets where many products have zero or low sales. Such markets are increasingly common in areas such as online retail and other online marketplaces. In this paper, we propose a two-stage estimator that uses machine learning to correct for this bias. Our method first uses deep learning to predict the market shares of all products, where the neural network’s structure mirrors the random coefficient multinomial logit model’s data generating process. In the second stage, we use the predictions of the first stage to reweight the observed shares in a way that corrects for the induced bias and maintains the causal interpretation of the structural model. We show that the estimated parameters are consistent in the number of markets. Our method performs well on simulated and real long tail data, producing accurate estimates of customer behavior. These improved estimates can subsequently be used to provide prescriptive policy recommendations on important managerial decisions such as pricing, assortment, and so on. This paper was accepted by Gabriel Weintraub, revenue management and market analytics. Supplemental Material: The data files are available at https://doi.org/10.1287/mnsc.2023.4893 .