Discretizing Unobserved Heterogeneity
研究了在总体异质性非离散时,先用k均值聚类将个体分组,再估计组内异质性的两步分组固定效应估计量,并推导了渐近性质,提出了数据驱动的分组数选择规则。
We study discrete panel data methods where unobserved heterogeneity is revealed in a first step, in environments where population heterogeneity is not discrete. We focus on two‐step grouped fixed‐effects (GFE) estimators, where individuals are first classified into groups using kmeans clustering, and the model is then estimated allowing for group‐specific heterogeneity. Our framework relies on two key properties: heterogeneity is a function—possibly nonlinear and time‐varying—of a low‐dimensional continuous latent type, and informative moments are available for classification. We illustrate the method in a model of wages and labor market participation, and in a probit model with time‐varying heterogeneity. We derive asymptotic expansions of two‐step GFE estimators as the number of groups grows with the two dimensions of the panel. We propose a data‐driven rule for the number of groups, and discuss bias reduction and inference.