Finite Mixtures of Multivariate Contaminated Normal Censored Regression Models
针对数据中存在异常值和删失(部分观测值被限制)的情况,提出一种新的有限混合模型,并给出参数估计算法,可用于恢复删失数据和检测异常点。
The complexity of model-based clustering grows as outliers become more prevalent, compounded by restrictions imposed by the detection of quantification. This paper introduces a finite mixture of multivariate contaminated normal (FM-MCN) distributions tailored for handling censored data, referred to as the FM-MCNC model henceforth. Subsequently, the FM-MCNC model is extended to tackle the multivariate linear regression issue, leading to the formulation of the FM-MCN censored regression (FM-MCNCR) model. For the estimation of model parameters, we devise a computationally analytical alternating expectation conditional maximization (AECM) algorithm. Additionally, we present an information matrix-based formula to approximate the asymptotic standard errors of parameter estimates. Importantly, the AECM algorithm serves a dual role by not only facilitating parameter estimation but also providing methods to recover censored measurements and detect outlier data points as a by-product when it converges. The efficacy and advantages of the proposed methodology are illustrated through a series of simulations and two real-life data examples.