Enhancing house price forecasting with hybrid data driven feature extraction model
提出VMD-RF特征提取模型,结合空间依赖和上下文因素生成统一住房市场健康指数,在Selangor房价数据集上实现R²达0.97,为购房者、投资者和政策制定者提供更准确的房价预测。
Accurate house price estimation is crucial for stakeholders such as homeowners, investors, and policymakers. While current models excel at capturing spatial dependencies, they often overlook the influence of contextual factors such as macroeconomic conditions and property-specific attributes. To address this gap, the Variational Mode Decomposition and Random Forest (VMD-RF) feature extraction model is proposed to consider both spatial dependency factors (i.e. area, Gross Domestic Product (GDP), Consumer Price Index (CPI), and population) and contextual factors (i.e., tenure, age, and interest rate) in generating a unified housing market health index. Specifically, VMD decomposes each feature into intrinsic mode functions (IMFs), revealing underlying data patterns. The IMFs from all features are concatenated, and Random Forest performs dimensionality reduction, consolidating them into a single feature. This feature improves the model’s predictive power, as evidenced by a higher Spearman’s correlation coefficient (0.9652) with house prices. The contribution of VMD-RF lies in its enhanced prediction accuracy within a simpler machine learning framework, avoiding the complexity of ensemble methods. Validated using the Selangor house price dataset, both VMD-RF-SVR and VMD-RF-LSTM achieve competitive coefficients of determination (R2) of 0.97. Furthermore, the VMD-RF-SVR model demonstrates superior performance compared to VMD-RF-LSTM, as evidenced by its lower median residual distribution.