🌙

通过邻域Jensen间隙最小化增强符号回归中进化特征构造的泛化能力

Enhancing Generalization in Evolutionary Feature Construction for Symbolic Regression Through Vicinal Jensen Gap Minimization

IEEE Transactions on Evolutionary Computation · 2025
被引 1
ABS 4

中文导读

该研究提出一种进化特征构造框架,通过联合优化经验风险和邻域Jensen间隙来控制过拟合,并在58个数据集上验证了其有效性,相比其他复杂度度量及15种机器学习算法表现更优。

Abstract

Genetic programming-based feature construction has achieved significant success in recent years as an automated machine learning technique to enhance learning performance. However, overfitting remains a challenge that limits its broader applicability. To improve generalization, we prove that vicinal risk, estimated through noise perturbation or mixup-based data augmentation, is bounded by the sum of empirical risk and a regularization termb–either finite difference or the vicinal Jensen gap. Leveraging this decomposition, we propose an evolutionary feature construction framework that jointly optimizes empirical risk and the vicinal Jensen gap to control overfitting. Since datasets may vary in noise levels, we develop a noise estimation strategy to dynamically adjust regularization strength. Furthermore, to mitigate manifold intrusionb–where data augmentation may generate unrealistic samples that fall outside the data manifoldb–we propose a manifold intrusion detection mechanism. Experimental results on 58 datasets demonstrate the effectiveness of Jensen gap minimization compared to other complexity measures. Comparisons with 15 machine learning algorithms further indicate that genetic programming with the proposed overfitting control strategy achieves superior performance.

符号回归进化计算特征构造机器学习泛化