BlindSMOTE: Synthetic Minority Oversampling Based Only on Evolutionary Computation
提出一种仅基于进化计算的合成少数类过采样方法BlindSMOTE,不限制新样本生成方式,并融入多数类欠采样,在85个数据集上优于90多种现有策略。
One of the most common problems in data mining applications is the uneven distribution of classes, which appears in many real-world scenarios. The class of interest is often highly underrepresented in the given dataset, which harms the performance of most classifiers. One of the most successful methods for addressing the class imbalance problem is to oversample the minority class using synthetic samples. Since the original algorithm, the synthetic minority oversampling technique (SMOTE), introduced this method, numerous versions have emerged, each of which is based on a specific hypothesis about where and how to generate new synthetic instances. In this paper, we propose a different approach based exclusively on evolutionary computation that imposes no constraints on the creation of new synthetic instances. Majority class undersampling is also incorporated into the evolutionary process. A thorough comparison involving three classification methods, 85 datasets, and more than 90 class-imbalance strategies shows the advantages of our proposal.