🌙

基于持续信息瓶颈方法的增量多视图聚类

Incremental Multiview Clustering With Continual Information Bottleneck Method

IEEE Transactions on Systems, Man, and Cybernetics: Systems · 2024
被引 4
ABS 3

中文导读

提出持续信息瓶颈方法,解决多视图数据中视图随时间递增时的聚类问题,通过知识库和共享编码器保留视图间一致性并消除冗余信息,在九个基准数据集上优于19种基线方法。

Abstract

Multiview clustering (MVC) provides a natural formulation to generate clusters for multiview data, which is fundamental to lots of industrial tasks like autonomous driving, defect detection, and multisensor information fusion, as part of the foundation models. Most existing MVC methods suppose that the data of multiple views are available during the clustering process. However, that is a very strong assumption and is impractical when the views are incremental over time. In addition, if directly applying existing MVC approaches to the clustering setting with incremental views, the massive redundant information in each view might limit the knowledge sharing between historical and newly arrived views. To solve these problems, a continual information bottleneck (CIB) method is presented in this article, which addresses the incremental MVC issue by maximally preserving the consistency of a sequence of views and removing the redundant information in each view. In particular, to facilitate the knowledge transfer from historical views to incoming one, we build a knowledge library to store the representative samples in historical views. When adding a new view, we first construct a view-specific encoder with information-theoretic constraints to learn a compact and discriminative representation, in which redundant information in the new view is eliminated. Then, to capture the consistency information between historical views and the new view, a shared encoder is devised after retrieving the global neighbors in the library for the new view, which is performed by contrasting the cluster assignment and feature representation simultaneously. Finally, a unified objective function is devised to simultaneously optimize the knowledge library and clustering process, in which the knowledge library is updated by maximizing the mutual information between the new view and all historical ones to keep tracking knowledge about the earlier views. Extensive experiment on nine multiview benchmarks has verified the superiority of the CIB method over 19 baselines.

多视图聚类增量学习信息瓶颈数据挖掘人工智能