多重链接张量分解

Multiple Linked Tensor Factorization

Journal of Computational and Graphical Statistics · 2026
被引 0 · 同刊同年前 5%
ABS 3

中文导读

提出多重链接张量分解方法,扩展CP分解以同时降维多个多维数组并逼近潜在信号,自动识别共享和特有成分,支持缺失数据填补,适用于多源多组学数据整合分析。

Abstract

In biomedical research and other fields, it is now common to generate high-content data that are both multi-source and multi-way. Multi-source data are collected from different technologies while multi-way data are collected over multiple dimensions, yielding multiple tensor arrays. Integrative analysis of these data is needed, e.g., to capture and synthesize different facets of complex biological systems. However, despite growing interest in multi-source and multi-way factorization techniques, methods that handle data that are both multi-source and multi-way are limited. We propose Multiple Linked Tensors Factorization (MULTIFAC), extending the CANDECOMP/PARAFAC (CP) decomposition to simultaneously reduce the dimension of multiple multi-way arrays and approximate underlying signals. We first introduce a CP factorization with L2 penalties on the latent factors, leading to rank sparsity. When extended to multiple linked tensors, this automatically reveals latent components that are shared across tensors or individual to each tensor. We also extend the decomposition algorithm to its expectation–maximization (EM) version to handle incomplete data with imputation. Extensive simulation studies demonstrate MULTIFAC’s ability to approximate underlying signals, identify shared and unshared structures, and impute missing data. The approach yields an interpretable decomposition on multi-way multi-omics data for a study on early-life iron deficiency.

生物医学多源数据分析张量分解多组学整合