Leveraged Matrix Completion With Noise
研究了从部分观测中补全低秩矩阵的问题,利用杠杆分数设计非均匀采样,在更宽松假设下实现与均匀采样相同的理论保证,并允许观测数据含少量噪声。
Completing low-rank matrices from subsampled measurements has received much attention in the past decade. Existing works indicate that <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathcal{O}(nr\log^{2}(n))$</tex-math> </inline-formula> datums are required to theoretically secure the completion of an <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$n \times n$</tex-math> </inline-formula> noisy matrix of rank <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$r$</tex-math> </inline-formula> with high probability, under some quite restrictive assumptions: 1) the underlying matrix must be incoherent and 2) observations follow the uniform distribution. The restrictiveness is partially due to ignoring the roles of the leverage score and the oracle information of each element. In this article, we employ the leverage scores to characterize the importance of each element and significantly relax assumptions to: 1) not any other structure assumptions are imposed on the underlying low-rank matrix and 2) elements being observed are appropriately dependent on their importance via the leverage score. Under these assumptions, instead of uniform sampling, we devise an ununiform/biased sampling procedure that can reveal the “importance” of each observed element. Our proofs are supported by a novel approach that phrases sufficient optimality conditions based on the Golfing scheme, which would be of independent interest to the wider areas. Theoretical findings show that we can provably recover an unknown <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$n\times n$</tex-math> </inline-formula> matrix of rank <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$r$</tex-math> </inline-formula> from just about <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathcal{O}(nr\log^{2} (n))$</tex-math> </inline-formula> entries, even when the observed entries are corrupted with a small amount of noisy information. The empirical results align precisely with our theories.