论文标题
更简单的格拉曼尼亚优化
Simpler Grassmannian optimization
论文作者
论文摘要
Grassmannian $ \ operatorName {gr}(k,n)$有两种广泛使用的模型,作为正交矩阵$ \ operatoTorname {o}(o}(n)/(\ propatatorNAME {\ propatatorName {o}(o}(k)\ times \ times \ times \ opervients \ opervients \ operviftion \ opervients \ k k o} $, $ \ {p \ in \ mathbb {r}^{n \ times n}:p^{\ mathsf {t}} = p = p = p^2,\; \ operatatorName {tr}(p)= k \} $。前者是多种优化的标准,具有给出数值稳定算法的优点,但必须使用等效矩阵类别的缺点。后者在编码理论和概率中广泛使用,具有使用实际矩阵(而不是等价类别)的优点,但是使用投影矩阵在数值上是不稳定的。我们提出了一种既有优势又遭受任何缺点的替代方案。通过表示$ k $维的子空间为对称的正交矩阵,$ 2K-N $,我们获得\ [\ [\ propatatorName {gr}(k,n)\ cong \ cong \ cong \ {q \ in \ in \ operatatOrnAme {o propatatorname {o} \ operatatorName {tr}(q)= 2k -n \}。 \]与其他两个模型一样,我们表明差分几何对象和操作 - 切线矢量,度量,正常矢量,指数图,地球,平行传输,梯度,Hessian等)具有可计算使用标准数值线性代数的封闭形式的分析表达式。 In the proposed model, these expressions are considerably simpler, a result of representing $\operatorname{Gr}(k,n)$ as a linear section of a compact matrix Lie group $\operatorname{O}(n)$, and can be computed with at most one QR decomposition and one exponential of a special skew-symmetric matrix that takes only $O(nk(n-k))$ time.特别是,我们完全避免在最陡峭的下降,结合梯度,准牛顿和牛顿的牛顿方法中的特征和奇异价值分解。
There are two widely used models for the Grassmannian $\operatorname{Gr}(k,n)$, as the set of equivalence classes of orthogonal matrices $\operatorname{O}(n)/(\operatorname{O}(k) \times \operatorname{O}(n-k))$, and as the set of trace-$k$ projection matrices $\{P \in \mathbb{R}^{n \times n} : P^{\mathsf{T}} = P = P^2,\; \operatorname{tr}(P) = k\}$. The former, standard in manifold optimization, has the advantage of giving numerically stable algorithms but the disadvantage of having to work with equivalence classes of matrices. The latter, widely used in coding theory and probability, has the advantage of using actual matrices (as opposed to equivalence classes) but working with projection matrices is numerically unstable. We present an alternative that has both advantages and suffers from neither of the disadvantages; by representing $k$-dimensional subspaces as symmetric orthogonal matrices of trace $2k-n$, we obtain \[ \operatorname{Gr}(k,n) \cong \{Q \in \operatorname{O}(n) : Q^{\mathsf{T}} = Q, \; \operatorname{tr}(Q) = 2k -n\}. \] As with the other two models, we show that differential geometric objects and operations -- tangent vector, metric, normal vector, exponential map, geodesic, parallel transport, gradient, Hessian, etc -- have closed-form analytic expressions that are computable with standard numerical linear algebra. In the proposed model, these expressions are considerably simpler, a result of representing $\operatorname{Gr}(k,n)$ as a linear section of a compact matrix Lie group $\operatorname{O}(n)$, and can be computed with at most one QR decomposition and one exponential of a special skew-symmetric matrix that takes only $O(nk(n-k))$ time. In particular, we completely avoid eigen- and singular value decompositions in our steepest descent, conjugate gradient, quasi-Newton, and Newton methods for the Grassmannian.