Online Multiview Representation Learning: Dropping Convexity for Better Efficiency

27 February 2017

Zhehui Chen

Abstract

Multiview representation learning is very popular for latent factor analysis. It naturally arises in many data analysis, machine learning, and information retrieval applications to model dependent structures between a pair of data matrices. For computational convenience, existing approaches usually formulate the multiview representation learning as convex optimization problems, where global optima can be obtained by certain algorithms in polynomial time. However, many evidences have corroborated that heuristic nonconvex approaches also have good empirical computational performance and convergence to the global optima, although there is a lack of theoretical justification. Such a gap between theory and practice motivates us to study a nonconvex formulation for multiview representation learning, which can be efficiently solved by two stochastic gradient descent (SGD) methods. Theoretically, by analyzing the dynamics of the algorithms based on diffusion processes, we establish global rates of convergence to the global optima with high probability. Numerical experiments are provided to support our theory.

View on arXiv

Comments on this paper