Provable Non-convex Optimization for Learning Parametric Models
Author: Kai Zhong (Ph. D.)
Publisher:
Published: 2018
Total Pages: 866
ISBN-13:
DOWNLOAD EBOOKNon-convex optimization plays an important role in recent advances of machine learning. A large number of machine learning tasks are performed by solving a non-convex optimization problem, which is generally NP-hard. Heuristics, such as stochastic gradient descent, are employed to solve non-convex problems and work decently well in practice despite the lack of general theoretical guarantees. In this thesis, we study a series of non-convex optimization strategies and prove that they lead to the global optimal solution for several machine learning problems, including mixed linear regression, one-hidden-layer (convolutional) neural networks, non-linear inductive matrix completion, and low-rank matrix sensing. At a high level, we show that the non-convex objectives formulated in the above problems have a large basin of attraction around the global optima when the data has benign statistical properties. Therefore, local search heuristics, such as gradient descent or alternating minimization, are guaranteed to converge to the global optima if initialized properly. Furthermore, we show that spectral methods can efficiently initialize the parameters such that they fall into the basin of attraction. Experiments on synthetic datasets and real applications are carried out to justify our theoretical analyses and illustrate the superiority of our proposed methods.