- This event has passed.
Peter Richtarik (Edinburgh)
May 31 @ 2:00 pm - 3:00 pm
Joint Theoretical Physics and Applied Mathematics Seminar
Title: Stochastic reformulations of linear systems and efficient randomized algorithms
We propose a new paradigm for solving linear systems with a very large number of equations. In our paradigm, the system is first reformulated into a stochastic problem, and then solved with a suitable (typically randomized) algorithm. Our stochastic reformulation is flexible as it depends on a user-defined parameter in the form of a distribution defining an ensemble of random matrices. The choice of the distribution directly influences the “condition number” of the reformulation, which leads to the novel concept of “randomized reconditioning”. We give necessary and sufficient conditions for the reformulation to be exact, i.e., for the solution set of the stochastic problem to be identical to the solution set of the linear system. We also show that the reformulation can be equivalently seen as a stochastic optimization problem, stochastically preconditioned linear system, stochastic fixed-point problem and as a probabilistic intersection problem. For instance, the condition number of the reformulation is equal to the condition number of the stochastically preconditioned linear system, and to the condition number of associated with the Hessian of the objective function appearing the stochastic optimization reformulation. Further, we propose and analyze basic, parallel and accelerated stochastic algorithms for solving the reformulated problem, with linear convergence rates. The methods have natural and sometimes surprising interpretations from the viewpoint of each of the four reformulations. For instance, the methods can be interpreted as basic, parallel and accelerated variants of stochastic gradient descent, stochastic Newton descent, stochastic projection method and stochastic fixed-point method. The complexity of the basic variants scales linearly with the condition number of the reformulation, while the accelerated variants scale with the square root of the condition number. Moreover, all our methods lend themselves to a natural dual interpretation as “stochastic subspace ascent” methods, a novel class of optimization algorithms not analyzed before. Stochastic dual coordinate ascent and stochastic dual Newton ascent arise in special cases. We prove global linear convergence of all our algorithms. Further, we highlight a close connection to recent algorithmic developments in machine learning through casting the problem as an instance of the Empirical Risk Minimization problem in a new regime not studied before.
The above development can be extended to matrix inversion. In particular, we develop and analyze a broad family of stochastic/randomized algorithms for inverting a matrix, with specialized variants maintaining symmetry and/or positive definiteness of the iterates. All methods in the family converge globally and linearly, with explicit rates. In special cases, we obtain stochastic block variants of several quasi-Newton updates, including bad Broyden (BB), good Broyden (GB), Powell-symmetric-Broyden (PSB), Davidon-Fletcher-Powell (DFP) and Broyden-Fletcher-Goldfarb-Shanno (BFGS). Ours are the first stochastic versions of these updates shown to converge to an inverse of a fixed matrix. Through a dual viewpoint we uncover a fundamental link between quasi-Newton updates and approximate inverse preconditioning. Further, we develop an adaptive variant of randomized block BFGS, where we modify the distribution underlying the stochasticity of the method throughout the iterative process to achieve faster convergence. Further, for rectangular and non-invertible matrices, variants of our methods can be shown to converge to the Moore-Penrose pseudoinverse.