This focused monograph presents a study of subgradient algorithms for constrained minimization problems in a Hilbert space. The book is of interest for experts in applications of optimization to engineering and economics. The goal is to obtain a good approximate solution of the problem in the presence of computational errors. The discussion takes into consideration the fact that for every algorithm its iteration consists of several steps and that computational errors for different steps are different, in general. The book is especially useful for the reader because it contains solutions to a number of difficult and interesting problems in the numerical optimization. The subgradient projection algorithm is one of the most important tools in optimization theory and its applications. An optimization problem is described by an objective function and a set of feasible points. For this algorithm each iteration consists of two steps. The first step requires a calculation of a subgradient of the objective function; the second requires a calculation of a projection on the feasible set. The computational errors in each of these two steps are different. This book shows that the algorithm discussed, generates a good approximate solution, if all the computational errors are bounded from above by a small positive constant. Moreover, if computational errors for the two steps of the algorithm are known, one discovers an approximate solution and how many iterations one needs for this. In addition to their mathematical interest, the generalizations considered in this book have a significant practical meaning.
This monograph presents the main complexity theorems in convex optimization and their corresponding algorithms. It begins with the fundamental theory of black-box optimization and proceeds to guide the reader through recent advances in structural optimization and stochastic optimization. The presentation of black-box optimization, strongly influenced by the seminal book by Nesterov, includes the analysis of cutting plane methods, as well as (accelerated) gradient descent schemes. Special attention is also given to non-Euclidean settings (relevant algorithms include Frank-Wolfe, mirror descent, and dual averaging), and discussing their relevance in machine learning. The text provides a gentle introduction to structural optimization with FISTA (to optimize a sum of a smooth and a simple non-smooth term), saddle-point mirror prox (Nemirovski's alternative to Nesterov's smoothing), and a concise description of interior point methods. In stochastic optimization it discusses stochastic gradient descent, mini-batches, random coordinate descent, and sublinear algorithms. It also briefly touches upon convex relaxation of combinatorial problems and the use of randomness to round solutions, as well as random walks based methods.
This book is an abridged version of the two volumes "Convex Analysis and Minimization Algorithms I and II" (Grundlehren der mathematischen Wissenschaften Vol. 305 and 306). It presents an introduction to the basic concepts in convex analysis and a study of convex minimization problems (with an emphasis on numerical algorithms). The "backbone" of bot volumes was extracted, some material deleted which was deemed too advanced for an introduction, or too closely attached to numerical algorithms. Some exercises were included and finally the index has been considerably enriched, making it an excellent choice for the purpose of learning and teaching.
In recent years much attention has been given to the development of auto matic systems of planning, design and control in various branches of the national economy. Quality of decisions is an issue which has come to the forefront, increasing the significance of optimization algorithms in math ematical software packages for al,ltomatic systems of various levels and pur poses. Methods for minimizing functions with discontinuous gradients are gaining in importance and the ~xperts in the computational methods of mathematical programming tend to agree that progress in the development of algorithms for minimizing nonsmooth functions is the key to the con struction of efficient techniques for solving large scale problems. This monograph summarizes to a certain extent fifteen years of the author's work on developing generalized gradient methods for nonsmooth minimization. This work started in the department of economic cybernetics of the Institute of Cybernetics of the Ukrainian Academy of Sciences under the supervision of V.S. Mikhalevich, a member of the Ukrainian Academy of Sciences, in connection with the need for solutions to important, practical problems of optimal planning and design. In Chap. I we describe basic classes of nonsmooth functions that are dif ferentiable almost everywhere, and analyze various ways of defining generalized gradient sets. In Chap. 2 we study in detail various versions of the su bgradient method, show their relation to the methods of Fejer-type approximations and briefly present the fundamentals of e-subgradient methods.
The primary goal of this book is to provide a self-contained, comprehensive study of the main ?rst-order methods that are frequently used in solving large-scale problems. First-order methods exploit information on values and gradients/subgradients (but not Hessians) of the functions composing the model under consideration. With the increase in the number of applications that can be modeled as large or even huge-scale optimization problems, there has been a revived interest in using simple methods that require low iteration cost as well as low memory storage. The author has gathered, reorganized, and synthesized (in a unified manner) many results that are currently scattered throughout the literature, many of which cannot be typically found in optimization books. First-Order Methods in Optimization offers comprehensive study of first-order methods with the theoretical foundations; provides plentiful examples and illustrations; emphasizes rates of convergence and complexity analysis of the main first-order methods used to solve large-scale problems; and covers both variables and functional decomposition methods.
The book is devoted to the study of approximate solutions of optimization problems in the presence of computational errors. It contains a number of results on the convergence behavior of algorithms in a Hilbert space, which are known as important tools for solving optimization problems. The research presented in the book is the continuation and the further development of the author's (c) 2016 book Numerical Optimization with Computational Errors, Springer 2016. Both books study the algorithms taking into account computational errors which are always present in practice. The main goal is, for a known computational error, to find out what an approximate solution can be obtained and how many iterates one needs for this. The main difference between this new book and the 2016 book is that in this present book the discussion takes into consideration the fact that for every algorithm, its iteration consists of several steps and that computational errors for different steps are generally, different. This fact, which was not taken into account in the previous book, is indeed important in practice. For example, the subgradient projection algorithm consists of two steps. The first step is a calculation of a subgradient of the objective function while in the second one we calculate a projection on the feasible set. In each of these two steps there is a computational error and these two computational errors are different in general. It may happen that the feasible set is simple and the objective function is complicated. As a result, the computational error, made when one calculates the projection, is essentially smaller than the computational error of the calculation of the subgradient. Clearly, an opposite case is possible too. Another feature of this book is a study of a number of important algorithms which appeared recently in the literature and which are not discussed in the previous book. This monograph contains 12 chapters. Chapter 1 is an introduction. In Chapter 2 we study the subgradient projection algorithm for minimization of convex and nonsmooth functions. We generalize the results of [NOCE] and establish results which has no prototype in [NOCE]. In Chapter 3 we analyze the mirror descent algorithm for minimization of convex and nonsmooth functions, under the presence of computational errors. For this algorithm each iteration consists of two steps. The first step is a calculation of a subgradient of the objective function while in the second one we solve an auxiliary minimization problem on the set of feasible points. In each of these two steps there is a computational error. We generalize the results of [NOCE] and establish results which has no prototype in [NOCE]. In Chapter 4 we analyze the projected gradient algorithm with a smooth objective function under the presence of computational errors. In Chapter 5 we consider an algorithm, which is an extension of the projection gradient algorithm used for solving linear inverse problems arising in signal/image processing. In Chapter 6 we study continuous subgradient method and continuous subgradient projection algorithm for minimization of convex nonsmooth functions and for computing the saddle points of convex-concave functions, under the presence of computational errors. All the results of this chapter has no prototype in [NOCE]. In Chapters 7-12 we analyze several algorithms under the presence of computational errors which were not considered in [NOCE]. Again, each step of an iteration has a computational errors and we take into account that these errors are, in general, different. An optimization problems with a composite objective function is studied in Chapter 7. A zero-sum game with two-players is considered in Chapter 8. A predicted decrease approximation-based method is used in Chapter 9 for constrained convex optimization. Chapter 10 is devoted to minimization of quasiconvex functions. Minimization of sharp weakly convex functions is discussed in Chapter 11. Chapter 12 is devoted to a generalized projected subgradient method for minimization of a convex function over a set which is not necessarily convex. The book is of interest for researchers and engineers working in optimization. It also can be useful in preparation courses for graduate students. The main feature of the book which appeals specifically to this audience is the study of the influence of computational errors for several important optimization algorithms. The book is of interest for experts in applications of optimization to engineering and economics.
Proximal Algorithms discusses proximal operators and proximal algorithms, and illustrates their applicability to standard and distributed convex optimization in general and many applications of recent interest in particular. Much like Newton's method is a standard tool for solving unconstrained smooth optimization problems of modest size, proximal algorithms can be viewed as an analogous tool for nonsmooth, constrained, large-scale, or distributed versions of these problems. They are very generally applicable, but are especially well-suited to problems of substantial recent interest involving large or high-dimensional datasets. Proximal methods sit at a higher level of abstraction than classical algorithms like Newton's method: the base operation is evaluating the proximal operator of a function, which itself involves solving a small convex optimization problem. These subproblems, which generalize the problem of projecting a point onto a convex set, often admit closed-form solutions or can be solved very quickly with standard or simple specialized methods. Proximal Algorithms discusses different interpretations of proximal operators and algorithms, looks at their connections to many other topics in optimization and applied mathematics, surveys some popular algorithms, and provides a large number of examples of proximal operators that commonly arise in practice.
This book provides a comprehensive and accessible presentation of algorithms for solving convex optimization problems. It relies on rigorous mathematical analysis, but also aims at an intuitive exposition that makes use of visualization where possible. This is facilitated by the extensive use of analytical and algorithmic concepts of duality, which by nature lend themselves to geometrical interpretation. The book places particular emphasis on modern developments, and their widespread applications in fields such as large-scale resource allocation problems, signal processing, and machine learning. The book is aimed at students, researchers, and practitioners, roughly at the first year graduate level. It is similar in style to the author's 2009"Convex Optimization Theory" book, but can be read independently. The latter book focuses on convexity theory and optimization duality, while the present book focuses on algorithmic issues. The two books share notation, and together cover the entire finite-dimensional convex optimization methodology. To facilitate readability, the statements of definitions and results of the "theory book" are reproduced without proofs in Appendix B.
In the last few years, Algorithms for Convex Optimization have revolutionized algorithm design, both for discrete and continuous optimization problems. For problems like maximum flow, maximum matching, and submodular function minimization, the fastest algorithms involve essential methods such as gradient descent, mirror descent, interior point methods, and ellipsoid methods. The goal of this self-contained book is to enable researchers and professionals in computer science, data science, and machine learning to gain an in-depth understanding of these algorithms. The text emphasizes how to derive key algorithms for convex optimization from first principles and how to establish precise running time bounds. This modern text explains the success of these algorithms in problems of discrete optimization, as well as how these methods have significantly pushed the state of the art of convex optimization itself.
Submodular functions are relevant to machine learning for at least two reasons: (1) some problems may be expressed directly as the optimization of submodular functions and (2) the Lovász extension of submodular functions provides a useful set of regularization functions for supervised and unsupervised learning. In this monograph, we present the theory of submodular functions from a convex analysis perspective, presenting tight links between certain polyhedra, combinatorial optimization and convex optimization problems. In particular, we show how submodular function minimization is equivalent to solving a wide variety of convex optimization problems. This allows the derivation of new efficient algorithms for approximate and exact submodular function minimization with theoretical guarantees and good practical performance. By listing many examples of submodular functions, we review various applications to machine learning, such as clustering, experimental design, sensor placement, graphical model structure learning or subset selection, as well as a family of structured sparsity-inducing norms that can be derived and used from submodular functions.