This book introduces the Learning-Augmented Network Optimization (LANO) paradigm, which interconnects network optimization with the emerging AI theory and algorithms and has been receiving a growing attention in network research. The authors present the topic based on a general stochastic network optimization model, and review several important theoretical tools that are widely adopted in network research, including convex optimization, the drift method, and mean-field analysis. The book then covers several popular learning-based methods, i.e., learning-augmented drift, multi-armed bandit and reinforcement learning, along with applications in networks where the techniques have been successfully applied. The authors also provide a discussion on potential future directions and challenges.
This handbook presents state-of-the-art research in reinforcement learning, focusing on its applications in the control and game theory of dynamic systems and future directions for related research and technology. The contributions gathered in this book deal with challenges faced when using learning and adaptation methods to solve academic and industrial problems, such as optimization in dynamic environments with single and multiple agents, convergence and performance analysis, and online implementation. They explore means by which these difficulties can be solved, and cover a wide range of related topics including: deep learning; artificial intelligence; applications of game theory; mixed modality learning; and multi-agent reinforcement learning. Practicing engineers and scholars in the field of machine learning, game theory, and autonomous control will find the Handbook of Reinforcement Learning and Control to be thought-provoking, instructive and informative.
This text presents a modern theory of analysis, control, and optimization for dynamic networks. Mathematical techniques of Lyapunov drift and Lyapunov optimization are developed and shown to enable constrained optimization of time averages in general stochastic systems. The focus is on communication and queueing systems, including wireless networks with time-varying channels, mobility, and randomly arriving traffic. A simple drift-plus-penalty framework is used to optimize time averages such as throughput, throughput-utility, power, and distortion. Explicit performance-delay tradeoffs are provided to illustrate the cost of approaching optimality. This theory is also applicable to problems in operations research and economics, where energy-efficient and profit-maximizing decisions must be made without knowing the future. Topics in the text include the following: - Queue stability theory - Backpressure, max-weight, and virtual queue methods - Primal-dual methods for non-convex stochastic utility maximization - Universal scheduling theory for arbitrary sample paths - Approximate and randomized scheduling theory - Optimization of renewal systems and Markov decision systems Detailed examples and numerous problem set questions are provided to reinforce the main concepts. Table of Contents: Introduction / Introduction to Queues / Dynamic Scheduling Example / Optimizing Time Averages / Optimizing Functions of Time Averages / Approximate Scheduling / Optimization of Renewal Systems / Conclusions
A recent development in SDC-related problems is the establishment of intelligent SDC models and the intensive use of LMI-based convex optimization methods. Within this theoretical framework, control parameter determination can be designed and stability and robustness of closed-loop systems can be analyzed. This book describes the new framework of SDC system design and provides a comprehensive description of the modelling of controller design tools and their real-time implementation. It starts with a review of current research on SDC and moves on to some basic techniques for modelling and controller design of SDC systems. This is followed by a description of controller design for fixed-control-structure SDC systems, PDF control for general input- and output-represented systems, filtering designs, and fault detection and diagnosis (FDD) for SDC systems. Many new LMI techniques being developed for SDC systems are shown to have independent theoretical significance for robust control and FDD problems.
This book covers not only foundational materials but also the most recent progresses made during the past few years on the area of machine learning algorithms. In spite of the intensive research and development in this area, there does not exist a systematic treatment to introduce the fundamental concepts and recent progresses on machine learning algorithms, especially on those based on stochastic optimization methods, randomized algorithms, nonconvex optimization, distributed and online learning, and projection free methods. This book will benefit the broad audience in the area of machine learning, artificial intelligence and mathematical programming community by presenting these recent developments in a tutorial style, starting from the basic building blocks to the most carefully designed and complicated algorithms for machine learning.
The purpose of this book is to develop in greater depth some of the methods from the author's Reinforcement Learning and Optimal Control recently published textbook (Athena Scientific, 2019). In particular, we present new research, relating to systems involving multiple agents, partitioned architectures, and distributed asynchronous computation. We pay special attention to the contexts of dynamic programming/policy iteration and control theory/model predictive control. We also discuss in some detail the application of the methodology to challenging discrete/combinatorial optimization problems, such as routing, scheduling, assignment, and mixed integer programming, including the use of neural network approximations within these contexts. The book focuses on the fundamental idea of policy iteration, i.e., start from some policy, and successively generate one or more improved policies. If just one improved policy is generated, this is called rollout, which, based on broad and consistent computational experience, appears to be one of the most versatile and reliable of all reinforcement learning methods. In this book, rollout algorithms are developed for both discrete deterministic and stochastic DP problems, and the development of distributed implementations in both multiagent and multiprocessor settings, aiming to take advantage of parallelism. Approximate policy iteration is more ambitious than rollout, but it is a strictly off-line method, and it is generally far more computationally intensive. This motivates the use of parallel and distributed computation. One of the purposes of the monograph is to discuss distributed (possibly asynchronous) methods that relate to rollout and policy iteration, both in the context of an exact and an approximate implementation involving neural networks or other approximation architectures. Much of the new research is inspired by the remarkable AlphaZero chess program, where policy iteration, value and policy networks, approximate lookahead minimization, and parallel computation all play an important role.
Reinforcement learning (RL) and adaptive dynamic programming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. This book describes the latest RL and ADP techniques for decision and control in human engineered systems, covering both single player decision and control and multi-player games. Edited by the pioneers of RL and ADP research, the book brings together ideas and methods from many fields and provides an important and timely guidance on controlling a wide variety of systems, such as robots, industrial processes, and economic decision-making.