Rollout, Policy Iteration, and Distributed Reinforcement Learning

Rollout, Policy Iteration, and Distributed Reinforcement Learning

Author: Dimitri Bertsekas

Publisher: Athena Scientific

Published: 2021-08-20

Total Pages: 498

ISBN-13: 1886529078

DOWNLOAD EBOOK

The purpose of this book is to develop in greater depth some of the methods from the author's Reinforcement Learning and Optimal Control recently published textbook (Athena Scientific, 2019). In particular, we present new research, relating to systems involving multiple agents, partitioned architectures, and distributed asynchronous computation. We pay special attention to the contexts of dynamic programming/policy iteration and control theory/model predictive control. We also discuss in some detail the application of the methodology to challenging discrete/combinatorial optimization problems, such as routing, scheduling, assignment, and mixed integer programming, including the use of neural network approximations within these contexts. The book focuses on the fundamental idea of policy iteration, i.e., start from some policy, and successively generate one or more improved policies. If just one improved policy is generated, this is called rollout, which, based on broad and consistent computational experience, appears to be one of the most versatile and reliable of all reinforcement learning methods. In this book, rollout algorithms are developed for both discrete deterministic and stochastic DP problems, and the development of distributed implementations in both multiagent and multiprocessor settings, aiming to take advantage of parallelism. Approximate policy iteration is more ambitious than rollout, but it is a strictly off-line method, and it is generally far more computationally intensive. This motivates the use of parallel and distributed computation. One of the purposes of the monograph is to discuss distributed (possibly asynchronous) methods that relate to rollout and policy iteration, both in the context of an exact and an approximate implementation involving neural networks or other approximation architectures. Much of the new research is inspired by the remarkable AlphaZero chess program, where policy iteration, value and policy networks, approximate lookahead minimization, and parallel computation all play an important role.


An Introduction to Stochastic Processes

An Introduction to Stochastic Processes

Author: Edward P.C. Kao

Publisher: Courier Dover Publications

Published: 2019-12-18

Total Pages: 451

ISBN-13: 0486837920

DOWNLOAD EBOOK

This incorporation of computer use into teaching and learning stochastic processes takes an applications- and computer-oriented approach rather than a mathematically rigorous approach. Solutions Manual available to instructors upon request. 1997 edition.


On the Continuous Dependence with Respect to Sampling of the Linear Quadratic Regulator Problem for Distributed Parameter Systems

On the Continuous Dependence with Respect to Sampling of the Linear Quadratic Regulator Problem for Distributed Parameter Systems

Author: Institute for Computer Applications in Science and Engineering

Publisher:

Published: 1990

Total Pages: 48

ISBN-13:

DOWNLOAD EBOOK

The convergence of solutions to the discrete or sampled time linear quadratic regulator problem and associated Riccati equation for infinite dimensional systems to the solutions to the corresponding continuous time problem and equation, as the length of the sampling interval (the sampling rate) tends toward zero (infinity) is established. Both the finite and infinite time horizon problems are studied. In the finite time horizon case, strong continuity of the operators which define the control system and performance index together with a stability and consistency condition on the sampling scheme are required. For the infinite time horizon problem, in addition, the sampled systems must be stabilizable and detectable, uniformly with respect to the sampling rate. Classes of systems for which this condition can be verified are discussed. Results of numerical studies involving the control of a heat/diffusion equation, a hereditary of delay system, and a flexible beam are presented and discussed. (kr).


Optimisation in Economic Analysis

Optimisation in Economic Analysis

Author: Gordon Mills

Publisher: Routledge

Published: 2014-04-04

Total Pages: 216

ISBN-13: 1317833627

DOWNLOAD EBOOK

One of the fundamental economic problems is one of making the best use of limited resources. As a result, mathematical optimisation methods play a crucial role in economic theory. Covering the use of such methods in applied and policy contexts, this book deals not only with the main techniques (linear programming, nonlinear optimisation and dynamic programming), but also emphasizes the art of model-building and discusses fields such as optimisation over time.


Optimization in Economics and Finance

Optimization in Economics and Finance

Author: Bruce D. Craven

Publisher: Springer Science & Business Media

Published: 2005

Total Pages: 184

ISBN-13: 9780387242798

DOWNLOAD EBOOK

Extends the optimization techniques, in a form that may be adopted for modeling social choice problems. The models in this book provide possible models for a society's social choice for an allocation that maximizes welfare and utilization of resources. A computer program SCOM is presented here for computing social choice models by optimal control.


Stochastic Simulation: Algorithms and Analysis

Stochastic Simulation: Algorithms and Analysis

Author: Søren Asmussen

Publisher: Springer Science & Business Media

Published: 2007-07-14

Total Pages: 490

ISBN-13: 0387690336

DOWNLOAD EBOOK

Sampling-based computational methods have become a fundamental part of the numerical toolset of practitioners and researchers across an enormous number of different applied domains and academic disciplines. This book provides a broad treatment of such sampling-based methods, as well as accompanying mathematical analysis of the convergence properties of the methods discussed. The reach of the ideas is illustrated by discussing a wide range of applications and the models that have found wide usage. The first half of the book focuses on general methods; the second half discusses model-specific algorithms. Exercises and illustrations are included.