Monte Carlo Planning and Reinforcement Learning for Large Scale Sequential Decision Problems

Monte Carlo Planning and Reinforcement Learning for Large Scale Sequential Decision Problems

Author: John Michael Mern

Publisher:

Published: 2021

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Autonomous agents have the potential to do tasks that would otherwise be too repetitive, difficult, or dangerous for humans. Solving many of these problems requires reasoning over sequences of decisions in order to reach a goal. Autonomous driving, inventory management, and medical diagnosis and treatment are all examples of important real-world sequential decision problems. Approximate solution methods such as reinforcement learning and Monte Carlo planning have achieved superhuman performance in some domains. In these methods, agents learn good actions to take in response to inputs. Problems with many widely varying inputs or possible actions remain challenging to efficiently solve without extensive problem-specific engineering. One of the key challenges in solving sequential decision problems is efficiently exploring the many different paths an agent may take. For most problems, it is infeasible to test every possible path. Many existing approaches explore paths using simple random sampling. Problems in which many different actions may be taken at each step often require more efficient exploration to be solved. Large, unstructured input spaces can also challenge conventional learning approaches. Agents must learn to recognize inputs that are functionally similar while simultaneously learning an effective decision strategy. As a result of these challenges, learning agents are often limited to solving tasks in virtual domains where very large amounts of trials can be conducted relatively safely and cheaply. When problems are solved using black-box models such as neural networks, the resulting decision making policy is impossible for a human to meaningfully interpret. This can also limit the use of learning agents to low-regret tasks such as image classification or video game playing. The work in this thesis addresses the challenges of learning in large-space sequential decision problems. The thesis first considers methods to improve scaling of deep reinforcement learning and Monte Carlo tree search methods. We present neural network architectures for the common case of exchangeable object inputs in deep reinforcement learning. The presented architecture accelerates learning by efficiently sharing learned representations among objects of the same type. The thesis then addresses methods to efficiently explore large action spaces in Monte Carlo tree search. We present two algorithms, PA-POMCPOW and BOMCP, that improve search by guiding exploration to actions with good expected performance or information gain. We then propose methods to improve the use of offline learned policies within online Monte Carlo planning through importance sampling and experience generalization. Finally, we study methods to interpret learned policies and expected search performance. Here, we present a method to represent high-dimensional policies with interpretable local surrogate trees. We also propose bounds on the error rates for Monte Carlo estimation that can be numerically calculated using empirical quantities.


Algorithms for Reinforcement Learning

Algorithms for Reinforcement Learning

Author: Csaba Grossi

Publisher: Springer Nature

Published: 2022-05-31

Total Pages: 89

ISBN-13: 3031015517

DOWNLOAD EBOOK

Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations. Table of Contents: Markov Decision Processes / Value Prediction Problems / Control / For Further Exploration


Ensemble Monte-Carlo Planning

Ensemble Monte-Carlo Planning

Author: Paul (Paul Arthur). Lewis

Publisher:

Published: 2011

Total Pages: 100

ISBN-13:

DOWNLOAD EBOOK

Monte-Carlo planning algorithms such as UCT make decisions at each step by intelligently expanding a single search tree given the available time and then selecting the best root action. Recent work has provided evidence that it can be advantageous to instead construct an ensemble of search trees and make a decision according to a weighted vote. However, these prior investigations have only considered the application domains of Go and Solitaire and were limited in the scope of ensemble configurations considered. In this paper, we conduct a large scale empirical study of ensemble Monte-Carlo planning using the UCT algorithm in a set of five additional diverse and challenging domains. In particular, we evaluate the advantages of a broad set of ensemble configurations in terms of space and time efficiency in both parallel and sequential time models. Our results show that ensembles are an effective way to improve performance given a parallel model, can significantly reduce space requirements and in some cases may improve performance in a sequential model. Additionally, from our work we produced an open-source planning library.


Efficient Algorithms for High-dimensional Data-driven Sequential Decision-making

Efficient Algorithms for High-dimensional Data-driven Sequential Decision-making

Author: Yilun Chen

Publisher:

Published: 2021

Total Pages: 0

ISBN-13:

DOWNLOAD EBOOK

The general framework of sequential decision-making captures various important real-world applications ranging from pricing, inventory control to public healthcare and pandemic management. It is central to operations research/operations management, often boiling down to solving stochastic dynamic programs (DP). The ongoing big data revolution allows decision makers to incorporate relevant data in their decision-making processes, which in many cases leads to significant performance upgrade/revenue increase. However, such data-driven decision-making also poses fundamental computational challenges, because they generally demand large-scale, more realistic and flexible (thus complicated) models. As a result, the associated DPs become computationally intractable due to curse of dimensionality issues. We overcome this computational obstacle for three specific sequential decision-making problems, each subject to a distinct \textit{combinatorial constraint} on its decisions: optimal stopping, sequential decision-making with limited moves and online bipartite max weight independent set. Assuming sample access to the underlying model (analogous to a \textit{generative model} in reinforcement learning), our algorithm can output epsilon-optimal solutions (policies/approximate optimal values) for any fixed error tolerance epsilon with computational and sample complexity both scaling polynomially in the time horizon, and essentially independent of the underlying dimension. Our results prove for the first time the fundamental tractability of certain sequential decision-making problems with combinatorial structures (including the notoriously challenging high-dimensional optimal stopping), and our approach may potentially bring forth efficient algorithms with provable performance guarantee in more sequential decision-making settings.


Optimization in Large Scale Problems

Optimization in Large Scale Problems

Author: Mahdi Fathi

Publisher: Springer Nature

Published: 2019-11-20

Total Pages: 333

ISBN-13: 3030285650

DOWNLOAD EBOOK

This volume provides resourceful thinking and insightful management solutions to the many challenges that decision makers face in their predictions, preparations, and implementations of the key elements that our societies and industries need to take as they move toward digitalization and smartness. The discussions within the book aim to uncover the sources of large-scale problems in socio-industrial dilemmas, and the theories that can support these challenges. How theories might also transition to real applications is another question that this book aims to uncover. In answer to the viewpoints expressed by several practitioners and academicians, this book aims to provide both a learning platform which spotlights open questions with related case studies. The relationship between Industry 4.0 and Society 5.0 provides the basis for the expert contributions in this book, highlighting the uses of analytical methods such as mathematical optimization, heuristic methods, decomposition methods, stochastic optimization, and more. The book will prove useful to researchers, students, and engineers in different domains who encounter large scale optimization problems and will encourage them to undertake research in this timely and practical field. The book splits into two parts. The first part covers a general perspective and challenges in a smart society and in industry. The second part covers several case studies and solutions from the operations research perspective for large scale challenges specific to various industry and society related phenomena.


From Bandits to Monte-Carlo Tree Search

From Bandits to Monte-Carlo Tree Search

Author: Rémi Munos

Publisher:

Published: 2014

Total Pages: 129

ISBN-13: 9781601987679

DOWNLOAD EBOOK

This work covers several aspects of the optimism in the face of uncertainty principle applied to large scale optimization problems under finite numerical budget. The initial motivation for the research reported here originated from the empirical success of the so-called Monte-Carlo Tree Search method popularized in Computer Go and further extended to many other games as well as optimization and planning problems. Our objective is to contribute to the development of theoretical foundations of the field by characterizing the complexity of the underlying optimization problems and designing efficient algorithms with performance guarantees.


An Introduction to Sequential Monte Carlo

An Introduction to Sequential Monte Carlo

Author: Nicolas Chopin

Publisher: Springer Nature

Published: 2020-10-01

Total Pages: 378

ISBN-13: 3030478459

DOWNLOAD EBOOK

This book provides a general introduction to Sequential Monte Carlo (SMC) methods, also known as particle filters. These methods have become a staple for the sequential analysis of data in such diverse fields as signal processing, epidemiology, machine learning, population ecology, quantitative finance, and robotics. The coverage is comprehensive, ranging from the underlying theory to computational implementation, methodology, and diverse applications in various areas of science. This is achieved by describing SMC algorithms as particular cases of a general framework, which involves concepts such as Feynman-Kac distributions, and tools such as importance sampling and resampling. This general framework is used consistently throughout the book. Extensive coverage is provided on sequential learning (filtering, smoothing) of state-space (hidden Markov) models, as this remains an important application of SMC methods. More recent applications, such as parameter estimation of these models (through e.g. particle Markov chain Monte Carlo techniques) and the simulation of challenging probability distributions (in e.g. Bayesian inference or rare-event problems), are also discussed. The book may be used either as a graduate text on Sequential Monte Carlo methods and state-space modeling, or as a general reference work on the area. Each chapter includes a set of exercises for self-study, a comprehensive bibliography, and a “Python corner,” which discusses the practical implementation of the methods covered. In addition, the book comes with an open source Python library, which implements all the algorithms described in the book, and contains all the programs that were used to perform the numerical experiments.


Sequential Decision-Making in Musical Intelligence

Sequential Decision-Making in Musical Intelligence

Author: Elad Liebman

Publisher: Springer Nature

Published: 2019-10-01

Total Pages: 224

ISBN-13: 3030305198

DOWNLOAD EBOOK

Over the past 60 years, artificial intelligence has grown from an academic field of research to a ubiquitous array of tools used in everyday technology. Despite its many recent successes, certain meaningful facets of computational intelligence have yet to be thoroughly explored, such as a wide array of complex mental tasks that humans carry out easily, yet are difficult for computers to mimic. A prime example of a domain in which human intelligence thrives, but machine understanding is still fairly limited, is music. Over recent decades, many researchers have used computational tools to perform tasks like genre identification, music summarization, music database querying, and melodic segmentation. While these are all useful algorithmic solutions, we are still a long way from constructing complete music agents able to mimic (at least partially) the complexity with which humans approach music. One key aspect that hasn't been sufficiently studied is that of sequential decision-making in musical intelligence. Addressing this gap, the book focuses on two aspects of musical intelligence: music recommendation and multi-agent interaction in the context of music. Though motivated primarily by music-related tasks, and focusing largely on people's musical preferences, the work presented in this book also establishes that insights from music-specific case studies can also be applicable in other concrete social domains, such as content recommendation.Showing the generality of insights from musical data in other contexts provides evidence for the utility of music domains as testbeds for the development of general artificial intelligence techniques.Ultimately, this thesis demonstrates the overall value of taking a sequential decision-making approach in settings previously unexplored from this perspective.


Reinforcement Learning, second edition

Reinforcement Learning, second edition

Author: Richard S. Sutton

Publisher: MIT Press

Published: 2018-11-13

Total Pages: 549

ISBN-13: 0262352702

DOWNLOAD EBOOK

The significantly expanded and updated new edition of a widely used text on reinforcement learning, one of the most active research areas in artificial intelligence. Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives while interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the field's key ideas and algorithms. This second edition has been significantly expanded and updated, presenting new topics and updating coverage of other topics. Like the first edition, this second edition focuses on core online learning algorithms, with the more mathematical material set off in shaded boxes. Part I covers as much of reinforcement learning as possible without going beyond the tabular case for which exact solutions can be found. Many algorithms presented in this part are new to the second edition, including UCB, Expected Sarsa, and Double Learning. Part II extends these ideas to function approximation, with new sections on such topics as artificial neural networks and the Fourier basis, and offers expanded treatment of off-policy learning and policy-gradient methods. Part III has new chapters on reinforcement learning's relationships to psychology and neuroscience, as well as an updated case-studies chapter including AlphaGo and AlphaGo Zero, Atari game playing, and IBM Watson's wagering strategy. The final chapter discusses the future societal impacts of reinforcement learning.


From Bandits to Monte-Carlo Tree Search

From Bandits to Monte-Carlo Tree Search

Author: Rmi Munos

Publisher: Now Pub

Published: 2014

Total Pages: 146

ISBN-13: 9781601987662

DOWNLOAD EBOOK

Covers the optimism in the face of uncertainty principle applied to large scale optimization problems under finite numerical budget. The initial motivation for this research originated from the empirical success of the Monte-Carlo Tree Search method popularized in Computer Go and further extended to other games, optimization, and planning problems.