Subset and Sample Selection for Graphical Models: Gaussian Processes, Ising Models and Gaussian Mixture Models

Subset and Sample Selection for Graphical Models: Gaussian Processes, Ising Models and Gaussian Mixture Models

Author: Satyaki Mahalanabis

Publisher:

Published: 2012

Total Pages: 234

ISBN-13:

DOWNLOAD EBOOK

"Probabilistic Graphical Models are a popular method of representing complex joint distributions in which stochastic dependence between subsets of random variables is expressed in terms of a graph. In many scenarios, random samples from a graphical model are only partially observed - either only a few random variables may be observed for any sample, or the values of some variables (called hidden variables) are missing from each sample unless explicitly queried. This dissertation considers the following general problem: how to select a small subset of variables to observe for a given sample (called subset selection) or a small subset of samples for which to observe the hidden variables (called sample selection) so as to accurately predict the value of the unobserved variables? We investigate this question for 3 widely-studied classes of graphical models: Gaussian Processes, Ising Models, and Gaussian Mixture Models. We prove that nding an optimal subset selection strategy is NP-hard even for a restricted class of Gaussian Processes, called Gaussian Free Fields (GFF). We give a dynamic programming algorithm for Gaussian Processes on bounded tree-width graphs, which yields a fully polynomial time approximation scheme for the case of GFFs on such graphs. For general Gaussian Processes on bounded tree-width graphs, our algorithm's running time depends polynomially on the condition number of the covariance matrix. We also give a greedy constant-factor approximation algorithm for GFFs on arbitrary graphs. We consider both adaptive and non-adaptive subset selection for Ising Models. For the simple 1-dimensional ferromagnetic Ising Model, we demonstrate that adaptive strategies outperform non-adaptive strategies, and give a simple adaptive strategy whose error is at most a constant times that of the optimal adaptive strategy for the same observation budget. We prove that it is NP-hard to compute an optimal non-adaptive strategy for ferromagnetic Ising Models on general graphs. For mixture models, we dene a 'maximum-a-posteriori' oracle and discuss how it diers from other oracle models. Then we demonstrate the advantage provided by this oracle by giving an algorithm which estimates the parameters of a mixture of high-dimensional spherical Gaussians under a weaker separation condition and more eciently than known unsupervised algorithms"--Leaves iv-v.


Finite Mixture Models

Finite Mixture Models

Author: Geoffrey McLachlan

Publisher: John Wiley & Sons

Published: 2004-03-22

Total Pages: 419

ISBN-13: 047165406X

DOWNLOAD EBOOK

An up-to-date, comprehensive account of major issues in finitemixture modeling This volume provides an up-to-date account of the theory andapplications of modeling via finite mixture distributions. With anemphasis on the applications of mixture models in both mainstreamanalysis and other areas such as unsupervised pattern recognition,speech recognition, and medical imaging, the book describes theformulations of the finite mixture approach, details itsmethodology, discusses aspects of its implementation, andillustrates its application in many common statisticalcontexts. Major issues discussed in this book include identifiabilityproblems, actual fitting of finite mixtures through use of the EMalgorithm, properties of the maximum likelihood estimators soobtained, assessment of the number of components to be used in themixture, and the applicability of asymptotic theory in providing abasis for the solutions to some of these problems. The author alsoconsiders how the EM algorithm can be scaled to handle the fittingof mixture models to very large databases, as in data miningapplications. This comprehensive, practical guide: * Provides more than 800 references-40% published since 1995 * Includes an appendix listing available mixture software * Links statistical literature with machine learning and patternrecognition literature * Contains more than 100 helpful graphs, charts, and tables Finite Mixture Models is an important resource for both applied andtheoretical statisticians as well as for researchers in the manyareas in which finite mixture models can be used to analyze data.


Multiscale Gaussian Graphical Models and Algorithms for Large-scale Inference

Multiscale Gaussian Graphical Models and Algorithms for Large-scale Inference

Author: Myung Jin Choi (Ph. D.)

Publisher:

Published: 2007

Total Pages: 123

ISBN-13:

DOWNLOAD EBOOK

Graphical models provide a powerful framework for stochastic processes by representing dependencies among random variables compactly with graphs. In particular, multiscale tree-structured graphs have attracted much attention for their computational efficiency as well as their ability to capture long-range correlations. However, tree models have limited modeling power that may lead to blocky artifacts. Previous works on extending trees to pyramidal structures resorted to computationally expensive methods to get solutions due to the resulting model complexity. In this thesis, we propose a pyramidal graphical model with rich modeling power for Gaussian processes, and develop efficient inference algorithms to solve large-scale estimation problems. The pyramidal graph has statistical links between pairs of neighboring nodes within each scale as well as between adjacent scales. Although the graph has many cycles, its hierarchical structure enables us to develop a class of fast algorithms in the spirit of multipole methods. The algorithms operate by guiding far-apart nodes to communicate through coarser scales and considering only local interactions at finer scales. The consistent stochastic structure of the pyramidal graph provides great flexibilities in designing and analyzing inference algorithms. Based on emerging techniques for inference on Gaussian graphical models, we propose several different inference algorithms to compute not only the optimal estimates but also approximate error variances as well. In addition, we consider the problem of rapidly updating the estimates based on some new local information, and develop a re-estimation algorithm on the pyramidal graph. Simulation results show that this algorithm can be applied to reconstruct discontinuities blurred during the estimation process or to update the estimates to incorporate a new set of measurements introduced in a local region.


Algorithms and Applications for Gaussian Graphical Models Under the Multivariate Totally Positive Constraint of Order 2

Algorithms and Applications for Gaussian Graphical Models Under the Multivariate Totally Positive Constraint of Order 2

Author: Uma Roy (M. Eng.)

Publisher:

Published: 2019

Total Pages: 72

ISBN-13:

DOWNLOAD EBOOK

We consider the problem of estimating an undirected Gaussian graphical model when the underlying distribution is multivariate totally positive of order 2 (MTP2), a strong form of positive dependence. A large body of methods have been proposed for learning undirected graphical models without the MTP2 constraint. A major limitation of these methods is that their consistency guarantees in the high-dimensional setting usually require a particular choice of a tuning parameter, which is unknown a priori in real world applications. We show that an undirected graphical model under MTP2 can be learned consistently without any tuning parameters. We evaluate this new estimator on synthetic and real-world financial data sets, showing that it out-performs other methods in the literature with tuning parameters. We further explore applications of estimators in the MTP2 setting to covariance estimation for finance. In particular, the very well-explored optimal Markowitz portfolio allocation problem requires a precise estimate of the covariance matrix of returns. By exploiting the fact that the returns of assets are typically positively dependent, we propose a new estimator based on MTP2 estimation and show that this estimator outperforms (in terms of out-of-sample risk) baseline methods including shrinkage techniques and explicitly providing market factors on stock-market data spanning over thirty years.


Network Psychometrics with R

Network Psychometrics with R

Author: Adela-Maria Isvoranu

Publisher: Taylor & Francis

Published: 2022-04-28

Total Pages: 261

ISBN-13: 100054107X

DOWNLOAD EBOOK

A systematic, innovative introduction to the field of network analysis, Network Psychometrics with R: A Guide for Behavioral and Social Scientists provides a comprehensive overview of and guide to both the theoretical foundations of network psychometrics as well as modelling techniques developed from this perspective. Written by pioneers in the field, this textbook showcases cutting-edge methods in an easily accessible format, accompanied by problem sets and code. After working through this book, readers will be able to understand the theoretical foundations behind network modelling, infer network topology, and estimate network parameters from different sources of data. This book features an introduction on the statistical programming language R that guides readers on how to analyse network structures and their stability using R. While Network Psychometrics with R is written in the context of social and behavioral science, the methods introduced in this book are widely applicable to data sets from related fields of study. Additionally, while the text is written in a non-technical manner, technical content is highlighted in textboxes for the interested reader. Network Psychometrics with R is ideal for instructors and students of undergraduate and graduate level courses and workshops in the field of network psychometrics as well as established researchers looking to master new methods. This book is accompanied by a companion website with resources for both students and lecturers.


Probability on Graphs

Probability on Graphs

Author: Geoffrey Grimmett

Publisher: Cambridge University Press

Published: 2018-01-25

Total Pages: 279

ISBN-13: 1108542999

DOWNLOAD EBOOK

This introduction to some of the principal models in the theory of disordered systems leads the reader through the basics, to the very edge of contemporary research, with the minimum of technical fuss. Topics covered include random walk, percolation, self-avoiding walk, interacting particle systems, uniform spanning tree, random graphs, as well as the Ising, Potts, and random-cluster models for ferromagnetism, and the Lorentz model for motion in a random medium. This new edition features accounts of major recent progress, including the exact value of the connective constant of the hexagonal lattice, and the critical point of the random-cluster model on the square lattice. The choice of topics is strongly motivated by modern applications, and focuses on areas that merit further research. Accessible to a wide audience of mathematicians and physicists, this book can be used as a graduate course text. Each chapter ends with a range of exercises.


Bayesian Reasoning and Machine Learning

Bayesian Reasoning and Machine Learning

Author: David Barber

Publisher: Cambridge University Press

Published: 2012-02-02

Total Pages: 739

ISBN-13: 0521518148

DOWNLOAD EBOOK

A practical introduction perfect for final-year undergraduate and graduate students without a solid background in linear algebra and calculus.


An Introduction to Conditional Random Fields

An Introduction to Conditional Random Fields

Author: Charles Sutton

Publisher: Now Pub

Published: 2012

Total Pages: 120

ISBN-13: 9781601985729

DOWNLOAD EBOOK

An Introduction to Conditional Random Fields provides a comprehensive tutorial aimed at application-oriented practitioners seeking to apply CRFs. The monograph does not assume previous knowledge of graphical modeling, and so is intended to be useful to practitioners in a wide variety of fields.