In multivariate data analysis, regression techniques predict one set of variables from another while principal component analysis (PCA) finds a subspace of minimal dimensionality that captures the largest variability in the data. How can regression analysis and PCA be combined in a beneficial way? Why and when is it a good idea to combine them? Wha
Constrained Principal Component Analysis and Related Techniques shows how constrained principal component analysis (CPCA) offers a unified framework for regression analysis and PCA. The book begins with four concrete examples of CPCA that provide you with a basic understanding of the technique and its applications. It gives a detailed account of projection and singular value decomposition. The author then describes the basic data requirements, models, and analytical tools for CPCA and their immediate extensions. He also introduces techniques that are special cases of or closely related to CPCA and discusses several topics relevant to practical uses of CPCA. The book concludes with a technique that imposes different constraints on different dimensions, along with its analytical extensions. Features, Presents an in-depth, unified theoretical treatment of CPCA, Contains implementation details and many real application examples, Offers material for methodologically oriented readers interested in developing statistical techniques of their own, Keeps the use of complicated iterative methods to a minimum, Gives an overview of computer software for CPCA in the appendix, Provides MATLAB® programs and data on the author's website Book jacket.
Principal component analysis is probably the oldest and best known of the It was first introduced by Pearson (1901), techniques ofmultivariate analysis. and developed independently by Hotelling (1933). Like many multivariate methods, it was not widely used until the advent of electronic computers, but it is now weIl entrenched in virtually every statistical computer package. The central idea of principal component analysis is to reduce the dimen sionality of a data set in which there are a large number of interrelated variables, while retaining as much as possible of the variation present in the data set. This reduction is achieved by transforming to a new set of variables, the principal components, which are uncorrelated, and which are ordered so that the first few retain most of the variation present in all of the original variables. Computation of the principal components reduces to the solution of an eigenvalue-eigenvector problem for a positive-semidefinite symmetrie matrix. Thus, the definition and computation of principal components are straightforward but, as will be seen, this apparently simple technique has a wide variety of different applications, as weIl as a number of different deri vations. Any feelings that principal component analysis is a narrow subject should soon be dispelled by the present book; indeed some quite broad topics which are related to principal component analysis receive no more than a brief mention in the final two chapters.
This book provides a comprehensive introduction to the latest advances in the mathematical theory and computational tools for modeling high-dimensional data drawn from one or multiple low-dimensional subspaces (or manifolds) and potentially corrupted by noise, gross errors, or outliers. This challenging task requires the development of new algebraic, geometric, statistical, and computational methods for efficient and robust estimation and segmentation of one or multiple subspaces. The book also presents interesting real-world applications of these new methods in image processing, image and video segmentation, face recognition and clustering, and hybrid system identification etc. This book is intended to serve as a textbook for graduate students and beginning researchers in data science, machine learning, computer vision, image and signal processing, and systems theory. It contains ample illustrations, examples, and exercises and is made largely self-contained with three Appendices which survey basic concepts and principles from statistics, optimization, and algebraic-geometry used in this book. René Vidal is a Professor of Biomedical Engineering and Director of the Vision Dynamics and Learning Lab at The Johns Hopkins University. Yi Ma is Executive Dean and Professor at the School of Information Science and Technology at ShanghaiTech University. S. Shankar Sastry is Dean of the College of Engineering, Professor of Electrical Engineering and Computer Science and Professor of Bioengineering at the University of California, Berkeley.
This book expounds the principle and related applications of nonlinear principal component analysis (PCA), which is useful method to analyze mixed measurement levels data. In the part dealing with the principle, after a brief introduction of ordinary PCA, a PCA for categorical data (nominal and ordinal) is introduced as nonlinear PCA, in which an optimal scaling technique is used to quantify the categorical variables. The alternating least squares (ALS) is the main algorithm in the method. Multiple correspondence analysis (MCA), a special case of nonlinear PCA, is also introduced. All formulations in these methods are integrated in the same manner as matrix operations. Because any measurement levels data can be treated consistently as numerical data and ALS is a very powerful tool for estimations, the methods can be utilized in a variety of fields such as biometrics, econometrics, psychometrics, and sociology. In the applications part of the book, four applications are introduced: variable selection for mixed measurement levels data, sparse MCA, joint dimension reduction and clustering methods for categorical data, and acceleration of ALS computation. The variable selection methods in PCA that originally were developed for numerical data can be applied to any types of measurement levels by using nonlinear PCA. Sparseness and joint dimension reduction and clustering for nonlinear data, the results of recent studies, are extensions obtained by the same matrix operations in nonlinear PCA. Finally, an acceleration algorithm is proposed to reduce the problem of computational cost in the ALS iteration in nonlinear multivariate methods. This book thus presents the usefulness of nonlinear PCA which can be applied to different measurement levels data in diverse fields. As well, it covers the latest topics including the extension of the traditional statistical method, newly proposed nonlinear methods, and computational efficiency in the methods.
This is the first textbook that allows readers who may be unfamiliar with matrices to understand a variety of multivariate analysis procedures in matrix forms. By explaining which models underlie particular procedures and what objective function is optimized to fit the model to the data, it enables readers to rapidly comprehend multivariate data analysis. Arranged so that readers can intuitively grasp the purposes for which multivariate analysis procedures are used, the book also offers clear explanations of those purposes, with numerical examples preceding the mathematical descriptions. Supporting the modern matrix formulations by highlighting singular value decomposition among theorems in matrix algebra, this book is useful for undergraduate students who have already learned introductory statistics, as well as for graduate students and researchers who are not familiar with matrix-intensive formulations of multivariate data analysis. The book begins by explaining fundamental matrix operations and the matrix expressions of elementary statistics. Then, it offers an introduction to popular multivariate procedures, with each chapter featuring increasing advanced levels of matrix algebra. Further the book includes in six chapters on advanced procedures, covering advanced matrix operations and recently proposed multivariate procedures, such as sparse estimation, together with a clear explication of the differences between principal components and factor analyses solutions. In a nutshell, this book allows readers to gain an understanding of the latest developments in multivariate data science.
Ever-greater computing technologies have given rise to an exponentially growing volume of data. Today massive data sets (with potentially thousands of variables) play an important role in almost every branch of modern human activity, including networks, finance, and genetics. However, analyzing such data has presented a challenge for statisticians
The first part of the book gives a general introduction to key concepts in algebraic statistics, focusing on methods that are helpful in the study of models with hidden variables. The author uses tensor geometry as a natural language to deal with multivariate probability distributions, develops new combinatorial tools to study models with hidden data, and describes the semialgebraic structure of statistical models. The second part illustrates important examples of tree models with hidden variables. The book discusses the underlying models and related combinatorial concepts of phylogenetic trees as well as the local and global geometry of latent tree models. It also extends previous results to Gaussian latent tree models. This book shows you how both combinatorics and algebraic geometry enable a better understanding of latent tree models. It contains many results on the geometry of the models, including a detailed analysis of identifiability and the defining polynomial constraints
The13thInternationalConferenceonMedicalImageComputingandComputer- Assisted Intervention, MICCAI 2010, was held in Beijing, China from 20-24 September,2010.ThevenuewastheChinaNationalConventionCenter(CNCC), China’slargestandnewestconferencecenterwith excellentfacilities andaprime location in the heart of the Olympic Green, adjacent to characteristic constr- tions like the Bird’s Nest (National Stadium) and the Water Cube (National Aquatics Center). MICCAI is the foremost international scienti?c event in the ?eld of medical image computing and computer-assisted interventions. The annual conference has a high scienti?c standard by virtue of the threshold for acceptance, and accordingly MICCAI has built up a track record of attracting leading scientists, engineersandcliniciansfromawiderangeoftechnicalandbiomedicaldisciplines. This year, we received 786 submissions, well in line with the previous two conferences in New York and London. Three program chairs and a program committee of 31 scientists, all with a recognized standing in the ?eld of the conference, were responsible for the selection of the papers. The review process was set up such that each paper was considered by the three program chairs, two program committee members, and a minimum of three external reviewers. The review process was double-blind, so the reviewers did not know the identity of the authors of the submission. After a careful evaluation procedure, in which all controversialand gray area papers were discussed individually, we arrived at a total of 251 accepted papers for MICCAI 2010, of which 45 were selected for podium presentation and 206 for poster presentation. The acceptance percentage (32%) was in keeping with that of previous MICCAI conferences. All 251 papers are included in the three MICCAI 2010 LNCS volumes.