This paper proposes a novel shrinkage estimator for high-dimensional covariance matrices by extending the Oracle Approximating Shrinkage (OAS) of Chen et al. (2009) to target the diagonal elements of the sample covariance matrix. We derive the closed-form solution of the shrinkage parameter and show by simulation that, when the diagonal elements of the true covariance matrix exhibit substantial variation, our method reduces the Mean Squared Error, compared with the OAS that targets an average variance. The improvement is larger when the true covariance matrix is sparser. Our method also reduces the Mean Squared Error for the inverse of the covariance matrix.
Methods for estimating sparse and large covariance matrices Covariance and correlation matrices play fundamental roles in every aspect of the analysis of multivariate data collected from a variety of fields including business and economics, health care, engineering, and environmental and physical sciences. High-Dimensional Covariance Estimation provides accessible and comprehensive coverage of the classical and modern approaches for estimating covariance matrices as well as their applications to the rapidly developing areas lying at the intersection of statistics and machine learning. Recently, the classical sample covariance methodologies have been modified and improved upon to meet the needs of statisticians and researchers dealing with large correlated datasets. High-Dimensional Covariance Estimation focuses on the methodologies based on shrinkage, thresholding, and penalized likelihood with applications to Gaussian graphical models, prediction, and mean-variance portfolio management. The book relies heavily on regression-based ideas and interpretations to connect and unify many existing methods and algorithms for the task. High-Dimensional Covariance Estimation features chapters on: Data, Sparsity, and Regularization Regularizing the Eigenstructure Banding, Tapering, and Thresholding Covariance Matrices Sparse Gaussian Graphical Models Multivariate Regression The book is an ideal resource for researchers in statistics, mathematics, business and economics, computer sciences, and engineering, as well as a useful text or supplement for graduate-level courses in multivariate analysis, covariance estimation, statistical learning, and high-dimensional data analysis.
Praise for the first edition: "[This book] succeeds singularly at providing a structured introduction to this active field of research. ... it is arguably the most accessible overview yet published of the mathematical ideas and principles that one needs to master to enter the field of high-dimensional statistics. ... recommended to anyone interested in the main results of current research in high-dimensional statistics as well as anyone interested in acquiring the core mathematical skills to enter this area of research." —Journal of the American Statistical Association Introduction to High-Dimensional Statistics, Second Edition preserves the philosophy of the first edition: to be a concise guide for students and researchers discovering the area and interested in the mathematics involved. The main concepts and ideas are presented in simple settings, avoiding thereby unessential technicalities. High-dimensional statistics is a fast-evolving field, and much progress has been made on a large variety of topics, providing new insights and methods. Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this new edition: Offers revised chapters from the previous edition, with the inclusion of many additional materials on some important topics, including compress sensing, estimation with convex constraints, the slope estimator, simultaneously low-rank and row-sparse linear regression, or aggregation of a continuous set of estimators. Introduces three new chapters on iterative algorithms, clustering, and minimax lower bounds. Provides enhanced appendices, minimax lower-bounds mainly with the addition of the Davis-Kahan perturbation bound and of two simple versions of the Hanson-Wright concentration inequality. Covers cutting-edge statistical methods including model selection, sparsity and the Lasso, iterative hard thresholding, aggregation, support vector machines, and learning theory. Provides detailed exercises at the end of every chapter with collaborative solutions on a wiki site. Illustrates concepts with simple but clear practical examples.
An outgrowth of the author's extensive experience teaching senior and graduate level students, this is both a thorough introduction and a solid professional reference. * Material covered has been developed based on a 35-year research program associated with such systems as the Landsat satellite program and later satellite and aircraft programs. * Covers existing aircraft and satellite programs and several future programs *An Instructor's Manual presenting detailed solutions to all the problems in the book is available from the Wiley editorial department.
This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.
Fundamental topics and new methods in time series analysis Analysis of Financial Time Series provides a comprehensive and systematic introduction to financial econometric models and their application to modeling and prediction of financial time series data. It utilizes real-world examples and real financial data throughout the book to apply the models and methods described. The author begins with basic characteristics of financial time series data before covering three main topics: analysis and application of univariate financial time series; the return series of multiple assets; and Bayesian inference in finance methods. Timely topics and recent results include: Value at Risk (VaR) High-frequency financial data analysis Markov Chain Monte Carlo (MCMC) methods Derivative pricing using jump diffusion with closed-form formulas VaR calculation using extreme value theory based on a non-homogeneous two-dimensional Poisson process Multivariate volatility models with time-varying correlations Ideal as a fundamental introduction to time series for MBA students or as a reference for researchers and practitioners in business and finance, Analysis of Financial Time Series offers an in-depth and up-to-date account of these vital methods.
Random matrices now play a role in many areas of theoretical, applied, and computational mathematics. It is therefore desirable to have tools for studying random matrices that are flexible, easy to use, and powerful. Over the last fifteen years, researchers have developed a remarkable family of results, called matrix concentration inequalities, that achieve all of these goals. This monograph offers an invitation to the field of matrix concentration inequalities. It begins with some history of random matrix theory; it describes a flexible model for random matrices that is suitable for many problems; and it discusses the most important matrix concentration results. To demonstrate the value of these techniques, the presentation includes examples drawn from statistics, machine learning, optimization, combinatorics, algorithms, scientific computing, and beyond.
How to make forecasts that (1) satisfy constraints, like accounting identities, and (2) are smooth over time? Solving this common forecasting problem manually is resource-intensive, but the existing literature provides little guidance on how to achieve both objectives. This paper proposes a new method to smooth mixed-frequency multivariate time series subject to constraints by integrating the minimum-trace reconciliation and Hodrick-Prescott filter. With linear constraints, the method has a closed-form solution, convenient for a high-dimensional environment. Three examples show that the proposed method can reproduce the smoothness of professional forecasts subject to various constraints and slightly improve forecast performance.
The aim of the book is to introduce basic concepts, main results, and widely applied mathematical tools in the spectral analysis of large dimensional random matrices. The core of the book focuses on results established under moment conditions on random variables using probabilistic methods, and is thus easily applicable to statistics and other areas of science. The book introduces fundamental results, most of them investigated by the authors, such as the semicircular law of Wigner matrices, the Marcenko-Pastur law, the limiting spectral distribution of the multivariate F matrix, limits of extreme eigenvalues, spectrum separation theorems, convergence rates of empirical distributions, central limit theorems of linear spectral statistics, and the partial solution of the famous circular law. While deriving the main results, the book simultaneously emphasizes the ideas and methodologies of the fundamental mathematical tools, among them being: truncation techniques, matrix identities, moment convergence theorems, and the Stieltjes transform. Its treatment is especially fitting to the needs of mathematics and statistics graduate students and beginning researchers, having a basic knowledge of matrix theory and an understanding of probability theory at the graduate level, who desire to learn the concepts and tools in solving problems in this area. It can also serve as a detailed handbook on results of large dimensional random matrices for practical users. This second edition includes two additional chapters, one on the authors' results on the limiting behavior of eigenvectors of sample covariance matrices, another on applications to wireless communications and finance. While attempting to bring this edition up-to-date on recent work, it also provides summaries of other areas which are typically considered part of the general field of random matrix theory.