Transposable Regularized Covariance Models with Applications to High-dimensional Data

Transposable Regularized Covariance Models with Applications to High-dimensional Data

Author: Genevera Irene Allen

Publisher:

Published: 2010

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

High-dimensional data is becoming more prevalent with new technologies in biomedical sciences, imaging and the Internet. Many examples of this data often contain complex relationships between and among sets of variables. When arranged in the form of a matrix, this data is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate normal, the mean-restricted matrix-variate normal, and introduce transposable regularized covariance models by placing penalties on inverse covariance matrices. We give theoretical results exploiting the structure of our transposable models that give computationally feasible algorithms for parameter estimation and calculation of conditional expectations. These contributions make the matrix-variate normal accessible for application to high-dimensional data. We apply our model to two applications: missing data imputation and large-scale inference with the matrix-variate normal distribution. Examples, simulations and results are given using the Netflix movie-rating data and microarrays, demonstrating the flexibility and functionality of our transposable models.


High-Dimensional Covariance Estimation

High-Dimensional Covariance Estimation

Author: Mohsen Pourahmadi

Publisher: John Wiley & Sons

Published: 2013-06-24

Total Pages: 204

ISBN-13: 1118034295

DOWNLOAD EBOOK

Methods for estimating sparse and large covariance matrices Covariance and correlation matrices play fundamental roles in every aspect of the analysis of multivariate data collected from a variety of fields including business and economics, health care, engineering, and environmental and physical sciences. High-Dimensional Covariance Estimation provides accessible and comprehensive coverage of the classical and modern approaches for estimating covariance matrices as well as their applications to the rapidly developing areas lying at the intersection of statistics and machine learning. Recently, the classical sample covariance methodologies have been modified and improved upon to meet the needs of statisticians and researchers dealing with large correlated datasets. High-Dimensional Covariance Estimation focuses on the methodologies based on shrinkage, thresholding, and penalized likelihood with applications to Gaussian graphical models, prediction, and mean-variance portfolio management. The book relies heavily on regression-based ideas and interpretations to connect and unify many existing methods and algorithms for the task. High-Dimensional Covariance Estimation features chapters on: Data, Sparsity, and Regularization Regularizing the Eigenstructure Banding, Tapering, and Thresholding Covariance Matrices Sparse Gaussian Graphical Models Multivariate Regression The book is an ideal resource for researchers in statistics, mathematics, business and economics, computer sciences, and engineering, as well as a useful text or supplement for graduate-level courses in multivariate analysis, covariance estimation, statistical learning, and high-dimensional data analysis.


Large Covariance and Autocovariance Matrices

Large Covariance and Autocovariance Matrices

Author: Arup Bose

Publisher: CRC Press

Published: 2018-07-03

Total Pages: 272

ISBN-13: 1351398164

DOWNLOAD EBOOK

Large Covariance and Autocovariance Matrices brings together a collection of recent results on sample covariance and autocovariance matrices in high-dimensional models and novel ideas on how to use them for statistical inference in one or more high-dimensional time series models. The prerequisites include knowledge of elementary multivariate analysis, basic time series analysis and basic results in stochastic convergence. Part I is on different methods of estimation of large covariance matrices and auto-covariance matrices and properties of these estimators. Part II covers the relevant material on random matrix theory and non-commutative probability. Part III provides results on limit spectra and asymptotic normality of traces of symmetric matrix polynomial functions of sample auto-covariance matrices in high-dimensional linear time series models. These are used to develop graphical and significance tests for different hypotheses involving one or more independent high-dimensional linear time series. The book should be of interest to people in econometrics and statistics (large covariance matrices and high-dimensional time series), mathematics (random matrices and free probability) and computer science (wireless communication). Parts of it can be used in post-graduate courses on high-dimensional statistical inference, high-dimensional random matrices and high-dimensional time series models. It should be particularly attractive to researchers developing statistical methods in high-dimensional time series models. Arup Bose is a professor at the Indian Statistical Institute, Kolkata, India. He is a distinguished researcher in mathematical statistics and has been working in high-dimensional random matrices for the last fifteen years. He has been editor of Sankhyā for several years and has been on the editorial board of several other journals. He is a Fellow of the Institute of Mathematical Statistics, USA and all three national science academies of India, as well as the recipient of the S.S. Bhatnagar Award and the C.R. Rao Award. His first book Patterned Random Matrices was also published by Chapman & Hall. He has a forthcoming graduate text U-statistics, M-estimates and Resampling (with Snigdhansu Chatterjee) to be published by Hindustan Book Agency. Monika Bhattacharjee is a post-doctoral fellow at the Informatics Institute, University of Florida. After graduating from St. Xavier's College, Kolkata, she obtained her master’s in 2012 and PhD in 2016 from the Indian Statistical Institute. Her thesis in high-dimensional covariance and auto-covariance matrices, written under the supervision of Dr. Bose, has received high acclaim.


Wavelets and Multiscale Analysis

Wavelets and Multiscale Analysis

Author: Jonathan Cohen

Publisher: Springer Science & Business Media

Published: 2011-03-01

Total Pages: 345

ISBN-13: 0817680950

DOWNLOAD EBOOK

Since its emergence as an important research area in the early 1980s, the topic of wavelets has undergone tremendous development on both theoretical and applied fronts. Myriad research and survey papers and monographs have been published on the subject, documenting different areas of applications such as sound and image processing, denoising, data compression, tomography, and medical imaging. The study of wavelets remains a very active field of research, and many of its central techniques and ideas have evolved into new and promising research areas. This volume, a collection of invited contributions developed from talks at an international conference on wavelets, is divided into three parts: Part I is devoted to the mathematical theory of wavelets and features several papers on wavelet sets and the construction of wavelet bases in different settings. Part II looks at the use of multiscale harmonic analysis for understanding the geometry of large data sets and extracting information from them. Part III focuses on applications of wavelet theory to the study of several real-world problems. Overall, the book is an excellent reference for graduate students, researchers, and practitioners in theoretical and applied mathematics, or in engineering.


Handbook of Statistical Genomics

Handbook of Statistical Genomics

Author: David J. Balding

Publisher: John Wiley & Sons

Published: 2019-07-09

Total Pages: 1828

ISBN-13: 1119429250

DOWNLOAD EBOOK

A timely update of a highly popular handbook on statistical genomics This new, two-volume edition of a classic text provides a thorough introduction to statistical genomics, a vital resource for advanced graduate students, early-career researchers and new entrants to the field. It introduces new and updated information on developments that have occurred since the 3rd edition. Widely regarded as the reference work in the field, it features new chapters focusing on statistical aspects of data generated by new sequencing technologies, including sequence-based functional assays. It expands on previous coverage of the many processes between genotype and phenotype, including gene expression and epigenetics, as well as metabolomics. It also examines population genetics and evolutionary models and inference, with new chapters on the multi-species coalescent, admixture and ancient DNA, as well as genetic association studies including causal analyses and variant interpretation. The Handbook of Statistical Genomics focuses on explaining the main ideas, analysis methods and algorithms, citing key recent and historic literature for further details and references. It also includes a glossary of terms, acronyms and abbreviations, and features extensive cross-referencing between chapters, tying the different areas together. With heavy use of up-to-date examples and references to web-based resources, this continues to be a must-have reference in a vital area of research. Provides much-needed, timely coverage of new developments in this expanding area of study Numerous, brand new chapters, for example covering bacterial genomics, microbiome and metagenomics Detailed coverage of application areas, with chapters on plant breeding, conservation and forensic genetics Extensive coverage of human genetic epidemiology, including ethical aspects Edited by one of the leading experts in the field along with rising stars as his co-editors Chapter authors are world-renowned experts in the field, and newly emerging leaders. The Handbook of Statistical Genomics is an excellent introductory text for advanced graduate students and early-career researchers involved in statistical genetics.


Level Sets and Extrema of Random Processes and Fields

Level Sets and Extrema of Random Processes and Fields

Author: Jean-Marc Azais

Publisher: John Wiley & Sons

Published: 2009-02-17

Total Pages: 407

ISBN-13: 0470434635

DOWNLOAD EBOOK

A timely and comprehensive treatment of random field theory with applications across diverse areas of study Level Sets and Extrema of Random Processes and Fields discusses how to understand the properties of the level sets of paths as well as how to compute the probability distribution of its extremal values, which are two general classes of problems that arise in the study of random processes and fields and in related applications. This book provides a unified and accessible approach to these two topics and their relationship to classical theory and Gaussian processes and fields, and the most modern research findings are also discussed. The authors begin with an introduction to the basic concepts of stochastic processes, including a modern review of Gaussian fields and their classical inequalities. Subsequent chapters are devoted to Rice formulas, regularity properties, and recent results on the tails of the distribution of the maximum. Finally, applications of random fields to various areas of mathematics are provided, specifically to systems of random equations and condition numbers of random matrices. Throughout the book, applications are illustrated from various areas of study such as statistics, genomics, and oceanography while other results are relevant to econometrics, engineering, and mathematical physics. The presented material is reinforced by end-of-chapter exercises that range in varying degrees of difficulty. Most fundamental topics are addressed in the book, and an extensive, up-to-date bibliography directs readers to existing literature for further study. Level Sets and Extrema of Random Processes and Fields is an excellent book for courses on probability theory, spatial statistics, Gaussian fields, and probabilistic methods in real computation at the upper-undergraduate and graduate levels. It is also a valuable reference for professionals in mathematics and applied fields such as statistics, engineering, econometrics, mathematical physics, and biology.


Mathematical Methods in Biomedical Imaging and Intensity-Modulated Radiation Therapy (IMRT)

Mathematical Methods in Biomedical Imaging and Intensity-Modulated Radiation Therapy (IMRT)

Author: Yair Censor

Publisher: Edizioni della Normale

Published: 2008-08-14

Total Pages: 0

ISBN-13: 9788876423147

DOWNLOAD EBOOK

This book contains papers presented by leading experts at the "Interdisciplinary Workshop on Mathematical Methods in Biomedical Imaging and Intensity-Modulated Radiation Therapy (IMRT)" held at the Centro di Ricerca Matematica (CRM) Ennio De Giorgi at Pisa, Italy, from October 15 to 19, 2007. The interdisciplinary book consists of research and review papers by leading experts and practitioners in biomedical imaging and intensity-modulated radiation therapy (IMRT).


Statistical Learning with Sparsity

Statistical Learning with Sparsity

Author: Trevor Hastie

Publisher: CRC Press

Published: 2015-05-07

Total Pages: 354

ISBN-13: 1498712177

DOWNLOAD EBOOK

Discover New Methods for Dealing with High-Dimensional DataA sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underl


Introduction to Single Cell Omics

Introduction to Single Cell Omics

Author: Xinghua Pan

Publisher: Frontiers Media SA

Published: 2019-09-19

Total Pages: 129

ISBN-13: 2889459209

DOWNLOAD EBOOK

Single-cell omics is a progressing frontier that stems from the sequencing of the human genome and the development of omics technologies, particularly genomics, transcriptomics, epigenomics and proteomics, but the sensitivity is now improved to single-cell level. The new generation of methodologies, especially the next generation sequencing (NGS) technology, plays a leading role in genomics related fields; however, the conventional techniques of omics require number of cells to be large, usually on the order of millions of cells, which is hardly accessible in some cases. More importantly, harnessing the power of omics technologies and applying those at the single-cell level are crucial since every cell is specific and unique, and almost every cell population in every systems, derived in either vivo or in vitro, is heterogeneous. Deciphering the heterogeneity of the cell population hence becomes critical for recognizing the mechanism and significance of the system. However, without an extensive examination of individual cells, a massive analysis of cell population would only give an average output of the cells, but neglect the differences among cells. Single-cell omics seeks to study a number of individual cells in parallel for their different dimensions of molecular profile on genome-wide scale, providing unprecedented resolution for the interpretation of both the structure and function of an organ, tissue or other system, as well as the interaction (and communication) and dynamics of single cells or subpopulations of cells and their lineages. Importantly single-cell omics enables the identification of a minor subpopulation of cells that may play a critical role in biological process over a dominant subpolulation such as a cancer and a developing organ. It provides an ultra-sensitive tool for us to clarify specific molecular mechanisms and pathways and reveal the nature of cell heterogeneity. Besides, it also empowers the clinical investigation of patients when facing a very low quantity of cell available for analysis, such as noninvasive cancer screening with circulating tumor cells (CTC), noninvasive prenatal diagnostics (NIPD) and preimplantation genetic test (PGT) for in vitro fertilization. Single-cell omics greatly promotes the understanding of life at a more fundamental level, bring vast applications in medicine. Accordingly, single-cell omics is also called as single-cell analysis or single-cell biology. Within only a couple of years, single-cell omics, especially transcriptomic sequencing (scRNA-seq), whole genome and exome sequencing (scWGS, scWES), has become robust and broadly accessible. Besides the existing technologies, recently, multiplexing barcode design and combinatorial indexing technology, in combination with microfluidic platform exampled by Drop-seq, or even being independent of microfluidic platform but using a regular PCR-plate, enable us a greater capacity of single cell analysis, switching from one single cell to thousands of single cells in a single test. The unique molecular identifiers (UMIs) allow the amplification bias among the original molecules to be corrected faithfully, resulting in a reliable quantitative measurement of omics in single cells. Of late, a variety of single-cell epigenomics analyses are becoming sophisticated, particularly single cell chromatin accessibility (scATAC-seq) and CpG methylation profiling (scBS-seq, scRRBS-seq). High resolution single molecular Fluorescence in situ hybridization (smFISH) and its revolutionary versions (ex. seqFISH, MERFISH, and so on), in addition to the spatial transcriptome sequencing, make the native relationship of the individual cells of a tissue to be in 3D or 4D format visually and quantitatively clarified. On the other hand, CRISPR/cas9 editing-based In vivo lineage tracing methods enable dynamic profile of a whole developmental process to be accurately displayed. Multi-omics analysis facilitates the study of multi-dimensional regulation and relationship of different elements of the central dogma in a single cell, as well as permitting a clear dissection of the complicated omics heterogeneity of a system. Last but not the least, the technology, biological noise, sequence dropout, and batch effect bring a huge challenge to the bioinformatics of single cell omics. While significant progress in the data analysis has been made since then, revolutionary theory and algorithm logics for single cell omics are expected. Indeed, single-cell analysis exert considerable impacts on the fields of biological studies, particularly cancers, neuron and neural system, stem cells, embryo development and immune system; other than that, it also tremendously motivates pharmaceutic RD, clinical diagnosis and monitoring, as well as precision medicine. This book hereby summarizes the recent developments and general considerations of single-cell analysis, with a detailed presentation on selected technologies and applications. Starting with the experimental design on single-cell omics, the book then emphasizes the consideration on heterogeneity of cancer and other systems. It also gives an introduction of the basic methods and key facts for bioinformatics analysis. Secondary, this book provides a summary of two types of popular technologies, the fundamental tools on single-cell isolation, and the developments of single cell multi-omics, followed by descriptions of FISH technologies, though other popular technologies are not covered here due to the fact that they are intensively described here and there recently. Finally, the book illustrates an elastomer-based integrated fluidic circuit that allows a connection between single cell functional studies combining stimulation, response, imaging and measurement, and corresponding single cell sequencing. This is a model system for single cell functional genomics. In addition, it reports a pipeline for single-cell proteomics with an analysis of the early development of Xenopus embryo, a single-cell qRT-PCR application that defined the subpopulations related to cell cycling, and a new method for synergistic assembly of single cell genome with sequencing of amplification product by phi29 DNA polymerase. Due to the tremendous progresses of single-cell omics in recent years, the topics covered here are incomplete, but each individual topic is excellently addressed, significantly interesting and beneficial to scientists working in or affiliated with this field.