This series of books collects a diverse array of work that provides the reader with theoretical and applied information on data analysis methods, models, and techniques, along with appropriate applications. Volume 1 begins with an introductory chapter by Gilbert Saporta, a leading expert in the field, who summarizes the developments in data analysis over the last 50 years. The book is then divided into three parts: Part 1 presents clustering and regression cases; Part 2 examines grouping and decomposition, GARCH and threshold models, structural equations, and SME modeling; and Part 3 presents symbolic data analysis, time series and multiple choice models, modeling in demography, and data mining.
The scientific field of data analysis is constantly expanding due to the rapid growth of the computer industry and the wide applicability of computational and algorithmic techniques, in conjunction with new advances in statistical, stochastic and analytic tools. There is a constant need for new, high-quality publications to cover the recent advances in all fields of science and engineering. This book is a collective work by a number of leading scientists, computer experts, analysts, engineers, mathematicians, probabilists and statisticians who have been working at the forefront of data analysis and related applications. The chapters of this collaborative work represent a cross-section of current concerns, developments and research interests in the above scientific areas. The collected material has been divided into appropriate sections to provide the reader with both theoretical and applied information on data analysis methods, models and techniques, along with related applications.
This book covers recent developments in correlated data analysis. It utilizes the class of dispersion models as marginal components in the formulation of joint models for correlated data. This enables the book to cover a broader range of data types than the traditional generalized linear models. The reader is provided with a systematic treatment for the topic of estimating functions, and both generalized estimating equations (GEE) and quadratic inference functions (QIF) are studied as special cases. In addition to the discussions on marginal models and mixed-effects models, this book covers new topics on joint regression analysis based on Gaussian copulas.
Handbook of Statistical Analysis and Data Mining Applications, Second Edition, is a comprehensive professional reference book that guides business analysts, scientists, engineers and researchers, both academic and industrial, through all stages of data analysis, model building and implementation. The handbook helps users discern technical and business problems, understand the strengths and weaknesses of modern data mining algorithms and employ the right statistical methods for practical application. This book is an ideal reference for users who want to address massive and complex datasets with novel statistical approaches and be able to objectively evaluate analyses and solutions. It has clear, intuitive explanations of the principles and tools for solving problems using modern analytic techniques and discusses their application to real problems in ways accessible and beneficial to practitioners across several areas—from science and engineering, to medicine, academia and commerce. - Includes input by practitioners for practitioners - Includes tutorials in numerous fields of study that provide step-by-step instruction on how to use supplied tools to build models - Contains practical advice from successful real-world implementations - Brings together, in a single resource, all the information a beginner needs to understand the tools and issues in data mining to build successful data mining solutions - Features clear, intuitive explanations of novel analytical tools and techniques, and their practical applications
The scientific field of data analysis is constantly expanding due to the rapid growth of the computer industry and the wide applicability of computational and algorithmic techniques, in conjunction with new advances in statistical, stochastic and analytic tools. There is a constant need for new, high-quality publications to cover the recent advances in all fields of science and engineering. This book is a collective work by a number of leading scientists, computer experts, analysts, engineers, mathematicians, probabilists and statisticians who have been working at the forefront of data analysis and related applications. The chapters of this collaborative work represent a cross-section of current concerns, developments and research interests in the above scientific areas. The collected material has been divided into appropriate sections to provide the reader with both theoretical and applied information on data analysis methods, models and techniques, along with related applications.
To understand the world around us, as well as ourselves, we need to measure many things, many variables, many properties of the systems and processes we investigate. Hence, data collected in science, technology, and almost everywhere else are multivariate, a data table with multiple variables measured on multiple observations (cases, samples, items, process time points, experiments). This book describes a remarkably simple minimalistic and practical approach to the analysis of data tables (multivariate data). The approach is based on projection methods, which are PCA (principal components analysis), and PLS (projection to latent structures) and the book shows how this works in science and technology for a wide variety of applications. In particular, it is shown how the great information content in well collected multivariate data can be expressed in terms of simple but illuminating plots, facilitating the understanding and interpretation of the data. The projection approach applies to a variety of data-analytical objectives, i.e., (i) summarizing and visualizing a data set, (ii) multivariate classification and discriminant analysis, and (iii) finding quantitative relationships among the variables. This works with any shape of data table, with many or few variables (columns), many or few observations (rows), and complete or incomplete data tables (missing data). In particular, projections handle data matrices with more variables than observations very well, and the data can be noisy and highly collinear. Authors: The five authors are all connected to the Umetrics company (www.umetrics.com) which has developed and sold software for multivariate analysis since 1987, as well as supports customers with training and consultations. Umetrics' customers include most large and medium sized companies in the pharmaceutical, biopharm, chemical, and semiconductor sectors.
Data Analysis for Omic Sciences: Methods and Applications, Volume 82, shows how these types of challenging datasets can be analyzed. Examples of applications in real environmental, clinical and food analysis cases help readers disseminate these approaches. Chapters of note include an Introduction to Data Analysis Relevance in the Omics Era, Omics Experimental Design and Data Acquisition, Microarrays Data, Analysis of High-Throughput RNA Sequencing Data, Analysis of High-Throughput DNA Bisulfite Sequencing Data, Data Quality Assessment in Untargeted LC-MS Metabolomic, Data Normalization and Scaling, Metabolomics Data Preprocessing, and more. - Presents the best reference book for omics data analysis - Provides a review of the latest trends in transcriptomics and metabolomics data analysis tools - Includes examples of applications in research fields, such as environmental, biomedical and food analysis
Methods and Applications of Longitudinal Data Analysis describes methods for the analysis of longitudinal data in the medical, biological and behavioral sciences. It introduces basic concepts and functions including a variety of regression models, and their practical applications across many areas of research. Statistical procedures featured within the text include: - descriptive methods for delineating trends over time - linear mixed regression models with both fixed and random effects - covariance pattern models on correlated errors - generalized estimating equations - nonlinear regression models for categorical repeated measurements - techniques for analyzing longitudinal data with non-ignorable missing observations Emphasis is given to applications of these methods, using substantial empirical illustrations, designed to help users of statistics better analyze and understand longitudinal data. Methods and Applications of Longitudinal Data Analysis equips both graduate students and professionals to confidently apply longitudinal data analysis to their particular discipline. It also provides a valuable reference source for applied statisticians, demographers and other quantitative methodologists. - From novice to professional: this book starts with the introduction of basic models and ends with the description of some of the most advanced models in longitudinal data analysis - Enables students to select the correct statistical methods to apply to their longitudinal data and avoid the pitfalls associated with incorrect selection - Identifies the limitations of classical repeated measures models and describes newly developed techniques, along with real-world examples.
It is difficult to imagine that the statistical analysis of compositional data has been a major issue of concern for more than 100 years. It is even more difficult to realize that so many statisticians and users of statistics are unaware of the particular problems affecting compositional data, as well as their solutions. The issue of ``spurious correlation'', as the situation was phrased by Karl Pearson back in 1897, affects all data that measures parts of some whole, such as percentages, proportions, ppm and ppb. Such measurements are present in all fields of science, ranging from geology, biology, environmental sciences, forensic sciences, medicine and hydrology. This book presents the history and development of compositional data analysis along with Aitchison's log-ratio approach. Compositional Data Analysis describes the state of the art both in theoretical fields as well as applications in the different fields of science. Key Features: Reflects the state-of-the-art in compositional data analysis. Gives an overview of the historical development of compositional data analysis, as well as basic concepts and procedures. Looks at advances in algebra and calculus on the simplex. Presents applications in different fields of science, including, genomics, ecology, biology, geochemistry, planetology, chemistry and economics. Explores connections to correspondence analysis and the Dirichlet distribution. Presents a summary of three available software packages for compositional data analysis. Supported by an accompanying website featuring R code. Applied scientists working on compositional data analysis in any field of science, both in academia and professionals will benefit from this book, along with graduate students in any field of science working with compositional data.
This is a revised and very expanded version of the previous second edition of the book. "Pharmacokinetic and Pharmacodynamic Data Analysis" provides an introduction into pharmacokinetic and pharmacodynamic concepts using simple illustrations and reasoning. It describes ways in which pharmacodynamic and pharmacodynamic theory may be used to give insight into modeling questions and how these questions can in turn lead to new knowledge. This book differentiates itself from other texts in this area in that it bridges the gap between relevant theory and the actual application of the theory to real life situations. The book is divided into two parts; the first introduces fundamental principles of PK and PD concepts, and principles of mathematical modeling, while the second provides case studies obtained from drug industry and academia. Topics included in the first part include a discussion of the statistical principles of model fitting, including how to assess the adequacy of the fit of a model, as well as strategies for selection of time points to be included in the design of a study. The first part also introduces basic pharmacokinetic and pharmacodynamic concepts, including an excellent discussion of effect compartment (link) models as well as indirect response models. The second part of the text includes over 70 modeling case studies. These include a discussion of the selection of the model, derivation of initial parameter estimates and interpretation of the corresponding output. Finally, the authors discuss a number of pharmacodynamic modeling situations including receptor binding models, synergy, and tolerance models (feedback and precursor models). This book will be of interest to researchers, to graduate students and advanced undergraduate students in the PK/PD area who wish to learn how to analyze biological data and build models and to become familiar with new areas of application. In addition, the text will be of interest to toxicologists interested in learning about determinants of exposure and performing toxicokinetic modeling. The inclusion of the numerous exercises and models makes it an excellent primary or adjutant text for traditional PK courses taught in pharmacy and medical schools. A diskette is included with the text that includes all of the exercises and solutions using WinNonlin.