Discrete or count data arise in experiments where the outcome variables are the numbers of individuals classified into unique, non-overlapping categories. This book describes the statistical models used in the analysis and summary of such data, and provides an introduction to the subject for graduate students and practitioners needing a review of the methodology. It includes topics not covered in depth elsewhere, such as the negative multinomial distribution; the many forms of the hypergeometric distribution; and coordinate free models.
An Applied Treatment of Modern Graphical Methods for Analyzing Categorical DataDiscrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data presents an applied treatment of modern methods for the analysis of categorical data, both discrete response data and frequency data. It explains how to use graphical meth
This book focuses on statistical methods for the analysis of discrete failure times. Failure time analysis is one of the most important fields in statistical research, with applications affecting a wide range of disciplines, in particular, demography, econometrics, epidemiology and clinical research. Although there are a large variety of statistical methods for failure time analysis, many techniques are designed for failure times that are measured on a continuous scale. In empirical studies, however, failure times are often discrete, either because they have been measured in intervals (e.g., quarterly or yearly) or because they have been rounded or grouped. The book covers well-established methods like life-table analysis and discrete hazard regression models, but also introduces state-of-the art techniques for model evaluation, nonparametric estimation and variable selection. Throughout, the methods are illustrated by real life applications, and relationships to survival analysis in continuous time are explained. Each section includes a set of exercises on the respective topics. Various functions and tools for the analysis of discrete survival data are collected in the R package discSurv that accompanies the book.
This book is devoted to biased sampling problems (also called choice-based sampling in Econometrics parlance) and over-identified parameter estimation problems. Biased sampling problems appear in many areas of research, including Medicine, Epidemiology and Public Health, the Social Sciences and Economics. The book addresses a range of important topics, including case and control studies, causal inference, missing data problems, meta-analysis, renewal process and length biased sampling problems, capture and recapture problems, case cohort studies, exponential tilting genetic mixture models etc. The goal of this book is to make it easier for Ph. D students and new researchers to get started in this research area. It will be of interest to all those who work in the health, biological, social and physical sciences, as well as those who are interested in survey methodology and other areas of statistical science, among others.
The Statistical Analysis of Discrete Data provides an introduction to cur rent statistical methods for analyzing discrete response data. The book can be used as a course text for graduate students and as a reference for researchers who analyze discrete data. The book's mathematical prereq uisites are linear algebra and elementary advanced calculus. It assumes a basic statistics course which includes some decision theory, and knowledge of classical linear model theory for continuous response data. Problems are provided at the end of each chapter to give the reader an opportunity to ap ply the methods in the text, to explore extensions of the material covered, and to analyze data with discrete responses. In the text examples, and in the problems, we have sought to include interesting data sets from a wide variety of fields including political science, medicine, nuclear engineering, sociology, ecology, cancer research, library science, and biology. Although there are several texts available on discrete data analysis, we felt there was a need for a book which incorporated some of the myriad recent research advances. Our motivation was to introduce the subject by emphasizing its ties to the well-known theories of linear models, experi mental design, and regression diagnostics, as well as to describe alterna tive methodologies (Bayesian, smoothing, etc. ); the latter are based on the premise that external information is available. These overriding goals, to gether with our own experiences and biases, have governed our choice of topics.
The linear mixed model has become the main parametric tool for the analysis of continuous longitudinal data, as the authors discussed in their 2000 book. Without putting too much emphasis on software, the book shows how the different approaches can be implemented within the SAS software package. The authors received the American Statistical Association's Excellence in Continuing Education Award based on short courses on longitudinal and incomplete data at the Joint Statistical Meetings of 2002 and 2004.
The thirteen papers in "Structural Analysis of Discrete Data" are previously unpublished major research contributions solicited by the editors. They have been specifically prepared to fulfill the two-fold purpose of the volume, first to provide the econometrics student with an overview of the present extent of the subject and to delineate the boundaries of current research, both in terms of methodology and applications. "Coordinated publication of important findings" should, as the editors state, "lower the cost of entry into the field and speed dissemination of recent research into the graduate econometrics classroom."A second purpose of the volume is to communicate results largely reported in the econometrics literature to a wider community of researchers to whom they are directly relevant, including applied econometricians, statisticians in the area of discrete multivariate analysis, specialists in biometrics, psychometrics, and sociometrics, and analysts in various applied fields such as finance, marketing, and transportation.The papers are grouped into four sections: "Statistical Analysis of Discrete Probability Models, " with papers by the editors and by Steven Cosslett; "Dynamic Discrete Probability Models, " consisting of two contributions by James Heckman; "Structural Discrete Probability Models Derived from Theories of Choice, " with papers by Daniel McFadden, Gregory Fischer and Daniel Nagin, Steven Lerman and Charles Manski, and Moshe Ben-Akiva and Thawat Watanatada; and "Simultaneous Systems Models with Discrete Endogenous Variables, " with contributions by Lung-Fei Lee, Jerry Hausman and David Wise, Dale Poirier, Peter Schmidt, and Robert Avery.Among the applications treated are income maintenance experiments, physician behavior, consumer credit, and intra-urban location and transportation.
This book describes the new generation of discrete choice methods, focusing on the many advances that are made possible by simulation. Researchers use these statistical methods to examine the choices that consumers, households, firms, and other agents make. Each of the major models is covered: logit, generalized extreme value, or GEV (including nested and cross-nested logits), probit, and mixed logit, plus a variety of specifications that build on these basics. Simulation-assisted estimation procedures are investigated and compared, including maximum stimulated likelihood, method of simulated moments, and method of simulated scores. Procedures for drawing from densities are described, including variance reduction techniques such as anithetics and Halton draws. Recent advances in Bayesian procedures are explored, including the use of the Metropolis-Hastings algorithm and its variant Gibbs sampling. The second edition adds chapters on endogeneity and expectation-maximization (EM) algorithms. No other book incorporates all these fields, which have arisen in the past 25 years. The procedures are applicable in many fields, including energy, transportation, environmental studies, health, labor, and marketing.
Researchers in fields ranging from biology and medicine to the social sciences, law, and economics regularly encounter variables that are discrete or categorical in nature. While there is no dearth of books on the analysis and interpretation of such data, these generally focus on large sample methods. When sample sizes are not large or the data are
A much-needed introduction to the field of discrete-valued time series, with a focus on count-data time series Time series analysis is an essential tool in a wide array of fields, including business, economics, computer science, epidemiology, finance, manufacturing and meteorology, to name just a few. Despite growing interest in discrete-valued time series—especially those arising from counting specific objects or events at specified times—most books on time series give short shrift to that increasingly important subject area. This book seeks to rectify that state of affairs by providing a much needed introduction to discrete-valued time series, with particular focus on count-data time series. The main focus of this book is on modeling. Throughout numerous examples are provided illustrating models currently used in discrete-valued time series applications. Statistical process control, including various control charts (such as cumulative sum control charts), and performance evaluation are treated at length. Classic approaches like ARMA models and the Box-Jenkins program are also featured with the basics of these approaches summarized in an Appendix. In addition, data examples, with all relevant R code, are available on a companion website. Provides a balanced presentation of theory and practice, exploring both categorical and integer-valued series Covers common models for time series of counts as well as for categorical time series, and works out their most important stochastic properties Addresses statistical approaches for analyzing discrete-valued time series and illustrates their implementation with numerous data examples Covers classical approaches such as ARMA models, Box-Jenkins program and how to generate functions Includes dataset examples with all necessary R code provided on a companion website An Introduction to Discrete-Valued Time Series is a valuable working resource for researchers and practitioners in a broad range of fields, including statistics, data science, machine learning, and engineering. It will also be of interest to postgraduate students in statistics, mathematics and economics.