Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author ha
In the course of one's research, the expediency of meeting contractual and other externally imposed deadlines too often seems to take priority over what may be more significant research findings in the longer run. Such is the case with this volume which, despite our best intentions, has been put aside time and again since 1971 in favor of what seemed to be more urgent matters. Despite this delay, to our knowledge the principal research results and documentation presented here have not been superseded by other publications. The background of this endeavor may be of some historical interest, especially to those who agree that research is not a straightforward, mechanistic process whose outcome or even direction is known in ad vance. In the process of this brief recounting, we would like to express our gratitude to those individuals and organizations who facilitated and supported our efforts. We were introduced to the Beale, Kendall and Mann algorithm, the source of all our efforts, quite by chance. Professor Britton Harris suggested to me in April 1967 that I might like to attend a CEIR half-day seminar on optimal regression being given by Professor M. G. Kendall in Washington. D. C. I agreed that the topic seemed interesting and went along. Had it not been for Harris' suggestion and financial support, this work almost certainly would have never begun.
The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.
This User’s Guide is a resource for investigators and stakeholders who develop and review observational comparative effectiveness research protocols. It explains how to (1) identify key considerations and best practices for research design; (2) build a protocol based on these standards and best practices; and (3) judge the adequacy and completeness of a protocol. Eleven chapters cover all aspects of research design, including: developing study objectives, defining and refining study questions, addressing the heterogeneity of treatment effect, characterizing exposure, selecting a comparator, defining and measuring outcomes, and identifying optimal data sources. Checklists of guidance and key considerations for protocols are provided at the end of each chapter. The User’s Guide was created by researchers affiliated with AHRQ’s Effective Health Care Program, particularly those who participated in AHRQ’s DEcIDE (Developing Evidence to Inform Decisions About Effectiveness) program. Chapters were subject to multiple internal and external independent reviews. More more information, please consult the Agency website: www.effectivehealthcare.ahrq.gov)
Discover New Methods for Dealing with High-Dimensional DataA sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underl
The essential introduction to the theory and application of linear models—now in a valuable new edition Since most advanced statistical tools are generalizations of the linear model, it is neces-sary to first master the linear model in order to move forward to more advanced concepts. The linear model remains the main tool of the applied statistician and is central to the training of any statistician regardless of whether the focus is applied or theoretical. This completely revised and updated new edition successfully develops the basic theory of linear models for regression, analysis of variance, analysis of covariance, and linear mixed models. Recent advances in the methodology related to linear mixed models, generalized linear models, and the Bayesian linear model are also addressed. Linear Models in Statistics, Second Edition includes full coverage of advanced topics, such as mixed and generalized linear models, Bayesian linear models, two-way models with empty cells, geometry of least squares, vector-matrix calculus, simultaneous inference, and logistic and nonlinear regression. Algebraic, geometrical, frequentist, and Bayesian approaches to both the inference of linear models and the analysis of variance are also illustrated. Through the expansion of relevant material and the inclusion of the latest technological developments in the field, this book provides readers with the theoretical foundation to correctly interpret computer software output as well as effectively use, customize, and understand linear models. This modern Second Edition features: New chapters on Bayesian linear models as well as random and mixed linear models Expanded discussion of two-way models with empty cells Additional sections on the geometry of least squares Updated coverage of simultaneous inference The book is complemented with easy-to-read proofs, real data sets, and an extensive bibliography. A thorough review of the requisite matrix algebra has been addedfor transitional purposes, and numerous theoretical and applied problems have been incorporated with selected answers provided at the end of the book. A related Web site includes additional data sets and SAS® code for all numerical examples. Linear Model in Statistics, Second Edition is a must-have book for courses in statistics, biostatistics, and mathematics at the upper-undergraduate and graduate levels. It is also an invaluable reference for researchers who need to gain a better understanding of regression and analysis of variance.
This book constitutes the thoroughly refereed post-conference proceedings of the Joint Meeting of the 2nd Luxembourg-Polish Symposium on Security and Trust and the 19th International Conference Intelligent Information Systems, held as International Joint Confererence on Security and Intelligent Information Systems, SIIS 2011, in Warsaw, Poland, in June 2011. The 29 revised full papers presented together with 2 invited lectures were carefully reviewed and selected from 60 initial submissions during two rounds of selection and improvement. The papers are organized in the following three thematic tracks: security and trust, data mining and machine learning, and natural language processing.
"Learning Statistics with R" covers the contents of an introductory statistics class, as typically taught to undergraduate psychology students, focusing on the use of the R statistical software and adopting a light, conversational style throughout. The book discusses how to get started in R, and gives an introduction to data manipulation and writing scripts. From a statistical perspective, the book discusses descriptive statistics and graphing first, followed by chapters on probability theory, sampling and estimation, and null hypothesis testing. After introducing the theory, the book covers the analysis of contingency tables, t-tests, ANOVAs and regression. Bayesian statistics are covered at the end of the book. For more information (and the opportunity to check the book out before you buy!) visit http://ua.edu.au/ccs/teaching/lsr or http://learningstatisticswithr.com
This important book describes procedures for selecting a model from a large set of competing statistical models. It includes model selection techniques for univariate and multivariate regression models, univariate and multivariate autoregressive models, nonparametric (including wavelets) and semiparametric regression models, and quasi-likelihood and robust regression models. Information-based model selection criteria are discussed, and small sample and asymptotic properties are presented. The book also provides examples and large scale simulation studies comparing the performances of information-based model selection criteria, bootstrapping, and cross-validation selection methods over a wide range of models.