Model-Free Prediction and Regression

Model-Free Prediction and Regression

Author: Dimitris N. Politis

Publisher: Springer

Published: 2015-11-13

Total Pages: 256

ISBN-13: 3319213474

DOWNLOAD EBOOK

The Model-Free Prediction Principle expounded upon in this monograph is based on the simple notion of transforming a complex dataset to one that is easier to work with, e.g., i.i.d. or Gaussian. As such, it restores the emphasis on observable quantities, i.e., current and future data, as opposed to unobservable model parameters and estimates thereof, and yields optimal predictors in diverse settings such as regression and time series. Furthermore, the Model-Free Bootstrap takes us beyond point prediction in order to construct frequentist prediction intervals without resort to unrealistic assumptions such as normality. Prediction has been traditionally approached via a model-based paradigm, i.e., (a) fit a model to the data at hand, and (b) use the fitted model to extrapolate/predict future data. Due to both mathematical and computational constraints, 20th century statistical practice focused mostly on parametric models. Fortunately, with the advent of widely accessible powerful computing in the late 1970s, computer-intensive methods such as the bootstrap and cross-validation freed practitioners from the limitations of parametric models, and paved the way towards the `big data' era of the 21st century. Nonetheless, there is a further step one may take, i.e., going beyond even nonparametric models; this is where the Model-Free Prediction Principle is useful. Interestingly, being able to predict a response variable Y associated with a regressor variable X taking on any possible value seems to inadvertently also achieve the main goal of modeling, i.e., trying to describe how Y depends on X. Hence, as prediction can be treated as a by-product of model-fitting, key estimation problems can be addressed as a by-product of being able to perform prediction. In other words, a practitioner can use Model-Free Prediction ideas in order to additionally obtain point estimates and confidence intervals for relevant parameters leading to an alternative, transformation-based approach to statistical inference.


Clinical Prediction Models

Clinical Prediction Models

Author: Ewout W. Steyerberg

Publisher: Springer

Published: 2019-07-22

Total Pages: 574

ISBN-13: 3030163997

DOWNLOAD EBOOK

The second edition of this volume provides insight and practical illustrations on how modern statistical concepts and regression methods can be applied in medical prediction problems, including diagnostic and prognostic outcomes. Many advances have been made in statistical approaches towards outcome prediction, but a sensible strategy is needed for model development, validation, and updating, such that prediction models can better support medical practice. There is an increasing need for personalized evidence-based medicine that uses an individualized approach to medical decision-making. In this Big Data era, there is expanded access to large volumes of routinely collected data and an increased number of applications for prediction models, such as targeted early detection of disease and individualized approaches to diagnostic testing and treatment. Clinical Prediction Models presents a practical checklist that needs to be considered for development of a valid prediction model. Steps include preliminary considerations such as dealing with missing values; coding of predictors; selection of main effects and interactions for a multivariable model; estimation of model parameters with shrinkage methods and incorporation of external data; evaluation of performance and usefulness; internal validation; and presentation formatting. The text also addresses common issues that make prediction models suboptimal, such as small sample sizes, exaggerated claims, and poor generalizability. The text is primarily intended for clinical epidemiologists and biostatisticians. Including many case studies and publicly available R code and data sets, the book is also appropriate as a textbook for a graduate course on predictive modeling in diagnosis and prognosis. While practical in nature, the book also provides a philosophical perspective on data analysis in medicine that goes beyond predictive modeling. Updates to this new and expanded edition include: • A discussion of Big Data and its implications for the design of prediction models • Machine learning issues • More simulations with missing ‘y’ values • Extended discussion on between-cohort heterogeneity • Description of ShinyApp • Updated LASSO illustration • New case studies


Practical Statistics for Data Scientists

Practical Statistics for Data Scientists

Author: Peter Bruce

Publisher: "O'Reilly Media, Inc."

Published: 2017-05-10

Total Pages: 322

ISBN-13: 1491952911

DOWNLOAD EBOOK

Statistical methods are a key part of of data science, yet very few data scientists have any formal statistics training. Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not. Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format. With this book, you’ll learn: Why exploratory data analysis is a key preliminary step in data science How random sampling can reduce bias and yield a higher quality dataset, even with big data How the principles of experimental design yield definitive answers to questions How to use regression to estimate outcomes and detect anomalies Key classification techniques for predicting which categories a record belongs to Statistical machine learning methods that “learn” from data Unsupervised learning methods for extracting meaning from unlabeled data


Fundamentals of Clinical Data Science

Fundamentals of Clinical Data Science

Author: Pieter Kubben

Publisher: Springer

Published: 2018-12-21

Total Pages: 218

ISBN-13: 3319997130

DOWNLOAD EBOOK

This open access book comprehensively covers the fundamentals of clinical data science, focusing on data collection, modelling and clinical applications. Topics covered in the first section on data collection include: data sources, data at scale (big data), data stewardship (FAIR data) and related privacy concerns. Aspects of predictive modelling using techniques such as classification, regression or clustering, and prediction model validation will be covered in the second section. The third section covers aspects of (mobile) clinical decision support systems, operational excellence and value-based healthcare. Fundamentals of Clinical Data Science is an essential resource for healthcare professionals and IT consultants intending to develop and refine their skills in personalized medicine, using solutions based on large datasets from electronic health records or telemonitoring programmes. The book’s promise is “no math, no code”and will explain the topics in a style that is optimized for a healthcare audience.


Regression and Other Stories

Regression and Other Stories

Author: Andrew Gelman

Publisher: Cambridge University Press

Published: 2021

Total Pages: 551

ISBN-13: 110702398X

DOWNLOAD EBOOK

A practical approach to using regression and computation to solve real-world problems of estimation, prediction, and causal inference.


Statistical Regression and Classification

Statistical Regression and Classification

Author: Norman Matloff

Publisher: CRC Press

Published: 2017-09-19

Total Pages: 439

ISBN-13: 1351645897

DOWNLOAD EBOOK

Statistical Regression and Classification: From Linear Models to Machine Learning takes an innovative look at the traditional statistical regression course, presenting a contemporary treatment in line with today's applications and users. The text takes a modern look at regression: * A thorough treatment of classical linear and generalized linear models, supplemented with introductory material on machine learning methods. * Since classification is the focus of many contemporary applications, the book covers this topic in detail, especially the multiclass case. * In view of the voluminous nature of many modern datasets, there is a chapter on Big Data. * Has special Mathematical and Computational Complements sections at ends of chapters, and exercises are partitioned into Data, Math and Complements problems. * Instructors can tailor coverage for specific audiences such as majors in Statistics, Computer Science, or Economics. * More than 75 examples using real data. The book treats classical regression methods in an innovative, contemporary manner. Though some statistical learning methods are introduced, the primary methodology used is linear and generalized linear parametric models, covering both the Description and Prediction goals of regression methods. The author is just as interested in Description applications of regression, such as measuring the gender wage gap in Silicon Valley, as in forecasting tomorrow's demand for bike rentals. An entire chapter is devoted to measuring such effects, including discussion of Simpson's Paradox, multiple inference, and causation issues. Similarly, there is an entire chapter of parametric model fit, making use of both residual analysis and assessment via nonparametric analysis. Norman Matloff is a professor of computer science at the University of California, Davis, and was a founder of the Statistics Department at that institution. His current research focus is on recommender systems, and applications of regression methods to small area estimation and bias reduction in observational studies. He is on the editorial boards of the Journal of Statistical Computation and the R Journal. An award-winning teacher, he is the author of The Art of R Programming and Parallel Computation in Data Science: With Examples in R, C++ and CUDA.


Interpretable Machine Learning

Interpretable Machine Learning

Author: Christoph Molnar

Publisher: Lulu.com

Published: 2020

Total Pages: 320

ISBN-13: 0244768528

DOWNLOAD EBOOK

This book is about making machine learning models and their decisions interpretable. After exploring the concepts of interpretability, you will learn about simple, interpretable models such as decision trees, decision rules and linear regression. Later chapters focus on general model-agnostic methods for interpreting black box models like feature importance and accumulated local effects and explaining individual predictions with Shapley values and LIME. All interpretation methods are explained in depth and discussed critically. How do they work under the hood? What are their strengths and weaknesses? How can their outputs be interpreted? This book will enable you to select and correctly apply the interpretation method that is most suitable for your machine learning project.


An Introduction to the Advanced Theory and Practice of Nonparametric Econometrics

An Introduction to the Advanced Theory and Practice of Nonparametric Econometrics

Author: Jeffrey S. Racine

Publisher: Cambridge University Press

Published: 2019-06-27

Total Pages: 436

ISBN-13: 1108757286

DOWNLOAD EBOOK

Interest in nonparametric methodology has grown considerably over the past few decades, stemming in part from vast improvements in computer hardware and the availability of new software that allows practitioners to take full advantage of these numerically intensive methods. This book is written for advanced undergraduate students, intermediate graduate students, and faculty, and provides a complete teaching and learning course at a more accessible level of theoretical rigor than Racine's earlier book co-authored with Qi Li, Nonparametric Econometrics: Theory and Practice (2007). The open source R platform for statistical computing and graphics is used throughout in conjunction with the R package np. Recent developments in reproducible research is emphasized throughout with appendices devoted to helping the reader get up to speed with R, R Markdown, TeX and Git.


Machine Learning and Data Science Blueprints for Finance

Machine Learning and Data Science Blueprints for Finance

Author: Hariom Tatsat

Publisher: "O'Reilly Media, Inc."

Published: 2020-10-01

Total Pages: 426

ISBN-13: 1492073008

DOWNLOAD EBOOK

Over the next few decades, machine learning and data science will transform the finance industry. With this practical book, analysts, traders, researchers, and developers will learn how to build machine learning algorithms crucial to the industry. You'll examine ML concepts and over 20 case studies in supervised, unsupervised, and reinforcement learning, along with natural language processing (NLP). Ideal for professionals working at hedge funds, investment and retail banks, and fintech firms, this book also delves deep into portfolio management, algorithmic trading, derivative pricing, fraud detection, asset price prediction, sentiment analysis, and chatbot development. You'll explore real-life problems faced by practitioners and learn scientifically sound solutions supported by code and examples. This book covers: Supervised learning regression-based models for trading strategies, derivative pricing, and portfolio management Supervised learning classification-based models for credit default risk prediction, fraud detection, and trading strategies Dimensionality reduction techniques with case studies in portfolio management, trading strategy, and yield curve construction Algorithms and clustering techniques for finding similar objects, with case studies in trading strategies and portfolio management Reinforcement learning models and techniques used for building trading strategies, derivatives hedging, and portfolio management NLP techniques using Python libraries such as NLTK and scikit-learn for transforming text into meaningful representations


Regression Analysis and Linear Models

Regression Analysis and Linear Models

Author: Richard B. Darlington

Publisher: Guilford Publications

Published: 2016-08-22

Total Pages: 689

ISBN-13: 1462527981

DOWNLOAD EBOOK

Emphasizing conceptual understanding over mathematics, this user-friendly text introduces linear regression analysis to students and researchers across the social, behavioral, consumer, and health sciences. Coverage includes model construction and estimation, quantification and measurement of multivariate and partial associations, statistical control, group comparisons, moderation analysis, mediation and path analysis, and regression diagnostics, among other important topics. Engaging worked-through examples demonstrate each technique, accompanied by helpful advice and cautions. The use of SPSS, SAS, and STATA is emphasized, with an appendix on regression analysis using R. The companion website (www.afhayes.com) provides datasets for the book's examples as well as the RLM macro for SPSS and SAS. Pedagogical Features: *Chapters include SPSS, SAS, or STATA code pertinent to the analyses described, with each distinctively formatted for easy identification. *An appendix documents the RLM macro, which facilitates computations for estimating and probing interactions, dominance analysis, heteroscedasticity-consistent standard errors, and linear spline regression, among other analyses. *Students are guided to practice what they learn in each chapter using datasets provided online. *Addresses topics not usually covered, such as ways to measure a variable’s importance, coding systems for representing categorical variables, causation, and myths about testing interaction.