Practical Statistics for Data Scientists

Practical Statistics for Data Scientists

Author: Peter Bruce

Publisher: "O'Reilly Media, Inc."

Published: 2017-05-10

Total Pages: 322

ISBN-13: 1491952911

DOWNLOAD EBOOK

Statistical methods are a key part of of data science, yet very few data scientists have any formal statistics training. Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not. Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format. With this book, you’ll learn: Why exploratory data analysis is a key preliminary step in data science How random sampling can reduce bias and yield a higher quality dataset, even with big data How the principles of experimental design yield definitive answers to questions How to use regression to estimate outcomes and detect anomalies Key classification techniques for predicting which categories a record belongs to Statistical machine learning methods that “learn” from data Unsupervised learning methods for extracting meaning from unlabeled data


Foundations of Data Science

Foundations of Data Science

Author: Avrim Blum

Publisher: Cambridge University Press

Published: 2020-01-23

Total Pages: 433

ISBN-13: 1108617360

DOWNLOAD EBOOK

This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.


Statistics for the Health Sciences

Statistics for the Health Sciences

Author: Christine Dancey

Publisher: SAGE

Published: 2012-03-19

Total Pages: 588

ISBN-13: 1446291235

DOWNLOAD EBOOK

Statistics for the Health Sciences is a highly readable and accessible textbook on understanding statistics for the health sciences, both conceptually and via the SPSS programme. The authors give clear explanations of the concepts underlying statistical analyses and descriptions of how these analyses are applied in health science research without complex maths formulae. The textbook takes students from the basics of research design, hypothesis testing and descriptive statistical techniques through to more advanced inferential statistical tests that health science students are likely to encounter. The strengths and weaknesses of different techniques are critically appraised throughout, and the authors emphasise how they may be used both in research and to inform best practice care in health settings. Exercises and tips throughout the book allow students to practice using SPSS. The companion website provides further practical experience of conducting statistical analyses. Features include: • multiple choice questions for both student and lecturer use • full Powerpoint slides for lecturers • practical exercises using SPSS • additional practical exercises using SAS and R This is an essential textbook for students studying beginner and intermediate level statistics across the health sciences.


All of Statistics

All of Statistics

Author: Larry Wasserman

Publisher: Springer Science & Business Media

Published: 2013-12-11

Total Pages: 446

ISBN-13: 0387217363

DOWNLOAD EBOOK

Taken literally, the title "All of Statistics" is an exaggeration. But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like non-parametric curve estimation, bootstrapping, and classification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analysing data.


Practical Statistics Simply Explained

Practical Statistics Simply Explained

Author: Dr. Russell A. Langley

Publisher: Courier Corporation

Published: 2013-04-26

Total Pages: 419

ISBN-13: 0486317277

DOWNLOAD EBOOK

Primer on how to draw valid conclusions from numerical data using logic and the philosophy of statistics rather than complex formulae. Discusses averages and scatter, investigation design, more. Problems, solutions.


How Not to Be Wrong

How Not to Be Wrong

Author: Jordan Ellenberg

Publisher: Penguin Press

Published: 2014-05-29

Total Pages: 480

ISBN-13: 1594205221

DOWNLOAD EBOOK

A brilliant tour of mathematical thought and a guide to becoming a better thinker, How Not to Be Wrong shows that math is not just a long list of rules to be learned and carried out by rote. Math touches everything we do; It's what makes the world make sense. Using the mathematician's methods and hard-won insights-minus the jargon-professor and popular columnist Jordan Ellenberg guides general readers through his ideas with rigor and lively irreverence, infusing everything from election results to baseball to the existence of God and the psychology of slime molds with a heightened sense of clarity and wonder. Armed with the tools of mathematics, we can see the hidden structures beneath the messy and chaotic surface of our daily lives. How Not to Be Wrong shows us how--Publisher's description.


Statistics Without Math

Statistics Without Math

Author: William E. Magnusson

Publisher: Sinauer Associates Incorporated

Published: 2004

Total Pages: 136

ISBN-13: 9780878935062

DOWNLOAD EBOOK

Statistics without Math is not your typical statistics book; nor is it designed to serve as a substitute for conventional statistical texts. Experience with ecology students and researchers has shown that too much mathematical detail diverts attention away from basic logical concepts, resulting in errors in sampling design, data analysis, and comprehension of the ecological literature. Hence, this book starts with real-world observations and explains how statistics may be used as a practical tool to answer questions about them, and to clearly communicate these results. The book targets intermediate-level statistics (given short shrift in most books and courses), and teaches concepts with a minimum of mathematical detail, instead using simple graphs and, where necessary, analogy. This approach, class-tested for many years by the authors, has revolutionized students' ability to understand statistics.


The Statistical Analysis of Experimental Data

The Statistical Analysis of Experimental Data

Author: John Mandel

Publisher: Courier Corporation

Published: 2012-06-08

Total Pages: 434

ISBN-13: 048613959X

DOWNLOAD EBOOK

First half of book presents fundamental mathematical definitions, concepts, and facts while remaining half deals with statistics primarily as an interpretive tool. Well-written text, numerous worked examples with step-by-step presentation. Includes 116 tables.