This introduction to the MDL Principle provides a reference accessible to graduate students and researchers in statistics, pattern classification, machine learning, and data mining, to philosophers interested in the foundations of statistics, and to researchers in other applied sciences that involve model selection.
A source book for state-of-the-art MDL, including an extensive tutorial and recent theoretical advances and practical applications in fields ranging from bioinformatics to psychology.
Information Theory and Statistics: A Tutorial is concerned with applications of information theory concepts in statistics, in the finite alphabet setting. The topics covered include large deviations, hypothesis testing, maximum likelihood estimation in exponential families, analysis of contingency tables, and iterative algorithms with an "information geometry" background. Also, an introduction is provided to the theory of universal coding, and to statistical inference via the minimum description length principle motivated by that theory. The tutorial does not assume the reader has an in-depth knowledge of Information Theory or statistics. As such, Information Theory and Statistics: A Tutorial, is an excellent introductory text to this highly-important topic in mathematics, computer science and electrical engineering. It provides both students and researchers with an invaluable resource to quickly get up to speed in the field.
This book describes how model selection and statistical inference can be founded on the shortest code length for the observed data, called the stochastic complexity. This generalization of the algorithmic complexity not only offers an objective view of statistics, where no prejudiced assumptions of 'true' data generating distributions are needed, but it also in one stroke leads to calculable expressions in a range of situations of practical interest and links very closely with mainstream statistical theory. The search for the smallest stochastic complexity extends the classical maximum likelihood technique to a new global one, in which models can be compared regardless of their numbers of parameters. The result is a natural and far reaching extension of the traditional theory of estimation, where the Fisher information is replaced by the stochastic complexity and the Cramer-Rao inequality by an extension of the Shannon-Kullback inequality. Ideas are illustrated with applications from parametric and non-parametric regression, density and spectrum estimation, time series, hypothesis testing, contingency tables, and data compression.
No statistical model is "true" or "false," "right" or "wrong"; the models just have varying performance, which can be assessed. The main theme in this book is to teach modeling based on the principle that the objective is to extract the information from data that can be learned with suggested classes of probability models. The intuitive and fundamental concepts of complexity, learnable information, and noise are formalized, which provides a firm information theoretic foundation for statistical modeling. Although the prerequisites include only basic probability calculus and statistics, a moderate level of mathematical proficiency would be beneficial.
This open access book constitutes the proceedings of the 18th International Conference on Intelligent Data Analysis, IDA 2020, held in Konstanz, Germany, in April 2020. The 45 full papers presented in this volume were carefully reviewed and selected from 114 submissions. Advancing Intelligent Data Analysis requires novel, potentially game-changing ideas. IDA’s mission is to promote ideas over performance: a solid motivation can be as convincing as exhaustive empirical evaluation.
Introduces machine learning and its algorithmic paradigms, explaining the principles behind automated learning approaches and the considerations underlying their usage.
Briefly, we review the basic elements of computability theory and prob ability theory that are required. Finally, in order to place the subject in the appropriate historical and conceptual context we trace the main roots of Kolmogorov complexity. This way the stage is set for Chapters 2 and 3, where we introduce the notion of optimal effective descriptions of objects. The length of such a description (or the number of bits of information in it) is its Kolmogorov complexity. We treat all aspects of the elementary mathematical theory of Kolmogorov complexity. This body of knowledge may be called algo rithmic complexity theory. The theory of Martin-Lof tests for random ness of finite objects and infinite sequences is inextricably intertwined with the theory of Kolmogorov complexity and is completely treated. We also investigate the statistical properties of finite strings with high Kolmogorov complexity. Both of these topics are eminently useful in the applications part of the book. We also investigate the recursion theoretic properties of Kolmogorov complexity (relations with Godel's incompleteness result), and the Kolmogorov complexity version of infor mation theory, which we may call "algorithmic information theory" or "absolute information theory. " The treatment of algorithmic probability theory in Chapter 4 presup poses Sections 1. 6, 1. 11. 2, and Chapter 3 (at least Sections 3. 1 through 3. 4).
This book constitutes the post-conference proceedings of the 5th International Conference on Machine Learning, Optimization, and Data Science, LOD 2019, held in Siena, Italy, in September 2019. The 54 full papers presented were carefully reviewed and selected from 158 submissions. The papers cover topics in the field of machine learning, artificial intelligence, reinforcement learning, computational optimization and data science presenting a substantial array of ideas, technologies, algorithms, methods and applications.