Discovery in Physics

Discovery in Physics

Author: Katharina Morik

Publisher: Walter de Gruyter GmbH & Co KG

Published: 2022-12-31

Total Pages: 364

ISBN-13: 311078596X

DOWNLOAD EBOOK

Machine learning is part of Artificial Intelligence since its beginning. Certainly, not learning would only allow the perfect being to show intelligent behavior. All others, be it humans or machines, need to learn in order to enhance their capabilities. In the eighties of the last century, learning from examples and modeling human learning strategies have been investigated in concert. The formal statistical basis of many learning methods has been put forward later on and is still an integral part of machine learning. Neural networks have always been in the toolbox of methods. Integrating all the pre-processing, exploitation of kernel functions, and transformation steps of a machine learning process into the architecture of a deep neural network increased the performance of this model type considerably. Modern machine learning is challenged on the one hand by the amount of data and on the other hand by the demand of real-time inference. This leads to an interest in computing architectures and modern processors. For a long time, the machine learning research could take the von-Neumann architecture for granted. All algorithms were designed for the classical CPU. Issues of implementation on a particular architecture have been ignored. This is no longer possible. The time for independently investigating machine learning and computational architecture is over. Computing architecture has experienced a similarly rampant development from mainframe or personal computers in the last century to now very large compute clusters on the one hand and ubiquitous computing of embedded systems in the Internet of Things on the other hand. Cyber-physical systems’ sensors produce a huge amount of streaming data which need to be stored and analyzed. Their actuators need to react in real-time. This clearly establishes a close connection with machine learning. Cyber-physical systems and systems in the Internet of Things consist of diverse components, heterogeneous both in hard- and software. Modern multi-core systems, graphic processors, memory technologies and hardware-software codesign offer opportunities for better implementations of machine learning models. Machine learning and embedded systems together now form a field of research which tackles leading edge problems in machine learning, algorithm engineering, and embedded systems. Machine learning today needs to make the resource demands of learning and inference meet the resource constraints of used computer architecture and platforms. A large variety of algorithms for the same learning method and, moreover, diverse implementations of an algorithm for particular computing architectures optimize learning with respect to resource efficiency while keeping some guarantees of accuracy. The trade-off between a decreased energy consumption and an increased error rate, to just give an example, needs to be theoretically shown for training a model and the model inference. Pruning and quantization are ways of reducing the resource requirements by either compressing or approximating the model. In addition to memory and energy consumption, timeliness is an important issue, since many embedded systems are integrated into large products that interact with the physical world. If the results are delivered too late, they may have become useless. As a result, real-time guarantees are needed for such systems. To efficiently utilize the available resources, e.g., processing power, memory, and accelerators, with respect to response time, energy consumption, and power dissipation, different scheduling algorithms and resource management strategies need to be developed. This book series addresses machine learning under resource constraints as well as the application of the described methods in various domains of science and engineering. Turning big data into smart data requires many steps of data analysis: methods for extracting and selecting features, filtering and cleaning the data, joining heterogeneous sources, aggregating the data, and learning predictions need to scale up. The algorithms are challenged on the one hand by high-throughput data, gigantic data sets like in astrophysics, on the other hand by high dimensions like in genetic data. Resource constraints are given by the relation between the demands for processing the data and the capacity of the computing machinery. The resources are runtime, memory, communication, and energy. Novel machine learning algorithms are optimized with regard to minimal resource consumption. Moreover, learned predictions are applied to program executions in order to save resources. The three books will have the following subtopics: Volume 1: Machine Learning under Resource Constraints - Fundamentals Volume 2: Machine Learning and Physics under Resource Constraints - Discovery Volume 3: Machine Learning under Resource Constraints - Applications Volume 2 is about machine learning for knowledge discovery in particle and astroparticle physics. Their instruments, e.g., particle accelerators or telescopes, gather petabytes of data. Here, machine learning is necessary not only to process the vast amounts of data and to detect the relevant examples efficiently, but also as part of the knowledge discovery process itself. The physical knowledge is encoded in simulations that are used to train the machine learning models. At the same time, the interpretation of the learned models serves to expand the physical knowledge. This results in a cycle of theory enhancement supported by machine learning.


Machine Learning under Resource Constraints - Discovery in Physics

Machine Learning under Resource Constraints - Discovery in Physics

Author: Katharina Morik

Publisher: Walter de Gruyter GmbH & Co KG

Published: 2022-12-31

Total Pages: 406

ISBN-13: 3110786133

DOWNLOAD EBOOK

Machine Learning under Resource Constraints addresses novel machine learning algorithms that are challenged by high-throughput data, by high dimensions, or by complex structures of the data in three volumes. Resource constraints are given by the relation between the demands for processing the data and the capacity of the computing machinery. The resources are runtime, memory, communication, and energy. Hence, modern computer architectures play a significant role. Novel machine learning algorithms are optimized with regard to minimal resource consumption. Moreover, learned predictions are executed on diverse architectures to save resources. It provides a comprehensive overview of the novel approaches to machine learning research that consider resource constraints, as well as the application of the described methods in various domains of science and engineering. Volume 2 covers machine learning for knowledge discovery in particle and astroparticle physics. Their instruments, e.g., particle detectors or telescopes, gather petabytes of data. Here, machine learning is necessary not only to process the vast amounts of data and to detect the relevant examples efficiently, but also as part of the knowledge discovery process itself. The physical knowledge is encoded in simulations that are used to train the machine learning models. At the same time, the interpretation of the learned models serves to expand the physical knowledge. This results in a cycle of theory enhancement supported by machine learning.


Fundamentals

Fundamentals

Author: Katharina Morik

Publisher: Walter de Gruyter GmbH & Co KG

Published: 2022-12-31

Total Pages: 506

ISBN-13: 3110785943

DOWNLOAD EBOOK

Machine learning is part of Artificial Intelligence since its beginning. Certainly, not learning would only allow the perfect being to show intelligent behavior. All others, be it humans or machines, need to learn in order to enhance their capabilities. In the eighties of the last century, learning from examples and modeling human learning strategies have been investigated in concert. The formal statistical basis of many learning methods has been put forward later on and is still an integral part of machine learning. Neural networks have always been in the toolbox of methods. Integrating all the pre-processing, exploitation of kernel functions, and transformation steps of a machine learning process into the architecture of a deep neural network increased the performance of this model type considerably. Modern machine learning is challenged on the one hand by the amount of data and on the other hand by the demand of real-time inference. This leads to an interest in computing architectures and modern processors. For a long time, the machine learning research could take the von-Neumann architecture for granted. All algorithms were designed for the classical CPU. Issues of implementation on a particular architecture have been ignored. This is no longer possible. The time for independently investigating machine learning and computational architecture is over. Computing architecture has experienced a similarly rampant development from mainframe or personal computers in the last century to now very large compute clusters on the one hand and ubiquitous computing of embedded systems in the Internet of Things on the other hand. Cyber-physical systems’ sensors produce a huge amount of streaming data which need to be stored and analyzed. Their actuators need to react in real-time. This clearly establishes a close connection with machine learning. Cyber-physical systems and systems in the Internet of Things consist of diverse components, heterogeneous both in hard- and software. Modern multi-core systems, graphic processors, memory technologies and hardware-software codesign offer opportunities for better implementations of machine learning models. Machine learning and embedded systems together now form a field of research which tackles leading edge problems in machine learning, algorithm engineering, and embedded systems. Machine learning today needs to make the resource demands of learning and inference meet the resource constraints of used computer architecture and platforms. A large variety of algorithms for the same learning method and, moreover, diverse implementations of an algorithm for particular computing architectures optimize learning with respect to resource efficiency while keeping some guarantees of accuracy. The trade-off between a decreased energy consumption and an increased error rate, to just give an example, needs to be theoretically shown for training a model and the model inference. Pruning and quantization are ways of reducing the resource requirements by either compressing or approximating the model. In addition to memory and energy consumption, timeliness is an important issue, since many embedded systems are integrated into large products that interact with the physical world. If the results are delivered too late, they may have become useless. As a result, real-time guarantees are needed for such systems. To efficiently utilize the available resources, e.g., processing power, memory, and accelerators, with respect to response time, energy consumption, and power dissipation, different scheduling algorithms and resource management strategies need to be developed. This book series addresses machine learning under resource constraints as well as the application of the described methods in various domains of science and engineering. Turning big data into smart data requires many steps of data analysis: methods for extracting and selecting features, filtering and cleaning the data, joining heterogeneous sources, aggregating the data, and learning predictions need to scale up. The algorithms are challenged on the one hand by high-throughput data, gigantic data sets like in astrophysics, on the other hand by high dimensions like in genetic data. Resource constraints are given by the relation between the demands for processing the data and the capacity of the computing machinery. The resources are runtime, memory, communication, and energy. Novel machine learning algorithms are optimized with regard to minimal resource consumption. Moreover, learned predictions are applied to program executions in order to save resources. The three books will have the following subtopics: Volume 1: Machine Learning under Resource Constraints - Fundamentals Volume 2: Machine Learning and Physics under Resource Constraints - Discovery Volume 3: Machine Learning under Resource Constraints - Applications Volume 1 establishes the foundations of this new field (Machine Learning under Resource Constraints). It goes through all the steps from data collection, their summary and clustering, to the different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Several machine learning methods are inspected with respect to their resource requirements and how to enhance their scalability on diverse computing architectures ranging from embedded systems to large computing clusters.


Applications

Applications

Author: Katharina Morik

Publisher: Walter de Gruyter GmbH & Co KG

Published: 2022-12-31

Total Pages: 478

ISBN-13: 3110785986

DOWNLOAD EBOOK

Machine learning is part of Artificial Intelligence since its beginning. Certainly, not learning would only allow the perfect being to show intelligent behavior. All others, be it humans or machines, need to learn in order to enhance their capabilities. In the eighties of the last century, learning from examples and modeling human learning strategies have been investigated in concert. The formal statistical basis of many learning methods has been put forward later on and is still an integral part of machine learning. Neural networks have always been in the toolbox of methods. Integrating all the pre-processing, exploitation of kernel functions, and transformation steps of a machine learning process into the architecture of a deep neural network increased the performance of this model type considerably. Modern machine learning is challenged on the one hand by the amount of data and on the other hand by the demand of real-time inference. This leads to an interest in computing architectures and modern processors. For a long time, the machine learning research could take the von-Neumann architecture for granted. All algorithms were designed for the classical CPU. Issues of implementation on a particular architecture have been ignored. This is no longer possible. The time for independently investigating machine learning and computational architecture is over. Computing architecture has experienced a similarly rampant development from mainframe or personal computers in the last century to now very large compute clusters on the one hand and ubiquitous computing of embedded systems in the Internet of Things on the other hand. Cyber-physical systems’ sensors produce a huge amount of streaming data which need to be stored and analyzed. Their actuators need to react in real-time. This clearly establishes a close connection with machine learning. Cyber-physical systems and systems in the Internet of Things consist of diverse components, heterogeneous both in hard- and software. Modern multi-core systems, graphic processors, memory technologies and hardware-software codesign offer opportunities for better implementations of machine learning models. Machine learning and embedded systems together now form a field of research which tackles leading edge problems in machine learning, algorithm engineering, and embedded systems. Machine learning today needs to make the resource demands of learning and inference meet the resource constraints of used computer architecture and platforms. A large variety of algorithms for the same learning method and, moreover, diverse implementations of an algorithm for particular computing architectures optimize learning with respect to resource efficiency while keeping some guarantees of accuracy. The trade-off between a decreased energy consumption and an increased error rate, to just give an example, needs to be theoretically shown for training a model and the model inference. Pruning and quantization are ways of reducing the resource requirements by either compressing or approximating the model. In addition to memory and energy consumption, timeliness is an important issue, since many embedded systems are integrated into large products that interact with the physical world. If the results are delivered too late, they may have become useless. As a result, real-time guarantees are needed for such systems. To efficiently utilize the available resources, e.g., processing power, memory, and accelerators, with respect to response time, energy consumption, and power dissipation, different scheduling algorithms and resource management strategies need to be developed. This book series addresses machine learning under resource constraints as well as the application of the described methods in various domains of science and engineering. Turning big data into smart data requires many steps of data analysis: methods for extracting and selecting features, filtering and cleaning the data, joining heterogeneous sources, aggregating the data, and learning predictions need to scale up. The algorithms are challenged on the one hand by high-throughput data, gigantic data sets like in astrophysics, on the other hand by high dimensions like in genetic data. Resource constraints are given by the relation between the demands for processing the data and the capacity of the computing machinery. The resources are runtime, memory, communication, and energy. Novel machine learning algorithms are optimized with regard to minimal resource consumption. Moreover, learned predictions are applied to program executions in order to save resources. The three books will have the following subtopics: Volume 1: Machine Learning under Resource Constraints - Fundamentals Volume 2: Machine Learning and Physics under Resource Constraints - Discovery Volume 3: Machine Learning under Resource Constraints - Applications Volume 3 describes how the resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples. In the areas of health and medicine, it is demonstrated how machine learning can improve risk modelling, diagnosis, and treatment selection for diseases. Machine learning supported quality control during the manufacturing process in a factory allows to reduce material and energy cost and save testing times is shown by the diverse real-time applications in electronics and steel production as well as milling. Additional application examples show, how machine-learning can make traffic, logistics and smart cities more efficient and sustainable. Finally, mobile communications can benefit substantially from machine learning, for example by uncovering hidden characteristics of the wireless channel.


Physics-Aware Tiny Machine Learning

Physics-Aware Tiny Machine Learning

Author: Swapnil Sayan Saha

Publisher:

Published: 2023

Total Pages: 0

ISBN-13:

DOWNLOAD EBOOK

Tiny machine learning has enabled Internet of Things platforms to make intelligent inferences for time-critical and remote applications from unstructured data. However, realizing edge artificial intelligence systems that can perform long-term high-level reasoning and obey the underlying system physics, rules, and constraints within the tight platform resource budget is challenging. This dissertation explores how rich, robust, and intelligent inferences can be made on extremely resource-constrained platforms in a platform-aware and automated fashion. Firstly, we introduce a robust training pipeline that handles sampling rate variability, missing data, and misaligned data timestamps through intelligent data augmentation techniques during training time. We use a controlled jitter in window length and add artificial misalignments in data timestamps between sensors, along with masking representations of missing data. Secondly, we introduce TinyNS, a platform-aware neurosymbolic architecture search framework for the automatic co-optimization and deployment of neural operators and physics-based process models. TinyNS exploits fast, gradient-free, and black-box Bayesian optimization to automatically construct the most performant learning-enabled, physics, and context-aware edge artificial intelligence program from a search space containing neural and symbolic operators within the platform resource constraints. To guarantee deployability, TinyNS receives hardware metrics directly from the target hardware during the optimization process. Thirdly, we introduce the concept of neurosymbolic tiny machine learning, where we showcase recipes for defining the physics-aware tiny machine learning program synthesis search space from five neurosymbolic program categories. Neurosymbolic artificial intelligence combines the context awareness and integrity of symbolic techniques with the robustness and performance of machine learning models. We develop parsers to automatically write microcontroller code for neurosymbolic programs and showcase several previously unseen TinyML applications. These include onboard physics-aware neural-inertial navigation, on-device human activity recognition, on-chip fall detection, neural-Kalman filtering, and co-optimization of neural and symbolic processes. Finally, we showcase techniques to personalize and adapt tiny machine learning systems to the target domain and application. We illustrate the use of transfer learning, resource-efficient unsupervised template creation and matching, and foundation models as pathways to realize generalizable, domain-aware, and data-efficient edge artificial intelligence systems.


Machine Learning and Knowledge Discovery in Databases

Machine Learning and Knowledge Discovery in Databases

Author: Massih-Reza Amini

Publisher: Springer Nature

Published: 2023-03-16

Total Pages: 669

ISBN-13: 3031264193

DOWNLOAD EBOOK

The multi-volume set LNAI 13713 until 13718 constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2022, which took place in Grenoble, France, in September 2022. The 236 full papers presented in these proceedings were carefully reviewed and selected from a total of 1060 submissions. In addition, the proceedings include 17 Demo Track contributions. The volumes are organized in topical sections as follows: Part I: Clustering and dimensionality reduction; anomaly detection; interpretability and explainability; ranking and recommender systems; transfer and multitask learning; Part II: Networks and graphs; knowledge graphs; social network analysis; graph neural networks; natural language processing and text mining; conversational systems; Part III: Deep learning; robust and adversarial machine learning; generative models; computer vision; meta-learning, neural architecture search; Part IV: Reinforcement learning; multi-agent reinforcement learning; bandits and online learning; active and semi-supervised learning; private and federated learning; . Part V: Supervised learning; probabilistic inference; optimal transport; optimization; quantum, hardware; sustainability; Part VI: Time series; financial machine learning; applications; applications: transportation; demo track.


Large-Scale Machine Learning in the Earth Sciences

Large-Scale Machine Learning in the Earth Sciences

Author: Ashok N. Srivastava

Publisher: CRC Press

Published: 2017-08-01

Total Pages: 314

ISBN-13: 1315354462

DOWNLOAD EBOOK

From the Foreword: "While large-scale machine learning and data mining have greatly impacted a range of commercial applications, their use in the field of Earth sciences is still in the early stages. This book, edited by Ashok Srivastava, Ramakrishna Nemani, and Karsten Steinhaeuser, serves as an outstanding resource for anyone interested in the opportunities and challenges for the machine learning community in analyzing these data sets to answer questions of urgent societal interest...I hope that this book will inspire more computer scientists to focus on environmental applications, and Earth scientists to seek collaborations with researchers in machine learning and data mining to advance the frontiers in Earth sciences." --Vipin Kumar, University of Minnesota Large-Scale Machine Learning in the Earth Sciences provides researchers and practitioners with a broad overview of some of the key challenges in the intersection of Earth science, computer science, statistics, and related fields. It explores a wide range of topics and provides a compilation of recent research in the application of machine learning in the field of Earth Science. Making predictions based on observational data is a theme of the book, and the book includes chapters on the use of network science to understand and discover teleconnections in extreme climate and weather events, as well as using structured estimation in high dimensions. The use of ensemble machine learning models to combine predictions of global climate models using information from spatial and temporal patterns is also explored. The second part of the book features a discussion on statistical downscaling in climate with state-of-the-art scalable machine learning, as well as an overview of methods to understand and predict the proliferation of biological species due to changes in environmental conditions. The problem of using large-scale machine learning to study the formation of tornadoes is also explored in depth. The last part of the book covers the use of deep learning algorithms to classify images that have very high resolution, as well as the unmixing of spectral signals in remote sensing images of land cover. The authors also apply long-tail distributions to geoscience resources, in the final chapter of the book.


Investigating Explanation-Based Learning

Investigating Explanation-Based Learning

Author: Gerald DeJong

Publisher: Springer Science & Business Media

Published: 2012-12-06

Total Pages: 447

ISBN-13: 1461536022

DOWNLOAD EBOOK

Explanation-Based Learning (EBL) can generally be viewed as substituting background knowledge for the large training set of exemplars needed by conventional or empirical machine learning systems. The background knowledge is used automatically to construct an explanation of a few training exemplars. The learned concept is generalized directly from this explanation. The first EBL systems of the modern era were Mitchell's LEX2, Silver's LP, and De Jong's KIDNAP natural language system. Two of these systems, Mitchell's and De Jong's, have led to extensive follow-up research in EBL. This book outlines the significant steps in EBL research of the Illinois group under De Jong. This volume describes theoretical research and computer systems that use a broad range of formalisms: schemas, production systems, qualitative reasoning models, non-monotonic logic, situation calculus, and some home-grown ad hoc representations. This has been done consciously to avoid sacrificing the ultimate research significance in favor of the expediency of any particular formalism. The ultimate goal, of course, is to adopt (or devise) the right formalism.