Real-Time Data Analytics for Large Scale Sensor Data

Real-Time Data Analytics for Large Scale Sensor Data

Author: Himansu Das

Publisher: Academic Press

Published: 2019-08-31

Total Pages: 298

ISBN-13: 0128182423

DOWNLOAD EBOOK

Real-Time Data Analytics for Large-Scale Sensor Data covers the theory and applications of hardware platforms and architectures, the development of software methods, techniques and tools, applications, governance and adoption strategies for the use of massive sensor data in real-time data analytics. It presents the leading-edge research in the field and identifies future challenges in this fledging research area. The book captures the essence of real-time IoT based solutions that require a multidisciplinary approach for catering to on-the-fly processing, including methods for high performance stream processing, adaptively streaming adjustment, uncertainty handling, latency handling, and more. Examines IoT applications, the design of real-time intelligent systems, and how to manage the rapid growth of the large volume of sensor data Discusses intelligent management systems for applications such as healthcare, robotics and environment modeling Provides a focused approach towards the design and implementation of real-time intelligent systems for the management of sensor data in large-scale environments


Demand-based Data Stream Gathering, Processing, and Transmission

Demand-based Data Stream Gathering, Processing, and Transmission

Author: Jonas Traub

Publisher: BoD – Books on Demand

Published: 2021-04-09

Total Pages: 208

ISBN-13: 3752671254

DOWNLOAD EBOOK

This book presents an end-to-end architecture for demand-based data stream gathering, processing, and transmission. The Internet of Things (IoT) consists of billions of devices which form a cloud of network connected sensor nodes. These sensor nodes supply a vast number of data streams with massive amounts of sensor data. Real-time sensor data enables diverse applications including traffic-aware navigation, machine monitoring, and home automation. Current stream processing pipelines are demand-oblivious, which means that they gather, transmit, and process as much data as possible. In contrast, a demand-based processing pipeline uses requirement specifications of data consumers, such as failure tolerances and latency limitations, to save resources. Our solution unifies the way applications express their data demands, i.e., their requirements with respect to their input streams. This unification allows for multiplexing the data demands of all concurrently running applications. On sensor nodes, we schedule sensor reads based on the data demands of all applications, which saves up to 87% in sensor reads and data transfers in our experiments with real-world sensor data. Our demand-based control layer optimizes the data acquisition from thousands of sensors. We introduce time coherence as a fundamental data characteristic. Time coherence is the delay between the first and the last sensor read that contribute values to a tuple. A large scale parameter exploration shows that our solution scales to large numbers of sensors and operates reliably under varying latency and coherence constraints. On stream analysis systems, we tackle the problem of efficient window aggregation. We contribute a general aggregation technique, which adapts to four key workload characteristics: Stream (dis)order, aggregation types, window types, and window measures. Our experiments show that our solution outperforms alternative solutions by an order of magnitude in throughput, which prevents expensive system scale-out. We further derive data demands from visualization needs of applications and make these data demands available to streaming systems such as Apache Flink. This enables streaming systems to pre-process data with respect to changing visualization needs. Experiments show that our solution reliably prevents overloads when data rates increase.


Big Data Analytics for Sensor-Network Collected Intelligence

Big Data Analytics for Sensor-Network Collected Intelligence

Author: Hui-Huang Hsu

Publisher: Morgan Kaufmann

Published: 2017-02-02

Total Pages: 328

ISBN-13: 012809625X

DOWNLOAD EBOOK

Big Data Analytics for Sensor-Network Collected Intelligence explores state-of-the-art methods for using advanced ICT technologies to perform intelligent analysis on sensor collected data. The book shows how to develop systems that automatically detect natural and human-made events, how to examine people’s behaviors, and how to unobtrusively provide better services. It begins by exploring big data architecture and platforms, covering the cloud computing infrastructure and how data is stored and visualized. The book then explores how big data is processed and managed, the key security and privacy issues involved, and the approaches used to ensure data quality. In addition, readers will find a thorough examination of big data analytics, analyzing statistical methods for data analytics and data mining, along with a detailed look at big data intelligence, ubiquitous and mobile computing, and designing intelligence system based on context and situation. Indexing: The books of this series are submitted to EI-Compendex and SCOPUS Contains contributions from noted scholars in computer science and electrical engineering from around the globe Provides a broad overview of recent developments in sensor collected intelligence Edited by a team comprised of leading thinkers in big data analytics


Tributary

Tributary

Author: Yadid Ayzenberg

Publisher:

Published: 2016

Total Pages: 173

ISBN-13:

DOWNLOAD EBOOK

State of the art technology has made it possible to monitor various physiological signals for prolonged periods. Using wearable sensors, individuals can be monitored; sensor data can be collected and stored in digital format, transmitted to remote locations, and analyzed at later times. This technology may open the door to a multitude of exciting and innovative applications. We could learn the effects of the environment and of our day-to-day choices on our physiology. Does the number of hours we sleep affect our mood during the following day? Is our performance impacted by the times we schedule our recreational activities? Does physical activity affect our quality of sleep? Do these choices have an impact on chronic conditions? This proliferation of smart phones and wearable sensors is creating very large data sets that may contain useful information. Gartner claims that the Internet of Things Install Base Will Grow to 26 Billion Units By 2020. However, the magnitude of generated data creates new challenges as well. Processing and analyzing these large data sets in an efficient manner requires advanced computational tools. The challenge is that as more data are collected, it becomes more computationally expensive to process requiring novel algorithmic techniques and parallel architectures. Traditional analysis techniques do not scale adequately and in many cases researchers are required to create customized environments. This thesis explores and extends the affordances of warehouse scale computing for interactivity and pliability of large-scale time series data sets. In the first part of the thesis, I describe a theoretical framework for distributed processing of time-series data that is implementation invariant and may be implemented on an existing distributed computation infrastructure. Next, I present a detailed architecture and implementation of the theoretical framework, which was deployed on several clusters, as well as indepth analysis of the user-interface design considerations and the user experience design process. In the second part of the thesis, I present a system evaluation that consists of two parts. The first part is a quantitative characterization of the system performance in a variety of scenarios that included different dataset and cluster sizes. The second part contains the results of a qualitative user study: researchers were asked to use the system to analyze data that they had collected in their own studies and to participate in an ethnographic study on their experience. This study reveals that distributed computing holds great potential for accelerating scientific research utilizing large scale sensor data sets, providing new ways to see patterns in large sets of data, and much speedier analyses.


Improving Computational and Human Efficiency in Large-scale Data Analytics

Improving Computational and Human Efficiency in Large-scale Data Analytics

Author: Kexin Rong

Publisher:

Published: 2021

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Network telemetry, sensor readings, and other machine-generated data are growing exponentially in volume. Meanwhile, the computational resources available for processing this data -- as well as analysts' ability to manually inspect it -- remain limited. As the gap continues to widen, keeping up with the data volumes is challenging for analytic systems and analysts alike. This dissertation introduces systems and algorithms that focus the limited computational resources and analysts' time in modern data analytics on a subset of relevant data. The dissertation comprises two parts that focus on improving the computational and human efficiency in data analytics, respectively. In the first part of this dissertation, we improve the computational efficiency of analytics by combining precomputation and sampling techniques to select a subset of data that contributes the most to query results. We demonstrate this concept with two approximate query processing systems. PS3 approximates aggregate SQL queries with weighted, partition-level samples based on precomputed summary statistics, whereas HBE approximates kernel density estimations using precomputed hash indexes as smart data samplers. Our evaluation shows that both systems outperform uniform sampling, the best-known result for these queries, with practical precomputation overheads. PS3 enables a 3 to 70x speedup under the same accuracy as uniform partition sampling, with less than 100 KB of storage overhead per partition; HBE offers up to a 10x improvements in query time compared to the second-best method with comparable precomputation time. In the second part of this dissertation, we improve the human efficiency of analytics by automatically identifying and summarizing unusual behaviors in large data streams to reduce the burden of manual inspections. We demonstrate this approach through two monitoring applications for machine-generated data. First, ASAP is a visualization operator that automatically smooths time series in monitoring dashboards to highlight large-scale trends and deviations. Compared to presenting the raw time series, ASAP decreases users' response time for identifying anomalies by up to 44.3% in our user study. We subsequently describe FASTer, an end-to-end earthquake detection system that we built in collaboration with seismologists at Stanford University. By pushing down domain-specific filtering and aggregation into the analytics workflows, FASTer significantly improves the speed and quality of earthquake candidate generation, scaling the analysis from three months of data from a single sensor to ten years of data over a network of sensors. The contributions of this dissertation have had real-world impact. ASAP has been incorporated into open-source tools such as Graphite, TimescaleDB Toolkit, and NPM module downsample. ASAP has also directly inspired an auto smoother for the real-time dashboards at the monitoring service Datadog. FASTer is open-source and has been used by researchers worldwide. Its improved scalability has enabled the discovery of hundreds of new earthquake events near the Diablo Canyon nuclear power plant in California.


Managing and Mining Sensor Data

Managing and Mining Sensor Data

Author: Charu C. Aggarwal

Publisher: Springer Science & Business Media

Published: 2013-01-15

Total Pages: 547

ISBN-13: 1461463092

DOWNLOAD EBOOK

Advances in hardware technology have lead to an ability to collect data with the use of a variety of sensor technologies. In particular sensor notes have become cheaper and more efficient, and have even been integrated into day-to-day devices of use, such as mobile phones. This has lead to a much larger scale of applicability and mining of sensor data sets. The human-centric aspect of sensor data has created tremendous opportunities in integrating social aspects of sensor data collection into the mining process. Managing and Mining Sensor Data is a contributed volume by prominent leaders in this field, targeting advanced-level students in computer science as a secondary text book or reference. Practitioners and researchers working in this field will also find this book useful.


Data Science and Big Data Computing

Data Science and Big Data Computing

Author: Zaigham Mahmood

Publisher: Springer

Published: 2016-07-05

Total Pages: 332

ISBN-13: 3319318616

DOWNLOAD EBOOK

This illuminating text/reference surveys the state of the art in data science, and provides practical guidance on big data analytics. Expert perspectives are provided by authoritative researchers and practitioners from around the world, discussing research developments and emerging trends, presenting case studies on helpful frameworks and innovative methodologies, and suggesting best practices for efficient and effective data analytics. Features: reviews a framework for fast data applications, a technique for complex event processing, and agglomerative approaches for the partitioning of networks; introduces a unified approach to data modeling and management, and a distributed computing perspective on interfacing physical and cyber worlds; presents techniques for machine learning for big data, and identifying duplicate records in data repositories; examines enabling technologies and tools for data mining; proposes frameworks for data extraction, and adaptive decision making and social media analysis.


Real-Time Analytics

Real-Time Analytics

Author: Byron Ellis

Publisher: John Wiley & Sons

Published: 2014-06-23

Total Pages: 432

ISBN-13: 1118838025

DOWNLOAD EBOOK

Construct a robust end-to-end solution for analyzing and visualizing streaming data Real-time analytics is the hottest topic in data analytics today. In Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data, expert Byron Ellis teaches data analysts technologies to build an effective real-time analytics platform. This platform can then be used to make sense of the constantly changing data that is beginning to outpace traditional batch-based analysis platforms. The author is among a very few leading experts in the field. He has a prestigious background in research, development, analytics, real-time visualization, and Big Data streaming and is uniquely qualified to help you explore this revolutionary field. Moving from a description of the overall analytic architecture of real-time analytics to using specific tools to obtain targeted results, Real-Time Analytics leverages open source and modern commercial tools to construct robust, efficient systems that can provide real-time analysis in a cost-effective manner. The book includes: A deep discussion of streaming data systems and architectures Instructions for analyzing, storing, and delivering streaming data Tips on aggregating data and working with sets Information on data warehousing options and techniques Real-Time Analytics includes in-depth case studies for website analytics, Big Data, visualizing streaming and mobile data, and mining and visualizing operational data flows. The book's "recipe" layout lets readers quickly learn and implement different techniques. All of the code examples presented in the book, along with their related data sets, are available on the companion website.


Smart Sensor Networks

Smart Sensor Networks

Author: Umang Singh

Publisher: Springer Nature

Published: 2021-09-01

Total Pages: 233

ISBN-13: 3030772144

DOWNLOAD EBOOK

This book provides IT professionals, educators, researchers, and students a compendium of knowledge on smart sensors and devices, types of sensors, data analysis and monitoring with the help of smart sensors, decision making, impact of machine learning algorithms, and artificial intelligence-related methodologies for data analysis and understanding of smart applications in networks. Smart sensor networks play an important role in the establishment of network devices which can easily interact with physical world through plethora of variety of sensors for collecting and monitoring the surrounding context and allowing environment information. Apart from military applications, smart sensor networks are used in many civilian applications nowadays and there is a need to manage high volume of demands in related applications. This book comprises of 9 chapters and presents a valuable insight on the original research and review articles on the latest achievements that contributes to the field of smart sensor networks and their usage in real-life applications like smart city, smart home, e-healthcare, smart social sensing networks, etc. Chapters illustrate technological advances and trends, examine research opportunities, highlight best practices and standards, and discuss applications and adoption. Some chapters also provide holistic and multiple perspectives while examining the impact of smart sensor networks and the role of data analytics, data sharing, and its control along with future prospects.


Smart Grid using Big Data Analytics

Smart Grid using Big Data Analytics

Author: Robert C. Qiu

Publisher: John Wiley & Sons

Published: 2017-01-23

Total Pages: 630

ISBN-13: 1118716809

DOWNLOAD EBOOK

This book is aimed at students in communications and signal processing who want to extend their skills in the energy area. It describes power systems and why these backgrounds are so useful to smart grid, wireless communications being very different to traditional wireline communications.