Temporal Data Mining

Temporal Data Mining

Author: Theophano Mitsa

Publisher: CRC Press

Published: 2010-03-10

Total Pages: 398

ISBN-13: 1420089773

DOWNLOAD EBOOK

From basic data mining concepts to state-of-the-art advances, this book covers the theory of the subject as well as its application in a variety of fields. It discusses the incorporation of temporality in databases as well as temporal data representation, similarity computation, data classification, clustering, pattern discovery, and prediction. The book also explores the use of temporal data mining in medicine and biomedical informatics, business and industrial applications, web usage mining, and spatiotemporal data mining. Along with various state-of-the-art algorithms, each chapter includes detailed references and short descriptions of relevant algorithms and techniques described in other references.


High Performance Discovery In Time Series

High Performance Discovery In Time Series

Author: New York University

Publisher: Springer Science & Business Media

Published: 2013-11-09

Total Pages: 195

ISBN-13: 1475740468

DOWNLOAD EBOOK

This monograph is a technical survey of concepts and techniques for describing and analyzing large-scale time-series data streams. Some topics covered are algorithms for query by humming, gamma-ray burst detection, pairs trading, and density detection. Included are self-contained descriptions of wavelets, fast Fourier transforms, and sketches as they apply to time-series analysis. Detailed applications are built on a solid scientific basis.


High Performance Discovery In Time Series

High Performance Discovery In Time Series

Author: Dennis Elliott Shasha

Publisher: Springer Science & Business Media

Published: 2004-06-03

Total Pages: 210

ISBN-13: 9780387008578

DOWNLOAD EBOOK

Time-series data—data arriving in time order, or a data stream—can be found in fields such as physics, finance, music, networking, and medical instrumentation. Designing fast, scalable algorithms for analyzing single or multiple time series can lead to scientific discoveries, medical diagnoses, and perhaps profits. High Performance Discovery in Time Series presents rapid-discovery techniques for finding portions of time series with many events (i.e., gamma-ray scatterings) and finding closely related time series (i.e., highly correlated price and return histories, or musical melodies). A typical time-series technique may compute a "consensus" time series—from a collection of time series—to use regression analysis for predicting future time points. By contrast, this book aims at efficient discovery in time series, rather than prediction, and its novelty lies in its algorithmic contributions and its simple, practical algorithms and case studies. It presumes familiarity with only basic calculus and some linear algebra. Topics and Features: *Presents efficient algorithms for discovering unusual bursts of activity in large time-series databases * Describes the mathematics and algorithms for finding correlation relationships between thousands or millions of time series across fixed or moving windows *Demonstrates strong, relevant applications built on a solid scientific basis *Outlines how readers can adapt the techniques for their own needs and goals *Describes algorithms for query by humming, gamma-ray burst detection, pairs trading, and density detection *Offers self-contained descriptions of wavelets, fast Fourier transforms, and sketches as they apply to time-series analysis This new monograph provides a technical survey of concepts and techniques for describing and analyzing large-scale time-series data streams. It offers essential coverage of the topic for computer scientists, physicists, medical researchers, financial mathematicians, musicologists, and researchers and professionals who must analyze massive time series. In addition, it can serve as an ideal text/reference for graduate students in many data-rich disciplines.


Time Series Databases

Time Series Databases

Author: Ted Dunning

Publisher: O'Reilly Media

Published: 2014

Total Pages: 0

ISBN-13: 9781491914724

DOWNLOAD EBOOK

Time series data is of growing importance, especially with the rapid expansion of the Internet of Things. This concise guide shows you effective ways to collect, persist, and access large-scale time series data for analysis. You'll explore the theory behind time series databases and learn practical methods for implementing them. Authors Ted Dunning and Ellen Friedman provide a detailed examination of open source tools such as OpenTSDB and new modifications that greatly speed up data ingestion. You'll learn: A variety of time series use cases The advantages of NoSQL databases for large-scale time series data NoSQL table design for high-performance time series databases The benefits and limitations of OpenTSDB How to access data in OpenTSDB using R, Go, and Ruby How time series databases contribute to practical machine learning projects How to handle the added complexity of geo-temporal data For advice on analyzing time series data, check out Practical Machine Learning: A New Look at Anomaly Detection, also from Ted Dunning and Ellen Friedman.


Mining of Massive Datasets

Mining of Massive Datasets

Author: Jure Leskovec

Publisher: Cambridge University Press

Published: 2014-11-13

Total Pages: 480

ISBN-13: 1107077230

DOWNLOAD EBOOK

Now in its second edition, this book focuses on practical algorithms for mining data from even the largest datasets.


Proceedings of the Seventh SIAM International Conference on Data Mining

Proceedings of the Seventh SIAM International Conference on Data Mining

Author: Chid Apte

Publisher: Proceedings in Applied Mathema

Published: 2007

Total Pages: 674

ISBN-13:

DOWNLOAD EBOOK

The Seventh SIAM International Conference on Data Mining (SDM 2007) continues a series of conferences whose focus is the theory and application of data mining to complex datasets in science, engineering, biomedicine, and the social sciences. These datasets challenge our abilities to analyze them because they are large and often noisy. Sophisticated, highperformance, and principled analysis techniques and algorithms, based on sound statistical foundations, are required. Visualization is often critically important; tuning for performance is a significant challenge; and the appropriate levels of abstraction to allow end-users to exploit sophisticated techniques and understand clearly both the constraints and interpretation of results are still something of an open question.