Data Management: a gentle introduction

Data Management: a gentle introduction

Author: Bas van Gils

Publisher: Van Haren

Published: 2020-03-03

Total Pages: 355

ISBN-13: 9401805555

DOWNLOAD EBOOK

The overall objective of this book is to show that data management is an exciting and valuable capability that is worth time and effort. More specifically it aims to achieve the following goals: 1. To give a “gentle” introduction to the field of DM by explaining and illustrating its core concepts, based on a mix of theory, practical frameworks such as TOGAF, ArchiMate, and DMBOK, as well as results from real-world assignments. 2. To offer guidance on how to build an effective DM capability in an organization.This is illustrated by various use cases, linked to the previously mentioned theoretical exploration as well as the stories of practitioners in the field. The primary target groups are: busy professionals who “are actively involved with managing data”. The book is also aimed at (Bachelor’s/ Master’s) students with an interest in data management. The book is industry-agnostic and should be applicable in different industries such as government, finance, telecommunications etc. Typical roles for which this book is intended: data governance office/ council, data owners, data stewards, people involved with data governance (data governance board), enterprise architects, data architects, process managers, business analysts and IT analysts. The book is divided into three main parts: theory, practice, and closing remarks. Furthermore, the chapters are as short and to the point as possible and also make a clear distinction between the main text and the examples. If the reader is already familiar with the topic of a chapter, he/she can easily skip it and move on to the next.


Data Management: a gentle introduction

Data Management: a gentle introduction

Author: Bas van Gils

Publisher: Van Haren

Published: 2020-03-03

Total Pages: 301

ISBN-13: 9401805520

DOWNLOAD EBOOK

The overall objective of this book is to show that data management is an exciting and valuable capability that is worth time and effort. More specifically it aims to achieve the following goals: 1. To give a “gentle” introduction to the field of DM by explaining and illustrating its core concepts, based on a mix of theory, practical frameworks such as TOGAF, ArchiMate, and DMBOK, as well as results from real-world assignments. 2. To offer guidance on how to build an effective DM capability in an organization.This is illustrated by various use cases, linked to the previously mentioned theoretical exploration as well as the stories of practitioners in the field. The primary target groups are: busy professionals who “are actively involved with managing data”. The book is also aimed at (Bachelor’s/ Master’s) students with an interest in data management. The book is industry-agnostic and should be applicable in different industries such as government, finance, telecommunications etc. Typical roles for which this book is intended: data governance office/ council, data owners, data stewards, people involved with data governance (data governance board), enterprise architects, data architects, process managers, business analysts and IT analysts. The book is divided into three main parts: theory, practice, and closing remarks. Furthermore, the chapters are as short and to the point as possible and also make a clear distinction between the main text and the examples. If the reader is already familiar with the topic of a chapter, he/she can easily skip it and move on to the next.


Missing Data

Missing Data

Author: Paul D. Allison

Publisher: SAGE Publications

Published: 2024-05-08

Total Pages: 100

ISBN-13: 1071962523

DOWNLOAD EBOOK

Sooner or later anyone who does statistical analysis runs into problems with missing data in which information for some variables is missing for some cases. Why is this a problem? Because most statistical methods presume that every case has information on all the variables to be included in the analysis. Using numerous examples and practical tips, this book offers a nontechnical explanation of the standard methods for missing data (such as listwise or casewise deletion) as well as two newer (and, better) methods, maximum likelihood and multiple imputation. Anyone who has been relying on ad-hoc methods that are statistically inefficient or biased will find this book a welcome and accessible solution to their problems with handling missing data.


A Gentle Introduction to Effective Computing in Quantitative Research

A Gentle Introduction to Effective Computing in Quantitative Research

Author: Harry J. Paarsch

Publisher: MIT Press

Published: 2016-05-06

Total Pages: 777

ISBN-13: 0262333996

DOWNLOAD EBOOK

A practical guide to using modern software effectively in quantitative research in the social and natural sciences. This book offers a practical guide to the computational methods at the heart of most modern quantitative research. It will be essential reading for research assistants needing hands-on experience; students entering PhD programs in business, economics, and other social or natural sciences; and those seeking quantitative jobs in industry. No background in computer science is assumed; a learner need only have a computer with access to the Internet. Using the example as its principal pedagogical device, the book offers tried-and-true prototypes that illustrate many important computational tasks required in quantitative research. The best way to use the book is to read it at the computer keyboard and learn by doing. The book begins by introducing basic skills: how to use the operating system, how to organize data, and how to complete simple programming tasks. For its demonstrations, the book uses a UNIX-based operating system and a set of free software tools: the scripting language Python for programming tasks; the database management system SQLite; and the freely available R for statistical computing and graphics. The book goes on to describe particular tasks: analyzing data, implementing commonly used numerical and simulation methods, and creating extensions to Python to reduce cycle time. Finally, the book describes the use of LaTeX, a document markup language and preparation system.


Executing Data Quality Projects

Executing Data Quality Projects

Author: Danette McGilvray

Publisher: Academic Press

Published: 2021-05-27

Total Pages: 378

ISBN-13: 0128180161

DOWNLOAD EBOOK

Executing Data Quality Projects, Second Edition presents a structured yet flexible approach for creating, improving, sustaining and managing the quality of data and information within any organization. Studies show that data quality problems are costing businesses billions of dollars each year, with poor data linked to waste and inefficiency, damaged credibility among customers and suppliers, and an organizational inability to make sound decisions. Help is here! This book describes a proven Ten Step approach that combines a conceptual framework for understanding information quality with techniques, tools, and instructions for practically putting the approach to work – with the end result of high-quality trusted data and information, so critical to today's data-dependent organizations. The Ten Steps approach applies to all types of data and all types of organizations – for-profit in any industry, non-profit, government, education, healthcare, science, research, and medicine. This book includes numerous templates, detailed examples, and practical advice for executing every step. At the same time, readers are advised on how to select relevant steps and apply them in different ways to best address the many situations they will face. The layout allows for quick reference with an easy-to-use format highlighting key concepts and definitions, important checkpoints, communication activities, best practices, and warnings. The experience of actual clients and users of the Ten Steps provide real examples of outputs for the steps plus highlighted, sidebar case studies called Ten Steps in Action. This book uses projects as the vehicle for data quality work and the word broadly to include: 1) focused data quality improvement projects, such as improving data used in supply chain management, 2) data quality activities in other projects such as building new applications and migrating data from legacy systems, integrating data because of mergers and acquisitions, or untangling data due to organizational breakups, and 3) ad hoc use of data quality steps, techniques, or activities in the course of daily work. The Ten Steps approach can also be used to enrich an organization's standard SDLC (whether sequential or Agile) and it complements general improvement methodologies such as six sigma or lean. No two data quality projects are the same but the flexible nature of the Ten Steps means the methodology can be applied to all. The new Second Edition highlights topics such as artificial intelligence and machine learning, Internet of Things, security and privacy, analytics, legal and regulatory requirements, data science, big data, data lakes, and cloud computing, among others, to show their dependence on data and information and why data quality is more relevant and critical now than ever before. - Includes concrete instructions, numerous templates, and practical advice for executing every step of The Ten Steps approach - Contains real examples from around the world, gleaned from the author's consulting practice and from those who implemented based on her training courses and the earlier edition of the book - Allows for quick reference with an easy-to-use format highlighting key concepts and definitions, important checkpoints, communication activities, and best practices - A companion Web site includes links to numerous data quality resources, including many of the templates featured in the text, quick summaries of key ideas from the Ten Steps methodology, and other tools and information that are available online


SAS Applications Programming

SAS Applications Programming

Author: Frank C. DiIorio

Publisher: Cengage Learning

Published: 1991

Total Pages: 706

ISBN-13:

DOWNLOAD EBOOK

Intended for use as a core text or to supplement any introductory or intermediate level statistics course, this book presents the basics of the SAS system in a well-paced, structured, non-threatening manner. It provides an introduction to the SAS system for data management, analysis, and reporting using the subset of the language ideally suited for beginning students, while at the same time serving as a useful reference for intermediate or advanced users. Students learn the language's power and flexibility with many real-world examples drawn from the author's industry experience. Beginning with an overview of the system, this text shows students how to read data, perform simple analyses, and produce simple reports. More complex topics are carefully introduced, guiding students to manage multiple datasets and write custom reports. More advanced statistical techniques such as correlation, regression, and analysis of variance are presented in later chapters.


A General Introduction to Data Analytics

A General Introduction to Data Analytics

Author: João Moreira

Publisher: John Wiley & Sons

Published: 2018-07-18

Total Pages: 352

ISBN-13: 1119296242

DOWNLOAD EBOOK

A guide to the principles and methods of data analysis that does not require knowledge of statistics or programming A General Introduction to Data Analytics is an essential guide to understand and use data analytics. This book is written using easy-to-understand terms and does not require familiarity with statistics or programming. The authors—noted experts in the field—highlight an explanation of the intuition behind the basic data analytics techniques. The text also contains exercises and illustrative examples. Thought to be easily accessible to non-experts, the book provides motivation to the necessity of analyzing data. It explains how to visualize and summarize data, and how to find natural groups and frequent patterns in a dataset. The book also explores predictive tasks, be them classification or regression. Finally, the book discusses popular data analytic applications, like mining the web, information retrieval, social network analysis, working with text, and recommender systems. The learning resources offer: A guide to the reasoning behind data mining techniques A unique illustrative example that extends throughout all the chapters Exercises at the end of each chapter and larger projects at the end of each of the text’s two main parts Together with these learning resources, the book can be used in a 13-week course guide, one chapter per course topic. The book was written in a format that allows the understanding of the main data analytics concepts by non-mathematicians, non-statisticians and non-computer scientists interested in getting an introduction to data science. A General Introduction to Data Analytics is a basic guide to data analytics written in highly accessible terms.


Data Management at Scale

Data Management at Scale

Author: Piethein Strengholt

Publisher: "O'Reilly Media, Inc."

Published: 2020-07-29

Total Pages: 404

ISBN-13: 1492054739

DOWNLOAD EBOOK

As data management and integration continue to evolve rapidly, storing all your data in one place, such as a data warehouse, is no longer scalable. In the very near future, data will need to be distributed and available for several technological solutions. With this practical book, you’ll learnhow to migrate your enterprise from a complex and tightly coupled data landscape to a more flexible architecture ready for the modern world of data consumption. Executives, data architects, analytics teams, and compliance and governance staff will learn how to build a modern scalable data landscape using the Scaled Architecture, which you can introduce incrementally without a large upfront investment. Author Piethein Strengholt provides blueprints, principles, observations, best practices, and patterns to get you up to speed. Examine data management trends, including technological developments, regulatory requirements, and privacy concerns Go deep into the Scaled Architecture and learn how the pieces fit together Explore data governance and data security, master data management, self-service data marketplaces, and the importance of metadata


A Survivor's Guide to R

A Survivor's Guide to R

Author: Kurt Taylor Gaubatz

Publisher: SAGE Publications

Published: 2014-04-22

Total Pages: 489

ISBN-13: 1483346889

DOWNLOAD EBOOK

Focusing on developing practical R skills rather than teaching pure statistics, Dr. Kurt Taylor Gaubatz’s A Survivor’s Guide to R provides a gentle yet thorough introduction to R. The book is structured around critical R tasks, and focuses on applied knowledge, rather than abstract concepts. Gaubatz’s easy-to-read approach helps students with little or no background in statistics or programming to develop real-world R skills through straightforward coverage of R objects and functions. Focusing on real-world data, the challenges of dataset construction, and the use of R’s powerful graphing tools, the guide is written in an accessible, sympathetic, even humorous style that ensures students acquire functional R skills they can use in their own projects and carry into their work beyond the classroom.


Metadata Management for Information Control and Business Success

Metadata Management for Information Control and Business Success

Author: Guy V. Tozer

Publisher: Artech House Publishers

Published: 1999

Total Pages: 360

ISBN-13:

DOWNLOAD EBOOK

By describing how to establish metadata management with an organization, this book provides examples of data structure architectures, and reviews issues associated with metadata management in relation to the Internet and data warehousing. It helps readers control the factors that make data useable throughout an organization and manage data so that it becomes a valuable corporate asset. The book examines real-world business departments that can benefit from this approach and ways in which sets ...