This book covers important aspects of fundamental research in data provenance and data management(DPDM), including provenance representation and querying, as well as practical applications in such domains as clinical trials, bioinformatics and radio astronomy.
This book constitutes the revised selected papers of the 4th International Provenance and Annotation Workshop, IPAW 2012, held in Santa Barbara, CA, USA, in June 2012. The 14 full papers, 4 demo papers and 12 poster papers presented were carefully reviewed and selected from 49 submissions. The papers are organized in topical sections on documents databases; the Web; reconstruction; science applications; and demonstrations.
This open access book summarises the latest developments on data management in the EU H2020 ENVRIplus project, which brought together more than 20 environmental and Earth science research infrastructures into a single community. It provides readers with a systematic overview of the common challenges faced by research infrastructures and how a ‘reference model guided’ engineering approach can be used to achieve greater interoperability among such infrastructures in the environmental and earth sciences. The 20 contributions in this book are structured in 5 parts on the design, development, deployment, operation and use of research infrastructures. Part one provides an overview of the state of the art of research infrastructure and relevant e-Infrastructure technologies, part two discusses the reference model guided engineering approach, the third part presents the software and tools developed for common data management challenges, the fourth part demonstrates the software via several use cases, and the last part discusses the sustainability and future directions.
This encyclopedia will be an essential resource for our times, reflecting the fact that we currently are living in an expanding data-driven world. Technological advancements and other related trends are contributing to the production of an astoundingly large and exponentially increasing collection of data and information, referred to in popular vernacular as “Big Data.” Social media and crowdsourcing platforms and various applications ― “apps” ― are producing reams of information from the instantaneous transactions and input of millions and millions of people around the globe. The Internet-of-Things (IoT), which is expected to comprise tens of billions of objects by the end of this decade, is actively sensing real-time intelligence on nearly every aspect of our lives and environment. The Global Positioning System (GPS) and other location-aware technologies are producing data that is specific down to particular latitude and longitude coordinates and seconds of the day. Large-scale instruments, such as the Large Hadron Collider (LHC), are collecting massive amounts of data on our planet and even distant corners of the visible universe. Digitization is being used to convert large collections of documents from print to digital format, giving rise to large archives of unstructured data. Innovations in technology, in the areas of Cloud and molecular computing, Artificial Intelligence/Machine Learning, and Natural Language Processing (NLP), to name only a few, also are greatly expanding our capacity to store, manage, and process Big Data. In this context, the Encyclopedia of Big Data is being offered in recognition of a world that is rapidly moving from gigabytes to terabytes to petabytes and beyond. While indeed large data sets have long been around and in use in a variety of fields, the era of Big Data in which we now live departs from the past in a number of key respects and with this departure comes a fresh set of challenges and opportunities that cut across and affect multiple sectors and disciplines, and the public at large. With expanded analytical capacities at hand, Big Data is now being used for scientific inquiry and experimentation in nearly every (if not all) disciplines, from the social sciences to the humanities to the natural sciences, and more. Moreover, the use of Big Data has been well established beyond the Ivory Tower. In today’s economy, businesses simply cannot be competitive without engaging Big Data in one way or another in support of operations, management, planning, or simply basic hiring decisions. In all levels of government, Big Data is being used to engage citizens and to guide policy making in pursuit of the interests of the public and society in general. Moreover, the changing nature of Big Data also raises new issues and concerns related to, for example, privacy, liability, security, access, and even the veracity of the data itself. Given the complex issues attending Big Data, there is a real need for a reference book that covers the subject from a multi-disciplinary, cross-sectoral, comprehensive, and international perspective. The Encyclopedia of Big Data will address this need and will be the first of such reference books to do so. Featuring some 500 entries, from "Access" to "Zillow," the Encyclopedia will serve as a fundamental resource for researchers and students, for decision makers and leaders, and for business analysts and purveyors. Developed for those in academia, industry, and government, and others with a general interest in Big Data, the encyclopedia will be aimed especially at those involved in its collection, analysis, and use. Ultimately, the Encyclopedia of Big Data will provide a common platform and language covering the breadth and depth of the topic for different segments, sectors, and disciplines.
The 7 revised full papers, 11 revised medium-length papers, 6 revised short, and 7 demo papers presented together with 10 poster/abstract papers describing late-breaking work were carefully reviewed and selected from numerous submissions. Provenance has been recognized to be important in a wide range of areas including databases, workflows, knowledge representation and reasoning, and digital libraries. Thus, many disciplines have proposed a wide range of provenance models, techniques, and infrastructure for encoding and using provenance. The papers investigate many facets of data provenance, process documentation, data derivation, and data annotation.
Innovative technologies are changing the way research is performed, preserved, and communicated. Managing Scientific Information and Research Data explores how these technologies are used and provides detailed analysis of the approaches and tools developed to manage scientific information and data. Following an introduction, the book is then divided into 15 chapters discussing the changes in scientific communication; new models of publishing and peer review; ethics in scientific communication; preservation of data; discovery tools; discipline-specific practices of researchers for gathering and using scientific information; academic social networks; bibliographic management tools; information literacy and the information needs of students and researchers; the involvement of academic libraries in eScience and the new opportunities it presents to librarians; and interviews with experts in scientific information and publishing. - Promotes innovative technologies for creating, sharing and managing scientific content - Presents new models of scientific publishing, peer review, and dissemination of information - Serves as a practical guide for researchers, students, and librarians on how to discover, filter, and manage scientific information - Advocates for the adoption of unique author identifiers such as ORCID and ResearcherID - Looks into new tools that make scientific information easy to discover and manage - Shows what eScience is and why it is becoming a priority for academic libraries - Demonstrates how Electronic Laboratory Notebooks can be used to record, store, share, and manage research data - Shows how social media and the new area of Altmetrics increase researchers' visibility and measure attention to their research - Directs to sources for datasets - Provides directions on choosing and using bibliographic management tools - Critically examines the metrics used to evaluate research impact - Aids strategic thinking and informs decision making
The amount of data in everyday life has been exploding. This data increase has been especially significant in scientific fields, where substantial amounts of data must be captured, communicated, aggregated, stored, and analyzed. Cloud Computing with e-Science Applications explains how cloud computing can improve data management in data-heavy fields such as bioinformatics, earth science, and computer science. The book begins with an overview of cloud models supplied by the National Institute of Standards and Technology (NIST), and then: Discusses the challenges imposed by big data on scientific data infrastructures, including security and trust issues Covers vulnerabilities such as data theft or loss, privacy concerns, infected applications, threats in virtualization, and cross-virtual machine attack Describes the implementation of workflows in clouds, proposing an architecture composed of two layers—platform and application Details infrastructure-as-a-service (IaaS), platform-as-a-service (PaaS), and software-as-a-service (SaaS) solutions based on public, private, and hybrid cloud computing models Demonstrates how cloud computing aids in resource control, vertical and horizontal scalability, interoperability, and adaptive scheduling Featuring significant contributions from research centers, universities, and industries worldwide, Cloud Computing with e-Science Applications presents innovative cloud migration methodologies applicable to a variety of fields where large data sets are produced. The book provides the scientific community with an essential reference for moving applications to the cloud.
A volume in the three-volume Remote Sensing Handbook series, Remote Sensing of Water Resources, Disasters, and Urban Studies documents the scientific and methodological advances that have taken place during the last 50 years. The other two volumes in the series are Remotely Sensed Data Characterization, Classification, and Accuracies, and Land Reso
This book constitutes the refereed proceedings of the Fifth VLDB Workshop on Secure Data Management, SDM 2008, held in Auckland, New Zealand, on August 24, 2008, in conjunction with VLDB 2008. The 11 full papers were selected for publication in the book from 32 submissions. In addition, 3 position papers and a keynote paper are included. The papers are organized in topical sections on database security, trust management, privacy protection, and security and privacy in healthcare.
Libraries organize information and data is information, so it is natural that librarians should help people who need to find, organize, use, or store data. Organizations need evidence for decision making; data provides that evidence. Inventors and creators build upon data collected by others. All around us, people need data. Librarians can help increase the relevance of their library to the research and education mission of their institution by learning more about data and how to manage it. Data Management will guide readers through: Understanding data management basics and best practices. Using the reference interview to help with data management Writing data management plans for grants. Starting and growing a data management service. Finding collaborators inside and outside the library. Collecting and using data in different disciplines.