Introduction to Information Retrieval

Introduction to Information Retrieval

Author: Christopher D. Manning

Publisher: Cambridge University Press

Published: 2008-07-07

Total Pages:

ISBN-13: 1139472100

DOWNLOAD EBOOK

Class-tested and coherent, this textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. It gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Slides and additional exercises (with solutions for lecturers) are also available through the book's supporting website to help course instructors prepare their lectures.


Dynamic Information Retrieval Modeling

Dynamic Information Retrieval Modeling

Author: Grace Hui Yang

Publisher: Morgan & Claypool Publishers

Published: 2016-06-01

Total Pages: 146

ISBN-13: 1627055266

DOWNLOAD EBOOK

Big data and human-computer information retrieval (HCIR) are changing IR. They capture the dynamic changes in the data and dynamic interactions of users with IR systems. A dynamic system is one which changes or adapts over time or a sequence of events. Many modern IR systems and data exhibit these characteristics which are largely ignored by conventional techniques. What is missing is an ability for the model to change over time and be responsive to stimulus. Documents, relevance, users and tasks all exhibit dynamic behavior that is captured in data sets typically collected over long time spans and models need to respond to these changes. Additionally, the size of modern datasets enforces limits on the amount of learning a system can achieve. Further to this, advances in IR interface, personalization and ad display demand models that can react to users in real time and in an intelligent, contextual way. In this book we provide a comprehensive and up-to-date introduction to Dynamic Information Retrieval Modeling, the statistical modeling of IR systems that can adapt to change. We define dynamics, what it means within the context of IR and highlight examples of problems where dynamics play an important role. We cover techniques ranging from classic relevance feedback to the latest applications of partially observable Markov decision processes (POMDPs) and a handful of useful algorithms and tools for solving IR problems incorporating dynamics. The theoretical component is based around the Markov Decision Process (MDP), a mathematical framework taken from the field of Artificial Intelligence (AI) that enables us to construct models that change according to sequential inputs. We define the framework and the algorithms commonly used to optimize over it and generalize it to the case where the inputs aren't reliable. We explore the topic of reinforcement learning more broadly and introduce another tool known as a Multi-Armed Bandit which is useful for cases where exploring model parameters is beneficial. Following this we introduce theories and algorithms which can be used to incorporate dynamics into an IR model before presenting an array of state-of-the-art research that already does, such as in the areas of session search and online advertising. Change is at the heart of modern Information Retrieval systems and this book will help equip the reader with the tools and knowledge needed to understand Dynamic Information Retrieval Modeling.


Estimating the Query Difficulty for Information Retrieval

Estimating the Query Difficulty for Information Retrieval

Author: David Carmel

Publisher: Morgan & Claypool Publishers

Published: 2010

Total Pages: 77

ISBN-13: 160845357X

DOWNLOAD EBOOK

Many information retrieval (IR) systems suffer from a radical variance in performance when responding to users' queries. Even for systems that succeed very well on average, the quality of results returned for some of the queries is poor. Thus, it is desirable that IR systems will be able to identify "difficult" queries so they can be handled properly. Understanding why some queries are inherently more difficult than others is essential for IR, and a good answer to this important question will help search engines to reduce the variance in performance, hence better servicing their customer needs. Estimating the query difficulty is an attempt to quantify the quality of search results retrieved for a query from a given collection of documents. This book discusses the reasons that cause search engines to fail for some of the queries, and then reviews recent approaches for estimating query difficulty in the IR field. It then describes a common methodology for evaluating the prediction quality of those estimators, and experiments with some of the predictors applied by various IR methods over several TREC benchmarks. Finally, it discusses potential applications that can utilize query difficulty estimators by handling each query individually and selectively, based upon its estimated difficulty. Table of Contents: Introduction - The Robustness Problem of Information Retrieval / Basic Concepts / Query Performance Prediction Methods / Pre-Retrieval Prediction Methods / Post-Retrieval Prediction Methods / Combining Predictors / A General Model for Query Difficulty / Applications of Query Difficulty Estimation / Summary and Conclusions


Multimedia Information Retrieval

Multimedia Information Retrieval

Author: Stefan Rueger

Publisher: Morgan & Claypool Publishers

Published: 2010

Total Pages: 155

ISBN-13: 160845097X

DOWNLOAD EBOOK

Supporting users in their resource discovery mission when hunting for multimedia material is not a technological indexing problem alone. We look at interactiveways of engaging with repositories through browsing and relevance feedback, roping in geographical context, and providing visual summaries for videos. The book concludes with an overview of state-of-the-art research projects in the area of multimedia information retrieval, which gives an indication of the research and development trends and, thereby, a glimpse of the future world.


Visual Information Retrieval Using Java and LIRE

Visual Information Retrieval Using Java and LIRE

Author: Lux Mathias

Publisher: Springer Nature

Published: 2022-05-31

Total Pages: 96

ISBN-13: 3031022823

DOWNLOAD EBOOK

Visual information retrieval (VIR) is an active and vibrant research area, which attempts at providing means for organizing, indexing, annotating, and retrieving visual information (images and videos) from large, unstructured repositories. The goal of VIR is to retrieve matches ranked by their relevance to a given query, which is often expressed as an example image and/or a series of keywords. During its early years (1995-2000), the research efforts were dominated by content-based approaches contributed primarily by the image and video processing community. During the past decade, it was widely recognized that the challenges imposed by the lack of coincidence between an image's visual contents and its semantic interpretation, also known as semantic gap, required a clever use of textual metadata (in addition to information extracted from the image's pixel contents) to make image and video retrieval solutions efficient and effective. The need to bridge (or at least narrow) the semantic gap has been one of the driving forces behind current VIR research. Additionally, other related research problems and market opportunities have started to emerge, offering a broad range of exciting problems for computer scientists and engineers to work on. In this introductory book, we focus on a subset of VIR problems where the media consists of images, and the indexing and retrieval methods are based on the pixel contents of those images -- an approach known as content-based image retrieval (CBIR). We present an implementation-oriented overview of CBIR concepts, techniques, algorithms, and figures of merit. Most chapters are supported by examples written in Java, using Lucene (an open-source Java-based indexing and search implementation) and LIRE (Lucene Image REtrieval), an open-source Java-based library for CBIR. Table of Contents: Introduction / Information Retrieval: Selected Concepts and Techniques / Visual Features / Indexing Visual Features / LIRE: An Extensible Java CBIR Library / Concluding Remarks


Lectures on Information Retrieval

Lectures on Information Retrieval

Author: Maristella Agosti

Publisher: Springer

Published: 2003-05-15

Total Pages: 320

ISBN-13: 3540453687

DOWNLOAD EBOOK

Information Retrieval (IR) is concerned with the effective and efficient retrieval of information based on its semantic content. The central problem in IR is the quest to find the set of relevant documents, among a large collection containing the information sought, satisfying a user's information need usually expressed in a natural language query. Documents may be objects or items in any medium: text, image, audio, or indeed a mixture of all three. This book presents 12 revised lectures given at the Third European Summer School in Information Retrieval, ESSIR 2000, held at the Villa Monastero, Varenna, Italy, in September 2000. The first part of the book is devoted to the foundation of IR and related areas; the second part on advanced topics addresses various current issues, from usability aspects to Web searching and browsing.


Click Models for Web Search

Click Models for Web Search

Author: Aleksandr Chuklin

Publisher: Morgan & Claypool Publishers

Published: 2015-07-01

Total Pages: 117

ISBN-13: 1627056483

DOWNLOAD EBOOK

With the rapid growth of web search in recent years the problem of modeling its users has started to attract more and more attention of the information retrieval community. This has several motivations. By building a model of user behavior we are essentially developing a better understanding of a user, which ultimately helps us to deliver a better search experience. A model of user behavior can also be used as a predictive device for non-observed items such as document relevance, which makes it useful for improving search result ranking. Finally, in many situations experimenting with real users is just infeasible and hence user simulations based on accurate models play an essential role in understanding the implications of algorithmic changes to search engine results or presentation changes to the search engine result page. In this survey we summarize advances in modeling user click behavior on a web search engine result page. We present simple click models as well as more complex models aimed at capturing non-trivial user behavior patterns on modern search engine result pages. We discuss how these models compare to each other, what challenges they have, and what ways there are to address these challenges. We also study the problem of evaluating click models and discuss the main applications of click models.


Information Concepts

Information Concepts

Author: Gary Marchionini

Publisher: Morgan & Claypool Publishers

Published: 2010-06-06

Total Pages: 105

ISBN-13: 1598299638

DOWNLOAD EBOOK

Information is essential to all human activity, and information in electronic form both amplifies and augments human information interactions. This lecture surveys some of the different classical meanings of information, focuses on the ways that electronic technologies are affecting how we think about these senses of information, and introduces an emerging sense of information that has implications for how we work, play, and interact with others. The evolutions of computers and electronic networks and people's uses and adaptations of these tools manifesting a dynamic space called cyberspace. Our traces of activity in cyberspace give rise to a new sense of information as instantaneous identity states that I term proflection of self. Proflections of self influence how others act toward us. Four classical senses of information are described as context for this new form of information. The four senses selected for inclusion here are the following: thought and memory, communication process, artifact, and energy. Human mental activity and state (thought and memory) have neurological, cognitive, and affective facets.The act of informing (communication process) is considered from the perspective of human intentionality and technical developments that have dramatically amplified human communication capabilities. Information artifacts comprise a common sense of information that gives rise to a variety of information industries. Energy is the most general sense of information and is considered from the point of view of physical, mental, and social state change. This sense includes information theory as a measurable reduction in uncertainty. This lecture emphasizes how electronic representations have blurred media boundaries and added computational behaviors that yield new forms of information interaction, which, in turn, are stored, aggregated, and mined to create profiles that represent our cyber identities. Table of Contents: The Many Meanings of Information / Information as Thought and Memory / Information as Communication Process / Information as Artifact / Information as Energy / Information as Identity in Cyberspace: The Fifth Voice / Conclusion and Directions


Advances in Information Retrieval

Advances in Information Retrieval

Author: Djoerd Hiemstra

Publisher: Springer Nature

Published: 2021-03-26

Total Pages: 808

ISBN-13: 3030721132

DOWNLOAD EBOOK

This two-volume set LNCS 12656 and 12657 constitutes the refereed proceedings of the 43rd European Conference on IR Research, ECIR 2021, held virtually in March/April 2021, due to the COVID-19 pandemic. The 50 full papers presented together with 11 reproducibility papers, 39 short papers, 15 demonstration papers, 12 CLEF lab descriptions papers, 5 doctoral consortium papers, 5 workshop abstracts, and 8 tutorials abstracts were carefully reviewed and selected from 436 submissions. The accepted contributions cover the state of the art in IR: deep learning-based information retrieval techniques, use of entities and knowledge graphs, recommender systems, retrieval methods, information extraction, question answering, topic and prediction models, multimedia retrieval, and much more.


Private Information Retrieval

Private Information Retrieval

Author: Xun Yi

Publisher: Morgan & Claypool Publishers

Published: 2013-09-01

Total Pages: 116

ISBN-13: 1627051546

DOWNLOAD EBOOK

This book deals with Private Information Retrieval (PIR), a technique allowing a user to retrieve an element from a server in possession of a database without revealing to the server which element is retrieved. PIR has been widely applied to protect the privacy of the user in querying a service provider on the Internet. For example, by PIR, one can query a location-based service provider about the nearest car park without revealing his location to the server. The first PIR approach was introduced by Chor, Goldreich, Kushilevitz and Sudan in 1995 in a multi-server setting, where the user retrieves information from multiple database servers, each of which has a copy of the same database. To ensure user privacy in the multi-server setting, the servers must be trusted not to collude. In 1997, Kushilevitz and Ostrovsky constructed the first single-database PIR. Since then, many efficient PIR solutions have been discovered. Beginning with a thorough survey of single-database PIR techniques, this text focuses on the latest technologies and applications in the field of PIR. The main categories are illustrated with recently proposed PIR-based solutions by the authors. Because of the latest treatment of the topic, this text will be highly beneficial to researchers and industry professionals in information security and privacy.