Class-tested and coherent, this textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. It gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Slides and additional exercises (with solutions for lecturers) are also available through the book's supporting website to help course instructors prepare their lectures.
A collection of papers proposing, developing, and implementing logical IR models. After an introductory chapter on non-classical logic as the appropriate formalism with which to build IR models, papers are divided into groups on three approaches: logical models, uncertainty models, and meta-models. Topics include preferential models of query by navigation, a logic for multimedia information retrieval, logical imaging and probabilistic information retrieval, and an axiomatic aboutness theory for information retrieval. Can be used as a text for a graduate course on information retrieval or database systems, and as a reference for researchers and practitioners in industry. Annotation copyrighted by Book News, Inc., Portland, OR
An introduction to information retrieval, the foundation for modern search engines, that emphasizes implementation and experimentation. Information retrieval is the foundation for modern search engines. This textbook offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus—a multiuser open-source information retrieval system developed by one of the authors and available online—provides model implementations and a basis for student work. The modular structure of the book allows instructors to use it in a variety of graduate-level courses, including courses taught from a database systems perspective, traditional information retrieval courses with a focus on IR theory, and courses covering the basics of Web retrieval. In addition to its classroom use, Information Retrieval will be a valuable reference for professionals in computer science, computer engineering, and software engineering.
Chapter 1 places into perspective a total Information Storage and Retrieval System. This perspective introduces new challenges to the problems that need to be theoretically addressed and commercially implemented. Ten years ago commercial implementation of the algorithms being developed was not realistic, allowing theoreticians to limit their focus to very specific areas. Bounding a problem is still essential in deriving theoretical results. But the commercialization and insertion of this technology into systems like the Internet that are widely being used changes the way problems are bounded. From a theoretical perspective, efficient scalability of algorithms to systems with gigabytes and terabytes of data, operating with minimal user search statement information, and making maximum use of all functional aspects of an information system need to be considered. The dissemination systems using persistent indexes or mail files to modify ranking algorithms and combining the search of structured information fields and free text into a consolidated weighted output are examples of potential new areas of investigation. The best way for the theoretician or the commercial developer to understand the importance of problems to be solved is to place them in the context of a total vision of a complete system. Understanding the differences between Digital Libraries and Information Retrieval Systems will add an additional dimension to the potential future development of systems. The collaborative aspects of digital libraries can be viewed as a new source of information that dynamically could interact with information retrieval techniques.
Information retrieval is the science concerned with the effective and efficient retrieval of documents starting from their semantic content. It is employed to fulfill some information need from a large number of digital documents. Given the ever-growing amount of documents available and the heterogeneous data structures used for storage, information retrieval has recently faced and tackled novel applications. In this book, Melucci and Baeza-Yates present a wide-spectrum illustration of recent research results in advanced areas related to information retrieval. Readers will find chapters on e.g. aggregated search, digital advertising, digital libraries, discovery of spam and opinions, information retrieval in context, multimedia resource discovery, quantum mechanics applied to information retrieval, scalability challenges in web search engines, and interactive information retrieval evaluation. All chapters are written by well-known researchers, are completely self-contained and comprehensive, and are complemented by an integrated bibliography and subject index. With this selection, the editors provide the most up-to-date survey of topics usually not addressed in depth in traditional (text)books on information retrieval. The presentation is intended for a wide audience of people interested in information retrieval: undergraduate and graduate students, post-doctoral researchers, lecturers, and industrial researchers.
In order to be effective for their users, information retrieval (IR) systems should be adapted to the specific needs of particular environments. The huge and growing array of types of information retrieval systems in use today is on display in Understanding Information Retrieval Systems: Management, Types, and Standards, which addresses over 20 typ
With the increased use of technology in modern society, high volumes of multimedia information exists. It is important for businesses, organizations, and individuals to understand how to optimize this data and new methods are emerging for more efficient information management and retrieval. Information Retrieval and Management: Concepts, Methodologies, Tools, and Applications is an innovative reference source for the latest academic material in the field of information and communication technologies and explores how complex information systems interact with and affect one another. Highlighting a range of topics such as knowledge discovery, semantic web, and information resources management, this multi-volume book is ideally designed for researchers, developers, managers, strategic planners, and advanced-level students.
The wealth of information accessible on the Internet has grown exponentially since its advent. This mass of content must be systemically sifted to glean pertinent data, and the utilization of the collective intelligence of other users, or social information retrieval, is an innovative, emerging technique. Social Information Retrieval Systems: Emerging Technologies & Applications for Searching the Web Effectively provides relevant content in the areas of information retrieval systems, services, and research; covering topics such as social tagging, collaborative querying, social network analysis, subjective relevance judgments, and collaborative filtering. Answering the increasing demand for authoritative resources on Internet technologies, this Premier Reference Source will make an indispensable addition to any library collection.
With the proliferation of huge amounts of (heterogeneous) data on the Web, the importance of information retrieval (IR) has grown considerably over the last few years. Big players in the computer industry, such as Google, Microsoft and Yahoo!, are the primary contributors of technology for fast access to Web-based information; and searching capabilities are now integrated into most information systems, ranging from business management software and customer relationship systems to social networks and mobile phone applications. Ceri and his co-authors aim at taking their readers from the foundations of modern information retrieval to the most advanced challenges of Web IR. To this end, their book is divided into three parts. The first part addresses the principles of IR and provides a systematic and compact description of basic information retrieval techniques (including binary, vector space and probabilistic models as well as natural language search processing) before focusing on its application to the Web. Part two addresses the foundational aspects of Web IR by discussing the general architecture of search engines (with a focus on the crawling and indexing processes), describing link analysis methods (specifically Page Rank and HITS), addressing recommendation and diversification, and finally presenting advertising in search (the main source of revenues for search engines). The third and final part describes advanced aspects of Web search, each chapter providing a self-contained, up-to-date survey on current Web research directions. Topics in this part include meta-search and multi-domain search, semantic search, search in the context of multimedia data, and crowd search. The book is ideally suited to courses on information retrieval, as it covers all Web-independent foundational aspects. Its presentation is self-contained and does not require prior background knowledge. It can also be used in the context of classic courses on data management, allowing the instructor to cover both structured and unstructured data in various formats. Its classroom use is facilitated by a set of slides, which can be downloaded from www.search-computing.org.