LMF Lexical Markup Framework

LMF Lexical Markup Framework

Author: Gil Francopoulo

Publisher: John Wiley & Sons

Published: 2013-05-06

Total Pages: 268

ISBN-13: 1118712595

DOWNLOAD EBOOK

The community responsible for developing lexicons for Natural Language Processing (NLP) and Machine Readable Dictionaries (MRDs) started their ISO standardization activities in 2003. These activities resulted in the ISO standard – Lexical Markup Framework (LMF). After selecting and defining a common terminology, the LMF team had to identify the common notions shared by all lexicons in order to specify a common skeleton (called the core model) and understand the various requirements coming from different groups of users. The goals of LMF are to provide a common model for the creation and use of lexical resources, to manage the exchange of data between and among these resources, and to enable the merging of a large number of individual electronic resources to form extensive global electronic resources. The various types of individual instantiations of LMF can include monolingual, bilingual or multilingual lexical resources. The same specifications can be used for small and large lexicons, both simple and complex, as well as for both written and spoken lexical representations. The descriptions range from morphology, syntax and computational semantics to computer-assisted translation. The languages covered are not restricted to European languages, but apply to all natural languages. The LMF specification is now a success and numerous lexicon managers currently use LMF in different languages and contexts. This book starts with the historical context of LMF, before providing an overview of the LMF model and the Data Category Registry, which provides a flexible means for applying constants like /grammatical gender/ in a variety of different settings. It then presents concrete applications and experiments on real data, which are important for developers who want to learn about the use of LMF. Contents 1. LMF – Historical Context and Perspectives, Nicoletta Calzolari, Monica Monachini and Claudia Soria. 2. Model Description, Gil Francopoulo and Monte George. 3. LMF and the Data Category Registry: Principles and Application, Menzo Windhouwer and Sue Ellen Wright. 4. Wordnet-LMF: A Standard Representation for Multilingual Wordnets, Piek Vossen, Claudia Soria and Monica Monachini. 5. Prolmf: A Multilingual Dictionary of Proper Names and their Relations, Denis Maurel, Béatrice Bouchou-Markhoff. 6. LMF for Arabic, Aida Khemakhem, Bilel Gargouri, Kais Haddar and Abdelmajid Ben Hamadou. 7. LMF for a Selection of African Languages, Chantal Enguehard and Mathieu Mangeot. 8. LMF and its Implementation in Some Asian Languages, Takenobu Tokunaga, Sophia Y.M. Lee, Virach Sornlertlamvanich, Kiyoaki Shirai, Shu-Kai Hsieh and Chu-Ren Huang. 9. DUELME: Dutch Electronic Lexicon of Multiword Expressions, Jan Odijk. 10. UBY-LMF – Exploring the Boundaries of Language-Independent Lexicon Models, Judith Eckle-Kohler, Iryna Gurevych, Silvana Hartmann, Michael Matuschek and Christian M. Meyer. 11. Conversion of Lexicon-Grammar Tables to LMF: Application to French, Éric Laporte, Elsa Tolone and Matthieu Constant. 12. Collaborative Tools: From Wiktionary to LMF, for Synchronic and Diachronic Language Data, Thierry Declerck, Pirsoka Lendvai and Karlheinz Mörth. 13. LMF Experiments on Format Conversions for Resource Merging: Converters and Problems, Marta Villegas, Muntsa Padró and Núria Bel. 14. LMF as a Foundation for Servicized Lexical Resources, Yoshihiko Hayashi, Monica Monachini, Bora Savas, Claudia Soria and Nicoletta Calzolari. 15. Creating a Serialization of LMF: The Experience of the RELISH Project, Menzo Windhouwer, Justin Petro, Irina Nevskaya, Sebastian Drude, Helen Aristar-Dry and Jost Gippert. 16. Global Atlas: Proper Nouns, From Wikipedia to LMF, Gil Francopoulo, Frédéric Marcoul, David Causse and Grégory Piparo. 17. LMF in U.S. Government Language Resource Management, Monte George. About the Authors Gil Francopoulo works for Tagmatica (www.tagmatica.com), a company specializing in software development in the field of linguistics and documentation in the semantic web, in Paris, France, as well as for Spotter (www.spotter.com), a company specializing in media and social media analytics.


Linked Lexical Knowledge Bases

Linked Lexical Knowledge Bases

Author: Iryna Gurevych

Publisher: Springer Nature

Published: 2022-06-01

Total Pages: 124

ISBN-13: 3031021622

DOWNLOAD EBOOK

This book conveys the fundamentals of Linked Lexical Knowledge Bases (LLKB) and sheds light on their different aspects from various perspectives, focusing on their construction and use in natural language processing (NLP). It characterizes a wide range of both expert-based and collaboratively constructed lexical knowledge bases. Only basic familiarity with NLP is required and this book has been written for both students and researchers in NLP and related fields who are interested in knowledge-based approaches to language analysis and their applications. Lexical Knowledge Bases (LKBs) are indispensable in many areas of natural language processing, as they encode human knowledge of language in machine readable form, and as such, they are required as a reference when machines attempt to interpret natural language in accordance with human perception. In recent years, numerous research efforts have led to the insight that to make the best use of available knowledge, the orchestrated exploitation of different LKBs is necessary. This allows us to not only extend the range of covered words and senses, but also gives us the opportunity to obtain a richer knowledge representation when a particular meaning of a word is covered in more than one resource. Examples where such an orchestrated usage of LKBs proved beneficial include word sense disambiguation, semantic role labeling, semantic parsing, and text classification. This book presents different kinds of automatic, manual, and collaborative linkings between LKBs. A special chapter is devoted to the linking algorithms employing text-based, graph-based, and joint modeling methods. Following this, it presents a set of higher-level NLP tasks and algorithms, effectively utilizing the knowledge in LLKBs. Among them, you will find advanced methods, e.g., distant supervision, or continuous vector space models of knowledge bases (KB), that have become widely used at the time of this book's writing. Finally, multilingual applications of LLKB's, such as cross-lingual semantic relatedness and computer-aided translation are discussed, as well as tools and interfaces for exploring LLKBs, followed by conclusions and future research directions.


Lexical Conflict

Lexical Conflict

Author: Danko Šipka

Publisher: Cambridge University Press

Published: 2015-09-18

Total Pages: 265

ISBN-13: 1316395685

DOWNLOAD EBOOK

The first practical study of its kind, Lexical Conflict presents a taxonomy of cross-linguistic lexical differences, with thorough discussion of zero equivalence, multiple equivalence and partial equivalence across languages. Illustrated with numerous examples taken from over one hundred world languages, this work is an exhaustive exploration of cross-linguistic and cross-cultural differences, presenting guidelines and solutions for the lexicographic treatment of these differences. The text combines theoretical and applied linguistic perspectives to create an essential guide for students, researchers and practitioners in linguistics, anthropology, cross-cultural psychology, translation, interpretation and international marketing.


The Routledge Handbook of Lexicography

The Routledge Handbook of Lexicography

Author: Pedro A. Fuertes-Olivera

Publisher: Routledge

Published: 2017-10-02

Total Pages: 987

ISBN-13: 135159964X

DOWNLOAD EBOOK

The Routledge Handbook of Lexicography provides a comprehensive overview of the major approaches to lexicography and their applications within the field. This Handbook features key case studies and cutting-edge contributions from an international range of practitioners, teachers, and researchers. Analysing the theory and practice of compiling dictionaries within the digital era, the 47 chapters address the core issues of: The foundations of lexicography, and its interactions with other disciplines including Corpus Linguistics and Information Science; Types of dictionaries, for purposes such as translation and teaching; Innovative specialised dictionaries such as the Oenolex wine dictionary and the Online Dictionary of New Zealand Sign Language; Lexicography and world languages, including Arabic, Hindi, Russian, Chinese, and Indonesian; The future of lexicography, including the use of the Internet, user participation, and dictionary portals. The Routledge Handbook of Lexicography is essential reading for researchers and students working in this area.


Language, Culture, Computation: Computational Linguistics and Linguistics

Language, Culture, Computation: Computational Linguistics and Linguistics

Author: Nachum Dershowitz

Publisher: Springer

Published: 2014-12-05

Total Pages: 882

ISBN-13: 3642453279

DOWNLOAD EBOOK

This Festschrift volume is published in Honor of Yaacov Choueka on the occasion of this 75th birthday. The present three-volumes liber amicorum, several years in gestation, honours this outstanding Israeli computer scientist and is dedicated to him and to his scientific endeavours. Yaacov's research has had a major impact not only within the walls of academia, but also in the daily life of lay users of such technology that originated from his research. An especially amazing aspect of the temporal span of his scholarly work is that half a century after his influential research from the early 1960s, a project in which he is currently involved is proving to be a sensation, as will become apparent from what follows. Yaacov Choueka began his research career in the theory of computer science, dealing with basic questions regarding the relation between mathematical logic and automata theory. From formal languages, Yaacov moved to natural languages. He was a founder of natural-language processing in Israel, developing numerous tools for Hebrew. He is best known for his primary role, together with Aviezri Fraenkel, in the development of the Responsa Project, one of the earliest fulltext retrieval systems in the world. More recently, he has headed the Friedberg Genizah Project, which is bringing the treasures of the Cairo Genizah into the Digital Age. This third part of the three-volume set covers a range of topics related to language, ranging from linguistics to applications of computation to language, using linguistic tools. The papers are grouped in topical sections on: natural language processing; representing the lexicon; and neologisation.


Features

Features

Author: Greville G. Corbett

Publisher: Cambridge University Press

Published: 2012-10-11

Total Pages: 341

ISBN-13: 1107026237

DOWNLOAD EBOOK

A unique examination of the features of language: how features vary between languages and also how they work.


Towards the Multilingual Semantic Web

Towards the Multilingual Semantic Web

Author: Paul Buitelaar

Publisher: Springer

Published: 2014-11-13

Total Pages: 339

ISBN-13: 3662435853

DOWNLOAD EBOOK

To date, the relation between multilingualism and the Semantic Web has not yet received enough attention in the research community. One major challenge for the Semantic Web community is to develop architectures, frameworks and systems that can help in overcoming national and language barriers, facilitating equal access to information produced in different cultures and languages. As such, this volume aims at documenting the state-of-the-art with regard to the vision of a Multilingual Semantic Web, in which semantic information will be accessible in and across multiple languages. The Multilingual Semantic Web as envisioned in this volume will support the following functionalities: (1) responding to information needs in any language with regard to semantically structured data available on the Semantic Web and Linked Open Data (LOD) cloud, (2) verbalizing and accessing semantically structured data, ontologies or other conceptualizations in multiple languages, (3) harmonizing, integrating, aggregating, comparing and repurposing semantically structured data across languages and (4) aligning and reconciling ontologies or other conceptualizations across languages. The volume is divided into three main sections: Principles, Methods and Applications. The section on “Principles” discusses models, architectures and methodologies that enrich the current Semantic Web architecture with features necessary to handle multiple languages. The section on “Methods” describes algorithms and approaches for solving key issues related to the construction of the Multilingual Semantic Web. The section on “Applications” describes the use of Multilingual Semantic Web based approaches in the context of several application domains. This volume is essential reading for all academic and industrial researchers who want to embark on this new research field at the intersection of various research topics, including the Semantic Web, Linked Data, natural language processing, computational linguistics, terminology and information retrieval. It will also be of great interest to practitioners who are interested in re-examining their existing infrastructure and methodologies for handling multiple languages in Web applications or information retrieval systems.


Multiword expressions in lexical resources

Multiword expressions in lexical resources

Author: Voula Giouli

Publisher: Language Science Press

Published: 2024-06-17

Total Pages: 372

ISBN-13: 3961104700

DOWNLOAD EBOOK

This volume contains chapters that paint the current landscape of the multiword expressions (MWE) representation in lexical resources, in view of their robust identification and computational processing. Both large-size general lexica and smaller MWE-centred ones are included, with special focus on the representation decisions and mechanisms that facilitate their usage in Natural Language Processing tasks. The presentations go beyond the morpho-syntactic description of MWEs, into their semantics. One challenge in representing MWEs in lexical resources is ensuring that the variability along with extra features required by the different types of MWEs can be captured efficiently. In this respect, recommendations for representing MWEs in mono- and multilingual computational lexicons have been proposed; these focus mainly on the syntactic and semantic properties of support verbs and noun compounds and their proper encoding thereof.


The Swedish FrameNet++

The Swedish FrameNet++

Author: Dana Dannélls

Publisher: John Benjamins Publishing Company

Published: 2021-11-26

Total Pages: 349

ISBN-13: 9027258481

DOWNLOAD EBOOK

Large computational lexicons are central NLP resources. Swedish FrameNet++ aims to be a versatile full-scale lexical resource for NLP containing many kinds of linguistic information. Although focused on Swedish, this ongoing effort, which includes building a new Swedish framenet and recycling existing lexicons, has offered valuable insights into general aspects of lexical-resource building for NLP, which are discussed in this book: computational and linguistic problems of lexical semantics and lexical typology, the nature of lexical items (words and multiword expressions), achieving interoperability among heterogeneous lexical content, NLP methods for extending and interlinking existing lexicons, and deploying the new resource in practical NLP applications. This book is targeted at everyone with an interest in lexicography, computational lexicography, lexical typology, lexical semantics, linguistics, computational linguistics and related fields. We believe it should be of particular interest to those who are or have been involved in language resource creation, development and evaluation.


The Language Grid

The Language Grid

Author: Toru Ishida

Publisher: Springer Science & Business Media

Published: 2011-07-29

Total Pages: 303

ISBN-13: 364221178X

DOWNLOAD EBOOK

There is increasing interaction among communities with multiple languages, thus we need services that can effectively support multilingual communication. The Language Grid is an initiative to build an infrastructure that allows end users to create composite language services for intercultural collaboration. The aim is to support communities to create customized multilingual environments by using language services to overcome local language barriers. The stakeholders of the Language Grid are the language resource providers, the language service users, and the language grid operators who coordinate the former. This book includes 18 chapters in six parts that summarize various research results and associated development activities on the Language Grid. The chapters in Part I describe the framework of the Language Grid, i.e., service-oriented collective intelligence, used to bridge providers, users and operators. Two kinds of software are introduced, the service grid server software and the Language Grid Toolbox, and code for both is available via open source licenses. Part II describes technologies for service workflows that compose atomic language services. Part III reports on research work and activities relating to sharing and using language services. Part IV describes various applications of language services as applicable to intercultural collaboration. Part V contains reports on applying the Language Grid for translation activities, including localization of industrial documents and Wikipedia articles. Finally, Part VI illustrates how the Language Grid can be connected to other service grids, such as DFKI's Heart of Gold and smart classroom services in Tsinghua University in Beijing. The book will be valuable for researchers in artificial intelligence, natural language processing, services computing and human--computer interaction, particularly those who are interested in bridging technologies and user communities.