Recent Advances in Multiword Units in Machine Translation and Translation Technology

Recent Advances in Multiword Units in Machine Translation and Translation Technology

Author: Johanna Monti

Publisher: John Benjamins Publishing Company

Published: 2024-11-15

Total Pages: 276

ISBN-13: 9027246386

DOWNLOAD EBOOK

The investigation of phraseology through corpus-based and computational approaches holds significant relevance for various professionals, including translators, interpreters, terminologists, lexicographers, language instructors, and learners. Computational Phraseology, and in particular the computational analysis of multiword expressions (also known as multiword units), has gained prominence in recent years and is essential for a number of Natural Language Processing and Translation Technology applications. The failure to detect these units automatically could result in incorrect and problematic automatic translations and could hinder the performance of applications such as text summarisation and web search. Against this background, the volume offers 13 articles carefully selected and organised into two parts: ‘Computational treatment of multiword units’ and ‘Corpus-based and linguistic studies in phraseology‘. The contributions not only highlight the latest advancements in computational and corpus-based phraseology but also reiterate its vital role in all areas of language technologies, including basic and applied research.


Multiword Units in Machine Translation and Translation Technology

Multiword Units in Machine Translation and Translation Technology

Author: Ruslan Mitkov

Publisher: John Benjamins Publishing Company

Published: 2018-07-15

Total Pages: 271

ISBN-13: 9027264201

DOWNLOAD EBOOK

The correct interpretation of Multiword Units (MWUs) is crucial to many applications in Natural Language Processing but is a challenging and complex task. In recent years, the computational treatment of MWUs has received considerable attention but there is much more to be done before we can claim that NLP and Machine Translation (MT) systems process MWUs successfully. This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French, Romanian, German, Dutch and Croatian, among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues.


Computational and Corpus-Based Phraseology

Computational and Corpus-Based Phraseology

Author: Gloria Corpas Pastor

Publisher: Springer Nature

Published: 2022-09-21

Total Pages: 252

ISBN-13: 303115925X

DOWNLOAD EBOOK

This book constitutes the refereed proceedings of the 4th International Conference on Computational and Corpus-Based Phraseology, Europhras 2022, held in Malaga, Spain, in September 2022. The 16 full papers presented in this book were carefully reviewed and selected from 59 submissions. The papers in this volume cover a number of topics including general corpus-based approaches to phraseology, phraseology in translation and cross-linguistic studies, phraseology in language teaching and learning, phraseology in specialized languages, phraseology in lexicography, cognitive approaches to phraseology, the computational treatment of multiword expressions, and the development, annotation, and exploitation of corpora for phraseological studies.


Corpora in Translation and Contrastive Research in the Digital Age

Corpora in Translation and Contrastive Research in the Digital Age

Author: Julia Lavid-López

Publisher: John Benjamins Publishing Company

Published: 2021-12-15

Total Pages: 353

ISBN-13: 9027259682

DOWNLOAD EBOOK

Corpus-based contrastive and translation research are areas that keep evolving in the digital age, as the range of new corpus resources and tools expands, opening up to different approaches and application contexts. The current book contains a selection of papers which focus on corpora and translation research in the digital age, outlining some recent advances and explorations. After an introductory chapter which outlines language technologies applied to translation and interpreting with a view to identifying challenges and research opportunities, the first part of the book is devoted to current advances in the creation of new parallel corpora for under-researched areas, the development of tools to manage parallel corpora or as an alternative to parallel corpora, and new methodologies to improve existing translation memory systems. The contributions in the second part of the book address a number of cutting-edge linguistic issues in the area of contrastive discourse studies and translation analysis on the basis of comparable and parallel corpora in several languages such as English, German, Swedish, French, Italian, Spanish, Portuguese and Turkish, thus showcasing the richness of the linguistic diversity carried out in these recent investigations. Given the multiplicity of topics, methodologies and languages studied in the different chapters, the book will be of interest to a wide audience working in the fields of translation studies, contrastive linguistics and the automatic processing of language.


Idiom Treatment Experiments in Machine Translation

Idiom Treatment Experiments in Machine Translation

Author: Dimitra Anastasiou

Publisher: Cambridge Scholars Publishing

Published: 2010-09-13

Total Pages: 265

ISBN-13: 1443825409

DOWNLOAD EBOOK

In 1975, Searle stated that one should speak idiomatically unless there is some good reason not to do so. Fillmore, Kay, and O’Connor in 1988 defined an idiomatic expression or construction as something that a language user could fail to know while knowing everything else in the language. Our language is rich in conversational phrases, idioms, metaphors, and general expressions used in metaphorical meaning. These idiomatic expressions pose a particular challenge for Machine Translation (MT), because their translation for the most part does not work literally, but logically. The present book shows how idiomatic expressions can be recognized and correctly translated with the help of a bilingual idiom dictionary (English-German), a monolingual (German) corpus, and morphosyntactic rules. The work focuses on the field of Example-based Machine Translation (EBMT). A theory of idiomatic expressions with their syntactic and semantic properties is provided, followed by the practical part of the book which describes how the hybrid EBMT system METIS-II is able to correctly process idiomatic expressions. A comparison of METIS-II with three commercial systems shows that idioms are not impossible to translate as it was predicted in 1952: “The only way for a machine to treat idioms is—not to have idioms!” This book furnishes plenty of examples of idiomatic phrases and provides the foundation for how MT systems can process and translate idioms by means of simple linguistic resources.


The Pragmatics of Multiword Terms

The Pragmatics of Multiword Terms

Author: Melania Cabezas-García

Publisher: Taylor & Francis

Published: 2024-02-29

Total Pages: 173

ISBN-13: 1003845568

DOWNLOAD EBOOK

This book explores the pragmatics of specialized language with a focus on multiword terms, complex phrases characterized by sequences of nouns or adjectives whose meaning is clarified in the unspecified but implicit links between them, with implications for their use and translation. The volume adopts an innovative approach rooted in Frame-Based Terminology which allows for the analysis of multiword – compound terms in specialized language, such as horizontal-axis wind turbine – term formation from an integrated semantic and pragmatic perspective. The book features data from a corpus on wind power in English, Spanish, and French comprising such specialized texts as research articles, books, reports, and PhD theses to consider term extraction and the identification of terminological correspondences. Cabezas-García highlights the ways in which pragmatic analysis is an integral part of understanding multiword terms, due to the necessary inference of information implicit within them, with applications for future research on pragmatics and specialized language more broadly. This book will be of interest to students and researchers in pragmatics, semantics, corpus linguistics, and terminology.


Lexical Collocation Analysis

Lexical Collocation Analysis

Author: Pascual Cantos-Gómez

Publisher: Springer

Published: 2018-08-21

Total Pages: 145

ISBN-13: 3319925822

DOWNLOAD EBOOK

This book re-examines the notion of word associations, more precisely collocations. It attempts to come to a potentially more generally applicable definition of collocation and how to best extract, identify and measure collocations. The book highlights the role played by (i) automatic linguistic annotation (part-of-speech tagging, syntactic parsing, etc.), (ii) using semantic criteria to facilitate the identification of collocations, (iii) multi-word structured, instead of the widespread assumption of bipartite collocational structures, for capturing the intricacies of the phenomenon of syntagmatic attraction, (iv) considering collocation and valency as near neighbours in the lexis-grammar continuum and (v) the mathematical properties of statistical association measures in the automatic extraction of collocations from corpora. This book is an ideal guide to the use of statistics in collocation analysis and lexicography, as well as a practical text to the development of skills in the application of computational lexicography. Lexical Collocation Analysis: Advances and Applications begins with a proposal for integrating both collocational and valency phenomena within the overarching theoretical framework of construction grammar. Next the book makes the case for integrating advances in syntactic parsing and in collocational analysis. Chapter 3 offers an innovative look at complementing corpus data and dictionaries in the identification of specific types of collocations consisting of restricted predicate-argument combinations. This strategy complements corpus collocational data with network analysis techniques applied to dictionary entries. Chapter 4 explains the potential of collocational graphs and networks both as a visualization tool and as an analytical technique. Chapter 5 introduces MERGE (Multi-word Expressions from the Recursive Grouping of Elements), a data-driven approach to the identification and extraction of multi-word expressions from corpora. Finally the book concludes with an analysis and evaluation of factors influencing the performance of collocation extraction methods in parsed corpora.


The Routledge Handbook of Translation and Technology

The Routledge Handbook of Translation and Technology

Author: Minako O'Hagan

Publisher: Routledge

Published: 2019-08-23

Total Pages: 644

ISBN-13: 1315311232

DOWNLOAD EBOOK

The Routledge Handbook of Translation and Technology provides a comprehensive and accessible overview of the dynamically evolving relationship between translation and technology. Divided into five parts, with an editor's introduction, this volume presents the perspectives of users of translation technologies, and of researchers concerned with issues arising from the increasing interdependency between translation and technology. The chapters in this Handbook tackle the advent of technologization at both a technical and a philosophical level, based on industry practice and academic research. Containing over 30 authoritative, cutting-edge chapters, this is an essential reference and resource for those studying and researching translation and technology. The volume will also be valuable for translators, computational linguists and developers of translation tools.


Mobile Speech and Advanced Natural Language Solutions

Mobile Speech and Advanced Natural Language Solutions

Author: Amy Neustein

Publisher: Springer Science & Business Media

Published: 2013-02-03

Total Pages: 373

ISBN-13: 1461460182

DOWNLOAD EBOOK

"Mobile Speech and Advanced Natural Language Solutions" presents the discussion of the most recent advances in intelligent human-computer interaction, including fascinating new study findings on talk-in-interaction, which is the province of conversation analysis, a subfield in sociology/sociolinguistics, a new and emerging area in natural language understanding. Editors Amy Neustein and Judith A. Markowitz have recruited a talented group of contributors to introduce the next generation natural language technologies for practical speech processing applications that serve the consumer’s need for well-functioning natural language-driven personal assistants and other mobile devices, while also addressing business’ need for better functioning IVR-driven call centers that yield a more satisfying experience for the caller. This anthology is aimed at two distinct audiences: one consisting of speech engineers and system developers; the other comprised of linguists and cognitive scientists. The text builds on the experience and knowledge of each of these audiences by exposing them to the work of the other.


Formalising Natural Languages with Nooj 2014

Formalising Natural Languages with Nooj 2014

Author: Mario Monteleone

Publisher: Cambridge Scholars Publishing

Published: 2015-10-13

Total Pages: 260

ISBN-13: 1443884642

DOWNLOAD EBOOK

This volume is composed of 22 peer-reviewed contributions selected from among the 52 presentations submitted for the 2014 International NooJ Conference held at the University of Sassari, Italy. NooJ is a linguistic development environment that allows linguists to formalize a wide range of linguistic phenomena, and then test, adapt, share and accumulate each elementary description so as to build linguistic “modules”, that is, structured libraries of linguistic resources. NooJ is also used as a corpus processor that can launch sophisticated queries over large corpora of texts, in order to produce various results, including concordances, statistical analyses, information extraction, and automatic translation. NooJ is used in many research centers all over the world, and linguistic modules are available for more than 20 languages. NooJ is also used by a growing number of software companies to develop various Natural Language Processing applications. Johanna Monti is Associate Professor at the University of Sassari, Italy, where she teaches Translation Studies, Computational Linguistics, and Machine-Translation and Computer-Aided Translation. She has acted as a member of the scientific committees of various renowned international conferences on Natural Language Processing, and as external evaluator for the Italian Ministry for Education, Universities and Research (MIUR) and the Horizon 2020 programme.