Working with Portuguese Corpora

Working with Portuguese Corpora

Author: Tony Berber Sardinha

Publisher: A&C Black

Published: 2014-04-10

Total Pages: 347

ISBN-13: 1472570014

DOWNLOAD EBOOK

Although Portuguese is one of the main world languages and researchers have been working on Portuguese electronic text collections for decades (e.g. Kelly, 1970; Biderman, 1978; Bacelar do Nascimento et al., 1984; see Berber Sardinha, 2005), this is the first volume in English that encapsulates the exciting and cutting-edge corpus linguistic work being done with Portuguese language corpora on different continents. The book includes chapters by leading corpus linguists dealing with Portuguese corpora across the world, and their contributions explore various methods and how they are applicable to a wide range of language issues. The book is divided into six sections, each covering a key issue in Corpus Linguistics: lexis and grammar, lexicography, language teaching and terminology, translation, corpus building and sharing, and parsing and annotation. Together these sections present the reader with a broad picture of the field.


Linguistic Corpora and Big Data in Spanish and Portuguese

Linguistic Corpora and Big Data in Spanish and Portuguese

Author: Miguel Calderón Campos

Publisher: Walter de Gruyter GmbH & Co KG

Published: 2024-10-21

Total Pages: 238

ISBN-13: 3110781468

DOWNLOAD EBOOK

In recent decades, corpus linguistics has experienced tremendous development in the Hispanic world, along two opposite but complementary approaches: increase in corpus size (corpus linguistics as Big Data) and improvement in document selection and data annotation (corpus linguistics as High Quality Data). The first approach has led to the creation of massive corpora such as EsTenTen; at the same time, it has promoted the use of the web and social networks as corpora. The second perspective gives rise to specialized corpora such as Post Scriptum or Oralia Diacrónica del español (ODE). The contributions gathered in this volume combine both methods in order to exploit their advantages and to overcome their possible limitations. On the one hand, it addresses the creation and design of small corpora focused on data quality; on the other hand, it offers case studies that make use of both specialized corpora and massive data extracted from the web. Highlighting the complementary nature of both methods is the main idea of this book.


Text, Speech, and Dialogue

Text, Speech, and Dialogue

Author: Petr Sojka

Publisher: Springer

Published: 2016-09-02

Total Pages: 565

ISBN-13: 3319455109

DOWNLOAD EBOOK

This book constitutes the refereed proceedings of the 19th International Conference on Text, Speech, and Dialogue, TSD 2016, held in Brno, CzechRepublic, in September 2016. The 62 papers presented together with 3 abstracts of invited talks were carefully reviewed and selected from 127 submissions. They focus on topics such as corpora and language resources; speech recognition; tagging, classification and parsing of text and speech; speech and spoken language generation; semantic processing of text and speech; integrating applications of text and speech processing; automatic dialogue systems; as well as multimodal techniques and modelling.


Out of Corpora

Out of Corpora

Author: Hasselgård

Publisher: BRILL

Published: 2023-11-20

Total Pages: 377

ISBN-13: 9004653686

DOWNLOAD EBOOK

Main headings: Introduction. - I. Representing language use. - II. Grammar and lexis in English corpora. - III. Contrastive and translation studies. - IV. English abroad. - List of Stig Johansson's publications (selection).


Designing and Evaluating Language Corpora

Designing and Evaluating Language Corpora

Author: Jesse Egbert

Publisher: Cambridge University Press

Published: 2022-04-14

Total Pages: 299

ISBN-13: 1009254758

DOWNLOAD EBOOK

Corpora are ubiquitous in linguistic research, yet to date, there has been no consensus on how to conceptualize corpus representativeness and collect corpus samples. This pioneering book bridges this gap by introducing a conceptual and methodological framework for corpus design and representativeness. Written by experts in the field, it shows how corpora can be designed and built in a way that is both optimally suited to specific research agendas, and adequately representative of the types of language use in question. It considers questions such as 'what types of texts should be included in the corpus?', and 'how many texts are required?' – highlighting that the degree of representativeness rests on the dual pillars of domain considerations and distribution considerations. The authors introduce, explain, and illustrate all aspects of this corpus representativeness framework in a step-by-step fashion, using examples and activities to help readers develop practical skills in corpus design and evaluation.


Computational Processing of the Portuguese Language

Computational Processing of the Portuguese Language

Author: Aline Villavicencio

Publisher: Springer

Published: 2018-09-14

Total Pages: 507

ISBN-13: 331999722X

DOWNLOAD EBOOK

This book constitutes the refereed proceedings of the 13th International Conference on Computational Processing of the Portuguese Language, PROPOR 2018, held in Canela, RS, Brazil, in September 2018. The 42 full papers, 3 short papers and 4 other papers presented in this volume were carefully reviewed and selected from 92 submissions. The papers are organized in topical sections named: Corpus Linguistics, Information Extraction, LanguageApplications, Language Resources, Sentiment Analysis and Opinion Mining, Speech Processing, and Syntax and Parsing.


Computational Processing of the Portuguese Language

Computational Processing of the Portuguese Language

Author: Helena Caseli

Publisher: Springer

Published: 2012-03-09

Total Pages: 443

ISBN-13: 3642288855

DOWNLOAD EBOOK

This book constitutes the thoroughly refereed proceedings of the 8th International Workshop on Computational Processing of the Portuguese Language, PROPOR 2012, held in Coimbra, Portugal in April 2012. The 24 revised full papers and 23 revised short papers presented were carefully reviewed and selected from 86 submissions. These papers cover the areas related to phonology, morphology and POS-Tagging, acquisition, language resources, linguistic description, syntax and parsing, semantics, opinion analysis, natural language processing applications, speech production and phonetics, speech resources, speech processing and applications.


Proceedings of the VIIth GSCP International Conference. Speech and Corpora

Proceedings of the VIIth GSCP International Conference. Speech and Corpora

Author: Massimo Pettorino

Publisher: Firenze University Press

Published: 2012

Total Pages: 488

ISBN-13: 8866553514

DOWNLOAD EBOOK

The 7th International Conference of the Gruppo di Studi sulla Comunicazione Parlata, dedicated to the memory of Claire Blanche-Benveniste, chose as its main theme Speech and Corpora. The wide international origin of the 235 authors from 21 countries and 95 institutions led to papers on many different languages. The 89 papers of this volume reflect the themes of the conference: spoken corpora compilation and annotation, with the technological connected fields; the relation between prosody and pragmatics; speech pathologies; and different papers on phonetics, speech and linguistic analysis, pragmatics and sociolinguistics. Many papers are also dedicated to speech and second language studies. The online publication with FUP allows direct access to sound and video linked to papers (when downloaded).


Translation-based corpus studies

Translation-based corpus studies

Author: Diana Santos

Publisher: BRILL

Published: 2016-08-09

Total Pages: 185

ISBN-13: 900433372X

DOWNLOAD EBOOK

This book presents a model for describing translation performance as a basis for contrastive linguistics, in the realm of tense and aspect. It is based on extensive corpus studies investigating the differences between English and Portuguese using authentic translations in the two directions. In method and substance, the book features several original claims, trying to achieve a balance between theoretical issues and the presentation of concrete translation data. In addition, it deals with computational applications of parallel corpora. Translation-based corpus studies should thus be appropriate for translator education, and for introducing contrastive semantics and the methodology of corpus linguistics to students of linguistics and computer science. Researchers in tense and aspect, translation, and corpus linguistics are, nevertheless, the book’s primary audience.


Computational Processing of the Portuguese Language

Computational Processing of the Portuguese Language

Author: Vládia Pinheiro

Publisher: Springer Nature

Published: 2022-03-17

Total Pages: 447

ISBN-13: 3030983056

DOWNLOAD EBOOK

This book constitutes the proceedings of the 15th International Conference on Computational Processing of the Portuguese Language, PROPOR 2021, held in Fortaleza, Brazil, in March 2021. The 36 full papers presented together with 4 short papers were carefully reviewed and selected from 88 submissions. They are grouped in topical sections on speech processing; resources and evaluation; natural language processing applications; semantics; natural language processing tasks; and multilinguality.