Cluster Analysis for Corpus Linguistics

Cluster Analysis for Corpus Linguistics

Author: Hermann Moisl

Publisher: Walter de Gruyter GmbH & Co KG

Published: 2015-02-24

Total Pages: 319

ISBN-13: 3110393174

DOWNLOAD EBOOK

The standard scientific methodology in linguistics is empirical testing of falsifiable hypotheses. As such the process of hypothesis generation is central, and involves formulation of a research question about a domain of interest and statement of a hypothesis relative to it. In corpus linguistics the domain is text, and generation involves abstraction of data from text, data analysis, and formulation of a hypothesis based on inference from the results. Traditionally this process has been paper-based, but the advent of electronic text has increasingly rendered it obsolete both because the size of digital corpora is now at or beyond the limit of what can efficiently be used in the traditional way, and because the complexity of data abstracted from them can be impenetrable to understanding. Linguists are increasingly turning to mathematical and statistical computational methods for help, and cluster analysis is such a method. It is used across the sciences for hypothesis generation by identification of structure in data which are too large or complex, or both, to be interpretable by direct inspection. This book aims to show how cluster analysis can be used for hypothesis generation in corpus linguistics, thereby contributing to a quantitative empirical methodology for the discipline.


Corpus Linguistics and Statistics with R

Corpus Linguistics and Statistics with R

Author: Guillaume Desagulier

Publisher: Springer

Published: 2017-11-17

Total Pages: 359

ISBN-13: 3319645722

DOWNLOAD EBOOK

This textbook examines empirical linguistics from a theoretical linguist’s perspective. It provides both a theoretical discussion of what quantitative corpus linguistics entails and detailed, hands-on, step-by-step instructions to implement the techniques in the field. The statistical methodology and R-based coding from this book teach readers the basic and then more advanced skills to work with large data sets in their linguistics research and studies. Massive data sets are now more than ever the basis for work that ranges from usage-based linguistics to the far reaches of applied linguistics. This book presents much of the methodology in a corpus-based approach. However, the corpus-based methods in this book are also essential components of recent developments in sociolinguistics, historical linguistics, computational linguistics, and psycholinguistics. Material from the book will also be appealing to researchers in digital humanities and the many non-linguistic fields that use textual data analysis and text-based sensorimetrics. Chapters cover topics including corpus processing, frequencing data, and clustering methods. Case studies illustrate each chapter with accompanying data sets, R code, and exercises for use by readers. This book may be used in advanced undergraduate courses, graduate courses, and self-study.


Statistics in Corpus Linguistics

Statistics in Corpus Linguistics

Author: Vaclav Brezina

Publisher: Cambridge University Press

Published: 2018-09-20

Total Pages: 317

ISBN-13: 1107125707

DOWNLOAD EBOOK

A comprehensive and accessible introduction to statistics in corpus linguistics, covering multiple techniques of quantitative language analysis and data visualisation.


Cluster Analysis for Corpus Linguistics

Cluster Analysis for Corpus Linguistics

Author: Hermann Moisl

Publisher: Walter de Gruyter

Published: 2015-01-16

Total Pages: 381

ISBN-13: 9783110363821

DOWNLOAD EBOOK

The rapidly growing volume of digital natural language text and the complexity of data abstracted from it have increasingly rendered traditional corpus linguistic analytical methodology obsolete. This book describes a cluster analytic methodology for generating linguistic hypotheses on the basis of data abstracted from language corpora.


Corpus Linguistics and the Web

Corpus Linguistics and the Web

Author: Marianne Hundt

Publisher: Rodopi

Published: 2007

Total Pages: 313

ISBN-13: 9042021284

DOWNLOAD EBOOK

Using the Web as Corpus is one of the recent challenges for corpus linguistics. This volume presents a current state-of-the-arts discussion of the topic. The articles address practical problems such as suitable linguistic search tools for accessing the www, the question of register variation, or they probe into methods for culling data from the web. The book also offers a wide range of case studies, covering morphology, syntax, lexis, as well as synchronic and diachronic variation in English. These case studies make use of the two approaches to the www in corpus linguistics - web-as-corpus and web-for-corpus-building. The case studies demonstrate that web data can provide useful additional evidence for a broad range of research questions.


The Routledge Handbook of Corpus Linguistics

The Routledge Handbook of Corpus Linguistics

Author: Anne O'Keeffe

Publisher: Routledge

Published: 2022-02-08

Total Pages: 684

ISBN-13: 0429632649

DOWNLOAD EBOOK

The Routledge Handbook of Corpus Linguistics 2e provides an updated overview of a dynamic and rapidly growing area with a widely applied methodology. Over a decade on from the first edition of the Handbook, this collection of 47 chapters from experts in key areas offers a comprehensive introduction to both the development and use of corpora as well as their ever-evolving applications to other areas, such as digital humanities, sociolinguistics, stylistics, translation studies, materials design, language teaching and teacher development, media discourse, discourse analysis, forensic linguistics, second language acquisition and testing. The new edition updates all core chapters and includes new chapters on corpus linguistics and statistics, digital humanities, translation, phonetics and phonology, second language acquisition, social media and theoretical perspectives. Chapters provide annotated further reading lists and step-by-step guides as well as detailed overviews across a wide range of themes. The Handbook also includes a wealth of case studies that draw on some of the many new corpora and corpus tools that have emerged in the last decade. Organised across four themes, moving from the basic start-up topics such as corpus building and design to analysis, application and reflection, this second edition remains a crucial point of reference for advanced undergraduates, postgraduates and scholars in applied linguistics.


Ancient Texts and Modern Readers

Ancient Texts and Modern Readers

Author:

Publisher: BRILL

Published: 2019-06-07

Total Pages: 393

ISBN-13: 9004402918

DOWNLOAD EBOOK

The chapters of this volume address a variety of topics that pertain to modern readers’ understanding of ancient texts, as well as tools or resources that can facilitate contemporary audiences’ interpretation of these ancient writings and their language. In this regard, they cover subjects related to the fields of ancient Hebrew linguistics and Bible translation. The chapters apply linguistic insights and theories to elucidate elements of ancient texts for modern readers, investigate how ancient texts help modern readers to interpret features in other ancient texts, and suggest ways in which translations can make the language and conceptual worlds of ancient texts more accessible to modern readers. In so doing, they present the results of original research, identify new lines and topics of inquiry, and make novel contributions to modern readers’ understanding of ancient texts. Contributors are Alexander Andrason, Barry L. Bandstra, Reinier de Blois, Lénart J. de Regt, Gideon R. Kotzé, Geoffrey Khan, Christian S. Locatell, Kristopher Lyle, John A. Messarra, Cynthia L. Miller-Naudé, Jacobus A. Naudé, Daniel Rodriguez, Eep Talstra, Jeremy Thompson, Cornelius M. van den Heever, Herrie F. van Rooy, Gerrit J. van Steenbergen, Ernst Wendland, Tamar Zewi.


The Cambridge Handbook of English Corpus Linguistics

The Cambridge Handbook of English Corpus Linguistics

Author: Douglas Biber

Publisher: Cambridge University Press

Published: 2015-06-25

Total Pages: 757

ISBN-13: 1316298701

DOWNLOAD EBOOK

The Cambridge Handbook of English Corpus Linguistics (CHECL) surveys the breadth of corpus-based linguistic research on English, including chapters on collocations, phraseology, grammatical variation, historical change, and the description of registers and dialects. The most innovative aspects of the CHECL are its emphasis on critical discussion, its explicit evaluation of the state of the art in each sub-discipline, and the inclusion of empirical case studies. While each chapter includes a broad survey of previous research, the primary focus is on a detailed description of the most important corpus-based studies in this area, with discussion of what those studies found, and why they are important. Each chapter also includes a critical discussion of the corpus-based methods employed for research in this area, as well as an explicit summary of new findings and discoveries.


An A–Z of Applied Linguistics Research Methods

An A–Z of Applied Linguistics Research Methods

Author: Shawn Loewen

Publisher: Bloomsbury Publishing

Published: 2017-09-16

Total Pages: 224

ISBN-13: 1137403225

DOWNLOAD EBOOK

Featuring an extensive set of entries covering all aspects of research methodology, ranging from basic to more advanced topics, this is an essential reference for applied linguists everywhere. Explanations of key concepts and techniques are fully cross-referenced and presented in bite-sized chunks, making it easy for users to look up specific terms quickly or have a brief refresher on methodological practices and related issues. Concepts are further illustrated by real-life examples drawn from current linguistics research. This is ideal for undergraduate and postgraduate students studying applied linguistics or TESOL modules.


Statistics for Corpus Linguistics

Statistics for Corpus Linguistics

Author: Michael Oakes

Publisher: Edinburgh University Press

Published: 2019-08-06

Total Pages: 304

ISBN-13: 1474471382

DOWNLOAD EBOOK

This book in the Edinburgh Textbooks in Empirical Linguistics series is a comprehensive introduction to the statistics currently used in corpus linguistics. Statistical techniques and corpus applications - whether oriented towards linguistics or language engineering - often go hand in glove, and corpus linguists have used an increasingly wide variety of statistics, drawing on techniques developed in a great many fields. This is the first one-volume introduction to the subject.