Corpora are used widely in linguistics, but not always wisely. This book attempts to frame corpus linguistics systematically as a variant of the observational method. The first part introduces the reader to the general methodological discussions surrounding corpus data as well as the practice of doing corpus linguistics, including issues such as the scientific research cycle, research design, extraction of corpus data and statistical evaluation. The second part consists of a number of case studies from the main areas of corpus linguistics (lexical associations, morphology, grammar, text and metaphor), surveying the range of issues studied in corpus linguistics while at the same time showing how they fit into the methodology outlined in the first part.
Corpus linguistics is the study of language data on a large scale - the computer-aided analysis of very extensive collections of transcribed utterances or written texts. This textbook outlines the basic methods of corpus linguistics, explains how the discipline of corpus linguistics developed and surveys the major approaches to the use of corpus data. It uses a broad range of examples to show how corpus data has led to methodological and theoretical innovation in linguistics in general. Clear and detailed explanations lay out the key issues of method and theory in contemporary corpus linguistics. A structured and coherent narrative links the historical development of the field to current topics in 'mainstream' linguistics. Practical tasks and questions for discussion at the end of each chapter encourage students to test their understanding of what they have read and an extensive glossary provides easy access to definitions of technical terms used in the text.
This theoretical and practical guide to using corpus linguistic techniques in stylistic analysis focuses on how to use off-the-shelf corpus software, such as AntConc, Wmatrix, and the Brigham Young University (BYU) corpus interface.
This alphabetic guide provides definitions and discussion of key terms used in corpus linguistics. Corpus data is being used in a growing number of English and Linguistics departments which have no record of past research with corpus data. This is the first comprehensive glossary of the many specialist terms in corpus linguistics and will be useful for corpus linguists and non corpus linguists alike. Clearly written, by a team of experienced academics in the field, the glossary provides full coverage of both traditional and contemporary terminology.
Corpus Linguistics and The Study of Literature provides a theoretical introduction to corpus stylistics and also demonstrates its application by presenting corpus stylistic analyses of literary texts and corpora. The first part of the book addresses theoretical issues such as the relationship between subjectivity and objectivity in corpus linguistic analyses, criteria for the evaluation of results from corpus linguistic analyses and also discusses units of meaning in language. The second part of the book takes this theory and applies it to Northanger Abbey by Jane Austen and to two corpora consisting of: Austen's six novels; and texts that are contemporary with Austen. The analyses demonstrate the impact of various features of text on literary meanings and how corpus tools can extract new critical angles. This book will be a key read for upper level undergraduates and postgraduates working in corpus linguistics and in stylistics on linguistics and language studies courses. The editorial board includes: Paul Baker (Lancaster), Frantisek Cermak (Prague), Susan Conrad (Portland), Geoffrey Leech (Lancaster), Dominique Maingueneau (Paris XII), Christian Mair (Freiburg), Alan Partington (Bologna), Elena Tognini-Bonelli (Siena and TWC), Ruth Wodak (Lancaster), and Feng Zhiwei (Beijing). The Corpus and Discourse series consists of two strands. The first, Research in Corpus and Discourse , features innovative contributions to various aspects of corpus linguistics and a wide range of applications, from language technology via the teaching of a second language to a history of mentalities. The second strand, Studies in Corpus and Discourse , is comprised of key texts bridging the gap between social studies and linguistics. Although equally academically rigorous, this strand will be aimed at a wider audience of academics and postgraduate students working in both disciplines.
This textbook examines empirical linguistics from a theoretical linguist’s perspective. It provides both a theoretical discussion of what quantitative corpus linguistics entails and detailed, hands-on, step-by-step instructions to implement the techniques in the field. The statistical methodology and R-based coding from this book teach readers the basic and then more advanced skills to work with large data sets in their linguistics research and studies. Massive data sets are now more than ever the basis for work that ranges from usage-based linguistics to the far reaches of applied linguistics. This book presents much of the methodology in a corpus-based approach. However, the corpus-based methods in this book are also essential components of recent developments in sociolinguistics, historical linguistics, computational linguistics, and psycholinguistics. Material from the book will also be appealing to researchers in digital humanities and the many non-linguistic fields that use textual data analysis and text-based sensorimetrics. Chapters cover topics including corpus processing, frequencing data, and clustering methods. Case studies illustrate each chapter with accompanying data sets, R code, and exercises for use by readers. This book may be used in advanced undergraduate courses, graduate courses, and self-study.
This book provides an up-to-date survey of current issues and approaches in corpus linguistics in the form of twenty-two recent research articles. The articles cover a wide range of topics illustrating the diversity of research that is characteristic of corpus linguistics today. Central themes are the relationship between theory, intuition and corpus data and the role of corpora in linguistic research. The majority of the articles are empirical studies of specific aspects of English, ranging from lexis and grammar to discourse and pragmatics. Other areas explored are language variation, language change and development, language learning, cross-linguistic comparisons of English and other languages, and the development of linguistic software tools. The contributors to the volume include some of the leading figures in the field such as M.A.K. Halliday, John Sinclair, Geoffrey Leech and Michael Hoey. The theoretical and methodological issues addressed in the volume demonstrate clearly the steady advance of an expanding discipline inspired by an empirical, usage-based approach to the study of language. The volume is essential reading for researchers and students interested in the use of computer corpora in linguistic research.
English Corpus Linguistics is a step-by-step guide to creating and analyzing linguistic corpora. It begins with a discussion of the role that corpus linguistics plays in linguistic theory, demonstrating that corpora have proven to be very useful resources for linguists who believe that their theories and descriptions of English should be based on real rather than contrived data. Charles F. Meyer goes on to describe how to plan the creation of a corpus, how to collect and computerize data for inclusion in a corpus, how to annotate the data that are collected, and how to conduct a corpus analysis of a completed corpus. The book concludes with an overview of the challenges that corpus linguists face to make both the creation and analysis of corpora much easier undertakings than they currently are. Clearly organized and accessibly written, this book will appeal to students of linguistics and English language.