Authorship Attribution

Authorship Attribution

Author: Patrick Juola

Publisher: Now Publishers Inc

Published: 2008

Total Pages: 116

ISBN-13: 160198118X

DOWNLOAD EBOOK

Authorship Attribution surveys the history and present state of the discipline, presenting some comparative results where available. It also provides a theoretical and empirically-tested basis for further work. Many modern techniques are described and evaluated, along with some insights for application for novices and experts alike.


Versification and Authorship Attribution

Versification and Authorship Attribution

Author: Petr Plecháč

Publisher: Charles University in Prague, Karolinum Press

Published: 2021-07-01

Total Pages: 96

ISBN-13: 8024648717

DOWNLOAD EBOOK

The technique known as contemporary stylometry uses different methods, including machine learning, to discover a poem’s author based on features like the frequencies of words and character n-grams. However, there is one potential textual fingerprint stylometry tends to ignore: versification, or the very making of language into verse. Using poetic texts in three different languages (Czech, German, and Spanish), Petr Plecháč asks whether versification features like rhythm patterns and types of rhyme can help determine authorship. He then tests its findings on two unsolved literary mysteries. In the first, Plecháč distinguishes the parts of the Elizabethan verse play The Two Noble Kinsmen written by William Shakespeare from those written by his coauthor, John Fletcher. In the second, he seeks to solve a case of suspected forgery: how authentic was a group of poems first published as the work of the nineteenth-century Russian author Gavriil Stepanovich Batenkov? This book of poetic investigation should appeal to literary sleuths the world over.


Machine Learning Methods for Stylometry

Machine Learning Methods for Stylometry

Author: Jacques Savoy

Publisher: Springer Nature

Published: 2020-09-28

Total Pages: 286

ISBN-13: 3030533603

DOWNLOAD EBOOK

This book presents methods and approaches used to identify the true author of a doubtful document or text excerpt. It provides a broad introduction to all text categorization problems (like authorship attribution, psychological traits of the author, detecting fake news, etc.) grounded in stylistic features. Specifically, machine learning models as valuable tools for verifying hypotheses or revealing significant patterns hidden in datasets are presented in detail. Stylometry is a multi-disciplinary field combining linguistics with both statistics and computer science. The content is divided into three parts. The first, which consists of the first three chapters, offers a general introduction to stylometry, its potential applications and limitations. Further, it introduces the ongoing example used to illustrate the concepts discussed throughout the remainder of the book. The four chapters of the second part are more devoted to computer science with a focus on machine learning models. Their main aim is to explain machine learning models for solving stylometric problems. Several general strategies used to identify, extract, select, and represent stylistic markers are explained. As deep learning represents an active field of research, information on neural network models and word embeddings applied to stylometry is provided, as well as a general introduction to the deep learning approach to solving stylometric questions. In turn, the third part illustrates the application of the previously discussed approaches in real cases: an authorship attribution problem, seeking to discover the secret hand behind the nom de plume Elena Ferrante, an Italian writer known worldwide for her My Brilliant Friend’s saga; author profiling in order to identify whether a set of tweets were generated by a bot or a human being and in this second case, whether it is a man or a woman; and an exploration of stylistic variations over time using US political speeches covering a period of ca. 230 years. A solutions-based approach is adopted throughout the book, and explanations are supported by examples written in R. To complement the main content and discussions on stylometric models and techniques, examples and datasets are freely available at the author’s Github website.


Scalability Issues in Authorship Attribution

Scalability Issues in Authorship Attribution

Author: Kim Luyckx

Publisher: ASP / VUBPRESS / UPA

Published: 2011-08

Total Pages: 197

ISBN-13: 9054878231

DOWNLOAD EBOOK

Provides an in-depth and systematic study of the so-called scalability issues in authorship attribution -- the task that aims to identify the author of a text, given a model of authorial style based on texts of known authorship. Computational authorship attribution does not rely on in-depth reading, but rather automates the process. This book investigates the behavior of a text categorization approach to the task when confronted with scalability issues. By addressing the issues of experimental design, data size, and author set size, the dissertation demonstrates whether the approach taken is valid in experiments with limited or sufficient data, and with small or large sets of authors.


Machine Learning for Authorship Attribution and Cyber Forensics

Machine Learning for Authorship Attribution and Cyber Forensics

Author: Farkhund Iqbal

Publisher: Springer Nature

Published: 2020-12-04

Total Pages: 158

ISBN-13: 3030616754

DOWNLOAD EBOOK

The book first explores the cybersecurity’s landscape and the inherent susceptibility of online communication system such as e-mail, chat conversation and social media in cybercrimes. Common sources and resources of digital crimes, their causes and effects together with the emerging threats for society are illustrated in this book. This book not only explores the growing needs of cybersecurity and digital forensics but also investigates relevant technologies and methods to meet the said needs. Knowledge discovery, machine learning and data analytics are explored for collecting cyber-intelligence and forensics evidence on cybercrimes. Online communication documents, which are the main source of cybercrimes are investigated from two perspectives: the crime and the criminal. AI and machine learning methods are applied to detect illegal and criminal activities such as bot distribution, drug trafficking and child pornography. Authorship analysis is applied to identify the potential suspects and their social linguistics characteristics. Deep learning together with frequent pattern mining and link mining techniques are applied to trace the potential collaborators of the identified criminals. Finally, the aim of the book is not only to investigate the crimes and identify the potential suspects but, as well, to collect solid and precise forensics evidence to prosecute the suspects in the court of law.


Authorship attribution in Turkish Texts

Authorship attribution in Turkish Texts

Author: Hülya Kocagül Yüzer

Publisher: Artsürem

Published: 2022-12-31

Total Pages: 221

ISBN-13: 6057228502

DOWNLOAD EBOOK

The latest developments in the field of computer technology have created new ways to share information without time and space limits. Computer technologies have not only made life easier and more accessible for users, but they have also opened up a new arena for illegal activities. These illegal actions have found an opportunity to spread via e-mails, websites, Internet chat rooms, forum pages, and social networking websites (like Facebook, Twitter, Instagram). Online contributors do not need to provide information such as their real names, the city where they live, age or gender in order to share their opinions, and such feelings of anonymity encourage criminal activities. Thus, disputed authorship cases have become one of the main challenges of the technological era. This research is a corpus-based simulated authorship casework application in Turkish. Texts for the corpora were collected from a collaborative online encyclopaedia – Eksi Sozluk (Sour Times) and Twitter. The corpus consists of 900 texts from 52 authors in total. However, 105 texts belong to seven authors from Twitter. The two methodological approaches that were applied are qualitative and statistical methods, according to Grant’s (2013) approach. Ten different tests were applied, depending on the various parameters that are forensically possible in real-world cases. Accordingly, the role of feature type, size, including the candidate author size, text size and a limited number of texts per author and finally cross-genre application were tested. The analyses revealed that such a combined approach has promising results in some tests in that they attributed authorship in Turkish. The findings of the research indicated that there is the potential to attribute unknown authors in Turkish and it appears that the results have significant conclusions for the broader application of forensic authorship attribution techniques in Turkish texts. Keywords: Authorship Attribution, Turkish, Forensic Linguistics, Authorship Analysis


The Ascension of Authorship

The Ascension of Authorship

Author: Jed Wyrick

Publisher: Harvard University Department of Comparative Literature

Published: 2004

Total Pages: 536

ISBN-13:

DOWNLOAD EBOOK

Tracing the history of the idea of the author beginning with attribution practices of Second Temple and Rabbinic Judaism, Wyrick argues that the fusion of Jewish and Hellenistic approaches to attribution helped lead to Augustine's reinvention of the writer of scripture as an author whose texts were governed by both divine will and human intent.


Cognitive Approach to Natural Language Processing

Cognitive Approach to Natural Language Processing

Author: Bernadette Sharp

Publisher: Elsevier

Published: 2017-05-31

Total Pages: 236

ISBN-13: 008102343X

DOWNLOAD EBOOK

As natural language processing spans many different disciplines, it is sometimes difficult to understand the contributions and the challenges that each of them presents. This book explores the special relationship between natural language processing and cognitive science, and the contribution of computer science to these two fields. It is based on the recent research papers submitted at the international workshops of Natural Language and Cognitive Science (NLPCS) which was launched in 2004 in an effort to bring together natural language researchers, computer scientists, and cognitive and linguistic scientists to collaborate together and advance research in natural language processing. The chapters cover areas related to language understanding, language generation, word association, word sense disambiguation, word predictability, text production and authorship attribution. This book will be relevant to students and researchers interested in the interdisciplinary nature of language processing. Discusses the problems and issues that researchers face, providing an opportunity for developers of NLP systems to learn from cognitive scientists, cognitive linguistics and neurolinguistics Provides a valuable opportunity to link the study of natural language processing to the understanding of the cognitive processes of the brain


Automating Open Source Intelligence

Automating Open Source Intelligence

Author: Robert Layton

Publisher: Syngress

Published: 2015-12-03

Total Pages: 224

ISBN-13: 012802917X

DOWNLOAD EBOOK

Algorithms for Automating Open Source Intelligence (OSINT) presents information on the gathering of information and extraction of actionable intelligence from openly available sources, including news broadcasts, public repositories, and more recently, social media. As OSINT has applications in crime fighting, state-based intelligence, and social research, this book provides recent advances in text mining, web crawling, and other algorithms that have led to advances in methods that can largely automate this process. The book is beneficial to both practitioners and academic researchers, with discussions of the latest advances in applications, a coherent set of methods and processes for automating OSINT, and interdisciplinary perspectives on the key problems identified within each discipline. Drawing upon years of practical experience and using numerous examples, editors Robert Layton, Paul Watters, and a distinguished list of contributors discuss Evidence Accumulation Strategies for OSINT, Named Entity Resolution in Social Media, Analyzing Social Media Campaigns for Group Size Estimation, Surveys and qualitative techniques in OSINT, and Geospatial reasoning of open data. Presents a coherent set of methods and processes for automating OSINT Focuses on algorithms and applications allowing the practitioner to get up and running quickly Includes fully developed case studies on the digital underground and predicting crime through OSINT Discusses the ethical considerations when using publicly available online data


Big Data Analytics

Big Data Analytics

Author:

Publisher: Elsevier

Published: 2015-08-04

Total Pages: 391

ISBN-13: 0444634975

DOWNLOAD EBOOK

While the term Big Data is open to varying interpretation, it is quite clear that the Volume, Velocity, and Variety (3Vs) of data have impacted every aspect of computational science and its applications. The volume of data is increasing at a phenomenal rate and a majority of it is unstructured. With big data, the volume is so large that processing it using traditional database and software techniques is difficult, if not impossible. The drivers are the ubiquitous sensors, devices, social networks and the all-pervasive web. Scientists are increasingly looking to derive insights from the massive quantity of data to create new knowledge. In common usage, Big Data has come to refer simply to the use of predictive analytics or other certain advanced methods to extract value from data, without any required magnitude thereon. Challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and information privacy. While there are challenges, there are huge opportunities emerging in the fields of Machine Learning, Data Mining, Statistics, Human-Computer Interfaces and Distributed Systems to address ways to analyze and reason with this data. The edited volume focuses on the challenges and opportunities posed by "Big Data" in a variety of domains and how statistical techniques and innovative algorithms can help glean insights and accelerate discovery. Big data has the potential to help companies improve operations and make faster, more intelligent decisions. Review of big data research challenges from diverse areas of scientific endeavor Rich perspective on a range of data science issues from leading researchers Insight into the mathematical and statistical theory underlying the computational methods used to address big data analytics problems in a variety of domains