Details

A Corpus-Based Analysis of Using Function Words in English Forensic Authorship Attribution

A Case of Political Journalism Disputes
1. Auflage

von: Khalid Shakir Hussein, Eman Abdul Kareem
CHF 36.00
Verlag:	Grin Verlag
Format:	PDF
Veröffentl.:	14.02.2018
ISBN/EAN:	9783668637474
Sprache:	englisch
Anzahl Seiten:	124

In den Warenkorb

Als Gutschein

Dieses eBook erhalten Sie ohne Kopierschutz.

Beschreibungen

Titelbeschreibung

Case Study from the year 2017 in the subject English Language and Literature Studies - Linguistics, , language: English, abstract: The advancement in computational linguistics and statistics has made an explicit impact on the emergence of corpus linguistics and the sophistication of its applications and studies involving not only pure linguistic issues but also areas related to real-life problems. One of these areas is authorship attribution studies.

Authorship attribution is a domain of a study concerned with identifying the most likely author of a particular anonymous or disputed document from a set of suspected authors. To this end, several methodologies, techniques, and approaches have been devised and so often assessed on various sets of data to make sure of their effectiveness. Although the literature shows no consensus as to which methodology is the best among others, there is an overwhelming fact that all authorship attribution studies are grounded on the assumption that each author has a particular "linguistic fingerprint" which can be captured through detecting and measuring the linguistic clues hidden in their authorial styles.

Taking an experimental framework, this study is an attempt to gauge the discriminating and clustering power of the selected methodology against a particular type of data covering samples of political journal articles. The corpus compiled is a special purpose one strictly controlled for genre, register, and date of publication. It comprises eleven samples extracted from eleven articles with their lengths ranging between (1,101) to (1,113) words long; three ones are taken as test (hypothetically questioned) samples and the rest as training samples. The corpus represents the journalistic writings of four authors.