We investigate the usefulness of part-of-speech (POS) annotation as a tool in the study of sociolinguistic variation and genre evolution. We analyse how POS ratios change over time in the Parsed Corpus of Early English Correspondence (c.1410–1681), which social groups lead the changes, and whether the changes can be connected to colloquialisation with regard to reduced complexity or an increasingly involved style. While we find gentry-led colloquialisation in terms of noun and verb frequencies as well as evidence for gendered styles, the results on structural complexity are more mixed. We argue that POS annotation can be a useful tool when complemented by a thorough textual analysis, but that more fine-grained categories are needed to reach firmer conclusions.
Argamon, Shlomo, Moshe Koppel, Jonathan Fine & Anat Rachel Shimoni
2003Gender, genre, and writing style in formal written texts. Text 23(3). 321–346.
Atzmueller, Martin
2015Subgroup discovery. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 5(1). 35–49.
Bamman, David, Jacob Eisenstein & Tyler Schnoebelen
2014Gender identity and lexical variation in social media. Journal of Sociolinguistics 18(2). 135–160.
Bell, Allan
1984Language style as audience design. Language in Society 13(2). 145–204.
Biber, Douglas
1988Variation across speech and writing. Cambridge: Cambridge University Press.
Biber, Douglas
1992On the complexity of discourse complexity: A multidimensional analysis. Discourse Processes 15(2). 133–163.
Biber, Douglas
1995Dimensions of register variation. Cambridge: Cambridge University Press.
Biber, Douglas & Jena Burges
2000Historical change in the language use of women and men: Gender differences in dramatic dialogue. Journal of English Linguistics 28(1). 21–37.
Biber, Douglas & Susan Conrad
2009Register, genre, and style (Cambridge Textbooks in Linguistics). Cambridge: Cambridge University Press.
Biber, Douglas & Edward Finegan
1989Drift and the evolution of English style: A history of three genres. Language 65(3). 487–517.
Biber, Douglas & Edward Finegan
1997Diachronic relations among speech-based and written registers in English. In Terttu Nevalainen & Leena Kahlas-Tarkka (eds.), To explain the present: Studies in the changing English language in honour of Matti Rissanen (Mémoires de la Société Néophilologique de Helsinki 52), 253–275. Helsinki: Société Néophilologique.
Biber, Douglas & Bethany Gray
2010Being specific about historical change: The influence of sub-register. Journal of English Linguistics 41(2). 104–134.
2016Predicting patterns of grammatical complexity across language exam task types and proficiency levels. Applied Linguistics 37(5). 639–668.
Carpenter, Bob, Andrew Gelman, Matt Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li & Allen Riddell
2017Stan: A probabilistic programming language. Journal of Statistical Software 76(1).
Chafe, Wallace
1982Integration and involvement in speaking, writing, and oral literature. In Deborah Tannen (ed.), Spoken and written language, 35–53. Norwood, NJ: Ablex.
Halliday, M. A. K. & Ruqaiya Hasan
1976Cohesion in English. London & New York: Longman.
2002Variation in the contextuality of language: An empirical measure. Foundations of Science 7(3). 293–340.
Hinneburg, Alexander, Heikki Mannila, Samuli Kaislaniemi, Terttu Nevalainen & Helena Raumolin-Brunberg
2007How to handle small samples: Bootstrap and Bayesian methods in the analysis of linguistic change. Literary and Linguistic Computing 22(2). 137–150.
Huddleston, Rodney & Geoffrey K. Pullum
(eds.)2002The Cambridge grammar of the English language. Cambridge: Cambridge University Press.
Hudson, Richard
1994About 37% of word-tokens are nouns. Language 70(2). 331–339.
Karlsson, Fred
2008Complexity in linguistic theorizing. The Mental Lexicon 9(2). 144–169.
1982Building on empirical foundations. In Winfred P. Lehmann & Yakov Malkiel (eds.), Perspectives on historical linguistics: Papers from a conference held at the meeting of the Language Theory Division, Modern Language Assn, San Francisco, 27–30 December 1979 (Current Issues in Linguistic Theory 24), 17–92. Amsterdam: John Benjamins.
Labov, William
1990The intersection of sex and social class in the course of linguistic change. Language Variation and Change 2(2). 205–254.
Labov, William
1994Principles of linguistic change, volume 1: Internal factors. Oxford: Blackwell.
Laslett, Peter
1965The world we have lost. New York: Charles Scribner’s Sons.
Lehto, Anu
2015The genre of Early Modern English statutes: Complexity in historical legal language (Mémoires de la Société Néophilologique de Helsinki 97). Helsinki: Société Néophilologique.
Mair, Christian, Marianne Hundt, Geoffrey Leech & Nicholas Smith
2016Khepri – a modular view-based tool for exploring (historical sociolinguistic) data. In Maciej Eder & Jan Rybicki (eds.), Digital Humanities 2016: Conference abstracts, 269–272. Kraków: Jagiellonian University & Pedagogical University.
Markus, Manfred
2001The development of prose in Early Modern English in view of the gender question: Using grammatical idiosyncracies of 15th and 17th century letters. European Journal of English Studies 5(2). 181–196.
Meurman-Solin, Anneli
2011Utterance-initial connective elements in early Scottish epistolary prose. In Anneli Meurman-Solin & Ursula Lenker (eds.), Connectives in synchrony and diachrony in European languages (Studies in Variation, Contacts and Change in English 8). Helsinki: VARIENG. [URL] (17 December, 2016.)
Nevala, Minna
2004Address in early English correspondence: Its forms and socio-pragmatic functions (Mémoires de la Société Néophilologique de Helsinki 64). Helsinki: Société Néophilologique.
Nevalainen, Terttu
2002Language and woman’s place in earlier English. Journal of English Linguistics 30(2). 181–199.
Nevalainen, Terttu & Helena Raumolin-Brunberg
2003Historical sociolinguistics: Language change in Tudor and Stuart England (Longman Linguistics Library). London: Pearson Education.
Newman, Matthew L., Carla J. Groom, Lori D. Handelman & James W. Pennebaker
2008Gender differences in language use: An analysis of 14,000 text samples. Discourse Processes 45(3). 211–236.
Palander-Collin, Minna
1999Grammaticalization and social embedding: I THINK and METHINKS in Middle and Early Modern English (Mémoires de la Société Néophilologique de Helsinki 55). Helsinki: Société Néophilologique.
Palander-Collin, Minna
2000The language of husbands and wives in seventeenth-century correspondence. In Christian Mair & Marianne Hundt (eds.), Corpus linguistics and linguistics theory. Papers from the twentieth International Conference on English Language Research on Computerized Corpora (ICAME 20), Freiburg im Breisgau 1999 (Language and Computers: Studies in Practical Linguistics 33), 289–300. Amsterdam: Rodopi.
PCEEC = Parsed Corpus of Early English Correspondence, tagged version 2006 Annotated by Arja Nurmi, Ann Taylor, Anthony Warner, Susan Pintzuk & Terttu Nevalainen. Compiled by the CEEC Project Team. York: University of York & Helsinki: University of Helsinki. Distributed through the Oxford Text Archive. [URL] (17 December, 2016.)
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik
1985A comprehensive grammar of the English language. London: Longman.
R Core Team
2016R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. [URL] (17 December, 2016.)
Raumolin-Brunberg, Helena & Terttu Nevalainen
2007Historical sociolinguistics: The Corpus of Early English Correspondence. In Joan C. Beal, Karen P. Corrigan & Hermann L. Moisl (eds.), Creating and digitizing language corpora, volume 2: Diachronic databases, 148–171. Houndsmills: Palgrave Macmillan.
1998Complexity: A philosophical overview. New Brunswick, NJ: Transaction Publishers.
Säily, Tanja, Terttu Nevalainen & Harri Siirtola
2011Variation in noun and pronoun frequencies in a sociohistorical corpus of English. Literary and Linguistic Computing 26(2). 167–188.
Santorini, Beatrice
2016Annotation manual for the Penn Historical Corpora and the York-Helsinki Corpus of Early English Correspondence. [URL] (17 December, 2016.)
Schiffrin, Deborah
1987Discourse markers. Cambridge: Cambridge University Press.
Siirtola, Harri, Poika Isokoski, Tanja Säily & Terttu Nevalainen
2016Interactive text visualization with Text Variation Explorer. In Ebad Banissi, Mark W. McK. Bannatyne, Fatma Bouali, Remo Burkhard, John Counsell, Urska Cvek, Martin J. Eppler, Georges Grinstein, Wei Dong Huang, Sebastian Kernbach, Chun-Cheng Lin, Feng Lin, Francis T. Marchese, Chi Man Pun, Muhammad Sarfraz, Marjan Trutschl, Anna Ursyn, Gilles Venturini, Theodor G. Wyeld & Jian J. Zhang (eds.), Proceedings of the 20th international conference on Information Visualisation (IV 2016), 330–335. Los Alamitos, California, CA: IEEE Computer Society.
2011Visualisation of text corpora: A case study of the PCEEC. In Terttu Nevalainen & Susan M. Fitzmaurice (eds.), How to deal with data: Problems and approaches to the investigation of the English language over time and space (Studies in Variation, Contacts and Change in English 7). Helsinki: VARIENG. [URL] (17 December, 2016.)
1991You just don’t understand: Women and men in conversation. New York: Morrow and Company.
Taylor, Ann
2007The York-Toronto-Helsinki Parsed Corpus of Old English Prose. In Joan C. Beal, Karen P. Corrigan & Hermann L. Moisl (eds.), Creating and digitizing language corpora, volume 2: Diachronic databases, 196–227. Houndsmills: Palgrave Macmillan.
Taylor, Ann & Beatrice Santorini
2006The Parsed Corpus of Early English Correspondence. University of York. [URL] (17 December, 2016.)
Vartiainen, Turo, Tanja Säily & Mikko Hakala
2013Variation in pronoun frequencies in early English letters: Gender-based or relationship-based? In Jukka Tyrkkö, Olga Timofeeva & Maria Salenius (eds.), Ex philologia lux: Essays in honour of Leena Kahlas-Tarkka (Mémoires de la Société Néophilologique de Helsinki 90), 233–255. Helsinki: Société Néophilologique.
Cited by
Cited by 4 other publications
Leiwo, Martti
2020. L2 Greek in Roman Egypt: Intense language contact in Roman military forts. Journal of Historical Sociolinguistics 6:2
2021. The burden of legacy: Producing the Tagged Corpus of Early English Correspondence Extension (TCEECE). Research in Corpus Linguistics 9:1 ► pp. 104 ff.
Säily, Tanja, Turo Vartiainen, Harri Siirtola & Terttu Nevalainen
This list is based on CrossRef data as of 22 april 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.