Hostname: page-component-8448b6f56d-wq2xx Total loading time: 0 Render date: 2024-04-24T22:31:29.669Z Has data issue: false hasContentIssue false

Variationist sociolinguistics and corpus-based variationist linguistics: overlap and cross-pollination potential

Published online by Cambridge University Press:  20 June 2017

Benedikt Szmrecsanyi*
Affiliation:
KU Leuven

Abstract

The paper surveys overlap between corpus linguistics and variationist sociolinguistics. Corpus linguistics is customarily defined as a methodology that bases claims about language on usage patterns in collections of naturalistic, authentic speech or text. Because this is what is typically done in variationist sociolinguistics work, I argue that variationist sociolinguists are by definition corpus linguists, though of course the reverse is not true: the variationist method entails more than merely analyzing usage data, and not all corpus analysts are interested in variation. But that being said, a considerable and arguably increasing number of corpus linguists not formally trained in variationist sociolinguistics are explicitly concerned with variation and engage in what I call corpus-based variationist linguistics (CVL). I first discuss what unites or divides work in CVL and in variationist sociolinguistics. In a plea to cross subdisciplinary boundaries, I subsequently identify three research areas where variationist sociolinguists may draw inspiration from work in CVL: conducting multi-variable research, paying more attention to probabilistic grammars, and taking more seriously the register-sensitivity of variation patterns.

Résumé

Cet article explore le chevauchement entre la linguistique de corpus et la sociolinguistique variationniste. La linguistique de corpus est typiquement définie comme une méthodologie qui fonde ses affirmations linguistiques sur les régularités de l'usage émergeant des collectes de données orales ou textuelles naturalistes et authentiques. Puisque c'est ce qui se fait généralement en sociolinguistique variationniste, je soutiens que les sociolinguistes variationnistes sont par définition des linguistes de corpus, bien que l'inverse ne soit pas vrai: la méthode variationniste implique davantage que le seul fait d'analyser les données de l'usage, et tous les analystes de corpus ne s'intéressent pas à la variation. Ceci étant dit, un nombre grandissant de linguistes de corpus n'ayant pas été formellement formés en sociolinguistique variationniste s'intéressent explicitement à la variation et s'investissent dans ce que j'appelle la linguistique variationniste basée sur les corpus (en anglais, corpus-based variationist linguistics ou CVL). Je discute d'abord ce qui unit et ce qui divise la linguistique variationniste basée sur les corpus et la sociolinguistique variationniste. Dans un appel visant à franchir les frontières des sous-disciplines, j'identifie ensuite trois domaines de recherche où les sociolinguistes variationnistes peuvent s'inspirer des travaux en linguistique variationniste basée sur les corpus: effectuer des recherches multi-variables, accorder plus d'attention aux grammaires probabilistes et prendre plus au sérieux la sensibilité au registre des modèles de variation.

Type
Articles
Copyright
© Canadian Linguistic Association/Association canadienne de linguistique 2017 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

I am grateful to Jeroen Claes, Jason Grafmiller, Lars Hinrichs, Laurel MacKenzie, and two anonymous referees for helpful feedback on earlier versions of this paper. The usual disclaimers apply.

References

Baayen, R. Harald, Hendrix, Peter, and Ramscar, Michael. 2013. Sidestepping the combinatorial explosion: An explanation of n-gram frequency effects based on naive discriminative learning. Language and Speech 56(3): 329347. <doi:10.1177/0023830913484896>.Google Scholar
Bailey, Guy, Wikle, Tom, Tillery, Jan, and Sand, Lori. 1991. The apparent time construct. Language Variation and Change 3(3): 241264. <doi:10.1017/S0954394500000569>.Google Scholar
Baker, Paul. 2010. Sociolinguistics and corpus linguistics. Edinburgh: Edinburgh University Press.Google Scholar
Bell, Allan. 1984. Language style as audience design. Language in Society 13(2): 145. <doi:10.1017/S004740450001037X>.Google Scholar
Biber, Douglas. 1988. Variation across speech and writing. Cambridge: Cambridge University Press.Google Scholar
Bock, Kathryn. 1986. Syntactic persistence in language production. Cognitive Psychology 18(3): 355387.Google Scholar
Bresnan, Joan. 2007. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. in Roots: Linguistics in search of its evidential base, ed. Featherston, Sam and Sternefeld, Wolfgang, 7596. Berlin: Mouton de Gruyter.Google Scholar
Bresnan, Joan, Cueni, Anna, Nikitina, Tatiana, and Baayen, Harald. 2007. Predicting the dative alternation. In Cognitive foundations of interpretation, ed. Boume, Gerlof, Krämer, Irene, and Zwarts, Joost, 6994. Amsterdam: Royal Netherlands Academy of Science.Google Scholar
Bresnan, Joan, and Ford, Marilyn. 2010. Predicting syntax: Processing dative constructions in American and Australian varieties of English. Language 86(1): 168213. <doi: 10.1353/lan.0.0189>.Google Scholar
Bresnan, Joan, and Hay, Jennifer. 2008. Gradient grammar: An effect of animacy on the syntax of give in New Zealand and American English. Lingua 118(2): 245259. <doi:10.1016/j.lingua.2007.02.007>.Google Scholar
Bybee, Joan. 2006. From usage to grammar: The mind's response to repetition. Language 82(4): 711733.Google Scholar
Cedergren, Henrietta, and Sankoff, David. 1974. Variable rules: Performance as a statistical reflection of competence. Language 50(2): 333355.Google Scholar
Chambers, J.K. 2003. Sociolinguistic theory: Linguistic variation and its social significance. 2nd ed. Oxford: Blackwell.Google Scholar
Claes, Jeroen. 2014. A cognitive construction grammar approach to the pluralization of presentational haber in Puerto Rican Spanish. Language Variation and Change 26(2): 219246. <doi10.1017/S0954394514000052>.Google Scholar
De Cuypere, Ludovic, and Verbeke, Saartje. 2013. Dative alternation in Indian English: A corpus-based analysis. World Englishes 32(2): 169184. <doi:10.1111/weng.12017>.Google Scholar
D'Arcy, Alexandra, and Tagliamonte, Sali A.. 2015. Not always variable: Probing the vernacular grammar. Language Variation and Change 27(3): 255285. <doi:10.1017/S0954394515000101>.Google Scholar
Davies, Mark, and Fuchs, Robert. 2015. Expanding horizons in the study of World Englishes with the 1.9 billion word Global Web-Based English Corpus (GloWbE). English World-Wide 36(1): 128.Google Scholar
Eckert, Penelope, and Rickford, John R.. 2001. Style and sociolinguistic variation. Cambridge: Cambridge University Press. <http://dx.doi.org/10.1017/CBO9780511613258>.Google Scholar
Ehret, Katharina, Wolk, Christoph, and Szmrecsanyi, Benedikt. 2014. Quirky quadratures: On rhythm and weight as constraints on genitive variation in an unconventional data set. English Language and Linguistics 18(2): 263303. <doi:10.1017/S1360674314000033>.Google Scholar
Ferguson, Charles A. 1983. Sports announcer talk: Syntactic aspects of register variation. Language in Society 12(2): 153172.Google Scholar
Friginal, Eric, and Hardy, Jack A.. 2014. Corpus-based sociolinguistics: A guide for students. New York: Routledge, Taylor and Francis Group.Google Scholar
Grafmiller, Jason. 2014. Variation in English genitives across modality and genres. English Language and Linguistics 18(3): 471496. <doi:10.1017/S1360674314000136>.CrossRefGoogle Scholar
Grafmiller, Jason, and Shih, Stephanie. 2011. New approaches to End Weight. Paper presented at Variation and Typology: New trends in syntactic research. Helsinki.Google Scholar
Gries, Stefan Th. 2005. Syntactic priming: A corpus-based approach. Journal of Psycholinguistic Research 34(4): 365399.CrossRefGoogle ScholarPubMed
Gries, Stefan Th. 2015. The most under-used statistical method in corpus linguistics: Multi-level (and mixed-effects) models. Corpora 10(1): 95125. <doi:10.3355/cor.2015.0068>.Google Scholar
Grieve, Jack, Speelman, Dirk, and Geeraerts, Dirk. 2011. A statistical method for the identification and aggregation of regional linguistic variation. Language Variation and Change 23(2): 193221. <doi:10.1017/S095439451100007X>.Google Scholar
Grondelaers, Stefan, and Speelman, Dirk. 2007. A variationist account of constituent ordering in presentative sentences in Belgian Dutch. Corpus Linguistics and Linguistic Theory 3(2): 161193. <doi:10.1515?CLLT.2007.010>.Google Scholar
Guy, Gregory R. 2005. Letters to Language . Language 81(3): 561563. <doi:10.1353/lan.2005.0132>.Google Scholar
Guy, Gregory R. 2013. The cognitive coherence of sociolects: How do speakers handle multiple sociolinguistic variables? Journal of Pragmatics 52(1): 6371. <doi:10.1016/j.pragma.2012.12.019>.Google Scholar
Guy, Gregory R. 2014. Linking usage and grammar: Generative phonology, exemplar theory, and variable rules. Lingua 142: 5765. <doi:10.1016/j.lingua.2012.07.007>.Google Scholar
Guy, Gregory R. 2015. Coherence, constraints and quantities. Paper presented at NWAV44, Toronto.Google Scholar
Han, Weifeng, Arppe, Antti, and Newman, John. 2013. Topic marking in a Shanghainese corpus: From observation to prediction. Corpus Linguistics and Linguistic Theory. Published online June 5, 2013-06-05. <doi:10.1515/cllt-2013-0014>.Google Scholar
Heylen, Kris. 2005. A quantitative corpus study of German word order variation. In Linguistic evidence: Empirical, theoretical and computational perspectives, ed. Kepser, Stephan and Reis, Marga, 241264. Berlin: Mouton de Gruyter.Google Scholar
Hilpert, Martin. 2008. The English comparative: Language structure and language use. English Language and Linguistics 12(3): 395417. <doi:10.1017/S1360674308002694>.Google Scholar
Hinrichs, Lars, and Szmrecsanyi, Benedikt. 2007. Recent changes in the function and frequency of Standard English genitive constructions: A multivariate analysis of tagged corpora. English Language and Linguistics 11(3): 437474. <doi:10.1017/S1360674307002341>.Google Scholar
Hinrichs, Lars, Szmrecsanyi, Benedikt, and Bohmann, Axel. 2015. Which-hunting and the Standard English relative clause. Language 91(4): 806836. <doi:10.1353/lan.2015.0062>.Google Scholar
Jaeger, T. Florian. 2006. Redundancy and syntactic reduction in spontaneous speech. Doctoral dissertation, Stanford University.Google Scholar
Kolbe-Hanna, Daniela, and Szmrecsanyi, Benedikt. 2015. Grammatical Variation. In The Cambridge handbook of English corpus linguistics, ed. Biber, Douglas and Reppen, Randi, 161179. Cambridge: Cambridge University Press.Google Scholar
Labov, William. 1969. Contraction, deletion, and inherent variability of the English copula. Language 45(4): 715762. <doi:10.2307/412333>.Google Scholar
Labov, William. 1972. Sociolinguistic patterns. Philadelphia: University of Philadelphia Press.Google Scholar
Labov, William. 2010. Principles of linguistic change, vol. 3: Cognitive and cultural factors. Malden, MA: Wiley-Blackwell.Google Scholar
Lavandera, Beatriz R. 1978. Where does the sociolinguistic variable stop? Language in Society 7(2): 171182. <doi: 10.2307/4166996>.Google Scholar
Levshina, Natalia, Geeraerts, Dirk, and Speelman, Dirk. 2013. Towards a 3D-grammar: Interaction of linguistic and extralinguistic factors in the use of Dutch causative constructions. Journal of Pragmatics 52: 3448. <doi:10.1016/j.pragma.2012.12.013>.CrossRefGoogle Scholar
Lohmann, Arne. 2011. Help vs. help to: A multifactorial, mixed-effects account of infinitive marker omission. English Language and Linguistics 15(3); 499521. <doi:10.1017/S1360674311000141>.Google Scholar
de Marneffe, Marie-Catherine, Grimm, Scott, Arnon, Inbal, Kirby, Susannah, and Bresnan, Joan. 2012. A statistical model of the grammatical choices in child production of dative sentences. Language and Cognitive Processes 27(1): 2561. <doi:10.1080/01690965.2010.542651>.Google Scholar
McEnery, Tony, Xiao, Richard, and Tono, Yukio. 2006. Corpus-based language studies: An advanced resource book. New York: Routledge.Google Scholar
Meyer, Charles F. 2002. English corpus linguistics: An introduction. Cambridge: Cambridge University Press.Google Scholar
Meyerhoff, Miriam. 2017. Writing a linguistic symphony: Analysing variation while doing language documentation. Canadian Journal of Linguistics 62(4). <10.1017/cnj.2017.28>Google Scholar
Nerbonne, John. 2009. Data-driven dialectology. Language and Linguistics Compass 3(1): 175198.Google Scholar
Nesselhauf, Nadja. 2005. Collocations in a learner corpus. Amsterdam: John Benjamins. <http://www.jbe-platform.com/content/books/9789027294739>.Google Scholar
Pijpops, Dirk, and Van de Velde, Freek. 2014. A multivariate analysis of the partitive genitive in Dutch: Bringing quantitative data into a theoretical discussion. Corpus Linguistics and Linguistic Theory. Published online April 10, 2014. <doi:10.1515/cllt-2013-0027>.Google Scholar
Rayson, Paul, Piao, Scott, Sharoff, Serge, Evert, Stefan, and Villada Moirón, Begoña. 2010. Multiword expressions: Hard going or plain sailing? Language Resources and Evaluation 44(1–2): 15. <doi:10.1007/s10579-009-9105-0>.Google Scholar
Rickford, John R. 2014. Situation: Stylistic variation in sociolinguistic corpora and theory. Language and Linguistics Compass 8(11): 590603.Google Scholar
Rickford, John R., and McNair-Knox, Faye. 1994. Addressee- and topic-influenced style shift: A quantitative sociolinguistic study. In Perspectives on register: Situating register variation within sociolinguistics, ed. Biber, Douglas and Finegan, Edward, 235276. Oxford: Oxford University Press.Google Scholar
Rickford, John R., and Wasow, Thomas A.. 1995. Syntactic variation and change in progress: Loss of the verbal coda in topic-restricting as far as constructions. Language 71(1): 102131. <doi:10.2307/415964>.Google Scholar
Rosenbach, Anette. 2005. Animacy versus weight as determinants of grammatical variation in English. Language 81(3): 613644.Google Scholar
Rosenfelder, Ingrid. 2009. Sociophonetic variation in educated Jamaican English: An analysis of the spoken component of ICE-Jamaica. Doctoral dissertation, University of Freiburg.Google Scholar
Ruette, Tom, Ehret, Katharina, and Szmrecsanyi, Benedikt. 2016. A lectometric analysis of aggregated lexical variation in written Standard English with semantic vector space models. International Journal of Corpus Linguistics 21(1): 4879. <doi:10.1075/ijcl.21.1.03rue>.Google Scholar
Schilk, Marco, Mukherjee, Joybrato, Nam, Christopher, and Mukherjee, Sach. 2013. Complementation of ditransitive verbs in South Asian Englishes: A multifactorial analysis. Corpus Linguistics and Linguistic Theory 9(2): 187225. <doi:10.1515/cllt-2013-0001>.Google Scholar
Shih, Stephanie, Grafmiller, Jason, Futrell, Richard, and Bresnan, Joan. 2015. Rhythm's role in genitive construction choice in spoken English. in Rhythm in cognition and grammar, ed. Vogel, Ralf and Vijver, Ruben, 207234. Berlin: De Gruyter. <http://www.degruyter.com/view/books/9783110378092/9783110378092.207/9783110378092.207.xml>.Google Scholar
Szmrecsanyi, Benedikt. 2013. Grammatical variation in British English dialects: A study in corpus-based dialectometry. Cambridge: Cambridge University Press.Google Scholar
Szmrecsanyi, Benedikt, Biber, Douglas, Egbert, Jesse, and Franco, Karlien. 2016a. Toward more accountability: Modeling ternary genitive variation in Late Modern English. Language Variation and Change 28(1): 129. <doi:10.1017/S0954394515000198>.Google Scholar
Szmrecsanyi, Benedikt, Grafmiller, Jason, Heller, Benedikt, and Röthlisberger, Melanie. 2016b. Around the world in three alternations: Modeling syntactic variation in varieties of English. English World-Wide 37(2): 109137.Google Scholar
Tagliamonte, Sali. 2001. Comparative sociolinguistics. In Handbook of language variation and change, ed. Chambers, J. K., Trudgill, Peter, and Schilling-Estes, Natalie, 729763. Oxford: Blackwell.Google Scholar
Tagliamonte, Sali. 2012. Variationist sociolinguistics: Change, observation, interpretation. Malden, MA: Wiley-Blackwell. <http://public.eblib.com/EBLPublic/PublicView.do?ptiID=819316>..>Google Scholar
Tagliamonte, Sali, Smith, Jennifer, and Lawrence, Helen. 2005. No taming the vernacular! Insights from the relatives in Northern Britain. Language Variation and Change 17(1): 75112.Google Scholar
Teubert, Wolfgang. 2005. My version of corpus linguistics. International Journal of Corpus Linguistics 10(1): 113. <doi:10/1075/ijcl.10.1.01.teu>.Google Scholar
Theijssen, Daphne, ten Bosch, Louis, Boves, Lou, Cranen, Bert, and van Halteren, Hans. 2013. Choosing alternatives: Using Bayesian networks and memory-based learning to study the dative alternation. Corpus Linguistics and Linguistic Theory 9(2): 227262. <doi:10.1515/cllt-2013-0007>.Google Scholar
Travis, Catherine E., and Lindstrom, Amy M.. 2016. Different registers, different grammars? Subject expression in English conversation and narrative. Language Variation and Change 28(1): 103128. <doi:10.1017/S0954394515000174>.Google Scholar
Trudgill, Peter. 1974. Linguistic change and diffusion: Description and explanation in sociolinguistic dialect geography. Language in Society 3(2): 215246.Google Scholar
Tummers, Jose, Heylen, Kris, and Geeraerts, Dirk. 2005. Usage-based approaches in cognitive linguistics: A technical state of the art. Corpus Linguistics and Linguistic Theory 1(2): 225261. <doi:10.1515/cllt.2005.1.2.225>.Google Scholar
Weiner, Judith, and Labov, William. 1983. Constraints on the agentless passive. Journal of Linguistics 19(1): 2958.Google Scholar
Wolk, Christoph, Bresnan, Joan, Rosenbach, Anette, and Szmrecsanyi, Benedikt. 2013. Dative and genitive variability in Late Modern English: Exploring cross-constructional variation and change. Diachronica 30(3): 382419. <doi:10.1075/dia.30.3.04wol>.Google Scholar
Wulff, Stefanie, Lester, Nicholas, and Martinez-Garcia, Maria T.. 2014. That-variation in German and Spanish L2 English. Language and Cognition 6(2): 271299. <doi:10.1017/langcog.2014.5>.Google Scholar
Yañez-Bouza, Nuria. 2011. ARCHER past and present (1990–2010). ICAME Journal 35: 205236.Google Scholar