Alexandre François (LATTICE – CNRS, ENS-PSL, USN, Paris)

Dialexification: an empirical tool for observing trends in semantic shift

Far from being random, semantic change in the lexicon shows recurring trends across families (Juvonen & Koptjevskaja-Tamm 2016). Thus, the shift seize → understand is attested in Greek (AGk καταλαμβάνω ‘seize’ → MGk καταλαβαίνω ‘understand’), in Romance (Lat. capere ‘seize, hold’ → It. capire ‘understand’), in Germanic (P-Gmc *fatōną → Gmn fassen ‘seize’, Faroese fata ‘understand’), in Turkic (P-Tkc *tut- → Uyghur tutmaq ‘hold, grab’, Azeri tutmaq ‘hold; understand’), in Oceanic (POc *alap → Lo-Toga ole ‘take’, Mwotlap lep ‘take, get; understand’), and so on. In order to detect and compare these recurring shifts, we need to gather in one place the abundant etymological knowledge that has been accumulated over decades for various language phyla. The project EvoSem aims at creating such a cross-linguistic database, by bringing together etymological resources for a broad array of families, and synthesizing them in the form of semantic graphs and tables (Dehouck et al. 2023).

Because EvoSem aims to be fully empirical, it needs to address one issue – namely, the fact that glosses of reconstructed etyma are inherently speculative. The solution to this problem is to ignore protoglosses, and set as our unit of observation the meanings of attested forms within a cognate set, i.e., all words descended from the same etymon. The new concept of Dialexification captures the relation of “semantic cognacy” – that is, the link between the meanings of cognate forms. Thus, seize and understand are dialexified in Greek, Romance, Germanic, Turkic, etc. While the relevant locus for Colexification (François 2008) was a word in one language in synchrony, the locus of Dialexification is the cognate set, in a group of languages. This approach manages to indirectly capture the effects of diachrony, while remaining fully empirical. This talk will discuss the concept of dialexification, and present its dedicated database EvoSem [https://tiny.cc/EvoSem] – a resource with already 31,000 concepts, and 18,000 etyma from 115 proto-languages.

 

References

Dehouck, Mathieu, Alexandre François, Siva Kalyan, Martial Pastor & David Kletz. 2023. EvoSem, a database of polysemous cognate sets. In Nina Tahmasebi et al. (eds), Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change, 66–75. Singapore. Association for Computational Linguistics.

François, Alexandre. 2008. Semantic maps and the typology of colexification: Intertwining polysemous networks across languages. In M. Vanhove (ed.), From Polysemy to Semantic Change: Towards a typology of lexical semantic associations. (Studies in Language Companion Series), 163–215. Benjamins.

Juvonen, Päivi & Maria Koptjevskaja-Tamm. 2016. The lexical typology of semantic shifts (Cognitive Linguistics Research 58). Berlin: Walter de Gruyter.