Print Friendly

Moran, Steven, Research Unit: Quantitative Language Comparison, LMU, Munich, Germany,
Prokic, Jelena, Research Unit: Quantitative Language Comparison, LMU, Munich, Germany,

Dogon languages are spoken predominately in eastern Mali in West Africa. The Dogon were made famous by Marcel Griaule, a French anthropologist who pioneered Ethnography in France, and worked with the Dogon between 1931-1956. He reported that the Dogon had advanced astronomical knowledge of the Sirius binary star system, knowledge that is not possible without telescope. Since then, the Dogon have been shrouded in controversy and mystery.

As late as 1989, Dogon appeared in reference books as if it were a single language, e.g. Bendor-Samuel 1989 (as found in Blench 2005). The standard encyclopedic reference on the world’s languages, the Ethnologue,1 now lists 14 Dogon languages (Lewis 2009), but this figure is too low. In 2004, an extensive sociolinguistic survey by Hochstetler et al. (2004) estimated no less than 17 distinct languages and described the language family as highly internally divided. Since 2004, much initial survey work on Dogon has been undertaken by Roger Blench, Denis Douyon and Jeffrey Heath’s Dogon languages project,2 which has lead to the ‘discovery’ of a web of divergent dialects, some of which have been raised to the status of distinct languages based on standard linguistic criteria. Thus, the current Dogon linguistic situation is not at all transparent. Dogon languages are very under-described, many are highly endangered, and all are genealogically not well established (Blench 2005; Heath 2008).

The Dogon languages project provides a tentative detailed inventory of known Dogon languages. There are currently 20 distinct languages grouped (crudely) into eight geographical regions, with no implications for genealogical subgrouping. The internal structure of the Dogon language family is unknown, as is the number of mutually unintelligible languages it contains. In fact, the Ethnologue gives a flat genealogical tree. The position of the Dogon languages relative to other African language families is also unclear because of Dogon’s unique typological characteristics. Its lineage has long been disputed, as summarized in Table 1.3

Table 1: Historical classification of Dogon
Year Classification (language family) Author
1924 Nigéro-Sénégalais Delafosse
1941 Voltaic (Eng. Gur) Homburger
1948 Voltaic; Gurunsi Baumann & Westermann
1951 Mandé Holas
1952 Mandé Delafosse
1952 Gur (Fr. Voltaic) Westermann & Bryan
1953 Voltaic Bertho
1953 Non-classified de Tressan
1950/60 Gur Calame-Griaule
1963 Gur Greenberg
1971 Gur Bender-Samuel
1981 Voltaic Manessy
1981 Volta-Congo Bendor-Samuel
1993 Unresolved; non-classified Galtier
1994 Unresolved; non-classified Plungian and Tembiné
2000 Ijo-Congo Williamson and Blench
2009 Volta-Congo Lewis

In this paper, we use a marriage of digital methods successfully applied in bioinformatics to decode DNA and determine the genetic relatedness of humans, and we apply these methods to language data in an attempt to shed light on the prehistory of the Dogon languages and determine their genealogical subgroupings. The comparative method employed in historical and comparative linguistics (the study of language change to reconstruct the genealogical relatedness of languages) is a very laborious and time-consuming task that involves identifying cognates (words that share a common etymological origin) through their shared meanings and common sound change correspondences (e.g. English ‘is’, German ‘ist’, Latin ‘est’, Indo-European ‘esti’).

We show how recent advances in the use of quantitative methods in the study of language comparison can be applied to digital data of (endangered) languages to automate the discovery of regular sound correspondences and cognate forms. Next, powerful statistical techniques allow for new insights into the origin and evolution of human languages. We use distance-based methods like Levenshtein (Levenshtein 1965) and character-based methods like Bayesian Markov Chain Monte Carlo methods (Page & Holmes 2006) to induce language family trees from lexical data. We present the first subgrouping hypothesis for the Dogon language family and we will discuss the limits of current quantitative approaches, where the state-of-the-art in computational historical linguistics is heading, and what we can hope to glean from their application to linguistic diversity.


Bendor-Samuel, J., E. J. Olsen, and A. R. White (1989). The Niger-congo languages – a classification and description of Africa’s largest language family, chapter Dogon. Langham: UP of America, pp. 169-177.

Blench, R. (2005). A survey of Dogon languages in Mali: overview. Ogmios 26: 14-15.

Heath, J. (2008). A grammar of Jamsay. Berlin: Mouton de Gruyter.

Hochstetler, J. L., J. Durieux, and E. Durieux-Boon (2004). Sociolinguistic survey of the Dogon language area. SIL International.

Levenshtein, V. (1965). Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR 163: 845-848.

Lewis, M. P. (2009). Ethnologue: languages of the world, Sixteenth edition. SIL International.

Page, R. D. M., and E. C. Holmes (2006). Molecular evolution: a phylogenetic approach. Malden: Blackwell.




3.See Hochstetler et al. (2004) and Hantgan’s Dogon bibliography for references at: