Rooting Language Family Trees


Figure 1: “The genealogical tree of world languages,” compiled by Alexander Militarev based on research by the Moscow School of Comparative Linguistics including the author’s own research on Afroasiatic/Afrasian/Semito-Hamitic macrofamily. Published as appendix 3 in the book: Alexander Militarev: The Jewish conundrum in world history. Academic Studies Press, Reference Library of Jewish Intellectual History, Boston 2010.

Rooting Language Family Trees

Historian of science Judith Kaplan studies the encounters of linguists, anthropologists and biologists in their search for human origins.

World Tree

“Linguists dig deeper into origins of language”—so ran the title of a 1987 feature in the science section of the New York Times. Just as “paleontologists ponder their fossils” and “archeologists turn over ancient stones,” its author noted, linguists had recently joined in the quest for human origins “seeking ultimately to reconstruct the primordial language, the mother tongue of all humans.”

This article invoked the universal appeal of human prehistory, even though its claims were not entirely true. Indeed, practitioners did “dig deeper” into the evolutionary history of language during the second half of the twentieth century, proposing “long-range” genealogies that reached expansively across space and through time. But very few of them were willing to stake a claim when it came to the ultimate Ursprache—thought to have originated some 50,000 years back. How human language originated (whether this had happened once or several times); how early language was structured and used; and how it diffused and diversified across the globe—these questions loomed large over the work of long-range comparative linguistics in the late-1980s and early-1990s. Through newspaper reports, public television programming, and interdisciplinary appropriations, they were formulated at the intersection of specialist and general concerns. Publicity was actively pursued by committed “long-rangers,” who found themselves marginalized in an academic world dominated by more circumspect goals.


Figure 2: Map from Gray & Atkinson, “Language-tree divergence times support the Anatolian theory of Indo-European origin,” Nature 426, p. 437. This was one of the first widely-publicized attempts to apply computational methods from evolutionary biology to historical linguistics.

Building on my doctoral research into the flowering of comparative-historical linguistics between 1874 and 1918, I analyze these developments in my project, Big Data and the Reconstruction of Linguistic Prehistory. The project pays careful attention to the political, cultural, philosophical, and methodological stakes involved in deep linguistic reconstruction. With respect to politics and culture, for instance, I explore potential correlations between references to linguistic monogenesis and the ‘un-freezing’ of the Cold War. Philosophically and methodologically, I show that the controversies surrounding proposed macro-families (for example “Nostratic,” “Amerind,” and even “proto-World”) have prompted historical linguists to reassess the foundations of their scientific practice (Figure 1). What constitutes the “cutting edge” in (pre)historical linguistics?—does progress mean tackling bigger mysteries or specifying existing models with greater precision? Are there limits to what linguists can know scientifically?—can the Comparative Method yield trustworthy results at any time depth, or does the swift rate of language change at some point render it useless? Long-range interventions have further sparked debate on the nature of data and the relative merits of quantitative versus qualitative methods. These debates are beginning to win support for the incorporation of interdisciplinary data-intensive practices, as reflected in journal publications and graduate training (Figure 2).

This project seeks to historicize a flurry of contemporary data-driven activities in the study of linguistic prehistory, exemplified by projects such as the Evolution of Human Languages database of the Santa Fe Institute and the Automated Similarity Judgment Program of the Max Planck Institute for Evolutionary Anthropology. It traces these back to late nineteenth-century efforts to generalize the methods of comparative-historical linguistics from the Indo-European family to other families world-wide. It then considers the origins of “lexicostatistics” in American linguistic anthropology of the 1950s and 1960s—presented as a critical emendation to earlier comparative work. From there, it describes the development of an inclusive Nostratic phylogeny—linking Indo-European, Altaic, Uralic, and Kartvelian languages in a very broad genetic grouping—among members of the “Moscow School,” which coalesced around the work of V. M. Illich-Svitych in the mid-1960s (Figure 3). Moving to the North American context, it recounts the controversies surrounding Joseph Greenberg’s classificatory overhaul of African and American languages using the method of “mass-” or “multilateral comparison”—a pencil-and-paper kind of Big Data approach. The survey concludes in the mid-1980s, when qualitative Soviet and quantitative American traditions began to converge. This collaboration was forged against the backdrop of heightened public curiosity in the sciences of human origins sparked by the announcement of the “Out of Africa” theory.


Figure 3: Map from the first volume of V. M. Illich-Svitych’s Dictionary of Nostratic (1971), p. 45. It motivated subsequent activities of the “Moscow School” and is one of the first sustained works of macro-comparative linguistics.

Big Data and the Reconstruction of Linguistic Prehistory contends that controversies engendered by long-range linguistic reconstruction can tell historians a great deal about tacit standards of evidence and method in the language sciences. More broadly, the project sheds new light on the collection, comparison, and classification of Big Data over the last 150 years.

Further Information

Website of the project: Big Data and the Reconstruction of Linguistic Prehistory

Website of the Working Group: Historicizing Big Data

German Version of this Research Topic

Download print version of this Research Topic

Research Topics Archive

Bathymetry model of the Strait of Gibraltar ca. 1932, Instituto Español de Oceanografía.
50: The Strait in the Cold War—Deep Science and Global Geopolitics in the Mediterranean
Andreas Ryff, Münz- und Mineralienbuch, 1594. Autograph in possession of the Basel University Library (A lambda II 46a).
49: Mountain Clamor! Resource Flows and Metal Culture in Early Modern Mining
Parades of Miners, Craftsmen, and Officials Marking the Marriage of Friedrich August II, Elector of Saxony, and Maria Josepha, Archduchess of Austria in 1719. Bergakademie Freiberg.
48: Data and Decisions in Early Modern Mines
Transcript of a Bobolink song by Ferdinand S. Mathews (1904), Field Book of Wild Birds and Their Music: A Description of the Character and Music of Birds.
47: Scientific Scores and Musical Ears: Sound Diagrams in Field Recording
School of Athens
46: Early Modern Adaptation of the Aristotelian Mechanics
better shelter
45: Refugee Housing
44: Mapping Climatology
Black Hole Merger
43: One Hundred Years of Gravitational Waves
42: How High Is the Sea?
41: The Renewal of Einstein's Theory of Relativity in the Post-War Era
40: Do Data Have Politics?
39: From Sound to Knowledge
38: Colours and Their Context
37: Is Bigger Better
36: Rooting Language Family Trees
35: Making Genetics Human
34: Galileo's Laboratory of Ideas
33: Historicizing Big Data
32: Ancient Balances at the Nexus of Innovation and Knowledge
31: Looking at Diversity
30: How Recipes Created Knowledge in Early Modern Households
29: Metallurgy, Ballistics and Epistemic Instruments
28: Science under Scrutiny
27: The Globalization of Knowledge and its Consequences
26: Parts Unknown: Making the Familiar Strange
25: Apprehending Human Difference and Population Size
24: Endangerment and Its Consequences
23: The Equilibrium Controversy
22: Art and Knowledge in Pre-Modern Europe
21: Knowledgescapes
20: Baby Science in fin-de-siècle America
19: Let him reconquer language
18: Histories of Scientific Observation
17: On Historicizing Epistemology : an essay
16: Johann Lambert's Conversion to a Geometry of Space
15: The Uncertain Boundaries between Light and Matter
14: Every move will be recorded
13: Courting the Crafts in Qing China
12: The Concepts of Immanuel Kant's Natural Philosophy
11: Jean Piaget and the Child's Spontaneous Geometry
10: Galileo and the Others
9: Historicizing Knowledge about Human Biodiversity
8: Dreaming in and of Neurophilosophy
7: Who Were Einstein's Opponents?
6: Physiology of the piano
5: Numbering Bees
4: New Ways of Using Digital Images
3: Telling Instruments
2: Microscope Slides: An Object of the History of Science?
1: What (Good) is Historical Epistemology?