1
|
Yang S, Sun X, Jin L, Zhang M. Inferring language dispersal patterns with velocity field estimation. Nat Commun 2024; 15:190. [PMID: 38167834 PMCID: PMC10761963 DOI: 10.1038/s41467-023-44430-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 12/11/2023] [Indexed: 01/05/2024] Open
Abstract
Reconstructing the spatial evolution of languages can deepen our understanding of the demic diffusion and cultural spread. However, the phylogeographic approach that is frequently used to infer language dispersal patterns has limitations, primarily because the phylogenetic tree cannot fully explain the language evolution induced by the horizontal contact among languages, such as borrowing and areal diffusion. Here, we introduce the language velocity field estimation, which does not rely on the phylogenetic tree, to infer language dispersal trajectories and centre. Its effectiveness and robustness are verified through both simulated and empirical validations. Using language velocity field estimation, we infer the dispersal patterns of four agricultural language families and groups, encompassing approximately 700 language samples. Our results show that the dispersal trajectories of these languages are primarily compatible with population movement routes inferred from ancient DNA and archaeological materials, and their dispersal centres are geographically proximate to ancient homelands of agricultural or Neolithic cultures. Our findings highlight that the agricultural languages dispersed alongside the demic diffusions and cultural spreads during the past 10,000 years. We expect that language velocity field estimation could aid the spatial analysis of language evolution and further branch out into the studies of demographic and cultural dynamics.
Collapse
Affiliation(s)
- Sizhe Yang
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Xiaoru Sun
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
- Ministry of Education Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Li Jin
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, 200438, China.
- Human Phenome Institute, Fudan University, Shanghai, 200438, China.
| | - Menghan Zhang
- Institute of Modern Languages and Linguistics, Fudan University, Shanghai, 200433, China.
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai, 200433, China.
| |
Collapse
|
2
|
List JM, Forkel R. Automated identification of borrowings in multilingual wordlists. OPEN RESEARCH EUROPE 2022; 1:79. [PMID: 37645101 PMCID: PMC10445856 DOI: 10.12688/openreseurope.13843.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/03/2021] [Indexed: 08/31/2023]
Abstract
Although lexical borrowing is an important aspect of language evolution, there have been few attempts to automate the identification of borrowings in lexical datasets. Moreover, none of the solutions which have been proposed so far identify borrowings across multiple languages. This study proposes a new method for the task and tests it on a newly compiled large comparative dataset of 48 South-East Asian languages from Southern China. The method yields very promising results, while it is conceptually straightforward and easy to apply. This makes the approach a perfect candidate for computer-assisted exploratory studies on lexical borrowing in contact areas.
Collapse
Affiliation(s)
- Johann-Mattis List
- Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Thüringen, 04103, Germany
| | - Robert Forkel
- Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Thüringen, 04103, Germany
| |
Collapse
|
3
|
Miller JE, Tresoldi T, Zariquiey R, Beltrán Castañón CA, Morozova N, List JM. Using lexical language models to detect borrowings in monolingual wordlists. PLoS One 2020; 15:e0242709. [PMID: 33296372 PMCID: PMC7725347 DOI: 10.1371/journal.pone.0242709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 11/07/2020] [Indexed: 11/25/2022] Open
Abstract
Lexical borrowing, the transfer of words from one language to another, is one of the most frequent processes in language evolution. In order to detect borrowings, linguists make use of various strategies, combining evidence from various sources. Despite the increasing popularity of computational approaches in comparative linguistics, automated approaches to lexical borrowing detection are still in their infancy, disregarding many aspects of the evidence that is routinely considered by human experts. One example for this kind of evidence are phonological and phonotactic clues that are especially useful for the detection of recent borrowings that have not yet been adapted to the structure of their recipient languages. In this study, we test how these clues can be exploited in automated frameworks for borrowing detection. By modeling phonology and phonotactics with the support of Support Vector Machines, Markov models, and recurrent neural networks, we propose a framework for the supervised detection of borrowings in mono-lingual wordlists. Based on a substantially revised dataset in which lexical borrowings have been thoroughly annotated for 41 different languages from different families, featuring a large typological diversity, we use these models to conduct a series of experiments to investigate their performance in mono-lingual borrowing detection. While the general results appear largely unsatisfying at a first glance, further tests show that the performance of our models improves with increasing amounts of attested borrowings and in those cases where most borrowings were introduced by one donor language alone. Our results show that phonological and phonotactic clues derived from monolingual language data alone are often not sufficient to detect borrowings when using them in isolation. Based on our detailed findings, however, we express hope that they could prove to be useful in integrated approaches that take multi-lingual information into account.
Collapse
Affiliation(s)
- John E. Miller
- Artificial Intelligence/Engineering, Pontificia Universidad Católica del Perú, San Miguel, Lima, Peru
- * E-mail: (JEM); (TT)
| | - Tiago Tresoldi
- Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena, Germany
- * E-mail: (JEM); (TT)
| | - Roberto Zariquiey
- Humanities Department, Pontificia Universidad Católica del Perú, San Miguel, Lima, Peru
| | - César A. Beltrán Castañón
- Artificial Intelligence/Engineering, Pontificia Universidad Católica del Perú, San Miguel, Lima, Peru
| | - Natalia Morozova
- Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Johann-Mattis List
- Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena, Germany
| |
Collapse
|
4
|
Rockmore DN, Fang C, Foti NJ, Ginsburg T, Krakauer DC. The cultural evolution of national constitutions. J Assoc Inf Sci Technol 2017. [DOI: 10.1002/asi.23971] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Daniel N. Rockmore
- Department of Computer Science; Dartmouth College; Hanover NH 03755 USA
- Department of Mathematics; Dartmouth College; Hanover NH 03755 USA
- The Santa Fe Institute; Santa Fe NM 87501 USA
- The Neukom Institute for Computational Science, Dartmouth College; Hanover NH 03755 USA
| | - Chen Fang
- Department of Computer Science; Dartmouth College; Hanover NH 03755 USA
| | - Nicholas J. Foti
- Department of Statistics; University of Washington; Seattle WA 98195-4322 USA
| | - Tom Ginsburg
- University of Chicago Law School, The University of Chicago; Chicago IL 60637 USA
| | | |
Collapse
|
5
|
Maurits L, Forkel R, Kaiping GA, Atkinson QD. BEASTling: A software tool for linguistic phylogenetics using BEAST 2. PLoS One 2017; 12:e0180908. [PMID: 28796784 PMCID: PMC5552126 DOI: 10.1371/journal.pone.0180908] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2017] [Accepted: 06/22/2017] [Indexed: 11/18/2022] Open
Abstract
We present a new open source software tool called BEASTling, designed to simplify the preparation of Bayesian phylogenetic analyses of linguistic data using the BEAST 2 platform. BEASTling transforms comparatively short and human-readable configuration files into the XML files used by BEAST to specify analyses. By taking advantage of Creative Commons-licensed data from the Glottolog language catalog, BEASTling allows the user to conveniently filter datasets using names for recognised language families, to impose monophyly constraints so that inferred language trees are backward compatible with Glottolog classifications, or to assign geographic location data to languages for phylogeographic analyses. Support for the emerging cross-linguistic linked data format (CLDF) permits easy incorporation of data published in cross-linguistic linked databases into analyses. BEASTling is intended to make the power of Bayesian analysis more accessible to historical linguists without strong programming backgrounds, in the hopes of encouraging communication and collaboration between those developing computational models of language evolution (who are typically not linguists) and relevant domain experts.
Collapse
Affiliation(s)
- Luke Maurits
- School of Psychology, University of Auckland, Auckland, New Zealand
| | - Robert Forkel
- Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Gereon A Kaiping
- Leiden University Centre for Linguistics, Leiden University, Leiden, the Netherlands
| | - Quentin D Atkinson
- School of Psychology, University of Auckland, Auckland, New Zealand.,Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena, Germany
| |
Collapse
|
6
|
Shiers N, Aston JA, Smith JQ, Coleman JS. Gaussian tree constraints applied to acoustic linguistic functional data. J MULTIVARIATE ANAL 2017. [DOI: 10.1016/j.jmva.2016.09.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
7
|
Willems M, Lord E, Laforest L, Labelle G, Lapointe FJ, Di Sciullo AM, Makarenkov V. Using hybridization networks to retrace the evolution of Indo-European languages. BMC Evol Biol 2016; 16:180. [PMID: 27600442 PMCID: PMC5012036 DOI: 10.1186/s12862-016-0745-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 08/17/2016] [Indexed: 11/24/2022] Open
Abstract
Background Curious parallels between the processes of species and language evolution have been observed by many researchers. Retracing the evolution of Indo-European (IE) languages remains one of the most intriguing intellectual challenges in historical linguistics. Most of the IE language studies use the traditional phylogenetic tree model to represent the evolution of natural languages, thus not taking into account reticulate evolutionary events, such as language hybridization and word borrowing which can be associated with species hybridization and horizontal gene transfer, respectively. More recently, implicit evolutionary networks, such as split graphs and minimal lateral networks, have been used to account for reticulate evolution in linguistics. Results Striking parallels existing between the evolution of species and natural languages allowed us to apply three computational biology methods for reconstruction of phylogenetic networks to model the evolution of IE languages. We show how the transfer of methods between the two disciplines can be achieved, making necessary methodological adaptations. Considering basic vocabulary data from the well-known Dyen’s lexical database, which contains word forms in 84 IE languages for the meanings of a 200-meaning Swadesh list, we adapt a recently developed computational biology algorithm for building explicit hybridization networks to study the evolution of IE languages and compare our findings to the results provided by the split graph and galled network methods. Conclusion We conclude that explicit phylogenetic networks can be successfully used to identify donors and recipients of lexical material as well as the degree of influence of each donor language on the corresponding recipient languages. We show that our algorithm is well suited to detect reticulate relationships among languages, and present some historical and linguistic justification for the results obtained. Our findings could be further refined if relevant syntactic, phonological and morphological data could be analyzed along with the available lexical data. Electronic supplementary material The online version of this article (doi:10.1186/s12862-016-0745-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Matthieu Willems
- Department of Computer Science, Université du Québec à Montréal, Case postale 8888, succursale Centre-ville, Montréal, Québec, H3C 3P8, Canada
| | - Etienne Lord
- Department of Computer Science, Université du Québec à Montréal, Case postale 8888, succursale Centre-ville, Montréal, Québec, H3C 3P8, Canada.,Department of Biological Sciences, Université de Montréal, C.P. 6128 succ. Centre-Ville, Montreal, Quebec, H3C 3J7, Canada
| | - Louise Laforest
- Department of Computer Science, Université du Québec à Montréal, Case postale 8888, succursale Centre-ville, Montréal, Québec, H3C 3P8, Canada
| | - Gilbert Labelle
- Department of Mathematics, Université du Québec à Montréal, Case postale 8888, succursale Centre-ville, Montréal, Québec, H3C 3P8, Canada
| | - François-Joseph Lapointe
- Department of Biological Sciences, Université de Montréal, C.P. 6128 succ. Centre-Ville, Montreal, Quebec, H3C 3J7, Canada
| | - Anna Maria Di Sciullo
- Department of Linguistics, Université du Québec à Montréal, Case postale 8888, succursale Centre-ville, Montréal, Québec, H3C 3P8, Canada
| | - Vladimir Makarenkov
- Department of Computer Science, Université du Québec à Montréal, Case postale 8888, succursale Centre-ville, Montréal, Québec, H3C 3P8, Canada.
| |
Collapse
|
8
|
List JM, Pathmanathan JS, Lopez P, Bapteste E. Unity and disunity in evolutionary sciences: process-based analogies open common research avenues for biology and linguistics. Biol Direct 2016; 11:39. [PMID: 27544206 PMCID: PMC4992195 DOI: 10.1186/s13062-016-0145-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2016] [Accepted: 08/06/2016] [Indexed: 11/13/2022] Open
Abstract
Background For a long time biologists and linguists have been noticing surprising similarities between the evolution of life forms and languages. Most of the proposed analogies have been rejected. Some, however, have persisted, and some even turned out to be fruitful, inspiring the transfer of methods and models between biology and linguistics up to today. Most proposed analogies were based on a comparison of the research objects rather than the processes that shaped their evolution. Focusing on process-based analogies, however, has the advantage of minimizing the risk of overstating similarities, while at the same time reflecting the common strategy to use processes to explain the evolution of complexity in both fields. Results We compared important evolutionary processes in biology and linguistics and identified processes specific to only one of the two disciplines as well as processes which seem to be analogous, potentially reflecting core evolutionary processes. These new process-based analogies support novel methodological transfer, expanding the application range of biological methods to the field of historical linguistics. We illustrate this by showing (i) how methods dealing with incomplete lineage sorting offer an introgression-free framework to analyze highly mosaic word distributions across languages; (ii) how sequence similarity networks can be used to identify composite and borrowed words across different languages; (iii) how research on partial homology can inspire new methods and models in both fields; and (iv) how constructive neutral evolution provides an original framework for analyzing convergent evolution in languages resulting from common descent (Sapir’s drift). Conclusions Apart from new analogies between evolutionary processes, we also identified processes which are specific to either biology or linguistics. This shows that general evolution cannot be studied from within one discipline alone. In order to get a full picture of evolution, biologists and linguists need to complement their studies, trying to identify cross-disciplinary and discipline-specific evolutionary processes. The fact that we found many process-based analogies favoring transfer from biology to linguistics further shows that certain biological methods and models have a broader scope than previously recognized. This opens fruitful paths for collaboration between the two disciplines. Reviewers This article was reviewed by W. Ford Doolittle and Eugene V. Koonin. Electronic supplementary material The online version of this article (doi:10.1186/s13062-016-0145-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Johann-Mattis List
- CRLAO/EHESS, 2 rue de Lille, Paris, 75007, France. .,Equipe AIRE, UMR 7138, Laboratoire Evolution Paris-Seine, Université Pierre et Marie Curie, 7 quai St Bernard, Paris, 75005, France.
| | - Jananan Sylvestre Pathmanathan
- Equipe AIRE, UMR 7138, Laboratoire Evolution Paris-Seine, Université Pierre et Marie Curie, 7 quai St Bernard, Paris, 75005, France
| | - Philippe Lopez
- Equipe AIRE, UMR 7138, Laboratoire Evolution Paris-Seine, Université Pierre et Marie Curie, 7 quai St Bernard, Paris, 75005, France
| | - Eric Bapteste
- Equipe AIRE, UMR 7138, Laboratoire Evolution Paris-Seine, Université Pierre et Marie Curie, 7 quai St Bernard, Paris, 75005, France
| |
Collapse
|
9
|
Duda P, Jan Zrzavý. Human population history revealed by a supertree approach. Sci Rep 2016; 6:29890. [PMID: 27431856 PMCID: PMC4949479 DOI: 10.1038/srep29890] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Accepted: 06/23/2016] [Indexed: 01/01/2023] Open
Abstract
Over the past two decades numerous new trees of modern human populations have been published extensively but little attention has been paid to formal phylogenetic synthesis. We utilized the "matrix representation with parsimony" (MRP) method to infer a composite phylogeny (supertree) of modern human populations, based on 257 genetic/genomic, as well as linguistic, phylogenetic trees and 44 admixture plots from 200 published studies (1990-2014). The resulting supertree topology includes the most basal position of S African Khoisan followed by C African Pygmies, and the paraphyletic section of all other sub-Saharan peoples. The sub-Saharan African section is basal to the monophyletic clade consisting of the N African-W Eurasian assemblage and the consistently monophyletic Eastern superclade (Sahul-Oceanian, E Asian, and Beringian-American peoples). This topology, dominated by genetic data, is well-resolved and robust to parameter set changes, with a few unstable areas (e.g., West Eurasia, Sahul-Melanesia) reflecting the existing phylogenetic controversies. A few populations were identified as highly unstable "wildcard taxa" (e.g. Andamanese, Malagasy). The linguistic classification fits rather poorly on the supertree topology, supporting a view that direct coevolution between genes and languages is far from universal.
Collapse
Affiliation(s)
- Pavel Duda
- Department of Zoology, Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
- Center for Theoretical Study, Charles University and Academy of Sciences of the Czech Republic, Prague, Czech Republic
| | - Jan Zrzavý
- Department of Zoology, Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
| |
Collapse
|
10
|
Valverde S, Solé RV. Punctuated equilibrium in the large-scale evolution of programming languages. J R Soc Interface 2016; 12:rsif.2015.0249. [PMID: 25994298 DOI: 10.1098/rsif.2015.0249] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The analogies and differences between biological and cultural evolution have been explored by evolutionary biologists, historians, engineers and linguists alike. Two well-known domains of cultural change are language and technology. Both share some traits relating the evolution of species, but technological change is very difficult to study. A major challenge in our way towards a scientific theory of technological evolution is how to properly define evolutionary trees or clades and how to weight the role played by horizontal transfer of information. Here, we study the large-scale historical development of programming languages, which have deeply marked social and technological advances in the last half century. We analyse their historical connections using network theory and reconstructed phylogenetic networks. Using both data analysis and network modelling, it is shown that their evolution is highly uneven, marked by innovation events where new languages are created out of improved combinations of different structural components belonging to previous languages. These radiation events occur in a bursty pattern and are tied to novel technological and social niches. The method can be extrapolated to other systems and consistently captures the major classes of languages and the widespread horizontal design exchanges, revealing a punctuated evolutionary path.
Collapse
Affiliation(s)
- Sergi Valverde
- ICREA-Complex Systems Lab, Universitat Pompeu Fabra, Dr Aiguader 80, 08003 Barcelona, Spain Institut de Biologia Evolutiva, CSIC-UPF, Pg Maritim de la Barceloneta 37, 08003 Barcelona, Spain
| | - Ricard V Solé
- ICREA-Complex Systems Lab, Universitat Pompeu Fabra, Dr Aiguader 80, 08003 Barcelona, Spain Institut de Biologia Evolutiva, CSIC-UPF, Pg Maritim de la Barceloneta 37, 08003 Barcelona, Spain Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA
| |
Collapse
|
11
|
Abstract
Phylogenetic models, originally developed to demonstrate evolutionary biology, have been applied to a wide range of cultural data including natural language lexicons, manuscripts, folktales, material cultures, and religions. A fundamental question regarding the application of phylogenetic inference is whether trees are an appropriate approximation of cultural evolutionary history. Their validity in cultural applications has been scrutinized, particularly with respect to the lexicons of dialects in contact. Phylogenetic models organize evolutionary data into a series of branching events through time. However, branching events are typically not included in dialectological studies to interpret the distributions of lexical terms. Instead, dialectologists have offered spatial interpretations to represent lexical data. For example, new lexical items that emerge in a politico-cultural center are likely to spread to peripheries, but not vice versa. To explore the question of the tree model’s validity, we present a simple simulation model in which dialects form a spatial network and share lexical items through contact rather than through common ancestors. We input several network topologies to the model to generate synthetic data. We then analyze the synthesized data using conventional phylogenetic techniques. We found that a group of dialects can be considered tree-like even if it has not evolved in a temporally tree-like manner but has a temporally invariant, spatially tree-like structure. In addition, the simulation experiments appear to reproduce unnatural results observed in reconstructed trees for real data. These results motivate further investigation into the spatial structure of the evolutionary history of dialect lexicons as well as other cultural characteristics.
Collapse
Affiliation(s)
- Yugo Murawaki
- Department of Advanced Information Technology, Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
- * E-mail:
| |
Collapse
|
12
|
Kressing F. Lateral and Vertical Transfer in Biology, Linguistics and Anthropology: An Account of Widely Neglected Ideas in the Formation of Evolutionary Theories. Evol Biol 2015. [DOI: 10.1007/s11692-015-9330-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
13
|
Winter B. Spoken language achieves robustness and evolvability by exploiting degeneracy and neutrality. Bioessays 2014; 36:960-7. [DOI: 10.1002/bies.201400028] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Bodo Winter
- Cognitive and Information Sciences; University of California; Merced CA USA
| |
Collapse
|
14
|
Abstract
Words are built from smaller meaning bearing parts, called morphemes. As one word can contain multiple morphemes, one morpheme can be present in different words. The number of distinct words a morpheme can be found in is its family size. Here we used Birth-Death-Innovation Models (BDIMs) to analyze the distribution of morpheme family sizes in English and German vocabulary over the last 200 years. Rather than just fitting to a probability distribution, these mechanistic models allow for the direct interpretation of identified parameters. Despite the complexity of language change, we indeed found that a specific variant of this pure stochastic model, the second order linear balanced BDIM, significantly fitted the observed distributions. In this model, birth and death rates are increased for smaller morpheme families. This finding indicates an influence of morpheme family sizes on vocabulary changes. This could be an effect of word formation, perception or both. On a more general level, we give an example on how mechanistic models can enable the identification of statistical trends in language change usually hidden by cultural influences.
Collapse
|
15
|
Linguistic phylogenies support back-migration from Beringia to Asia. PLoS One 2014; 9:e91722. [PMID: 24621925 PMCID: PMC3951421 DOI: 10.1371/journal.pone.0091722] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 02/15/2014] [Indexed: 11/19/2022] Open
Abstract
Recent arguments connecting Na-Dene languages of North America with Yeniseian languages of Siberia have been used to assert proof for the origin of Native Americans in central or western Asia. We apply phylogenetic methods to test support for this hypothesis against an alternative hypothesis that Yeniseian represents a back-migration to Asia from a Beringian ancestral population. We coded a linguistic dataset of typological features and used neighbor-joining network algorithms and Bayesian model comparison based on Bayes factors to test the fit between the data and the linguistic phylogenies modeling two dispersal hypotheses. Our results support that a Dene-Yeniseian connection more likely represents radiation out of Beringia with back-migration into central Asia than a migration from central or western Asia to North America.
Collapse
|
16
|
List JM, Nelson-Sathi S, Geisler H, Martin W. Networks of lexical borrowing and lateral gene transfer in language and genome evolution. Bioessays 2014; 36:141-50. [PMID: 24375688 PMCID: PMC3910147 DOI: 10.1002/bies.201300096] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Like biological species, languages change over time. As noted by Darwin, there are many parallels between language evolution and biological evolution. Insights into these parallels have also undergone change in the past 150 years. Just like genes, words change over time, and language evolution can be likened to genome evolution accordingly, but what kind of evolution? There are fundamental differences between eukaryotic and prokaryotic evolution. In the former, natural variation entails the gradual accumulation of minor mutations in alleles. In the latter, lateral gene transfer is an integral mechanism of natural variation. The study of language evolution using biological methods has attracted much interest of late, most approaches focusing on language tree construction. These approaches may underestimate the important role that borrowing plays in language evolution. Network approaches that were originally designed to study lateral gene transfer may provide more realistic insights into the complexities of language evolution.
Collapse
Affiliation(s)
- Johann-Mattis List
- Research Center Deutscher Sprachatlas, Philipps-University MarburgMarburg, Germany
| | - Shijulal Nelson-Sathi
- Institute of Molecular Evolution, Heinrich-Heine University DüsseldorfDüsseldorf, Germany
| | - Hans Geisler
- Institute of Romance Languages and Literature, Heinrich-Heine University DüsseldorfDüsseldorf, Germany
| | - William Martin
- Institute of Molecular Evolution, Heinrich-Heine University DüsseldorfDüsseldorf, Germany
| |
Collapse
|
17
|
Keller DB, Schultz J. Connectivity, not frequency, determines the fate of a morpheme. PLoS One 2013; 8:e69945. [PMID: 23922865 PMCID: PMC3726735 DOI: 10.1371/journal.pone.0069945] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Accepted: 06/18/2013] [Indexed: 11/25/2022] Open
Abstract
Morphemes are the smallest meaningful parts of words and therefore represent a natural unit to study the evolution of words. To analyze the influence of language change on morphemes, we performed a large scale analysis of German and English vocabulary covering the last 200 years. Using a network approach from bioinformatics, we examined the historical dynamics of morphemes, the fixation of new morphemes and the emergence of words containing existing morphemes. We found that these processes are driven mainly by the number of different direct neighbors of a morpheme in words (connectivity, an equivalent to family size or type frequency) and not its frequency of usage (equivalent to token frequency). This contrasts words, whose survival is determined by their frequency of usage. We therefore identified features of morphemes which are not dictated by the statistical properties of words. As morphemes are also relevant for the mental representation of words, this result might enable establishing a link between an individual's perception of language and historical language change.
Collapse
Affiliation(s)
| | - Jörg Schultz
- Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany
| |
Collapse
|
18
|
O’Brien MJ, Collard M, Buchanan B, Boulanger MT. Trees, thickets, or something in between? Recent theoretical and empirical work in cultural phylogeny. Isr J Ecol Evol 2013. [DOI: 10.1080/15659801.2013.825431] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Anthropology has always had as one of its goals the explanation of human cultural diversity across space and through time. Over the past several decades, there has been a growing appreciation among anthropologists and other social scientists that the phylogenetic approaches that biologists have developed to reconstruct the evolutionary relationships of species are useful tools for building and explaining patterns of human diversity. Phylogenetic methods offer a means of creating testable propositions of heritable continuity – how one thing is related to another in terms of descent. Such methods have now been applied to a wide range of cultural phenomena, including languages, projectile points, textiles, marital customs, and political organization. Here we discuss several cultural phylogenies and demonstrate how they were used to address long-standing anthropological issues. Even keeping in mind that phylogenetic trees are nothing more than hypotheses about evolutionary relationships, some researchers have argued that when it comes to cultural behaviors and their products, tree building is theoretically unwarranted. We examine the issues that critics raise and find that they in no way sound the death knell for cultural phylogenetic work.
Collapse
Affiliation(s)
| | - Mark Collard
- Human Evolutionary Studies Program and Department of Archaeology, Simon Fraser University
| | - Briggs Buchanan
- Department of Anthropology, University of Missouri
- Human Evolutionary Studies Program and Department of Archaeology, Simon Fraser University
| | | |
Collapse
|
19
|
Towner MC, Grote MN, Venti J, Borgerhoff Mulder M. Cultural macroevolution on neighbor graphs : vertical and horizontal transmission among Western North American Indian societies. HUMAN NATURE-AN INTERDISCIPLINARY BIOSOCIAL PERSPECTIVE 2013; 23:283-305. [PMID: 22791406 DOI: 10.1007/s12110-012-9142-z] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
What are the driving forces of cultural macroevolution, the evolution of cultural traits that characterize societies or populations? This question has engaged anthropologists for more than a century, with little consensus regarding the answer. We develop and fit autologistic models, built upon both spatial and linguistic neighbor graphs, for 44 cultural traits of 172 societies in the Western North American Indian (WNAI) database. For each trait, we compare models including or excluding one or both neighbor graphs, and for the majority of traits we find strong evidence in favor of a model which uses both spatial and linguistic neighbors to predict a trait's distribution. Our results run counter to the assertion that cultural trait distributions can be explained largely by the transmission of traits from parent to daughter populations and are thus best analyzed with phylogenies. In contrast, we show that vertical and horizontal transmission pathways can be incorporated in a single model, that both transmission modes may indeed operate on the same trait, and that for most traits in the WNAI database, accounting for only one mode of transmission would result in a loss of information.
Collapse
Affiliation(s)
- Mary C Towner
- Department of Zoology, Oklahoma State University, Stillwater, 74078, USA.
| | | | | | | |
Collapse
|
20
|
Bouckaert R, Lemey P, Dunn M, Greenhill SJ, Alekseyenko AV, Drummond AJ, Gray RD, Suchard MA, Atkinson QD. Mapping the origins and expansion of the Indo-European language family. Science 2012; 337:957-60. [PMID: 22923579 DOI: 10.1126/science.1219669] [Citation(s) in RCA: 184] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
There are two competing hypotheses for the origin of the Indo-European language family. The conventional view places the homeland in the Pontic steppes about 6000 years ago. An alternative hypothesis claims that the languages spread from Anatolia with the expansion of farming 8000 to 9500 years ago. We used Bayesian phylogeographic approaches, together with basic vocabulary data from 103 ancient and contemporary Indo-European languages, to explicitly model the expansion of the family and test these hypotheses. We found decisive support for an Anatolian origin over a steppe origin. Both the inferred timing and root location of the Indo-European language trees fit with an agricultural expansion from Anatolia beginning 8000 to 9500 years ago. These results highlight the critical role that phylogeographic inference can play in resolving debates about human prehistory.
Collapse
Affiliation(s)
- Remco Bouckaert
- Department of Computer Science, University of Auckland, Auckland 1142, New Zealand
| | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Walker RS, Wichmann S, Mailund T, Atkisson CJ. Cultural phylogenetics of the Tupi language family in lowland South America. PLoS One 2012; 7:e35025. [PMID: 22506065 PMCID: PMC3323632 DOI: 10.1371/journal.pone.0035025] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2011] [Accepted: 03/12/2012] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Recent advances in automated assessment of basic vocabulary lists allow the construction of linguistic phylogenies useful for tracing dynamics of human population expansions, reconstructing ancestral cultures, and modeling transition rates of cultural traits over time. METHODS Here we investigate the Tupi expansion, a widely-dispersed language family in lowland South America, with a distance-based phylogeny based on 40-word vocabulary lists from 48 languages. We coded 11 cultural traits across the diverse Tupi family including traditional warfare patterns, post-marital residence, corporate structure, community size, paternity beliefs, sibling terminology, presence of canoes, tattooing, shamanism, men's houses, and lip plugs. RESULTS/DISCUSSION The linguistic phylogeny supports a Tupi homeland in west-central Brazil with subsequent major expansions across much of lowland South America. Consistently, ancestral reconstructions of cultural traits over the linguistic phylogeny suggest that social complexity has tended to decline through time, most notably in the independent emergence of several nomadic hunter-gatherer societies. Estimated rates of cultural change across the Tupi expansion are on the order of only a few changes per 10,000 years, in accord with previous cultural phylogenetic results in other language families around the world, and indicate a conservative nature to much of human culture.
Collapse
Affiliation(s)
- Robert S Walker
- Department of Anthropology, University of Missouri, Columbia, Missouri, United States of America.
| | | | | | | |
Collapse
|
22
|
Tools from evolutionary biology shed new light on the diversification of languages. Trends Cogn Sci 2012; 16:167-73. [DOI: 10.1016/j.tics.2012.01.007] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2011] [Revised: 01/18/2012] [Accepted: 01/20/2012] [Indexed: 01/04/2023]
|
23
|
Atkinson QD. Response to Comments on “Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa”. Science 2012. [DOI: 10.1126/science.1210005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Affiliation(s)
- Quentin D. Atkinson
- Department of Psychology, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand
- Institute of Cognitive and Evolutionary Anthropology, University of Oxford, 64 Banbury Road, Oxford OX2 6PN, UK
| |
Collapse
|
24
|
Bowern C, Epps P, Gray R, Hill J, Hunley K, McConvell P, Zentz J. Does lateral transmission obscure inheritance in hunter-gatherer languages? PLoS One 2011; 6:e25195. [PMID: 21980394 PMCID: PMC3181316 DOI: 10.1371/journal.pone.0025195] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2011] [Accepted: 08/29/2011] [Indexed: 11/18/2022] Open
Abstract
In recent years, linguists have begun to increasingly rely on quantitative phylogenetic approaches to examine language evolution. Some linguists have questioned the suitability of phylogenetic approaches on the grounds that linguistic evolution is largely reticulate due to extensive lateral transmission, or borrowing, among languages. The problem may be particularly pronounced in hunter-gatherer languages, where the conventional wisdom among many linguists is that lexical borrowing rates are so high that tree building approaches cannot provide meaningful insights into evolutionary processes. However, this claim has never been systematically evaluated, in large part because suitable data were unavailable. In addition, little is known about the subsistence, demographic, ecological, and social factors that might mediate variation in rates of borrowing among languages. Here, we evaluate these claims with a large sample of hunter-gatherer languages from three regions around the world. In this study, a list of 204 basic vocabulary items was collected for 122 hunter-gatherer and small-scale cultivator languages from three ecologically diverse case study areas: northern Australia, northwest Amazonia, and California and the Great Basin. Words were rigorously coded for etymological (inheritance) status, and loan rates were calculated. Loan rate variability was examined with respect to language area, subsistence mode, and population size, density, and mobility; these results were then compared to the sample of 41 primarily agriculturalist languages. Though loan levels varied both within and among regions, they were generally low in all regions (mean 5.06%, median 2.49%, and SD 7.56), despite substantial demographic, ecological, and social variation. Amazonian levels were uniformly very low, with no language exhibiting more than 4%. Rates were low but more variable in the other two study regions, in part because of several outlier languages where rates of borrowing were especially high. High mobility, prestige asymmetries, and language shift may contribute to the high rates in these outliers. No support was found for claims that hunter-gatherer languages borrow more than agriculturalist languages. These results debunk the myth of high borrowing in hunter-gatherer languages and suggest that the evolution of these languages is governed by the same type of rules as those operating in large-scale agriculturalist speech communities. The results also show that local factors are likely to be more critical than general processes in determining high (or low) loan rates.
Collapse
Affiliation(s)
- Claire Bowern
- Department of Linguistics, Yale University, New Haven, Connecticut, United States of America.
| | | | | | | | | | | | | |
Collapse
|
25
|
Matthews LJ, Tehrani JJ, Jordan FM, Collard M, Nunn CL. Testing for divergent transmission histories among cultural characters: a study using Bayesian phylogenetic methods and Iranian tribal textile data. PLoS One 2011; 6:e14810. [PMID: 21559083 PMCID: PMC3084691 DOI: 10.1371/journal.pone.0014810] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2010] [Accepted: 03/18/2011] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Archaeologists and anthropologists have long recognized that different cultural complexes may have distinct descent histories, but they have lacked analytical techniques capable of easily identifying such incongruence. Here, we show how bayesian phylogenetic analysis can be used to identify incongruent cultural histories. We employ the approach to investigate Iranian tribal textile traditions. METHODS We used bayes factor comparisons in a phylogenetic framework to test two models of cultural evolution: the hierarchically integrated system hypothesis and the multiple coherent units hypothesis. In the hierarchically integrated system hypothesis, a core tradition of characters evolves through descent with modification and characters peripheral to the core are exchanged among contemporaneous populations. In the multiple coherent units hypothesis, a core tradition does not exist. Rather, there are several cultural units consisting of sets of characters that have different histories of descent. RESULTS For the Iranian textiles, the bayesian phylogenetic analyses supported the multiple coherent units hypothesis over the hierarchically integrated system hypothesis. Our analyses suggest that pile-weave designs represent a distinct cultural unit that has a different phylogenetic history compared to other textile characters. CONCLUSIONS The results from the Iranian textiles are consistent with the available ethnographic evidence, which suggests that the commercial rug market has influenced pile-rug designs but not the techniques or designs incorporated in the other textiles produced by the tribes. We anticipate that bayesian phylogenetic tests for inferring cultural units will be of great value for researchers interested in studying the evolution of cultural traits including language, behavior, and material culture.
Collapse
Affiliation(s)
- Luke J Matthews
- Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America.
| | | | | | | | | |
Collapse
|