This paper infers the processes of development and change of grammatical gender in Indo-Aryan languages using phylogenetic comparative methods. 48 Indo-Aryan languages are coded based on 44 presence-absence features relating to gender marking on the verbs, adjectives, personal pronouns, demonstrative pronouns, and possessive pronouns. A Bayesian Reverse Jump Hyper Prior analysis, which infers the evolutionary dynamics of changes between feature values, gives results that are consistent with historical linguistic and typological studies on gender systems in Indo-Aryan languages and predicts the evolutionary trends of the features included in the dataset.
The Chapacuran language family, with three extant members and nine historically attested lects, has yet to be classified following modern standards in historical linguistics. This paper presents an internal classification of these languages by combining both the traditional comparative method (CM) and Bayesian phylogenetic inference (BPI). We identify multiple systematic sound correspondences and 285 cognate sets of basic vocabulary using the available documentation. These allow us to reconstruct a large portion of the Proto-Chapacuran phonemic inventory and identify tentative major subgroupings. The cognate sets form the input for the BPI analysis, which uses a stochastic Continuous-Time Markov Chain to model the change of these cognate sets over time. We test various models of lexical substitution and evolutionary clocks, and use ethnohistorical information and data collection dates to calibrate the resulting trees. The CM and BPI analyses produce largely congruent results, suggesting a division of the family into three different clades.
There are two competing hypotheses for the origin of the Indo-European language family. The conventional view places the homeland in the Pontic steppes about 6000 years ago. An alternative hypothesis claims that the languages spread from Anatolia with the expansion of farming 8000 to 9500 years ago. We used Bayesian phylogeographic approaches, together with basic vocabulary data from 103 ancient and contemporary Indo-European languages, to explicitly model the expansion of the family and test these hypotheses. We found decisive support for an Anatolian origin over a steppe origin. Both the inferred timing and root location of the Indo-European language trees fit with an agricultural expansion from Anatolia beginning 8000 to 9500 years ago. These results highlight the critical role that phylogeographic inference can play in resolving debates about human prehistory.
Recent environmental humanities scholarship has argued that environmental illness memoirs perform important cultural work by recasting health as an environmental issue. In this article, I show how EI autobiography hearkens back to a longer tradition of health travel with deep colonial resonances. I explore such connections by means of a comparative analysis of two health travel narratives: The first, A winter in the West Indies and Florida, is an anonymous tract by a self-described 'northern invalid' dealing with his travels to the Caribbean as a remedy for his chronic pulmonary problems during the late 1830s. The second, drawn from a collection by disability activist Aurora Levins Morales, details the author's healing journey to Cuba during the summer of 2009. I argue that, while A Winter points forward to modern sociobiology, Levins Morales's narrative should be read as issuing from a biosocial community of EI sufferers. Finally, attending to the continuities and differences between EI autobiographies may deepen current debates on trans-corporeality, which tend to assume a direct relation between non-dualistic epistemologies and somatic ethics. In this sense, the article can be read as a commentary on overly rights-based approaches to illness and Q1 disability in the present biochemical age.
In this paper, we investigate the phenomenon of pronominal politeness in the Indo-European languages and demonstrate that the processes of change of pronominal systems related to politeness follow two evolutionary regimes, one inside the 'Standard Average European' (SAE) linguistic area and another outside of it. Historical processes of language change differ at different levels of linguistic structure. In general, we presume that lower level, unconscious aspects of language change slowly over phylogenetic time, giving rise to patterns of relationship that can often be described as a family tree. Aspects of language that are consciously manipulated by speakers are expected to vary at a faster rate and to diffuse within areas of contact. Politeness is a social phenomenon, so we expect these systems to be highly susceptible to areal norms of interaction. We show that the similarities of SAE politeness systems can be accounted for with a model of convergence due to parallel evolution in a shared (social-demographic) environment, rather than by genealogical relatedness or borrowing. By quantifying and testing factors determining rates of structural change, we offer a novel and realistic approach that can explain similarities between distantly related languages sharing the same environment.
A major argument against the feasibility of reconstructing syntax for proto-stages is the widely discussed lack of directionality of syntactic change. In a recent typology of changes in argument structure constructions based on Germanic (Barodal 2015), several different, yet opposing, changes are reported. These include, among others, processes sometimes called dative sickness, nominative sickness, and accusative sickness. In order to tease apart the roles of the different processes, we have carried out a phylogenetic trait analysis on a predefined data set of twelve predicates found across the Germanic phyla using the MULTISTATE method. This is, as far as we are aware, the first application of the MULTISTATE method (Pagel et al. 2004) in historical syntax. The results clearly favor one of the models, the dative sickness model, over any other model, as this model is the only one that can accurately account for both the observed diversity of case frames and the independently proposed philological reconstructions. Methods of evolutionary trait analysis can be used to model evolutionary paths of argument structure constructions, and they provide the perfect testing ground for hypotheses arrived at through philological reconstruction, based on classical historical-comparative methods.
Our species displays remarkable linguistic diversity. Although the uneven distribution of this diversity demands explanation, the drivers of these patterns have not been conclusively determined. We address this issue in two steps: First, we review previous empirical studies whose authors have suggested environmental, geographical, and sociocultural drivers of linguistic diversification. However, contradictory results and methodological variation make it difficult to draw general conclusions. Second, we outline a program for future research. We suggest that future analyses should account for interactions among causal factors, the lack of spatial and phylogenetic independence of the data, and transitory patterns. Recent analytical advances in biogeography and evolutionary biology, such as simulation modeling of diversity patterns, hold promise for testing four key mechanisms of language diversification proposed here: neutral change, population movement, contact, and selection. Future modeling approaches should also evaluate how the outcomes of these processes are influenced by demography, environmental heterogeneity, and time.
AimTwo fundamental questions about human language demand answers: why are so many languages spoken today and why is their geographical distribution so uneven? Although hypotheses have been proposed for centuries, the processes that determine patterns of linguistic and cultural diversity remain poorly understood. Previous studies, which relied on correlative, curve-fitting approaches, have produced contradictory results. Here we present the first application of process-based simulation modelling, derived from macroecology, to examine the distribution of human groups and their languages. LocationThe Australian continent is used as a case study to demonstrate the power of simulation modelling for identifying processes shaping the diversity and distribution of human languages. MethodsProcess-based simulation models allow investigators to hold certain factors constant in order to isolate and assess the impact of modelled processes. We tested the extent to which a minimal set of processes determines the number and spatial distribution of languages on the Australian continent. Our model made three basic assumptions based on previously proposed, but untested, hypotheses: groups fill unoccupied spaces, rainfall limits population density and groups divide after reaching a maximum population. ResultsRemarkably, this simple model accurately predicted the total number of languages (average estimate 406, observed 407), and explained 56% of spatial variation in language richness on the Australian continent. Main conclusionsOur results present strong evidence that current climatic conditions and limits to group size are important processes shaping language diversity patterns in Australia. Our study also demonstrates how simulation models from macroecology can be used to understand the processes that have shaped human cultural diversity across the globe.
Understanding how and why language subsystems differ in their evolutionary dynamics is a fundamental question for historical and comparative linguistics. One key dynamic is the rate of language change. While it is commonly thought that the rapid rate of change hampers the reconstruction of deep language relationships beyond 6,000-10,000 y, there are suggestions that grammatical structures might retainmore signal over time than other subsystems, such as basic vocabulary. In this study, we use a Dirichlet process mixture model to infer the rates of change in lexical and grammatical data from 81 Austronesian languages. We show that, on average, most grammatical features actually change faster than items of basic vocabulary. The grammatical data show less schismogenesis, higher rates of homoplasy, and more bursts of contact-induced change than the basic vocabulary data. However, there is a core of grammatical and lexical features that are highly stable. These findings suggest that different subsystems of language have differing dynamics and that careful, nuanced models of language change will be needed to extract deeper signal from the noise of parallel evolution, areal readaptation, and contact.
Collections of sayings of the desert fathers and mothers are extant in manuscripts in many languages and are organized differently. They are ‘fixed-content miscellanies’ (FCM): they include material that belongs to the same genre, but is variable both when it comes to appearance and order. Distance measurement methods are particularly suitable for large text traditions including variable content in the so-called mixed-content miscellanies, such as recipes, anthological compilations of shorter text passages, or catalogues, but can also be suitable for text genres like collections of sayings, that are equally variable in appearance and order of sayings, even though the genre is fixed; hence ‘fixed-content miscellanies’. In the article, collections of sayings in seven languages were compared using four distance measures methods. Each segment of the sayings was given a unique id to be comparable. The first method used, the Jaccard distance measure, disregards the linear order of items and instead considers each collection compared only as a ‘bag of stories’. In two other methods used (Birnbaum and Levenshtein methods), the order in which the narratives of each saying appear is compared. All three methods yielded interesting results, but the collections that were apparently closely related were clustered together so tightly that it was not possible to make more nuanced analyses. In order to remove false negatives, particulars concerning lacunes in the material were taken into account in the proposed modified Levenshtein method, the fixed-content miscellanies (FCM)-Levenshtein method. By applying the FCM-Levenshtein method, previously unknown relations between collections witnessed in different languages could be detected.
The consequences of the Neolithic transition in Europe-one of the most important cultural changes in human prehistory-is a subject of great interest. However, its effect on prehistoric and modern-day people in Iberia, the westernmost frontier of the European continent, remains unresolved. We present, to our knowledge, the first genome-wide sequence data from eight human remains, dated to between 5,500 and 3,500 years before present, excavated in the El Portalon cave at Sierra de Atapuerca, Spain. We show that these individuals emerged from the same ancestral gene pool as early farmers in other parts of Europe, suggesting that migration was the dominant mode of transferring farming practices throughout western Eurasia. In contrast to central and northern early European farmers, the Chalcolithic El Portalon individuals additionally mixed with local southwestern hunter-gatherers. The proportion of hunter-gatherer-related admixture into early farmers also increased over the course of two millennia. The Chalcolithic El Portalon individuals showed greatest genetic affinity to modern-day Basques, who have long been considered linguistic and genetic isolates linked to the Mesolithic whereas all other European early farmers show greater genetic similarity to modern-day Sardinians. These genetic links suggest that Basques and their language may be linked with the spread of agriculture during the Neolithic. Furthermore, all modern-day Iberian groups except the Basques display distinct admixture with Caucasus/Central Asian and North African groups, possibly related to historical migration events. The El Portalon genomes uncover important pieces of the demographic history of Iberia and Europe and reveal how prehistoric groups relate to modern-day people.
Recent studies have detailed a remarkable degree of genetic and linguistic diversity in Northern Island Melanesia. Here we utilize that diversity to examine two models of genetic and linguistic coevolution. The first model predicts that genetic and linguistic correspondences formed following population splits and isolation at the time of early range expansions into the region. The second is analogous to the genetic model of isolation by distance, and it predicts that genetic and linguistic correspondences formed through continuing genetic and linguistic exchange between neighboring populations. We tested the predictions of the two models by comparing observed and simulated patterns of genetic variation, genetic and linguistic trees, and matrices of genetic, linguistic, and geographic distances. The data consist of 751 autosomal microsatellites and 108 structural linguistic features collected from 33 Northern Island Melanesian populations. The results of the tests indicate that linguistic and genetic exchange have erased any evidence of a splitting and isolation process that might have occurred early in the settlement history of the region. The correlation patterns are also inconsistent with the predictions of the isolation by distance coevolutionary process in the larger Northern Island Melanesian region, but there is strong evidence for the process in the rugged interior of the largest island in the region (New Britain). There we found some of the strongest recorded correlations between genetic, linguistic, and geographic distances. We also found that, throughout the region, linguistic features have generally been less likely to diffuse across population boundaries than genes. The results from our study, based on exceptionally fine-grained data, show that local genetic and linguistic exchange are likely to obscure evidence of the early history of a region, and that language barriers do not particularly hinder genetic exchange. In contrast, global patterns may emphasize more ancient demographic events, including population splits associated with the early colonization of major world regions. The coevolution of genes and languages has been a subject of enduring interest among geneticists and linguists. Progress has been limited by the available data and by the methods employed to compare patterns of genetic and linguistic variation. Here, we use high-quality data and novel methods to test two models of genetic and linguistic coevolution in Northern Island Melanesia, a region known for its complex history and remarkable biological and linguistic diversity. The first model predicts that congruent genetic and linguistic trees formed following serial population splits and isolation that occurred early in the settlement history of the region. The second model emphasizes the role of post-settlement exchange among neighboring groups in determining genetic and linguistic affinities. We rejected both models for the larger region, but found strong evidence for the post-settlement exchange model in the rugged interior of its largest island, where people have maintained close ties to their ancestral lands. The exchange (particularly genetic exchange) has obscured but not completely erased signals of early migrations into Island Melanesia, and such exchange has probably obscured early prehistory within other regions. In contrast, local exchange is less likely to have obscured evidence of population history at larger geographic scales.
This chapter investigates the fit of genetic, phenotypic, and linguistic data to two well-known models of population history. The first of these models, termed the population fissions model, emphasizes population splitting, isolation, and independent evolution. It predicts that genetic and linguistic data will be perfectly tree-like. The second model, termed isolation by distance, emphasizes genetic exchange among geographically proximate populations. It predicts a monotonic decline in genetic similarity with increasing geographic distance. While these models are overly simplistic, deviations from them were expected to provide important insights into the population history of northern Island Melanesia. The chapter finds scant support for either model because the prehistory of the region has been so complex. Nonetheless, the genetic and linguistic data are consistent with an early radiation of proto-Papuan speakers into the region followed by a much later migration of Austronesian speaking peoples. While these groups subsequently experienced substantial genetic and cultural exchange, this exchange has been insufficient to erase this history of separate migrations.
The Dravidian language family consists of about 80 varieties (Hammarstrom H. 2016 Glottolog 2.7) spoken by 220 million people across southern and central India and surrounding countries (Steever SB. 1998 Tn The Dravidian languages (ed. SB Steever), pp. 1-39: 1). Neither the geographical origin of the Dravidian language homeland nor its exact dispersal through time are known. The history of these languages is crucial for understanding prehistory in Eurasia, because despite their current restricted range, these languages played a significant role in influencing other language groups including IndoAryan (Indo-European) and Munda (Austroasiatic) speakers. Here, we report the results of a Bayesian phylogenetic analysis of cognate -coded lexical data, elicited first hand from native speakers, to investigate the subgrouping of the Dravidian language family, and provide dates for the major points of diversification. Our results indicate that the Dravidian language family is approximately 4500 years old, a finding that corresponds well with earlier linguistic and archaeological studies. The main branches of the Dravidian language family (North, Central, South I, South II) are recovered, although the placement of languages within these main branches diverges from previous classifications. We find considerable uncertainty with regard to the relationships between the main branches.
In each semantic domain studied to date, there is considerable variation in how meanings are expressed across languages. But are some semantic domains more likely to show variation than others? Is the domain of space more or less variable in its expression than other semantic domains, such as containers, body parts, or colours? According to many linguists, the meanings expressed in grammaticised expressions, such as (spatial) adpositions, are more likely to be similar across languages than meanings expressed in open class lexical items. On the other hand, some psychologists predict there ought to be more variation across languages in the meanings of adpositions, than in the meanings of nouns. This is because relational categories, such as those expressed as adpositions, are said to be constructed by language; whereas object categories expressed as nouns are predicted to be "given by the world". We tested these hypotheses by comparing the semantic systems of closely related languages. Previous cross-linguistic studies emphasise the importance of studying diverse languages, but we argue that a focus on closely related languages is advantageous because domains can be compared in a culturally- and historically-informed manner. Thus we collected data from 12 Germanic languages. Naming data were collected from at least 20 speakers of each language for containers, body-parts, colours, and spatial relations. We found the semantic domains of colour and body-parts were the most similar across languages. Containers showed some variation, but spatial relations expressed in adpositions showed the most variation. The results are inconsistent with the view expressed by most linguists. Instead, we find meanings expressed in grammaticised meanings are more variable than meanings in open class lexical items. (C) 2014 Elsevier Ltd. All rights reserved
The use of computational methods to assign absolute datings to language divergence is receiving renewed interest, as modern approaches based on Bayesian statistics offer alternatives to the discredited techniques of glottochronology. The datings provided by these new analyses depend crucially on the use of calibration, but the methodological issues surrounding calibration have received comparatively little attention. Especially, underappreciated is the extent to which traditional historical linguistic scholarship can contribute to the calibration process via loanword analysis. Aiming at a wide audience, we provide a detailed discussion of calibration theory and practice, evaluate previously used calibrations, recommend best practices for justifying calibrations, and provide a concrete example of these practices via a detailed derivation of calibrations for the Uralic language family. This article aims to inspire a higher quality of scholarship surrounding all statistical approaches to language dating, and especially closer engagement between practitioners of statistical methods and traditional historical linguists, with the former thinking more carefully about the arguments underlying their calibrations and the latter more clearly identifying results of their work which are relevant to calibration, or even suggesting calibrations directly.
This paper presents the Uralic Areal Typology Online (UraTyp 1.0), a typological dataset of 35 Uralic languages and a total of 360 features, mainly covering the levels of morphology, syntax, and phonology. The features belong to two different datasets: 195 features’ definitions originate from the Grambank (GB) database, developed for comparison of world language typology, whereas 165 features (UT) have been designed specifically to describe the typological variation within the Uralic language family. We present a series of analyses of the dataset demonstrating its scope and possibilities. The complete data set correctly identifies the main Uralic subgroups in a Principal Components Analysis, whereas GB data alone is insufficiently granular to detect this family-internal structure. Similar analyses limited to various typological subdomains also give variable results. A model-based admixture analysis identifies four distinct areas of historical interaction: Saami, Finnic, the Volga area and Ob-Ugric.
Similarities between languages can be due to 1) homoplasies because of a limited design space, 2) common ancestry, and 3) contact-induced convergence. Typological or structural features cannot prove genealogy, but they can provide historical signals that are due to common ancestry or contact (or both). Following a brief summary of results obtained from the comparison of 160 structural features from 121 languages (Reesink, Singer & Dunn 2009), we discuss some issues related to the relative dependencies of such features: logical entailment, chance resemblance, typological dependency, phylogeny and contact. This discussion focusses on the clustering of languages found in a small sample of 11 Austronesian and 8 Papuan languages of eastern Indonesia, an area known for its high degree of admixture.