A Historical Database of Sociocultural Evolution

The origin of human ultrasociality—the ability to cooperate in huge groups of genetically unrelated individuals—has long interested evolutionary and social theorists, but there has been little systematic empirical research on the topic. The Historical Database of Sociocultural Evolution, which we introduce in this article, brings the available historical and archaeological data together in a way that will allow hypotheses concerning the origin of ultrasociality to be tested rigorously. In addition to describing the methodology informing the set-up of the database, our article introduces four hypotheses that we intend to test using the database. These hypotheses focus on the resource base, warfare, ritual, and religion, respectively. Ultimately the aim of our database is to offer a ‘rapid discovery science’ route to the study of the past. We believe our approach is not only highly complementary with existing traditions of enquiry in history and archaeology but will extend their intellectual scope and explanatory power.


Introduction
Until about ten thousand years ago all humans lived in small-scale societies characterized by face-to-face cooperation. Today the vast majority of people live in very large-scale anonymous societies, typically organized as states. The functioning of large-scale complex societies is only possible on the basis of cooperation among its members-at least among some of them, some of the time. When the degree of cooperation in a society declines below the threshold necessary for its survival, societies fall apart, sometimes peacefully (as happened in the dissolution of Czechoslovakia in 1993), but more often violently (as is the case with today's 'failed states').
Cooperation in large-scale societies can take many forms. Examples include volunteering for the army when one's country is attacked, willingly paying taxes, organizing sporting activities for a community's children, helping strangers by donating to the local food-bank, and refusing to take bribes. In each case, cooperation produces a 'public good'-that is, a good that benefits many members of a group, including those who decline to engage in such activities, while the costs of cooperation are borne privately (for example, one can be killed defending the country). This form of sociality-the ability to cooperate in huge groups of genetically unrelated individuals-is termed ultrasociality (Campbell 1983). As far as we know, ultrasociality is unique to humans. Although eusociality in social insects-bees, ants, termitessuperficially resembles the sort of large-scale cooperation that characterizes human city-states and nations, the mechanisms must be radically different, because social insect colonies are composed of genetically highly related individuals, and therefore are more easily understandable in terms of basic kin-selection evolutionary mechanisms.
The origin of ultrasociality can be viewed as a major evolutionary transition (Maynard Smith and Szathmáry 1995). The other major evolutionary transitions are the development of chromosomes from independent replicators, the emergence of eukaryotic cells from prokaryotic cells, the evolution of multicellular organisms from unicellular organisms, and the development of eusocial colonies. Ultrasociality is the most recent major evolutionary transition.
How ultrasociality evolved presents a serious puzzle for both evolutionary and social theorists (Richerson and Boyd 1998), and has been a question that thinkers have struggled with throughout recorded history, from ancient China and Greece to the present. Nevertheless, we still do not have a generally accepted answer. One issue is that there is not even a theoretical framework for investigating this question on which the majority of researchers could agree (we discuss one such framework, cultural multilevel selection, in a later section). Another important factor is that the enormous amount of historical (including archaeological) information that exists has never been brought together in a way that would allow the various hypotheses that have been put forward to explain the origin of ultrasociality to be tested rigorously.
With the latter point in mind, we are creating a historical database that will enable us (and others) to test theories about the processes responsible for the rise of large-scale societies. The database will bring together, in a systematic form, what is currently known about the sociopolitical organization of complex human societies. It will also be used in analyses to determine how characteristics of large-scale socioeconomic organization vary with culture, institutions, world region and historical period, and whether there are any universal features that all complex societies share.

Testing Rival Theories with Historical Data
The general methodology we are using is not novel; it is essentially the triedand-tested scientific method. Our approach involves the following steps: • Specify the main question and identify empirical patterns that need to be explained. • Develop a set of competing hypotheses that make different empirical predictions. This phase may require building mathematical models as an intermediate step, for example when theories postulate a complex set of interrelated processes. • Code the data and perform statistical analyses of the resulting database to determine which hypotheses most parsimoniously explain the empirical patterns. While the overall logic of this approach is quite simple, in practice it takes a lot of effort-and much thought-to make it work.

The 'Bottom-Up' Theory
As indicated above, the main goal of our research network is to explain the transition from small-scale, simple societies to large-scale, complex societies. This general question can be approached in a variety of ways, and here we give several examples of such approaches.
The first approach focuses on the empirical observation that the evolution of complex societies happened at different rates in different regions. Some regions were quite precocious in developing large-scale societies, others lagged. Before the recent globalization during which complex societies spread over the whole globe, a number of geographical regions had only small-scale societies of hunter-gatherers, while others experienced recurrent 'chiefly cycles' (Anderson 1996, Marcus 1998) without making the transition to more complex, state-based societies. Such spatio-temporal variation in the rates of social evolution provides a rich testing ground for comparing predictions from rival theories.
Probably the most prevalent current theory of how complex societies evolved focuses on resources. Put simply, the adoption of agriculture created a resource base capable of sustaining high population densities and an extensive division of labor. It also generated the capacity to produce 'surplus.' These developments made possible cities, where populations did not need to grow their own food, and specialized classes of managers and rulers were able to emerge. According to one theory of complex society formation, a rich resource base is not only a necessary condition, but also a sufficient one. We refer to this as the 'bottom-up' theory because it treats social complexity as a sort of 'superstructure' on the material resource base.
The resource hypothesis has been widely discussed by anthropologists, archaeologists, and historians (Childe 1950, White 1959, Service 1962, Diamond 1997, Johnson and Earle 2000, 2003, Kennett et al. 2012. In its pure form, this hypothesis features prominently in the theory of 'cultural materialism' (White 1959), which maintains that social evolution was driven by technological advances in the ability of societies to harness energy. Many other anthropological theories elaborate on this basic idea and posit a variety of additional mechanisms. A particularly sophisticated version was outlined by Johnson and Earle (2000). These authors consider how intensification of production affects economic, military, and ideological dynamics, and how the interaction between these processes results in the rise of chiefdoms, states, and empires.
Another recent version of the resource hypothesis draws on the growing body of data indicating that there have been frequent climatic fluctuations during the Holocene (Kennett et al. 2012). According to this hypothesis, better climate raises agricultural yields, enabling the rise of complex societies, while poor climate reduces the resource base, which in extreme cases can cause societal collapse. While different versions of the resource hypothesis put a greater emphasis on one or another mechanism, there is one clear implication of the bottom-up dynamics that is common to all of them: there should be a strong correlation between increases in the resource base and transitions to more complex societies (although there may be time lags between these two developments).
Before we discuss alternative theories, it is important to emphasize that investigating the correlation between advances in productive technologies and transitions to greater social complexity is only the beginning. Different theories, even though assuming bottom-up dynamics, may stress different mechanisms. For example, Service (1962) focused on redistributive aspects of chiefdoms, while Carneiro's (1970Carneiro's ( , 1981 theory emphasized warfare and circumscription. While we do not pursue such questions here, it is worth keeping in mind that such clear-cut theoretical disagreements can be tested empirically with the database we are building, providing we can figure out how to unambiguously code such concepts as, for example, 'circumscription.'

Cultural Multilevel Selection
A problem with most versions of the resource hypothesis is that they presume that somehow societies will find ways of solving collective action problems that inevitably arise when large groups of people need to cooperate in the production of public goods (Olson 1965, Hardin 1968). If such mechanisms are not specified, the theory must be incomplete.
One conceptual framework for understanding the evolution of social complexity that addresses this issue head-on is cultural group or multilevel selection (CMLS) theory (Sober and Wilson 1991, Richerson and Boyd 1998, 2005, Okasha 2007, Wilson and Wilson 2007. At its core CMLS theory proposes that human ultrasociality, and the cultural traits necessary to sustain it (most importantly, social norms and institutions), arose as a result of competition between cultural groups Boyd 1998, Richerson andHenrich 2012). Groups can outcompete one another in a variety of ways, including enhanced internal population growth, exportation (deliberate or not) of cultural forms, forcible assimilation, and outright elimination of competing groups, or some combination of these mechanisms. According to CMLS theory, collective action problems are solved by societies adopting prosocial norms and institutions. Although such solutions are usually costly, they spread because groups with such cultural traits tend to outcompete groups lacking them. A consequence of this logic is that prosocial norms and institutions will be gradually lost if between-group competition is relaxed. It should be noted that, although CMLS theory has been recently gaining ground, it remains rather controversial, and many evolutionary scientists continue to reject it, as witnessed by recent exchanges following the lead articles by Steven Pinker at the Edge and by David Sloan Wilson at the Social Evolution Forum (Wilson 2013).
Of particular interest to CMLS theorists are ultrasocial institutions playing a role in the integration of the largest-scale human groups-that is, institutions that enabled the transition from middle-scale societies (simple and complex chiefdoms) initially to archaic states and then to large-scale empires and modern nation-states. Ultrasocial institutions are characterized by a tension between the benefits they yield at the higher level of social organization and the costs borne by lower-level units. As a result, fragmentation into lower-level units should typically lead to a loss of such institutions. For example, when a territorial state fragments into a multitude of province-sized political units organized as complex chiefdoms, we expect that such ultrasocial institutions as governance by professional bureaucracies, or education systems producing literate elites, would be gradually eliminated from the system. Since fragmentation into smaller-scale units is something that has occurred repeatedly in human history, this observation provides us with an empirical basis for distinguishing ultrasocial institutions from others. Given that institutions are locally stable equilibria, however, we should not expect an immediate effect of fragmentation. Rather, the loss of ultrasocial institutions should be a long-term and stochastic process, with different lower-level units 'flipping' from one equilibrium to another at random times.
While bottom-up theories of agricultural productivity see the invention of agriculture as the primary catalyst for the development of ultrasocial societies, some CMLS-based theories argue that the causality goes the other way around: a complex undertaking such as agriculture requires cooperation on a scale that far exceeds what small-scale societies are capable of. According to this view, 275 inter-group pressures or other cultural innovations must have preceded, or at least accompanied, the development of agriculture-based large-scale societies. Our database project is being constructed with an eye toward assessing the most prominent current theories concerning what these additional factors may have been, namely the pressure of warfare, and the creation of certain forms of ritual and religion.

Warfare as a CMLS Force
The most important form of competition for historical (and prehistoric) societies was warfare. Thus, CMLS theory predicts that intensification of warfare should be followed, after a suitable time lag (again, because cultural evolution is not instantaneous), by a transition to larger-scale, more complex societies.
When we speak of 'intensification' of warfare, we use this term in a special way. What matters for the warfare-as-CMLS hypothesis is not how many people are killed or how much booty plundered. Rather, it is the question of how likely it is that warfare would result in a decisive victory that will have consequences for cultural change. These 'consequences' can take a variety of forms. In extreme cases, one of the sides in a conflict can be physically eliminated because its members are killed, or sold into slavery. Another possibility is for a conquered group to gradually lose its cultural identity as a result of such processes as linguistic assimilation and religious conversion. Yet another possibility is for the vanquished group to preserve its cultural identity while adopting certain traits of the victorious group. For example, a series of military defeats may prompt a group to imitate cultural traits of more successful groups. It is also possible that the conquerors will adopt cultural traits of the conquered population, even perhaps becoming assimilated to their language and converting to their religion. From the point of view of CMLS, it does not matter whether cultural traits that win in competition belong to the conquerors or the conquered. What all these outcomes share is that one set of cultural traits is replaced with another set.
To test this theory we need to develop a set of proxies-variables that correlate with warfare intensity (or the intensity of CMLS). Previous work has identified three classes of such variables (Turchin 2009(Turchin , 2011. First, a variety of technological innovations serves to make warfare more decisive. The paradigmatic example is the development of gunpowder artillery in early modern Europe, which made medieval fortifications obsolete (Roberts 1956, Tilly 1990, Parker 1996. The losing side could no longer wait behind impregnable walls until the winners depart. A direct consequence of this innovation (although many other forces also played a role) was a gradual territorial consolidation of European states, from many hundreds in the fifteenth century to roughly 30 in the nineteenth century. Another consequence was rapid cultural evolution of both military and political technologies, such as the development of a new (and very expensive) type of fortification that could resist siege cannon-the so-called Trace Italienne, or 'Star Fort.' The introduction of gunpowder is not the only example of a military revolution (and intense cultural evolution) resulting from a technological innovation. Other examples include the spread of chariots and cavalry, new metals (bronze, iron, and steel) for weapons and armor, and new projectile weapons (such as the compound bow, the cross-bow, the catapult, and the trebuchet) (Turchin 2009).
The second class of variables affecting warfare intensity is geographical. Rugged terrain is easier to defend, and thus warfare should be more decisive in the plains, compared to hills and mountainous. Navigable rivers, narrow straits, and internal seas enable rapid movement of navies and armies. On the other hand, dense forests impede troop movements. Essentially, any geographical feature that facilitates movement of armies and supplies should increase the intensity of competition between cultural groups.
The third class of variables relates to the willingness to wage 'total war' against the enemy. Is it permissible to slaughter noncombatants, destroy settlements, deport or even exterminate populations? Historical evidence suggests that when the cultural distance between the groups involved in a conflict is large, it becomes easier to dehumanize the enemy and to perpetrate against them what would be considered as an atrocity, if it were directed at a culturally similar group (Turchin 2011). Another question is whether people on the losing side should be forcibly converted or assimilated, Thus, certain ideologies-for example exclusive proselytizing religions (Stark 1996)-may be associated with more intense styles of warfare.

Ritual as a CMLS Force
Many forms of cooperation required to maintain cultural groups, large and small, or to motivate wars between them, depend crucially on the cultivation of social cohesion. 1 Although people can be forced to pay taxes, obey the law, or lay down their lives on the battlefield, coercion alone has limitations. A citizen who believes that the functions of the state should be funded from the public purse, that its judicial institutions have legitimacy, or that its wars are just, will be a more reliable contributor to these and other forms of collective enterprise. And the same principle applies also in much smaller groups, even down to the level of the conjugal unit, where resource pooling and child rearing require commitment and not just fear of punishment. In all human groups, whether large or small, political or religious, commercial or charitable, public or domestic, rituals play a crucial role in generating the social cohesion necessary for their effective operation. The superior efficacy of ritual cohesion, as opposed to legal or military compulsion, is a theme that can be found throughout the world's historical cultures-in early China, for instance, it was the primary focus of debates concerning political legitimacy (Cook 2004, Slingerland 2008). However, the specific manner in which rituals accomplish the end of social cohesion or shared values differ markedly depending on the scale on which cooperation is required and the levels of self-sacrifice for the group that are needed.
In the case of small groups, in which individuals face high risks and strong temptations to defect, self-sacrifice is often motivated through participation in dysphoric rituals, capable of promoting allegiances to a 'band of brothers' that can be even stronger than those found among actual kin. Examples include the ordeals of initiation cults, millenarian sects, and vision quests. Such 'imagistic' rituals (Whitehouse 2004, Atkinson and are typically emotionally intense events that are experienced rarely (in some cases only once in a lifetime). The intensity of such rituals is exaggerated by extreme forms of deprivation, bodily mutilation and flagellation, and psychological trauma based around participation in shocking acts. These practices are very widespread in small-scale tribal societies (Whitehouse 1996), modern rebel groups (Whitehouse and McQuinn 2012), and some ancient civilizations (Whitehouse and Hodder 2010).
Experiments show that imagistic rituals typically involve intrinsically puzzling (causally opaque) procedures that trigger intense reflection (Richert et al. 2005). Such reflection appears to be an essential element in the process, producing perceptions not only of shared experience but also of shared insight and understanding, thereby strengthening relational ties among coparticipants as part of the creation of 'identity fusion' (Swann et al. 2012).
Until around ten thousand years ago all human groups were small-scale and based upon relational ties. Success in between-group competition probably depended largely on local fusion within face-to-face coalitions. Groups achieving high levels of fusion would have been better able to defend themselves against predation and also to appropriate resources from less cohesive groups. Changing climatic conditions in the late Holocene, however, enabled some groups (e.g. in the Middle East and Mediterranean) to colonize fertile valleys where between-group competition was greatly reduced. A current hypothesis is that as inter-group conflict diminished, the need for dysphoric rituals and local fusion also disappeared. From this point on, increasingly large portions of humanity were living in societies much too large for everybody to know each other, still less to be fused with them.
This new kind of society was based primarily on identification rather than fusion, and thus on categorical rather than relational ties (Swann et al. 2012). Collective rituals probably played a part in this process, but we propose that some of these new rituals were unlike any that had occurred before in the 278 cultural repertoire: for the first time, rituals in these much larger societies were organized around daily or weekly cycles and the emotions evinced were far less intense. High-frequency ritual (or routinization) is a hallmark of world religions and their offshoots, but is also characteristic of a great many regional religions and ideological movements (Whitehouse 2000). Routinized rituals play a major role in the formation of large-scale identities, enabling strangers to recognize each other as members of a common in-group, facilitating trust and cooperation on a scale that would otherwise be impossible (Whitehouse 1995(Whitehouse , 2004. It heralds not only the first large-scale societies, but also the first complex political systems in which roles and offices are understood to be detachable from the persons who occupy them. Some routinized traditions, however, manage to get the best of both worlds: a mainstream tradition, constructed around regular worship under the surveillance of an ecclesiastical hierarchy, may tolerate much more colorful local practices involving rare, dysphoric rituals (such as self-flagellation at Easter parades in the Philippines or walking on red hot coals among the Anastenaria of Northern Greece). While these localized practices produce highly solidary groups distinct from the mainstream tradition, the resulting fusion can be extended to the larger community, rejuvenating commitment to its unremitting regime of repetitive rituals (Whitehouse 1995). Other patterns are also possible, however. For example, according to the 'pendulum-swing theory' of Islam (Ibn Khaldun 1958, Gellner 1969, rural tribes fused by higharousal rituals formed the most formidable small military units in Muslim society, capable of periodically toppling urban elites, whose more routinized rituals and doctrinal beliefs failed to generate the kind of cohesion needed to mount an effective defense. Other major patterns include periodic splintering and reformation (Pyysiäinen 2004).

Religion as a CMLS Force
This hypothesis links the emergence of ultrasociality to religious beliefs and behaviors. It postulates that religious beliefs and behaviors have been maintained and strengthened because certain groups succeeded in integrating them into packages of cultural elements (beliefs, rituals, devotions). Such cultural packages deepened group solidarity by incentivizing trust and cooperation with supernatural punishments and rewards (Wilson 2002, Norenzayan andShariff 2008, Norenzayan et al. MS). The gradual assembly of this cultural package, the hypothesis contends, was not only a key to the origin of large-scale societies, but also provides a convincing answer to the historical question of why religions with moralistic gods-rather rare amongst the panoply of human religious variety-have spread at the expense of other types of religion. This CMLS-based theory suggests that cultural groups with religions that best promote within-group cooperation and harmony tend to outcompete other groups. The hypothesized link between religion, group identity, and morality also potentially explains the persistence of religious belief in the face of countervailing evolutionary pressures, and lends credence to arguments postulating a link between moral evaluations and some sort of 'moral realism'-the metaphysical grounding of prosocial norms.
We intend to use the database to test the claim that cultural evolutionary processes may have shaped the spread and recombination of certain representations into 'packages' of religious beliefs, institutions, and practices that served to extend and galvanize the human sphere of cooperation, trust, and exchange by a variety of mechanisms (Irons 1991, Wilson 2002, Sosis and Alcorta 2003, Turchin 2003b, Henrich 2009, Shariff et al. 2010. These include, but are not limited to: • Supernatural Monitoring, Rewards, and Punishment, or the belief in omniscient supernatural watchers who monitor cooperation and trust among strangers (Norenzayan and Shariff 2008). • Moral Realism, or the belief that one's moral intuitions are grounded in the metaphysical structure of the universe, which both explains their psychological force and justifies their imposition on others (Taylor 1989, Haidt andKesebir 2010). • Credibility Enhancing Displays, or hard-to-fake commitment displays-the seemingly costly rituals, devotions, and other actions that may effectively transmit and signal commitment to observers (Sosis andAlcorta 2003, Henrich 2009). While some, or all, of these features are often taken to be typical of 'religions' in general, there is reason to suspect that they actually represent relatively novel products of a long cultural evolutionary process that has created a linkage between prosociality, morality, rituals, and deep commitments to supernatural agents or principles.

Analytical Approaches
The general purpose of the database is to provide the empirical basis for testing theories about the evolution of ultrasociality. In the previous section we discussed four such hypotheses, addressing the role of resources, warfare, ritual, and religion. This section describes how tests of these (and other) hypotheses can be conducted. Our ultimate goal is to publish a set of predictions that we plan to test before we have collected any data. Thus, the results of the analysis, when it is performed, will constitute 'strong inference' (Platt, 1964), because although we are not predicting the future, we are predicting data that are not currently known.
It is important to note that, when viewed through the lens of cultural evolution, the variables we will code may play different roles, and this affects our analytical approaches. Cultural evolution can be defined as temporal change in the frequency of cultural traits. Ritual and religious variables are therefore examples of cultural traits that evolve. For instance, the belief in omniscient and omnipotent supernatural watchers may spread at the expense of the belief in omniscient, but relatively ineffective supernatural watchers, or the belief that such supernatural watchers do not exist.
Warfare can also be treated as something that evolves (for example, a military technology, such as a catapult, may be invented and spread with time). But warfare also plays a logically distinct role in the CMLS theory, being a selection force-a process that explains why other cultural traits are spreading or disappearing. Characteristics of the resource base are akin to warfare, in that they are a condition (in some variants of the bottom-up hypothesis a necessary and sufficient condition) for the rise of social complexity. This is an important distinction because, within the framework of the CMLS theory, warfare, religion, and ritual are not mutually incompatible explanatory forces. Certain religious or ritual traits may function as the proximate reason why a large-scale society can exist without falling apart, while the ultimate reason can be identified as intergroup competition, perhaps taking the form of warfare. Although the distinction between proximate and ultimate mechanisms in cultural evolution is not absolute, these are nevertheless useful concepts in helping us design statistical approaches, and especially to interpret their results.
There is another way in which the hypotheses that we will test are nonexclusive. We can think of religious and ritual traits as 'devices' for nurturing and sustaining collective solidarity, which is necessary to solve collective action problems in large-scale cooperating societies. This means that, to a certain degree, different traits may substitute for each other, and we do not necessarily expect that having any particular trait would be a necessary condition for the transition to the next level of social complexity. We are undeterred by such complexities. Clearly, the analysis of the database will be a protracted process conducted by many teams of analysts. For the near future, however, it is sufficient to start designing empirical tests that contrast the predictions of the four hypotheses that we have discussed in the previous section. At the same time, our task as database designers is to ensure that its structure is open and flexible, and expandable without any artificial limits. This approach will ensure that the database will eventually acquire the rich and diverse data that will be needed to tease apart the various possibilities discussed above, among potentially many others.

Approaches to Testing the Bottom-up versus Warfare Hypothesis
As a concrete example of how we intend to use the database, it would be helpful to contrast two of the hypotheses we have discussed-resources versus warfare. Again, these two factors can (and probably did) work together in the evolution of large-scale societies. For conceptual clarity, however, we begin by formulating them as alternative hypotheses. We also focus on one set of patterns, spatio-temporal variation in social complexity over the course of human history. Accordingly, the goal of analysis is to determine which of the following outcomes is supported by the data: • Simply knowing how the resource base varied in time and space is sufficient to predict the dynamics of the response variable (social complexity), whereas including measures of warfare/CMLS does not add anything to the explanation. • Alternatively, the resource base may serve only as a threshold variable (e.g., presence or absence of agriculture), so that within areas where it exceeds the threshold all variation is explained by warfare/CMLS, whereas variation in the amount of resources (above the threshold) offers no additional predictive power. • Both types of variables are necessary to capture spatio-temporal variability in social complexity. In that case, we need to estimate the relative contributions of the resource base and warfare. • Lastly, it is possible that neither predictor variable has a statistically significant effect on the response variable. This is the null model.

Defining the Response Variable
So far we have talked about our response (social complexity) and predictor (resources and warfare) variables as though it is unproblematic to measure them for historical societies. Of course, this is not the case, and a discussion of how we can operationalize these variables is in order. We will use social complexity to illustrate our approach, but other variables will be treated in the same manner. Social complexity is a multidimensional variable, and researchers from different disciplines define it in different ways. Our approach is inclusive in that we make an attempt to code all (within reason) aspects of what different disciplines understand by social complexity (see the Appendix).
The first set of variables relates to the scale of societies: the total population, the size of the largest urban center, and the extent of territory controlled by the state (if the society is organized as a territorial state). Next come measures of hierarchical or vertical complexity. These focus on the number of control/decision levels in the administrative, religious, and military hierarchies. Another bundle of variables focuses on what may be called 'horizontal complexity.' Such measures quantify the extent or elaborateness of the division of labor in the economy and polity (for example, the estimated number of professions, or the number of different specialized workshops). A related class of variables code for the characteristics of bureaucracy and 282 judicial system. Informational complexity is coded by the characteristics of the writing and record-keeping (more generally, informational) systems. We also record whether the society has developed sophisticated literature, including history, philosophy, and fiction. Lastly, there are various proxy variables that archaeologists use to identify social complexity in the absence of written records. These include the presence of monumental buildings, the number of levels in the settlement hierarchy, and the presence of structures specialized for government, religion/ritual, and economic activity. Some variables in the list overlap with each other and so are redundant. This feature is implemented by design, because for many historical societies we will be unable to code a substantial proportion of the variables, redundant variables then serving as proxies for those with which they are highly correlated and for which data are lacking.
A more general question is whether there is some fundamental metric that can be applied to all societies along a spectrum from 'simple' to 'complex.' This is an empirical issue, and we will resolve it by running a statistical analysis on the data, once we have coded enough societies. For example, we can subject the data to Principal Component Analysis, and determine whether most of the action is captured by one, or more principal components. In the simplest case, we will use the first principal component as our response variable, since, by definition, it will capture the most variance among multiple measures of social complexity. If there are important dimensions of social complexity that the first principal component does not capture, those too can be subjected to the statistical analysis along the lines described below.
Resource and warfare variables will receive a similar treatment. With resources, it may be possible to summarize them with a single measure, such as the carrying capacity (i.e. the maximum population size that can be sustainably supported in an area given its soil and climatic conditions, and assuming the technology level appropriate for the time period). Warfare, on the other hand, is likely to be multi-dimensional. Its technological and geographical aspects, for example, may be sufficiently uncorrelated to suggest using them as independent predictor variables (again, this is an empirical issue and will be resolved prior to running the regression analyses).

Analysis
Once we have our principal variables, the analysis will proceed along conventional lines (Turchin 2003a(Turchin , 2005. Let X(i,t) be the response variable, the first component of social complexity in a spatial location i (e.g., Sicily, or the Valley of Oaxaca) during a given time period t (e.g. 2000-1900 BCE or 100-125 CE). Analogously, Y(i,t) and Z(i,t) are the predictor variables, measures of the resource base and warfare intensity at location i and time period t (there can be more than one Y and Z, if a single measure does not capture all important aspects of these predictor variables). The general model, then, is a nonlinear regression in the following form: spatial autocorrelation terms, the error term] Here various τ refer to time lags. This formula looks somewhat intimidating, but in the simplest form the model may look something like this linear regression: where various τ (time lags) were all set to 1 time step (e.g., a century). What this model says is that social complexity in any particular place (i) and any particular time step (t) depends on social complexity, resources, and warfare intensity a time step earlier (t-1). A1, A2, etc., are regression coefficients and ε(i,t) stands for error and autocorrelation terms. The reason we include the first term, X(i,t-1) in the model is to allow for the very real possibility that social complexity can build up only gradually. For example, in order for X(i,t) to increase to 4, X(i,t-1) during the previous century already needs to be at least 3. An alternative model would focus on increments in social complexity from one time period to the next, e.g., but such a specification is formally equivalent to Eqn (1). There is an additional possibility that social complexity may go through endogenously driven cycles. Such internally generated oscillatory dynamics can be captured by including in the right-hand side additional lagged terms, such as X(i,t-2). The basic idea behind the analysis is simple. For example, if we use Eqn (2), do we need both terms (the ones involving Y and Z) in the model, or can we dispense with one (or both) of them? This result will tell us which of the hypotheses in the list above is supported by the data. In practice, however, there are a number of complexities that will need to be taken care of.
First, this modeling approach deals explicitly with temporal autocorrelations (by including lagged X in the regression analysis), but because our data has spatio-temporal structure, the analysis needs to address spatial autocorrelations as well. There are two ways of dealing with this issue. One is to use Simultaneous Autoregressive (SAR) models approach. Another is to model spatial structure in the data directly by including in the model terms involving X, Y, and Z at other spatial locations (this approach will become more feasible as our spatial coverage becomes denser).
Second, we need to allow for the possibility of nonlinear effects. For example, the effect of a predictor variable could be ∩-shaped: initially rising and later falling. A simple way to test for such eventualities is by adding quadratic terms.
Third, our database is likely to contain 'holes.' That is, there are likely to be many gaps in our knowledge about the values of variables in any particular time and location. This missing data problem can be dealt with by the statistical method of multiple imputation (White et al. 2011).
Finally, the analysis is likely to involve fitting multiple regression models, varying in the number of parameters. It is clear that the more predictor variables are employed in the regression, the higher is the proportion of variance 'explained.' Thus, we need to guard against 'overfitting,' i.e. fitting overly complex models. The standard approach in such application is to use the Akaike Information Criterion (Burnham and Anderson 1998), which penalizes models for employing too many parameters.

Testing Other Hypotheses
Although the actual analysis of the database will be labor-and computerintensive, and will require sophisticated statistical tools, researchers in fields such as econometrics and population dynamics have developed methods of dealing with complexities such as those discussed above. Thus, we will not need to develop new statistical techniques. Our main task is to 'populate' the database as densely as possible and to make it as informative as possible (which also means that we need to code a great variety of societies to capture the whole extent of variation). Once this is done (or, at least, once the database achieves the critical mass), the rest is a matter of implementing existing techniques.
Statistical testing of the ritual and religion hypotheses will employ regression models similar to Eqn (1). The most logical procedure is to add these variables in a stepwise fashion. In other words, we start with a model such as Eqn (1), which only includes terms that have been found to have a statistically significant effect on the response variable, and then add terms to the right-hand side representing ritual and religion effects. For example, a ritual term may encapsulate the important features of the most frequent widespread ritual in the 'culturally dominant religious tradition' (either a state religion/official cult, or if none exist, the religion/cult with a majority of adherents), including frequency with which it is performed, its inclusiveness (how widespread it is), and the degree of cohesion that it engenders. Similarly, a religion term would code for the prevalence and costliness of Credibility Enhancing Displays (CREDs) in the culturally dominant religious tradition.
Three comments are in order. First, care is needed to ensure that the same data do not appear on both sides of the regression equation. For example, our broader measures of social complexity include such variables as the size and costliness of specialized buildings. However, such buildings can also serve as CREDs and including this measure on both sides of the regression equation can lead to spurious correlations. We can deal with such factors in two ways. One is to define social complexity in the narrow sense, for example, by focusing only on scale and hierarchical complexity aspects of it (see Appendix). The second approach is to omit any aspects of social complexity that can cause spurious correlations, and recalculate the first principle component from such a reduced list.
Second, recall that intensity of warfare-as-CMLS is an evolutionary selection force, while prevalence of doctrinal rituals, or of CREDs, is a cultural trait. This means that we need to analyze the possible causal chains involved in the rise of social complexity. For example, does warfare intensity cause both the frequency of such cultural elements to increase and, independently, select for larger scale, more complex societies? Or is the frequency with which such cultural innovations arise independent of warfare, but once they arise, they make societies possessing them more competitive? The point here is that causation can work in complex ways, and having an explicit temporal component in the database enhances our ability to tease apart these possibilities at the analysis stage.
Third, we can (and should) investigate the possibility that cultural elements such as doctrinal rituals and CREDs increase the longevity of complex societies. In other words, instead of making them more competitive in their interactions with rival polities, they make polities possessing them more stable to internal perturbations. (Of course, it is also possible that both processes, enhancement of competitive ability and internal stability are affected by the same ritual and religious traits). Again, such a question can be approached with an appropriate analysis of the database. But instead of looking at the probability that social complexity increases, we need to examine what factors are associated with reduced probability of complexity decreases. In other words, it would be very interesting to determine what cultural characteristics prevent, or at least stave off, societal collapse.

Conclusions
Some years ago Randall Collins (1994) pointed out that the natural sciences are typically characterized by rapid discovery of new phenomena and a high degree of consensus among the practitioners, once the research front has moved away. For example, biologists generally agree that Darwin's version of the evolutionary theory has decisively won over that of Lamarck. The social sciences, by contrast, exhibit low levels of consensus even on core issues. Each new generation of scholars is as likely to reject the ideas of their predecessors as to endorse them. Collins argued that rapid discovery and high consensus are related: "high consensus results because there is higher social prestige in moving ahead to new research discoveries than by continuing to dispute the interpretation of older discoveries." Although Collins was skeptical about the ability of social science to break out of this mold and transform itself into a rapid-discovery science, we think that he was unduly pessimistic. Consider the anthropological database called the Standard Cross-Cultural Sample, or SCCS (Murdock and White 1969). The SCCS codes 186 cultures for a great variety of social, economic, and political variables. The introduction of this database was a truly transformative event in cross-cultural research. It held out the prospect of transforming cultural anthropology into rapid-discovery science characterized not by cyclic development (in which each new generation rejects the insights of their elders) but by a cumulative growth of knowledge. The SCCS made knowledge accumulation possible in at least two ways. First, although Murdoch and White initially coded only a few dozen variables, over the last four decades other researchers added hundreds of additional variables. The total count currently approaches 2000 variables.
Another way in which knowledge can accumulate is by different teams of investigators analyzing the data in the SCCS and correlating them with data from other databases (for example, linguistic and economic). Naturally, many articles challenge and even reject the results of previous analyses, but that is simply the self-correcting process typical of any scientific field, including the natural sciences. Cumulative progress takes place when newer analyses use improved methodologies, or bring additional data into consideration, instead of simply rejecting what came before.
According to Google Scholar, between 70 and 80 articles used data from the SCCS every year during the last decade. Overall, more than 1200 analyses of the SCCS data were published since its introduction in 1969. And this impressive body of research has been accumulated despite several serious limitations of the database that restrict its application to the study of sociocultural evolution.
The most important limitation is that the SCCS is a synchronic or static database. In other words, it codes the characteristics of any particular society at a single point in time. Sociocultural evolution, however, is all about change. Anthropologists and other social scientists have designed clever approaches to get around this problem, for example, using the methods of phylogenetic analysis developed in evolutionary biology (Mace and Holden 2005, Currie and Mace 2009, Fortunato and Mace 2009). But it would make much more sense to code and analyze how societies evolved in time-after all, the data are there.
Another limitation is that the SCCS is dominated by stateless societies (79 cases) and societies with minimal states (50 cases). There are only 34 societies with large states. Such a sample makes sense from the point of view of a cultural anthropologist who is particularly interested in small-scale societies. 287 But the SCCS is not really suitable for testing hypotheses about the evolutionary transitions between small-scale and large-scale societies. Additionally, the SCCS codes only 186 cultures (so the maximum sample size is n = 186). Initially this was done in order to overcome Galton's problem, due to the units of analysis in cross-cultural studies-'cultures'-not being statistically independent. Autocorrelations between different cultures, which may arise as a result of common decent, or cross-cultural borrowing, invalidate standard statistical tests. However, the attempt to avoid Galton's problem by selecting a sample of cultures did not work, because the autocorrelations were still there. And it is not even necessary, as modern statistical approaches allow us to deal with Galton's problem at the analysis stage, rather than by throwing data away (Eff andDow 2009, White et al. 2011).
Other anthropological and archaeological datasets overcome some limitations of the SCCS. For example, the Ethnographic Atlas (Murdock 1967) coded 1167 societies. However, like the SCCS it is a static database. Peregrine's (2003) Atlas of Cultural Evolution and the related Encyclopedia of Prehistory (Peregrine and Ember 2001) attempt to capture the time dimension, but their time step is one thousand years. The huge corpus of knowledge about past societies collectively possessed by academic historians is almost entirely in the form that is inaccessible to analysts (stored in historians' brains or scattered over heterogeneous notes and publications). Its huge potential for advancing the state of social evolution has been largely untapped.
The preceding review is not a warrant for pessimism. All the datasets that we mentioned, and those we did not, have been highly useful resources. We point out their limitations with the goal of overcoming them in the database that we are building. Clearly, our database will also be limited in some ways. Nevertheless, it should have the same transformative effect on the field of historical social science that the SCCS provided for cross-cultural research.
The importance of the database will not be limited to the world of academic science. Sometimes it is forgotten that our own modern societies did not suddenly appear 20 or 50 years ago-instead they evolved over many centuries and millennia. History matters. For example, recent research indicates that the degree of economic development today is strongly correlated with that of 1500 CE, which in turn was influenced by 1000 BCE, and perhaps even by conditions obtained 10,000 years ago (Acemoglu et al. 2001, Diamond and Bellwood 2003, Olsson and Hibbs 2003, Comin et al. 2010, Acemoglu and Robinson 2012. The ability of populations to construct and maintain viable states is also strongly conditioned by history. For example, the efficiency of provincial governments in Italy is strongly correlated with the vibrancy of civic life in the province during the Renaissance (Putnam et al. 1993). In fact, the roots of the North-South split in the ability to cooperate may go all the way to the times of the Late Roman Empire (Turchin 2006). The ability of different regions of 288 Afghanistan to maintain local peace and order is similarly strongly conditioned on the previous history of state building in the region (Barfield 2010). Furthermore, ability to cooperate in the political and the economic spheres are probably interrelated (Acemoglu and Robinson 2012). For example, Bockstette et al. (2002), showed that state antiquity is significantly correlated with political stability, institutional quality, and the rate of economic growth between 1960 and 1995. The potential implications for public policy makers are obvious. What we are proposing here is a new way of analyzing the human past with the aim of explaining core features of sociocultural evolution scientifically. This effort will not replace traditional forms of historiography or archaeology. Rather, it will greatly extend their intellectual scope, explanatory potential, and relevance to the contemporary world. We offer an open invitation to our colleagues with specialist knowledge of historical regions and periods to join us in establishing this new and ambitious research program. In doing so we will be laying foundations together for research for generations to come.