Friends from Afar: The Taiping Rebellion, CulturalProximity and Primary Schooling in the Lower Yangzi,1850-1949 Yu Hao‡Peking University, School of EconomicsMelanie Meng Xue§UCLA Anderson School of ManagementThis Version: June 2016AbstractThis paper tests the hypothesis that the cultural distance between migrants and natives impedes public goods provision. The Taiping Rebellion was a shock that causedgroups without a history of shared governance to be relocated into the same region.We use a unique historical dataset of surnames in the Lower Yangzi of China to construct a measure of the cultural distance between migrants and natives (MNCD).We find an one-standard-deviation increase in MNCD is associated with a decreaseof over 0.19 public primary schools per 10,000 persons in the early 20th century.Results survive various robustness checks and an instrumental analysis exploitingpre-existing cultural distances between native and the nearby population. Evidencefrom the timing of MNCD taking effect, suggests that the primary mechanism runsfrom migrant-native cultural distance through quality of collective decision-makingto modern primary education.Keywords: Cultural Distance; Primary Education; Local Public Goods; Quasi-ExogenousMigrationJEL Codes: D72, J15, N45, N95, O15, Z1 We thank the editor Debin Ma, two anonymous referees, as well as Ying Bai, Zhiwu Chen, Qiang Chen,Gregory Clark, Christian Dippel, Mark Koyama, James Kung, Nan Li, Kris Mitchener, Jean-Laurent Rosenthal,Tuan Hwee Sng, Romain Wacziarg, Noam Yuchtman and conference participants at the All-UC Conference on“Frontiers in Chinese Economic History” , Chinese University of Hong Kong, Shandong University, Xiamen University and Peking University for helpful comments and suggestions. All remaining errors are the responsibilityof the

I. INTRODUCTIONAn extensive literature documents the negative impact of population heterogeneity on publicgoods provision (Easterly and Levine, 1997; Alesina, Baqir, and Easterly, 1999; Alesina andFerrara, 2005). However, recent research suggests the lack of history of shared and centralizedgovernance between groups is just as likely to be responsible for the adverse outcomes associatedwith the coexistence of different ethnic groups(Gennaioli and Rainer, 2007; Michalopoulos andPapaioannou, 2013). This raises the question whether ethnic cleavages or artificial jurisdictionshas caused poor economic performance. Dippel (2014) contributes to the debate by showing thata lack of a history of shared governance can negatively affect even ethnically and linguisticallyhomogeneous populations. We instead show that even for previously detached groups, culturaldistance can matter for the coexistence of different ethnic groups.We exploit variation in cultural distance between previously detached groups following an external shock: the Taiping Rebellion. We use a unique dataset of Chinese surnames of approximately100,000 individuals over the course of 150 years. The same dataset also allows us to build common measures of population heterogeneity such as fractionalization and polarization.The Taiping Rebellion (1850-1864) was a massive civil war in South China that constituted aone-time shock to the population makeup. The rebellion led to the loss of 17 million lives in theLower Yangzi, or half of the native population (Cao, 1998). After the war, migrants flocked intothe region and began to coexist with natives. This shock created two groups without a historyof shared governance or prior interaction in a region. Cultural proximity between migrants andnatives varied. We hypothesize that cultural distance between migrants and natives (“MNCD”),who lived in the same community after the rebellion, had a negative impact on public goodsprovision.We provide the historical context to show that migration was plausibly exogenous to the culturaldistance between migrants and natives. First, migrants moved into the area with little priorcontact with natives. Migrants were not selected based on their cultural proximity with thenatives (as is the case with chain migration). Ex-ante sorting was minimum. Second, in traditional China where ancestral land was of cultural prominence, natives were not able to move outas freely in response to the arrival of migrants whose preferences differed. Hence ex-post selfsorting was not a concern, either.1 To further establish causality, we introduce an instrumentalvariable approach exploiting variation in pre-existing native-nearby cultural distance.We go on to test our hypothesis that a greater migrant-native cultural distance lowers publicgoods provision. Our proxy for public goods provision is the number of public primary schoolsat county level. In the baseline model, we find that a one-standard-deviation increase in MNCDis associated with a decrease of 0.19 public primary schools per 10,000 persons between 19001Compared to Ager and Brückner (2013), we use arguably more exogenous migration as a treatment, asnatives and migrants had few opportunities to engage in ex ante screening, or ex-post self-sorting.1

and 1910. That is a fifth of the mean of the number of public primary schools by population, or40% of the standard devision. We then include in the controls share of arable land, distance tothe Grand Canal, distance to the Yangtze River, distance to the provincial capital, and distanceto Shanghai. We show that MNCD wins horse races against alternative explanatory variablesincluding the traditional fractionalization index and the polarization index. For robustness,we control for initial conditions, interventions in education (missionary activities and templeconversion), and confront possible effects of war (battle exposure, demographic shock and humancapital shock) on schools.While our key finding is that cultural distance has an independent effect on public goods provision outcomes, we find evidence that the negative effects of the cultural distance betweensurname groups can be mitigated by the history of shared governance. Our finding is mainlybuilt off the horse race results of MNCD against other measures of population heterogeneitythat ignore the history of shared governance within native surname groups.We provide suggestive evidence on the mechanisms through which MNCD prevented the establishment of public primary schools. First, we exploit institutional features of early 20th centuryChina to form a testable hypothesis : MNCD should have the strongest effect on lower-primaryand dual-primary schools, since (a.) MNCD should matter the most in an environment of selfgovernance, (b.) villages were traditionally self-governed, and (c.) villages (and townships) wereresponsible for the building of lower-primary schools, and sometimes, dual-primary schools.2Consistent with our prediction, we find that MNCD only affects schools at lower-primary anddual-primary schools, not at upper-primary and secondary schools. Second, we exploit timevariation in institutions through the first half of the 20th century. Massive institutional changesin 20th century China provides an excellent laboratory to observe the impact of MNCD. In oursample, we have periods of decentralization and centralization of the education system, whendecisions to education children were made locally and when those decisions were made by thenational or provincial government, respectively. This allows us to use evidence from timing tointerpret our finding. We expect to see a larger effect of MNCD during the period featuringmore self-governance and more decentralization of fiscal authority. And consistent with thisprediction, we find the effect of MNCD on modern education is pronounced in the early 20thcentury but is muted in China for much of the 20th century under autocratic rule and fiscalcentralization. Soon after the centralization of the educational system in 1927, MNCD no longerhad a significant effect on schools. We conclude that MNCD resulted in fewer primary schoolsbeing build due to lower quality of collective decision-making in local communities.Our study builds on the literature on the relationship between diversity of individual preferencesand public goods provision. Alesina, Baqir, and Easterly (1999) show theoretically that the2At the time, the entire phase of primary education was divided into two: upper- and lower-primary education.“Upper primary education” refers to the more advanced stage of primary education. Most schools specializedin either upper- or lower-primary education. Those providing both upper and lower primary education were called“dual-primary schools”.2

median distance from the preference of the median voter can be considered as an indication ofhow polarized preferences are. The model predicts that public goods provision will be adverselyaffected in a polarized society characterized by two separate groups with relatively homogeneouspreferences within the group, but very distinct preferences across groups. More recent workshows that in the process of decentralization and redistricting, the benefits of reduced diversitycan be undone if the newly governed population is highly polarized (Bazzi et al., 2015). In ourpaper, we use the cultural distance between migrants and natives as a proxy for the differencein preferences between these two groups. We find that cultural distances between groups indeedmatter for public goods provision, whereas the traditional fragmentation measure that assignsthe same distance to all groups does not produce the same effects.Our study also contributes to the literature of the effect of genetic dissimilarity on economicdevelopment. Ashraf and Galor (2013) find the beneficial and detrimental effects of diversity onproductivity, and conclude that an immediate level of diversity is the most conducive for economic development. Desmet, Le Breton, Ortuño-Ortı́n, and Weber (2011) link genetic distanceto the stability and breakup of nations, and provides empirical support for the use of genetic distance as a proxy of cultural heterogeneity. Spolaore and Wacziarg (2009) show genetic distanceaffects income differences across countries through a barrier effect to the diffusion of development from the world technological frontier. Our paper similarly uses genetic distance as a proxyfor cultural distance, and focuses on the public goods provision consequences of greater geneticand cultural distances between groups.This paper is organized as follows. Section II explains the historical context. Section III discussesdata sources and the basis for constructing our measure of migrant-native cultural distance. Section IV summarizes my baseline results and the comparison of migrant-native cultural distanceto the fractionalization and polarization of the population. Section V introduces a number ofrobustness checks, accounting for initial conditions, interventions in education and war-relatedconditions. Section VI comprises an instrumental variable analysis. Section VII identifies thequality of collective decision-making as a possible channel for migrant-native cultural distanceto influence the supply of modern primary education. Section VIII concludes the paper.II. HISTORICAL CONTEXTA The Taiping RebellionThe Taiping Rebellion was a massive civil war in South China which lasted from 1850 to 1864.At least 17 million people, or half of the populace, died in the lower Yangzi (Cao, 1998; Cao andLi, 2000). Battles broke out throughout the Lower Yangzi and all counties, with the exceptionof Shanghai, were occupied for at least 3 months. The area around Nanjing, the capital cityof Taiping regime since 1853, had lingering conflicts for over ten years. The most prosperousand important cities in the Lower Yangzi, Hangzhou and Suzhou were occupied by the Taiping3

army after 1860. Shanghai, protected by foreign powers, was the least affected, and it servedas a shelter for over 200,000 refugees (Ge, 2002a, pp.62–63). Famine and plague followed thebattles. So did mass migration.A.1In-Flow MigrationMigration occurred both during the Taiping Rebellion itself and in the aftermath of the rebellion.While migration internal to the Lower Yangzi was certainly common, post-Taiping migrationwas best characterized by long-distance migration from North China and from the Middle YangziRiver. A crucial difference between pre- and post-Taiping migration is that pre-Taiping migration was largely driven by income differences, job opportunities, and based on ethnic bondsand geographic proximity (Li, 2011), whereas post-Taping rebellion migration originated froma very wide range of geographic areas and featured diverse economic and cultural backgrounds.Another difference is the scale and pace of migration—post-Taiping migration was far morerapid and broader in scale. For this reason, in this paper we focus on post-Taiping migration.3The mass migration led to conflicts between natives and migrants, and between different migrantgroups. In villages and townships conflicts arose as a result of clashing preferences and interests,different dialects, skills and social customs. Conflicts, as documented in local gazetteers, tookplace over a wide range of issues such as usage of public water, property rights of ownerless landand eligibility for imperial exams (Ge, 2002a, pp. 303-308).A.2The Economic and Political Consequences of the Taiping RebellionThe Taiping Rebellion constituted a multi-dimensional shock to the region. Most likely, it hadmore than one way to affect primary schools. Those effects could be at play on both the supplyand demand side of education. In Section V.C, we provide a quantitative analysis of how variousoutcomes of the Taiping Rebellion might have affected the building of primary schools in theearly 20th century.The rebellion damaged local infrastructure. In the Jiaxin prefecture of Zhejiang, 21% of Buddhist and Tao temples were destroyed by rebels affiliated with the God Worshiping Cult (Li,2002). County public schools and private schools, where lower degree holders received further instruction to prepare for the higher level exams, were also destroyed or damaged in large numbers.The destruction of local infrastructure could have arguably undermined the resources useful tothe launch of modern schools fifty years after the rebellion. That said, it is widely documentedthat most temples and schools were restored shortly and even more were built in the late 19thcentury. For example, Li (2002) found that 98 temples were destroyed by the rebels but 220were built (or restored) within twenty years after the rebellion because living standards was3The provincial governments advertised all around China for migrants and depicted the Lower Yangzi asa ‘kingdom of free land; and the ‘land of opportunities’. Farmers from Henan, Anhui, Hubei, Hunan, NorthJiangsu, and South Zhejiang came for a better living (Ge, 2002a, pp.100-106). After 1900, industrialization drewmassive immigrants into the urban area of Shanghai (Junya and Wright, 2010). Its population increased byfour-fold from 1907 to 1947.4

even better than before the war and trade and commercial network was quickly restored. Kuhn(2002), to the contrary, interpreted this trend (along with the rise of local charity organization)as the rise of local gentry at provincial and county level overseeing local public affairs, whicheventually led to the formalization of local self-governance in the early 20th century.The rebellion dismantled kinship networks. Clans used to provide financial aid for clan membersto receive education. During the rebellion rich families migrated to the urban area with theirless well-off relatives left behind in the countryside. Clans also lost land property to the war, therent from which were assigned as public funds for supporting education (Li, 1981). As discussedby Xu and Yao (2015), kinship networks are an alternative to formal institutions in providingpublic goods by effectively overcoming free-riding problems. However, in the context of educationreform in the 1900s, strong kinship networks can be a double-edged sword. Clans sometimeswould prefer the option of funding informal and private tutoring exclusively enjoyed by clanmembers to establishing a school accessible by both clan and non-clan members. A weakerkinship network, in that case, may have reduced within-kinship public goods but enhancedcross-kinship public goods.The rebellion led to huge population losses, which induced higher land-labor ratios and higherwages (Cao and Chen, 2002). High wages forced war-stricken areas to abandon subsistenceagriculture and switch to labor-saving technologies and industries. In Wujin and Wuxi, thesilk industry superseded rice farming to be the largest employer in the rural area (Mickey andShiroyama, 2009). Lin and Li (2014) show that areas with a larger impact of war saw moremodern industrial enterprises in the late 19th century and had a higher level of urbanizationin the 1930s. In addition, the rebellion inadvertently created political room for institutions infavor of modernization to set roots. Pro-reform officials were assigned to post-war provinces.They established formal institutions to promote industrialization.B Educational Reforms: From Traditional to Modern EducationFifty years after the Taiping Rebellion, Qing Government put forward an educational reform.The abolition of the imperial exam system went hand in hand with the attempt to establish awestern-style, modern school system.Prior to 1905, education focused on Confucian classics and aimed at preparing students for theimperial examinations. The traditional educational system included two stages: mass primaryeducation aimed for basic literacy and talent spotting, and more advanced education that drilledcandidates selected from the first stage to pass the exams (Leung, 1994). In the late 19th centurygrowing economic openness gave rise to higher demand for education in science, technology, andother non-exam skills (Yuchtman, 2015). Attempts by missionaries to build modern schoolsbegan in some coastal cities as early as the 1860s. But only until the abolishment of theexam system in 1905 did modern education begin to expand. The Ministry of Education wasestablished, and Offices of Provincial Education was founded, along with county-level agenciesknown as “Education Exhorting Offices” (quan xue suo).5

Educational reform was not a smooth process. Despite ambitious political and educationalreforms, few things changed on the ground. For villages and townships, the process of buildingmodern schools was slow and painful. County officials often found it difficult to raise countytaxes, and make within-county transfers to ensure universal primary schools. Clans sometimeswould prefer to provide direct financial aid to clan members for them to take cheaper informaland private tutorships, rather than establish a school open to both clan and non-clan members.More details concerning both the institutional features of the traditional exam system and ofthe modern education system are included in the appendix (Appendix D and Appendix E).III. DATA AND MEASUREMENTTable A-1 provides summary statistics of all the variables used in the paper and their datasources. Below we focus on the underlying logic of our independent variable, migrant-nativecultural distance.A Independent Variable: Migrant-Native Cultural DistanceOur independent variable is the cultural distance between migrants and natives. We rely onsurname data to construct our measure. To be specific, we use differences in the surname mixto proxy for the cultural distance between migrants and natives:MNCDi where normalized isonomyN,M N,i 1,normalized isonomyN,M N,iPSPPqP k k,N,iPk,M N,i.S 2S 2Pk k,N,ik Pk,M N,i(1)S is the number of surnames in the twogroups. Pk,N,i and Pk,M N,i are the relative frequencies of surname k within natives and within theentire population including natives and migrants.4 The isonomy between the native populationPand the entire population, Sk Pk,N,i Pk,M N,i , measures how likely any individual randomly drawnfrom within natives bears the same surname as one drawn from within the entire population.P 2, and the isonomy of theWe normalize it with the isonomy of the native population, Sk Pk,N,iPS 2entire population, k Pk,M N,i . MNCD captures how culturally dissimilar natives and migrantswere. Figure 1 illustrates migrant-native cultural distance in the Lower Yangzi.Our approach is in line with Bai and Kung (2011, 2015); Li (2011); Spolaore and Wacziarg(2009). Li (2011) uses surname distances between pair of countries or regions at a given time tomeasure multilateral genetic and cultural distance. Following Du et al. (1997), Bai and Kung(2011) and Bai and Kung (2015) use isonomy (similarity in surname distribution between any4In practice, we extract migrant-native cultural distance from the distance in the surname distribution ofa county’s population before and after the rebellion, with the assumption that the surname mix of the nativesremained relatively stable during that period. It is clear that the total population of natives declined after therebellions, but as long as the proportion of each surname population to the total population did not change, ourassumption remains valid.6

Figure 1: Migrant-Native Cultural Distancepair of population) to approximate genetic and cultural distance across regions. They show thatthis surname-based measure is strongly correlated to a measure of genetic distance based on thefrequency distribution of the A, B and O alleles of the ABO gene at the province level. Theyalso find that surname-based measure is strongly correlated to a measure of cultural distancebased on dialects. Similar to Spolaore and Wacziarg (2009), they find that the smaller geneticor cultural distance to the technological frontier from a region, the faster the technology diffusedto that region and hence faster growth. In this paper, we adopt the same measure as in Baiand Kung (2011); but instead of using isonomy between two regions, we use isonomy betweenbefore-migration population (natives) and after-migration population (natives and migrants) toproxy for the consanguinity between natives and migrants.We do not have data on which surnames were associated with migrants but it is generallythe case that migrants as a group were different to natives in terms of surname distribution,especially when they came from far away. To obtain the surname distribution of a county beforemigration, we hand-collect data from county chronicles (a.) name lists of civilians who diedduring the rebellion 1851-1865, (b.) exam degree holders (1645-1850) (c.), surnames of chastewomen, (d.) surnames of their husbands if they are also recorded. The number of records for7

each county ranges from 800 to 3000. To obtain the surname distribution of a county aftermigration, we use the following sources: (a.) surnames of dead soldiers (1927-1953) and (b.)college students born in that county who graduated between 1900 and 1949. The number ofsurnames from the above sources ranges from 500 to 2500 per county. More details can be foundin Appendix B.We draw surnames of individuals from a wide range of social backgrounds to improve the representativeness of the surname sample. One concern might be that these samples seem too smallto correctly estimate the true surname distribution of population if each surname accounts for asmall fraction of population. However, this is not the case for China. In each county the largestfew surnames each accounts for more than 5% of the population. To correctly estimate surnamedistribution of population at least for the largest few surnames, one needs a population sampleof as small as a few hundred. For the six counties we indeed have a small sample problem, we donot include them in most of our regressions. As counties with a small sample are not random, weimpute our outcome variable for those six counties, and run regressions on the sample includingthem as well.One way to check the validity of our measure is to cross-check with measures reflecting culturalor ethnolingustic devisions. One indicator of cultural devisions in the Lower Yangzi is linguisticenclaves. Before the rebellion, almost the entire lower Yangzi spoke Wu. After the rebellion,migrants settled in this area, giving rise to thousands of Mandarin-speaking villages and communities (Huang, 2004; Cao, 2006; Simmons et al., 2006). In Figure 2, we mark counties withlinguistic enclaves where Mandarin (guan hua) is used. As shown on the map, counties withsome of the highest values of MNCD harbor linguistic enclaves even today. This enhances ourconfidence in the validity of our measure.Ideally, we would also like to construct a weighted fragmentation index that takes into accountwhether individuals with the same surname to be both natives, both migrants, or one nativeand one migrant. The fractionalization of the population comes down to dissimilarity of optionspreferred by each other. Rather than to simply stipulate that options are either similar ordissimilar, (Bossert et al., 2003) propose alternative frameworks that permit more degrees ofsimilarity between options. A generalized index of fractionalization is described in (Bossertet al., 2011). Drawing on the insights from Bossert et al. (2003), Alesina and Ferrara (2005)and Caselli and Coleman (2013), we can assume the distance between two individuals with thesame surname, if one is a migrant and the other is a native, to be positive; and the distancebetween two individuals with the same surname, both being natives or being migrants, to bezero. Unfortunately, without individual-level or surname-level data on migrant-native status,our ability to operationalize this index with our data is limited.B Control VariablesTo account for other factors shaping modern education, we include in the baseline controls primary schools rate in 1880, population size and urbanization rate. Culturally dissimilar migrants8

Figure 2: Mandarin Linguistic Enclaves in the Lower Yangzi (Wu-Speaking) Sources: Cao(2006)might be selected into places with economic conditions that are not in favor of public goodsprovision and schooling. So in the full set of controls, we include ruggedness, share of arableland, agricultural suitability, distance to the Yangtze River, distance to the Grand Canal, distance to the provincial capital and distance to Shanghai to capture other differences in economicconditions between counties. For robustness, we account for initial conditions—the number ofcharitable organizations by 1850, population and population density in 1820, and measure of theimpact of war—battle exposure (months), % of elderly and youth (under 20 or over 40), % ofadult men (between 20 and 40) and a measure of differences in human capital between migrantsand natives which we call the “human capital shock”. We infer that natives who were able to setup more charitable organizations have higher social capital, which is likely correlated with boththe type of migrants they admit and retain, as well as with their own ability to provide publiceducation. We also discuss other potential shocks to education, such as missionary activities,measured by the log of one plus communicants per 10,000, and temple conversion, measuredby the log of one plus number of temple-converted schools. We expect both variables to bepositively associated with our dependent variable.IV. BASELINE RESULTSWe use an OLS model to estimate the impact of migrant-native cultural distance on the number9

of public primary schools:#public primary schools per 10,000 personsi α βMNCDi Xi Ω i .(2)The dependent variable #public primary schools per 10,000 persons is the number of publicprimary schools per 10,000 persons right after the educational reform.5 MNCDi is the migrantnative cultural distance in County i. MNCD exclusively focuses on the cultural distance betweenmigrants and natives—two groups with no history of shared governance.6 Xi Ω are a vector ofcounty-level controls. i is a disturbance term.Table 1 summarizes estimates of the effect of migrant-native cultural distance on public primaryschools. With all controls, a one-standard-deviation of MNCD reduces the number of publicprimary schools by approximately 0.18 school per 10,000 persons, which is equal to a fifth ofthe mean or 40% of the standard deviation. We show an unconditional regression of MNCD onprimary schools during the decade of 1900-1910 in Column 1. The relationship is both negativeand significant. In Columns 2 and 3 we add population and urbanization rate sequentially. InAlesina et al. (1999), a larger population means greater economy of scale to provide public goods,but a higher transaction cost in raising taxes. Urbanization may enhance the economic returnof education, affecting the demand side of education. In Column 4, we include basic educationaccess in 1880. We interpret this as a measure of the stock of human capital in an area, and as asummary statistic of those slow-moving components in local culture and institutions that shapethe decision to receive education in the long run. By isolating the influence of past educationalachievement under the private education system, we are one step closer to focusing on the impactof MNCD on public goods provision. From Column 5 to Column 10, we introduce geographiccontrols such as ruggedness, share of arable land and agricultural suitability, distance to theGrand Canal, distance to the Yangzi River, distance to Shanghai, and distance to the provincialcapital.7 Share of arable land and agricultural suitability can proxy the opportunity cost ofreceiving modern education. Distance to the Grand Canal and distance to the Yangzi River areused to proxy the market potential and access to trade. Distance to the provincial capital isincluded to account for the reach of provincial government or state capacity. Shanghai, w

