Policy Research Working Paper 9738 Trade, Internal Migration, and Human Capital Who Gains from India’s IT Boom? Devaki Ghose Development Economics Development Research Group July 2021 Policy Research Working Paper 9738 Abstract How do trade shocks affect welfare and inequality when and work. The framework is used to quantify the aggregate human capital is endogenous? Using an external infor- and distributional effects of the information technology mation technology demand shock and detailed internal boom and perform counterfactuals. Without endogenous migration data from India, this paper first documents that education, the estimated aggregate welfare gain from the both information technology employment and engineering export shock would have been about a third as large and enrollment responded to the rise in information technology regional inequality twice as large. Reducing barriers to exports. Information technology employment responded mobility for education, such as reducing in-state quotas more when nearby regions had a higher share of college-age for students at higher education institutes, would substan- population. The paper then develops a quantitative spatial tially reduce inequality in the gains from the information equilibrium model featuring two new channels: higher edu- technology boom across districts. cation choice and differential costs of migrating for college This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The author may be contacted at dghose@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Trade, Internal Migration, and Human Capital: Who Gains from India’s IT Boom? ∗ For latest version of paper, click here Devaki Ghose JEL Classification: F16, F63, I24, J24, R12 Keywords: trade, human capital, inequality, migration, gravity, education ∗ Devaki Ghose, World Bank, DECRG (corresponding author: dghose@worldbank.org). I thank my advisers Kerem Cosar, Treb Allen, John Mclaren, James Harrigan; discussants Stephen Yeaple, Richard Kneller, Aradhya Sood, Mathilde Munoz; and for valuable com- ments and conversations I thank Andrew Bernard, Emily Blanchard, Sheetal Sekhri, Sandip Sukhtankar, Jonathan Colmer, Nina Pavcnik, Robert Staiger, B. Ravikumar, Rob Johnson, James Feyrer, Nick Tsivandivis, Sharat Ganapati, Eric Young, Gaurav Khanna, my class- mates, and conference participants at UVA, Dartmouth College, Federal Reserve Bank of St. Louis, and numerous other universities. I thank the Indian Census for helping me access confidential data. I thank Gauri Kartini Shastry, Maggi Liu, and Paul Novosad for sharing data. I thank the Bankard Fund for Political Economy and Center for Global Inquiry and Innovation for financial support. Any remaining errors are my own. 1 1 Introduction New economic opportunities that arise from globalization are often accompanied by a rising demand for different types of skills. Inequalities in local access to education and jobs, along with mobility frictions, make it costly for individuals in some regions to acquire education or pursue better job opportunities. These frictions could be particularly large in developing countries. To what extent do these frictions limit the gains from trade and exacerbate inequality? What policies can help reduce these inequalities? The main challenge in answering these questions is disentangling the different ways in which individuals respond to these opportunities, such as choosing the sector and the locations of work and education, and the interdependence between these decisions. In this paper, I analyze the effects of trade on welfare and inequality when education choice is endogenous and when there are mobility frictions to access both education and work. Combining detailed spatial and migration data, I document that IT employment and engineering enrollment responded to the rise in Indian IT exports in the late 1990s, and this response was heterogeneous across regions. Consistent with these stylized facts, I develop and quantify a spatial equilibrium model that adds two new margins of response relative to the existing trade and spatial literature: first, agents can acquire new skills and second, they can migrate internally to acquire these skills. I find that without higher education choice, estimated aggregate welfare gains from the IT boom would be about a third and estimated regional inequality would have been twice as large. Restricting individuals to go to college only in their home districts (i.e., not allowing for mobility for education), increases regional inequality by about 1.45 times.1 The paper begins by providing a set of stylized facts about the labor market conse- quences of the IT boom using spatially granular sectoral labor and education data that I compiled and a unique census data-set tracking migration flows between Indian districts, disaggregated by reasons for migration. From 1998 to 2008, while Indian IT as a fraction of total service exports increased from 15% to 40%, engineering enrollment as a fraction of total enrollment more than doubled, and total college enrollment increased three-fold. I document two salient stylized facts: 1) IT employment and engineering enrollment pos- itively respond to IT exports, with IT employment responding more when nearby regions have higher engineering enrollment and exports; and 2) distance affects migration, and individuals migrate more for work than for education. State borders restrict migration flows for education more than that for work, reflecting state-level barriers to mobility for education, such as in-state quotas for students at higher education institutes. Consistent with these stylized facts, I develop a quantitative spatial equilibrium 1 Regional inequality measured by the coefficient of variation in welfare gains 2 model that allows individuals to make education and work decisions in two stages. In the first stage, they decide what and where to study, accounting for access to higher education and job opportunities. In the second stage, individuals choose the sector and location of work. The first and second stage decisions generate the education and employment responses respectively, documented by stylized fact 1. To my knowledge, this is the first paper to allow for and estimate differential mobility costs for work and education. The estimated mobility costs are much higher for education than for work, consistent with stylized fact 2. In the model, sector specific trade shocks, such as the Indian IT boom, change the relative returns to occupations across locations depending on two factors: 1) the location’s comparative advantage in that sector and 2) the location’s connectivity to other locations. The changes in the relative returns to occupations affect an individual’s incentives to invest in different skill types. Skill investments are constrained by the local availability of higher education and the costs of moving to regions with colleges. Thus, regions differ in how much skilled labor they can access and consequently, by how much they can expand IT production. To the extent that the external demand shock and historical regional differences in comparative advantage are not correlated with unobserved productivities that are determined from the supply side, I can leverage the IT boom to estimate the structural model parameters, such as the elasticity of software exports to software prices. Differences in local access to jobs and education, along with differential moving costs for work and education, generate regional inequalities in the welfare gains from the IT boom. People face differential migration costs when they move for education or for work. The dependence of job opportunities on skill levels makes it challenging to separately estimate work mobility costs using migration data that do not track the skill level of the migrant. I use the two stage structure of the model to explicitly account for such dependence and use the unique census data that track why people move, to estimate these two costs separately.2 I find that the mobility costs across districts, measured as the dis-utility from moving, for education are 2 percentage points higher than those for work. Estimated state border effects are large: The costs of crossing state borders is 2.17 times larger than migrating within province and this effect is about 2.135% larger when the reason for migration is education. There are several reasons why the mobility cost of education could differ from that of work, for example, policy-induced mobility barriers. In India (as in many other countries like the United States and China), there are state quotas in higher education institutes for in-state students. This policy could result in higher costs of crossing state borders for education than for work. 2 To my knowledge, this is only the second project to use this highly confidential data and the first that uses it to estimate migration costs. Kone et al. (2018) were the first to use this data to show how migration flows relate to geographic and cultural distances in India. 3 Compared to a benchmark quantitative model with fixed skill types, I find signif- icantly different aggregate and distributional consequences of trade across regions after incorporating the mechanism of endogenous education choice. Almost two-thirds of the gains in average welfare are driven by the ability to change skills: without endogenous education, for every percentage increase in IT exports during the boom, welfare increases on average by .05% compared with an average gain of .16% in the endogenous education case. Over the course of the IT boom from 1997 to 2000, when IT exports increased by 168%, this translates into a large difference in welfare gains. The rise in inter-regional welfare inequality due to the IT boom, measured as the coefficient of variation in regional welfare at the origin district of an individual, is twice as large in the fixed-skills model than in the model with endogenous education choice. The key mechanisms leading to higher aggregate welfare and lower welfare inequality in the endogenous education model, compared with the fixed-skills model, are the ability to acquire skills and to move across regions for education. How important is this mobility cost for education that I introduce in a spatial equi- librium model? To quantify the importance of mobility costs for education, I restrict individuals to attend college in their home districts. This counterfactual increases re- gional inequality by 45%. The gap between the welfare gains of the worst off and the best district increases by 56%. The question of how to reduce inequality across both regions and skill groups lies at the heart of many policy debates. This paper suggests policy interventions in the education market that can reduce trade-driven regional inequality, but not by moving jobs directly. The policy of reducing in-state quotas for students at colleges can reduce the migration costs for education by reducing the border costs of moving for education. In a counter-factual, eliminating all border costs of moving for education reduces regional inequality in welfare gains by more than half. Reducing inter-state barriers to education can significantly increase access to education for out-of-state students. This can increase the opportunity for people from remote regions to gain access to education and migrate to areas with more high-skilled jobs. Although this policy does not reduce inequality in the distribution of employment, it reduces inequality in the distribution of welfare by increasing access to education. This underscores the importance of general equilibrium effects induced by the expansion of exports, which requires us to give more consideration to the interactions of trade, education, and labor markets. This paper makes three contributions. First, I introduce human capital acquisition decisions in a general equilibrium economic geography model. The general equilibrium aspect is important, since human capital takes time to respond to employment opportuni- ties, during which both people and goods can move. Second, to my knowledge, this is the first paper to estimate the mobility costs for work and education separately and show that 4 these costs are quantitatively different. The unique Census data tracking migration flows disaggregated by reason, obtained through an agreement with the Indian government, made this estimation possible. I show that access to both jobs and education are indi- vidually important for determining the spatial dispersion in the gains from trade. Third, the framework is well-suited for analyzing the effects of policy-induced spatial frictions to moving for higher education, such as in-state quotas at colleges. Reducing these barriers substantially decrease the impact of the export shock on regional inequality. The results underscore the potential for education policies to distribute the gains from globalization more equally. The model builds on a large theoretical literature in the fields of international trade, economic geography, labor, and migration. Similarly to Caliendo et al. (2019), Fuchs (2018), and Kucheryavyy et al. (2016), the model features multiple sectors. Like Allen et al. (2018) and Tsivanidis (2018), the model features agents with heterogeneous skill types. However, unlike the above models, the theory developed here endogenizes the formation of skills across space. Costly labor mobility relates this paper to the class of gravity migration models, such as those by Allen et al. (2018), Tombe and Zhu (2019), Fan (2019), and Bryan and Morten (2019) that feature multiple sectors or regions with costly mobility of goods and people. Kone et al. (2018) use the Indian migration data to provide evidence of how migrations flows relate to distance and cultural differences. Imbert and Papp (2020) provides evidence that the seasonal cost of migration from rural to urban India is very high such that wage differences between the rural and urban sectors can persist. Differently from these papers, I provide separate estimates for mobility costs by reasons for migration. A few structural trade models study endogenous human capital acquisition in trade. Khanna and Morales (2017) studies how US immigration policy and the internet boom affected aggregate welfare in both the United States and India in a dynamic setting with international migration. In contrast, this paper studies the regional distributional con- sequences of the IT boom, quantifying how costs of migration contributed to regional inequality induced by the IT boom. Compared to Ferriere et al. (2018), who build a dynamic multi-region model of international trade with heterogeneous households, in- complete credit markets, and costly endogenous skill acquisition, this paper, in a static setting, additionally features costly mobility for education. A few other theoretical works in this literature focus on quantifying the overall response of endogenous education to trade, without considering regional differences, such as Danziger (2017). Seminal works on the dynamic Heckscher-Ohlin (HO) model, which embed endogenous factor forma- tion in response to trade in the classic HO framework, include Stiglitz (1970), Findlay and Kierzkowski (1983), and Borsook (1987). Consistent with this literature, I demon- strate that trade can strengthen a country’s initial comparative advantage by changing 5 the incentives to acquire skills, and thereby reduce regional inequality in the gains from trade. Endogenizing education relates my model to the class of human capital accumulation models prominent in the education and labor literature. In these models, forward-looking individuals make education decisions based on labor market returns and costs of tuition (Jones and Kellogg (2014), Johnson (2013), and Lee (2005)). Compared to this class of models which requires keeping track of a large number of state spaces, I use a simpler two- stage model that allows me to tractably incorporate many regions and bilateral migration flows between these regions. Given the emphasis in the trade literature on the effect of exports and trade liber- alization on skill premium, there has been relatively little research on the effect of trade on skill acquisition. A number of empirical studies such as Atkin (2016), Blanchard and Olney (2017), Edmonds et al. (2010), Greenland and Lopresti (2016), Shastry (2012), Liu (2017), and Oster and Steinberg (2013) focus on the impact of trade on primary and secondary education. Exceptions to these are Li (2018) and Khanna and Morales (2017) which study the response of college enrollment to high-skill export shocks. More evidence has emerged recently (Li (2019), Hou and Karayalcin (2019), Ma et al. (2019)). Complementing this literature, I provide reduced form evidence about the response of tertiary enrollment to shocks in the high-tech sector in a large developing country, and I document regional heterogeneity in this response. 2 Data A major constraint in studying the effects of IT export growth on human capital ac- quisition in the presence of costly migration is the lack of employment and education data, disaggregated at the sector of work and field of education level, combined with the absence of detailed migration data. To this end, I use three sources to collect data on India’s IT sector and access confidential Indian Census data to obtain district-to-district migration flows by reasons for migration. 2.1 Data on the Indian IT sector I use three rounds of Economic Census data (1998, 2005, and 2013) to obtain data on total IT employment across all districts of India. While the advantage of the Census data is that it covers the entirety of all Indian firms and hence reports total employment, the data are not disaggregated by level of education. To supplement this information, I use 6 data from the National Sample Survey (NSS) rounds 50, 55, 60, 61, 62, 64, 66, and 68.3 These surveys record information on the sector and location of occupation as well as the field of study. The drawback of the NSS data is that it represents only a small sample, and hence does not contain a lot of important sector-level information. However, it does report multipliers on each unit of observation, which, in the NSS, is an individual. This allows me to obtain the unbiased ratios of engineers, non-engineers, and both college-educated and non-college-educated individuals in each sector of employment. By multiplying these ratios with total employment from the Economic Census, one can recover the distribution of the population by field of study and sector of employment in each district of India. Data on wages by sector of occupation and field of education are also obtained from the NSS, supplemented with data from the Economic Census. For more details on combining NSS with Economic Census, see Appendix A.3. Appendix Table A1 reports the daily average district-level raw wages in INR and the average district-level employment respectively. From these tables, observe that both the wages and employment of college-educated and non-college-educated workers increased more in high-skill intensive industries between the pre and post boom periods compared to those in manufacturing. As an additional source, for the reduced form analysis, I supplement the IT em- ployment and wage data with data on IT exports from NASSCOM (the leading trade association of the software industry in India) directories 1992, 1995, 1998, 1999, 2002, and 2003. The strength of the NASSCOM dataset is that it contains data on “95% of all registered IT firms in India”.4 NASSCOM also contains data on IT employment, and this information is divided according to whether employees are technical employees (that is, associated directly with the provision and deliverance of IT services) or non-technical employees (all other employees). Several papers have used the NASSCOM data, which is the most comprehensive source of data on Indian IT firms; among these, Tharakan et al. (2005) and Shastry (2012) are notable.5 Throughout the paper, 1995-1998 is referred to as the pre-boom period and 2001- 2011 is referred to as the post-boom period. The relatively longer choice for the post boom period is based on the fact that it takes at-least 2 or 3 years to prepare for college and at-least 4 years to complete a college degree. Thus, the effect of the IT boom on enrollment and graduation will be observed with a lag. 3 These rounds correspond to years 1993, 1999, 2004, 2005, 2007, 2009, and 2012 4 Source: NASSCOM 5 Tharakan et al. (2005) cross-checked the quality of NASSCOM software exports data by compar- ing yearly aggregate software exports from India from International Data Corporation with the figures obtained from NASSCOM data and claim that they are of comparable magnitude. 7 2.2 Data on internal migration The National Census of India for 2001 is the main data source for internal migration in India. An individual is a migrant, according to the Census, “if the place in which he is enumerated during the census is other than his place of immediate last residence” (Census, 2001). The Census includes additional questions based on the last residence criteria. These questions include reason for migration, such as marriage, education, or employment; the urban/rural status of the last residence’s location; and the duration of stay in the current residence since migration. This level of disaggregation is crucial for separately estimating the costs of migration due to education and work. Publicly available Census data only report the destination district and whether the migrant’s origin is in the same state or out-side the state, aggregated over all reasons for migration. I obtained the more disaggregated data through a special agreement with the Census of India. More information on this data can be found in Section 4 and descriptive evidence about the proportion of people migrating for work and education can be found in Table X. 2.3 Other data Data on the linguistic distance of each Indian district from Hindi was obtained from Shas- try (2012). Construction of the index, which is key to my empirical strategy, is detailed in Shastry (2012) and in Online Appendix A.2 of this paper. The linguistic distance I use is calculated by ethno-linguistics based on the similarity of grammar and cognates. For example, daughter in English is “dokhtar” in Perisan and “nuer” in Mandarin Chi- nese. While Persian and English are both part of the Indo-European language family, Chinese is derived from the Sino-Tibetan language family. Linguistic distance between Persian and English is therefore lower than between Chinese and English or Chinese and Persian. In India, languages differ across regions. The 1961 Census of India documented speakers of 1,652 languages from five language families. There can be wide linguistic diversity between districts, and most people adopt a second language that is a widely ac- cepted speaking medium across districts. Of all multilingual people who were not native speakers, 60 percent chose to learn Hindi and 56 percent chose English (Shastry (2012)). Shastry (2012) proxies English-learning costs as linguistic distance from Hindi rel- ative to English. She shows that since a necessary condition for employment in the IT industry is fluency in English, IT industry expands more in districts that have a higher proportion of English speakers, as proxied by linguistic distance of that district to English relative to Hindi. Data on the college-age population, college enrollment, and literacy are collected from the decadal Census data of 2001 and 2011. Detailed summary statistics for enrollment are 8 reported in Table A2 in the Online Appendix. The most notable is the rise in engineering enrollment. Between the pre and the post boom period, the proportion of engineers in total college enrollment more than doubled from 5% to 11%. During this time, the total number of people enrolled in college also increased by three-fold. Thus, the total number of students studying engineering also increased in absolute numbers. 3 Background of India’s IT growth While the last two decades have witnessed a world-wide expansion of IT and consequent increase in demand for computing skills, this expansion has been disproportionately larger for India than for any other country in the world (International Trade Center (2017)). Panel A of Figure III plots the growth in IT exports over time, where the value of IT exports in 1993 has been normalized to one. This figure shows that IT exports from India have been steadily increasing since 1993, but a large jump occurred in the late 1990s and early 2000, when normalized software exports increased by more than 76% in one year. Figure III shows that during this period, IT employment as a fraction of total employment was also rising. From 1998 to 2000, IT employment as a fraction of total employment almost doubled. Engineering as a fraction of total enrollment was also generally increasing, but the largest jump occurred after 2000. While many factors are responsible for the growth of IT in India, the lack of do- mestic demand for IT means that the sector’s growth is constrained by the growth in world demand for Indian IT. This constraint was eased during the late 1990s and early 2000s, when several major events suddenly escalated demand for Indian IT. The Y2K phenomenon dominated from 1998 to 2000, along with the earlier dot-com boom and, later on, the dot-com bust. In order to solve Y2K-related computer problems, commonly known as the “Y2K bugs”, IT firms started offshoring large parts of their work to devel- oping countries such as India.6 The dot-com boom was a historic economic bubble and period of excessive speculation that occurred from roughly 1995 to 2000; it was marked by extreme growth in the use and adaptation of the Internet. The dot-com bust caused many firms in the United States (two-thirds of India’s IT market) and elsewhere to slash their IT budgets, prompting even more outsourcing to India (Economist (2003)). Most notably, technological progress in the worldwide Internet which had been un- derway for some time, was responsible for bringing world outsourcing demand to Indian 6 Before 2000, all computers stored dates using only the last two digits of a year. The Y2K problem refers to the problem that can occur in computer systems as the year 1900 becomes indistinguishable from 2000. The majority of programs with Y2K problems were business applications written in a 40-year-old language called COBOL (UC Berkeley (1999)). While COBOL programming was already obsolete in US universities, in India it was still a part of the regular course curriculum. (Mathur (2006)). 9 firms. As Khanna and Morales (2017) notes: The absence of world-wide Internet during the 1980s meant that on-site work (“body-shopping”) dominated, because otherwise software had to be trans- ported on tapes that faced heavy import duties. But in 1992, satellite links were set up in Software Technology Parks (STP), negating the need for some kinds of on-site work, and this boosted the offshoring of work to India. In 1993, the shift from B-1 to H-1 visas in the US further lowered the incentives to hire Indian engineers for on-site work, as they were to be paid the prevailing market wage. While world-wide events such as the Y2K shock, the dot-com boom and bust, and changes in US H-1B visa policies provided considerable external demand stimuli for the growth of the Indian IT sector, certain factors inherent to India are responsible for this expansion of Indian IT exports. It is generally agreed that the availability of low-cost, high-skill human resources has given India a comparative advantage in the IT sector over its competitor nations (Kapur (2002)). Moreover, much of the population (over 60%) is under 25, and India has one of the largest pools of technical graduates in the world. India also has a large English-speaking population due to its British legacy, and this fact is considered one of the key ingredients in the success of IT. As Shastry (2012) has shown, IT firms in India are located mostly in regions with a larger English-speaking population. A natural advantage of India is its time difference with the United States, which is one of India’s biggest customers for IT services; this enables India to offer overnight services to the United States, effectively creating round-the-clock working hours for outsourcing firms (Carmel and Tjia (2005)). The growth of Indian IT is the result of much more than a single transitory demand shock that temporarily catapulted the sector upward. With the expansion in Indian IT exports, Indian IT employment continued to increase. Wages peaked during the sudden expansion of the late 1990s and early 2000s. Arguably, in response to rising IT employment opportunities, engineering enrollment started to respond after 2000, as shown in Figure III. 4 Reduced-form facts In this section, I present four facts about internal migration, the relationships between IT exports, regional employment, and enrollment over the short-run and the long-run in India. I use the expansion of IT during 1998-2002, largely driven by external demand shocks as described in Section 1.3, to study the labor market effects in the long-run, that 10 is, between 2005 and 2011. The choice of this time frame is dictated by the fact that an engineering degree takes at-least four years to complete and thus any effect on the labor market related to skill acquisition will occur after 2004-2005. Fact 1: IT employment and engineering enrollment positively respond to exports. To understand how IT employment and engineering enrollment changed across re- gions after the IT boom, I estimate the following event study specification: Ydt = αt + γd + χd ∗ t + βt Exportsd,1995 + dt (1) where Ydt is standardized IT employment or standardized engineering enrollment in dis- trict d at time t. Exportsd,1995 is the proportion of software exports from district d in the year 1995 out of total Indian IT exports in 1995. αt are time fixed effects that capture any factors that are common to all districts at time t. γd are district fixed effects that capture any factors that are fixed over-time in district d. χd ∗ t is a district-level time- trend capturing any linear trend in the outcome variable at the district level. Standard errors are clustered at the state-year level, with alternative clustering assumptions at the state and district levels explored in the Online Appendix B, Table A4. The findings do not change. The idea is that districts which initially had higher connections with the rest of the world, as measured by the proportion of software exports in 1995, will gain more from the expansion in world demand for Indian IT than districts that had little or no connection with the rest of the world. In alternative specifications reported in Table A5 Online Appendix B, following Shastry (2012), I instrument the initial software exports with the historical linguistic distance of a district from English. The conclusions do not change, although the coefficients reduce in size. In panel A of Figure IV, I plot the estimated coefficients along with the confidence intervals for the years 1995-2013. From this figure, we can see that post-1998, IT employ- ment increased more in districts that had a higher level of software exports in 1995. This effect is significant in all years available in the data from 1999-2013. The insignificant coefficients for 1995 and 1998, the pre-boom years, indicates the absence of pre-trends. In panel B of Figure IV, I plot the response of engineering enrollment at 10 year intervals, as the available census data allows. As the graph shows, engineering enrollment has also been rising since 2001. Since the Census data is available at decadal intervals, I cannot show the pre-trend estimates for enrollment. In table A6 in Online Appendix B, we show the regression results for both the response of engineering enrollment and engineering as a percentage of total enrollment. In both cases, engineering enrollment responds in districts with higher historical IT exports after the boom. 11 Fact 2: The effects are heterogeneous. Employment responds more when nearby regions have higher engineering enrollment and higher IT exports. The heterogeneous effects are stronger in the long run. In equation 2 below, I add an interaction term between the number of students enrolled in engineering in 1991 and the proportion of software exports from district d in 1995. Estimated coefficient δt is plotted in Figure V. δt measures the differential response of IT employment between the pre and post boom periods depending on the historical level of engineering college enrollment in 1991, in districts that already had prior software exports in 1995. Ydt = αt + γd + βt ∗ Exportsd,1995 + χt Enrollmentd,1991 + (2) δt Exportsd,1995 ∗ Enrollmentd,1991 + dt Figure V shows that, conditional on the level of software exports, post 1998, IT employment responds more in districts that, in 1991, had more enrolled engineering students in same-state, nearby districts. The intuition, formalized in the model, is that in these districts, it is easier to expand future IT production due to having access to more college-educated, engineering program graduates in close proximity. Regression results are reported in Online Appendix B in Table A7. Map VI shows the spatial distribution of IT employment as a proportion of total employment and engineering enrollment as a proportion of total enrollment in 2011. The graph shows that there is a positive correlation between the percentage of people employed in the IT sector and the percentage of people enrolled in engineering at the district-level. While the contemporaneous correlation in 2011 is 0.38, the corresponding correlation between the proportion of IT employment in 2005 and the proportion of engineering enrollment in 2011 is 0.43. Districts in the south have a higher proportion of both, districts in the north and northeast are relatively deprived of both. I next establish a set of facts related to the costs of migration over distance for both work and education. Fact 3: Migration reduces over distance. In addition, state borders negatively affect migration and this effect is significantly larger when people migrate for education than when they migrate for work or for any other reason. Using the Poisson pseudo maximum likelihood procedure (PPML), I estimate (3), 7 similar to Kone et al. (2018). PPML is a non-linear estimation procedure which per- 7 Kone et al. (2018) ran this specification for 585* 584 districts, excluding the own district. Following the literature on gravity estimation, for e.g., see Bryan and Morten (2019)), I estimate it on a 585*585 sample, including own district. 12 forms better than a log-log estimation in the presence of zeros and has been traditionally used in the estimation of migration gravity equations (Santos Silva and Tenreyro (2006)). loj = C + fj + fo + β1 ln(Distoj ) + β2 langoj + dif f −N BR (3) same−N BR same−notN BR γ1 Dif foj + γ2 Doj + γ3 Doj + oj where loj is the stock of migrants migrating from district o to district j for education (column 1 in Table IX), for work (column 2) or for other reasons (column 3). Distoj is a measure of geographic distance between two districts.8 For bilateral distance between any two districts, I use the geodesic (flight) distance between the geographic centers of districts i and j. All these variables included in the gravity specification are obtained from the calculations by Kone et al. (2018).9 langoj denotes the likelihood of any two individuals from districts i and j being able to communicate in a common language. This is given by: CommonLanguage = sl l i .sj l where sl i is the share of people from district i having mother tongue l . There are three contiguity variables: dif f − N BRij is a dummy variable that takes the value 1 if districts i and j are in different states but are neighbors; same − N BRij is a dummy variable that is equal to 1 if the districts i and j are in the same state and are neighbors; same − notN BRij is a dummy variable that is equal to 1 if the districts i and j are in the same state but are not neighbors. The base group is ‘not in the same state and not neighbors’. The difference between γ1 and γ2 gauges the role of the state borders. Table IX shows that the coefficient for same-state-neighbor dummy is larger than the different-state-neighbor coefficient in every column, and this difference is statistically significant. This shows that the effect of state borders differ substantially depending on the reason for migration. One reason why the state border dummy is so important when people migrate for education is the policy of reserving a large proportion of seats in public as well as private colleges for in-state students. Most state colleges have home state quotas of 50 % with such limits being as high as 85% for some states.10 While such 8 Other reasons include marriage, business and other unclassified reasons 9 Geodesic distance is the length of the shortest curve between two points along the surface of a math- ematical model of the earth—between the districts’ geographical centers, denoted as distance centroids. 10 Support for the 85% reservation policy started in Maharashtra from the year 2011 with the backing of nationalist state parties 13 quotas also exist for jobs and thus create significant hurdles for moving across states, the employment quotas are more specific and less ubiquitous than the in-state education quotas. Fact 4: Individuals migrate more for work than for education and the distributions of flows for work and for migration across districts differ accordingly. Figure VII shows the histogram of migration flows by reason for migration. The x-axis plots the percentage of people who migrated for work and for education out of the total number of migrants at the destination district. The y-axis plots the number of destination districts with the corresponding percentages. As is clear from the plots of these very different and almost non-overlapping distributions, out of the total migrant population in most destination districts, a much higher percentage had migrated for work compared to that for education. Facts 3 and 4 are also borne out by Table X. Reading off column 3, out of all migrants who migrated out of their district of past residence in the last 10 years, 48% migrated for work, but only 3% migrated for education. Column 4 shows that out of all individuals who migrated for education, only 31% crossed state borders, while over half of those migrating for work did so. Informed by these four facts, the next section presents a general equilibrium model featuring many locations, costs of movement of people and goods between locations, and costly human capital acquisition decisions. 5 A Quantitative spatial equilibrium model with en- dogenous education choice There are discrete locations d ∈ D where D includes the many regions within a country, in this case, districts within India. There is also the Rest of the World (RoW) that these many regions trade with. The small open economy assumption holds. The regions differ from each other in their distances to other regions and the RoW and in the distribution of population eligible to attend college, that is, individuals who have already completed high school. There are individuals in each region who make decisions in two stages. In the first stage, they decide whether or not to go to college, and if they go to college, what field to study and in what location. There are F fields individuals can choose to study, such as engineering. In the second stage, given their education decisions, individuals decide where and in which sector to work. There is a representative firm in each sector in each location, and within each sector the firm in each location produces a different variety which is costly to trade across locations, as in an Armington set up. Each worker 14 is endowed with an unit of labor which they supply inelastically. There are S sectors in the economy. Online Appendix D describes the assumptions implicit in this modeling structure in detail, including their justifications and implications. 5.1 Individuals Utility of an individual i who attained college education in field of study f from region o2 and then works in sector S in region d depends on wages, amenities, migration costs, price indices, and idiosyncratic productivity shocks, and is given by: (suppressing individual subscript i from utility for expositional clarity) wf,dS Vo2 f,dS = · uf,dS · ηi · µ2 o2 d (4) Pd where wf,dS is the wage of a worker in region d with a degree in field f who is working in sector S , uf,dS is the amenity of living in region d for a worker with degree f working in S , henceforth referred to as type (f, S ) worker. Pd is the cost of living in region d, which is endogenously determined as described in Section 5.2. (1 − µ2 o2 d ) is the utility cost of migrating from o2 to work in d. The idiosyncratic productivity shocks for each individual i, ηio2 f,dS are drawn from a Frechet distribution where F (ηio2 f,dS ) = exp(−ηio2 f,dS −θ ) θ determines the dispersion of the Frechet productivity shocks. Utility cost of education To add workers’ education choice, I introduce an utility cost of education. Let ao2 f denote the net amenity of studying f in o2 , which includes the unobserved preferences for studying f in o2 and the money and opportunity cost of education. In other words, it is the fraction of utility lost in order to study field f in region o2 . People who choose not to go to school earn income wu,dS and people who go to school earn a normalized stipend 1. Let ζiof denote the idiosyncratic preference shock of individual i for his field of choice f in location o2 , where −γ G(ζio2 f ) = exp(−ζio 2f ) γ again determines the dispersion of amenities of studying f in o2 . There is also a migration cost incurred due to moving from one’s location of birth o1 to one’s location of study o2 denoted by (1 − µ1 o1 o2 ). Thus, utility of an individual i born in o1 who chooses to study field f in location o2 and then decides to work in sector S in region d is given 15 by: (suppressing individual subscript i in utility for expositional clarity) IU (1−IU ) ao f wu,o2 S wf,dS Uo1 o2 f,dS = µ1 o1 o2 · 2 · ζi · uf,dS · ηi · µ2 o2 d (5) P o2 Pd where IU is the weight placed on period 1 utility.11 wu,o2 is the wage earned by unskilled workers in stage 1 in region o2 . wu,o2 is 1 if the person is not employed in stage 1. In other words, people who are not working in stage 1 earn a normalized stipend of just one. Derivation of this utility is given in Online Appendix C.1. Migration decisions for education and work When choosing the location and field of education in stage 1, the individual takes into account her expected utility from stage 2. She does not know the exact utility in stage 2 since the idiosyncratic productivity shock is not yet observed. We thus solve the individual’s problem backwards. In stage 2, given the choice of location and field of education (sector of work for an unskilled person), the individual makes her choice of sector of occupation (S ) and location (d), given by: wf,dS argM axd,S · uf,dS · ηi · µ2 o2 d |o1 , o2 , f Pd Given the Frechet distribution of the idiosyncratic productivity shock, the proportion of people with degree in f from region o2 who goes to region d to work in sector S is given by: (θ) wf,dS Pd · uf,dS · µ2 o2 d mo2 f,dS = (6) Φo2 f (θ) wf d S where Φ o2 f = d S Pd · uf d S · µ2 o2 d Φo2 f is a measure of access to jobs for an individual from o2 with degree f . It summarizes the expected value of all the job opportunities available to a person from o2 with a degree in f , taking into account costs of migration and the distribution of job opportunities. In stage 1, the individual maximizes E (Uio1 o2 f,dS ) by choosing (o2 , f ). θ Proposition 1: If η ∼ F rechet(θ), then η α ∼ F rechet( α ) Proof: See Online Appendix C.5 11 Here, being born in o1 is equivalent to completing non-tertiary education in o1 . 16 1 1 Proposition 2: If ηi ∼ F rechet(θ), then E (maxi (ai × ηi )) = ( i aθ i ) Γ(1 − θ ) θ Proof: See Online Appendix C.5 Using propositions 1 and 2, the results of the maximization problem of an individual in stage 1, described by the left-hand side, is given by: ao2 f wu,o2 1 wf,dS (1−IU ) M axo2 ,f (( .µo1 o2 .ζio2 f )IU E maxd,S · uf,dS · ηi · µ2 o2 d |o1 , o2 , f Po 2 Pd (7) ao 2 f θ (1−IU ) = M axo2 ,f ( · µ1 o1 o 2 · ζio f )IU Γ(1 − )Φ θ Po 2 2 (1 − IU ) o2 f since wf,dS (1−IU ) (1−IU ) θ E maxd,S · uf,dS · ηi · µ2 o2 d |o1 , o2 , f = Φo2 fθ Γ(1 − ) (8) Pd (1 − IU ) is the expected income prior to drawing match productivities for workers trained in field f at location o2 . The proportion of people living in o1 who studies f in region o2 is then given by: γ (1−IU ) ao f wu,o µ1 IU ( 2 Po 2 o1 o2 )IU Φo2 fθ 2 lo1 o2 f = (9) Φo1 γ (1−IU ) IU where Φo1 = o2 ,f (ao2 f µ1 IU o1 o2 ) Φo2 f θ . Φo1 is a measure of access to education for an individual from o1 . It summarizes the expected returns from all the different types of education an individual from o1 can access, taking into account the costs of migration and the distribution of work and education amenities. The separate measures of access to education and jobs, as summarized by Φo1 and Φo2 f respectively, are the novel contributions of the theoretical model to the existing spatial economics literature. While measures of market access for factors and jobs are fairly common in the literature, the notion of a separate access to higher education is an innovation that helps us quantify the welfare gains from trade in a world with unequal access to education and costly migration for college. 5.2 Firms There is perfect competition in the production of each variety. The representative firm in sector S in location d produce a variety of the sector S good using both high-skilled LhdS 17 and low-skilled labor LldS , combined in a nested CES constant returns to scale production function: ρS − 1 ρS −1 ρS ρS ρS QdS = (QhdS + QldS ) ρS −1 (10) where ρhS −1 ρhS QhdS = ( ˜ f,dS ) Af,dS (L ρhS ) ρhS −1 (11) f ∈college and ρlS −1 ρlS QldS = ( ˜ f,dS ) Af,dS (L ρlS ) ρlS −1 f ∈nocollege ˜ f,dS = η f,dS Lf,dS = Γ(1 − 1 )Lf,dS is the effective labor supply. where L θ The Armington structure of the model delivers a cost of living index Pd for each αS region, where Pd = ΠS PdS and (PdS )1−σS = j (τjdS pdS ) 1−σS . 5.3 External trade The country exports a tradeable good to the RoW where each region of the country produces a variety of the tradeable good, and, in turn, imports an importable good from the RoW. The country is a price-taker in the world market so the price of the importable good is given. The income of the RoW is also exogenously given.12 Gravity determines the level of trade between each region of the country and the RoW. People can move within the country but not outside the country. The demand for IT exports from region d (Ed,IT ) is given by: τd,IT pd,IT (1−σIT ) Ed,IT = 1−σIT EIT (12) d (τd ,IT pd ,IT ) gravity where pd,IT is the price of IT variety from region d, τd,IT are the costs of exporting IT to the RoW, mostly consisting of communication and management costs. EIT is the RoW’s income spent on the IT sector. Using equation 12, we can solve for IT prices in each district: 1 Ed,IT 1−σ 1 ( d (τd ,IT Pd ,IT )1−σIT ). 1−σIT pd,IT =( ) IT (13) EIT τd,IT 12 In theory, the income of the RoW consists of income from sales to itself, the domestic country, and all other countries. In this particular empirical setting, given that US exports to India consists of a negligible proportion of total US income, this assumption is tenable. 18 5.4 Internal trade The sectors other than IT and the importable goods sector are all internally traded. The gravity equations determining the flows of these internally traded sectors are given by: 1−σS 1−σS σS −1 YdS = XdjS = τdj pdS Pj EjS (14) j j 1−σS 1−σS σS −1 EjS = XjkS = τjk pkS Pj EjS (15) k k Equation 14 states that the income of sector S in region d equals the sum of exports from sector S in region d to all other districts. Equation 15 states that the expenditure of region j on sector S good must equal the sum of imports of good S from all other regions. 5.5 Equilibrium For each region (in our analysis, district), equilibrium in the steady-state is defined as a set of sectoral employment according to field of study (Lf,dS ), field-wise college enrollment (Lo2 f ), wages (wf,dS ), prices (Pd ), and quantities (QdS ). For each district, the equilibrium takes as given population, productivities, amenities, and bilateral migration costs of studying and working according to fields of education and sectors of employment, trade costs between domestic districts and between domestic districts and the RoW. It also takes as given the parameters governing the dispersion of productivity shocks (θ) and amenity shocks (γ ), expenditure shares (α), the elasticity of substitution between high- skilled and low-skilled workers (ρS ), and between different types of high-skilled workers (ρhS ). The steady-state equilibrium is governed by the following equations describing goods and labor market clearance: 1. Given productivities and the initial distribution of population, the quantity pro- duced in each location is determined by the production functions. 2. Given quantities produced in each location and trade costs, exogenously given world income spent on IT, from equation 13 the price of the tradeable good (here, IT) is given by the market clearing for the tradeable good S in each region d: 1 (pS ,d τS ,d,RoW )1−σS pS ,d = 1−σS (YRoW ) (16) QS ,d d (pS ,d τS ,d ,RoW ) 19 3. Given quantities produced in each location and trade costs, price of the tradeable good, from equations 14 and 15, prices of the externally non-tradeable but internally tradeable goods S is given by market clearing: 1 1−σS 1−σS σS −1 pS,d = τdj pS,d Pj α( pS,j QS,j ) (17) Qs,d j S 1−σS 1−σS 1−σS where the price index Pd == j τjd pS,j , ( S pS,j QS,j ) is the income of region j , 13 and α is the proportion of income spent on the good. 4. Given prices of both tradeable and non-tradeable goods, the wages of workers with field of education f working in industry S in region d are given by: 1 ρ 1 ρ − ρ1 1 1 wf,dS = pS Af,dS QSd S QhSd hS S (Lf,dS Γ(1 − )) ρhS (18) θ 5. Given wages and prices, migration flows for education determine the population distribution of skill at each location. The number of people from o migrating to j to seek education in field f is given by: Lo2 f = lo1 ,o2 f Lo1 (19) o1 where Lo1 is the college eligible population in o. 6. Given wages, prices, and the distribution of skill in each region, the distribution of people with skill f working in industry S in region d is given by: Lf,dS = mo2 f,dS Lo2 f o2 7. In the steady-state, the initial distribution of population working in different industries with different skill levels is equal to the final distribution. This completes the description of equilibrium in this model.14 13 Note that conditions 2 and 3 automatically ensure that the trade balance condition is maintained. Summing over d in condition 2, one can easily see that the sales from IT in the domestic country is the same as the amount of income spent on IT goods in the foreign country. Now, for balanced trade, the amount of income spent on IT in the foreign country also has to be equal to the amount of income spent on imports by the domestic country. (3) uses the condition that the income spent on non-IT goods by each region is α proportion of its income, that is, α(pIT qIT + pN onIT qN onIT ). This implies that (1 − α)(pIT qIT + pN onIT qN onIT ) is spent on imports. Since condition 2 ensures that sum of imports is equal to value of sales from IT, trade balance is maintained. 14 In Online Appendix C.3 , I show that a competitive general equilibrium exists. 20 5.6 Summary of the mechanics of the model This section describes how a rise in the demand for the externally traded good, in this case IT, affects employment, education, and ultimately welfare of individuals in different regions within the country. The rise in IT export demand translates into differential changes in IT real wages across regions, depending on the region’s geographic location that determines how difficult it is to migrate there and the regions comparative advantage in IT, as measured by the historical regional software exports and the region’s linguistic distance from English. People start moving into regions where the net utility from mi- gration is higher. Given two regions where individuals will face the exact same cost of migration, individuals will move to regions where the IT real wages are higher. Given the same real wages in two regions, individuals will move into places where it is less costly to move. This is the place where the mobility costs for work matter. This part of the model is like a specific factors model in that engineers are more required in the IT sector. This is a standard spatial model with no changes in skills. But then the rise in real wages changes the incentives for higher education, especially engineering, since engineers are more intensively hired in the IT sector. Individuals who are closer to skilled jobs or who are closer to good education facilities are more likely to get educated. Thus, enrollment rises. This is where the mobility costs for education matter and this is the new compo- nent that I add to existing spatial models. This generates a Heckscher- Ohlin (HO) type response to changes in skilled wages. 6 Identification and estimation In this section, I estimate the structural parameters that determine the migration and IT trade costs and measure the expenditure shares on goods using available expenditure data. I then use the estimated parameters and the measured quantities, along with the available data on employment, wages, migration, and enrollment, to back out the unknown amenities and productivities consistent with the model. Adapting the model for estimation, I assume F =3, where f ∈ F . f can be college degree in engineering, college degree in any other field, henceforth referred to as non-engineering, or no college degree at all. There are two types of high-skilled workers: those who complete a college degree in engineering and those who complete a college degree but not in engineering. There is only one type of low-skilled worker, those who do not go to college. There are 7 sectors in the economy (S =7), where these sectors are: agriculture and allied activities, manufacturing, wholesale and retail trade, a non-tradeable sector) , services other than IT, the IT sector and an importable sector. IT is only consumed by the RoW. There is an importable sector: goods in this 21 sector are not produced domestically but are consumed domestically. Goods in the other sectors are all traded internally between districts, except in the non-tradeable sector which includes for example construction. 6.1 Estimation of migration costs 6.1.1 Estimation of migration costs due to education In this section, the migration costs of people moving to acquire college education are estimated. Taking the logarithm on both sides of equation 9, the ideal gravity equation of flows of workers from o1 to o2 who move to study field f would be estimated by: (1 − IU ) ln(lo1 o2 f ) = γ ln(µ1 o1 o2 ) + γ ln(Φo2 f ) − ln(Φo 2 ) + ao1 o2 f (20) IU θ where f is engineering, non-engineering, or no college. The equation states that the proportion of people who move from o1 to o2 to study field f depends on i) The expected return from studying f in o2 (ln(Φo2 f )) ii) The bilateral migration costs of moving from o1 to o2 , given by µ1 o1 o2 iii) The geographic advantage of the origin district, determined by its proximity to regions with good job and education opportunities (Φo1 ) iv) ao1 o2 f is an error term which captures any unobserved preferences for education that varies according to origin, destination. However, flows of people disaggregated by reason for migration, origin, destination, and field of study are not available. Instead, the data informs us of the number of people migrating for education from every origin district o1 to every destination district o2 , aggregated across all fields of education f that they chose to study. Aggregating equation 20 across all fields of education and taking the logarithm on both sides, one gets the following equation: (1−IU ) ln (lo1 o2 ) = γ ln(µ1 o1 o2 ) − ln(Φo 1 ) + γ ( (ao2 f )IU (Φo2 fθ )) (21) f Following the migration gravity literature, I parameterize the costs of migration in equation 20 where the migration costs depend on geographic and cultural distances: 22 dif f −N BR ln (µ1 o1 o2 ) = λ1 ln (DistCentroido1 o2 ) + λ2 lango1 o2 + λ3 Dif fo1 o2 + (22) same−N BR same−notN BR λ4 Do 1 o2 + λ5 Do 1 o2 where DistCentroido1 o2 measures the distance between district-centroids and (lango1 o2 ) measures the proportion of people speaking a common language in districts o1 and o2 . If two districts belong to different states but share the same border, dif f − N BR=1. If two districts belong to the same state and also share a border, same − N BR=1. If two districts belong to the same state and are not neighbors, same − notN BR=1. The estimating equation becomes: log (lo1 o2 ) = fo1 + fo2 − λlndisto1 o2 + o1 o2 (23) (1−IU ) IU , where fo2 = γlog ( f (ao2 f ) (Φo2 f θ ), fo1 = log Φo1 , λ = (λ1 , λ2 , λ3 , λ4 , λ5 ) and λ = γλ is a combination of elasticity of migration flows to migration costs, and the elasticity of migration costs to distance. o1 o2 includes any measurement error or random factors not correlated with the distance measures. (µ1 γ o1 o2 ) can be estimated using equa- tions 22 and 23 in the usual gravity estimation framework using origin, destination fixed effects and bilateral measures of distances. Given bilateral migration data on the number of people moving from district o1 to district o2 to acquire education, bilateral geographic and cultural distances, the compos- ite parameter λ is identified in the cross-section by the elasticity of migration flows to distances. The key assumption required for the identification of λ is that the unobserved error term o1 o2 which is not derived from the model and does not represent any structural object, is random measurement error and is uncorrelated with bilateral district to district cultural and geographic distances. Regression 23 thus gives an estimate of dif f −N BR same−N BR µ1 γ o1 o2 = exp(γλ1 ln(DistCentroido1 o2 )+γλ2 lango1 o2 +γλ3 Dif fo1 o2 +γλ4 Do 1 o2 + same−notN BR γλ5 Do 1 o2 ) The results of the estimation are given in Table XI, column 1. According to these estimates, for every percentage increase in distance, migration for education falls by .60 %. Interpretations of all the columns and coefficients is discussed in the next subsection 6.1.2, after finishing the estimation of work migration costs. 23 6.1.2 Joint estimation of migration due to work and amenities To estimate the migration costs for work, the ideal regression would be estimating the log of 6, the migration flow equation for work: log (mo2 f,dS ) = θlog (wf,dS ) + θlogµ2 o2 d − θlog (Pd ) + θloguf,dS − θlog Φo2 f + o2 f,dS (24) This relates the proportion of people from o2 with degree f who move to d to work in sector S , (mo2 f,dS ), to the wages of workers with degree f working in sector S in location d (wf,dS ), bilateral migration costs (µ2 o2 d ), destination-specific prices (Pd ), and the option value of a degree from location o2 in field f , (Φo2 f ). The option value summarizes the job opportunities available to a person who has completed degree f in location o2 . The error term, o2 f dS , represents random measurement errors. However, the available data does not inform us of bilateral migration flows disaggre- gated according to degree of education and sector of work. The data give information on the number of people who moved from district o2 to district d for work, aggregated across all fields of education and all sectors of work. Summing across all fields f and sectors S , the estimable regression equation is given by: wf,dS ln(mo2 d ) = θlog ( ) + θlog (µ2 o2 d ) − θlogPd + θ (log uf,dS ) + o2 d (25) f,S Φo2 f f,S This relates the flow of people who move from location o2 to location d for work to the average wage in location d in field f weighted by the option value of studying f (Φo2 f ). Since the option value of education varies by origin (o2 ) and field of education (f ), the relative attractiveness of a destination is no longer separable in just the origin and destination fixed effects, as in traditional gravity models. The problem is that the option value of education contains the unobserved migration costs µ2 o2 d and amenities (uf,dS ), and so this relative attractiveness is not known. If we treat this relative attractiveness of a destination as unknown, the existence of unobserved migration costs in the error will bias the estimate of θ. First, for a relatively remote district o2 , a rise in bilateral migration cost to d will reduce migration to d but by not as much compared to a district that is relatively well- connected to regions with employment opportunities, since people from the remote district have fewer options to choose from. Even this effect will differ according to an individual’s skill level depending on how valuable destination d is for that skill group. Thus, the existence of this unaccounted for and unknown remoteness measure in the error term will bias the estimate of the elasticity of migration costs downward. On the other hand, for people in well-connected locations, if the migration cost to 24 a particular district falls, they can more easily turn to other districts compared to their more remote counterparts and this effect varies according to their field of training f . For districts in well-connected locations, the elasticity of migration to migration costs are thus over-estimated. On the aggregate, it remains an empirical question as to which effect dominates. The costs of migration depend on distance: dif f −N BR logµ2 o2 d = ζ1 log (DistCentroido2 d ) + ζ2 loglango2 d + ζ3 Do2 d + (26) same−N BR same−notN BR ζ4 Do 2d + ζ5 Djd Rewriting the estimating equation by inserting the migration cost in terms of dis- tance, wf,dS ln(mo2 d ) = θlog ( ) − ζlogdisto2 d − θlogPd + θ(log uf,dS ) + o2 d (27) f,S Φo2 f f,S where logdisto2 d is the vector of distances mentioned above, and ζ = −θζ and ζ = (ζ1 , ζ2 , ζ3 , ζ4 , ζ5 ) I use a nested nonlinear least squares approach to estimate ζ . The idea is to ex- plicitly account for the effect of the unobserved option value of education by location and degree, thereby correcting the source of the bias in traditional gravity estimation. After accounting for the unobserved option value of education as I will describe below, the moment condition that identifies ζ is: E( o2 d |lndisto2 d ) =0 The assumption is that after accounting for the unobserved attractiveness of the destination region relative to the origin region, the remaining unobserved term is white noise, uncorrelated with migration costs. Since the option value of education contains the unobserved amenities, in practice, for estimation, I use a nested non-linear least squares procedure where I make a guess of migration costs and use labor market data on workers across fields, sectors of occupation, and locations to back out the unknown amenities as shown in equation 28 below.15 Using the fact that employment in sector S of people with field of education f in region d is a sum of individuals with degree in field f migrating to d to work in S , we get the following expression for the unique district and field of education amenities: 15 (See the equations in Online Appendix C.4 for detailed derivations.) 25 µ2 θ ( Wf,dS )θ o 1 2d Pd (Lf,dS /( k o2 2 Wf,d S uf,d S Lf o2 )) θ θ µo d ( )θ d S 2 P d (28) uf,dS = Wf,dS Pd In the outer loop, I choose migration cost parameters to minimize the distance be- tween bilateral migration flows predicted by the model by equation 27 and observed in the data in 2001, the only year for which such detailed migration data is available. Given the assumption that migration costs do not change during the period under study, I use the estimated migration costs and the distribution of employment post boom to recover unknown amenities. As unknown amenities are recovered in the last step from the dis- tribution of population in each location, I update the amenities and re-estimate equation 27 until the migration costs converge. Note that the estimation of unknown amenities requires an estimate of θ, which is described in Section 6.1.3. In the same way, given estimated migration costs for education and the option value of education, one can use population with and without college degrees in each location to solve for unknown quantities ao2 f , which includes net quality as well as unobserved preferences for education. The results of the estimation procedure are given in Table XI. Columns 1 and 2 report the estimation results of the traditional PPML gravity regression where the reasons for migration are education and work, respectively. In the third column, I report the estimates for work using the non-linear least squares method. Reading off columns 1 and 3 of Table XI, a 1% increase in distance reduces migration by 0.60% when the reason for migration is education and by 0.96% when the reason for migration is work. These estimates are comparable with findings in the traditional gravity literature. For example, Bryan and Morten (2019) finds that a 1% increase in distance leads to a 0.7% reduction in the proportion migrating. Note that migration cost estimates from the literature, including Bryan and Morten (2019), are not able to estimate migration costs separately by reasons for migration, and their estimates could be a weighted average of the elasticity of migration flows to distances for work and for education.16 With these estimates, the average iceberg cost of migration when people migrate for work within state, either across neighboring or non-neighboring districts, turns out to be 0.88 on average. This means that migrants have to be compensated about 88% more to make up for lost utility when they migrate for work across districts within a state, and this loss includes many different factors such as cultural differences, loss of home network, 16 They do not separately account for cultural differences. As long as geographic distances are positively correlated with cultural distances, their estimates would be over-stating the effect of physical distance. 26 and transportation costs. This cost is about 90% when people move for education. The notable point from Table XI is that the effect of state borders is large, and it is especially larger for education: The costs of crossing state borders is 2.17 times larger when migrating for work and this effect is about 2.135% larger when the reason for migra- tion is education.17 Many state-level policies, such as quotas for in-state students, create barriers to mobility that differ according to the reasons for migration. These estimates suggest that there could be scope for policy interventions in reducing the mobility costs across state borders. In the literature, only combined estimates of migration costs aggregated across rea- sons for migration are available, and these estimates vary across countries. For example, Tombe and Zhu (2019) finds that migrants, on average, have to be compensated 82% more than non-migrants in China in 2000, and this cost is almost 1.75 times larger when workers migrate across provinces than when they stay within their own province. Bryan and Morten (2019) finds that in Indonesia, on average, migrant workers have to be paid 38% more if they were to receive the same wage as non-migrant workers. They estimate a much lower cost, only 15%, for the United States. It would be interesting to compare the finding of substantially larger migration cost for education relative to that for work for India with other countries if such data are available for other countries. 6.1.3 Estimation of elasticity of migration flows to migration costs The elasticity of migration flows to migration costs is the dispersion parameter θ that governs the variance of the idiosyncratic component of workers’ productivity draws. The higher the value of θ, the lower is the variance in productivity, and thus workers are more identical. This means that workers tend to respond more similarly to changes in migration costs compared to when they are more heterogeneous in their productivities. Thus, for a given rise in migration cost, the higher is θ, the larger is the fall in migration. Following Fan (2019), I use the variance in the wage distribution of stayers, that is, the wage distribution of people who do not migrate for work, to identify θ. Using the properties of the Frechet distribution, it can be shown that the productivity distribution of stayers also follows a Frechet distribution where the mean varies by field of education, sector of work, and location of degree. Refer to online appendix C.5, Proposition 3, for the proof. For any (f , d, S ) combination, the wage observed in the data is the effective ˜f,dS ), where wage (w ˜if,dS = wf,dS ηif,dS w 1 1 17 The state border effects are calculated as exp(5.381−3.339)( 2.61 ) for work and exp(3.646−2.339)( 1.637 ) for education. The estimated elasticities of migration costs with respect to work and education are 2.61 and 1.637 respectively, as show in section 6.1.3. 27 Taking logs on both sides, ˜if,dS ) = Ff,dS + ln (ηif,dS ) ln(w where Ff,dS is a sector of job, field of education, and district fixed effect which is a combination of average wage per effective unit of labor and the average productivity of stayers. The variance of exponentiated residuals (ηif,dS ) identifies θ, which turns out to be 2.61. This is very similar to the estimate of Fan (2019), who used the same method to estimate these elasticities to be within the tight range of 2.50 to 2.73. The assumption is that after controlling for field of education, sector, and location of work, the remaining variation in individual wages for those who stay back in the same location is due to variation in the idiosyncratic component, which can include factors such as ability, talent, and family background. Given the estimate of θ (elasticity of migration flows to migration costs for work) and γ (elasticity of migration flows to distance for work) from Section 6.1.2, it is possible to separately identify the elasticity of migration flows to migration costs for education (ζ ). The assumption required for this identification is the following: the elasticity of migration costs to geographic distance is the same irrespective of the reason for migration, once institutional boundaries such as state borders and neighboring districts dummies have been accounted for. Note that this assumption does not require the elasticity of migration flows to dis- tance to be the same for work and education. In fact, these elasticities are very different, as we estimated before. It only requires the costs of migration to respond to geographic distances in exactly the same way, once we have accounted for state-specific institutional barriers such as heterogeneous state policies for work and for education. An example of a violation of this assumption would be any factor that increases or decreases the migration costs for education relative to work over the same geographic distance. For example, one such factor would be the provision of special transportation for students. By assumption, ζ1 = λ1 Thus, λ ζ = γ θ Given θ = 2.61, ζ = .961, λ = .602, the above identity yields γ = 1.636. This completes the description of my estimation strategy for migration costs. 28 6.2 Trade costs 6.2.1 Trade costs in the IT sector In this model, IT is the only good traded with the RoW and it is not consumed domesti- cally. Taking the logarithm on both sides of equation 13, the gravity equation expressing IT trade flows as a function of IT prices and comparative advantage, and getting rid of IT in the notation, I get the following estimating equation: Ed,t ln( ) = C + (1 − σIT ) ln(τd ) + (1 − σIT ) ln(pd,t ) (29) E, t where C=−(1 − σIT ) ln( d (τd pd )1−σIT ) is a quantity that is constant across dis- tricts. Following Shastry (2012) and Banerjee and Duflo (2000), I parameterize the costs of exporting IT as a function of the linguistic distance of district d from English and the prior software exports in 1995. Let distd,IT be the vector denoting the linguistic distance of each district from English and the proportion of historical software exports from d, measured using 1995 export data. (1 − σIT ) ln(τd ) = κIT ln(distd ) (30) The historical comparative advantage of a district in this sector depends on the prior links of a district to the RoW, measured by the proportion of software exports histori- cally exported from that district. Prior connections, through building reputation, play an important role in determining the volume of transactions in this sector (Banerjee and Duflo (2000)). Shastry (2012) showed that linguistic distance of each regional language spoken in a district from English determines the cost of learning English for individuals in that district. Since English proficiency is a necessary skill in this industry, the compar- ative advantage of a district also depends on the linguistic distance of the district from English. Note that price is unobserved since it includes the unobserved productivities. Us- ing the structure of the production function, and marginal cost pricing, price can be log-linearly decomposed into its known and unknown components (productivity). After taking first differences, this decomposition helps us express equation 29 in the form below: Ed,t ˜ne,d,t ) ∆ ln( ) = (1 − σIT )∆OCdt + ∆ ln(A (31) ERoW,t where: 1 1−ρIT (−ρ ) ph OCdt =ln((˜d,t ˜l,d,tIT (wl,d,t )1−ρIT ) (1−ρIT ) ) is the observable part of Marginal +x 29 Cost (MC) ρh,IT 1 (ρ h,IT −1 )( ρIT −1) 1 Ane,d,IT Wl,d,IT (Ll,d,IT ) ρIT ˜l,d,IT = x 1 −1 = 1 ρ ˜h p ˜ ρIT S,IT (Qh,d,IT ) IT Al,d,IT 1 1−ρIT A 1−ρ 1−ρ ˜h p d,IT e,d,IT = (( Ane,d,IT )ρh,IT we,d,IT h,IT h,IT 1−ρh,IT + wne,d,IT ) Due to the firm’s first order conditions, the ratio of productivities is a function of 1−ρIT ˜h known wages and employment, and therefore, both p d,IT ˜l,d,IT are also functions and x of observables. ˜ne,d,t ) = ∆ (1−σIT ) ρh,IT (ρIT −1) ln Ane,d,t . The full derivation can be found in ∆ ln(A (1−ρIT ) 1−ρh,IT Online Appendix C.2. Intuitively, in equilibrium, how responsive IT exports are to changes in the prices of IT depends on the elasticity of substitution between different varieties of IT products (σIT ), where each variety corresponds to a region. The lower the elasticity of substitution, the more difficult it is to switch to a different variety as the price of a particular variety rises, and the less responsive is IT demand to IT prices. Since in general equilibrium, the unobserved district specific productivities in the error term also determine the marginal cost of production, σIT cannot be recovered through a linear regression of IT exports on the observed part of marginal cost. I construct an instrument by leveraging the IT boom of the late 1990s and early 2000. As demand for IT increased, the prices of IT increased in all regions that produce IT. However, the capacities of IT production differ across regions. In particular, regions that are better connected geographically to other populous regions could expand supply more because people can migrate more easily into these regions and thus the supply of labor in these regions is more elastic. Also, regions with higher historical software exports had a comparative advantage in IT exports, as discussed in this section. To formalize this intuition, I develop an instrument by interacting a measure of labor supply for each region (defined below) with the historical software exports of a region. A measure of labor supply access for each region is summarized by 1 LM Ad = Lo ( )−1 o distanceo,d The instrument, referred to as Id is formally defined below: Id = LM Ad ∗ HistoricalSof twareExport Regions that are better connected historically and where the potential labor supply are high will see lower increases in marginal costs and hence lower increases in prices. On the other hand, regions that have potentially high labor supply but are not historically 30 connected will have relatively larger increases in MC. This estimation requires the assumption that changes in the productivity of non- engineers in the IT sector, in the pre and post-2000 boom periods are uncorrelated with the pre-period exports and the remoteness of a region during the period of the IT boom. The unobserved productivities, by model construction, do not depend on historical soft- ware exports, and the historical distribution of college educated workers. These produc- tivities are the residual quantities that explain the deviation of predicted output from actual output, after these known quantities are taken into account. These historical fac- tors, in turn, are not affected by future changes in productivities. However, to account for the fact that in reality district level productivities in the IT sector can be affected by these factors, I additionally run the following specifications: First, I include controls for the geographic remoteness of a district, as measured by the average log distance of a district to other districts. Second, I include state fixed effects to account for any differential growth in productivities across states. In Table (XII) column 1, I report the results from the OLS estimation. Columns 2,3, and 4 report the results from the IV estimation. The regression result in column 3 controls for the remoteness of districts and the regression result in column 4 additionally controls for state fixed effects, which, in a first difference equation implies controlling for a linear state-level time trend. Export demand responds negatively to changes in observable prices/ MCs. In the most demanding specification, reported in column 4, I find that a 1% increase in prices leads to a .45% fall in demand, which translates into an elasticity of substitution of 1.45. The first stage is reported in Table XII, panel B. The instruments are strong across all specifications (F-stats of 210.37, 116.20, and 47.66). The IV has the right sign: marginal cost is lower in districts that have a larger access to college educated workers and are more connected abroad. The estimated value of elasticity of substitution, at 1.45, is pretty low compared to elasticities of substitution between different varieties that the literature has estimated (between 3 and 5). There are a couple of caveats when comparing σIT to the estimates from the literature. First, the literature has mostly estimated the elasticities of substi- tution between varieties traded internationally. Second, my estimate is specifically for the IT industry, for which we do not have any known estimate in the literature. The low value of elasticity of substitution between the IT varieties of different regions could be justified on the ground that these regions specialize in very different types of tasks, such as, data processing, software development, multimedia graphics, as is reported in the NASSCOM software data. 31 6.2.2 Trade costs in the non-IT sector The iceberg transport cost is taken to be, τod = distance1 od , calibrating the distance elasticity to the canonical value of -1 (Head and Mayer (2014)).18 6.3 Quantifying sector-specific productivities We use the equality of marginal costs to prices to back out the unobserved amenities, after calculating the observable part of MC that depends on known wages and employment, following the long tradition in urban economics (see for example Allen and Arkolakis (2014), Allen et al. (2018)). As show in Online Appendix C.2 and subsection 6.2, price can be rewritten in the following way: −ρh,S (ρS −1) (pS,k ) 1−ρS 1−ρh,S = Ak ,ne,S ph (˜ 1−ρS ˜− +x ρS 1−ρS (32) S,k l,S,k (wl,S,k ) ) (−ρ ) Since (pd,IT )1−ρIT , p ˜hd,IT 1−ρIT , x IT ˜l,d,IT , (wl,d,IT )1−ρIT ) , ρh,IT are all known, we can recover Ane,d,IT using 1 1−ρIT (−ρ ) ph (˜d,IT +x IT ˜l,d,IT (wl,d,IT )1−ρIT ) 1−ρIT ρh,IT ρh,IT −1 Ane,d,IT = ) (33) pd,IT Intuitively, how the magnitude of estimated prices differ from that of the observed com- ponents of marginal cost consisting of the information on wages and employment, helps determine productivities. Note that (pd,IT ) is known by recovering it from equation 13, 1−ρIT (−ρ ) ˜h given estimated trade costs σIT and exports. p d,IT and x IT ˜l,d,IT are known as they are functions of observables. Finally we recover the productivity of low-skilled workers in the IT sector Al,d,IT in all locations by using the firm’s first order condition below: ρh,IT 1 (ρ h,IT −1 )( ρIT −1) 1 Ane,d,IT Wl,d,IT (Ll,d,IT ) ρIT ˜l,d,IT = x 1 −1 = 1 ρ ˜h p ˜ ρIT d,IT (Qh,d,IT ) IT Al,d,IT and then recover Ae,d,IT by using the firm’s first order conditions. To recover prices in the internally traded sectors, use equations 14 and 15, the identities that state that the income of sector S in district d equals the sum of exports from sector S in district d to all other districts, and the expenditure of district d on sector S good must equal the sum of imports of good S from all other districts, respectively. Combining these two equations, prices can be expressed as: 18 The only estimate for India is from Donaldson (2018) who estimates it to be -1.69 from Colonial India. Since connectivity has much improved since then, taking the classic estimate seems more appropriate in this case. 32 −σS 1−σS 1−σS −1 EjS p1 d,S = (τdj )1−σS ( τkj pkS ) (34) j k YdS where S is any sector other than IT and the importable goods sector. Income of each region in each sector YdS is obtained by summing over the products of wage bill and employment. Expenditure of each region on sector S goods EdS is calculated given share of GDP spent on sector S good. Internal trade costs τjdS are calculated given distances between districts and σS from the literature. Productivities in the internally traded sector can be recovered in exactly the same way as in the IT sector described above. 6.4 Calibration from the literature The elasticity of substitution between engineers and non-engineers is calibrated to 2 across all sectors Ryoo and Rosen (2004). The elasticity of substitution between high and low skilled labor (college and non-college graduates) is taken to be 1.7 from Khanna and Morales (2017) which apply Card and Lemieux (2001) methodology to Indian data and find the estimate to be consistent with the literature (such as in Katz and Murphy (1992), Card and Lemieux (2001) and Goldin and Katz (2007)). The elasticity of substitution σ between different types of goods traded internally within India is taken to be 5 following Simonovska and Waugh (2014). Several other papers estimate elasticities of substitution that are close. For example, Van Leemput (2021) estimate an elasticity of substitution between different types of agricultural goods in India as 5.6. The weight on current period utility (IU) is taken as 0.53, which corresponds to an intertemporal elasticity of substitution of 0.9, which is standard in the literature. Share of consumption expenditure on agriculture α2 = 0.38, share of consumption expenditure on manufacturing α3 = 0.16, share of consumption expenditure on non-traded goods αs = 0.37, share of consumption expenditure on high-skill services α1 = 0.07, share of consumption expenditure on imports (1−α1 −α2 −α3 −αs ) = 0.02 are all obtained from the official Government of India National Statistics . The price of US imports is normalized to be 1. Table XIII summarizes the parameter values. 33 7 Quantification and counterfactuals 7.1 Quantifying the effect of the IT boom on welfare I first use the model to quantify the effects of the IT boom on the Indian economy. To do that, I use data on the changes in IT exports between 1997 and 2000, holding the estimated parameters and the model fundamentals (i.e., the exogenous amenities and productivities) constant between the pre- and post-boom periods. This is the period during which major international demand shocks led IT exports to expand by more than 50% annually. Between 1997 and 2000, the share of IT in total GDP rises from 1.2% to 2.6%. Holding fixed the amenities and the productivities estimated in the model, I change the the value of IT demand for Indian product from the ROW.19 In the model with endogenous skills, welfare, obtained directly from the individual utility maximization problem in 7 is given by: 1 1 Γ(1 − )(Φo1 ) γ γ where Φo1 measures the access to higher education for college-eligible individuals from o1 . (1−IU ) γ Φo1 = o1 ,f ((ao2 f µ1 IU o1 o ) Φo θ ) IU , where access to education, in turn, depends on 2 2f education amenities, the connectivity of a district o, and the job opportunities available to individuals from district o. The ex-ante expected welfare measure in the model is thus succinctly summarized by this measure of access to education, which varies across regions. When skill levels are fixed, regional welfare depends on the access to jobs for each skill-group, weighted by the distribution of skills. In this case, welfare, again derived from the individual utility maximization in 7 is given by: 1 1 1 1 Γ(1 − )(ΦEng θ propEng + ΦN onEng θ propN onEng + ΦU n θ propU n) θ where Φs measures access to jobs for skill-group s, s = engineers, non-engineers, and unskilled. With fixed skill-level, for every percentage increase in IT exports, welfare increases on average by .05%. Given that during the period of the IT boom, welfare increased on average by more than 50% per annum, this translates into an average welfare gain of 2.5% per annum. 19 The model fundamentals, ie, the amenities and productivities, are estimated using the post period data since the post period data is better available across regions 34 Using the full general equilibrium model with endogenous skill acquisition, costs of migration for both education and work, and costs of moving goods internally, I find that for every percentage increase in IT exports welfare on average increases by .16%, that is, three times the increase in welfare with fixed skill levels. The average masks substantial variation across districts, with individuals born in districts with good access to jobs and education gaining as much as .51% while their counterparts in remote districts experienced gains as low as 0.05% for every percentage increase in IT exports. Over the long-run, when the supply of education can respond fully (that is, at-least after 5 years since an undergraduate college degree in engineering takes 4 years), the average increase in welfare gain is about 27%. Figure VIII below plots the histogram of the percentage increase in regional welfare gains for every percentage increase in IT exports when education choice is endogenous and when it is not. As is clear from the two histograms, the regional inequality in welfare is much higher when education choice is not endogenous. The variation in regional inequality, as measured by the CV is .73, almost twice as large as the CV with endogenous education (.37). The gains are positively correlated with the amenities for education, productivities in the IT sector, and the geographical connectivity of a district d, roughly approximated by the sum of inverse distances of d from all other districts. Districts in large states with good facilities for engineering education and connectivity to regions with jobs experienced the largest gains. This underscores the importance of accounting for endogenous skill acquisition in a general equilibrium setting. 7.2 Quantifying the importance of mobility for work and edu- cation To quantify the importance of mobility for education separately from work, I run a counterfactual where the option to move for education is shut off. As Figure IX shows, regional inequality is much larger in this case compared to the base-line model, but naturally lower than the case where the option to acquire education is completely shut off (Figure VIII). The CV in regional welfare inequality with the option to acquire education only in one’s home district (.53) lies in between the CVs in the baseline model (.36) and the model without any endogenous education (.73). An advantage of separately estimating the mobility costs for work and education is to understand which of these two costs is more important in affecting the regional inequality in the welfare gains from the IT Boom. To do this, I first quantify the welfare gains assuming that the two costs are exactly the same (situation 1). I then conduct a 35 second counter-factual, reducing only the migration costs for work (situation 2). I then do a third counter-factual, reducing only the migration costs for education (situation 3). I find that reducing the mobility costs for education is more important in reducing regional inequality due to trade: The CV of variation is .4 percentage points higher and the aggregate welfare gains from the IT boom is about 1 percentage point lower in situation 2 compared to situation 3. The intuitive explanation for why the mobility costs for education are more important in reducing inequality than the mobility costs for work lies in the fact that ease of mobility for education offers more flexibility. An individual can move for education to a college located closer to IT jobs, thereby reducing the importance of the second stage mobility costs for work. However, a lower mobility cost for work does not matter too much if a higher barrier to moving for education restricts an individual from acquiring education in the first place. 7.3 Counterfactual policy: Reducing border costs of moving for education In the particular case of India, the widespread prevalence of in-state student quotas for higher educational institutions, reflected in the significantly higher costs of crossing state borders for education relative to that for work, increases the potential for districts in larger states with good educational facilities to gain more from the IT boom. Given that migration costs in India are one of the highest compared to available migration cost estimates from other countries, the geographical connectivity of the district also plays an important role in determining the welfare gains of the district.20 There seems to be an obvious policy intervention in the education market – for example, the reduction of state quotas for education that reduces border costs– that is easier to implement than labor market policies that aim to move jobs. The existing magnitude of quotas in higher education institutes in India is huge: most state colleges have home state quotas of 50%, with some being as high as 85% .21 In the counterfactual, we look at the effects of removing all the border costs of moving for education.22 Figure X plots the histogram of 20 For example, many districts in Uttar Pradesh, the largest state of India in terms of land area and also the number of colleges, gained more than the average district 21 Support for the 85% reservation policy started in Maharashtra from the year 2011 with the backing of nationalist state parties. The size of the state quota varies by state and by whether the university in question is public or private, but in general, it is a substantial proportion of the total class size (Kone et al. (2018)).For example, in Haryana, private universities also have to reserve 25% of their seats for students domiciled in Haryana. Such quotas are in no way unique to India, and exist in many other countries, including the United States and China. 22 The ideal experiment from a policy perspective would be collecting data on quotas at the state level and relating these quotas to the estimated border effects. But given there is no systematic information on quotas available, I conduct a policy counterfactual of just reducing the border effects. 36 regional inequality in welfare gains with and without any border effects for education. As is clear from the histogram, reducing the border effects for education reduces the regional inequality in the welfare gains from the IT boom. The CV falls from .36 to .15, a more than 10 pp decline. The above counterfactual suggests that reducing state quotas, which would signifi- cantly reduce the border costs in education mobility, could help in significantly reducing the mobility costs for education. The previous counterfactual showed that reducing the costs of moving for education is more important in reducing the regional inequality in the welfare gains from an export boom compared to reducing the costs of moving for work. The findings of this paper thus demonstrate that even though a lot of emphasis has been placed on labor market policies for workers displaced by trade, and on the geographic inequality in the distribution of jobs, policies in the education market can be at-least as effective in addressing some of the regional inequality concerns from trade. Making access to quality education more affordable, such as making it easier for residents of Assam to study in Delhi, will go a long way in addressing some of the regional inequality concerns from trade. 8 Conclusion This paper assesses the aggregate and distributional consequences of human capital re- sponse to trade for the spatial distribution of welfare. In answering this question, the paper makes three contributions: First, it introduces human capital acquisition decisions in a general equilibrium model with multiple locations. It shows that studying the effects of trade on the labor market without taking into account endogenous skill acquisition can underestimate the aggregate welfare gains from trade. Second, a key innovation of this paper compared to the existing literature on trade and migration is introducing po- tentially different migration costs for education and work. Using confidential and unique district-to-district Indian migration data disaggregated by reasons for migration, this pa- per provides the first separate estimates of mobility costs by reasons for migration. I show that quantifying both of these costs separately is important as these costs can sig- nificantly alter the aggregate and regional welfare gains from trade depending on their relative magnitudes. Third, as a result of studying the interaction of education and labor market choices in the presence of changes in export-driven employment opportunities, this paper is able to suggest new forms of policy intervention to reduce inequality in the regional welfare gains from trade, such as reforms in the higher education sector. Despite a lot of interest surrounding the IT boom and its effect on geographic in- equality in India, the lack of disaggregated data made it challenging to quantify its effect 37 on overall economic growth. This paper also takes a step forward in collecting district- level data and building a general equilibrium model to quantify the effect of the IT boom on skill acquisition and the regional distribution of welfare gains in India. These gains are attenuated by high costs of mobility for education and for work across Indian districts, leaving scope for policy interventions in both the education and labor markets that have the potential to reduce regional inequality as well as increase aggregate welfare. There is scope for future work to further the research agenda presented in this paper by studying the regional welfare implications of endogenous education choice with trade in a dynamic framework, which can trace how welfare changes during the transition period from the short to long run. The challenge will be to devise a way to tackle the large number of state spaces as people migrate across regions and over time for work and education. 38 Part I Appendix for online publication only Table of Contents A Data Appendix 39 A.1 Education and labor market data . . . . . . . . . . . . . . . . . . . . . 39 A.2 Linguistic distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 A.3 Matching NSS with Economic Census: Missing value imputation . . . 42 B Reduced Form Facts 42 B.1 Stylized facts 1 and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 C Theoretical Derivations 48 C.1 Worker’s problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 C.2 Firm’s problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 C.3 Existence of equilibrium proof . . . . . . . . . . . . . . . . . . . . . . . 54 C.4 Unknown amenities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 C.5 Proofs of Frechet propositions . . . . . . . . . . . . . . . . . . . . . . . 59 D Model Assumptions, Extensions, and validity 61 39 A Data Appendix A.1 Education and labor market data Table A1: Summary statistics of labor market outcomes at district level Pre Boom Post Boom Variable Mean Std. Dev. Mean Std. Dev Labor market: Wages High-skill services College-educated 363.64 336.98 467.18 243.15 Non college-educated 171.02 80.28 275.17 235.45 Manufacturing College-educated 324.25 570.00 307.96 209.08 Non college-educated 115.08 111.37 152.21 104.28 Labor market: Employment High-skill services College-educated 7256 16526 9664 16125 Non college-educated 7574 13553 11750 204375 Manufacturing College-educated 25276 43491 30799 49503 Non college-educated 34231 61375 44380 66592 Note: Wages are measured at the daily district level in local currency, rupees 40 Table A2: Summary statistics of annual district-wise enrollment Education Pre Boom Post Boom Engineers/college enrollment 5.10% 11.45% College enrollment 12,404 32,632 41 A.2 Linguistic distance I summarize the way Shastry (2012) described the construction of this index in her paper. The 1961 Census of India documented speakers of 1652 languages from five language families. Much linguistic diversity is between districts. A district’s primary language is native to 83 percent of residents on average, ranging from 22 percent to 100 percent. Most people thus adopt a second language that is a widely accepted speaking medium across districts. Of all multi-linguals who were not native speakers, 60 percent chose to learn Hindi and 56 percent chose English. Think of an individual who is not a native Hindi speaker. Given everything else, whether this individual learns Hindi or English as a second language depends on the relative costs of learning that language, which in turn depends on her mother tongue. Someone whose mother tongue is similar (not similar) to Hindi will find Hindi easier (more difficult) to learn relative to English. To quantify what is the relative cost of learning Hindi or English , Shastry constructed three measures of linguistic distance of each native language from Hindi.The first measure classifies languages into five “degrees” of linguistic distance from Hindi based on cognates, grammar, and syntax (see Table A3). The second measure is the percent of words from a core list that are cognates of Hindi words. The third measure is based on language family trees from the Ethnologue database. These measures are highly correlated: 0.935 between degrees and percent cognates and 0.903 between degrees and nodes. Table A3: Measures of linguistic distance to hindi Sample Language Degrees % Cognates Nodes % Native Speakers 0 Degrees 0 100 0 .456 Hindi, Urdu 1 Degree 1 67.1 5 0.084 Gujarati, Punjabi, Rajasthani 2 Degrees 2 56.4 6.5 .076 Konkani, Marathi 3 Degrees 3 64.1 7 .133 Assamese, Bengali, Bihari, Oriya 4 Degrees 4 53.3 7.3 0.005 Kashmiri, Sindhi, Sinhalese 5 Degrees 5 5 12.5 .244 All non-Indo European Languages Source: Gauri Kartini Shashtry, 2012 From the 1991 census of India, Shastry calculates a district’s linguistic distance from Hindi in two ways-1) the population weighted average distance of all native languages from Hindi and 2) the population share of languages at least 3 degrees away from Hindi. All my analysis that follows is conducted with measure 2) but the analysis are robust to using measure 1) instead. Shastry proxies English-learning costs as linguistic distance 42 from Hindi. One may think the natural proxy is linguistic distance from English, but it is the relative costs of learning Hindi and English that should determine which language one learns. A native Hindi speaker can choose to learn English as a second language at a much lower cost than a non-native speaker whose language is close to Hindi. So there is a non-monotonicity in the relationship- native Hindi speakers are more likely to learn English but speakers of languages close to Hindi learn Hindi rather than English. Then as distance to Hindi rises, the probability of learning English as a second language rises except for at distance 0. Shastry (2012) shows that such a relation holds. From now on, I would use linguistically distant to Hindi and linguistically closer to English interchangeably. A.3 Matching NSS with Economic Census: Missing value im- putation The NSS is a sample as opposed to the Census which is a complete enumeration. In the NSS, individuals are sampled in a way such that it is representative. However, there are many observations in the NSS where no individual working in sector S , having a degree in s in district d has been interviewed, even though according to the Census there are individuals working in sector S , having a degree in s in district d. I use a weighted knn where the weights can be uniform, ie, only the reciprocal of the distance or the weights could be Gaussian Kernel, Epanechnikov, Cosine etc (isotropic Kernels) to impute missing values. This machine learning technique involves using a training data to choose the value of “k” that minimizes the sum of squared distances between the actual and predicted values, where the predicted values are obtained by taking a weighted average of the variables values of the “k” nearest neighbors. I have used the uniform Kernel as weight here. B Reduced Form Facts B.1 Stylized facts 1 and 2 The table below reports the regression results for total IT employment graph reported in the reduced form facts section. Table A4: Response of the IT sector (1) (2) (3) (4) total employment total employment total employment total employment 1998 × Software Exports 1995 0.028 0.028*** (0.22) (5.73) 1999 × Software Exports 1995 0.175 0.175*** (1.58) (4.28) 2002 × Software Exports 1995 0.361*** 0.361*** (3.37) (3.10) 2003 × Software Exports 1995 0.373*** 0.373*** (3.31) (2.79) 2005 × Software Exports 1995 0.310** 0.310 (2.18) (1.54) 2013 × Software Exports 1995 0.395** 0.395 (1.99) (1.63) 1998 × AnyExports 0.129** 0.129** (2.40) (2.14) year=1999 × AnyExports 0.932*** 0.932*** (3.11) (3.25) year=2002 × AnyExports 1.858*** 1.858*** (2.81) (2.91) year=2003 × AnyExports 1.823*** 1.823** (2.61) (2.65) year=2005 × AnyExports 1.773** 1.773** (2.31) (2.59) year=2013 × AnyExports 3.849** 3.849** (2.19) (2.42) Clustering District District State State Observations 3880 3880 3880 3880 43 44 Table A5: Response of IT employment, with historical software exports pre- dicted by linguistic distances Panel A: Second Stage (1) Standardized Employment year=1995 × Standardized Historical Export 0.000 (.) year=1998 × Standardized Historical Export 0.033 (0.08) year=1999 × Standardized Historical Export 0.318 (0.97) year=2002 × Standardized Historical Export 0.645** (1.91) year=2003 × Standardized Historical Export 0.667* (1.88) year=2005 × Standardized Historical Export 0.665** (1.92) year=2013 × Standardized Historical Export 0.955 (1.58) Constant -0.08*** (-3.47) N 3880.000 Panel B: First Stage (1) Standardized Employment Hindi speakers 0.115*** (2.62) English speakers 1020.29*** (23.46) linguistic distance 0.007 (0.614) Constant -0.18*** (-3.30) N 2731 F(3,2727) 184.29 Adjusted R-squared .167 Table A6: Response of the Education Sector (1) (2) (3) (4) (5) (6) Engineers Engineers Engineers Proportion of Engineers Proportion Proportion AnyExports 0.000 0.000 0.000 (.) (.) (.) 2001 × Any Exports 1.449 1.449** 1.449** (1.40) (2.43) (2.23) 2011 × Any Exports 4.998*** 4.998** 4.998** (3.08) (2.51) (2.18) Historical Software Exports 0.000 0.000 0.000 (.) (.) (.) 2001 × Historical Software Exports 0.104 0.104** 0.104** (1.56) (2.39) (2.43) 2011 × Historical Software Exports 0.348*** 0.348** 0.348** (3.33) (2.23) (2.36) N 1726 1726 1726 1724 1724 1724 45 Table A7: Heterogeneous Response of IT Employment (1) (2) (3) (4) (5) (6) total employment total employment total employment total employment total employment total employment year=1998 × AnyExports × Engineers 1991 0.012* 0.012 0.012* (1.85) (1.51) (1.84) year=1999 × AnyExports × Engineers 1991 0.018*** 0.018*** 0.018*** (3.00) (2.95) (2.97) year=2002 × AnyExports × Engineers 1991 0.041*** 0.041** 0.041** (2.76) (2.47) (2.50) year=2003 × AnyExports × Engineers 1991 0.045*** 0.045*** 0.045** (2.75) (2.60) (2.65) year=2005 × AnyExports × Engineers 1991 0.045*** 0.045** 0.045** (2.76) (2.41) (2.51) year=2013 × AnyExports × Engineers 1991 0.052 0.052 0.052 (1.25) (1.09) (1.29) year=1998 × Historical Software Exports × Engineers 1991 0.000 0.000 0.000** (0.92) (0.41) (2.28) year=1999 × Historical Software Exports × Engineers 1991 0.001** 0.001** 0.001** (2.26) (2.53) (2.24) year=2002 × Historical Software Exports × Engineers 1991 0.004*** 0.004*** 0.004*** (4.70) (6.56) (6.09) year=2003 × Historical Software Exports × Engineers 1991 0.005*** 0.005*** 0.005*** (5.53) (7.79) (7.39) year=2005 × Historical Software Exports × Engineers 1991 0.005*** 0.005*** 0.005*** (3.95) (2.95) (3.68) year=2013 × Historical Software Exports × Engineers 1991 -0.002 -0.002 -0.002 (-0.93) (-0.62) (-0.77) Clustering State time District State State time District State N 3838 3838 3838 3838 3838 3838 46 47 Table A8: Gravity estimation by reason for migration, replicating table 8 from Kone et al (1) (2) (3) (4) Marriage Work or Business Move with Family Education log distance centroid -1.626*** -1.481*** -1.454*** -1.207*** (-48.78) (-50.19) (-44.60) (-47.62) common 1.077*** 0.499*** 0.941*** 1.338*** (15.65) (7.01) (15.40) (19.14) Same state,; neighbors 2.265*** 1.508*** 1.681*** 2.359*** (40.98) (24.20) (24.41) (43.37) Same state; not neighbors 0.881*** 1.226*** 1.148*** 1.777*** (19.09) (22.52) (20.72) (38.25) Different state; neighbors 2.167*** 1.054*** 1.368*** 1.036*** (34.24) (14.27) (18.39) (12.86) N 341640 341640 341640 341640 t-stats reported in parenthesis 48 C Theoretical Derivations C.1 Worker’s problem Problem of worker i educated in o2 in field f who goes to work in d in sector S is given σ −1 σ αS by: Max ΠS CS where CS = ( k ckdS ) σ σ −1 s.t k S pkdS ckdS =Wf,dS ηio2 f,dS µo2 d This yields : Consumption of variety k of good S for an individual who got his de- gree in o2 and moved to d to work in occupation S is given by: σ −1 cikf,dS = (pkdS )−σ PdS (αS Wf,dS ηikf,dS µo2 d ) Assuming ice-berg transportation cost: pkdS = τkdS pkS Consumption of variety k of good S for an individual who got his degree in o2 and moved to d is given by: σ −1 cikf,dS = (τkdS pkS )−σ PdS (αS Wf,dS ηif,dS µo2 d ) Using the above quantities, worker indirect utility in stage 2 is derived as: Wf,dS ud ηif,dS µo2 d Uio2 f,dS = αS (35) ΠS PdS We can derive the indirect utility for stage 1 very similarly and this gives a combined stage 1 and stage 2 utility of the following form: IU (1−IU ) ao f wu,o2 S wf,dS Uo1 o2 f,dS = µ1 o1 o2 · 2 · ζi · uf,dS · ηi · µ2 o2 d (36) P o2 Pd where PdS = d τkdS pkS C.2 Firm’s problem Here, we will solve a perfectly competitive firm’s profit maximization problem. From the firm’s profit maximization problem, the derivation of equation 31 in the paper follows. The firm profit maximization condition for sector S is given by: 49 maxLsdS ∀d,s PdS QdS − ˜ sdS ˜ sdS − wsdS L wsdS L s ρS −1 ρS −1 ρS ρ ρ where QdS = (QhdS S + QldSS ) ρS −1 ρhS −1 ρhS QhdS = ( ˜ s∈college,k AsdS (LsdS ) ) ρhS −1 ρhS ρlS −1 ρlS and in theory we can have, QldS = ( ˜ sdS ) AsdS (L ρlS ) ρlS −1 s∈nocollege ˜ ldS ) For this paper, I use only one type of unskilled labor. And thus, here QldS = (AldS L ˜ s,S,k where s=e or ne, Differentiating with respect to L 1 −1 ρh,S −1 ρh,S −1 −1 ρS ρ ρS −1 ρS ρh,S ˜ ρh,S −1 ˜ PdS ρS Q S −1 dS ρS QhdS ρh,S −1 ( s∈college,k AsdS (LsdS ) ρh,S ) ρh,S −1 AsdS ρh,S (L sdS ) ρh = wk ,s,S Simplifying , 1 −1 ρh,S −1 1 −1 PdS QdS ρ QhdS ( ρ ˜ ρh,S ˜ sdS ) ρh ) ρh,S −1 AsdS (L s∈college,k AsdS (LsdS ) S S = wsdS 1 1 ρ − ρ1 −1 Simplifying further, PdS QdS S h,S QhdS ρ S ˜ sdS ) ρh = wsdS In the empirical model, we AsdS (L use engineers and non-engineers as two types of skilled labor. Denoting s=e and s=ne for engineers and non-engineers respectively, one can derive the following foc: −1 ρh,S Ae,d,S Le,d,S we,d,S −1 = (37) ρh,S wne,d,S Ane,d,S Lne,d,S Under the assumption that all productivities are drawn from the same Frechet distri- bution, and firms do not know worker productivities, the foc does not contain effective labor, only labor. We thus get the following estimating equation: −1 ρ h,S Ae,d,S Le,d,S we,d,S −1 = wne,d,S ρh,S Ane,d,S Lne,d,S 50 1−ρS 1−ρS 1−ρS Denote (ph dS ) = (ph dS ) + ( pl dS ) where, pl dS = AldS wldS (1−ρhS ) ρ (1−ρh,S ) where, ph dS = s h,S AsdS wsdS From firm first order condition for high-skilled labor, we can rewrite it as: 1 ρ 1 ρ − ρ1 −1 ph dS (1−ρh,S ) = h,S AsdS (PdS QdS S h,S QhdS ρ S ˜ sdS ) ρh )(1−ρh,S ) AsdS (L s 1 1 ρh,S −ρ 1 ρh,S −1 = (PdS QdS QhdS ρS h,S )(1−ρh,S ) ( AsdS ˜ sdS ) ρh,S )(1−ρh,S ) ) (AsdS (L s 1 ρh,S −1 1 ρh,S − ρ1 ρh,S = (PdS QdS QhdS ρS S ) (1−ρh,S ) ˜ (AsdS L ) sdS s 1 −1 1 ρh,S − ρ1 ρh,S QhdS )(1−ρh,S ) ρS S = (PdS QdS QhdS 1 − ρ1 S (1−ρh,S ) ρ = (PdS QdS S QhdS ) Thus we get the following equation for high-skilled, 1 − ρ1 ph ρ dS = PdS (QdS QhdS ) (38) S S I now solve the foc for low skilled workers. ρS −1 ρS −1 ρS ρS MaxLldS PdS (QhdS + (Ak ,S,l Lk ,S,l ) ρS ) ρS −1 − ˜ l,S,k ˜ s,S,k − wl,S,k L ws,S,k L s For low-skilled, taking the first order condition, we get, ρS −1 ρS ρS ρS −1 1 ρS − 1 −1 wldS = PdS (QhdS + (AldS LldS ) ρS ) ρS −1 (AldS LldS ) ρS AldS ρS − 1 ρS 1 −1 ρ = PdS QdS S (AldS LldS ) ρS AldS 1 −1 ρS ρS = PdS QdS QldS AldS 51 wldS Since pl dS = AldS 1 −1 ρ ρ Thus, pl dS = PdS QdS QldS Combining the two, we get the following equation: S S −1 ρ ph dS QhdS S = − 1 (39) pl dS ρS QldS Thus, ph dS −1 QhdS ln( l )= ln( ) (40) pdS ρS QldS Note however, that these are not observable quantities due to the presence of unobserved productivity. 1 ρhS −1 ρhS ρh,S (1−ρhS ) (1−ρ ) ( ˜ s∈college,k AsdS (LsdS ) ) ρhS −1 ρhS ( s AsdS wsdS ) h,S −1 ln( ρl )= ln( ) (41) (AldS wldS ) ρS AldS L˜ ldS For ease of notation, I now use s=e and s=ne for engineers and non-engineers respec- tively. −ρh,S 1 A 1−ρ ) (we,d,S )1−ρh,S + wne,d,S e,d,S ρh,S (Ane,d,S ) ρh,S −1 (( Ane,d,S h,S 1−ρh,S ) ln( ) (A− 1 ldS wldS ) ρh,S ρh,S −1 ρh,S −1 ρh,S (42) ρh,S −1 Ae,d,S ˜ e,d,S ) ρh,S ˜ ρh,S ) 1−ρh,S (Ane,d,S ) (( Ane,d,S )(L +Lne,d,S = ln( ) ˜ l,d,S ) (Al,d,S L This implies, 1 A 1−ρ ) (we,d,S )1−ρh,S + wne,d,S e,d,S ρh,S (( Ane,d,S h,S 1−ρh,S ) ρh,S ln( )− ln(Ane,S,k ) − ln(Al,S,k ) (wl,s,k ) ρh,S − 1 ρh,S −1 ρh,S −1 ρh,S Ae,d,S ˜ e,d,S ) ρh,S ˜ ρh,S 1−ρh,S −1 (( Ane,d,S )(L +L ne,d,S ) 1 1 ρh,S =( )ln( )− ln(AldS ) − ( )ln(Ane,d,S ) ρS ˜ ldS L ρS ρS ρh,S − 1 Thus, 52 1 A 1−ρ ) (we,d,S )1−ρh,S + wne,d,S e,d,S ρh,S (( Ane,d,S h,S 1−ρh,S ) ln( ) (wldS ) ρh,S −1 ρh,S −1 ρh,S Ae,d,S ˜ e,d,S ) ρh,S ˜ ρh,S ) 1−ρh,S −1 (( Ane,d,S )(L +Lne,d,S =( )ln( ) ρS ˜ ldS L 1 1 ρh,S − ln(AldS ) − ( )ln(Ane,d,S ) ρS ρS ρh,S − 1 ρh,S + ln(Ane,d,S ) − ln(AldS ) ρh,S − 1 ρh,S −1 ρh,S −1 ρh,S Ae,d,S ˜ e,d,S ) ρh,S ˜ ρh,S ) 1−ρh,S −1 (( Ane,d,S )(L +Lne,d,S =( )ln( ) ρS L˜ ldS ρh,S 1 1 +( )(1 − )ln(Ane,d,S ) + (1 − )ln(AldS ) ρh,S − 1 ρS ρS ρh,S −1 ρh,S −1 ρh,S Ae,d,S ˜ e,d,S ) ρh,S ˜ ρh,S ) 1−ρh,S −1 (( Ane,d,S )(L +Lne,d,S =( )ln( ) ρS ˜ ldS L ρhS 1 1−ρhS Ane,d,S + (1 − ) ρS AldS Note that, all the quantities in this equation are observable. If we plugin the first order condition 37, this is a regression of known quantities with the unobserved productivities as residuals. Recover IT prices then non-IT Use the following equation for S=IT (pS,k )1−ρS = 1−ρS 1−ρS (ph S,k ) + ( pl S,k ) (1−ρh ) ρ (1−ρ ) where ph S,k = s Akh,S ,s,S wk ,s,S h,S and pS,k = Au,l wl,S,k 53 Thus we can write price as: 1−ρS ρ (1−ρ ) ρ (1−ρ ) (pS,k )1−ρS = ((Akh,S ,e,S wk ,e,S h,S + Akh,S h,S ,ne,S wk ,ne,S )) 1−ρh,S + (A− 1 k ,l,S wl,S,k ) (1−ρS ) ρh,S (ρS −1) ρh,S (1−ρS ) ρ ρh,S −1 1−ρh,S Akh,S ,e,S (1−ρ ) (1−ρ ) 1−ρS Ak ,ne,S = Ak ,ne,S (( ρh,S wk ,e,Sh,S h,S + wk ,ne,S ) 1−ρh,S +( wl,S,k )1−ρS ) Ak ,ne,S Ak ,l,S ρh,S −ρh,S (ρS −1) ρh,S −1 1−ρh,S Ak ,ne,S (43) 1−ρS = Ak ,ne,S ph (˜S,k +( wl,S,k )1−ρS ) Ak ,l,S −ρh,S (ρS −1) ρS 1−ρh,S (ρ ) 1−ρS = Ak ,ne,S ph (˜S,k xl,S,k wl,S,k )1−ρS ) + (˜ S −1 −ρh,S (ρS −1) 1−ρh,S = Ak ,ne,S ph (˜S,k 1−ρS ˜− +x ρS l,S,k (wl,S,k ) 1−ρS ) The term in the bracket is a function of known quantities. How? 1 1−ρS A 1−ρ 1−ρ ˜h p S,k = (( A e,S,k )ρh,S we,S,k h,S h,S 1−ρh,S + wne,S,k ) is known from 37. ne,S,k and ρh,S (ρ )( ρ1 −1) h,S −1 S Ak ,ne,S ˜S,l,k = x 1 −1 ρ Al,S,k S From 39, we get: 1 1 ph l ρ ρ S,k Qk ,S,h = pS,k Qk ,S,l S S Or, substituting prices and quantities in terms of their observable components, ρh,S 1 ( )( ρ1 −1) Wl,S,k 1 ˜h p S,k ˜ k ,S,h ) ρS A ρh,S −1 (Q S = (Al,S,k Ll,S,k ) ρS k ,ne,S Al,S,k Thus, ρh,S (ρ )( ρ1 −1) 1 h,S −1 S Ak ,ne,S Wl,S,k (Ll,S,k ) ρS 1 −1 = 1 S ρ Al,S,k ˜h (Q p ˜ k ,S,h ) ρS S,k Denote ρh,S (ρ )( ρ1 −1) h,S −1 S Ak ,ne,S ˜S,l,k = x 1 −1 ρ Al,S,k S 54 Going back to equation 43, 1−ρIT ρ (1−ρ ) ρ (1−ρ ) (pd,IT )1−ρIT = ((Ae,d,IT h,IT h,IT we,d,IT h,IT + Ane,d,IT h,IT wne,d,IT )) 1−ρh,IT + (A− 1 l,d,IT wl,d,IT ) (1−ρIT ) −ρh,IT (ρIT −1) 1−ρh,IT 1−ρIT (−ρ ) = Ane,d,IT ph (˜d,IT +x IT ˜l,d,IT (wl,d,IT )1−ρIT ) (44) ρh,IT (ρ )( 1 −1) 1 h,IT −1 ρIT Ane,d,IT Wl,d,IT (Ll,d,IT ) ρIT ˜l,d,IT = where x 1 = 1 ρIT −1 ˜h p ˜ ρIT Al,d,IT S,IT (Qh,d,IT ) 1 1−ρIT A 1−ρ 1−ρ ˜h and p d,IT e,d,IT = (( Ane,d,IT )ρh,IT we,d,IT h,IT h,IT 1−ρh,IT + wne,d,IT ) Due to the firm’s first order conditions, the ratio of productivities is a function of 1−ρIT ˜h known wages and employment, and therefore, both p d,IT ˜l,d,IT are also functions and x of observables. See below for the proof. Substituting this, the estimating equation becomes: Ed,t 1−ρIT (−ρ ) 1 ln( ph ) = C + κIT ln(distd,RoW,IT )+(1 − σIT )ln((˜d,t ˜l,d,tIT (wl,d,t )1−ρIT ) (1−ρIT ) )− +x Et (σIT − 1) ρh,IT (ρIT − 1) ln(Ane,d,t ) (ρIT − 1) 1 − ρh,IT Taking first differences, we get equation 31 in the paper. C.3 Existence of equilibrium proof To show the existence of equilibrium I use the following theorem, proved in Allen et al (2019). N ×K N ×K Theorem 1: Consider any N × K system of equations F : R++ R++ : K K M M αk,l λk,l γk,m F (x)ik ≡ Kij,k (xj,l ) (xi,l ) Q m ( xj ) Qm (xi )κk,m j l=1 l=1 m=1 m=1 where Qm (.) are nested CES aggregating functions: 1 1 βm βm 1 1 δm,l Qm (xj ) ≡ (xj,n )δm,l l∈Sm | Sm | n∈Tl | Tn | 55 where δm,l > 0 and βm > 0 for all m and l, Kijk , Ul , Tj,n are all strictly positive parameter values; Sm and Tl,m are (weak) subsets of 1, ...., K ; and {αk,l , λk,l , γk,m , κk,p } are all real-valued. M K M M If maxk∈{1,...,K } ( m=1 | γk,m | + l=1 | αk,l | + m=1 | λk,m | + m=1 | κk,m |) < 1, then there exists an unique fixed point F (x∗) = x∗ I can show that the equilibrium system of equations in my model falls into the framework considered by theorem 1. The equilibrium conditions that govern enrollment are: Lo2 = lo1 o2 f Lo1 o1 ,f γ (1−IU ) ao2 f µ1 o1 o2 IU IU ( Po2 ) Φo2 fθ Lo2 = γ Lo1 o1 ,f Φ IU o1 γ (1−IU ) IU ao2 f IU −γ IU (µ1 IU o1 o2 ) Φo2 f θ Lo2 ( ) = γ Lo1 Po 2 o1 ,f Φo IU 1 γ (ao2 f µ1 o1 o ) IU (1−IU ) γ IU 2 Φ IU o1 = Φo θ Po 2 2f o2 ,f Let the following hold for some value of κ −γ ao f IU γ Lo2 ( 2 )IU = κf (Φo2 ) IU Po 2 Thus, we get, γ (1−IU ) IU γ ( µ1 IU o1 o2 ) Φo2 f θ κf (Φo2 ) IU = γ Lo1 o1 ,f Φo IU 1 γ (1−IU ) IU ( µ1 IU o1 o2 ) Φo2 f θ = γ Lo1 o1 ,f Φo IU 1 γ (1−IU ) IU ao1 ,f γ = (µ1 IU o1 o2 ) Φo2 f θ κf L− 1 o1 ( )Lo1 o1 ,f P o1 56 Simplifying, γ ( ao 1 f µ 1 o1 o2 ) IU (1−IU ) γ IU Φ IU o2 = Φ o2 f θ o1 ,f P o1 The two equations then just boil down to one. This allows us to consider a single non linear equation: −γ ao2 f IU IU (ao1 f µ1 o1 o2 ) IU (1−IU ) γ IU Lo2 = κf ( ) Φo2 fθ (45) Po 2 o1 ,f P o1 Substitute Wf d S uf d S (Φo2 f ) = µθ o2 d ( )θ d f Pd −γ ao2 f IU IU ( ao 1 f µ 1 o1 o2 ) IU Wf d S uf d S (1−IU ) γ IU Lo2 = κf ( ) (µθ o2 d ( )θ ) θ P o2 o1 ,f,d ,f Po 1 Pd The equilibrium condition in the internally traded sector is given by: σS 1−σS 1−σS σS −1 YdS QdS = τdj Pj Yj j We can rewrite the internal gravity equation 14 as: 1−σ 1−σ σ −1 YdS = τdj pdS PJ (αj Yj ) j −σ Multiplying both sides by Q1 dS , we get, −σ 1−σ 1−σ σ −1 1−σ YdS Q1 dS = τdj pdS Pj QdS (αS Yj ) j 1−σ σ −1 1−σ = τdj Pj YdS (αS Yj ) j Simplifying, the above: σ 1−σ 1−σ σ −1 YdS S QdS = τdj Pj (αS Yj ) j 57 We can rewrite 15, 1−σ 1−σ 1−σS σS −1 Pj1−σ = τjk pkS QkS QkS k 1−σ 1−σ σ −1 = τjk YkS QkS k Suppose that the following relationship holds true for some scalar κ σ 1−σ 1−σ YdS QdS = κPd In that case, as I show below, I can express equations 14 and 15 as a single equation. Equation 14 is given below: −σ 1−σ σ −1 σ YdS Q1 dS = τdj Pj (αS yj ) j −σ 1−σ σ Substituting YdS Q1 dS = κPd in the above we get back equation 15 1−σ 1−σ 1−σ −1 Pd = τdj (YjS )Qσ jS j This allows us to consider a single non-linear equation: −σ 1−σ 1−σ −1 σ YdS Q1 dS = κ τdj (YjS )Qσ jS j ∈N Now substitute the price index −1 σ 1−σ Pd = (κ) 1−σ YdS QdS in 45 −γ ao 2 f IU IU (ao1 f µ1 o1 o2 ) IU Lo2 = κf −1 σ −1 σ 1−σ 1−σ (κ) 1−σ Yo2 S Qo2 S o1 ,f (κ) Yo1 S Qo1 S 1−σ γ Wf d S uf d S θ (1−IU ) IU ( µθ o2 d ) θ d f Pd Simplifying, 58 −γ ao 2 f IU IU ( ao 1 f µ 1 o1 o2 ) IU Lo2 = κf ( −1 σ ) −1 σ 1−σ 1−σ (κ) 1−σ Yo2 S Qo2 S o1 ,f (κ) 1−σ Yo1 S Qo1 S γ Wf d S uf d S θ (1−IU ) IU ( µθ o2 d ( )) θ d f Pd −γ ao 2 f IU ( ao 1 f µ 1 o1 o2 ) IU = κf ( −1 σ )IU −1 σ 1−σ 1−σ o1 ,f,d ,f (κ) 1−σ Yo 2 S Q o 2 S (κ) 1−σ Yo1 S Qo1 S γ Wf d S uf d S (1−IU ) IU (µθ o2 d ( −1 σ )) θ θ (κ) 1−σ Yd1−σ S Qd S Finally, substitute the expression for wages, 1−ρf 1 ρh,S − ρ1 −1 ρf Wf d S = Yd S Qd Qhd S Af d S ˜fd (L S ) ρh S S −γ ao 2 f IU IU (ao1 f µ1 o1 o2 ) IU Lo2 = κf ( −1 σ ) −1 σ 1−σ 1−σ o1 ,f,d ,f (κ) 1−σ Yo2 S Qo2 S (κ) 1−σ Yo1 S Qo1 S 1−ρf 1 ρ − ρ1 −1 ρf Yd S Qd h,S Qhd S Af d S ˜fd (L S ) ρh uf d S (1−IU ) γ IU ( µθ o2 d ( S S −1 σ )θ ) θ (κ) 1−σ Yd1−σ S Qd S We are thus able to express the equilibrium conditions in the form required for theorem 1. An equilibrium thus exists by the contraction mapping theorem. C.4 Unknown amenities Given the distribution of population in each region, estimated migration costs and real wages, unknown region, field of education and sector specific amenities are backed out. The equilibrium population in location d of workers with degree in s working in sector S is given by: 59 Wf,dS uf,dS θ µ2 θ o2 d ( Pd )f Lf,dS = W u Lf o 2 j 2 θ f,d S f,d S o2 o 2 S µo 2 d ( Pd )θ Wf,dS θ µ2 θ o2 d s ( Pd )s = uθ f,dS W Lf o 2 2 θ f,d S uf,d S (46) k o2 d S µo 2 d ( Pd )θ θ ( Wf,dS )θ µ2 o 1 2d Pd (Lf,dS /( k o2 2 Wf,d S uf,d S Lf o2 )) θ θ µo d ( )θ d S P 2 d uf,dS = Wf,dS Pd To back out the unknown amenities for education, I use the population of people with degrees in field s in each location: γ (1−IU ) ao f wu,o µ1 IU ( 2 Po 2 o1 o2 )IU Φo2 fθ 2 Lo2 f = lo1 o2 f Lo1 = L0 (47) o1 o Φo1 γ (1−IU ) IU where Φo1 = o2 ,f (ao2 f µ1 IU o1 o2 ) Φo2 f θ . In the same way as above, we can back out the amenities for education. C.5 Proofs of Frechet propositions θ Proposition 1: If η ∼ F rechet(θ), then η α ∼ F rechet( α ) Proof: Since η ∼ F rechet(θ), thus Fη (x) = P (η ≤ x) = (exp(−x−θ )) Let z = η α Fz (x) = P (z ≤ x) = P (η α ≤ x) 1 = P (η ≤ (x) α ) 1 = exp(−(x α )−θ ) −θ = exp(−(x α )) −θ Thus z follows Frechet with dispersion parameter α 60 1 1 Proposition 2: If ηi ∼ F rechet(θ), then E (maxi (ai × ηi )) = ( i aθ i ) Γ(1 − θ ) θ Proof: Let zi = maxi (ai ηi ) FZi (z ) = P r(Zi ≤ z ) = P r(ai ηi ≤ z ∀i) z = P r(ηi ≤ , ∀i) ai z = Πi F ( ) ai z −θ = Πi exp(− ) ai 1 = exp(−z −θ ( ( )−θ )) i ai −θ = exp(−( aθ i ))z i z = maxi (ai ηi ) thus follows a Frechet distribution with dispersion parameter θ and position parameter ( i aθ i) According to the properties of the Frechet distribution, the mean of z will thus be 1 1 E (z ) = E (maxi (ai ηi )) = ( i aθ i ) Γ(1 − θ ) θ To understand how the propositions apply to the maximization problem at hand, 1−IU consider (1 − IU ) = α, ai = (wf,dS · Pd · uf,dS · µ2 o2 d ) Proposition 3: The productivity distribution of workers who do not move for work has the same dispersion parameter θ. Proof: For workers staying in their hometown, since µ2 o2 d = 1, the first period utility is given by: wf,dS Vo2 f,o2 S = · uf,dS · ηi · |o1 , o2 , f Pd Hence the distribution of productivity draws for workers choosing to stay in o2 is given by: 61 Fηi,o2 02 (z ) = F (ηi,o2 02 ≤ z ) Vo2 f,o2 S =F ≤z wf,o2 S Po 2 · uf,o2 S wf,o2 S = FVo2 f,o2 S · uf,o2 S .z Po 2 FVo2 f,o2 S is also a Frechet distribution with dispersion parameter θ. For different regions, the productivity distribution of stayers there have different means, but their dispersions will be the same. D Model Assumptions, Extensions, and validity Implicit in the model in section are the following assumptions: 1. This is a demand side model of education, such that demand for education is only constrained by the lack of availability of good quality education, as measured by amenities in the model, but not by the availability of education per se. This as- sumption is in line with the institutional background of engineering education in India. It has been consistently documented in news, research, and policy articles that the supply of engineering education far exceeds demand, if quality is ignored. This is due to the existence of a huge number of private colleges that sprung up 23 after India began privatizing education in the late 1980s and 1990s. 2. I have assumed that while students can migrate for college education inside the country, they cannot migrate internationally. During that time period, less than 1% of students who pursued higher education went abroad (Pande and Yan (2016)). Therefore, since the main purpose of this research is to understand the effect of the IT boom on regional inequality inside the country, I think this margin is small enough to be ignored. However, in a simple extension, international migration is introduced in the model by adding one region where people can migrate to but from where people cannot migrate out to India. This region is closer to some regions of India and further from others. Education facilities and job opportunities are both 23 See https://bit.ly/3iMgyLP, https://bit.ly/3cNd3RE,Kapur et al. (2008), Forbes (2013). My own numbers, computed on the basis of intake and enrollment numbers from the All India Council of Technical Education (AICTE) shows that intake (capacity declared by the colleges and approved by AICTE) almost always exceeded enrollment, except in top tier colleges. 62 better in this region than in any region of India. The introduction of this region increases both the aggregate welfare as well as the regional inequality. 3. The baseline model assumes that there is no endogenous agglomeration and con- gestion, as in the spatial literature (see for example Allen and Arkolakis (2014)). However, congestion exists in the model mechanism in so far as the prices of non- tradeable goods respond differentially in different regions, rising more in congested regions. The model is easily extendable to allow for endogenous amenities and productivities: A = ALα U = U L−β I take α = .3 and β = −.2, following Allen and Arkolakis (2014). For this parametric configuration, overall inequality as a result of the IT boom increases. 4. I have assumed that the only way the migration costs differ is by reason for migration and there are no differences in migration costs across skill types. The gravity literature on migration has mostly estimated an aggregate migration cost that is mostly the same across skill types or reasons (see for example Bryan and Morten (2019). The very dis-aggregated nature of the data by reason for migration helps me estimate dis-aggregated migration costs by reason, but I do not have data that is further dis-aggregated by migration flows across skill types. However, the model is very easily extendable to assuming different migration costs across skill types. Let’s assume that unskilled workers have a higher migration cost than skilled workers. In this extension, unskilled workers are worse off than in the baseline model where both types of workers have the same migration costs. This is because skilled and unskilled workers are complements in the production function. As skilled workers start migrating out of certain districts that did not see much of the IT boom, this brings down the marginal productivity of unskilled workers as skilled and unskilled workers are complements in the production function. Given model parameters and the exogenous amenities and productivities, the model makes predictions about the equilibrium employment and enrollment across districts, all of which are observable quantities in the data. In this section, I validate the model by first showing that the model generated data can replicate the reduced form facts established in section 5. Using the model generated data, I repeat the reduced form regression rewritten below and plot the coefficients βt in Figure I. Ydt = αt + γd + χd ∗ t + βt Exportsd,1995 + dt (48) 63 where Ydt is IT employment or engineering enrollment in district d at time t. Exportsd,1995 is the proportion of software exports from district d in the year 1995 out of total Indian IT exports in 1995. αt are time fixed effects that capture any factors that are common to all districts at time t. γd are district fixed effects that capture any factors that are fixed over-time in district d. χd ∗ t is a district-level time-trend capturing any linear trend in the outcome variable at the district level. Figure I: Response of IT Employment Across Regions with Different Levels of Software Exports NOTE: This graph plots the confidence intervals for the year by year response of standardized IT employment over the pre and the post boom periods in districts that had any IT exports in 1995 compared to districts that did not using model generated data. From this figure, just like in the reduced form estimation, we can see that post-1998, IT employment increased more in districts that had a higher level of software exports in 1995. In Figure II, I plot the response of engineering enrollment and here also the reduced form results are replicated: post 2000, engineering enrollment increased in districts with higher level of software exports. 64 Figure II: Response of Engineering Enrollment Across Regions with Different Levels of Software Exports NOTE: This graph plots the confidence intervals for the year by year response of standardized engineering employment over the pre and the post boom periods in districts that had any IT exports in 1995 compared to districts that did not using model generated data 65 References Allen, T. and C. Arkolakis (2014). Trade and the topography of the spatial economy. Quarterly Journal of Economics . Allen, T., C. d. C. Dobbin, and M. Morten (2018, nov). Border Walls. Technical report, National Bureau of Economic Research, Cambridge, MA. Atkin, D. (2016, aug). Endogenous skill acquisition and export manufacturing in Mexico. American Economic Review 106 (8), 2046–2085. Banerjee, A. V. and E. Duflo (2000). Reputation effects and the limits of contracting: A study of the Indian software industry. Quarterly Journal of Economics 115 (3), 989–1017. Blanchard, E. J. and W. W. Olney (2017, may). Globalization and human capital in- vestment: Export composition drives educational attainment. Journal of International Economics 106, 165–183. Borsook, I. (1987). Earnings, ability and international trade. Journal of International Economics 22 (3-4), 281–295. Bryan, G. and M. Morten (2019, jul). The Aggregate Productivity Effects of Internal Migration: Evidence from Indonesia. Journal of Political Economy , 000–000. Caliendo, L., M. Dvorkin, and F. Parro (2019). Trade and Labor Market Dynamics: General Equilibrium Analysis of the China Trade Shock. Econometrica 87 (3), 741– 835. Card, D. and T. Lemieux (2001). Can falling supply explain the rising return to college for younger men? A cohort-based analysis. Quarterly Journal of Economics 116 (2), 705–746. Carmel, E. and P. Tjia (2005). Offshoring Information Sourcing and Outsourcing to a Global Workforce. Cambridge University Press. Danziger, E. (2017, jul). Skill acquisition and the dynamics of trade-induced inequality. Journal of International Economics 107, 60–74. Donaldson, D. (2018, apr). Railroads of the Raj: Estimating the impact of transportation infrastructure. American Economic Review 108 (4-5), 899–934. Economist (2003). America’s pain, India’s gain. 66 Edmonds, E. V., N. Pavcnik, and P. Topalova (2010, oct). Trade adjustment and human capital investments: Evidence from indian tariff reform. American Economic Journal: Applied Economics 2 (4), 42–75. Fan, J. (2019, jul). Internal geography, labor mobility, and the distributional impacts of trade. American Economic Journal: Macroeconomics 11 (3), 252–288. Ferriere, A., G. Navarro, and R. Reyes-Heroles (2018). Escaping the Losses from Trade: The Impact of Heterogeneity on Skill Acquisition *. Technical report. Findlay, R. and H. Kierzkowski (1983, dec). International Trade and Human Capital: A Simple General Equilibrium Model. Journal of Political Economy 91 (6), 957–978. Forbes, N. (2013). India’s higher education opportunity’. Economic Reform in India: Challenges, Prospects, and Lessons , 260–72. Fuchs, S. (2018). The Spoils of War: Trade Shocks during WWI and Spain’s Regional Development Job Market Paper *. Technical report. Goldin, C. and L. Katz (2007, nov). Long-Run Changes in the U.S. Wage Structure: Narrowing, Widening, Polarizing. Technical report, National Bureau of Economic Research, Cambridge, MA. Greenland, A. and J. Lopresti (2016, may). Import exposure and human capital adjust- ment: Evidence from the U.S. Journal of International Economics 100, 50–60. Head, K. and T. Mayer (2014). Gravity Equations: Workhorse,Toolkit, and Cookbook. In Handbook of International Economics, Volume 4, pp. 131–195. Elsevier B.V. Hou, Y. and C. Karayalcin (2019, aug). Exports of primary goods and human capital accumulation. Review of International Economics . Imbert, C. and J. Papp (2020). Costs and benefits of rural-urban migration: Evidence from india. Journal of Development Economics 146, 102473. International Trade Center (2017). TRADE IMPACT FOR GOOD BRICS COUN- TRIES: EMERGING PLAYERS IN GLOBAL SERVICES TRADE. Technical report. Johnson, M. T. (2013, oct). Borrowing constraints, college enrollment, and delayed entry. Journal of Labor Economics 31 (4), 669–725. Jones, B. F. and . Kellogg (2014). The Human Capital Stock: A Generalized Approach †. American Economic Review 104 (11), 3752–3777. Kapur, D. (2002). India Review The causes and consequences of India’s IT boom. 67 Kapur, D., P. B. Mehta, et al. (2008). Mortgaging the future? indian higher education. Suman Bery Barry Bosworth Arvind Panagariya , 101. Katz, L. F. and K. M. Murphy (1992, feb). Changes in Relative Wages, 1963-1987: Supply and Demand Factors. The Quarterly Journal of Economics 107 (1), 35–78. Khanna, G. and N. Morales (2017, sep). The it Boom and Other Unintended Conse- quences of Chasing the American Dream. SSRN Electronic Journal . Kone, Z. L., M. Y. Liu, A. Mattoo, C. Ozden, and S. Sharma (2018, jul). Internal borders and migration in India. Journal of Economic Geography 18 (4), 729–759. ıguez-Clare (2016, aug). Grounded by Gravity: A Kucheryavyy, K., G. Lyn, and A. Rodr´ Well-Behaved Trade Model with Industry-Level Economies of Scale. Technical report, National Bureau of Economic Research, Cambridge, MA. Lee, C. (2005). Labor Market Status of Older Males in the United States, 1880-1940. Social Science History 29 (1), 77–105. Li, B. (2018, sep). Export expansion, skill acquisition and industry specialization: evi- dence from china. Journal of International Economics 114, 346–361. Li, L. (2019). Skill-biased Imports, Human Capital Accumulation, and the Allocation of Talent *. Technical report. Liu, M. Y. (2017). How does globalization affect educational attainment? Evidence from China *. Technical report. Ma, S., Y. Liu, and M. Zhou (2019, aug). Trade, educational costs, and skill acquisition. Review of International Economics . Mathur, S. K. (2006). Indian Information Technology Industry: Past, Present and Fu- ture& A Tool for National Development. Technical Report 2. Oster, E. and B. M. Steinberg (2013, sep). Do IT service centers promote school enroll- ment? Evidence from India. Journal of Development Economics 104, 123–135. Pande, A. and Y. Yan (2016). Migration of students from india and china: A comparative view. South Asian Survey 23 (1), 69–92. Ryoo, J. and S. Rosen (2004, feb). The engineering labor market. Journal of Political Economy 112 (1). Santos Silva, J. M. and S. Tenreyro (2006, nov). The log of gravity. Review of Economics and Statistics 88 (4), 641–658. 68 Shastry, G. K. (2012). Human capital response to globalization: Education and informa- tion technology in India. Journal of Human Resources 47 (2), 287–330. Simonovska, I. and M. E. Waugh (2014, jan). The elasticity of trade: Estimates and evidence. Journal of International Economics 92 (1), 34–50. Stiglitz, J. E. (1970, may). Factor Price Equalization in a Dynamic Economy. Journal of Political Economy 78 (3), 456–488. Tharakan, P. K., I. Van Beveren, and T. Van Ourti (2005, nov). Determinants of India’s software exports and goods exports. Review of Economics and Statistics 87 (4), 776– 780. Tombe, T. and X. Zhu (2019). Trade, migration, and productivity: A quantitative anal- ysis of China. In American Economic Review, Volume 109, pp. 1843–1872. American Economic Association. Tsivanidis, N. (2018). The Aggregate and Distributional Effects of Urban Transit Infras- a’s TransMilenio . Technical report. tructure: Evidence from Bogot´ UC Berkeley (1999). 8.02.99 - UC Berkeley computer scientists attack Y2K bug with new program to find millennium glitches in C applications. Van Leemput, E. (2021). A passage to india: Quantifying internal and external barriers to trade. Journal of International Economics 131, 103473. 69 Tables Table IX: PPML gravity estimation on district to district migration by reason for migration (1) (2) (3) Education Work Other reasons log distance between district centers -0.585*** -0.567*** -0.752*** (-71.67) (-48.87) (-60.02) common language 0.656*** 0.478*** 0.335*** (8.05) (5.11) (4.52) Same state; neighboring districts 3.577*** 3.002*** 3.126*** (64.25) (39.58) (35.16) Same state; not neighboring districts 2.559*** 2.088*** 1.935*** (47.51) (29.10) (25.72) Different state, neighboring districts 2.422*** 2.737*** 2.845*** (32.31) (37.56) (33.66) N 342225 342225 342225 NOTE: The table shows the PPML estimation results, differentiated by reason for migration. t-statistics are reported in parenthesis. 70 Table X: Migration Flows by Reason for Migration. Reason for Migration No. of Migrants Percentage Out of State Percentage Work 18,901,992 48 9,771,841 52 Education 11,507,98 3 3,59,029 31 Other 19,746,588 49 72,00,884 36 Total 39,799,378 100 17,331,754 44 NOTE: Column 1 lists the reason for migration. Column 2 lists the number of people migrating out of their district of previous residence by reason for migration in 2001. Column 3 shows the percentage distribution of migrants by reason for migration. Column 4 shows the number of people who migrated out of their own state of birth by reason for migration. Column 5 shows the percentage of people migrating out of their state of birth among the total number of migrants by reason for migration. Data source is the 2001 Census migration data. 71 Table XI: Estimation of response of district to district migration flows to dis- tances by reason for migration (1) (2) (3) Education Work: Traditional Work:NLS log distance between district centers -.602*** -.554*** -.961*** (-62.04) (-55.55 ) (-8.233) common language .307 *** .393 *** .174*** (17.77) (3.00) (.476) Same state; neighbors 3.646*** 3.158*** 5.381*** (45.48) (36.13) (3.419) Same state; not neighbors 2.379*** 2.125*** 3.925 *** (27.49) (23.39) (2.678) Different state, neighbors 2.339*** 2.737*** 3.339*** (17.16) (35.17) (2.033) N 280900 280900 280900 NOTE: The table shows the PPML estimation results, differentiated by reason for migration. t- statistics are reported in parenthesis. Columns 1 and 2 reports the traditional PPML estimation results. Column 3 reports the results for work migration estimation using non linear least squares. 72 Table XII: Panel A: Trade Cost Estimation Panel A: Second Stage (1) (2) (3) (4) Exports Exports Exports Exports OLS IV IV State-level trend Observable MC -0.44 *** -0.38*** -.36*** -.45*** (-10.96) (-7.88) (-9.53) (-6.66) State-time Trend Yes No No Yes Remoteness as control No No Yes Yes IV No Yes Yes Yes First Stage F-stat: 210.37*** 116.20*** 47.66*** N 523 523 523 523 Panel B: First Stage (1) (2) (3) (4) Observable MC Observable MC Observable MC Observable MC OLS IV IV State-level trend IV - -2.40e-08*** -2.31e-08*** -2.46e-08*** - ( -13.12) (-9.53) (-6.44) State-time Trend Yes No No Yes Remoteness as control No No Yes Yes IV No Yes Yes Yes First Stage F-stat: 210.37*** 116.20*** 47.66*** N 523 523 523 523 Robust standard errors are used. t statistics are reported in parenthesis. 73 Table XIII: Summary of estimated parameter values Parameter Value Source Productivity dispersion (θ) 2.61 Estimated Education amenity dispersion (γ ) 1.636 Estimated IT trade elasticity (σIT ) 1.45 Estimated ρS * 1.41 Katz and Murphy (1992) ρhS * 2 Ryoo and Rosen (2004) Internal trade elasticity (σS ) 5 Simonovska and Waugh (2014) Agriculture share .38 Ministry of Statistics, Govt of India Manufacturing share .16 - High-skill services .07 - Other services .37 - 74 Figures Figure III: Growth in software exports, IT employment, and engineering en- rollment (a) Panel A 20 Normalized Software Exports 5 10 0 15 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 year (b) Panel B 5 normalized enrollment/employment in Eng/CS 1 2 3 4 1990 1992 1995 1998 2000 2003 2005 2009 year IT Engineering Notes: Panel A graph shows the growth in IT exports over time, with IT exports normal- ized to their 1993 levels. This was generated using data on software exports compiled by Richards Heeks. Panel B figure shows the growth in IT employment and engineering en- rollment over time, with both IT employment and engineering enrollment normalized to their 1990-1991 levels. This was generated using IT employment data from NSSO, NASS- COM, Economic Census and engineering enrollment data from the Population Census. 75 Figure IV: Response of IT employment and engineering enrollment across dis- tricts with different levels of software exports (a) Panel A .8 .6 95% Confidence Interval .2 0 .4 −.2 1995 1998 1999 2002 2003 2005 2013 Years (b) Panel B .6 95% Confidence Interval .2 0 .4 1991 2001 2011 Years Notes: These figures plot the confidence intervals for the year by year response of stan- dardized IT employment (panel A) and engineering enrollment (panel B) over the pre and the post boom periods in districts that had any IT exports in 1995 compared to districts that did not.Note that the engineering enrollment, gathered from the Population Census, is only available at decadal intervals. 76 Figure V: Heterogeneous Response of IT Employment .15 .1 95% Confidence Interval 0 .05 −.05 1995 1998 1999 2002 2003 2005 2013 Years Notes: This graph plots the confidence intervals for the year by year heterogeneous response of IT employment to differences in historical college enrollment among districts that had any existing level of IT exports over the pre and the post boom periods. The unit is denoted as per ’000 engineering students. 77 Figure VI: Spatial Distribution of IT Employment and Engineering Enrollment in 2011 (a) Panel A (0.16,20.75] (0.08,0.16] (0.04,0.08] [0.00,0.04] No data (b) Panel B (2.12,13.44] (0.96,2.12] (0.51,0.96] [0.09,0.51] Notes: These figures show the percentage distribution of employment in IT sector out of total employment (panel A) and the percentage distribution of engineering enrollment out of total enrollment (panel B). 78 Figure VII: Histogram of Work-flows by Reason for Migration. 50 40 no of destination districts 20 30 10 0 0 .1 .2 .3 .4 Percentage out of total migrants in destination Education Work Notes: On the x-axis, this histogram plots the proportion of migrants in a district that migrated for work and education respectively as shown by the pink and blue colors. On the y-axis, the number of destination districts with the corresponding proportions of migrants for work and education are plotted. Data source is the 2001 Census migration data. 79 Figure VIII: Distribution of Welfare Gains from the IT Boom with and without endogenous education 8 6 Density 4 2 0 0 .5 1 1.5 no endogenous education baseline model NOTE: The histograms show the distributions of regional welfare gains, as measured by the percentage change in welfare for every percentage change in IT exports, with and without endogenous education choice. 80 Figure IX: Distribution of Welfare Gains from the IT Boom with and without education mobility 8 6 Density 4 2 0 0 .2 .4 .6 .8 No education mobility baseline model NOTE: The histograms show the distributions of regional welfare gains, as measured by the percentage change in welfare for every percentage change in IT exports, with and without education mobility. 81 Figure X: Distribution of Welfare Gains from the IT Boom with and without border effects for education migration costs 30 20 Density 10 0 .1 .2 .3 .4 .5 No borders effects for education baseline model NOTE: The histograms show the distributions of welfare gains from the IT boom with and without border effects in education mobility cost.