WPS3723 Determinants of City Growth in Brazil Daniel da Mata*, Uwe Deichmann, J. Vernon Henderson, Somik V. Lall, and Hyoung Gun Wang * DIRUR, Instituto de Pesquisa Econômica Aplicada (IPEA), Brasilia Development Research Group, The World Bank, Washington DC Department of Economics, Brown University, Providence, RI Abstract In this paper, we examine the determinants of Brazilian city growth between 1970 and 2000. We consider a model of a city, which combines aspects of standard urban economics and the new economic geography literatures. For the empirical analysis, we constructed a dataset of 123 Brazilian agglomerations, and estimate aspects of the demand and supply side as well as a reduced form specification that describes city sizes and their growth. Our main findings are that increases in rural population supply, improvements in inter-regional transport connectivity and education attainment of the labor force have strong impacts on city growth. We also find that local crime and violence, measured by homicide rates, impinge on growth. In contrast, a higher share of private sector industrial capital in the local economy stimulates growth. Using the residuals from the growth estimation, we also find that cities that better administer local land use and zoning laws have higher growth. Finally, our policy simulations show that diverting transport investments from large cities toward secondary cities does not provide significant gains in terms of national urban performance. World Bank Policy Research Working Paper 3723, September 2005 The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the view of the World Bank, its Executive Directors, or the countries they represent. Policy Research Working Papers are available online at http://econ.worldbank.org. Acknowledgements This paper is a product of a joint research program between the World Bank and the Instituto de Pesquisa Econômica Aplicada (IPEA), Brasilia. This research has been partly funded a World Bank research grant and by the Urban Cluster of the World Bank's Latin America and Caribbean Region, and is also an input to the World Bank's urban strategy for Brazil. We have benefited from discussions with Carlos Azzoni, Pedro Cavalcanti Ferreira, Ken Chomitz, Dean Cira, Marianne Fay, Mila Freire, João Carlos Magalhães, Maria da Piedade Morais, Marcelo Piancastelli, Zmarak Shalizi, Christopher Timmins and Alexandre Ywata de Carvalho. All errors are the authors'. A preliminary version of the paper was presented at the World Bank/ IPEA Urban Research Symposium in Brasilia (April 2005). 1 1. BACKGROUND AND MOTIVATION Why are some cities more successful than their peers? Is the `success' of individual cities driven by factors mostly external to any city's immediate control (location, growth in market potential, being a port in a period of national trade growth, national level decentralization and improved governance), or do individual city policies and politics influence growth and development? Disentangling the relative contribution of regional and local efforts is important for understanding the potential of alternate policy interventions for stimulating growth of cities across the national urban system. At this time, there is very little research examining the effectiveness of local and national policy environments on urban growth in developing countries. Brazil is a highly urbanized country ­ 80 percent of its population lives in urban centers and 90 percent of GDP is created in cities. According to estimates by the UN Population Division for Brazil, the entire growth in population that is expected over the next three decades will be in cities where the national urbanization rate is expected to rise to over 90 percent (UN 2003). This will add about 63 million people to Brazil's cities, and total urban population will be over 200 million. This population growth is occurring across the Brazilian urban system (Table 1; see also Lemos et al. 2003). Of the 123 major urban agglomerations in Brazil, only three were above 2 million people in 1970 versus ten in 2000. In the middle of the size distribution in 2000, there were 52 agglomerations with population between 250,000 and 2 million people compared to 25 in 1970. Thus, not only is the scale of urbanization a major concern, but the distribution of population across the urban hierarchy will also challenge policy makers to devise appropriate policies for 2 cities of different sizes. Across the urban system, there will be need to meet backlogs in infrastructure, service delivery, and amenity provision, as well as accommodate further growth. In addition to population increases across the urban system, fiscal and administrative decentralization has increased the role of individual cities in attracting investments and in providing services that are responsive to the needs of local residents. Brazil is one the most decentralized among developing countries. The 1988 Constitution established municipalities as the third level of government, and provided states and municipalities with more revenue raising power and freedom to set tax rates. However many local governments have limited administrative and institutional capacity, and have not been able to effectively use their autonomy to improve service delivery or attract new investment. A recent study by the World Bank (World Bank 2002) identifies that maximizing urban competitiveness from agglomeration economies and minimizing congestion costs from negative externalities are key challenges facing national and local governments in Brazil. Under this backdrop of rapid population growth and decentralization of administrative and fiscal responsibilities, it becomes essential to identify what types of interventions stimulate growth of individual cities. In addition, we want to find out the consequences of favoring investments in secondary cities on aggregate efficiency and economic growth. There is an ongoing debate in Brazil's policy circles that the largest agglomerations have become too big leading to significant negative externalities of crime, social conflict, and high land costs, and policies should be designed to actively stem the growth of these large agglomerations and favor investments in secondary cities. 3 It is however not clear if net agglomeration economies in large cites can be offset by incentives and other measures to divert growth to smaller cities. In this paper, we consider a model of a city, which consists of a demand side-- what utility levels a city can pay out--and a supply side--what utilities people demand to live in a city. We estimate aspects of the demand and supply side; and then a reduced form equation that describes city sizes and their growth. For the empirical analysis, we construct a dataset of Brazilian agglomerations to examine city growth between 1970 and 2000. Much of the underlying data come from the Brazilian Bureau of Statistics (IBGE) Population Censuses of 1970, 1980, 1991, and 2000. For the estimation, we make use of GMM and spatial GMM techniques to correct for endogeneity in the presence of spatially autocorrelated errors. Our main findings are that increases in rural population supply, and improvements in inter-regional transport connectivity and education attainment of the labor force have strong impacts on city growth. Both, labor force quality improvements and base period education attainment matter significantly for growth. In terms of local characteristics, we find that local crime and violence and a higher representation of public industrial capital in the city lower city growth rates. The rest of the paper is organized as follows. Section 2 provides the model and estimation framework of urban demand and population supply models. The models presented in this section combine traditional urban modeling with concepts from the new economic geography literature. In Section 3, we discuss findings from the empirical analysis and focus our attention on identifying main determinants of city growth. Section 4 provides results from simulations that examine if investments in secondary cites stimulate growth. Section 5 concludes. 4 2. MEASURING CITY GROWTH In this paper, we examine the local and regional determinants of city growth in Brazil. Urban growth is represented by both individual city productivity growth and city population growth, which are different indicators of city "success" and represent two interconnected dimensions of successful urban growth. However before we can look at any individual city's success, we need to understand the broader context, in which the economy as a whole is changing. Cities from an economic perspective represent the way modern production is carried out in a country and, as such, reflect what is occurring in the country as a whole. Production composition of cities varies by city size, where different types of goods are best produced in bigger versus smaller cities. If national output composition changes, altered by changing trade demand or domestic demand that changes with economic growth, then demand moves away from goods produced in smaller types of cities and those cities will suffer a setback. Some will falter; others will adjust what they produce and perhaps upgrade, moving up the urban hierarchy. Which ones adjust well may depend on "luck", but it may also depend on observable attributes such as education of the labor force. A better educated labor force may allow for more nimble adjustment and up-scaling of products produced-- what is called the reinvention hypothesis. Similarly the skill composition of the labor force will vary across cities in systematic ways, as output composition and skill needs vary. More generally, national productivity growth comes from productivity growth within cities, which engender the close social- spatial interactions inherent in innovation, knowledge accumulation and technological 5 improvements. To understand individual city success, we need to account for the external, national factors driving urban changes, as well as to understand the sources of local productivity growth. At the same time we need to be able to measure when cities are being "successful" versus less successful and what drives success. Much of success may be driven by conditions external to the city, as just noted. In addition to demand changes, changes in national institutions, for example providing smaller cities with greater autonomy in local public sector decision making and greater access to fiscal resources may make it easier for smaller cities to finance the infrastructure and public sector services demanded by firms (transport and telecommunications) and by higher skilled workers (e.g., better schools) and compete successfully with bigger cities for certain industries. For terms of city level conditions, better run cities with more efficient use of public sector revenues will be more attractive to both firms and migrants. And better run cities will co-ordinate better with local businesses to help service their needs and make them more productive. So part of measuring city success is measuring what local producer and consumer amenities are valued and what cities are better at providing these amenities. In related work, Glaeser et al. (1995) examined how urban growth of the U.S. cities between 1960 and 1990 is related to various urban characteristics in 1960, such as their location, initial population, initial income, past growth, output composition, unemployment, inequality, racial composition, segregation, size and nature of government, and the educational attainment of their labor force. They showed income and population growths are (1) positively related to initial schooling, (2) negatively 6 related to initial unemployment, and (3) negatively related to the initial share of employment in manufacturing. Racial composition and segregation are not correlated with later city population growth. Government expenditures (except for sanitation) are also not associated with subsequent growth. However, per capita government debt is positively correlated with later growth.1 In a long run analysis, Beeson et al. (2001) examine the location and growth of the U.S. population using county-level census data from 1840 and 1990. They showed access to transportation networks, either natural (oceans) or produced (railroads), was an important source of growth over the period.2 In addition, industry mix (share of employment in commerce and manufacturing), educational infrastructure, and weather have promoted population growth. In a recent paper for developing countries, Au and Henderson (2004) took a slightly different approach. They modeled and estimated net urban agglomeration economies for cities in China, which can be postulated by inverted-U shapes of net output or value-added per worker against city employment. They found urban agglomeration benefits are high ­ real incomes per worker rise sharply with increases in city size from a low level, level out nearer the peak, and then decline very slowly past the peak. The inverted-U shifts with industrial composition across the urban hierarchy of cities. Larger peak sizes are for more service oriented cities, but smaller for intensive manufacturing cities. In addition, (domestic) market potential and accumulated FDI per worker have significant and beneficial effects on city productivity, measured by value-added per 1They attributed this correlation to higher expected growth which made it cheaper to borrow, or government invest heavily in infrastructure to serve that growth. 2Transportation network is represented by a group of dummy variables indicating ocean, mountain, confluence of two rivers, railroads, and canals. 7 worker. However, percentage of high school graduates, distances to a major highway and to navigable rivers, and kilometers of paved road per person have no effects, once market potential is controlled for. We now describe the model and estimation strategy employed in our analysis. The data used for the analysis have been produced through a joint research program between IPEA, Brasilia and the World Bank. Detailed description of the variables and their sources are provided in Appendix C, and a descriptive overview of Brazilian city growth is in da Mata et. al (2005). There is no official statistical or administrative entity in Brazil that reflects the concept of a city or urban agglomeration that is appropriate for economic analysis. Socioeconomic data in Brazil tend to be available for municípios, the main administrative level for local policy implementation and management. Municípios, however, vary in size. In 2000, São Paulo município had a population of more than ten million, while many other municípios had only a few thousand residents. Furthermore, many functional agglomerations consist of a number of municípios, and the boundaries of these units change over time. Our analysis therefore adapts the concepts of agglomerations from a comprehensive urban study by IPEA, IBGE and UNICAMP (2002) resulting in a grouping of municípios to form 123 urban agglomerations (Figure 1). Throughout this paper we refer to these units of analysis as agglomerations, urban areas, or cities. 8 Model and estimation strategy The model consists of a demand side--what utility levels a city can pay out--and a supply side--what utilities people demand to live in a city. We estimate aspects of the demand and supply side; and then a reduced from equation that describes city sizes and their growth. In the end the focus is on the last item. Demand side The demand side is given by the schedule of utility levels a city can offer workers, as city size increases. A prime determinant of that is income, I, which consists of wage income and income from rents and other non-labor sources. In addition in an indirect utility function we also have a vector of items, Qi , such as commuting costs, housing rents, local taxes, and local public services and amenities, so that Ui =U(Ii,Qi) D (1) For wage income there is a wage rate component and then a work effort component discussed momentarily. The wage rate component comes from value of marginal productivity relationships, where wi = w(MPi,ri,ei, Ni) (2) In (2) r is the rental rate on capital, e is the quality or education level of workers, MP is market potential reflecting the demand for a city's output and hence the price it receives, and N is a measure of scale, such as city employment. MP from the new economic geography and monopolistic competition literature has a specific form with components we can't measure. We make two adjustments. First we use "nominal" market potential, which is simply the distance discounted sum of total incomes of all MCAs in Brazil for city i , or 9 MPi = TI j (3) j, ji ij TI is total income andij represents the transport cost between i and j.3 The calculation of market potential is described in Appendix B, where we use distance as the measure of transport costs. However travel times and costs vary by more than distance. Brazil for 1968, 1980 and 1995 has a measure of the transport cost from each city to its state capital. We divide that variable by distance from the city to the state capital to get a city specific measure of local transport costs which producers in a city face in selling in the local region. The variable "inter-city transport costs",ii , will be determined by intercity road infrastructure investment. The major items from urban theory affecting worker well-being, apart from the wage rate are rents and commuting costs. Commuting costs are time costs, of which part will be reflected in lost work time or energy for work, and part in out-of-pocket commuting costs. So total wage income is a function of both the wage rate and hours and energy available to work, where the later will be negatively affected by commuting times. Housing costs are tricky, since higher housing rents are also reflected in higher non-labor income earned by landowners. For demand side estimation, what we know from the data is total income per worker in each city. We model that as a function of the determinants of the wage rate and then factors affecting work time/energy and housing rental income. Both are a function of city size. In sum we estimate: Ii = I D(MPi,ii,ei, Ni) (4) 3The MCAs (Minimum Comparable Areas) are groups of municípios. The detailed description is in Appendix C. 10 The scale variable, N, captures three things, scale externality effects on wage rates, increasing housing rental incomes, and reduced work time/energy. As such its sign is uncertain--if cities are at a size where the commuting cost aspects of urban living weigh heavily, at the margin increases in scale could detract from incomes. That will be the case in our estimation (which is also good for "stability" given supply curves are upward sloping--being on the rising part of the "demand curve" can be problematical and also makes sign interpretations in the city size equation more difficult as discussed later). Population Supply The population supply relationship we estimate has population supplied to a city increasing in utility offered per worker, which we approximate by income per worker. This will tell us the supply elasticity of people to a city. In addition supply is shifted by attributes, Zi , of the surrounding area--or substitutes of places to work for population in the area. We have supply to a city of population from nearby rural areas. It is decreasing in surrounding rural incomes where we use a gravity measure of surrounding rural incomes, and it is increasing in surrounding rural population supply where again we use a gravity measure of surrounding rural population. The calculation details are in Appendix B. The supply equation is given by Ni = NS (U s(Ii),Zi), where N S /I > 0, N S /Z > 0 (5) Note the inverse we will use later is Ii = I S (Ni,Zi) where I S /N > 0, I S /Z < 0. (6) 11 City Size Level and Growth Equations The final estimating equation comes from equating income demand and supply equations in (4) and (6) and solving for N to get Ni = N(MPi,ii,ei,Zi) where N /MP > 0,N /i > 0,N /e > 0,N /Z > 0. (7) Also by differentiating (4) and (6) we can show dN = -(IS /Z)dZ +(ID /MP)dMP+(ID /i)di +(ID /e)de (8). IS /N -ID /N Note (IS /Z)<0. And IS /N -ID /N >0 for "stability", where that is helped by the fact that empirically in Table 2 (discussed momentarily)ID /N <0. 3. DETERMINANTS OF GROWTH - DEMAND AND SUPPLY SIDES Having described the model and estimation strategy in Section 2, we now discuss the main findings from demand, supply, and city growth models. Results from estimating the demand side model (equation 4) are presented in Table 2, pooling three years (1980, 1991, and 2000). We focus on the GMM-IV results in column 1, which are from the two- step efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary within- state correlation.4 We also give OLS results in column 2. In columns 1 and 2 the scale measure is total workers in each city. In column 3, population instead of total workers is used to represent urban scale. The instruments along with statistical test results are listed in the footnotes. The GMM results of columns 1 and 3 pass specification tests for the listed variables, and average partial R2's (average partial F's) are .44 and .43 (52.7 and 4The results are almost identical to 2SLS ones. All the GMM estimations in this paper are the two-step efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary within-state correlation. 12 51.6) respectively, which are relatively strong.5 In column 4, we provide the effects on outcomes of a one standard deviation increase in covariates. All variables have big impacts on total income per worker. For average schooling and Ln(market potential), one standard deviation increases (1.26 and 1.01) increase total income per worker by 37.5% and 36.5%. Also for Ln(number of workers) and Ln(intercity-transport costs), reduction of one standard deviation (-1.13 and -.344) increases total income per worker by 34.4% and 7.4% respectively. Of course for covariates in log form we already have elasticities. The inter-city transport costs variable is significant although it can be fragile. For intercity-transport costs we use the 1980 value for years 1980 and 1990; and we use the 1995 value for 2000. We give zero values to Ln(intercity-transport costs) of state capital cities and add to covariates a dummy variable indicating state capitals. Results for transport costs to São Paulo are much more fragile and have not been included in the specifications reported in Table 2. Finally, note the strong negative scale effects at the margin, suggesting we are on the downward sloping portion of inverted U's (of income against city size) as we should be.6 We had no success in estimating a quadratic specification or interacting scale with the manufacturing to service ratio, to examine interactions between city scale and industrial composition. 5Partial R2 is a squared partial correlation between the excluded instruments and the endogenous regressor in question, and the F-test of the excluded instruments corresponds to this partial R2. 6Theory suggests that, under free migration within a country, if particular cities are not a their peak of inverted U's, they will be to the right of the peak, due to either "stability" conditions in migration-labor markets or conditions on what constitutes a Nash equilibrium in migration decisions (Au and Henderson, 2004; Duranton and Puga, 2004). 13 Growth or differenced versions of this equation and the population supply one have very poor IV results, which is mainly due to a weak instrument problem. For the growth specifications, we only focus on the final reduced form specification (Table 5). Results for population supply are provided in Table 3. Again, for the estimation we pool three years (1980, 1991, and 2000). Columns 1 and 2 give the GMM-IV and then OLS results. The instruments, listed in the footnote of the table, pass specification tests and produce strong first-stage regression results. All terms have strong, expected sign coefficients. In column 1, a 1% increase in a city's total income per capita increases city population by 2.4%. The gravity measures of surrounding rural population supply and rural income opportunities have the expected opposite effects with similar magnitudes. A 1% increase in surrounding rural population supply increases city population by 5.9%, and a 1% increase in surrounding rural income opportunities decreases city population by 5.2%. Thus, city populations are very sensitive to rural population supply and earning opportunities. In columns 3-5, we present supply elasticities by year. The coefficients of all the three covariates increase over time, indicating increasing mobility. Population supply to a city has become more elastic to changes in attributes of the city and nearby rural areas. However, even in 2000, the elasticity, 2.9, is far from perfect mobility elasticity.7 7Under perfect labor mobility, we expect a horizontal population supply curve. All the cities offer the same utility level, and city sizes are only determined by demand-side factors. 14 City Size Results Results for city size from estimating equation (7) are given in Table 4. Column 1 gives GMM-IV results, column 2 OLS, and column 3 the effects of a one standard deviation increase in covariates on city size. For instruments, we use 1970 values and time-invariant variables.8 Again the instruments pass specification tests, and show strong first-stage regression results. If the reduced form results are indeed from combining demand and supply sides, we expect the coefficient estimates in Table 4 to be consistent with the imputed values from the demand side (Table 2) and the supply side (Table 3). The imputed values can be calculated using (8), such that ci = dN = ID /Q = bi dQ IS /N -ID /N 1/ a1 -b4 - IS /Z ( ) cj = dN = = -aj dZ IS /N -ID /N 1/ a1 -b4 where ci,cj are reduced form coefficient estimates in Table 4, bi the demand side of ( ) Table 2, and aj the supply side of Table 3. The comparison with imputed values, noted in the footnote, confirms a rough consistency between Tables 2 to 4.9 8The instruments are semi-arid area dummy, port dummy, illiteracy rate (1970), ln(industrial capital per worker, 1970), ln(distance to state capital)*ln(market pot. agric. land availability, 1970), ln(humidity), ln(avg. temperature), ln(rural pop. supply, 1970), ln(rural income opportunities, 1970), ln(market potential, 1970), and state capital and time dummies. 9 Imputed [from Tables 2 (3) and 3 (1)] Table 4 (1) Ln(market potential) b1/(1/a1-b4) 0.468 2.693 Ln(inter-city trans. costs) b2/(1/a1-b4) -0.250 -1.395 Average Schooling b3/(1/a1-b4) 0.381 0.220 Ln(rural pop. supply) -a2/(1/a1-b4) 3.053 1.661 Ln(rural income opportunities) -a3/(1/a1-b4) -3.468 -3.664 15 Table 4 suggests two things. First, market potential for goods, the rural population supply, and rural income opportunities have significant effects on city populations with roughly similar magnitudes. A 1% increase in market potential and rural population increase city size by 2.7% and 1.7% respectively. In comparison, a 1% decrease in rural income opportunities would increase city size by 3.7%. Second, intercity-transport costs and educational attainment (average schooling) are also important, although GMM-IV results are somewhat fragile. Growth Results Next we turn to growth equations, where we difference the reduced form equation (7). While in principle results should be the same, a differenced equation has three possible advantages and one draw-back. First a growth formulation allows us to separate out labor force quality improvements from the effect of education on technology (knowledge accumulation spillovers). The latter is inferred from the effect on city growth of base period education levels, in a common specification in the growth literature. Second, while the levels formulation we estimated passes specification tests, one might have strong priors that there are time invariant unobservables affecting city size that are difficult to instrument for; differencing removes these. Third, a growth formulation allows us conceptually to move beyond the equilibrium static allocation framework used in the specification to test for growth effects where adjustments processes are involved. The drawback in differencing equations is that the effects of variables which have small changes over time may be poorly estimated, given lack of variation in the data. 16 Table 5-1 shows the GMM-IV and OLS growth results pooling 1991-1980 and 2000-1991 differenced equation years for equation (7). For instruments, we add to the IV list of Table 4 ln(distance to São Paulo), ln(transport costs to São Paulo, 1968), and ln(transport costs to state capital, 1968). All covariates, except changes in rural income opportunities, have strong and expected sign coefficients. The poor performance of rural income opportunities is most probably due to the limited variance in the data over time, as discussed next. Relative to the levels equation in Table 4, the growth equation coefficients reported in column 1 are similar for market potential and (change) in schooling. However results for changes in rural situation variables and transport costs differ in magnitude. For ln(rural population supply) and ln(rural income opportunities), not only is there little variation, the two variables are strongly negatively correlated.10 So the high coefficient on ln(rural population supply) may be picking up some of the effect of ln(rural income opportunities). For the inter-city transport cost variable, differences over time may be poorly measured. While we instrument for this variable, the instruments include historical levels of the same measure, and therefore may be subject to the same measurement issues. As a result, reductions in inter city transport costs have a much smaller effect in the growth estimation. Nevertheless coefficients are consistent in sign with those of the level equation in Table 4. In examining the results in Table 5, we focus on column 3. The main difference between the GMM results in columns 1 and 3 is that we introduce base period population and manufacturing to service ratios in the latter specification. Controlling for population allows for dynamic adjustment to steady state levels from the base, and introducing 10The correlation coefficients are -.719 (for 1991-1980) and -.481 (for 2000-1991). 17 industrial composition allows for adjustment relative to changes in national output composition. For results in column 3, the instrument list readily passes the specification test. First stage regressions for the covariates have average partial R2's and F's of respectively .52 and 2852, which are strong for differenced covariates. For differenced intercity-transport costs, we use the difference between 1995 and 1980 for 2000-1991; and the difference between 1980 and 1968 for 1991-1980. We find that increases in rural population supply, market potential of goods, labor force quality improvements (measured by changes in educational attainment) increase the growth rate of city population. As a new effect, educational attainment in the base period increases city population growth rates afterwards, confirming spillover effects of knowledge accumulation. But as noted above, reductions in intercity-transport costs have a moderate effect on city population growth rate. A 10% decrease in intercity-transport costs increases city population growth by .9% over a decade. Initial city size has a negative coefficient, suggesting some conditional convergence in population growth across cities. Also, cities with high manufacturing ratios in the base period experience faster growth. We also find that once base period population and industrial composition are controlled for, state capitals are growing faster than other cities. In Table 5-2, we introduce two additional local characteristics to the specification in Table 5-1, column 3. These are (1) ratio of public industry capital to total industry capital stock in 198011 and (2) base period homicide rates. The main difference between the GMM results in column 3, Table 5-1 and those from Table 5-2 is that the statistical significance for the change in market potential drops to 20 percent. Other results are 11Total industry capital includes both public and private industry capital stocks. The capital stock data comes from Morandi and Reis (2004). Due to data limitation, we use capital stock in 1980, which is the most recent year available. 18 consistent with those reported in Table 5-1. The GMM results suggest that homicide rates and an increasing share of public industry capital have a detrimental effect on city growth. For example, a 10% increase in base period homicide rates reduces city growth by 1.1% over the next decade. The findings on public industrial capital accumulation suggest that public investment in industry tends to crowds out private investment (at least in the short term), and the potential inefficiency of state enterprises may also deter economic growth.12 Decomposing City Growth In Table 6, we decompose the city population growth results of Table 5-1 (3) into contributions of each covariate. We focus on the covariates which are statistically significant. The contribution of each covariate is calculated as a fitted value (the mean value multiplied by the estimated coefficient) relative to the sum of all the fitted values. Column 5 shows the overall contributions for all cities. There is a strong negative effect of city size in base period (-83.4%). This effect is compensated by increases in market potential (63.8%) and educational attainment (66.7%), along with base period's educational attainment (46.7%) which affects local technology growth. The estimated effects of market potential and technology spillovers support the new economic geography emphasis on local markets and the endogenous growth literature emphasis on human capital accumulation. These results are also consistent with cross country findings in Henderson and Wang (2005).13 Columns 6 and 7 compare city 12La Porta and López-de-Silanes (1999) showed privatization in Mexico in 1980s and 1990s led to a significant improvement in firm performance, as profitability increased 24 percentage points and converged to levels similar to those of private firms. 13Henderson and Wang (2005) analyzes how urbanization in a country is accommodated by increases in numbers versus population sizes of cities. Using a worldwide dataset on all metro areas over 100,000 population from 1960-2000, they show market potential, educational attainment, and the degree of democratization strongly affect growth in both city numbers and individual city sizes. 19 growth decompositions of large versus small cities. We find no major difference in these effects across city size. Robustness Tests ­ Spatial Dependence Interaction among cities due to trading and technological linkages is likely to influence city growth. In the presence of technology spillovers, copy cat policy adoption, and inter regional transport connectivity, growth in any given city will be related to other cities in the urban system, and the impact of these spillovers is likely to be higher among cities which are geographically close to each other. Much of these interactions however are not observed in the data that we have been able to compile, and thus is relegated to the error specification. In the presence of spatial autocorrelation, standard errors from the city growth estimation are likely to be inaccurate and introduce efficiency problems in the various estimations. To address this issue, we test whether the clustered estimation results of Tables 2 to 5-2 are robust to residual spatial dependence. Tests for spatial dependence (Moran's I and Geary's C) show that there is residual spatial autocorrelation in the error terms. To address this issue, we employ the GMM methodology reported by Conley (1999), who uses weighted averages of spatial autocovariance terms to correct the standard errors of parameter coefficients for possible serial dependence based on location. This approach is robust to misspecification of the degree of spatial correlation among the units. In this nonparametric application, the researcher can specify a cutoff point beyond which spatial dependence is thought to be unimportant. We use latitude and longitude of the agglomeration centroid as coordinate variables. Cutoffs are set to be 1.5 standard deviations of latitude and longitude (10.23, and 8.20), which correspond to 900 miles. 20 Thus, spatial correlation between cities declines linearly and is zero beyond 1.5 standard deviations of latitude and longitude. Appendix Tables A to D report the two-step spatial GMM and spatial OLS results which correspond to each specification of Tables 2 to 5-2. In general we find that the GMM results are robust and the spatial GMM results are very similar to the clustered ones. Decomposition of City Growth Residuals We now use the residuals from the GMM estimations in Table 5-2 (1), and examine if they have any systematic association with time invariant local characteristics. Our main interest is in examining if local management or governance, and inter industry linkages are associated with city growth. In principle, autonomous local government would actively work to provide local public goods for its constituents, and develop policies to stimulate growth and manage externalities. For our analysis, we have two measures of local government efforts: (1) existence of laws to collect IPTU tax (property tax), (2) percentage of population under land zone laws. In terms of inter industry linkages; we expect a clustered or densely populated region to provide a rich environment for competition and collaboration among firms and workers in the region, which lead to economic growth. As Saxenian (1994) observed, regional development is more distinct in a region consisting of many small size firms than that of a few large firms.14 A city with a rich set of forward and backward linkage 14Saxenian (1994) examined different regional economic performances between Silicon Valley in California and Route 128 in Massachusetts. Dense social networks and open labor market in Silicon Valley have facilitated informal communication and collaborative practices, and produced a regional network- 21 industries performs better than an enclave-a small pocket of firms. We measure the density of economic activities by (1) ln(no. firms relative to workers) = ln(no. formal firms / no. workers in formal firms), and (2) ln(population density). The basic estimation results from decomposing the residuals of Table 5-2 (1) are reported in Table 7. The basic structure is that city growth residuals between t and (t-1) years are affected by city characteristics in year (t-1). However, when data in year (t-1) are not available, we use the city characteristics in year t assuming long-lasting persistence of city characteristics across years. In any case, the estimation result should be interpreted as associations of contemporary variables rather than a causal relationship. We find that population growth is higher in cities with better enforcement of land use and zoning laws ­ the estimates suggest that city growth is associated with increases in the percentage of city population under land zone laws.15 However, we do not find any statistically significant association between city growth and existence of laws to collect IPTU (property tax). This is most likely because there is almost no variation in the IPTU collection data ­ most cities have laws to collect the property tax. A richer set of inter industry linkages is also associated with growth ­ the OLS coefficient for the number of (formal) firms relative to (formal) workers is statistically significant and has the expected sign. A higher number of firms relative to workers stimulate competition and collaboration among firms and workers in a city, and is associated with higher city growth. based industrial system. The Route 128 region, in contrast, is dominated by autarkic (self-sufficient) corporations that internalize a wide range of productive activities. She concluded that this difference in regional socio-economic structure accounts for the divergent prosperity of two regional economies, in spite of their common origins in postwar military spending and university-based research, and even though they enjoyed roughly the same employment levels in 1975. 15We can get a similar result when we use a dummy variable indicating more than 50% of population is under land zone laws. 22 4. POLICIES FAVORING SECONDARY CITIES Using the results from the regressions of city growth, let us consider the following policy experiment. There is considerable policy debate in Brazil that investments need to be directed towards secondary cities to stimulate local economic development and limit the growth of the largest metropolitan areas. However, the impact of these initiatives on overall economic growth and urban efficiency is unclear. Suppose the Brazilian government invests in transportation infrastructure in order to decrease inter-city transport costs. An issue is whether favoring investments in small cities vis-à-vis large cities increase overall productivity growth, and therefore higher overall economic growth in Brazil. To make the analysis tractable, we first assume that the amount of transportation investment to reduce one unit of inter-city transport cost (per mile) is proportional to city population. So one unit decease in inter-city transport costs for a city of 1 million is assumed to cost the same amount of government expenditure as those for 10 cities of 100,000 people. In 2000, the largest city, São Paulo, has 17.9 million residents, which is equivalent to the total population of the 88 smallest cities (Table 8). The total population of the 7 largest cities is the same as that of remaining 116 small cities (Our data consist of 123 cities). Our assumption says that total transportation investment needed to decrease one unit of transport costs for São Paulo will also reduce one unit of transport costs for the 88 smallest cities, if invested in those cities. Table 2 (3) describes the determinants of income per worker, in which average schooling, market potential, city population, and inter-city transport costs affect income per worker. From this equation, we can calculate the total urban income in Brazil, s. t. 23 123 total urban income = income per workeri × no. workersi i=1 123 X b^ i GMM × no. workersi. i=1 Now suppose the government invests in transportation infrastructure. In Table 8, we compare the effect on total urban income of investments favoring big cities versus small cities. The first column is the total urban income relative to the baseline income when infrastructure investments favor largest cities, specifically a ½ standard deviation (.4) decrease in inter-city transport cost of largest cities. The baseline income is the predicted value of Table 2 (3). The second column is the total urban income when the same amounts are invested in the smallest cities to decrease those cities' transport cost by the same magnitude (.4). We experiment with several combinations of cities in Table 8. The simulation results show that there are very small differences in total urban income from favoring small cities vis-à-vis large cities. These income differences range around 0.3 ~ 0.7%p of total urban income growth in 2000. The difference is highest when we favor the 104 smallest cities vis-à-vis than the largest two cities (.698%p). These results tell that there are no major gains in terms of overall urban income from diverting investments from the largest cities to secondary cities. 5. SUMMARY AND CONCLUSIONS In this paper, we have examined the determinants of Brazilian city growth between 1970 and 2000. For the analysis, we constructed a dataset of 123 agglomerations, and examined factors that influence wages and labor supply. Our main findings are the following. (1) Increases in rural population supply is a major driver of city growth. (2) Inter-regional transport improvements that lead to increases in the market 24 potential of goods and reduce inter city transport costs stimulate growth. In fact, we find that increases in market potential have the strongest impact on city growth. (3) Improvements in labor force quality and the spillover effects of knowledge accumulation (measured by initial levels of education attainment) have strong growth impacts. In terms of inter regional transport improvements, the Brazilian government has made significant investments in infrastructure to integrate the national economy and lower business costs in peripheral regions. Most of the improvements in the road network occurred between the 1950s and 1980s, leading to significant reduction in transportation and logistics costs. Castro (2002) measures the benefits of improvements in highway infrastructure from 1970-1995 as the change in equivalent paved road distance from each municipality to the state capital of São Paulo, accounting for the construction of the network as well as the difference in vehicle operating costs between earth/gravel and paved roads. He shows that transport cost reductions were quite significant for the Northern region and Central region state of Mato Grosso, with numbers varying from 5,000 to 3,000 equivalent kilometers of paved road. Average reductions fall to the 1,000 km range in the Central region states of Goiás and Mato Grosso do Sul, the southern states, and the coastal northeastern states. Using this measure, Castro (2002) finds that the reduction in interregional transport costs was one of the major determinants of both the expansion of agricultural production to the central regions of Brazil after the 1960s as well as increases in the country's agricultural productivity In terms of city level characteristics, we find that local homicide rates have a negative impact on city growth rates. In addition, cities with high shares of public industrial capital also experience slower growth. Thus, there is considerable scope for 25 local initiatives to reduce the costs imposed by crime and violence, along with local economic development programs to improve access to finance for small and medium sized businesses. Our decompositions of city growth residuals tentatively show that local land use and zoning enforcement is positively associated with city growth, as is the presence of a diverse set of inter industry linkages. One of the major limitations in our efforts to identify the contribution of local characteristics to city growth has been the lack of longitudinal data, which makes it difficult to draw causal relationships. It would be useful to get better data on historic land use and zoning regulations, as well as local public goods, services, and amenities. In further work, we hope to collect additional data on city level characteristics to better identify their impacts on city growth. 26 6. REFERENCES Alesina, A. and D. Rodrik (1994), "Distribution Politics and Economic Growth," The Quarterly Journal of Economics, 109, 456-490. Beeson, P., D. DeJong and W. Troesken (2001), "Population Growth in U.S. Counties, 1840- 1990," Regional Science and Urban Economics, 31, 669-699. Castro, N. (2002). "Transportation costs and Brazilian agricultural production: 1970-1996" Texto para Discussão - NEMESIS ­ LXVI, http://ssrn.com/author=243495", Social Science Research Network. Conley, T. (1999), "GMM Estimation with Cross Sectional Dependence," Journal of Econometrics, 92, 1-45. Da Mata, D., U. Deichmann, V. Henderson, S. Lall, and H. Wang (2005). Examining the Growth Patterns of Brazilian Cities. Mimeo. Duranton, G. and D. Puga (2004), "Micro-Foundations of Urban Agglomeration Economies," in J. V. Henderson and J.F. Thisse (eds.) Handbook of Regional and Urban Economics, Vol 4. North-Holland. Galor, O. and J. Zeira (1993), "Income Distribution and Macroeconomics," Review of Economic Studies, 60, 35-52. Glaeser E., J. Scheinkman and A. Shleifer (1995), "Economic Growth in a Cross-Section of Cities," Journal of Monetary Economics, 36, 117-143. Henderson, J. V. and H.G. Wang (2005), "Urbanization and City Growth: the Role of Institutions," Brown University, mimeo. Henderson, J.V. and C.C. Au (2004), "Are Chinese Cities Too Small?," Brown University, mimeo. Hummels, D. (2001), "Toward a Geography of Trade Costs", Purdue University, mimeo. IPEA, IBGE, and UNICAMP (2002), Configuração Atual e Tendêncies da Rede Urbana, Serie Configuração Atual e Tendêncies da Rede Urbana, Instituto de Pesquisa Econômica Aplicada, Instituto Brasileiro de Geografia e Estatistica, Universidade Estadual de Campinas, Brasilia. Lemos, M,. Moro, S., Biazi, E., Crocco, M. (2003). A Dinâmica urbana das Regiões Metropolitanas Brasileiras. Economia Aplicada, 7, 1:213-244. Korenman, S. and D. Neumark (2000), "Cohort Crowding and Youth Labor Markets: A Cross- National Analysis," in D. Blanchflower and R. Freeman, Youth Employment and Joblessness in Advanced Countries, University of Chicago Press, pp. 57-105. La Porta, R. and F. López-de-Silanes (1999), "The Benefits of Privatization: Evidence From Mexico," The Quarterly Journal of Economics, 114, 1193-1242. Morandi, L. and E. Reis (2004), "Estoque De Capital Fixo No Brasil, 1950-2002," Anais do XXXII Encontro Nacional de Economia, Proceedings of the 32th Brazilian Economics Meeting. Persson, T. and G. Tabellini (1994), "Is Inequality Harmful for Growth?," American Economic Review, 84, 600-622. 27 Saxenian, A. (1994), Regional Advantage: Culture and competition in Silicon Valley and Route 128, Harvard University Press. United Nations (2003). World Urbanization Prospects. Weil, D. (2005). Economic Growth, Addison-Wesley. World Bank (2003). Brazil: Equitable, Competitive and Sustainable - Contributions for Debate. World Bank, Washington DC. 28 Source: IPEA, IBGE Figure 1: Urban Agglomerations by population size 29 Table 1: City Size Distribution Population size 1970 1980 1991 2000 > 5 million 2 21) 32) 3 2 million - 5 million 1 3 7 7 1 million - 2 million 4 5 5 8 500,000 - 1 million 5 10 15 14 250,000 - 500,000 16 21 23 30 100,000 - 250,000 44 43 44 46 < 100,000 51 39 26 15 Total number of cities 123 123 123 123 Average size 350,857 507,242 657,602 788,222 Min 20,864 41,454 76,816 86,720 Max 8,139,705 12,588,745 15,444,941 17,878,703 1) "São Paulo" and "Rio de Janeiro" 2) "Porto Alegre" is newly added. 30 Table 2. Demand Side: Determinants of Income Per Workera,b,c (robust standard errors in parentheses) (1) (2) (3) (4) The effect of GMM-IV OLS GMM-IV increase in covariate based on (1) Average Schooling 0.298*** 0.280*** 0.271*** 0.375 (0.032) (0.026) (0.033) Ln(market potential) 0.363*** 0.048** 0.333*** 0.365 (0.080) (0.018) (0.070) Ln(no. workers) -0.304*** 0.005 -0.290*** -0.344 [ln(population) for (3)] (0.095) (0.016) (0.079) Ln(inter-city transport costs) -0.216* 0.016 -0.178* -0.074 (0.112) (0.032) (0.092) state capital dummy 0.019 -0.090 0.075 (0.146) (0.062) (0.144) time dummies Yes Yes Yes Observations 369 369 369 R2 0.807 Hansen J statistic (overidentification test) 1.593 1.439 (p-value) (0.661) (0.696) Average of Partial R2 0.435 0.425 Average of Partial F's 52.67 51.58 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. The instruments are semi-arid area dummy, ln(distance to state capital), ln(distance to São Paulo), manufacturing/service employment ratio (1970), infant mortality (1970), ln(humidity), average years of schooling (1970), state capital and time dummies. b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary intra-group (within-state) correlation. c. OLS regressions are with robust cluster standard errors. We assume the observations may be correlated within states, but would be independent between states. 31 Table 3. Population Supplya,b,c (robust standard errors in parentheses) (1) (2) (3) (4) (5) GMM-IV OLS GMM-IV GMM-IV GMM-IV (1980) (1991) (2000) Ln(income per capita) 2.370*** 1.813*** 1.830*** 2.636*** 2.886*** (0.683) (0.378) (0.569) (0.704) (0.933) Ln(rural income opportunities: -5.151*** -4.152*** -4.821*** -5.316*** -5.624*** market potential) (1.454) (0.819) (1.457) (1.354) (1.824) Ln(rural pop. supply market 5.851*** 4.878*** 5.559*** 5.978*** 6.317*** potential) (1.368) (0.752) (1.378) (1.281) (1.705) time dummies Yes Yes No No No R2 0.745 Hansen J statistic (overidentification test) 1.909 1.297 1.148 1.655 (p-value) (.591) (0.730) (0.765) (0.647) Average of Partial R2 0.657 0.691 0.644 0.662 Average of Partial F's 55.50 34.41 37.48 64.29 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. The instruments are semi-arid area dummy, ln(distance to São Paulo), ln(market pot. agric. land availability, 1970), port dummy, ln(per capita capital stock, 1970), southern region and time dummies. b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary intra-group (within-state) correlation. c. OLS regressions are with robust cluster standard errors. We assume the observations may be correlated within states, but would be independent between states. 32 Table 4. City Size Equationsa,b,c,d (robust standard errors in parentheses) (1) (2) (3) The effect of GMM-IV OLS increase in covariate based on (1) Ln(rural pop. supply) 1.661*** 1.216*** 1.558 (0.643) (0.425) Ln(rural income opportunities) -3.664*** -1.999*** -3.701 (0.894) (0.600) Ln(market potential) 2.693*** 1.426** 2.720 (0.916) (0.586) Average Schooling 0.220** 0.231** 0.277 (0.091) (0.106) Ln(inter-city transport costs) -1.395*** 0.081 -0.480 (0.337) (0.110) State capital dummy -0.260 1.091*** (0.395) (0.170) time dummies Yes Yes Observations 369 369 R2 0.801 Hansen J statistic (overidentification test) 1.770 (p-value) (.880) Average of Partial R2 .477 Average of Partial F's 129.47 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. The instruments are semi-arid area dummy, port dummy, illiteracy rate (1970), ln(industry capital per worker, 1970), ln(distance to state capital)*ln(market pot. agric. land availability, 1970), ln(humidity), ln(avg. temperature), ln(rural pop. supply, 1970), ln(rural income opportunities, 1970), ln(market potential, 1970), and state capital and time dummies. b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary intra-group (within-state) correlation. c. OLS regressions are with robust cluster standard errors. We assume the observations may be correlated within states, but would be independent between states. d. Average of Partial R2 and Partial F's are for average schooling and Ln(inter-city transport costs). Market potential and gravity measures are almost completely correlated with those in 1970 (Partial R2's are around .99). 33 Table 5-1. City Size Growth Equationa,b,c (robust standard errors in parentheses) (1) (2) (3) (4) GMM-IV OLS GMM-IV OLS Ln(rural pop. supply market 9.188*** 3.216*** 9.429*** 3.064*** potential) (2.309) (0.892) (2.410) (0.631) Ln(rural income opportunities: 0.756 0.364 0.358 0.198 market potential) (0.883) (0.517) (0.728) (0.317) Ln(market potential) 2.294*** 2.860*** 1.284** 2.738*** (0.761) (0.798) (0.512) (0.551) Average schooling (t-1) 0.078*** 0.021 0.071*** 0.021 (0.021) (0.014) (0.013) (0.012) Average schooling 0.275* 0.067* 0.384*** 0.097*** (0.141) (0.033) (0.104) (0.033) Ln(inter-city transport costs) -0.078** -0.092** -0.089*** -0.088** (0.035) (0.037) (0.026) (0.037) state capital dummy 0.016 0.080*** 0.154*** 0.129*** (0.036) (0.024) (0.035) (0.037) Ln(population) (t-1) -0.047*** -0.018* (0.009) (0.010) Manu / service (t-1) 0.140*** 0.096*** (0.027) (0.019) time dummies Yes Yes Yes Yes Observations 246 246 246 246 R2 0.364 0.403 Hansen J statistic (overidentification test) 5.786 8.204 (p-value) (.565) (.514) Average of Partial R2 .412 .526 Average of Partial F's 395.70 2852.4 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. For (1), instruments are the IV list of Table 4, ln(distance to São Paulo), ln(transport costs to São Paulo, 1968), and ln(transport costs to state capital, 1968). For (3), we drop ln(industry capital per worker, 1970) from (1), and add ln(population, 1970), manu/service ratio (1970), manu/service ratio(1970)*ln(population, 1970), manu/service ratio(1970)*ln(income per capita, 1970), and manu/service ratio(1970)*ln(market potential, 1970). b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary intra-group (within-state) correlation. c. OLS regressions are with robust cluster standard errors. We assume the observations may be correlated within states, but would be independent between states. 34 Table 5-2. City Size Growth Equation (continued)a,b,c (robust standard errors in parentheses) (1) (2) GMM-IV OLS Ln(rural pop. supply market 5.727** 3.227*** potential) (2.488) (0.684) Ln(rural income opportunities: -0.534 0.229 market potential) (0.917) (0.359) Ln(market potential) 1.546 2.127*** (1.257) (0.355) Average schooling (t-1) 0.064*** 0.035*** (0.016) (0.011) Average schooling 0.323** 0.093** (0.138) (0.034) Ln(inter-city transport costs) -0.082* -0.059 (0.043) (0.036) state capital dummy 0.139*** 0.113*** (0.036) (0.030) Ln(population) (t-1) -0.044*** -0.023** (0.008) (0.008) Manu / service (t-1) 0.067** 0.066** (0.032) (0.027) Ln(homicide / pop) (t-1) -0.115*** -0.092*** (0.033) (0.025) Public industry capital / -0.764** -0.780 total industry capital in 1980 (0.298) (0.502) time dummies Yes Yes Observations 245 245 R2 0.469 Hansen J statistic (overidentification test) 5.549 (p-value) (.698) Average of Partial R2 .498 Average of Partial F's 3014.5 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. Public industry capital / total industry capital (1980) is assumed to be exogenous by adding it to the IV list of (3). b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary intra-group (within-state) correlation. c. OLS regressions are with robust cluster standard errors. We assume the observations may be correlated within states, but would be independent between states. 35 Table 6. Decomposition of City Size Growth Coef. of ( ) Decomposition of city growth Table 5-1 Mean bi (ai ×bi /c), % (3), ai ( ) Total Large Small Large Small citiesb citiesb Total citiesb citiesb No. cities 123 61 62 Ln(city pop) 0.226 0.264 0.188 Ln(rural pop. supply 9.429 -0.006 -0.005 -0.008 -8.5 -6.5 -10.6 market potential) Ln(market potential) 1.284 0.346 0.346 0.345 63.8 62.2 65.5 Average schooling (t-1) 0.071 4.568 4.773 4.366 46.7 47.4 45.9 Average schooling 0.384 1.208 1.215 1.201 66.7 65.3 68.2 Ln(inter-city transport -0.089 -0.215 -0.191 -0.239 2.8 2.4 3.1 costs) State capital dummy 0.154 0.171 0.344 0.000 3.8 7.4 0.0 Ln(population) (t-1) -0.047 12.339 13.172 11.520 -83.4 -86.6 -80.1 Manu / service (t-1) 0.140 0.406 0.428 0.385 8.2 8.4 8.0 c = ai ×bi 0.695 0.715 0.676 i sum 100.0 100.0 100.0 a. Means are for 2000-1991 and 1991-1980. For average schooling (t-1), it is for 1991 and 1980. b. We define large (small) cities if they have greater (less) than median city population in each year. 36 Table 7. Regression of City Growth Residualsa,b (robust standard errors in parentheses) . (1) OLS Laws to collect property tax 0.035 (0.042) % of pop under land zone law 0.050*** (0.014) Ln(no. formal firms / 0.046* no. workers in formal firms) (0.024) Ln(pop density) 0.001 (0.007) Small city dummy -0.044*** (0.015) time dummies Yes Observations 245 R2 0.093 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. Small city dummy has a value 1 if a city has less than median city population in each year. b. OLS regressions are with robust cluster standard errors. We assume the observations may be correlated within states, but would be independent between states. 37 Table 8. Policy Simulation: favoring largest cites versus smallest ones (½ standard deviation (.4) decrease in inter-city transport costs in 2000) Total urban income relative to the Comparison baseline income (%) (b-a, %p) Favoring largest Favoring smallest cities (a) cities (b) 1 largest vs. 88 smallest 102.072 102.763 0.691 2 largest vs. 104 smallest 103.761 104.458 0.698 3 largest vs. 109 smallest 105.227 105.550 0.323 4 largest vs. 112 smallest 106.072 106.413 0.341 5 largest vs. 113 smallest 106.651 106.715 0.064 6 largest vs. 115 smallest 107.020 107.517 0.497 7 largest vs. 116 smallest 107.679 108.033 0.354 38 Appendix A. Means and Standard Deviations of Variables (N= 369, 123 cities for 3 years) Variable mean Standard deviation Ln (income per worker) 6.53 .279 Average schooling 5.13 1.26 Ln (market potential) 27.3 1.01 Ln (inter-city trans. costs: 1980, excluding state capitals) .857 .344 Ln( no. workers) 11.5 1.13 Ln (population) 12.4 1.12 Ln(rural pop. supply market potential) 20.2 .938 Ln( rural income opportunities: market potential) 12.4 1.01 39 Appendix B. Market potential measures (1) Basic Market Potential Market potential of agglomeration i is defined as the sum of its member MCAs' market potential. Therefore the market potential of agglomeration i in year t is 3659 yj (t)× popj (t) -1 . kii j=1 (Ad ) ki , j where yj t is per capita income of MCA j in year t, and popj t population of MCA j in year ( ) ( ) t. di is the distance between MCA i and j (100 miles). The distance of own MCA di, ( ) is the , j i 2 area average distance to city center, which is equal to . is assumed to be 2, is 0.3 (0.22 3 between two port cities), and A is such that Adi,j =1 for the smallest land area city (Au and 0.3 Henderson, 2004; Hummels, 2001). (2) Incomes offered in local rural areas competing with own city for local population The gravity measure of surrounding rural per capita incomes is a market potential measure of agglomeration i in year t , such that rural 3659 GDPj (t)/ rural popj (t) . kii j=1 ji (Ad ) -1 ki , j The MP calculation does not include the rural per capita MCA incomes of the same agglomeration. All parameters are the same as (1). Rural GDPs of (1970, 1980, 1985, and 1996) are assigned to those of (1970, 1980, 1991, and 2000). (3) Potential supply of people to the city from local rural areas The gravity measure of surrounding rural population is also a market potential measure of agglomeration i in year t , such that rural 3659 popj (t) . kii j=1 ji (Ad ) -1 ki , j The MP calculation is the same as (2). (4) Market potential measure of agricultural land availability The agricultural land market potential is calculated in the same way as (1), such that 40 agri3659 land j (t) -1 kii j=1 (Ad ) ki , j where agri land j t is agricultural area of MCA j in year t. All parameters are the same as ( ) previous ones. 41 Appendix C. Data sources and definitions There is no official definition of "city" or "agglomeration" in Brazil. The lowest administrative level consists of more than 5000 municípios. However, these vary greatly in size and many functional economic and population agglomerations consist of a number of municípios. In this paper, we therefore follow the example of a study of Brazilian urban dynamics by IPEA, IBGE and UNICAMP (2002). It defined agglomerations based on their place in the urban hierarchy from "World Cities" (São Paulo and Rio de Janeiro) to subregional centers. For each agglomeration, this study identified the municípios that were a functional part of the urban area. The municípios belonging to each agglomeration were then further classified into eight categories according to how tightly they are integrated in the agglomeration, from "maximum" to "very weak". The main criteria used in these classifications were centrality, function as a center of decision making, degree of urbanization, complexity and diversification of the urban areas, and diversification of services. These were measured by a range of census and other variables such as employed population in urban activities, urbanization rate, and population density. We modified this classification slightly by also including smaller municípios to existing agglomerations if their population exceeded 75,000 population and more than 75 percent of its residents lived in urban areas in 1991, or if they were completely enclosed by an agglomeration. The agglomeration definitions developed by IPEA, IBGE and UNICAMP (2002) are based on municípios boundaries valid at the time of the Brazilian Population Census of 1991 and the Population Count of 1996, while our study captures dynamics from 1970 to 2000. During this time, many new municípios were created by splitting or re-arranging existing ones. In fact, the number of municípios increased from 3951 to 5501 during these three decades. To create a consistent panel of agglomerations for the 1970 to 2000 period, we therefore used the Minimum Comparable Area (MCA) concept as implemented by IPEA researchers. MCAs group municípios in each of the four census years so that their boundaries do not change during the study period. All data have then been aggregated to match these MCAs. The resulting data set represents 123 urban agglomerations that consist of a total of 447 MCAs. The sources for the majority of data employed in this paper are the Brazilian Bureau of Statistics (IBGE) Population and Housing Censuses of 1970, 1980, 1991 and 2000. We used the full Brazilian census counts to get information about total population and housing conditions (urbanization rate). Other data were collected only for a sample of households. We used this census sample information for income, industrial composition, education, piped water provision, 42 and electricity availability. The sample sizes varied across census years (1970: 25 percent; 1980: 25; 1991: 12.5; 2000: 5)., but all are representative at the município level, and thus are also reliable at the MCA level employed in this study. Income figures are compiled from monthly data, deflated to 2000 Real (R$). The transportation cost (proxy for transportation connectivity) between all Brazilian municipalities and the nearest State capital and between all Brazilian municipalities and São Paulo come from Professor Newton De Castro at the Federal University of Rio De Janeiro, and available at www.ipeadata.gov.br. Existence of Ports and Brazilian Regions dummies are from the Bureau of Statistics (IBGE) Municipalities Profile of 1999. Homicides are from DATASUS / Brazilian Ministry of Health dataset. Local government expenditures are from the Brazilian Treasury dataset of 1991 and 2000. Formal employment data are from RAIS dataset / Brazilian Ministry of Labor. Morandi and Reis (2004) capital stock data employed in our analysis come from Brazilian Economic Censuses of 1970, 1975 and 1980. 43 Appendix D. Robustness test for spatial dependence Table A. Demand Side: Determinants of Income Per Workera,b (standard errors corrected for spatial dependence in parentheses) (1) (2) (3) Spatial GMM Spatial OLS Spatial GMM Average Schooling 0.286*** 0.280*** 0.260*** (0.032) (0.023) (0.030) Ln(market potential) 0.404*** 0.048*** 0.371*** (0.083) (0.016) (0.069) Ln(no. workers) -0.318*** 0.005 -0.304*** [ln(population) for (3)] (0.113) (0.018) (0.092) Ln(inter-city transport costs) -0.246** 0.016 -0.218** (0.122) (0.024) (0.102) state capital dummy -0.010 -0.090** 0.041 (0.157) (0.039) (0.143) time dummies Yes Yes Yes Observations 369 369 369 Hansen J statistic (overidentification test) 0.884 0.901 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. The instruments are semi-arid area dummy, ln(distance to state capital), ln(distance to São Paulo), manufacturing/service employment ratio (1970), infant mortality (1970), ln(humidity), average years of schooling (1970), state capital and time dummies. b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and longitude (10.23, and 8.20), which correspond to about 900 miles. 44 Table B. Population Supplya,b (standard errors corrected for spatial dependence in parentheses) (1) (2) (3) (4) (5) Spatial GMM Spatial OLS Spatial GMM Spatial GMM Spatial GMM Ln(income per capita) 2.539*** 1.813*** 1.846*** 2.771*** 3.072*** (0.624) (0.359) (0.476) (0.613) (0.879) Ln(rural income opportunities: -5.536*** -4.152*** -4.873*** -5.638*** -6.040*** market potential) (1.445) (0.830) (1.285) (1.334) (1.849) Ln(rural pop. supply market 6.231*** 4.878*** 5.615*** 6.313*** 6.719*** potential) (1.376) (0.788) (1.223) (1.276) (1.755) time dummies Yes Yes No No No Observations 369 369 123 123 123 Hansen J statistic (overidentification test) 1.355 1.014 1.463 1.684 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. The instruments are semi-arid area dummy, ln(distance to São Paulo), ln(market pot. agric. land availability, 1970), port dummy, ln(per capita capital stock, 1970), southern region and time dummies. b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and longitude (10.23, and 8.20), which correspond to about 900 miles. 45 Table C. City Size Equationsa,b (standard errors corrected for spatial dependence in parentheses) (1) (2) Spatial GMM Spatial OLS Ln(rural pop. supply) 1.706*** 1.216*** (0.635) (0.386) Ln(rural income opportunities) -3.317*** -1.999*** (0.864) (0.462) Ln(market potential) 2.322*** 1.426*** (0.660) (0.468) Average Schooling 0.181* 0.231** (0.099) (0.112) Ln(inter-city transport costs) -1.346*** 0.081 (0.280) (0.083) State capital dummy -0.211 1.091*** (0.330) (0.187) time dummies Yes Yes Observations 369 369 Hansen J statistic (overidentification test) 1.659 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. The instruments are semi-arid area dummy, port dummy, illiteracy rate (1970), ln(industry capital per worker, 1970), ln(distance to state capital)*ln(market pot. agric. land availability, 1970), ln(humidity), ln(avg. temperature), ln(rural pop. supply, 1970), ln(rural income opportunities, 1970), ln(market potential, 1970), and state capital and time dummies. b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and longitude (10.23, and 8.20), which correspond to about 900 miles. 46 Table D-1. City Size Growth Equationa,b (standard errors corrected for spatial dependence in parentheses) (1) (2) (3) (4) Spatial GMM Spatial OLS Spatial GMM Spatial OLS Ln(rural pop. supply market 8.894*** 3.216*** 5.590*** 3.064*** potential) (2.078) (0.703) (1.790) (0.639) Ln(rural income opportunities: 2.300 0.364 -0.700 0.198 market potential) (1.834) (0.389) (0.738) (0.271) Ln(market potential) 1.837 2.860*** 3.956*** 2.738*** (1.266) (0.674) (0.953) (0.606) Average schooling (t-1) 0.036 0.021 0.063*** 0.021* (0.027) (0.013) (0.016) (0.012) Average schooling 0.115 0.067** 0.604*** 0.097*** (0.117) (0.031) (0.116) (0.026) Ln(inter-city transport costs) -0.121*** -0.092*** -0.132** -0.088*** (0.044) (0.027) (0.051) (0.025) state capital dummy 0.080** 0.080*** 0.220*** 0.129*** (0.033) (0.026) (0.037) (0.033) Ln(population) (t-1) -0.057*** -0.018* (0.009) (0.010) Manu / service (t-1) 0.190*** 0.096*** (0.033) (0.018) time dummies Yes Yes Yes Yes Observations 246 246 246 246 Hansen J statistic (overidentification test) 3.582 5.381 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. For (1), instruments are the IV list of Table 4, ln(distance to São Paulo), ln(transport costs to São Paulo, 1968), and ln(transport costs to state capital, 1968). For (3), we drop ln(industry capital per worker, 1970) from (1), and add ln(population, 1970), manu/service ratio (1970), manu/service ratio(1970)*ln(population, 1970), manu/service ratio(1970)*ln(income per capita, 1970), and manu/service ratio(1970)*ln(market potential, 1970). b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and longitude (10.23, and 8.20), which correspond to about 900 miles. 47 Table D-2. City Size Growth Equation (continued)a,b (standard errors corrected for spatial dependence in parentheses) (5) (6) Spatial GMM Spatial OLS Ln(rural pop. supply market 5.815*** 3.227*** potential) (1.779) (0.655) Ln(rural income opportunities: -0.632 0.229 market potential) (0.720) (0.244) Ln(market potential) 1.257 2.127*** (0.890) (0.480) Average schooling (t-1) 0.066*** 0.035*** (0.016) (0.010) Average schooling 0.489*** 0.093*** (0.092) (0.024) Ln(inter-city transport costs) -0.107** -0.059** (0.047) (0.025) state capital dummy 0.183*** 0.113*** (0.038) (0.025) Ln(population) (t-1) -0.056*** -0.023*** (0.008) (0.009) Manu / service (t-1) 0.131*** 0.066*** (0.031) (0.022) Ln(homicide / pop) (t-1) -0.105*** -0.092*** (0.031) (0.023) Public industry capital / 0.006 -0.780* total industry capital in 1980 (0.385) (0.425) time dummies Yes Yes Observations 245 245 Hansen J statistic (overidentification test) 3.945 *** significant at 1% level; ** significant at 5% level; * significant at 10% level. a. Public industry capital / total industry capital (1980) is assumed to be exogenous by adding it to the IV list of (3). b. Coordinate variables are latitude and longitude. Cutoffs are 3/2 standard deviations of latitude and longitude (10.23, and 8.20), which correspond to about 900 miles. 48