WPS3723

                              Determinants of City Growth in Brazil



                   Daniel da Mata*, Uwe Deichmann, J. Vernon Henderson,
                              Somik V. Lall, and Hyoung Gun Wang


                * DIRUR, Instituto de Pesquisa Econômica Aplicada (IPEA), Brasilia
                  Development Research Group, The World Bank, Washington DC
                    Department of Economics, Brown University, Providence, RI




                                                  Abstract
In this paper, we examine the determinants of Brazilian city growth between 1970 and 2000. We
consider a model of a city, which combines aspects of standard urban economics and the new
economic geography literatures. For the empirical analysis, we constructed a dataset of 123
Brazilian agglomerations, and estimate aspects of the demand and supply side as well as a
reduced form specification that describes city sizes and their growth. Our main findings are that
increases in rural population supply, improvements in inter-regional transport connectivity and
education attainment of the labor force have strong impacts on city growth. We also find that
local crime and violence, measured by homicide rates, impinge on growth. In contrast, a higher
share of private sector industrial capital in the local economy stimulates growth. Using the
residuals from the growth estimation, we also find that cities that better administer local land use
and zoning laws have higher growth. Finally, our policy simulations show that diverting transport
investments from large cities toward secondary cities does not provide significant gains in terms
of national urban performance.

World Bank Policy Research Working Paper 3723, September 2005

The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the
exchange of ideas about development issues. An objective of the series is to get the findings out quickly,
even if the presentations are less than fully polished. The papers carry the names of the authors and should
be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely
those of the authors. They do not necessarily represent the view of the World Bank, its Executive Directors,
or the countries they represent. Policy Research Working Papers are available online at
http://econ.worldbank.org.

Acknowledgements

This paper is a product of a joint research program between the World Bank and the Instituto de Pesquisa
Econômica Aplicada (IPEA), Brasilia. This research has been partly funded a World Bank research grant
and by the Urban Cluster of the World Bank's Latin America and Caribbean Region, and is also an input to
the World Bank's urban strategy for Brazil. We have benefited from discussions with Carlos Azzoni, Pedro
Cavalcanti Ferreira, Ken Chomitz, Dean Cira, Marianne Fay, Mila Freire, João Carlos Magalhães, Maria da
Piedade Morais, Marcelo Piancastelli, Zmarak Shalizi, Christopher Timmins and Alexandre Ywata de
Carvalho. All errors are the authors'. A preliminary version of the paper was presented at the World Bank/
IPEA Urban Research Symposium in Brasilia (April 2005).



                                                       1

    1. BACKGROUND AND MOTIVATION


        Why are some cities more successful than their peers? Is the `success' of

individual cities driven by factors mostly external to any city's immediate control

(location, growth in market potential, being a port in a period of national trade growth,

national level decentralization and improved governance), or do individual city policies

and politics influence growth and development? Disentangling the relative contribution

of regional and local efforts is important for understanding the potential of alternate

policy interventions for stimulating growth of cities across the national urban system. At

this time, there is very little research examining the effectiveness of local and national

policy environments on urban growth in developing countries.

        Brazil is a highly urbanized country ­ 80 percent of its population lives in urban

centers and 90 percent of GDP is created in cities. According to estimates by the UN

Population Division for Brazil, the entire growth in population that is expected over the

next three decades will be in cities where the national urbanization rate is expected to rise

to over 90 percent (UN 2003). This will add about 63 million people to Brazil's cities,

and total urban population will be over 200 million. This population growth is occurring

across the Brazilian urban system (Table 1; see also Lemos et al. 2003). Of the 123 major

urban agglomerations in Brazil, only three were above 2 million people in 1970 versus

ten in 2000. In the middle of the size distribution in 2000, there were 52 agglomerations

with population between 250,000 and 2 million people compared to 25 in 1970. Thus, not

only is the scale of urbanization a major concern, but the distribution of population across

the urban hierarchy will also challenge policy makers to devise appropriate policies for




                                              2

cities of different sizes. Across the urban system, there will be need to meet backlogs in

infrastructure, service delivery, and amenity provision, as well as accommodate further

growth.

        In addition to population increases across the urban system, fiscal and

administrative decentralization has increased the role of individual cities in attracting

investments and in providing services that are responsive to the needs of local residents.

Brazil is one the most decentralized among developing countries. The 1988 Constitution

established municipalities as the third level of government, and provided states and

municipalities with more revenue raising power and freedom to set tax rates. However

many local governments have limited administrative and institutional capacity, and have

not been able to effectively use their autonomy to improve service delivery or attract new

investment. A recent study by the World Bank (World Bank 2002) identifies that

maximizing urban competitiveness from agglomeration economies and minimizing

congestion costs from negative externalities are key challenges facing national and local

governments in Brazil.

        Under this backdrop of rapid population growth and decentralization of

administrative and fiscal responsibilities, it becomes essential to identify what types of

interventions stimulate growth of individual cities. In addition, we want to find out the

consequences of favoring investments in secondary cities on aggregate efficiency and

economic growth. There is an ongoing debate in Brazil's policy circles that the largest

agglomerations have become too big leading to significant negative externalities of

crime, social conflict, and high land costs, and policies should be designed to actively

stem the growth of these large agglomerations and favor investments in secondary cities.




                                               3

It is however not clear if net agglomeration economies in large cites can be offset by

incentives and other measures to divert growth to smaller cities.

        In this paper, we consider a model of a city, which consists of a demand side--

what utility levels a city can pay out--and a supply side--what utilities people demand to

live in a city. We estimate aspects of the demand and supply side; and then a reduced

form equation that describes city sizes and their growth. For the empirical analysis, we

construct a dataset of Brazilian agglomerations to examine city growth between 1970 and

2000. Much of the underlying data come from the Brazilian Bureau of Statistics (IBGE)

Population Censuses of 1970, 1980, 1991, and 2000. For the estimation, we make use of

GMM and spatial GMM techniques to correct for endogeneity in the presence of spatially

autocorrelated errors. Our main findings are that increases in rural population supply, and

improvements in inter-regional transport connectivity and education attainment of the

labor force have strong impacts on city growth. Both, labor force quality improvements

and base period education attainment matter significantly for growth. In terms of local

characteristics, we find that local crime and violence and a higher representation of public

industrial capital in the city lower city growth rates.

        The rest of the paper is organized as follows. Section 2 provides the model and

estimation framework of urban demand and population supply models. The models

presented in this section combine traditional urban modeling with concepts from the new

economic geography literature. In Section 3, we discuss findings from the empirical

analysis and focus our attention on identifying main determinants of city growth. Section

4 provides results from simulations that examine if investments in secondary cites

stimulate growth. Section 5 concludes.




                                              4

    2. MEASURING CITY GROWTH


        In this paper, we examine the local and regional determinants of city growth in

Brazil. Urban growth is represented by both individual city productivity growth and city

population growth, which are different indicators of city "success" and represent two

interconnected dimensions of successful urban growth. However before we can look at

any individual city's success, we need to understand the broader context, in which the

economy as a whole is changing. Cities from an economic perspective represent the way

modern production is carried out in a country and, as such, reflect what is occurring in the

country as a whole.

        Production composition of cities varies by city size, where different types of

goods are best produced in bigger versus smaller cities. If national output composition

changes, altered by changing trade demand or domestic demand that changes with

economic growth, then demand moves away from goods produced in smaller types of

cities and those cities will suffer a setback. Some will falter; others will adjust what they

produce and perhaps upgrade, moving up the urban hierarchy. Which ones adjust well

may depend on "luck", but it may also depend on observable attributes such as education

of the labor force. A better educated labor force may allow for more nimble adjustment

and up-scaling of products produced-- what is called the reinvention hypothesis.

Similarly the skill composition of the labor force will vary across cities in systematic

ways, as output composition and skill needs vary. More generally, national productivity

growth comes from productivity growth within cities, which engender the close social-

spatial interactions inherent in innovation, knowledge accumulation and technological




                                               5

improvements. To understand individual city success, we need to account for the

external, national factors driving urban changes, as well as to understand the sources of

local productivity growth.

        At the same time we need to be able to measure when cities are being

"successful" versus less successful and what drives success. Much of success may be

driven by conditions external to the city, as just noted. In addition to demand changes,

changes in national institutions, for example providing smaller cities with greater

autonomy in local public sector decision making and greater access to fiscal resources

may make it easier for smaller cities to finance the infrastructure and public sector

services demanded by firms (transport and telecommunications) and by higher skilled

workers (e.g., better schools) and compete successfully with bigger cities for certain

industries. For terms of city level conditions, better run cities with more efficient use of

public sector revenues will be more attractive to both firms and migrants. And better run

cities will co-ordinate better with local businesses to help service their needs and make

them more productive. So part of measuring city success is measuring what local

producer and consumer amenities are valued and what cities are better at providing these

amenities.

        In related work, Glaeser et al. (1995) examined how urban growth of the U.S.

cities between 1960 and 1990 is related to various urban characteristics in 1960, such as

their location, initial population, initial income, past growth, output composition,

unemployment, inequality, racial composition, segregation, size and nature of

government, and the educational attainment of their labor force. They showed income

and population growths are (1) positively related to initial schooling, (2) negatively




                                                6

related to initial unemployment, and (3) negatively related to the initial share of

employment in manufacturing. Racial composition and segregation are not correlated

with later city population growth. Government expenditures (except for sanitation) are

also not associated with subsequent growth. However, per capita government debt is

positively correlated with later growth.1

         In a long run analysis, Beeson et al. (2001) examine the location and growth of

the U.S. population using county-level census data from 1840 and 1990. They showed

access to transportation networks, either natural (oceans) or produced (railroads), was an

important source of growth over the period.2 In addition, industry mix (share of

employment in commerce and manufacturing), educational infrastructure, and weather

have promoted population growth.

         In a recent paper for developing countries, Au and Henderson (2004) took a

slightly different approach. They modeled and estimated net urban agglomeration

economies for cities in China, which can be postulated by inverted-U shapes of net output

or value-added per worker against city employment. They found urban agglomeration

benefits are high ­ real incomes per worker rise sharply with increases in city size from a

low level, level out nearer the peak, and then decline very slowly past the peak. The

inverted-U shifts with industrial composition across the urban hierarchy of cities. Larger

peak sizes are for more service oriented cities, but smaller for intensive manufacturing

cities. In addition, (domestic) market potential and accumulated FDI per worker have

significant and beneficial effects on city productivity, measured by value-added per



1They attributed this correlation to higher expected growth which made it cheaper to borrow, or
government invest heavily in infrastructure to serve that growth.
2Transportation network is represented by a group of dummy variables indicating ocean, mountain,
confluence of two rivers, railroads, and canals.


                                                      7

worker. However, percentage of high school graduates, distances to a major highway and

to navigable rivers, and kilometers of paved road per person have no effects, once market

potential is controlled for.

        We now describe the model and estimation strategy employed in our analysis.

The data used for the analysis have been produced through a joint research program

between IPEA, Brasilia and the World Bank. Detailed description of the variables and

their sources are provided in Appendix C, and a descriptive overview of Brazilian city

growth is in da Mata et. al (2005). There is no official statistical or administrative entity

in Brazil that reflects the concept of a city or urban agglomeration that is appropriate for

economic analysis. Socioeconomic data in Brazil tend to be available for municípios, the

main administrative level for local policy implementation and management. Municípios,

however, vary in size. In 2000, São Paulo município had a population of more than ten

million, while many other municípios had only a few thousand residents. Furthermore,

many functional agglomerations consist of a number of municípios, and the boundaries of

these units change over time. Our analysis therefore adapts the concepts of

agglomerations from a comprehensive urban study by IPEA, IBGE and UNICAMP

(2002) resulting in a grouping of municípios to form 123 urban agglomerations (Figure

1). Throughout this paper we refer to these units of analysis as agglomerations, urban

areas, or cities.




                                               8

Model and estimation strategy

         The model consists of a demand side--what utility levels a city can pay out--and

a supply side--what utilities people demand to live in a city. We estimate aspects of the

demand and supply side; and then a reduced from equation that describes city sizes and

their growth. In the end the focus is on the last item.

Demand side

         The demand side is given by the schedule of utility levels a city can offer workers,

as city size increases. A prime determinant of that is income, I, which consists of wage

income and income from rents and other non-labor sources. In addition in an indirect

utility function we also have a vector of items, Qi , such as commuting costs, housing

rents, local taxes, and local public services and amenities, so that

                                                  Ui =U(Ii,Qi)
                                                    D                                   (1)

         For wage income there is a wage rate component and then a work effort

component discussed momentarily. The wage rate component comes from value of

marginal productivity relationships, where

                                          wi = w(MPi,ri,ei, Ni)          (2)

In (2) r is the rental rate on capital, e is the quality or education level of workers, MP is

market potential reflecting the demand for a city's output and hence the price it receives,

and N is a measure of scale, such as city employment. MP from the new economic

geography and monopolistic competition literature has a specific form with components

we can't measure. We make two adjustments. First we use "nominal" market potential,

which is simply the distance discounted sum of total incomes of all MCAs in Brazil for

city i , or



                                                 9

                                MPi =   TI      j                       (3)
                                        j, ji ij


TI is total income andij represents the transport cost between i and j.3 The calculation of

market potential is described in Appendix B, where we use distance as the measure of

transport costs. However travel times and costs vary by more than distance. Brazil for

1968, 1980 and 1995 has a measure of the transport cost from each city to its state

capital. We divide that variable by distance from the city to the state capital to get a city

specific measure of local transport costs which producers in a city face in selling in the

local region. The variable "inter-city transport costs",ii , will be determined by intercity

road infrastructure investment.

        The major items from urban theory affecting worker well-being, apart from the

wage rate are rents and commuting costs. Commuting costs are time costs, of which part

will be reflected in lost work time or energy for work, and part in out-of-pocket

commuting costs. So total wage income is a function of both the wage rate and hours and

energy available to work, where the later will be negatively affected by commuting times.

Housing costs are tricky, since higher housing rents are also reflected in higher non-labor

income earned by landowners.

        For demand side estimation, what we know from the data is total income per

worker in each city. We model that as a function of the determinants of the wage rate and

then factors affecting work time/energy and housing rental income. Both are a function of

city size. In sum we estimate:

                                Ii = I D(MPi,ii,ei, Ni)                 (4)



3The MCAs (Minimum Comparable Areas) are groups of municípios. The detailed description is in
Appendix C.


                                                 10

        The scale variable, N, captures three things, scale externality effects on wage

rates, increasing housing rental incomes, and reduced work time/energy. As such its sign

is uncertain--if cities are at a size where the commuting cost aspects of urban living

weigh heavily, at the margin increases in scale could detract from incomes. That will be

the case in our estimation (which is also good for "stability" given supply curves are

upward sloping--being on the rising part of the "demand curve" can be problematical

and also makes sign interpretations in the city size equation more difficult as discussed

later).

Population Supply

        The population supply relationship we estimate has population supplied to a city

increasing in utility offered per worker, which we approximate by income per worker.

This will tell us the supply elasticity of people to a city. In addition supply is shifted by

attributes, Zi , of the surrounding area--or substitutes of places to work for population in

the area. We have supply to a city of population from nearby rural areas. It is decreasing

in surrounding rural incomes where we use a gravity measure of surrounding rural

incomes, and it is increasing in surrounding rural population supply where again we use a

gravity measure of surrounding rural population. The calculation details are in Appendix

B.

        The supply equation is given by

                  Ni = NS (U s(Ii),Zi), where N S /I > 0, N S /Z > 0            (5)

Note the inverse we will use later is

                  Ii = I S (Ni,Zi) where I S /N > 0, I S /Z < 0.                (6)




                                                11

City Size Level and Growth Equations

          The final estimating equation comes from equating income demand and supply

equations in (4) and (6) and solving for N to get

          Ni = N(MPi,ii,ei,Zi) where N /MP > 0,N /i > 0,N /e > 0,N /Z > 0.                         (7)

Also by differentiating (4) and (6) we can show


                  dN =  -(IS /Z)dZ +(ID /MP)dMP+(ID /i)di +(ID /e)de                                (8).
                                             IS /N -ID /N


Note (IS /Z)<0. And IS /N -ID /N >0 for "stability", where that is helped by the fact

that empirically in Table 2 (discussed momentarily)ID /N <0.




    3. DETERMINANTS OF GROWTH - DEMAND AND SUPPLY SIDES



          Having described the model and estimation strategy in Section 2, we now discuss

the main findings from demand, supply, and city growth models. Results from estimating

the demand side model (equation 4) are presented in Table 2, pooling three years (1980,

1991, and 2000). We focus on the GMM-IV results in column 1, which are from the two-

step efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary within-

state correlation.4 We also give OLS results in column 2. In columns 1 and 2 the scale

measure is total workers in each city. In column 3, population instead of total workers is

used to represent urban scale. The instruments along with statistical test results are listed

in the footnotes. The GMM results of columns 1 and 3 pass specification tests for the

listed variables, and average partial R2's (average partial F's) are .44 and .43 (52.7 and



4The results are almost identical to 2SLS ones. All the GMM estimations in this paper are the two-step
efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary within-state correlation.


                                                     12

51.6) respectively, which are relatively strong.5 In column 4, we provide the effects on

outcomes of a one standard deviation increase in covariates. All variables have big

impacts on total income per worker. For average schooling and Ln(market potential), one

standard deviation increases (1.26 and 1.01) increase total income per worker by 37.5%

and 36.5%. Also for Ln(number of workers) and Ln(intercity-transport costs), reduction

of one standard deviation (-1.13 and -.344) increases total income per worker by 34.4%

and 7.4% respectively. Of course for covariates in log form we already have elasticities.

         The inter-city transport costs variable is significant although it can be fragile. For

intercity-transport costs we use the 1980 value for years 1980 and 1990; and we use the

1995 value for 2000. We give zero values to Ln(intercity-transport costs) of state capital

cities and add to covariates a dummy variable indicating state capitals. Results for

transport costs to São Paulo are much more fragile and have not been included in the

specifications reported in Table 2.

         Finally, note the strong negative scale effects at the margin, suggesting we are on

the downward sloping portion of inverted U's (of income against city size) as we should

be.6 We had no success in estimating a quadratic specification or interacting scale with

the manufacturing to service ratio, to examine interactions between city scale and

industrial composition.




5Partial R2 is a squared partial correlation between the excluded instruments and the endogenous regressor
in question, and the F-test of the excluded instruments corresponds to this partial R2.
6Theory suggests that, under free migration within a country, if particular cities are not a their peak of
inverted U's, they will be to the right of the peak, due to either "stability" conditions in migration-labor
markets or conditions on what constitutes a Nash equilibrium in migration decisions (Au and Henderson,
2004; Duranton and Puga, 2004).


                                                      13

          Growth or differenced versions of this equation and the population supply one

have very poor IV results, which is mainly due to a weak instrument problem. For the

growth specifications, we only focus on the final reduced form specification (Table 5).


          Results for population supply are provided in Table 3. Again, for the estimation

we pool three years (1980, 1991, and 2000). Columns 1 and 2 give the GMM-IV and then

OLS results. The instruments, listed in the footnote of the table, pass specification tests

and produce strong first-stage regression results. All terms have strong, expected sign

coefficients. In column 1, a 1% increase in a city's total income per capita increases city

population by 2.4%. The gravity measures of surrounding rural population supply and

rural income opportunities have the expected opposite effects with similar magnitudes. A

1% increase in surrounding rural population supply increases city population by 5.9%,

and a 1% increase in surrounding rural income opportunities decreases city population by

5.2%. Thus, city populations are very sensitive to rural population supply and earning

opportunities.



          In columns 3-5, we present supply elasticities by year. The coefficients of all the

three covariates increase over time, indicating increasing mobility. Population supply to a

city has become more elastic to changes in attributes of the city and nearby rural areas.

However, even in 2000, the elasticity, 2.9, is far from perfect mobility elasticity.7




7Under perfect labor mobility, we expect a horizontal population supply curve. All the cities offer the same
utility level, and city sizes are only determined by demand-side factors.


                                                      14

City Size Results

         Results for city size from estimating equation (7) are given in Table 4. Column 1

gives GMM-IV results, column 2 OLS, and column 3 the effects of a one standard

deviation increase in covariates on city size. For instruments, we use 1970 values and

time-invariant variables.8 Again the instruments pass specification tests, and show strong

first-stage regression results.

         If the reduced form results are indeed from combining demand and supply sides,

we expect the coefficient estimates in Table 4 to be consistent with the imputed values

from the demand side (Table 2) and the supply side (Table 3). The imputed values can be

calculated using (8), such that


                                     ci = dN  =      ID /Q           =    bi
                                          dQ    IS /N -ID /N          1/ a1 -b4
                                                    - IS /Z
                                                      (        )
                                     cj = dN  =                      =    -aj
                                           dZ   IS /N -ID /N           1/ a1 -b4


where ci,cj are reduced form coefficient estimates in Table 4, bi the demand side of
         (     )

Table 2, and aj the supply side of Table 3. The comparison with imputed values, noted in

the footnote, confirms a rough consistency between Tables 2 to 4.9




8The instruments are semi-arid area dummy, port dummy, illiteracy rate (1970), ln(industrial capital per
worker, 1970), ln(distance to state capital)*ln(market pot. agric. land availability, 1970), ln(humidity),
ln(avg. temperature), ln(rural pop. supply, 1970), ln(rural income opportunities, 1970), ln(market potential,
1970), and state capital and time dummies.
9

                                             Imputed
                                  [from Tables 2 (3) and 3 (1)]       Table 4 (1)

      Ln(market potential)          b1/(1/a1-b4)       0.468             2.693
   Ln(inter-city trans. costs)      b2/(1/a1-b4)      -0.250             -1.395
      Average Schooling             b3/(1/a1-b4)       0.381             0.220
     Ln(rural pop. supply)         -a2/(1/a1-b4)       3.053             1.661
 Ln(rural income opportunities)    -a3/(1/a1-b4)      -3.468             -3.664




                                                     15

        Table 4 suggests two things. First, market potential for goods, the rural population

supply, and rural income opportunities have significant effects on city populations with

roughly similar magnitudes. A 1% increase in market potential and rural population

increase city size by 2.7% and 1.7% respectively. In comparison, a 1% decrease in rural

income opportunities would increase city size by 3.7%. Second, intercity-transport costs

and educational attainment (average schooling) are also important, although GMM-IV

results are somewhat fragile.



Growth Results


        Next we turn to growth equations, where we difference the reduced form equation

(7). While in principle results should be the same, a differenced equation has three

possible advantages and one draw-back. First a growth formulation allows us to separate

out labor force quality improvements from the effect of education on technology

(knowledge accumulation spillovers). The latter is inferred from the effect on city growth

of base period education levels, in a common specification in the growth literature.

Second, while the levels formulation we estimated passes specification tests, one might

have strong priors that there are time invariant unobservables affecting city size that are

difficult to instrument for; differencing removes these. Third, a growth formulation

allows us conceptually to move beyond the equilibrium static allocation framework used

in the specification to test for growth effects where adjustments processes are involved.

The drawback in differencing equations is that the effects of variables which have small

changes over time may be poorly estimated, given lack of variation in the data.




                                              16

        Table 5-1 shows the GMM-IV and OLS growth results pooling 1991-1980 and

2000-1991 differenced equation years for equation (7). For instruments, we add to the IV

list of Table 4 ln(distance to São Paulo), ln(transport costs to São Paulo, 1968), and

ln(transport costs to state capital, 1968). All covariates, except changes in rural income

opportunities, have strong and expected sign coefficients. The poor performance of rural

income opportunities is most probably due to the limited variance in the data over time,

as discussed next.

        Relative to the levels equation in Table 4, the growth equation coefficients

reported in column 1 are similar for market potential and (change) in schooling. However

results for changes in rural situation variables and transport costs differ in magnitude. For

ln(rural population supply) and ln(rural income opportunities), not only is there little

variation, the two variables are strongly negatively correlated.10 So the high coefficient

on ln(rural population supply) may be picking up some of the effect of ln(rural income

opportunities). For the inter-city transport cost variable, differences over time may be

poorly measured. While we instrument for this variable, the instruments include historical

levels of the same measure, and therefore may be subject to the same measurement

issues. As a result, reductions in inter city transport costs have a much smaller effect in

the growth estimation. Nevertheless coefficients are consistent in sign with those of the

level equation in Table 4.

        In examining the results in Table 5, we focus on column 3. The main difference

between the GMM results in columns 1 and 3 is that we introduce base period population

and manufacturing to service ratios in the latter specification. Controlling for population

allows for dynamic adjustment to steady state levels from the base, and introducing

10The correlation coefficients are -.719 (for 1991-1980) and -.481 (for 2000-1991).


                                                    17

industrial composition allows for adjustment relative to changes in national output

composition. For results in column 3, the instrument list readily passes the specification

test. First stage regressions for the covariates have average partial R2's and F's of

respectively .52 and 2852, which are strong for differenced covariates. For differenced

intercity-transport costs, we use the difference between 1995 and 1980 for 2000-1991;

and the difference between 1980 and 1968 for 1991-1980.

         We find that increases in rural population supply, market potential of goods, labor

force quality improvements (measured by changes in educational attainment) increase the

growth rate of city population. As a new effect, educational attainment in the base period

increases city population growth rates afterwards, confirming spillover effects of

knowledge accumulation. But as noted above, reductions in intercity-transport costs have

a moderate effect on city population growth rate. A 10% decrease in intercity-transport

costs increases city population growth by .9% over a decade. Initial city size has a

negative coefficient, suggesting some conditional convergence in population growth

across cities. Also, cities with high manufacturing ratios in the base period experience

faster growth. We also find that once base period population and industrial composition

are controlled for, state capitals are growing faster than other cities.

         In Table 5-2, we introduce two additional local characteristics to the specification

in Table 5-1, column 3. These are (1) ratio of public industry capital to total industry

capital stock in 198011 and (2) base period homicide rates. The main difference between

the GMM results in column 3, Table 5-1 and those from Table 5-2 is that the statistical

significance for the change in market potential drops to 20 percent. Other results are


11Total industry capital includes both public and private industry capital stocks. The capital stock data
comes from Morandi and Reis (2004). Due to data limitation, we use capital stock in 1980, which is the
most recent year available.


                                                     18

consistent with those reported in Table 5-1. The GMM results suggest that homicide rates

and an increasing share of public industry capital have a detrimental effect on city

growth. For example, a 10% increase in base period homicide rates reduces city growth

by 1.1% over the next decade. The findings on public industrial capital accumulation

suggest that public investment in industry tends to crowds out private investment (at least

in the short term), and the potential inefficiency of state enterprises may also deter

economic growth.12

Decomposing City Growth

         In Table 6, we decompose the city population growth results of Table 5-1 (3) into

contributions of each covariate. We focus on the covariates which are statistically

significant. The contribution of each covariate is calculated as a fitted value (the mean

value multiplied by the estimated coefficient) relative to the sum of all the fitted values.

Column 5 shows the overall contributions for all cities. There is a strong negative effect

of city size in base period (-83.4%). This effect is compensated by increases in market

potential (63.8%) and educational attainment (66.7%), along with base period's

educational attainment (46.7%) which affects local technology growth.

         The estimated effects of market potential and technology spillovers support the

new economic geography emphasis on local markets and the endogenous growth

literature emphasis on human capital accumulation. These results are also consistent with

cross country findings in Henderson and Wang (2005).13 Columns 6 and 7 compare city


12La Porta and López-de-Silanes (1999) showed privatization in Mexico in 1980s and 1990s led to a
significant improvement in firm performance, as profitability increased 24 percentage points and converged
to levels similar to those of private firms.
13Henderson and Wang (2005) analyzes how urbanization in a country is accommodated by increases in
numbers versus population sizes of cities. Using a worldwide dataset on all metro areas over 100,000
population from 1960-2000, they show market potential, educational attainment, and the degree of
democratization strongly affect growth in both city numbers and individual city sizes.


                                                    19

growth decompositions of large versus small cities. We find no major difference in these

effects across city size.



Robustness Tests ­ Spatial Dependence

        Interaction among cities due to trading and technological linkages is likely to

influence city growth. In the presence of technology spillovers, copy cat policy adoption,

and inter regional transport connectivity, growth in any given city will be related to other

cities in the urban system, and the impact of these spillovers is likely to be higher among

cities which are geographically close to each other. Much of these interactions however

are not observed in the data that we have been able to compile, and thus is relegated to

the error specification. In the presence of spatial autocorrelation, standard errors from the

city growth estimation are likely to be inaccurate and introduce efficiency problems in

the various estimations.

        To address this issue, we test whether the clustered estimation results of Tables 2

to 5-2 are robust to residual spatial dependence. Tests for spatial dependence (Moran's I

and Geary's C) show that there is residual spatial autocorrelation in the error terms. To

address this issue, we employ the GMM methodology reported by Conley (1999), who

uses weighted averages of spatial autocovariance terms to correct the standard errors of

parameter coefficients for possible serial dependence based on location. This approach is

robust to misspecification of the degree of spatial correlation among the units. In this

nonparametric application, the researcher can specify a cutoff point beyond which spatial

dependence is thought to be unimportant. We use latitude and longitude of the

agglomeration centroid as coordinate variables. Cutoffs are set to be 1.5 standard

deviations of latitude and longitude (10.23, and 8.20), which correspond to 900 miles.


                                             20

Thus, spatial correlation between cities declines linearly and is zero beyond 1.5 standard

deviations of latitude and longitude.

        Appendix Tables A to D report the two-step spatial GMM and spatial OLS results

which correspond to each specification of Tables 2 to 5-2. In general we find that the

GMM results are robust and the spatial GMM results are very similar to the clustered

ones.

Decomposition of City Growth Residuals


        We now use the residuals from the GMM estimations in Table 5-2 (1), and

examine if they have any systematic association with time invariant local characteristics.

Our main interest is in examining if local management or governance, and inter industry

linkages are associated with city growth. In principle, autonomous local government

would actively work to provide local public goods for its constituents, and develop

policies to stimulate growth and manage externalities. For our analysis, we have two

measures of local government efforts: (1) existence of laws to collect IPTU tax (property

tax), (2) percentage of population under land zone laws.



        In terms of inter industry linkages; we expect a clustered or densely populated

region to provide a rich environment for competition and collaboration among firms and

workers in the region, which lead to economic growth. As Saxenian (1994) observed,

regional development is more distinct in a region consisting of many small size firms

than that of a few large firms.14 A city with a rich set of forward and backward linkage



14Saxenian (1994) examined different regional economic performances between Silicon Valley in
California and Route 128 in Massachusetts. Dense social networks and open labor market in Silicon Valley
have facilitated informal communication and collaborative practices, and produced a regional network-


                                                  21

industries performs better than an enclave-a small pocket of firms. We measure the

density of economic activities by (1) ln(no. firms relative to workers) = ln(no. formal

firms / no. workers in formal firms), and (2) ln(population density).

         The basic estimation results from decomposing the residuals of Table 5-2 (1) are

reported in Table 7. The basic structure is that city growth residuals between t and (t-1)

years are affected by city characteristics in year (t-1). However, when data in year (t-1)

are not available, we use the city characteristics in year t assuming long-lasting

persistence of city characteristics across years. In any case, the estimation result should

be interpreted as associations of contemporary variables rather than a causal relationship.

         We find that population growth is higher in cities with better enforcement of land

use and zoning laws ­ the estimates suggest that city growth is associated with increases

in the percentage of city population under land zone laws.15 However, we do not find any

statistically significant association between city growth and existence of laws to collect

IPTU (property tax). This is most likely because there is almost no variation in the IPTU

collection data ­ most cities have laws to collect the property tax. A richer set of inter

industry linkages is also associated with growth ­ the OLS coefficient for the number of

(formal) firms relative to (formal) workers is statistically significant and has the expected

sign. A higher number of firms relative to workers stimulate competition and

collaboration among firms and workers in a city, and is associated with higher city

growth.


based industrial system. The Route 128 region, in contrast, is dominated by autarkic (self-sufficient)
corporations that internalize a wide range of productive activities. She concluded that this difference in
regional socio-economic structure accounts for the divergent prosperity of two regional economies, in spite
of their common origins in postwar military spending and university-based research, and even though they
enjoyed roughly the same employment levels in 1975.
15We can get a similar result when we use a dummy variable indicating more than 50% of population is
under land zone laws.


                                                     22

    4. POLICIES FAVORING SECONDARY CITIES


        Using the results from the regressions of city growth, let us consider the following

policy experiment. There is considerable policy debate in Brazil that investments need to

be directed towards secondary cities to stimulate local economic development and limit

the growth of the largest metropolitan areas. However, the impact of these initiatives on

overall economic growth and urban efficiency is unclear.

        Suppose the Brazilian government invests in transportation infrastructure in order

to decrease inter-city transport costs. An issue is whether favoring investments in small

cities vis-à-vis large cities increase overall productivity growth, and therefore higher

overall economic growth in Brazil. To make the analysis tractable, we first assume that

the amount of transportation investment to reduce one unit of inter-city transport cost (per

mile) is proportional to city population. So one unit decease in inter-city transport costs

for a city of 1 million is assumed to cost the same amount of government expenditure as

those for 10 cities of 100,000 people.

        In 2000, the largest city, São Paulo, has 17.9 million residents, which is

equivalent to the total population of the 88 smallest cities (Table 8). The total population

of the 7 largest cities is the same as that of remaining 116 small cities (Our data consist of

123 cities). Our assumption says that total transportation investment needed to decrease

one unit of transport costs for São Paulo will also reduce one unit of transport costs for

the 88 smallest cities, if invested in those cities.

        Table 2 (3) describes the determinants of income per worker, in which average

schooling, market potential, city population, and inter-city transport costs affect income

per worker. From this equation, we can calculate the total urban income in Brazil, s. t.


                                               23

                                       123
                  total urban income = income      per workeri × no. workersi
                                       i=1

                                       123
                                       X     b^
                                            i GMM × no. workersi.
                                       i=1


        Now suppose the government invests in transportation infrastructure. In Table 8,

we compare the effect on total urban income of investments favoring big cities versus

small cities. The first column is the total urban income relative to the baseline income

when infrastructure investments favor largest cities, specifically a ½ standard deviation

(.4) decrease in inter-city transport cost of largest cities. The baseline income is the

predicted value of Table 2 (3). The second column is the total urban income when the

same amounts are invested in the smallest cities to decrease those cities' transport cost by

the same magnitude (.4). We experiment with several combinations of cities in Table 8.

        The simulation results show that there are very small differences in total urban

income from favoring small cities vis-à-vis large cities. These income differences range

around 0.3 ~ 0.7%p of total urban income growth in 2000. The difference is highest when

we favor the 104 smallest cities vis-à-vis than the largest two cities (.698%p). These

results tell that there are no major gains in terms of overall urban income from diverting

investments from the largest cities to secondary cities.


    5. SUMMARY AND CONCLUSIONS

        In this paper, we have examined the determinants of Brazilian city growth

between 1970 and 2000. For the analysis, we constructed a dataset of 123

agglomerations, and examined factors that influence wages and labor supply. Our main

findings are the following. (1) Increases in rural population supply is a major driver of

city growth. (2) Inter-regional transport improvements that lead to increases in the market



                                               24

potential of goods and reduce inter city transport costs stimulate growth. In fact, we find

that increases in market potential have the strongest impact on city growth. (3)

Improvements in labor force quality and the spillover effects of knowledge accumulation

(measured by initial levels of education attainment) have strong growth impacts.

        In terms of inter regional transport improvements, the Brazilian government has

made significant investments in infrastructure to integrate the national economy and

lower business costs in peripheral regions. Most of the improvements in the road network

occurred between the 1950s and 1980s, leading to significant reduction in transportation

and logistics costs. Castro (2002) measures the benefits of improvements in highway

infrastructure from 1970-1995 as the change in equivalent paved road distance from each

municipality to the state capital of São Paulo, accounting for the construction of the

network as well as the difference in vehicle operating costs between earth/gravel and

paved roads. He shows that transport cost reductions were quite significant for the

Northern region and Central region state of Mato Grosso, with numbers varying from

5,000 to 3,000 equivalent kilometers of paved road. Average reductions fall to the 1,000

km range in the Central region states of Goiás and Mato Grosso do Sul, the southern

states, and the coastal northeastern states. Using this measure, Castro (2002) finds that the

reduction in interregional transport costs was one of the major determinants of both the

expansion of agricultural production to the central regions of Brazil after the 1960s as

well as increases in the country's agricultural productivity

        In terms of city level characteristics, we find that local homicide rates have a

negative impact on city growth rates. In addition, cities with high shares of public

industrial capital also experience slower growth. Thus, there is considerable scope for




                                              25

local initiatives to reduce the costs imposed by crime and violence, along with local

economic development programs to improve access to finance for small and medium

sized businesses.

        Our decompositions of city growth residuals tentatively show that local land use

and zoning enforcement is positively associated with city growth, as is the presence of a

diverse set of inter industry linkages. One of the major limitations in our efforts to

identify the contribution of local characteristics to city growth has been the lack of

longitudinal data, which makes it difficult to draw causal relationships. It would be useful

to get better data on historic land use and zoning regulations, as well as local public

goods, services, and amenities. In further work, we hope to collect additional data on city

level characteristics to better identify their impacts on city growth.




                                                26

   6. REFERENCES


Alesina, A. and D. Rodrik (1994), "Distribution Politics and Economic Growth," The Quarterly
        Journal of Economics, 109, 456-490.

Beeson, P., D. DeJong and W. Troesken (2001), "Population Growth in U.S. Counties, 1840-
        1990," Regional Science and Urban Economics, 31, 669-699.

Castro, N. (2002). "Transportation costs and Brazilian agricultural production: 1970-1996" Texto
        para Discussão - NEMESIS ­ LXVI, http://ssrn.com/author=243495", Social Science
        Research Network.

Conley, T. (1999), "GMM Estimation with Cross Sectional Dependence," Journal of
        Econometrics, 92, 1-45.

Da Mata, D., U. Deichmann, V. Henderson, S. Lall, and H. Wang (2005). Examining the Growth
        Patterns of Brazilian Cities. Mimeo.

Duranton, G. and D. Puga (2004), "Micro-Foundations of Urban Agglomeration Economies," in
        J. V. Henderson and J.F. Thisse (eds.) Handbook of Regional and Urban Economics, Vol
        4. North-Holland.

Galor, O. and J. Zeira (1993), "Income Distribution and Macroeconomics," Review of Economic
        Studies, 60, 35-52.

Glaeser E., J. Scheinkman and A. Shleifer (1995), "Economic Growth in a Cross-Section of
        Cities," Journal of Monetary Economics, 36, 117-143.

Henderson, J. V. and H.G. Wang (2005), "Urbanization and City Growth: the Role of
        Institutions," Brown University, mimeo.

Henderson, J.V. and C.C. Au (2004), "Are Chinese Cities Too Small?," Brown University,
        mimeo.

Hummels, D. (2001), "Toward a Geography of Trade Costs", Purdue University, mimeo.

IPEA, IBGE, and UNICAMP (2002), Configuração Atual e Tendêncies da Rede Urbana, Serie
        Configuração Atual e Tendêncies da Rede Urbana, Instituto de Pesquisa Econômica
        Aplicada, Instituto Brasileiro de Geografia e Estatistica, Universidade Estadual de
        Campinas, Brasilia.

Lemos, M,. Moro, S., Biazi, E., Crocco, M. (2003). A Dinâmica urbana das Regiões
        Metropolitanas Brasileiras. Economia Aplicada, 7, 1:213-244.

Korenman, S. and D. Neumark (2000), "Cohort Crowding and Youth Labor Markets: A Cross-
        National Analysis," in D. Blanchflower and R. Freeman, Youth Employment and
        Joblessness in Advanced Countries, University of Chicago Press, pp. 57-105.

La Porta, R. and F. López-de-Silanes (1999), "The Benefits of Privatization: Evidence From
        Mexico," The Quarterly Journal of Economics, 114, 1193-1242.

Morandi, L. and E. Reis (2004), "Estoque De Capital Fixo No Brasil, 1950-2002," Anais do
        XXXII Encontro Nacional de Economia, Proceedings of the 32th Brazilian Economics
        Meeting.

Persson, T. and G. Tabellini (1994), "Is Inequality Harmful for Growth?," American Economic
        Review, 84, 600-622.




                                                27

Saxenian, A. (1994), Regional Advantage: Culture and competition in Silicon Valley and Route
       128, Harvard University Press.

United Nations (2003). World Urbanization Prospects.

Weil, D. (2005). Economic Growth, Addison-Wesley.

World Bank (2003). Brazil: Equitable, Competitive and Sustainable - Contributions for Debate.
       World Bank, Washington DC.




                                             28

Source: IPEA, IBGE


Figure 1: Urban Agglomerations by population size




                                              29

Table 1: City Size Distribution

      Population size                1970       1980       1991       2000
         > 5 million                     2        21)        32)         3
    2 million - 5 million                1          3          7          7
    1 million - 2 million                4          5          5          8
    500,000 - 1 million                  5         10         15         14
     250,000 - 500,000                 16         21         23         30
     100,000 - 250,000                 44         43         44         46
         < 100,000                      51        39         26         15
 Total number of cities                123        123        123        123
 Average size                      350,857    507,242    657,602    788,222
 Min                                20,864    41,454     76,816     86,720
 Max                             8,139,705 12,588,745 15,444,941 17,878,703
1) "São Paulo" and "Rio de Janeiro"
2) "Porto Alegre" is newly added.




                                                30

                       Table 2. Demand Side: Determinants of Income Per Workera,b,c
                              (robust standard errors in parentheses)

                                      (1)               (2)                (3)            (4)
                                                                                     The effect of

                                  GMM-IV              OLS              GMM-IV          increase
                                                                                      in covariate
                                                                                     based on (1)
    Average Schooling             0.298***          0.280***           0.271***          0.375
                                    (0.032)          (0.026)            (0.033)
   Ln(market potential)           0.363***          0.048**            0.333***          0.365
                                    (0.080)          (0.018)            (0.070)
     Ln(no. workers)              -0.304***           0.005            -0.290***         -0.344
  [ln(population) for (3)]          (0.095)          (0.016)            (0.079)
Ln(inter-city transport costs)      -0.216*           0.016             -0.178*          -0.074
                                    (0.112)          (0.032)            (0.092)
   state capital dummy               0.019           -0.090               0.075
                                    (0.146)          (0.062)            (0.144)


      time dummies                   Yes               Yes                Yes
       Observations                  369               369                369
             R2                                       0.807
     Hansen J statistic
 (overidentification test)           1.593                                1.439

         (p-value)                  (0.661)                             (0.696)
   Average of Partial R2             0.435                                0.425
  Average of Partial F's             52.67                                51.58
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.

 a.   The instruments are semi-arid area dummy, ln(distance to state capital), ln(distance to São
      Paulo), manufacturing/service employment ratio (1970), infant mortality (1970), ln(humidity),
      average years of schooling (1970), state capital and time dummies.
 b.   GMM estimates are from the two-step efficient GMM in the presence of arbitrary
      heteroskedasticity and arbitrary intra-group (within-state) correlation.
 c.   OLS regressions are with robust cluster standard errors. We assume the observations may be
      correlated within states, but would be independent between states.




                                                31

                                        Table 3. Population Supplya,b,c
                                   (robust standard errors in parentheses)

                                      (1)            (2)               (3)            (4)           (5)

                                  GMM-IV             OLS           GMM-IV         GMM-IV         GMM-IV
                                                                    (1980)          (1991)        (2000)
   Ln(income per capita)          2.370***        1.813***         1.830***        2.636***      2.886***
                                   (0.683)         (0.378)          (0.569)         (0.704)       (0.933)
Ln(rural income opportunities:    -5.151***       -4.152***        -4.821***      -5.316***      -5.624***
      market potential)            (1.454)         (0.819)          (1.457)         (1.354)       (1.824)
Ln(rural pop. supply market       5.851***        4.878***         5.559***        5.978***      6.317***
          potential)               (1.368)         (0.752)          (1.378)         (1.281)       (1.705)


       time dummies                  Yes             Yes               No             No            No
             R2                                     0.745
      Hansen J statistic
   (overidentification test)        1.909                            1.297           1.148         1.655

          (p-value)                 (.591)                          (0.730)         (0.765)       (0.647)
    Average of Partial R2           0.657                            0.691           0.644         0.662
    Average of Partial F's           55.50                           34.41           37.48         64.29
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.

      a.   The instruments are semi-arid area dummy, ln(distance to São Paulo), ln(market pot. agric. land
           availability, 1970), port dummy, ln(per capita capital stock, 1970), southern region and time
           dummies.
      b.   GMM estimates are from the two-step efficient GMM in the presence of arbitrary
           heteroskedasticity and arbitrary intra-group (within-state) correlation.
      c.   OLS regressions are with robust cluster standard errors. We assume the observations may be
           correlated within states, but would be independent between states.




                                                     32

                             Table 4. City Size Equationsa,b,c,d
                          (robust standard errors in parentheses)

                                           (1)                 (2)               (3)
                                                                            The effect of

                                       GMM-IV                OLS              increase
                                                                             in covariate
                                                                            based on (1)
      Ln(rural pop. supply)            1.661***           1.216***              1.558
                                         (0.643)           (0.425)
  Ln(rural income opportunities)       -3.664***          -1.999***             -3.701
                                         (0.894)           (0.600)
      Ln(market potential)             2.693***            1.426**              2.720
                                         (0.916)           (0.586)
       Average Schooling                0.220**            0.231**              0.277
                                         (0.091)           (0.106)
  Ln(inter-city transport costs)       -1.395***             0.081              -0.480
                                         (0.337)           (0.110)
      State capital dummy                -0.260           1.091***
                                         (0.395)           (0.170)


          time dummies                    Yes                 Yes
          Observations                     369                 369
               R2                                            0.801
        Hansen J statistic
     (overidentification test)            1.770

            (p-value)                    (.880)
      Average of Partial R2               .477
      Average of Partial F's             129.47
   *** significant at 1% level; ** significant at 5% level; * significant at 10% level.

a. The instruments are semi-arid area dummy, port dummy, illiteracy rate (1970), ln(industry
   capital per worker, 1970), ln(distance to state capital)*ln(market pot. agric. land availability,
   1970), ln(humidity), ln(avg. temperature), ln(rural pop. supply, 1970), ln(rural income
   opportunities, 1970), ln(market potential, 1970), and state capital and time dummies.
b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary
   heteroskedasticity and arbitrary intra-group (within-state) correlation.
c. OLS regressions are with robust cluster standard errors. We assume the observations may be
   correlated within states, but would be independent between states.
d. Average of Partial R2 and Partial F's are for average schooling and Ln(inter-city transport
   costs). Market potential and gravity measures are almost completely correlated with those in
   1970 (Partial R2's are around .99).




                                            33

                             Table 5-1. City Size Growth Equationa,b,c
                               (robust standard errors in parentheses)

                                         (1)               (2)               (3)             (4)
                                     GMM-IV               OLS            GMM-IV              OLS
 Ln(rural pop. supply market         9.188***          3.216***          9.429***         3.064***
           potential)                 (2.309)           (0.892)           (2.410)          (0.631)
 Ln(rural income opportunities:        0.756             0.364             0.358            0.198
      market potential)               (0.883)           (0.517)           (0.728)          (0.317)
    Ln(market potential)             2.294***          2.860***           1.284**         2.738***
                                      (0.761)           (0.798)           (0.512)          (0.551)
  Average schooling (t-1)            0.078***            0.021           0.071***           0.021
                                      (0.021)           (0.014)           (0.013)          (0.012)
     Average schooling                0.275*            0.067*           0.384***         0.097***
                                      (0.141)           (0.033)           (0.104)          (0.033)
 Ln(inter-city transport costs)      -0.078**          -0.092**          -0.089***        -0.088**
                                      (0.035)           (0.037)           (0.026)          (0.037)
     state capital dummy               0.016           0.080***          0.154***         0.129***
                                      (0.036)           (0.024)           (0.035)          (0.037)
     Ln(population) (t-1)                                                -0.047***         -0.018*
                                                                          (0.009)          (0.010)
     Manu / service (t-1)                                                0.140***         0.096***
                                                                          (0.027)          (0.019)


        time dummies                    Yes               Yes               Yes              Yes
        Observations                    246               246               246              246
              R2                                         0.364                              0.403
      Hansen J statistic
  (overidentification test)            5.786                               8.204

           (p-value)                   (.565)                              (.514)
    Average of Partial R2               .412                                .526
    Average of Partial F's            395.70                              2852.4
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.

a.  For (1), instruments are the IV list of Table 4, ln(distance to São Paulo), ln(transport costs to São
    Paulo, 1968), and ln(transport costs to state capital, 1968). For (3), we drop ln(industry capital per
    worker, 1970) from (1), and add ln(population, 1970), manu/service ratio (1970), manu/service
    ratio(1970)*ln(population, 1970), manu/service ratio(1970)*ln(income per capita, 1970), and
    manu/service ratio(1970)*ln(market potential, 1970).
b.  GMM estimates are from the two-step efficient GMM in the presence of arbitrary
    heteroskedasticity and arbitrary intra-group (within-state) correlation.
c.  OLS regressions are with robust cluster standard errors. We assume the observations may be
    correlated within states, but would be independent between states.




                                                 34

                       Table 5-2. City Size Growth Equation (continued)a,b,c
                               (robust standard errors in parentheses)

                                                           (1)                (2)
                                                       GMM-IV                OLS
              Ln(rural pop. supply market               5.727**           3.227***
                         potential)                     (2.488)             (0.684)
             Ln(rural income opportunities:             -0.534               0.229
                      market potential)                 (0.917)             (0.359)
                   Ln(market potential)                  1.546             2.127***
                                                        (1.257)             (0.355)
                 Average schooling (t-1)               0.064***            0.035***
                                                        (0.016)             (0.011)
                    Average schooling                   0.323**            0.093**
                                                        (0.138)             (0.034)
              Ln(inter-city transport costs)            -0.082*             -0.059
                                                        (0.043)             (0.036)
                   state capital dummy                 0.139***           0.113***
                                                        (0.036)             (0.030)
                   Ln(population) (t-1)                -0.044***           -0.023**
                                                        (0.008)             (0.008)
                   Manu / service (t-1)                 0.067**             0.066**
                                                        (0.032)             (0.027)
                 Ln(homicide / pop) (t-1)              -0.115***          -0.092***
                                                        (0.033)             (0.025)
                 Public industry capital /             -0.764**             -0.780
               total industry capital in 1980           (0.298)             (0.502)


                       time dummies                       Yes                 Yes
                       Observations                       245                 245
                            R2                                               0.469
                     Hansen J statistic
                 (overidentification test)               5.549

                         (p-value)                       (.698)
                   Average of Partial R2                  .498
                   Average of Partial F's               3014.5
       *** significant at 1% level; ** significant at 5% level; * significant at 10% level.

a. Public industry capital / total industry capital (1980) is assumed to be exogenous by adding it to
   the IV list of (3).
b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary
   heteroskedasticity and arbitrary intra-group (within-state) correlation.
c. OLS regressions are with robust cluster standard errors. We assume the observations may be
   correlated within states, but would be independent between states.




                                                 35

                        Table 6. Decomposition of City Size Growth


                        Coef. of                    ( )             Decomposition of city growth

                        Table 5-1            Mean bi                        (ai ×bi /c),  %
                        (3), ai
                             ( )       Total     Large      Small                Large     Small
                                                citiesb     citiesb   Total      citiesb   citiesb
       No. cities                      123        61          62
     Ln(city pop)                      0.226     0.264      0.188


  Ln(rural pop. supply    9.429       -0.006    -0.005      -0.008     -8.5      -6.5       -10.6
   market potential)
  Ln(market potential)    1.284        0.346     0.346      0.345      63.8      62.2        65.5


Average schooling (t-1)   0.071        4.568     4.773      4.366      46.7      47.4        45.9


   Average schooling      0.384        1.208     1.215      1.201      66.7      65.3        68.2


 Ln(inter-city transport -0.089       -0.215    -0.191      -0.239     2.8        2.4        3.1
         costs)
  State capital dummy     0.154        0.171     0.344      0.000      3.8        7.4        0.0


  Ln(population) (t-1)   -0.047       12.339    13.172     11.520     -83.4      -86.6      -80.1


  Manu / service (t-1)    0.140        0.406     0.428      0.385      8.2        8.4        8.0


    c = ai ×bi
                                       0.695     0.715      0.676
          i
          sum                                                         100.0      100.0      100.0
a. Means are for 2000-1991 and 1991-1980. For average schooling (t-1), it is for 1991 and 1980.
b. We define large (small) cities if they have greater (less) than median city population in each year.




                                               36

                            Table 7. Regression of City Growth Residualsa,b
                                 (robust standard errors in parentheses)

                                         .                            (1)
                                                                     OLS
                           Laws to collect property tax              0.035
                                                                    (0.042)
                          % of pop under land zone law             0.050***
                                                                    (0.014)
                               Ln(no. formal firms /                0.046*
                           no. workers in formal firms)             (0.024)
                                 Ln(pop density)                     0.001
                                                                    (0.007)
                                 Small city dummy                 -0.044***
                                                                    (0.015)


                                  time dummies                        Yes
                                   Observations                       245
                                        R2                           0.093
           *** significant at 1% level; ** significant at 5% level; * significant at 10% level.

a. Small city dummy has a value 1 if a city has less than median city population in each year.
b. OLS regressions are with robust cluster standard errors. We assume the observations may be
   correlated within states, but would be independent between states.




                                                   37

      Table 8. Policy Simulation: favoring largest cites versus smallest ones
      (½ standard deviation (.4) decrease in inter-city transport costs in 2000)

                               Total urban income relative to the

      Comparison                      baseline income (%)                  (b-a, %p)
                              Favoring largest Favoring smallest
                                  cities (a)          cities (b)
1 largest vs. 88 smallest         102.072              102.763               0.691
2 largest vs. 104 smallest        103.761              104.458               0.698
3 largest vs. 109 smallest        105.227              105.550               0.323
4 largest vs. 112 smallest        106.072              106.413               0.341
5 largest vs. 113 smallest        106.651              106.715               0.064
6 largest vs. 115 smallest        107.020              107.517               0.497
7 largest vs. 116 smallest        107.679              108.033               0.354




                                         38

Appendix A. Means and Standard Deviations of Variables (N= 369, 123 cities for 3 years)



                        Variable                mean         Standard
                                                            deviation

               Ln (income per worker)           6.53           .279
                  Average schooling             5.13           1.26
                 Ln (market potential)          27.3           1.01
            Ln (inter-city trans. costs: 1980,
               excluding state capitals)        .857           .344

                   Ln( no. workers)             11.5           1.13
                    Ln (population)             12.4           1.12
              Ln(rural pop. supply market
                       potential)               20.2           .938

            Ln( rural income opportunities:
                   market potential)            12.4           1.01




                                             39

Appendix B. Market potential measures

(1) Basic Market Potential

Market potential of agglomeration i is defined as the sum of its member MCAs' market

potential. Therefore the market potential of agglomeration i in year t is

             3659
                   yj (t)× popj (t) 
                                  -1      .
          kii  j=1   (Ad )  
                           ki , j         

where yj t is per capita income of MCA j in year t, and popj t population of MCA j in year
            ( )                                                  ( )

t. di is the distance between MCA i and j (100 miles). The distance of own MCA di,   ( )  is the
     , j                                                                                i


                                                   2 area
average distance to city center, which is equal to           .  is assumed to be 2,  is 0.3 (0.22
                                                   3     

between two port cities), and A is such that Adi,j =1 for the smallest land area city (Au and
                                                 0.3



Henderson, 2004; Hummels, 2001).

(2) Incomes offered in local rural areas competing with own city for local population

The gravity measure of surrounding rural per capita incomes is a market potential measure of

agglomeration i in year t , such that

             
        rural  3659     GDPj (t)/ rural popj (t)  .
          kii  j=1
                ji               (Ad )
                                          -1

                                    ki , j        

The MP calculation does not include the rural per capita MCA incomes of the same

agglomeration. All parameters are the same as (1). Rural GDPs of (1970, 1980, 1985, and 1996)

are assigned to those of (1970, 1980, 1991, and 2000).

(3) Potential supply of people to the city from local rural areas

The gravity measure of surrounding rural population is also a market potential measure of

agglomeration i in year t , such that

             
        rural  3659     popj (t) .
          kii  j=1
                ji (Ad )         -1

                         ki , j     

The MP calculation is the same as (2).

(4) Market potential measure of agricultural land availability

The agricultural land market potential is calculated in the same way as (1), such that




                                                 40

                               agri3659     land j (t) 
                                                   -1  
                               kii   j=1
                                         (Ad )
                                             
                                             ki , j    

where agri land j t is agricultural area of MCA j in year t. All parameters are the same as
                 ( )

previous ones.




                                             41

Appendix C. Data sources and definitions

There is no official definition of "city" or "agglomeration" in Brazil. The lowest administrative

level consists of more than 5000 municípios. However, these vary greatly in size and many

functional economic and population agglomerations consist of a number of municípios. In this

paper, we therefore follow the example of a study of Brazilian urban dynamics by IPEA, IBGE

and UNICAMP (2002). It defined agglomerations based on their place in the urban hierarchy

from "World Cities" (São Paulo and Rio de Janeiro) to subregional centers. For each

agglomeration, this study identified the municípios that were a functional part of the urban area.

The municípios belonging to each agglomeration were then further classified into eight categories

according to how tightly they are integrated in the agglomeration, from "maximum" to "very

weak". The main criteria used in these classifications were centrality, function as a center of

decision making, degree of urbanization, complexity and diversification of the urban areas, and

diversification of services. These were measured by a range of census and other variables such as

employed population in urban activities, urbanization rate, and population density. We modified

this classification slightly by also including smaller municípios to existing agglomerations if their

population exceeded 75,000 population and more than 75 percent of its residents lived in urban

areas in 1991, or if they were completely enclosed by an agglomeration.


The agglomeration definitions developed by IPEA, IBGE and UNICAMP (2002) are based on

municípios boundaries valid at the time of the Brazilian Population Census of 1991 and the

Population Count of 1996, while our study captures dynamics from 1970 to 2000. During this

time, many new municípios were created by splitting or re-arranging existing ones. In fact, the

number of municípios increased from 3951 to 5501 during these three decades. To create a

consistent panel of agglomerations for the 1970 to 2000 period, we therefore used the Minimum

Comparable Area (MCA) concept as implemented by IPEA researchers. MCAs group municípios

in each of the four census years so that their boundaries do not change during the study period.

All data have then been aggregated to match these MCAs. The resulting data set represents 123

urban agglomerations that consist of a total of 447 MCAs.


The sources for the majority of data employed in this paper are the Brazilian Bureau of Statistics

(IBGE) Population and Housing Censuses of 1970, 1980, 1991 and 2000. We used the full

Brazilian census counts to get information about total population and housing conditions

(urbanization rate). Other data were collected only for a sample of households. We used this

census sample information for income, industrial composition, education, piped water provision,




                                                  42

and electricity availability. The sample sizes varied across census years (1970: 25 percent; 1980:

25; 1991: 12.5; 2000: 5)., but all are representative at the município level, and thus are also

reliable at the MCA level employed in this study. Income figures are compiled from monthly

data, deflated to 2000 Real (R$).


The transportation cost (proxy for transportation connectivity) between all Brazilian

municipalities and the nearest State capital and between all Brazilian municipalities and São

Paulo come from Professor Newton De Castro at the Federal University of Rio De Janeiro, and

available at www.ipeadata.gov.br.


Existence of Ports and Brazilian Regions dummies are from the Bureau of Statistics (IBGE)

Municipalities Profile of 1999. Homicides are from DATASUS / Brazilian Ministry of Health

dataset. Local government expenditures are from the Brazilian Treasury dataset of 1991 and

2000. Formal employment data are from RAIS dataset / Brazilian Ministry of Labor. Morandi and

Reis (2004) capital stock data employed in our analysis come from Brazilian Economic Censuses

of 1970, 1975 and 1980.




                                                 43

                    Appendix D. Robustness test for spatial dependence

                 Table A. Demand Side: Determinants of Income Per Workera,b
                (standard errors corrected for spatial dependence in parentheses)

                                              (1)              (2)                (3)
                                        Spatial GMM        Spatial OLS      Spatial GMM
            Average Schooling             0.286***          0.280***          0.260***
                                            (0.032)          (0.023)            (0.030)
           Ln(market potential)           0.404***          0.048***          0.371***
                                            (0.083)          (0.016)            (0.069)
              Ln(no. workers)            -0.318***            0.005           -0.304***
          [ln(population) for (3)]          (0.113)          (0.018)            (0.092)
       Ln(inter-city transport costs)     -0.246**           0.016             -0.218**
                                            (0.122)          (0.024)            (0.102)
            state capital dummy             -0.010          -0.090**            0.041
                                            (0.157)          (0.039)            (0.143)


               time dummies                  Yes              Yes                Yes
                Observations                 369               369               369
             Hansen J statistic
         (overidentification test)           0.884                               0.901

      *** significant at 1% level; ** significant at 5% level; * significant at 10% level.

a. The instruments are semi-arid area dummy, ln(distance to state capital), ln(distance to São Paulo),
   manufacturing/service employment ratio (1970), infant mortality (1970), ln(humidity), average
   years of schooling (1970), state capital and time dummies.
b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and
   longitude (10.23, and 8.20), which correspond to about 900 miles.




                                                44

                                         Table B. Population Supplya,b
                        (standard errors corrected for spatial dependence in parentheses)

                                       (1)             (2)             (3)            (4)              (5)
                                 Spatial GMM      Spatial OLS     Spatial GMM Spatial GMM Spatial GMM
   Ln(income per capita)            2.539***        1.813***        1.846***       2.771***        3.072***
                                     (0.624)         (0.359)         (0.476)        (0.613)         (0.879)
Ln(rural income opportunities:     -5.536***       -4.152***       -4.873***       -5.638***      -6.040***
      market potential)              (1.445)         (0.830)         (1.285)        (1.334)         (1.849)
Ln(rural pop. supply market         6.231***        4.878***        5.615***       6.313***        6.719***
           potential)                (1.376)         (0.788)         (1.223)        (1.276)         (1.755)


       time dummies                   Yes              Yes             No             No               No
        Observations                  369              369            123             123             123
      Hansen J statistic
   (overidentification test)          1.355                           1.014          1.463           1.684

*** significant at 1% level; ** significant at 5% level; * significant at 10% level.

    a.    The instruments are semi-arid area dummy, ln(distance to São Paulo), ln(market pot. agric. land
          availability, 1970), port dummy, ln(per capita capital stock, 1970), southern region and time
          dummies.
    b.    Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and
          longitude (10.23, and 8.20), which correspond to about 900 miles.




                                                       45

                                 Table C. City Size Equationsa,b
               (standard errors corrected for spatial dependence in parentheses)

                                                        (1)                 (2)
                                                   Spatial GMM        Spatial OLS
                   Ln(rural pop. supply)             1.706***           1.216***
                                                      (0.635)            (0.386)
              Ln(rural income opportunities)         -3.317***         -1.999***
                                                      (0.864)            (0.462)
                   Ln(market potential)              2.322***           1.426***
                                                      (0.660)            (0.468)
                    Average Schooling                 0.181*             0.231**
                                                      (0.099)            (0.112)
               Ln(inter-city transport costs)        -1.346***            0.081
                                                      (0.280)            (0.083)
                   State capital dummy                -0.211            1.091***
                                                      (0.330)            (0.187)


                      time dummies                      Yes                Yes
                       Observations                     369                369
                     Hansen J statistic
                  (overidentification test)            1.659

       *** significant at 1% level; ** significant at 5% level; * significant at 10% level.

a. The instruments are semi-arid area dummy, port dummy, illiteracy rate (1970), ln(industry capital
   per worker, 1970), ln(distance to state capital)*ln(market pot. agric. land availability, 1970),
   ln(humidity), ln(avg. temperature), ln(rural pop. supply, 1970), ln(rural income opportunities,
   1970), ln(market potential, 1970), and state capital and time dummies.
b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and
   longitude (10.23, and 8.20), which correspond to about 900 miles.




                                                46

                             Table D-1. City Size Growth Equationa,b
                 (standard errors corrected for spatial dependence in parentheses)

                                        (1)                (2)               (3)             (4)
                                  Spatial GMM        Spatial OLS       Spatial GMM      Spatial OLS
 Ln(rural pop. supply market         8.894***          3.216***          5.590***        3.064***
           potential)                 (2.078)           (0.703)            (1.790)        (0.639)
 Ln(rural income opportunities:        2.300             0.364             -0.700          0.198
      market potential)               (1.834)           (0.389)            (0.738)        (0.271)
    Ln(market potential)               1.837           2.860***          3.956***        2.738***
                                      (1.266)           (0.674)           (0.953)         (0.606)
  Average schooling (t-1)              0.036             0.021           0.063***          0.021*
                                      (0.027)           (0.013)           (0.016)         (0.012)
     Average schooling                 0.115           0.067**           0.604***        0.097***
                                      (0.117)           (0.031)           (0.116)         (0.026)
 Ln(inter-city transport costs)     -0.121***         -0.092***           -0.132**       -0.088***
                                      (0.044)           (0.027)           (0.051)         (0.025)
     state capital dummy              0.080**          0.080***          0.220***        0.129***
                                      (0.033)           (0.026)           (0.037)         (0.033)
     Ln(population) (t-1)                                                -0.057***        -0.018*
                                                                          (0.009)         (0.010)
     Manu / service (t-1)                                                0.190***        0.096***
                                                                          (0.033)         (0.018)


        time dummies                    Yes               Yes               Yes             Yes
        Observations                    246               246               246              246
      Hansen J statistic
  (overidentification test)            3.582                               5.381

*** significant at 1% level; ** significant at 5% level; * significant at 10% level.

a. For (1), instruments are the IV list of Table 4, ln(distance to São Paulo), ln(transport costs to São
    Paulo, 1968), and ln(transport costs to state capital, 1968). For (3), we drop ln(industry capital per
    worker, 1970) from (1), and add ln(population, 1970), manu/service ratio (1970), manu/service
    ratio(1970)*ln(population, 1970), manu/service ratio(1970)*ln(income per capita, 1970), and
    manu/service ratio(1970)*ln(market potential, 1970).
b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and
    longitude (10.23, and 8.20), which correspond to about 900 miles.




                                                 47

                        Table D-2. City Size Growth Equation (continued)a,b
                 (standard errors corrected for spatial dependence in parentheses)

                                                          (5)                 (6)
                                                     Spatial GMM         Spatial OLS
              Ln(rural pop. supply market              5.815***           3.227***
                         potential)                     (1.779)             (0.655)
             Ln(rural income opportunities:             -0.632               0.229
                      market potential)                 (0.720)             (0.244)
                   Ln(market potential)                  1.257             2.127***
                                                        (0.890)             (0.480)
                 Average schooling (t-1)               0.066***            0.035***
                                                        (0.016)             (0.010)
                    Average schooling                  0.489***            0.093***
                                                        (0.092)             (0.024)
              Ln(inter-city transport costs)           -0.107**            -0.059**
                                                        (0.047)             (0.025)
                   state capital dummy                 0.183***           0.113***
                                                        (0.038)             (0.025)
                   Ln(population) (t-1)                -0.056***          -0.023***
                                                        (0.008)             (0.009)
                   Manu / service (t-1)                0.131***           0.066***
                                                        (0.031)             (0.022)
                 Ln(homicide / pop) (t-1)              -0.105***          -0.092***
                                                        (0.031)             (0.023)
                 Public industry capital /               0.006              -0.780*
               total industry capital in 1980           (0.385)             (0.425)


                       time dummies                       Yes                Yes
                       Observations                       245                245
                     Hansen J statistic
                 (overidentification test)               3.945

       *** significant at 1% level; ** significant at 5% level; * significant at 10% level.

a. Public industry capital / total industry capital (1980) is assumed to be exogenous by adding it to
   the IV list of (3).
b. Coordinate variables are latitude and longitude. Cutoffs are 3/2 standard deviations of latitude and
   longitude (10.23, and 8.20), which correspond to about 900 miles.




                                                 48