JOBS WORKING PAPER Issue No. 21 The Determinants of Firm Location in Tanzania Javier Sanchez-Reaza THE DETERMINANTS OF FIRM LOCATION IN TANZANIA Javier Sanchez-Reaza P155013 Activities under the Let’s Work Partnership are supported by grants under the Jobs Umbrella Multidonor Trust Fund and/or IFC Let’s Work Multidonor Trust Fund. © 2018 International Bank for Reconstruction and Development / The World Bank. 1818 H Street NW, Washington, DC 20433, USA. Telephone: 202-473-1000; Internet: www.worldbank.org. Some rights reserved This work is a product of the staff of The World Bank with external contributions. The findings, interpretations, and conclusions expressed in this work do not necessarily reflect the views of The World Bank, its Board of Executive Directors, or the governments they represent. The World Bank does not guarantee the accuracy of the data included in this work. The boundaries, colors, denominations, and other information shown on any map in this work do not imply any judgment on the part of The World Bank concerning the legal status of any territory or the endorsement or acceptance of such boundaries. Nothing herein shall constitute or be considered to be a limitation upon or waiver of the privileges and immunities of The World Bank, all of which are specifically reserved. Rights and Permissions This work is available under the Creative Commons Attribution 3.0 IGO license (CC BY 3.0 IGO) http://creativecommons.org/licenses/by/3.0/igo. Under the Creative Commons Attribution license, you are free to copy, distribute, transmit, and adapt this work, including for commercial purposes, under the following conditions: Attribution—Please cite the work as follows: Javier Sanchez-Reaza. 2018. “The Determinants of Firm Location in Tanzania.” World Bank, Washington, DC. License: Creative Commons Attribution CC BY 3. 0 IGO. Translations—If you create a translation of this work, please add the following disclaimer along with the attribution: This translation was not created by The World Bank and should not be considered an official World Bank translation. The World Bank shall not be liable for any content or error in this translation. Adaptations—If you create an adaptation of this work, please add the following disclaimer along with the attribution: This is an adaptation of an original work by The World Bank. Views and opinions expressed in the adaptation are the sole responsibility of the author or authors of the adaptation and are not endorsed by The World Bank. Third-party content—The World Bank does not necessarily own each component of the content contained within the work. The World Bank therefore does not warrant that the use of any third-party-owned individual component or part contained in the work will not infringe on the rights of those third parties. The risk of claims resulting from such infringement rests solely with you. If you wish to re-use a component of the work, it is your responsibility to determine whether permission is needed for that re-use and to obtain permission from the copyright owner. Examples of components can include, but are not limited to, tables, figures, or images. All queries on rights and licenses should be addressed to World Bank Publications, The World Bank Group, 1818 H Street NW, Washington, DC 20433, USA; fax: 202-522-2625; e-mail: pubrights@worldbank.org. Images: © World Bank Tanzania. Further permission required for reuse. 2 ABSTRACT This paper identifies the factors that affect the location of firms in Tanzania. Using a binomial econometric strategy to address data gaps in firm location at the ward level, the paper groups factors into firm characteristics, market features, and two types of agglomeration economies that capture economies of scale external to the firm. The benefits of agglomeration may stem from specialization within and among firms (referred to in the literature as localization economies) or from diversification across firms (referred to as urbanization economies). The distinction between these two lies at the heart of the discussion on firm location. Regression results indicate that, of the various factors tested, the most important determinant driving firm location is the jobs diversification aspect of urban economies. Other contributing factors are localization economies (jobs specialization), competitive markets, and market access. Based on these findings, policymakers seeking to foster agglomeration could orient policies toward promoting firm entry within cities, complementary investments in urban infrastructure and the urban pool of labor, regulations that support competition, and improvements in market access for large cities. But localization economies are also significant in Tanzania, and could be encouraged through investment in smaller population centers and increasing competition and market access beyond the primary urban centers of Dar-es-Salaam and Arusha. 3 ACKNOWLEDGEMENTS This report was prepared by the World Bank Group’s (WBG) Jobs Group. The author is Javier Sanchez- Reaza (Task Team Leader). The report was prepared under the general direction and ongoing support of David Robalino and Ian Walker. The author is particularly grateful to World Bank Country Director, Bella Bird, the Country Program Coordinator, Preeti Arora, and the Program Leader for Social Protection and Jobs, Gayle Martin, for their support. The author acknowledges the rich comments provided by the peer reviewers of this document: Yutaka Yoshino (Program Leader, AFCE1), Nancy Lozano (Senior Economist, GSU10) and Nadia Belhaj Hassine Belghith (Senior Economist, GPV01). The author is also grateful for comments provided by Tom Farole, Alvaro Gonzalez, Elizabeth Ruppert-Bulmer and Ian Walker. This report is part of the Let’s Work Program Tanzania (P155013). The author is thankful to the partners in the Let’s Work Program for feedback and advice. The Let’s Work Partnership in Tanzania is made possible through a grant from the World Bank’s Jobs Umbrella Multidonor Trust Fund (MDTF), which is supported by the Department for International Development/UK AID, the Swiss Secretariat for Economic Affairs (SECO), the Private Infrastructure Development Group (PIDG), and the Governments of Norway, Germany, Austria, the Austrian Development Agency, and the Swedish International Development Cooperation Agency. 4 CONTENTS ABSTRACT..........................................................................................................................................3 ACKNOWLEDGEMENTS ......................................................................................................................4 ABBREVIATIONS.................................................................................................................................6 INTRODUCTION .................................................................................................................................7 Motivation, Objectives and Units of Analysis ............................................................................................... 8 SNAPSHOT OF THE FIRM LANDSCAPE IN TANZANIA .......................................................................... 11 METHODOLOGY ............................................................................................................................... 14 THE MODEL AND CHOICE OF VARIABLES ........................................................................................... 16 REGRESSION RESULTS ...................................................................................................................... 26 CONCLUSIONS AND POLICY DISCUSSION........................................................................................... 31 ANNEX A.......................................................................................................................................... 33 REFERENCES .................................................................................................................................... 34 5 ABBREVIATIONS BRS Business Registry Survey CDM Count-Data Models HHI Hirschman-Herfindahl Index HIID Hachman Index of Industrial Diversity ISIC International Standard Industrial Classification NBREG Negative Binomial Regression NBS National Bureau of Statistics NEG New Economic Geography PHC Population and Housing Census SAGCOT Southern Agricultural Growth Corridor of Tanzania WBG World Bank Group ZINB Zero-Inflated Negative Binomial Regression 6 INTRODUCTION The location of production in space has long been an underlying feature of trade and economic growth theory. For Adam Smith, the main condition for growth was the division of labor induced by trade (Evans, 1989). When traditional trade models were developed by Heckscher (1919) and Ohlin (1933), trade theory was founded upon the general localization theory (Ohlin, 1933: 243). The places where production is located implicitly underpin our understanding of trade and growth. After a long interval during which spatial considerations were largely ignored in the literature, new technical insights brought space explicitly into economic models. Beginning in the 1990s, the New Economic Geography (NEG) emerged as a framework for rigorously addressing the role of space. The “core-periphery” model incorporates Dixit-Stiglitz monopolistic competition approaches that view concentration and dispersion of economic activity as the result of two countervailing forces: centripetal and centrifugal (Krugman, 1991 and 1992; Krugman and Venables, 1995; Fujita, Krugman and Venables, 1999). Firms choose to locate where they can find productive advantages. If firms can produce efficiently on their own, reaping the gains of (internal) economies of scale, why do they cluster in space? Presumably, a greater demand for land in a given place would increase land prices and lead to congestion costs. The answer may lie in economies of scale that are external to the firm. These external economies that attract firms to particular places have been referred to as agglomeration economies, and the literature has disentangled what Marshall (1920) said was “in the air”. Duranton and Puga (2004) classify agglomeration economies into sharing, matching and learning. In particular, the sharing of: (i) indivisible goods and facilities, (ii) the gains from variety (diversification), (iii) individual specialization, and (iv) risks. Agglomeration economies also arise from matching in labor and product markets by improving: (i) the quality of matches between workers and employers, and between producers and consumers, and/or (ii) the chances of matching. Moreover, agglomeration leads to learning as proximity amongst economic agents facilitates both knowledge generation and knowledge diffusion. In addition, agglomeration economies can extend in at least three dimensions, namely geographically, industrially and temporally (Rosenthal and Strange, 2004). Two types of agglomeration economies have dominated the discussion in the literature: localization economies and urbanization economies (Glaeser et. al., 1992). One of the leading factors that contribute to agglomeration is the economies of scale accrued to the firm that are external to the production process. These external economies of scale arise due to proximity to other firms. But whether those external economies are due to proximity to firms in the same industry or to firms in other sectors is still under debate.1 Localization economies – or Marshall-Arrow-Romer externalities – are external to the firm, but internal to the industry and the region. Localization economies are based on the gains from specialization: intra-industry specialization, labor market economies for workers with industry-specific 1 Henderson (1986), argues that external economies of scale are based on localization and not urbanization economies. For Nakamura (1985), which type of external economies explains concentration is based on city size and location. 7 training, or knowledge spillovers in that industry that speed up the adoption of innovation (Henderson, 1983). Ultimately, gains are realized when firms can find backward and forward linkages that allow them to specialize jobs in a narrower set of tasks and products, and lead to a quicker adoption of innovation in an industry. Urbanization economies are external to the firm and the industry, but internal to the region (Moomaw, 1988). The concept of urbanization economies introduced by Chinitz (1961) and Jacobs (1961, 1969) posits that firms derive benefits (so-called Jacobs externalities) that are external to the firm and the industry, but internal to the region. Firms may benefit from the presence of other firms in other industries by improving their operation through complementary industries such as business services or logistics. The distinction between the two types of agglomeration economies is important for understanding how firms gain benefit from clustering, and whether the main pull factor behind industrial location patterns is regional industrial diversity or specialization (Jordaan and Sanchez-Reaza, 2006). MOTIVATION, OBJECTIVES AND UNITS OF ANALYSIS Better understanding firm location can help understand jobs dynamics in Tanzania. The World Bank’s Jobs Diagnostic shows that 77 percent of firms employ between one and four workers and are thus, micro enterprises. Together micro and small firms, make 96 percent of all firms in the country (World Bank, forthcoming). Given that only a few large firms are present in the country, places with a greater number of firms, can become places where jobs are created. Understanding what makes firms locate in particular places, can help identify places with potential for jobs. But to understand firm location, it is crucial to identify what type of agglomeration economies favor their concentration. The paper starts with the idea that firms simply concentrate in space to seize the benefits of agglomeration economies. In the case of Tanzania, that concentration is evident as explored in the following section. Exploring which type of agglomeration economies explain firm location is useful in two ways. First, given the large share that micro and small firms represent in Tanzania, the number of firms is associated to a higher number of jobs. Second, because the distinction can shed some light into whether the policies aimed at fostering specialization, such as cluster policies, or those aimed at diversification, such as urbanization, are associated to a greater number of firms in given locations. The agglomeration framework described above is used to examine firm-location determinants in Tanzania. Tanzania’s economic development path was largely based on State intervention until the 1990s, when a more export-oriented strategy was adopted. During the pre-1990s period of communalist approach to development, primary activities were the most important source of jobs and growth.2 It is possible that the communalist approach helped consolidate traditional urban centers such as Arusha and Dar-es-Salaam. However, it is uncertain as to whether the subsequent policy approach of economic liberalization starting in the nineties spurred firm location in other urban centers or in rural areas. 2 From the 1967 Arusha Declaration to the mid-eighties, Tanzania embraced Ujamaa Socialism. Under the leadership of Julius Nyerere, Tanzania became independent in 1961. In the eyes of its leader, there were only minor changes that were at odds with his model of society. Ujamaa Socialism was designed with the idea of a communalist society based on freedom, self-reliance, and family (Nord et. al., 2009). 8 BOX 1. DATA DESCRIPTION The analysis uses the 2014/15 Business Register Survey (BRS), the most comprehensive and updated data set on firms in Mainland Tanzania. The 2014/15 BRS includes nearly four times the observations in the 2005 Integrated Business Survey, and almost twice the observations in the 2010 Integrated Business Survey. Differences in sectoral coverage preclude accurate comparisons across the three surveys. Wards are the smallest administrative level for which data is available. Tanzania’s Constitution divides the country in 30 regions. Each region is then divided in districts, and these in turn into divisions. The latter are formed up by wards, which is the smallest unit of analysis for which enterprise data is available and one of the lowest levels of government. According to Tanzania’s National Bureau of Statistics (NBS), Mainland Tanzania is divided into 3,312 wards (see Annex A, Figure A1). The 2014/15 BRS provides data on 154,618 firms. Whereas district names are missing for 1,565 firms and ward information was missing for 697 firms, it is possible to assign almost all missing location data using information from the area field (lower territorial unit than the ward area). Our sample captures over 99 percent of the firms from the Business Register Survey, and is representative at the regional level (Figure 1). Figure 1 Share of firms included in the sample by region, 2014/15 Source: Own calculations based on data from the NBS (2016). Note: For 2014/15 a sample of 153,509 firms are used. The following regions are not displayed in the figure due to lack of firm data: 51.Kaskazini Unguja, 52.Kusini Unguja, 53.Mjini Magharibi, 54.Kaskazini Pemba, and 55.Kusini Pemba. 9 The research question is what drives firm location. To be precise, the paper aims at understanding what regional characteristics determine the number of firms at the ward level. To do that, the chosen unit of analysis is the ward. Tanzania is divided into thirty-one regions or mikoa. Mainland Tanzania comprises, since 2016, twenty-five regions. Since the paper aims at understanding the links among firms, a regional level of analysis was too large since even the smallest of regions (Dar es Salaam) is as large as nearly 1,400 square kilometers. Regions are further subdivided into 169 districts, and 157 of those are in Mainland Tanzania. Although districts were a viable unit of analysis, some of the districts included large population centers that were deemed too large to contain viable backward and forward linkages. For instance, Kinondoni Municipal Council’s population stood at nearly 1.8 million people in 2012. Similarly, 2012 population figures for Temeke Municipal Council or Geita District Council stood at nearly 1.4 million and over 800 thousand respectively. As a result, the ward level, with over 3,300 wards, was chosen as the most precise unit of analysis. Although the geographical extension of wards can vary widely in Tanzania, they don’t vary that much in terms of the demographics. In fact, the geographical size of wards was determined by population size, which renders them somewhat homogenous – each ward cannot exceed 21 thousand people. However, the spatial precision offered by the ward could come at a cost in identifying some firm linkages. Wards are undoubtedly the best spatial precision the paper could base its analysis. However, as some of those wards might be small, particularly in larger urban areas, the chance that some firm linkages that extend beyond the ward are left out in the estimation, increases. Such a trade-off between spatial precision and linkages estimation is clear in the paper. I chose spatial precision, as inevitably, some linkages will go beyond the ward, the district and even the region. A greater precision for firm linkages could be achieved with dedicated surveys that aim at describing their types, benefits and geographic extent, all of which escape the scope of the paper.3 The analysis below explores several factors that may affect firm location in Tanzania. The paper considers factors linked to firm characteristics, market characteristics, and the relative roles of localization and urbanization economies in driving the observed agglomeration. The remainder of the paper begins with a snapshot of the current landscape of firms in Tanzania, discusses the methodology, describes the econometric model and choice of explanatory variables, presents the results, and concludes with a discussion of policy implications. 3 Linkages across wards can be best addressed through spatial econometrics techniques that assumes a spatial auto-correlation, but that are beyond the scope of this paper. 10 SNAPSHOT OF THE FIRM LANDSCAPE IN TANZANIA Firms in Tanzania are relatively balanced across manufacturing, commerce and services. Around one- third of the sample was accounted for each of the three major sectors (Figure 2): manufacturing (35 percent), commerce (34 percent), and services (29 percent). Other sectors in the sample with minor shares were agriculture-fisheries-mining (0.66 percent), and construction (0.32 percent). Figure 2 Sample firms by sector, 2014/15 Source: Own calculations based on data from the NBS (2016). Note: Includes 153,509 firms which represents over 99 percent of the 2014/15 Business Register Survey. Tanzanian firms are scattered around the country, but some cities exhibit higher concentration. The highest concentration and density of firms is found in the capital Dar-es-Salaam (Figure 3). Other urban areas also show high degrees of firm concentration. Urban areas in the SAGCOT corridor, including Dodoma, Morogoro, Iringa and Mbeya, are notable for denser firm concentrations. Other northern regions stand out: Arusha, Bukoba and Mwanza. 11 Figure 3 Number of firms and firm density per ward, 2014/15 Larger firms are more spatially concentrated than smaller firms. The 2014/15 BRS data indicates that micro firms are present in a large part of the country (see Panel a in Figure 4). Small firms are more concentrated compared to micro firms (Panel b in Figure 4), and medium-sized firms are more concentrated still (Panel c in Figure 4). Large firms show the highest degree of concentration, notably in and around Dar-es-Salaam and a few other wards in the country (Panel d in Figure 4).4 4 Firms face challenges to grow. The World Bank’s Jobs Diagnostic for Tanzania argues that firm s continue to be predominantly micro and small –particularly in manufacturing and commerce. While young and small firms may signal a healthy economy, large firms in Tanzania are crucial for non-farm employment. Large firms account for most industrial employment and value added, but they also represent an overwhelming share of their markets. A near absence of medium-sized firms, may signal that micro and small firms are unable to grow and fill the void (World Bank, forthcoming). Although large firms make a disproportionate contribution to non-farm employment, the bulk of jobs lie with micro and small firms. 12 Figure 4 Figure 4. Number of firms by size per ward, 2014/15 Number of firms by size per ward, 2014/15 Panel a. Micro firms per ward Panel b. Small firms per ward Panel c. Medium-sized firms per ward Panel d. Large firms per ward Source: Own calculations based on data from the NBS (2016). What leads firms to concentrate in these particular locations? Most firm concentration is in or around urban areas. But not all urban areas are the same. On the one hand, cities of all sizes allow for market coordination, and provide firms with a pooled labor market. On the other hand, agglomeration economies operate differently from one city to the next, and depend on the role of the linkages among firms. More precisely, specialization in a set of activities can be beneficial for a range of reasons, some more relevant than others in particular locations. Specialization may be beneficial due to (i) gains in productivity achieved by narrowing production processes to certain activities, (ii) competitive advantages that other firms have in related and complementary activities, or (iii) improvements that arise from competition. Alternatively, firms may derive greater benefits from being part of or accessing the diverse economic base present in cities. In other words, the diversity in economic activities may bring about better business services that enhance productivity and promote innovation in diversified production structures. 13 METHODOLOGY This section presents a model to test whether agglomeration economies are a significant determinant of firm location in Tanzania, and which type of agglomeration is at work: localization and specialization, or urbanization and diversification. It discusses the selection of explanatory variables, not only for measuring the impact of agglomeration economies, but also for capturing other factors that influence the spatial concentration of firms. The paper uses a non-linear model that optimally deals with zeros in the data sets. Because firms tend to concentrate in space, not all wards in Mainland Tanzania had firms registered by the 2014/15 survey. In 2014/15, the BRS had around 220 wards with no firms (Figure 5). Measuring the attributes that attract firms to a particular location requires departing from linear models in order to deal with zeros in an optimal way.5 The paper therefore employs Count-Data Models (CDM).6 Since the number of firms in a location are not normally distributed, linearity is not assumed. Figure 5 Figure Frequency of Frequency 5.ward firms per in Mainland of firms2014/15 Tanzania, per ward in Mainland Tanzania, 2014/15 Number of firms Source: Own calculations based on data from the NBS (2016). 5 Given the characteristics of count data, the utilization of Ordinary Least Square (OLS) method results in biased, inefficient, and inconsistent estimates (Long, 1997). Indeed, count data can potentially result in skewed distributions cut off at zero, making it unreasonable to assume that the response variables and resulting errors follow a normal distribution. Thus, the problem of non-linearity should be handled through non-linear functions that transform the expected value of the count variable into a linear function of the explanatory variables (e.g. Count Data Models). 6 The number of firms per ward is a nonnegative integer variable, also called an event count. 14 The analysis requires a methodological choice between two models that can handle zeros optimally. An absence of firms – or zero-value in the count data of firms in each ward – can result from two different reasons: lack of survey coverage in a particular ward, or that no firms are present in that ward. In data sets with excessive numbers of zeros, the conditional variance can be much larger than the conditional mean, a phenomenon called over-dispersion in the literature. Because Poisson models assume that the conditional variance of the dependent variable is equal to the conditional mean, these would fit the data poorly. We therefore consider two alternative models: (i) zero-inflated negative binomial regression (ZINB), and (ii) negative binomial regression (NBREG). The presence of over- dispersion in the Tanzania data typically would justify the use of the zero-inflated negative binomial regression, but the NBREG model is a better fit for the data.7 The selected NBREG model will have two parts. A logit model is first used to test the reasons for the presence of zeros, followed by a count data model – in our case the NBREG – to perform the count process. 7 NBREG models can also be used for over-dispersed count data. It can be considered as a generalization of Poisson regression since it has the same mean structure as Poisson regression and it has an extra parameter to model the over-dispersion. If the conditional distribution of the outcome variable is over-dispersed, the confidence intervals for the NBREG are likely to be narrower than those from a Poisson regression model. But based on the Vuong test, the NBREG model is preferred to the ZINB because it adjusts better to the Tanzania data. 15 THE MODEL AND CHOICE OF VARIABLES An NBREG model will be tested on the BRS data for the 2014/15 period. The dependent variable is the number of firms in Tanzania at the ward level. The model includes variables for firm and market characteristics, as well as agglomeration economies. Explanatory variables can be grouped into firm characteristics (e.g., size), place characteristics (e.g., quality of the local labor force), market characteristics (e.g., access and competition), localization economies including specialization, and urbanization economies (e.g., population density, diversity). The model specification for the number of firms in space is: (1) where: Fi = number of firms in ward i si = firm size i ci = competition in ward i rdi = road density in ward i hki = human capital in ward i Asi = specialization in agriculture in ward i Msi = specialization in manufacturing in ward i Ssi = specialization in services in ward i pdi = population density in ward i ui = dummy variable for urban wards Di = economic diversity in ward i The inclusion of firm size would reflect the underlying industrial organization in which firms operate. The model recognizes that context is key. Firm, market and place characteristics are crucial factors that can help explain the number of firms. The first set of localization economies are captured by the industrial structure and organization concepts underlined by Marshall (1920). The economies of scale are modelled using the number of employees per firm (size of firms, si). 16 Road density and transport costs affect market access. Generally speaking, firms would find it more profitable to concentrate production in places where skills are abundant, backward and forward linkages with other firms are present, they can benefit from public goods, and then ship final goods to places with relatively lower availability of those factors. The NEG literature argues that at different levels of transportation costs, different concentration/dispersion scenarios emerge. Using iceberg costs à la Samuelson (1954), Krugman (1991, 1992 and 1998), Venables (1993) and Fujita, Krugman and Venables (1999), concentration may emerge when those costs are reduced by improving for instance the access to markets by roads. The literature also indicates that at high transport costs, production would disperse throughout space. At intermediate levels of transport costs, a number of unstable equilibria can emerge that make production economically feasible at different locations.8 Market access will be proxied by road density by ward, since a denser road network (rdi) will improve market access. Following Banerjee et. al. (2012), market access will be proxied by the quotient of the sum of roads’ length within each ward and the ward’s land area. In the case of Tanzania, secondary roads that connect regions, and tertiary roads that connect wards across their region, seem to be denser away from main urban centers (Figure 6). One possible explanation for this counter-intuitive result is that, except for a few places, urban centers may have internal roads but may be poorly connected to their hinterland. Indeed, only a few cities are connected by primary roads (Figure 6, Panel a). Including secondary and tertiary roads increases the number of wards connected through roads (Figure 6, Panel b). Road density per ward also increases when secondary and tertiary roads are included (Figure 6, Panels c and d). The model will define road density per ward using secondary and tertiary roads. Road density seems to be positively associated with firm size (although only weakly, as shown in Figure 7). Firms locate near a pooled labor market, but the level of skills is likely to matter. The NEG literature argues that a pooled labor market is one of the centripetal forces contributing to firm concentration (Krugman, 1991; Fujita, Krugman and Venables, 1999). However, we care not only about absolute number of workers, but the concentration of relevant skills. The Jobs Diagnostic for Tanzania found that the likelihood of wage employment increases with the level of education (World Bank, forthcoming). It is therefore important to distinguish the agglomeration impacts of a skilled vs. unskilled labor pool. Data on human capital stocks is limited in Tanzania. The 2012 Population and Housing Census (PHC) comprising 159 districts does not offer statistically reliable estimators at the ward level. Similarly, the sampling frame for the 2014 Integrated Labor Force Survey was based on the 2012 PHC.9 As a result, the model uses the number of schools per 10,000 people at the ward level as a proxy for human capital. 8 These equilibria are unstable, because a change in either factor of production (e.g. migration, FDI), may trigger a dynamic of concentration in favor of a few places. 9 Although the three-stage sample design in the ILFS used enumerated areas (a level lower than the ward), the sample was designed to provide reliable information for Dar es Salaam, for other urban areas and for rural areas (in their aggregate), with no opportunity to disaggregate data to a ward level without losing statistical significance. On the qualitative side, survey data used in the Uwezo reports for 2011, 2014 and 2017 rely on a similar methodology that randomly selects EAs but can only allow inferences on the population at the Dar es Salaam, urban and rural levels, but not at a territorial (ward) level. 17 Figure 6 Figure 6. Road density by type, 2016 Road Density by Type, 2016 Panel a. Primary roads Panel b. Primary and secondary roads Panel c. Tertiary roads Panel d. Secondary and tertiary roads Source: Own calculations based on data from Openstreetmaps (2016) 18 Figure 7 Road density and firm size Figure 7. Road density and firm size Source: Own calculations based on data from the NBS (2012). Firms will likely concentrate where markets are competitive. According to Porter (1990), local rivalry fosters the concentration of firms. Bishop and Gripaios (2010) argue that in the case of Great Britain, strong local competition favors job growth at the regional level. According to Lazzeretti, Boix and Capone (2012), and de Dominicis, Arbia and de Groot (2013), large firms tend to enhance firm concentration by attracting other firms. To capture the degree of competition in the model, we employ the inverse of the Hirschman-Herfindahl Index (HHI) as a proxy.10 In Tanzania, higher levels of local competition are present in only a few wards, namely those near Dar-es-Salaam and Arusha. The World Bank (forthcoming) found low levels of competition across Tanzania, in the vast majority of wards (Figure 8). At face value, competition appears to be critical for jobs and firm size (as shown in Figure 9). 10 Since a HHI measures the extent of market concentration, a high score in the HHI would mean a high level of concentration and therefore a low level of competition. The inverse of the HHI can therefore be employed to proxy for competition. Variable ci represents the inverse of the HHI based on: ci = (2) where L = employment, i = ward, j = firm and k = industry. 19 Figure 8 Figure Level of competition per ward, 8. Level 2014/15 of competition per ward, 2014/15 Source: Own calculations based on data from the NBS (2016). Specialization trends are a reflection of localization economies. It is standard in the literature to measure specialization in relative terms. Regional specialization is therefore measured here relative to other industries and to the national economy. A region is deemed to be specialized in a given industry when that industry’s employment share is higher in the region compared to that prevailing in the national economy.11 A specialization index Sik is thus defined for each ward and industry as follows: (3) where L is employment, i is the ward, and k is the industry. 11 The paper includes specialization variables for the three main sectors. Although choosing a lower level of disaggregation (4-digit) was possible, the choice entails over 50 more variables to accommodate only manufacturing, which impinges on subsequent interpretation. 20 Figure 9 Figure 9. Competition and firm size, 2014/15 Competition and firm size, 2014/15 Source: Own calculations based on data from the NBS (2016). Specialization in some sectors may affect firm location. In the case of Tanzania, given that the 2014/15 BRS data only provides information about physically established firms, as opposed to economic production, only a few wards are specialized in agricultural jobs in registered firms, and these are in the northwest of the country (Figure 10, Panel a). In contrast, specialization in manufacturing is present in a much larger set of wards, and particularly in and around the larger cities and in the Southern Agricultural Growth Corridor of Tanzania (SAGCOT) corridor (Figure 10, Panel b).12 Jobs specialization in registered service-sector firms is highest only in a few places (Figure 10, Panel c). Specialization in agriculture and services appear to be only weakly associated with average firm size (Figure 11, Panels a and c). In contrast, specialization in manufacturing is negatively associated with average firm size (Figure 11, Panel b), suggesting that manufacturing jobs predominate in wards with more small firms. One possible explanation for this result is that Tanzania’s manufacturing sector is characterized by micro and small firms and lacks for the most part, larger manufacturers linked to global markets as other industrialized countries. 12 Established at the World Economic Forum Africa summit in May 2010, the Southern Agricultural Growth Corridor of Tanzania (SAGCOT) is a public private partnership, whose objective is to transform agriculture in Tanzania’s southern corridor. The SAGCOT corridor covers approximately one-third of mainland Tanzania; from Dar-es-Salaam to the northern areas of Zambia and Malawi. The SAGCOT partnership aims at enhancing food security, and ensuring improved livelihood for smallholder farmers and their communities in a sustainable manner by catalyzing responsible private-sector-led agriculture development. 21 Figure 10 Figure 10. Specialization index, 2014/15 Specialization indexes, 2014/15 Panel a. Specialization in agriculture Panel b. Specialization in manufacturing Panel c. Specialization in services Source: Own calculations based on data from the NBS (2016). Urbanization economies are measured using various indicators. The potential size of the market as intended by Ohlin (1933) and Hoover (1937) is modelled using population density (population per km2). As expected, population density reflects the urban nature of cities in Tanzania, but density spreads beyond cities particularly in northern regions (Figure 12). We alternatively test a dummy variable for the wards that are officially classified as urban (only core urban wards are identifiable however; Figure 13). Both variables aim to capture the degree to which urbanization affects firm location. An index of economic diversity is used to capture the role of industrial diversity in each ward in determining the number of firms located in each ward. The Hachman Index of Industrial Diversity (HIID) will test for the second type of agglomeration economies, namely, urbanization economies. In order to 22 measure the ward-level of industrial diversity in Tanzania in 2014/15 and at the same time try to address the main inconvenience of the fractionalization index, 13we use the HIID, which is measured as the inverse of the weighted sum of the location quotients, by industrial sector (measured at ISIC 4 digits), for a given ward, across all sectors.14 Figure 11 Figure 11. Specialization and firm size, 2014/15 Specialization and firm size 2014/15 Panel a. Agriculture Panel b. Manufacturing Panel c. Services Source: Own calculations based on data from the NBS (2016). 13 Easterly and Levine (1997), introduced the concept and measurement of group diversity in the context of ethnic divisions to explain differences in economic growth across countries. Initially measured as the sum of all squared individual ethnolinguistic shares in a given territory, the fractionalization index, became a reference in measuring diversity in other contexts. However, one of the problems associated to that measure is the number of groups in which the population is (arbitrarily) subdivided impinges on the maximum level of fractionalization. The HIID is an alternative to that measure. 14 Formally, HIID of ward i using industries I at 4 digits (m) is defined as follows: (4) A location quotient (LQ) is the fraction of the ward’s number of firms in a given industrial sector (4 -digit ISIC) over the total firms in a ward, divided by the fraction of the national firms in the same industrial sector over the total firms in the country. The LQs are weighted by the share of firms by ward in a given industrial sector. 23 Diversification appears to be moderate across the country. Only a handful of wards surrounding Dar es Salaam and in the SAGCOT corridor exhibit diversity higher than 0.55 in a scale that is capped at 1 (Figure 14). The observed moderate degree of diversity beyond these wards is contrary to expectations, and may be due to the disproportionate market power of larger firms. Logarithmic values of the HIID are negatively associated to log values of average jobs in the ward (Figure 15). That is, diversity declines with greater average employment. The latter means that the presence of larger firms – which raises average employment – may make diversity more difficult perhaps due to the market power they exert in their regions. Figure 12 Population density, 2014/15 Figure 12. Population density, 2014/15 Source: Own calculations based on data from the NBS (2016). Figure 13 Wards classified as urban, 2012 Figure 13. Wards classified as urban, 2012 Source: Own calculations based on data from the NBS (2016). 24 Figure 14 Figure 14. Hachman index of industrial diversity, Hachman index of industrial diversity, 2014/15 2014/15 Source: Own calculations based on data from the NBS (2016). Figure 15 Figure Diversity and firm size, 2014/15 15. Diversity and firm size, 2014/15 Source: Own calculations based on data from the NBS (2016). 25 REGRESSION RESULTS The regression results, reported in Table 1, test a number of specifications to determine the correlates of firm location and the relative roles of localization and urbanization economies. The main conclusions are described below. Competition is key for firm location. The results in all model specifications suggest that a healthy level of competition among firms is important to the number of firms operating in a given ward. The coefficient values on the competition variable (ci) are positive and statistically significant. When we control for specialization in services (in regression specification 4), however, the coefficient for competition halves in value. This may be because the services sector exhibits greater competition than others. Indeed, specialization in services (Ssi) and competition (ci) exhibit the largest statistically significant correlation coefficient across independent variables’ bivariate correlations (Table 2). Access to markets is correlated with firm location. Firms need access to markets to sell their product, and roads are one way to provide that access. Across all model specifications, road density’s coefficient (rdi) is positive and statistically significant, but smaller in magnitude and therefore less important than other significant variables. One explanation could be that whereas roads are essential, underinvestment in roads and therefore poor quality may impinge on the variable’s effectiveness in providing that access to markets. Another explanation is that road density is useful in determining firm location in urban settings where the access to markets is greater, but less effective in determining firm location for other types of wards. The availability of human capital is not a significant correlate of firm location. Access to secondary schooling, as measured by the number of secondary schools per 10,000 population, is not statistically significant. Contrary to what we might expect, urban areas and the dominance of Dar es Salaam are not good indicators of access to more secondary schools (Figure 16). One possible explanation is that secondary schools in urban areas may enroll a greater number of students. An alternative explanation is that investment in building schools was done in a compensatory way, favoring wards in rural areas. Interestingly, the number of secondary schools per ward does not seem to be associated with average firm size (Figure 17).15 It is also possible that the proxy (number of secondary schools) fails to reflect human capital conditions. One alternative was enrolment or attainment for secondary schooling. However, education-related data from the NBS (e.g. 2012 Housing and Population Census, Literacy and Education Monographs) provide data only at the region level of disaggregation. Wards with larger average firm size are slightly more attractive for entering firms, but the effect is small. The variable that captures average firm size per ward is positively correlated with the concentration of firms, but the coefficient value is small. When the urban dummy is added, however, firm size loses its significance. 15 Results not shown in the paper found weak relationships between secondary schools on the one hand, and number of firms or total jobs on the other hand. 26 Figure 16 Figure 2012 Secondary schools per 10,000 people, 16. Secondary schools per 10,000 people, 2012 Source: Own calculations based on data from the NBS (2012). Localization economies are a significant determinant of firm location, but not with respect to manufacturing. Specialization in agriculture and in services both played a role in increasing firm density, each having positive and statistically significant coefficients. The coefficient value was much greater for services specialization (NBREG 4 and 8) than for specialization in agriculture (NBREG 2 and 6). This suggests that the presence of other firms in the same industry (at ISIC 4-digits) generates localization economies by, for example, creating a network of backward and forward linkages, nurturing a pooled labor market of relevant skills, and fostering knowledge spillovers from other firms. In the case of manufacturing, by contrast, specialization does not generate these types of positive externalities. In other words, the presence of other manufacturing firms in the same industry does not entail backward and forward linkages or the other benefits of localization economies. A weaker coefficient for specialization is not uncommon even in OECD countries; Bishop and Gripaios (2010) found a similar result. In fact, a meta-analysis by Melo, Graham and Noland (2009) using over 700 estimations in 34 studies found evidence for both, stronger localization economies effects, and stronger urbanisation economies effects. According to Melo, Graham and Noland (2009), the results depend of the study characteristics, including country effects, industrial coverage, specification of agglomeration economies, and the presence of controls. Urbanization economies emerge as the single most important factor explaining firm location in Tanzania. All three variables used to capture urbanization economies – population density (pdi), urban dummy (ui) and industrial diversity (Di) – show positive and statistically significant coefficients many orders of magnitude larger than the other explanatory variables. While population density’s coefficient is statistically significant, it is much weaker than the results for the urban dummy variable. The strongest correlation across all specifications was for industrial diversity, implying that firms benefit from the presence of other firms, not necessarily in their industry, but in other sectors. When we consider the 27 bivariate correlations between independent variables, we find that diversity is negatively correlated with competition and firm size, and positively associated with the specialization in manufacturing (Table 2). The positive relationship with specialization in manufacturing might be because part of the diversity arises within manufacturing sub-sectors, not only diversity across sectors. A notable relationship between competition and employment is also clear in Table 2. At first, it may seem a problem of collinearity between an employment-based competition and firm-size variables. However, at low levels of competition in Tanzania, firms may find difficult to grow and may rather survive at very small levels (see the rationale between competition and firm exit in Hallward-Driemeier, 2009). Figure 17 Secondary schools and firmFigure size 17. Secondary schools and firm size Source: Own calculations based on data from the NBS (2012). 28 Table 1 Table 1. Regression results Regression results Nbreg1 Nbreg2 Nbreg3 Nbreg4 Nbreg5 Nbreg6 Nbreg7 Nbreg8 Constant 1.689*** 1.6744*** 1.7217*** 1.6867*** 1.6450*** 1.6435*** 1.5750*** 1.6413*** -0.0656 (0. 0655) (0. 0857) (0. 0630) -0.0582 -0.058 -0.0758 -0.0563 Firm size 0.0074* 0.0068* 0.0074** 0.0086** 0.0019 0.001 0.0018 0.0028 -0.0026 -0.0025 -0.0026 -0.0026 -0.0017 -0.0017 -0.0017 -0.0018 Competition 0.2636*** 0.2350*** 0.2558*** 0.1129** 0.2177*** 0.1832*** 0.2350*** 0.1003*** Firm & market -0.0315 -0.0313 -0.0341 -0.0333 -0.0262 -0.0263 -0.0289 -0.0283 context Secondary schools per -0.0052 -0.0044 -0.0051 -0.0041 -0.0093 -0.0089 -0.0095 -0.0085 10,000 people -0.0066 -0.0066 -0.0066 -0.0068 -0.0048 -0.0048 -0.0048 -0.0049 Roads density 0.0044*** 0.0046*** 0.0044*** 0.0041*** 0.0047*** 0.0047*** 0.0047*** 0.0044*** -0.0005 -0.0005 -0.0005 -0.0005 -0.0004 -0.0004 -0.0004 -0.0004 Agriculture (specialization) 0.0152*** 0.0145*** -0.0026 -0.058 Manufacturing -0.0196 0.0277 Localization (specialization) -0.0857 -0.0193 Services (specialization) 0.4459*** 0.3387*** -0.0393 -0.0349 Population density 0.0001*** 0.0001*** 0.0001*** 0.0001*** 0 0 0 0 Urban wards (dummy) 1.3324*** 1.3396*** 1.3391*** 1.2831*** Urbanization -0.0398 -0.0394 -0.041 -0.0393 Diversity (HIID, ISIC 4 6.6319*** 6.8600*** 6.6124*** 6.0481*** 5.3261*** 5.5554*** 5.3431*** 5.0248*** digits) -0.1807 -0.1845 -0.1838 -0.1864 -0.1714 -0.1739 -0.1716 -0.1725 Pseudo R-squared 0.0752 0.0767 0.0752 0.0793 0.0988 0.1008 0.0988 0.1018 Observations 3311 3311 3311 3311 3311 3311 3311 3311 LR chi2 2350 2398.08 2350.36 2477.75 3085.98 3149.88 3088.04 3179.61 Log likelihood -14447.8 -14423.76 -14447.62 -14383.93 -14079.81 -14047.86 -14078.78 -14033 Note 1: Dependent variable (number of firms per ward), size (number of employees per firm in each ward). Note 2: Standard errors in parentheses Note 3: Asterisks represent p-values: p<0.10 (*), p<0.05 (**), p<0.01 (***) Note 4: Coefficients need to be interpreted as follows. The expected change in log(count) for a one-unit increase in size is XX holding other variables constant. 29 Table 2 Correlation matrix Table 2. Correlation matrix Fi si ci rdi hki Asi Msi Ssi pdi ui Di Number of firms Fi 1 -0.0578* 0.5176* -0.1091* 0.1910* 0.4891* 0.4361* 0.3520* Firm size si 1 0.2406* 0.1850* -0.0923* 0.1299* 0.0515* -0.0961* Competition ci 1 0.0352* 0.0517* 0.1672* -0.3896* 0.5533* 0.0379* -0.1976* Road density rdi 1 0.0466* -0.0369* -0.0631* 0.1124* 0.6091* 0.3954* 0.1027* Human capital hki 1 Specialization in agriculture Asi 1 -0.1076* -0.0391* -0.2005* Specialization in manufacturing Msi 1 -0.7550* 0.2540* Specialization in services Ssi 1 0.0688* 0.1235* Population density pdi 1 0.3977* 0.0991* Urban wards ui 1 0.2559* Diversity Di 1 Note 1: This table displays the the correlation between variables, the number of observations for each correlation and the level of significance. This table displays only correlations with a significance level of .1 or better (i.e. lower), and displays a star with each correlation that is significant at .05 or better. Note 2: Standard errors in parentheses 30 CONCLUSIONS AND POLICY DISCUSSION The preceding analysis illustrates that when markets are competitive and function well, firms are more likely to be present in a given location. The results also suggest that physical connectivity such as through roads that allow access to markets is an important element in explaining firm location. In addition to market competition and access, firms are located in presence of other firms as they benefit from linkages to firms with similar and complementary activities. Both types of agglomeration economies – localization and diversification – are significant determinants of firm location. Localization economies linked to specialization in services, and to a lesser degree agriculture, play a role in attracting firms to cluster in particular locations. Urban centers are the largest draw for firms, however, stemming from the diversification of activities which have large positive spillovers. Although more work and analysis is required for full recommendations, these results suggest that a policy conversation on the following aspects could be useful. Some of the policy implications of the present empirical findings could be:  The need to facilitate the process of urbanization. The diversification that comes with urbanization can be beneficial to encourage firm location, and as a consequence, employment. Firms in proximity to each other share a pooled labor market and the matching of skills that take place there (Duranton and Puga, 2004). But firms also share productive infrastructure that enables them to achieve economies of scale (e.g. power grid) or that allows them to access markets (e.g. ports and airports).  The opportunity that lies in fostering specialization. By targeting specialization, stronger backward and forward linkages in the same sector may emerge, thereby promoting the location of more firms. Not only would additional jobs be created, but the resulting specialization within firms would likely improve job quality through more specialized skills that may raise productivity. At higher efficiency levels, workers may eventually receive higher wages. A local strategy based on localization economies might be particularly useful in towns and tertiary cities where the scope of skills and specialization could be narrower.  Addressing insufficient competition and limited market access may require a combination of policy and investment. Increasing competition could begin with a review of the existing incentives framework for firms, and in particular those related to licensing and innovation. Reducing the burden of firm registration, compliance and exit requirements could encourage more dynamic firm entry and exit that responds to market demand. Physical access to markets through roads is important not only within urban centers but also to link rural producers and consumers to market towns as well as larger urban centers.  Further work is needed. While the paper can shed some light into broad policy areas for discussion, it mostly constitutes a step in using spatial data to explore policy-relevant questions. More work is needed to: (i) detailing the implications for policy in greater detail and in light of evidence provided by other work (ii) use alternative measures of access to markets such as a distance measure weighted by population, or a travel time by type of road to different population centers; (iii) data 31 permitting, use sales market share to construct competition indices rather than employment based; (iv) explore the use of other methods such as spatial econometrics to determine the role of neighborhood effects and spatial autocorrelation; and (v) explore nonlinearities with respect to the role of competition. 32 ANNEX A Figure A1 Mainland Tanzania wards, 2012 Annex A. Mainland Tanzania wards, 2012 Source: Based on Tanzania National Bureau of Statistics http://nbs.go.tz/ 33 REFERENCES Anselin, L. (1988), Spatial Econometrics: Methods and Models, Spinger Science+Business Media Dordrecht. Bishop, P. and P. Gripaios (2010), “Spatial Externalities, Relatedness and Sector Employment Growth in Great Britain”, Regional Studies 44(4): 443-54. Chinitz, B. (1961), “Contrasts in Agglomeration: New York and Pittsburgh”, The American Economic Review 51(2): 279-289. De Dominicis, L., Arbia, G. and H. De Groot (2013), “Concentration of Manufacturing and Service Sector Activities in Italy: Accounting for Spatial Dependence and Firm Size Distribution”, Regional Studies 47(3): 405-418. Duranton, G. and D. Puga (2004), “Micro-Foundations of Urban Agglomeration Economies”, in V. Henderson and J. F. Thisse (eds.) Handbook of Regional and Urban Economics, vol. 4: 2063–17, Amsterdam: North-Holland. Easterly, W. and R. Levine (1997), “Africa’s Growth Tragedy: Policies and Ethnic Divisions”, The Quarterly Journal of Economics 112(4): 1203-50. Evans, H. D. (1989), Comparative Advantage and Growth: Trade and Development in Theory and Practice, London: Harvester Wheatsheaf. Fujita, M., Krugman, P. and A. Venables (1999), The Spatial Economy, Cambridge, MA: The MIT Press. Glaser, S. (2017), “A Review of Spatial Econometric Models for Count Data”, Hohenheim Discussion Papers in Business, Economics and Social Sciences, No. 19-2017, Universität Hohenheim. Glaeser, E. L., Kallal, H. D., Sheinkman, J. A. and A. Schleifer (1992), “Growth in Cities”, Journal of Political Economy 100: 1126-1152. Hallward-Driemeier, M. (2009), “Who Survives? The Impact of Corruption, Competition and Property Rights across Firms”, Policy Research Paper 5084, The World Bank Group. Henderson, J. V. (1983), “Industrial Bases and City Sizes”, American Economic Review 73: 164-69. Henderson, J. V. (1986), “Efficiency of Resource Usage and City Size”, Journal of Urban Economics 19: 47- 70. Hoover, E. M. (1937), Location Theory and the Shoe and Leather Industries, Cambridge, MA: Harvard University Press. Jacobs, J. (1961), The Death and Life of Great American Cities, New York: Vintage Books. Jacobs, J. (1969), The Economy of Cities, New York: Random House. Jordaan, J.A. and J. Sanchez-Reaza (2006), “Trade Liberalization and Location: Empirical Evidence for Mexican Manufacturing Industries”, The Review of Regional Studies 36(3): 279-303. 34 Krugman, P. (1998), “What's New About the New Economic Geography?”, Oxford Review of Economic Policy 14(2): 7–17, https://doi.org/10.1093/oxrep/14.2.7 Krugman, P. (1992), “A Dynamic Spatial Model”, National Bureau of Economic Research, Working Paper No. 4219. Krugman, P. (1991), “Increasing Returns and Economic Geography”, Journal of Political Economy 99(3): 483-99. Krugman, P. and A. Venables (1995), “Globalization and the Inequality of Nations, Quarterly Journal of Economics 110(4): 857-80. Lazzeretti, L., Boix, R. and F. Capone (2012), “Why do Creative Industries Cluster? An Analysis of the Determinants of Clustering of Creative Industries”, in L. Lazzeretti (ed.) Creative Industries and Innovation in Europe, London: Routledge. Long, J. S. (1997), Regression Models for Categorical and Limited Dependent Variables, Advanced Quantitative Techniques in the Social Sciences, London: Sage Publications. Marshall, A. (1920), Principles of Economics, 8th edition, London, Macmillan and Co. Melo, P. C., Graham, D. J. and R. B. Noland (2009), “A meta-analysis of estimates of urban agglomeration economies”, Regional Science and Urban Economics 39: 332-42. Moomaw, R. L. (1988), “Agglomeration Economies: Localization or Urbanization?”, Urban Studies 25: 150-161. Nakamura, R. (1985), “Agglomeration Economies in Urban Manufacturing Industries: A Case of Japanese Cities”, Journal of Urban Economics 17: 108-124. NBS (2016), Business Register Survey 2014/15, Dar-es-Salaam: National Bureau of Statistics Tanzania. NBS (2012), Census of Population and Housing 2012, Dar-es-Salaam: National Bureau of Statistics Tanzania. Nord, R., Sobolev, Y., Dunn, D., Hajdenberg, A., Hobdari, N., Maziad, S. and S. Roudet (2009), Tanzania: The Story of an African Transition, Washington, DC: International Monetary Fund. Ohlin, B. (1933), Interregional and International Trade, Cambridge, MA: Harvard University Press. Openstreetmaps (2016), Tanzania’s secondary and tertiary roads, accessed online at https://www.openstreetmap.org/search?query=tanzania#map=10/-6.6742/38.8244 Rosenthal, S. and W. Strange (2004), “Evidence on the Nature and Sources of Agglomeration Economies”, in V. Henderson and J. F. Thisse (eds.) Handbook of Regional and Urban Economics, vol. 4: 2119–72, Amsterdam: North-Holland. 35 Address: 1776 G St, NW, Washington, DC 20006 Website: http://www.worldbank.org/en/topic/jobsanddevelopment Twitter: @WBG_Jobs Blog: https://blogs.worldbank.org/jobs/