WPS7547 Policy Research Working Paper 7547 A Detailed Anatomy of Factor Misallocation in India Gilles Duranton Ejaz Ghani Arti Grover Goswami William Kerr Macroeconomics and Fiscal Management Global Practice Group January 2016 Policy Research Working Paper 7547 Abstract This paper complements the results of earlier work on whereas land is the determining factor in manufacturing. factor misallocation. The paper first expands the method- Third, the paper expands our earlier work on the effects ology and provides two important decompositions for the of policies on misallocation by looking at a much broader main indices. The main result is that factor and output range of policies, and find strong evidence of their effects misallocation across districts is at least as important as on misallocation. Finally, the paper take steps towards misallocation within districts. Second, the paper provides the identification of the causal effect of misallocation on an exploration of the service sector that complements ear- output per worker by developing a novel instrumental lier work on manufacturing. The analysis shows that labor variable approach and a simulation approach that allows plays a fundamental role for misallocation in services, for checking the consistency of the empirical results. This paper is a product of the Macroeconomics and Fiscal Management Global Practice Group. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at EGhani@WorldBank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team A Detailed Anatomy of Factor Misallocation in India Gilles Duranton∗ ‡ University of Pennsylvania Ejaz Ghani∗ ¶ World Bank Arti Grover Goswami∗ † World Bank William Kerr∗ § Harvard Business School Key words: agglomeration, firm selection, productivity, cities jel classification: c52, r12, d24 ∗ Financial support from the World Bank, cepr Private Enterprise Development in Low-Income Countries programme, and the Zell Lurie Center for Real Estate at the Wharton School is gratefully acknowledged. We appreciate the comments and guidance from Ed Glaeser, Sandile Hlatshwayo, Vijay Jagannathan, Somik Lall, Jevgenijs Steinbuks, and conference and seminar participants. We are also grateful to Yoichiro Kimura for helping us to understand the repeal of uclra and sharing his data with us. Ben Hyman’s help with the changes in industry classifications was also greatly appreciated. Yeayeun Park and Yu Wang provided us with excellent research assistance. The views expressed here are those of the authors and not of any institution they may be associated with. ‡ Wharton School, University of Pennsylvania, 3620 Locust Walk, Philadelphia, pa 19104, usa. E-mail: duranton@wharton.upenn.edu; website: https://real-estate.wharton.upenn.edu/profile/21470/. ¶ E-mail: eghani@worldbank.org. † E-mail: agrover1@worldbank.org. § E-mail: wkerr@hbs.edu; website: http://www.hbs.edu/faculty/Pages/profile.aspx?facId=337265. 1. Introduction We provide complements to the results of Duranton, Ghani, Grover Goswami, and Kerr (2015) along several dimensions: • We expand our methodology and provide two important decompositions for our main indices. • We implement empirically these two decompositions in a variety of contexts. • We provide an exploration of the service sector that complement our earlier work on manufacturing. • We expand our earlier work on the effects of policies on misallocation by looking at a much broader range of policies. • We take steps towards the identification of the causal effect of misallocation on output per worker. • We provide simulation results to check the consistency of our empirical results. These extensions, complements, and alternative applications are important for a num- ber of reasons. First, the methodology proposed by Duranton et al. (2015) is novel. Given its broad potential applicability, it certainly warrants further explorations. The two devel- opments that we propose here are of particular interest. The first is a factor decomposition, which shows precisely, how, in theory, factor misallocation affects output misallocation. This decomposition also implies an extremely sharp quantitative prediction regarding the sum of the coefficients in the regression of output misallocation on the misallocation of the different factors of production. We can verify this prediction empirically. This verification provides a further validation of our approach and an important insight about the role of land in manufacturing. The second decomposition is a within/between group decomposition. Our initial approach was concerned with a single level of aggregation: Indian districts. Our de- composition shows how misallocation at different levels of aggregation can be linked 1 to each other. This within/between decomposition shows that aggregate misallocation sums the weighted sum of the misallocation of each group (e.g., each district) and a between-group misallocation term (e.g., misallocation between districts). That allows us to distinguish between different forms of misallocation. The high level of misallocation in India documented initially by Hsieh and Klenow (2009) could be due either to too many workers residing in poorly productive areas or to more productive establishments everywhere being unable to expand their employment. These are two very different forms of misallocation. They probably call for different corrective policies. It is important to distinguish between them and measure them appropriately. The decomposition we propose here is a bridge between our earlier approach, which was uniquely concerned with misallocation within districts, and the recent work of Hsieh and Moretti (2015), which focuses solely on misallocation across locations.1 We also expand our previous empirical work in several directions. The first regards the service sector. Studying manufacturing presents a number of advantages, including greater data availability, estimation methods that have been used for a long time, and its possible unique importance for development. However, it represents only a small share of the economy in India. It is important to expand our analysis to services even though data is sparser and arguably more difficult to exploit when it comes to estimating production functions. In Duranton et al. (2015), we assessed the effects of two policies on misallocation. These were the repeal of the Urban Land Ceilings Regulation Act (uclra) and changes in stamp duty taxation. Studying narrow but well-defined policies, which are expected to affect primarily the land market, is useful. This is all the more useful when, like in the case of the repeal of uclra, a good case can be made that policy changes were exogenous. It remains unclear however how much external validity there is in our initial exploration of these two policies. Here, we generalize our previous work to two sets of state-level 1 See also Hsieh and Klenow (2014) for evidence regarding the life cycle of plants and the lack of growth of Indian establishments as they age. These facts are consistent with a major role for misallocation within districts. 2 policies and three industry policies. To explore them in a consistent framework, we work at the level of individual industries within states instead of aggregating industries within districts. This alternative level of aggregation also provides us with a useful check on our earlier results. More specifically, the five policies we study are land reforms (Besley and Burgess, 2000), labor reforms (Besley and Burgess, 2004), industry delicencing, industry tariff changes, and the liberalization of foreign direct investment (all studied by Aghion, Burgess, Redding, and Zilibotti, 2008). Finally, our earlier exploration was mostly descriptive in spirit. Here we take two steps towards a more causal analysis. Finding instruments for misallocation when trying to assess how misallocation affects output per worker is intrinsically difficult. For instance, the policy changes we explore above affect misallocation. Through changes in misalloca- tion, we expect these policies to affect output per worker. However, it is difficult to argue that these policies will affect output per worker only through misallocation (and other channels that can be controlled for). If these policies have a direct effect on productivity, they cannot be used appropriately as instruments for misallocation. We expect a similar negative argument will apply to any contextual variable. Some local characteristics will affect district misallocation but we also suspect they will affect local productivity directly. Instead, we use past misallocation to instrument for current misallocation. While this instrument might fail because past misallocation may be correlated with contemporane- ous determinants of output per worker, this bias should remain small when we control for permanent characteristics of districts through district fixed effects. Additionally, we also use as instruments predictors of misallocation based on the industry composition of manufacturing in each district. We also report results from simulations that seek to mimic the type of economy that we study. That is, we consider an hypothetical set of establishments for which the production function is as we estimate and for which factors and estimated productivity are as in the data we use. Then, we re-simulate such an economy by changing marginally the misallocation of one factor. This allows us to assess the effect of this marginal change on 3 output per worker. The main drawback is that implementing this approach without too much complexity requires imposing strong distributional assumptions. 2. Misallocation: A primer This section provides a quick refresher for the concepts and main notations used in Du- ranton et al. (2015). Please refer to this paper for further details. As argued in Duranton et al. (2015), the productive efficiency of an economy depends on three main proximate factors. The first is the amount of inputs which are available. The importance of factor and capital deepening was highlighted and first articulated by Solow (1956). The second proximate factor of growth is the level of technological development and the ability of an economy to generate more output out of the same inputs. While the development of better technologies has been a focus of an extremely large literature for many years, Romer (1990) was an important milestone, linking the development of new technologies and factor accumulation. The third proximate factor of growth is the ability of an economy to allocate more inputs to more productive establishments.2 In most economic models involving production by heterogenous producers, the most productive ones should be given access to more inputs given their ability to generate proportionately more output. In a frictionless economy, we expect an efficient allocation of inputs across establishments. Establishments in real economies face serious frictions that lead to an inefficient factor allocation. Although this idea has been around for a long time, Restuccia and Rogerson (2008) and Hsieh and Klenow (2009) should be credited for articulating it in a formal model and for making the case regarding its quantitative relevance.3 In Duranton et al. (2015), we extend the misallocation methodology initially provided by Olley and Pakes (1996). The production function of establishment i is: β γ Yi = e ϕi Tiα Ki Li , (1) 2 Werefer to production units as establishments to be consistent with the data we use below. 3 Wethink of factor accumulation, technological improvement, and factor allocation as proximate factors of growth. In turn, they result from deeper causes of growth such as institutions, culture, and policies. 4 where Yi is value added, which we refer to as output, ϕi is the (log) total factor produc- tivity of establishment i, T is land and buildings, K is other fixed assets, and L is labor. Duranton et al. (2015) explicitly separate between land and buildings and other fixed assets such as machines because this distinction is important in their (manufacturing) context. Next, define the share-weighted aggregate productivity of a group g of n g establish- ments as: ng Φg = ∑ sig ϕi , (2) i =1 where sig the share of establishment i in group g. For now we consider the share of Yi establishment i to be its output share: sig = ng . ∑ j=1 Yj Following Olley and Pakes (1996), it is particularly insightful to note that: ng Φg = ϕ g + ∑ (sig − s g )( ϕi − ϕ g ) = ϕ g + n g cov(sig , ϕi ) , (3) i =1 1 ng where ϕ g ≡ ng ∑i=1 ϕi is the unweighted productivity mean across establishments in group g and s g (= 1/n g ) is the mean establishment share. Then, we can define misallo- cation in group g as Mg = −(Φg − ϕ g ) = −n g covg (sig , ϕi ) . (4) This misallocation index is a negative function of the covariance between the share and productivity of establishments so that misallocation is minimised when the covariance between shares and productivity is highest. Relative to usual practice in past work and to ease the reading of our results, equation (4) multiplies the ‘misallocation’ covariance term by minus one so that an increase in the index corresponds to an increase in misal- location. It is useful to note that M = 0 corresponds to output being uncorrelated with establishment productivity while more negative values correspond to less misallocation (and positive covariances between output and productivity). The measure of allocation defined by equation (4) is also equal to the difference between un-weighted and weighted productivity, where the weights are the establishment shares of output. Relative to extant literature, Duranton et al. (2015) expand the op approach in two directions. First, the shares used above in the computation of misallocation are output 5 shares. It is obvious that alternative misallocation indices can be computed by weighting establishments by their share of any given factor of production instead of output. That is, we can build measures of allocation efficiency for output, MY , employment, M L , land and buildings, M T , other fixed assets, MK , or any combination of these. These different measures of misallocation are conceptually distinct. For instance, a given factor of produc- tion might be more misallocated than another depending on the relative frictions affecting input markets. The second innovation of Duranton et al. (2015) is to compute misallocation indices for subnational units. At this point it is important to clarify what a measure of misallocation within a district captures. Loosely speaking, computing the misallocation of, say, employ- ment in each district separately yields a measure of the misallocation of workers within each district, taking as given the distribution of employment across districts so that the misallocation of employment within districts is only one component of the misallocation of employment in the entire country. We return to this important point below in section 3.2. Then, Duranton et al. (2015) estimate three separate regressions. The first is a long difference-in-difference regression that relates changes in misallocation to policy changes: ∆Md = A0 + A1∆Pd + Xd A2 + d, (5) where ∆Md is a change in misallocation (for output or any given factor) in district d, Pd is a policy indicator variable for district d, Xd is a vector of district d characteristics that might be correlated with Pd , and d is an error term. More specifically, Duranton et al. (2015) explore the effects of two policy changes on the misallocation of output and the misallocation of land and buildings. The first is the abrupt repeal of the Urban Land Ceilings Regulation Act (uclra) by the federal government of India in 1999. The second is the changes in stamp duty taxation in Indian states over 1989-2010. Because changes in stamp duties (a tax that needs to be paid when transacting real property) may have been determined simultaneously with changes in misallocation, we use an extensive set of control variables. The case for the exogeneity of the repeal of ulcra is much stronger. We 6 nonetheless control for possible confounding factors in this case as well. Both policies are associated with changes in local misallocation. This provides an important check on our computed measures of misallocation. This also shows how misallocation may be caused by policies that create stronger frictions in input markets. In the analysis below, we return to regression (5) and explore a broader range of poli- cies including land and labor reforms over 1985-1997 and the liberalization of industry licences, trade, and foreign direct investment over the same period. The second estimation performed by Duranton et al. (2015) is a regression of output misallocation on factor misallocation: Y L T K Md = B0 + B1 Md + B2 Md + B3 Md + µd , (6) where the dependent variable is output misallocation in district d and the three explana- tory variables are misallocation indices for the three inputs, employment ( L), land and buildings (T ), and other fixed assets (K). The first main result associated with this re- gression in Duranton et al. (2015) is that output misallocation is strongly associated with input misallocation. A one unit increase in factor misallocation typically leads to about a one unit increase in output misallocation in Indian districts. Below we provide a new way to interpret this result by showing how output misallocation can be expressed as a function of factor misallocation. When establishments operate with constant returns to scale, a one-for-one increase in output misallocation should follow from an increase in misallocation for all factors. Below, we verify this result further using data for services and manufacturing data used at a level of aggregation different from Duranton et al. (2015). The second main result of Duranton et al. (2015) is that the misallocation of land and buildings plays a particularly important role to explain the misallocation of output in manufacturing. We confirm in part this result obtained for districts when using instead specific industries at the state level. We also provide results for services. This time, the fundamental driver of output misallocation is the misallocation of employment, while the misallocation of land and buildings plays little to no role. 7 The final estimation performed by Duranton et al. (2015) is a regression of log output per workers yd on factor misallocation: L T K yd = B0 + B1 Md + B2 Md + B3 Md + νd . (7) Duranton et al. (2015) find large effects of factor misallocation on output per worker. Below, we confirm these large effects for a different treatment of manufacturing and for services. We also confirm that the misallocation of land and buildings plays a fundamental role in manufacturing while it has a much smaller effect on output per worker in services. 3. Two decompositions 3.1 Factor decomposition This first decomposition is helpful to understand regression (6) and be able to interpret its results further. Dropping group indices for simplicity, recall that output misallocation can be written as: n i , ϕi ) = − ∑ si ( ϕi − ϕ ) , MY = −n cov(sY Y (8) i =1 where sY i = Yi /Y = Yi / ∑ j Yj is the share of output of establishment i in total output. β γ Recall that output for establishment i is given by Yi = e ϕi Kiα Li Ti . Profit maximization implies r Ki = α Yi , w Li = β Yi , and R Ti = γ Yi where r is the interest rate on other fixed assets, w is the wage, and R is the rental rate of land and buildings. Aggregating across all establishments, it is easy to obtain r K = α Y , w L = β Y , and R T = γ Y where K ≡ ∑ j K j , L ≡ ∑ j L j , and T ≡ ∑ j Tj are the aggregate values of other fixed assets, labor, and land and buildings, respectively. When establishments operate under constant returns to scale (i.e., α + β + γ = 1), free entry then implies Yi = r Ki + w Li + R Ti . Then it obviously follows from these simple results that r Ki w L i R Ti rK wL RT α Ki β Li γ Ti sY i = + + = r Ki + w L i + R Ti = + + Y Y Y α β γ K L T = αsiK + βsiL + γsiT , (9) 8 K , s L , and s T are the shares of establishment i is other fixed assets, labor, and land where si i i and buildings respectively. Using equation (9), it is easy to rewrite equation (8) as: MY = α M K + β M L + γ M T , (10) where MK , M L , and M T are indices of factor misallocation that correspond to MY but use the factor shares instead of output shares. We can note that this is a theoretical analogue to regression (6) Hence, by regressing output misallocation on factor misallocations, we should recover the factor shares α, β, and γ. Interestingly, we note that in the results of Duranton et al. (2015), the sum of the coefficients on the factor misallocation terms are always close to unity. This is obviously consistent with establishments operating not far from constant returns. We also note that the factor shares obtained when estimating the empirical counterpart to equation (10) differ from the factor shares obtained when we estimate establishment productivity. As pointed by Duranton et al. (2015), land and buildings represent about 13% of the revenue of establishments but estimating equation (10) implies a factor share for land and buildings of 40 to 60%. There are three possible reasons for that. The first that the price of land and build- ings is at the margin much higher than what establishments pay on average. A second interpretation is that factor choices are not made simultaneously but sequentially with land and buildings being chosen first. Then labor and capital become functions of land and buildings and, unsurprisingly, the estimation of the empirical counterpart to equation (10) indicates a much greater importance of land and buildings than we observe in terms of expenditure. Alternatively, land may be a fundamentally important asset to obtain the external financing necessary to buy other fixed assets or to pay workers at the end of the month. 3.2 Within and between misallocation: Theory Our second decomposition is a between/within group decomposition. Before going any further recall that in Duranton et al. (2015), we compute indices of misallocation at the 9 district level. As mentioned above, computing the misallocation of, say, employment in each district separately yields a measure of the misallocation of workers within each district, taking as given the distribution of employment across districts. To be more precise, consider a population of establishments and a partition of this population into m groups denoted g = 1, 2,...,m. For instance, the population of establishments might be all establishments in our sample and each district may form a separate group. Alternatively, one may partition the population of establishments into the organized and unorganized sectors, different industries, etc. By analogy with equation (4), which defines misallocation within group g, we can define misallocation for the entire population of establishments as n 1 n M = −(Φ − ϕ) = −n cov(si , ϕi ) = − ∑ si ϕi − ∑ ϕi , (11) i =1 n i =1 where n is the total number of establishments. Using the fact that the share of an estab- lishment in the population is equal to the share of that population in its group times the share of that group in the population, si = sig s g , we can rewrite equation (11) as: 1 M = − ∑ ∑ sg sig ϕi − n ∑ ∑ ϕi g i∈ g g i∈ g 1 1 = − ∑ sg − Mg + ng ∑ ϕi − n∑ ∑ ϕi , (12) g i∈ g g i∈ g where the last simplification arises directly from the definition of Mg in equation (4). After using the definition of ϕ g above and further simplifications, we can rewrite expression (12) as: ng M= ∑ s g Mg − ∑ sg − n ϕg . (13) g g ng Because − ∑ g s g − n = 0 as the shares of both factors and establishments sum to unity, equation (13) can be finally written as: ˜ + M∆ M=M (14) ng with ˜ ≡ M ∑ s g Mg and M∆ ≡ − ∑ s g − n ( ϕ g − ϕ) . g g 10 Hence, total misallocation for the population of establishments can be decomposed ˜ , is within-group misallocation summed into the sum of two simple terms. The first, M across groups, weighting them by their share s g . The second term, M∆, is a measure of between-group misallocation. It is actually the between-group analogue of the covariance term used in equation (3). This makes strong intuitive sense: total misallocation is the sum of the misallocation of each group where each group is weighted by its factor share to which we add the misallocation across groups. For instance, it is possible to imagine situations where more productive establishments employ more workers within each district, but that districts that host more productive establishments on average have access to substantially fewer workers than less productive districts. This would be a case of a low level of misallocation within districts but a high level of misallocation across districts. While most of our analysis is concerned with misallocation within districts, we pay some attention to the between component below. We also note that within- and between-district misallocations probably have very different root causes and most likely call for different policies. 3.3 Within and between misallocation: Female- and male-owned establishments Our first decomposition regards the differences between female- and male-owned busi- nesses in the unorganized sector. We use the fact that the nss reports the gender of business owners in the unorganized sector. As women suffer extensive discrimination in India, both as workers and as employers, it is important to assess, how much this discrimination affects misallocation in India. The results are reported in table 1. Let us focus our discussion on a specific year, 2010. Although the exact numbers differ, the results for the other years follow very similar patterns. The misallocation index for output is -0.687. This is very close to the overall district mean for 2010 of -0.70. By the decomposition proposed in equation (14), -0.687 is the sum of within-group weighted misallocation of -0.605 and between group misallocation of -0.082. Hence, there is more output misallocation between groups rather 11 ˜ and M∆ given above, we can compute that than within groups. Using the formula for M female owned establishments, which represent about 22% of the establishments produce only about 8% of output. This seems like an extremely low ratio of output to the number of establishments. This major imbalance may be at the root of much misallocation. However, the misallocation of output does not come from an abnormally low ratio of output to the number of establishments. Quite the opposite, this ratio is actually abnormally large. To understand this, note that the difference in estimated productivity between male- and female-owned establishments documented in the last two columns of table 1 is about 0.57 log point. This is an extremely large productivity gap. Hence, should the two groups of establishment be similar in other respects, an extremely small share of output (and factors) for female-owned establishments would be desirable from an efficiency point of view. When a group of establishments is much less productive, efficiency requires this group to receive only a tiny share of factors.4 To insist, this last statement regards of course only efficiency and remains true only when the two groups of establishments are not too different in terms of misallocation within their group. This caveat regarding symmetry across groups is important. The fourth and fifth columns of table 1 report the misallocation of output within male- and female-owned establishments. They clearly show a greater amount of misallocation among male-owned establishments. Female-owned establishments have access to less than 12% of employment, 10% of other fixed assets, and less than 9% of land and buildings. They manage to produce about 8% of output despite being immensely less productive. Given their much lower productivity, we would expect a set of representative female-owned establishments to produce only about 6.5% of total output. However, female-owned establishments produce 8% of total output because of less misallocation among them relative to male owned establishments. Hence, the overall effect of female-owned establishments on the misallocation of output 4 How tiny depends on the degree of differentiation between goods. See Melitz (2003) and subsequent literature for more on this. 12 is ambiguous. On the one hand, they receive too much factor given their extremely low productivity (or, perhaps more accurately, too many female workers may be stuck work- ing for female-owned establishments).5 This contributes to more misallocation. On the other hand, misallocation is less among female-owned establishments, which contributes to less misallocation in the unorganized sector. Two further results must bear notice in table 1. First, the lower level of output misal- location for female-owned establishments is the outcome of a lower level of factor misal- location. While for output, misallocation was worse across groups than within groups, the opposite is true for factors. For factors, female-owned establishments unambigu- ously contribute to less misallocation because (i) between-group misallocation is lower than within-group misallocation and (ii) there is less misallocation among female-owned establishments. The reason why within-group misallocation is less severe for output is that even if establishments receive factors independently of their productivity, the most productive establishments will produce much more output than the less productive establishments. Here, despite extremely large productivity differences between female- and male-owned establishments, within-group heterogeneity is even greater. 3.4 Within and between misallocation: Organized and unorganized sector We repeat the same decomposition exercise but perform it on the organized and unorga- nized sectors instead of male- and female-owned establishments. The results are reported in table 2. Again, we focus our attention on 2010. The results for the other years of data exhibit similar patterns. Starting with the misallocation of output, we can note that, although the between-group component M∆ is indicative of more misallocation than the ˜ , they are both of roughly similar magnitude. This is in sharp within group components, M contrast with the previous decomposition. 5 This last point is certainly consistent with female-owned establishments being more labor intensive and having access to less in terms of land and buildings. 13 For factors of production, a different set of patterns holds, within-group misallocation is much worse than between-group misallocation. The within-group component is slightly negative for land and buildings and positive for employment and other fixed assets. Overall, the misallocation indices for factors are close to zero. This indicates a complete lack of correlation between factor use and productivity. On the other hand, factors are allocated much more efficiently between groups. Establishments in the organized sector, which are immensely more productive than establishments in the unorganized sector have access, on average to a much more quantity of factors of production. Then, comparing misallocation within the organized sector with misallocation within the unorganized sector, we do not observe large differences between them. In 2010, there was slightly less misallocation in the organized sector for land and buildings but slightly more misallocation for employment and other fixed assets. These roughly similar numbers for factor misallocation are consistent with comparable misallocation indices for output across the two sectors. The differences in misallocation for previous years are sometimes larger but reflect again the lack of a clear ordering of the within-sector misallocation indices. We also repeated the same exercise for each state and each district and found very similar results. The conclusion regarding this decomposition is simple. The organized and unorga- nized sectors are roughly equally misallocated. The allocation of factors across the two sectors seems much more efficient as the more productive organized sector enjoys the use of disproportionately more factors. 3.5 Within and between misallocation: Industries and districts Our next step is to decompose misallocation within and between industries at the district level. To do so, we must make two minor changes relative to our previous decom- positions. First, we decompose within each district our indices of misallocation into a within-industry term and a between-industry term. It would not be informative to report 14 numbers for misallocation of each district. Instead, we report means across districts. Hence, we should think of the results commented below as applying to an average district. Second, all our tfp estimations have so far followed the usual practice of normalizing mean industry productivity to zero. This normalization implies that the between term in ˜ .6 This the decomposition of misallocation is expected to be close to zero so that M ≈ M is of course uninformative about misallocation across industries. The reason why mean industry tfp is usually normalized to zero is that we estimate different factor shares for different industries. As a result, it is unclear what an industry mean tfp really captures. We are not overly worried by this issue in our context because the factor shares are fairly constant across industries as discussed in Duranton et al. (2015). It is also the case that, when we compute misallocation directly at the district level without first aggregating within industries, we obtain misallocation indices that are very close to those we obtain when first aggregating within industries and then averaging across industries depending on their weight. The results are reported in table 3. As in previous tables, the results are very similar across years. Hence, we can focus our discussion on 2010. First, because of the change of computation method for total factor productivity, the misallocation indices we obtain are not directly comparable with our earlier indices. Nonetheless the main descriptive features for total misallocation within districts are the same as previously. Namely: misallocation is less important for output than for factors; among factors, misallocation is higher for employment and land and buildings than for other fixed assets; there is also a slight trend over time towards a worsening of misallocation. Turning to the decomposition, the within and the between components are roughly of the same magnitude. Put slightly differently, misallocation appears to be as much as an effect of factors being misallocated across establishments of heterogeneous productivity within industries as it is an effect of factors being misallocated across industries. Again 6 It would be exactly zero when decomposing within and between industries at the national level. Pat- terns of Ricardian comparative advantage and sampling issues may make this term different from zero at the district level. 15 some caution is needed here because measurement problems could affect one term of the decomposition more than the other. 3.6 Within and between misallocation: Country, states, and districts We now turn to the decomposition of national misallocation into districts. As already ar- gued, national misallocation may be about more productive establishments locally being unable to access factors and grow or about factors not being able (or willing) to move to more productive areas. These are two different phenomena, which arguably call for different policy responses. The results for this geographic decomposition are reported in table 4. As with previ- ous decompositions, the results are extremely stable over time so that we can focus our discussion on 2010 again. Many of the patterns already observed above still hold. For instance, there is less misallocation for output than for factors and factors are roughly equally misallocated. The first striking new result is that national misallocation reported in the first column is often very close to the unweighted mean of district misallocation. Hence, to a large extent, the national picture about misallocation is a close reflection of what happens in an average district. Then, we can also note that the share-weighted average for misallocation across ˜ , is typically higher than the unweighted mean. Larger districts, where output districts, M is higher and factors more abundant, are also more misallocated districts. As shown by column 5, there is a lot of variation in misallocation across districts. At the same time we also observe that the between-district term, M∆, is also negative and of about the same magnitude as the difference between the weighted and unweighted mean district misallocation. This explains why national misallocation is about the same as district misallocation. The negative sign of M∆ suggests that more productive districts produce more output and receive more factors of production per establishment. Because larger districts with more factors and more establishments are also often more productive 16 districts, we end up in the following situation.7 Larger and more productive districts make national misallocation worse because of their worse misallocation. At the same time, they have more factors which reduces national misallocation. The two effects roughly offset each other so that national misallocation is not very different from misallocation in an average district. Finally, we can also see that the between-district misallocation term, M∆ is much higher ˜ . This is despite large differences in tfp than the within-district misallocation term, M across districts, as shown by the last column of table 4, which reports the standard de- viation of district tfp. These values of between-district misallocation close to zero occur despite large differences across districts in productivity. They are suggestive of a lot of misallocation of factors of production across districts. In table 5, we report results for a similar decomposition but, this time, for districts within states. The reported results are averages across states. They show very similar patterns as the previous decomposition of districts within the country. Mean district misallocation within states is very close to mean state misallocation. Again, larger districts appear more misallocated so that the weighted mean of district misallocation within states ˜ ) is higher than its unweighted mean. It is also the case that (the cross-state mean of M between-district misallocation within states is high. Finally table 6 reports results for the decomposition of national misallocation within and between states. The results are highly consistent with those of the previous two decompositions and, if anything, even more marked. The between-state misallocation term is close to zero suggesting that establishments in more productive states do not receive on average more inputs. Taking all these results together, two major conclusions emerge. First, the misallocation picture remains the same whether we look at the country, state, or district level. This is an important message, which suggests that much can be learnt from the cross-section of districts. Second, there is a lot of misallocation between districts within states and between 7 This is confirmed by column 6 which shows that (unweighted) mean district tfp is negative whereas the national mean is normalized to zero. 17 states. This is consistent with the perception that India suffers from a major problem regarding factor mobility(see for instance Munshi and Rosenzweig, 2016). Importantly, the results appear to be the same for a mostly immobile factor such as land and buildings as for supposedly more mobile factors such as employment or other fixed assets.8 4. An exploration of the service sector In this section we duplicate the analysis of Duranton et al. (2015) for the service sector. This is a useful exercise for a number of reasons: - Can we confirm that factor misallocation affects output misallocation? - Can we confirm the importance of factor misallocation on output per worker? - Did policies affect factor misallocation? - Is land as important for services as it is for manufacturing? 4.1 Data and descriptive statistics Beginning with the 57th round during July 2001 to June 2002, the nsso conducted a survey of the unorganized enterprises belonging to the service sector and this survey was repeated as part of the 63rd round of National Sample Survey (nss) during July 2006 June 2007. The 57th round covered broadly all unorganized service sector enterprises engaged in the activities of hotels and restaurants; transport, storage and communication; real estate, renting and business activities; education; health and social work and other community, social and personal service activities. The 63rd round additionally covered financial intermediation services which was not so in the 57th round while specific sectors such as non-mechanised transport activities related to transport via railways and activities of business, employers and professional organizations were covered in 57th round but ex- cluded from coverage in 63rd round. Neither survey covered the service sector enterprises 8 While land is immobile, this is not the case for buildings before their location is decided. We must also keep in mind that we are talking about manufacturing land which represents only a small proportion of total land. Its supply should be arguably somewhat flexible unlike the total supply of land. As an additional point, note also that the immobility of land might be driving the immobility of the other factors. While this is true to some extent, it is hard to believe that land and other factors are perfect complements. 18 pursuing the activities of wholesale and retail trade, repair of motor vehicles, motorcycles and personal and household goods; public administration and defence; private house- holds with employed persons and extra-territorial organizations and bodies. As in the case of the nsso surveys for manufacturing establishments, the sample stratification in services is also includes both the district level and the two-digit nic sectors.9 The survey covered the whole of the Indian Union except (i) Leh (Ladakh), Kargil, Punch and Rajauri districts of Jammu & Kashmir, (ii) interior villages situated beyond 5 km of a bus route in Nagaland, (iii) villages of Andaman and Nicobar Islands, which remain inaccessible throughout the year. Thus the corresponding State/UT level estimates and the all-India results presented in this report related to the areas covered under survey. As in the case of manufacturing surveys, the services counterpart of the nsso survey collects analogous establishment-level information on the output, inputs (raw materials and other inputs), fixed assets, value of land and buildings, employment, energy con- sumption and so on. The sample stratification is also similar to that of the nsso. Altogether a total of 190,142 service sector enterprises considering both list frame and area frame were surveyed in the 63rd round while the 57th round generated 359,702 establishment level observations. Thus, we have a total of 549,844 observations across the two years. We note that industries with two-digit National classification code 61 (Inland, sea and coastal water transport), 73 (research on natural sciences, engineering, social sciences and humanities) and 90 (sanitation services) have fewer than 400 observations each year and hence not suitable for any industry level computations, such as total factor productivity estimations. Thus, we drop these observations, and are left with 359,050 observations in 2000 and 189,584 in 2005, giving us a total of 548,634. For our calculation of productivity and misallocation metrics, we need to further clean up the data by dropping observations belonging to unknown district names, and small and conflict states. These include Andaman and Nicobar Islands, Dadra & Nagar Haveli, Daman & Diu, Jammu & Kashmir, Tripura, Manipur, Meghalaya, Mizoram, Nagaland and Assam. The final 9 Furthermore, as the nic definitions have changed over time, we make them consistent using concor- dances that come with the data. 19 sample that we work with for productivity and misallocation metrics consists of 490,948 observations for both years. Finally, we drop observations with negative value added, missing raw materials value and missing values of fixed assets. Thus, we are left with 464,285 observations in total, with 308,475 corresponding to the year 2000 and 155,810 from 2005. This leaves us with over 380 districts each year. See table 7 for some descriptive statistics. To estimate total factor productivity, our preferred specification regresses establishment value added on employment and total fixed assets with ols. We check our results with an alternative specification where we use three factors of production to distinguish between land and buildings and other fixed assets. This second specification is closer to the spirit of what we do below. Nonetheless, we do not use it as our baseline because breaking down capital into different components will introduce more measurement error and may lead to less precise productivity estimates. In practice, taking one or the other estimate of tfp makes minimal differences as the two are highly correlated as we shall see below. In future work, we should be able to use at least one more year of data and used more refined productivity estimates as in Duranton et al. (2015). 4.2 Results for the determinants of misallocation We start by estimating regression (6) for services. The results are reported in table 8. The same regressions are estimated in both panels. The main difference is that the re- gressions in panel B are weighted by district employment whereas those in panel A are uneweighted. Column 1 regresses the misallocation of output per worker for services in Indian districts on corresponding measures of misallocation for employment, land and buildings and other fixed assets. The estimated coefficient on employment misallocation is high at 0.80 and extremely significant. The coefficient on misallocation of other fixed assets is much lower at 0.14 but nonetheless significant. The coefficient on the misallocation of land and buildings is of the same magnitude as that on fixed assets at 0.13 but in- significant. The coefficient on employment misallocation in panel B where we use district 20 employment to weight observations is slighly lower than the corresponding unweighted coefficient. The coefficient on the misallocation of other fixed assets is virtually unchanged while the coefficient on the misallocation of land and buildings is higher at 0.20 and significant at 5%. In column 2, we consider all forms of capital together. The coefficient on employment misallocation is marginally higher while that on all fixed assets approaches the sum of the coefficients of its two components in the previous columns. In column 3, we add district fixed effects to the regression of column 1. All the estimated coefficients decrease but only slightly so. This leads to a loss of significance of land and building and other fixed assets. The coefficient on employment misallocation remains highly significant. In column 4, we repeat again our baseline estimation of column 1 but measure output per worker through a measure of gross revenue per worker instead of value added and find similar results. In column 5, we use a measure of total factor productivity estimated with three production factors instead of two. The differences with column 1 are small. The coefficient on the misallocation of employment increases while those on the other factors decrease. Finally, columns 6 and 7 use measures of misallocation computed using the covariance between output or factor shares with output per worker as a measure of productivity. These two columns yield substantially different results with much higher coefficients on the misallocation of land and buildings and other fixed assets. On the other hand, the coefficient on the misallocation of labor is now insignificant with point estimates close to zero. To understand the results of these last two columns and why they differ so much from those reported in the previous column, the first point to note is that output per worker and total factor productivity are less highly correlated than one might have thought. The correlation between output per worker at the establishment level and our preferred measure of osl tfp is 0.35. At the same time, the correlation between our preferred measure of tfp with two factors and one estimated with three factors of production (distinguishing between land and building and other fixed assets) is above 0.90. Hence, 21 even though services are highly labor intensive, other factors still matter. Put slightly differently, differences in output per worker will measure differences in capital intensity more than differences in productivity. This will be especially true when employment is poorly allocated across establishments. Hence the covariance between employment and output per worker will measure essentially the same thing as the covariance between land and buildings (or other fixed assets) and output per worker, only less well. As a result when both types of covariances are used in the same regression, that between employment and output per worker (i.e., employment misallocation) will have little to no explanatory power. The first key conclusion we can draw from table 8 is that the sum of the coefficients on factor misallocation is close to one unless we measure establishment productivity with output per worker, in which case it ranges between 0.75 and one. This finding mirrors our result in Duranton et al. (2015) for manufacturing. As we show in section 3.1, under a Cobb-Douglas production function we can decompose the misallocation of output among a group of establishments into the sum of their factor misallocations each weighted by its share in the production function. Hence, under constant returns to scale, we expect the sum of the misallocation coefficients to be close to one in regression (6). This provides a useful validation of our misallocation indices and a check on the quality of the data. The second key conclusion we draw from table 8 is that the misallocation of employ- ment appears to play a key role in determining output misallocation. This is consistent with the high levels of labor intensity of most service industries. This result is also very much in contrast with the result that, in manufacturing, the key driver of the misallocation of output is the misallocation of land and buildings. A number of further points are also worth noting. First, the coefficient on the misalloca- tion of land and buildings shows a greater tendency to be significant when the regressions weight districts by their initial employment level. This finding mirrors similar findings in Duranton et al. (2015) and is consistent with the notion that the misallocation of land and buildings matters more in denser districts where the pressure on land is perhaps 22 greater. There is also a potential worry that measures of productivity for services may be extremely noisy, more so than for manufacturing. The fact that in columns 1-5, the sum of misallocation coefficients is close to one is reassuring in this respect. Another validation comes from the fairly high levels of R2 especially in the weighted regressions. In these latter regressions for services, the R2 are roughly similar to those obtained for manufacturing. In results not reported here, we also experimented with a more dynamic approach to the relationship between output and factor misallocation by regressing changes in output misallocation over time on the initial level of output misallocation and initial factor misallocation. Because misallocation is not well measured, the change in output misallocation over time is strongly negatively correlated with its initial level. In Duranton et al. (2015), we found some evidence that for manufacturing the initial misallocation of land and buildings breeds subsequent output misallocation in denser districts. In services, there is modest evidence that something similar happens with the initial misallocation of employment that affects the subsequent misallocation of output. In tables 9 and 10, we also briefly investigate the policy determinants of misallocation in services across Indian districts. As in Duranton et al. (2015), we estimate a version of regression (5) where we use changes in the planning regulations that affected major Indian cities. The Parliament of India enacted the Urban Land (Ceiling and Regulation) Act (ulcra) in 1976. Its main objective was to limit the concentration of urban land by imposing ceiling limits for holdings of vacant land, prohibited many transfers of land and buildings, and restricted building construction in 64 of the largest urban agglomerations (central cities and their suburbs). For almost a quarter century, ulcra practically halted legal development of land by the private sector in urban areas unless exemptions were obtained (Srinivas, 1991, Sridhar, 2010). The regulation and market constraints reduced the incentives of landholders to invest in building construction. Thus, a large proportion of establishments were both land and building constrained by way of ulcra. The Repeal Act in 1999 gave rights to state governments of India to repeal ulcra. As 23 argued in Duranton et al. (2015), this reform was mostly unanticipated. A number of states and uts had repealed ulcra by 2003, including Delhi, Gujarat, Haryana, Karnataka, Madhya Pradesh, Orissa, Punjab, Rajasthan, and Uttar Pradesh. By contrast, Andhra Pradesh, Assam, Bihar, Maharashtra, and West Bengal kept ulcra effective until 2008. To assess the effects of the repeal of ulcra on misallocation, we estimate a series of regressions in the spirit of regression (5) where we consider the district-level change in misallocation for services from 2000 to 2005 as the dependent variable. Our key explana- tory variable is an indicator variable for states that repealed ulcra early, as listed above. Unlike for manufacturing, we find very little effects of the repeal of ulcra on output or land and building misallocation in table 9. In all cases the coefficient on the repeal of ulcra is positive, indicating a worsening of misallocation. However this coefficient is significant in only three cases. This constitutes modest evidence of a perverse effect of the repeal of ulcra on misallocation. For manufacturing, the results of Duranton et al. (2015) indicate a strong, negative, significant effect of the repeal of ulcra on misallocation. It may be that ulcra helped land intensive manufacturing establishment expand, which reduced misallocation in manufacturing. Alternatively, the repeal of ulcra may have made it easier for unproductive manufacturing establishments to convert their land and sell it. Their exit may have reduced misallocation in manufacturing. Many entrants in the service sector using this newly available land may not have been the most productive establishments within services either because these were very young establishments for which measured productivity is often low or because establishments that can buy or lease land may be less productive large incumbents.10 We confirm this perhaps surprising finding by looking at the effects of changes in stamp duties. Stamp duties are taxes are collected whenever a real property is transacted. High stamp duties impose high compliance costs on taxpayers and lead to widespread avoidance through under-reporting. This, in turn, adversely affects the possibility of using land as collateral for construction financing. High stamp duties also discourage 10 It may also be that the adjustment towards a less misallocation takes longer and that these policy changes led to initially more misallocation. See below for further discussion of such effects. 24 land transactions, and, as a consequence, reduce the supply of land on the market. Although stamp duties are high in India by international standards, there is a lot of variation across states and time. Table 10 uses a difference-in-difference estimation similar to Table 9 to consider the relationship between changes in stamp duties at the state level and changes in local misallocation in services. We consider changes in misallocation from 2000 to 2005 as the outcome variable and changes in stamp duties over the same period as the core explanatory variable. Table 10 reports that lower stamp duties are associated with more misallocation. While this finding is not particularly strong, it is clearly in sharp contrast with the opposite find- ing for manufacturing in Duranton et al. (2015). This suggests again that policy changes that reduce frictions in the land market reduce misallocation in the manufacturing sector but they appear to mildly increase misallocation in services. 4.3 Results for output per worker To finish with services, we turn to regression (7) and estimate the effect of factor misallo- cation on output per worker. The results are reported in table 11. Panel A reports results for a pooled cross-section of districts. Panel B report results for the same set of districts but weights them by their initial employment. Because the results are essentially the same across both panels, we will only discuss the results from panel A. Column 1 regresses the misallocation of output in services for 2000 and 2005 on the misallocation of land and building, the misallocation of other fixed assets, and the misal- location of employment. The coefficient on the misallocation of employment is negative, large in magnitude, and highly significant. More misallocation of employment is asso- ciated with less output per worker. The coefficient of -0.88 indicates that a 0.1 increase in index of employment misallocation is associated with an about 9% decrease in output per worker. Alternatively, a one standard deviation increase (0.2) in the misallocation of employment is associated with a decline of 16% of output per worker. The coefficient on the misallocation of land and buildings and that of other fixed assets are both positive and 25 significant, albeit of much smaller magnitude. Because these coefficients are strongly neg- ative when estimated alone and because our misallocation indices are highly correlated across factors, this regression is likely to suffer from multicollinearity.11 Column 2 considers misallocation for all fixed assets and yields essentially similar results. Adding district fixed effects in column 3 makes the coefficient on employment misallocation insignificant. We need to remain cautious when interpreting this result since our panel for services is extremely short. It is likely that we miss the statistical power to estimate this regression properly. Considering an alternate measure of tfp estimated using three factors of production instead of two in column 4 makes little difference relative to our baseline estimation in column 1. Using yet another measure of productivity – output per worker – to compute misallocation in columns 5 and 6 leads to roughly similar results for land and buildings and for other fixed assets. The coefficients for employment misallocation are lower.12 These results are again in contrast with those of Duranton et al. (2015) for manufactur- ing. For services, the key measure of misallocation that explains output per worker the misallocation of employment whereas for manufacturing it was the misallocation of land and buildings. To conclude on services, we can return to the questions raised at the beginning of this section and provide answers to them. - Can we confirm that factor misallocation affects output misallocation? Yes. Although the relative importance of factors differs in services relative to manufac- turing, we find strong evidence that the misallocation of factors has a strong effect on the misallocation of output. In line with the factor decomposition of the misallocation of output proposed above, we find that the sum of the estimated coefficients is close to one. This is an important validation of our approach. - Can we confirm the importance of factor misallocation on output per worker? 11 Forthe indices used in column 1, the pairwise correlations between the misallocation of employment and either land and building or other fixed assets is about 0.60. 12 As per the argument made above, empirically the covariance between employment and output per worker will not constitute a good measure of the covariance between productivity and output per worker. 26 Yes. Again, the relative importance of factors differs in services relative to manufacturing but we do find a strong effect of misallocation on output per worker. Overall we find effects that are similar to those found for manufacturing. - Is land as important for services as it is for manufacturing? No. While the misallocation of land and buildings plays a major role in manufacturing, it seems to play at best a minor role in services. On the other hand, in services it is the misal- location of employment that appears to be the main determinant of output misallocation. - Did policies affect factor misallocation? Yes. Our results are nonetheless fairly subtle because we examine polices that primarily affect the land market. The functioning of the land market is fundamental for manufac- turing and, perhaps, secondary for services. In light of this, we find that policies that were found to increase misallocation in manufacturing had, if anything, the opposite effect on services. To sum up, the two key takeaway from our analysis of the service sector is that factor misallocation affects output in services in essentially the same way as it does in manufac- turing but the crucial factor in services is labor instead of land for manufacturing. Since distortions are arguably worse for the land market than for the labor market, this raises the possibility that this may be a key determinant of India’s success in services relative to manufacturing. 5. Robustness: effects of policies on misallocation 5.1 Background on policy changes In Duranton et al. (2015), we explored the consequences on factor and output misalloca- tion in Indian districts of the repeal of ulcra and stamp duty changes. There is some evidence that lower stamp duties and the repeal of ulcra both led to less misallocation in manufacturing. As just showed above, the results are different for services. There are two potential worries with this analysis. First, these policy changes may have been determined 27 simultaneously with changes in misallocation. Although a good case can be made that the repeal of ulcra came as a surprise, matters may be less clear for changes in stamp duty taxation. The second worry is that these policies are particular. We expect them to affect mostly the property markets. It is unclear whether our findings for these two policies can be generalized to policies affecting other factors of production. With this in mind, we now turn to a broader set of policies affecting labor markets, land markets, entry into manufacturing, foreign investment, and international trade. The regulation of the land and labor markets in India is made at the state level. There is a lot of cross-sectional variation between states with, perhaps at one extreme, a state like Gujarat where the outlook is pro-market, and at the other, West Bengal, which was historically ruled by communists-led coalitions. There have also been many changes over the years. These changes were recorded by Besley and Burgess (2000) for the land market and Besley and Burgess (2004) for the labor markets. These data series were subsequently expanded to 1997 by Aghion et al. (2008). The coding of labor market changes by Besley and Burgess is straightforward for our purpose. Any change in labor market regulations in a state for any given year is assessed to be either pro-worker or pro-employer. We expect pro-worker changes to create more frictions on the labor market and increase misallocation. We expect pro-employer changes to work in the opposite direction. Matters are more complex for the land reforms. First, in large part they concerned rural land since Besley and Burgess (2000) find much larger effects of these reforms in rural areas than in urban areas. These reforms may have affected misallocation in manufacturing through the location choices of manufacturing establishments and may be part of the explanation behind the decentralisation of manufacturing in India (Ghani, Goswami, and Kerr, 2012). Land reforms are also more heterogeneous and their effects on the functioning of land markets less obvious. A first group of reforms concerned the regulation of tenancy. These reforms attempted to give clearer rights to tenants. While greater rights for tenants can lead landowners to hold on to parcels that are inefficiently 28 used, they may also provide greater incentives to tenants to invest more in their parcels. A second group of reforms attempted to abolish intermediaries. These reforms may have improved the efficiency of land markets. A third group of reforms attempted to consolidate land holdings, which may also have improved efficiency. Finally, like with urban land, some reforms attempted to impose some ceilings on land holdings with a view to redistribute land. These reforms may have worsened frictions on the land market. It is important to keep in mind Besley and Burgess’s (2000) conclusions. Overall, land reforms in India have had positive effects in terms of poverty reduction and agricultural wages. However, in many instances agricultural output was reduced. This pattern is consistent the conversion of agricultural land to manufacturing use. Besley and Burgess (2000) also find that most of the effects of land reforms can be traced back to changes in tenancy regulation and to the abolition of intermediaries. Next we turn to industry delicencing. After independence, Indian manufacturing industries were subjected to a strict licencing system. ‘Raj licences’ were an essential tool for the planning of the Indian economy by the government. Entry, expansions, significant changes in capacity, relocations, and even product changes were all governed by a strict system of permit by a government willing to fine-tune the development of manufacturing in India. A first wave of reforms occurred after the rise to power of Rajiv Gandhi in the mid 1980s, following his mother’s assassination. These reforms were largely unexpected. A second wave of reforms also occurred in the early 19990s when Congress returned to power following the assassination of Rajiv Gandhi and a balance of payments crisis put pressure on the incoming government to liberalise. Industrial licencing had effectively disappeared in a large majority of industries by the mid 1990s. Together with delicencing, the Indian government also reduced the barriers to foreign direct investment and lowered tariffs. For delicencing, foreign direct investment, and tariffs, we follow Aghion et al. (2008). 29 5.2 Effects of policies on misallocation To assess the effects of these reforms, we build a panel of industries across states and compute changes in output misallocation between 1989 and 2000. This is our dependent variables in the regressions reported in table 12. Note that relative to our analysis of services in section 4 or to the results presented in Duranton et al. (2015), we work with different units, industries within states instead of an aggregate of industries within dis- tricts. Below, we assess to what extent this different level of analysis affects our results. Our main explanatory variables are the reforms described above, which we aggregate over 1985-1997. Hence, our estimations regress changes in misallocation between 1989 and 2000 on institutional and policy changes between 1985 and 1997. We also consider longer lags for changes in misallocation below. Because changes in misallocation are often noisy and because the initial level of misallocation may be correlated with what triggered some of the reforms, we also control for the initial level of misallocation in our regressions. From the regressions reported in table 12, we can draw a number of conclusions regard- ing the effects of institutional and policy changes. The first one regards the effects of labor reforms. Stricter pro-worker regulations led to a worse misallocation. This result is robust across most specifications. It is only when we break down our sample and distinguish between the organized and the unorganized sector that the effect of labor reforms become insignificant. This is perhaps unsurprising since labor reforms are thought to have pushed many establishments into the unorganized (informal) sector (Besley and Burgess, 2004).13 More generally, this negative effect of pro-worker labor reforms is fully consistent with the conclusion of Besley and Burgess (2004) who find that these reforms were generally harmful to industrial development in India. The second strong result that emerges from table 12 regards land reforms. These reforms led to less misallocation in manufacturing. Like with labor reform, it is only when we look within the organized sector or within the unorganized sector that the effects are insignificant. As argued above, the net effect of land reforms is in theory ambiguous 13 SeeDuranton, Gobillon, and Overman (2005) for a more detailed discussion of the possible pitfalls associated with looking at misallocation over only one part of the establishment size distribution. 30 since some particular reforms were expected to worsen misallocation, while others were expected to improve misallocation. Empirically, the positive effects appear to dominate. We note that this conclusion is consistent with that of Besley and Burgess (2000) who find that land reforms improved the plight of agricultural workers in India, even though agricultural output declined. It would be interesting to assess how land reforms affected the location of economic activity in India to gain further insight about the positive effects of land reforms on misallocation. This is, however, beyond our scope here. The third and fourth results are perhaps more surprising. We find that higher tariffs are associated with less misallocation while fdi liberalization is associated with more misallocation. These results may seem surprising as we might have expected entry by foreign establishments and greater import competition to have improved misallocation, not worsened it. These effects are moderately large. Using the coefficients of column 1 of table 12, tariff changes led to an about 0.1 increase in misallocation when valued at the mean whereas fdi liberalization in an industry increased misallocation by 0.065. When duplicating these regressions over a longer time period for the change in misal- location, an interesting pattern emerges. When looking at changes in misallocation over 1989-2005 instead of 1989-2000, the ‘perverse’ findings essentially disappear for fdi and are much attenuated for trade liberalization. When looking at changes in misallocation over 1989-2010, the reversal gets even stronger. The results are presented in table 13. The effects of trade liberalization on misallocation become insignificant and those for fdi are now strongly negative. Hence, after a longer time period, fdi liberalization is associated with a strong decline in misallocation. Because many other things may have happened in the intervening 2000-2010 period, we must of course remain cautious when interpreting this finding. Interestingly perhaps, the effects of labor and land reforms are mostly unchanged when we take 2010 as the end of the study period instead of 2000. Finally, the last result regards delicencing. In table 12, delicencing is associated with a modest and insignificant negative effect on misallocation. As shown in table 13, by 2010 31 these effects are much stronger and highly significant.14 These results are consistent with two well documented patterns. First, reforms like trade liberalization require some time before everything fully adjusts. For instance, Trefler (2004) shows that the us-Canada free trade agreement led first to losses in Canadian manufacturing before the gains eventually materialised. Initially many establishments had to exit and large numbers of manufacturing jobs disappeared. It is only after a long period that winning establishments managed to offset these job losses, productivity was significantly up, and consumers gained through lower prices. While this adjustment may have taken up to 10 years in Canada, it may not be surprising that it took even longer in India where factor markets are much less flexible. This explanation does not, however, explain why the initial effect of these reforms was to make factor misallocation worse. The simplest models of trade liberalization with heterogeneous establishments (e.g., Melitz, 2003) suggest that the least productive establishments (arguably responsible for most misallocation) should exit first. This need not be the case. As shown by Holmes and Stevens (2014) in the us case, the surge in Chinese imports over the last 20 years led, in many industries, to the exit of the bigger establishments. This was because these bigger establishments where much more exposed to import competition from China, while smaller establishments catering to more specific market segments remained in business. Put differently, mainstream furniture producers were, for most of them, wiped out by Chinese imports, while speciality furniture produc- ers were much less affected, even though some of them are poorly productive workshops in rural parts of America. In results not reported here, we also conducted a series of robustness checks where we started the reform period either earlier (1981) or later (1989) and also estimated the same regressions as in tables 12 and 13 but without controlling for initial misallocation. The results for these alternative specifications are consistent with those reported here. 14 This is not true however when we examine only the organized or the unorganized sector. As argued above, much caution is needed to interpret these regressions because we look at misallocation only over one part of the size distribution of establishments. 32 We draw two important conclusions from this analysis so far. First, the results of our work on the repeal of ulcra and on stamp-duty taxation extend to other, broader policies affecting labor, land, international trade and investment, and the regulations of industry. This provides a further validation of the measures of misallocation we developed. This also gives support to the idea that policies can have large effects on misallocation. Second, some policies may take very long to show their long-run effects. Because factor markets in India are subject to considerable frictions, delicencing or trade or fdi liberalization may lead first to a worsening of misallocation before it improves. 5.3 Robustness check: Effects of factor misallocation on output misallocation for indus- tries and states We now estimate again regression (6) using the same observations as above: industries by state. More precisely, we regress the misallocation of output on the misallocation of the different factors of production. The results are reported in table 14. In the baseline estimation of column 1, we find that in panel A the coefficient on the misallocation of employment is equal to 0.60 while the coefficients for the misallocation of land and buildings and for the misallocation of other fixed assets are equal to 0.16. All three coefficients are highly significant. Next, we consider different groupings of factors, a different way to measure establishment output (revenue instead of value added), and add fixed effects for industries and / or states. The results are essentially unchanged. Estimating the same regressions by weighting observations by initial employment leads to equally strong results in panel B of table 14. The only difference is that the coefficient on the misallocation of employment and that on the misallocation of land and buildings are now about equally large. Importantly, the coefficients on factor misallocation usually sum to between 0.9 and 1.0. This is close to the theoretical prediction of 1 discussed above. This result provides yet an another validation for this prediction. Again, this leads credence to both our approach 33 and the data we use. Relative to Duranton et al. (2015) who work at the level of districts, using industries and states leads to a lesser importance of the misallocation of land and buildings and a greater importance of the misallocation of employment. This difference should be kept into perspective nonetheless because the relative size of the coefficients obtained here are somewhat sensitive to the exact weighting scheme used to estimate the regressions. We also use the data at a different degree of aggregation. The misallocation of land and buildings may be more important when we look at a finer degree of spatial resolution, districts instead of states. 5.4 Robustness check: Effects of factor misallocation on output per worker for industries and states Turning to regression (7), we now assess the effect of factor misallocation on output per worker. In column 1 of table 15, we regress output per worker on misallocation for em- ployment, land and buildings, and other fixed assets. The coefficient for the misallocation of employment is negative and highly significant. The coefficient on the misallocation of land and buildings is also negative and significant but smaller by nearly 50%. The coefficient on other fixed assets is instead positive and highly significant. Although of opposite sign, its magnitude is about the same as that on the misallocation of land and buildings. As can be seen for the other columns of the same table, these patterns are extremely robust except for the significance of misallocation of other fixed assets. When imposing both industry and state fixed effects, the coefficient on employment misalloca- tion remains at -0.3 while that on the misallocation of land and buildings remains at -0.2. It is only when we look at the organized sector separately that the patterns are somewhat different with some differences across the two sectors. For the unorganized sectors, the effects of employment and land and buildings misallocation are stronger while those of other fixed assets misallocation are positive and highly significant. A first explanation for these slightly conflicting results and the estimation of positive coefficients is the presence of multicollinearity problems. The correlation between the 34 misallocation of land and buildings and the misallocation of other fixed assets is 0.72. The correlation between the misallocation of land and buildings and the misallocation of employment is 0.63. Finally, the correlation between the misallocation of employment and the misallocation of other fixed assets is 0.58. Because of these high pairwise correlations, it is difficult to separately identify the effects of each factor misallocation. To minimize the importance of multicollinearity, table 16 essentially mirrors table 15 but ignores the misallocation of other fixed assets. Looking at column 5 which includes both industry and state fixed effects, the coefficient on the misallocation of employment is about -0.3 whereas that on the misallocation of land and buildings is about -0.2, which is very close to the coefficients obtained in the previous table. While we no longer estimate positive coefficients when ignoring other fixed assets, the estimated effects differ somewhat from those in Duranton et al. (2015) at the district level. For states and industries, the misallocation of land and buildings appears to play less of role while that of employment plays a bigger role relative to estimations performed across industries at the sector level.15 A part of the explanation here is that the misallocation of labor worsens as more aggregate units of analysis are considered due to the extremely low mobility of labor in India Munshi and Rosenzweig (2016). 6. Robustness: Towards asserting causality and further checks 6.1 Causality issues in the misallocation on misallocation regressions The analysis proposed in Duranton et al. (2015) was mostly descriptive in spirit and made no attempt at relying on a plausibly exogenous source of variation to establish a chain of causal events except when looking at the policy determinants of misallocation. Regressions (6) and (7) were estimated using ordinary least squares. Let us first focus on 15 We also find that when weighting the regressions by initial employment, the levels of significance of the estimated misallocation coefficients drops. This is most likely because the manufacturing sector is dominated by a few large industries. 35 equation (6) which regresses output misallocation on the misallocation of employment, land and buildings, and other fixed assets. A first possible worry is that some local ‘constraint’ may lead a bunch of more pro- ductive establishments to produce less than they otherwise would. That would be a case of the misallocation of output driving the misallocation of factors. Possible example of such constraint include local permits, discrimination of consumers against certain estab- lishments, etc. As a result, these establishments will choose to employ less of each factor. Because our results when estimating regression (6) are the same in cross-section and in the within dimension, such ‘constraints’ need to be thought of as shocks, not permanent features that prevent more productive establishments from expanding. But then establish- ments should react by adjusting first more flexible factors of production, labor, or perhaps, given the rigidities of the Indian labor market, other fixed factors. So, the misallocation of output should breed the misallocation of the more flexible factors. Instead we observe, the opposite. The strongest association is between the misallocation of output and the misallocation of land and buildings. Another route to explore is to instrument factor misallocation. As argued above, find- ing appropriate instruments for factor misallocation is challenging. A policy may affect the misallocation of one factor but it may also directly impose unobserved constraints on production. Instead, we instrument factor misallocation in a given period by misal- location of the same factor at the previous period, that is about 5 years before. For this instrumenting strategy to be valid, we need past factor misallocation to cause current factor misallocation. This can be assessed empirically. We can confirm this is the case, at least in some specifications. We also require past factor misallocation to be correlated with current output misallocation only through current factor misallocation or through one of the controls included in the regression. While lags in the production process or anticipated factor adjustments may invalidate this strategy, using such instruments increases the distance between the variation to be explained for contemporary output and the variation used to explain it. 36 The results are reported in table 17. The main issue is that the instruments are often weak which leads to noisy and insignificant estimates. It is only in columns 2 where only two factors of production (employment and all fixed assets) and 4 that the instruments are strong. The results in column 4 remain noisy. We note however that the iv results of column 2 are very close to their corresponding ols estimation in Duranton et al. (2015). The results of columns 1 and 3 are also close to their corresponding ols estimates despite larger standard errors. Overall, three elements support the notion that the results of regression (6) are causal. First, the sum of the estimated ols coefficients is in line with the prediction of our model where factor misallocation breeds output misallocation. Second, should output and factor misallocations be simultaneously determined, we would expect a stronger correlation of output misallocation with the factors that are easier to adjust. The opposite occurs in the data. Finally, a tentative iv strategy never refutes our initial findings and in a few cases closely matches them. 6.2 Causality issues in the output per worker on misallocation regressions We now turn to a similar instrumenting strategy for regression (7) where we assess the effect of factor misallocation on output per worker. Again, we worry about the simultane- ity of misallocation and output per worker. The difficulty with findings instruments for misallocation is that determinants of factor or output misallocation may also affect output per worker through a number of different channels. That would be the case for instance with the policies that we have explored above. They will affect output per worker through their effects on misallocation. However, they may also affect output per worker through a variety of other channels or perhaps even directly. A similar argument will hold for any contextual variable that affects misallocation. Such variable may also have a direct effect on productivity or an indirect effect through a number of alternative channels.This invalidates the use of policy and contextual variables as instruments. Instead, we use past misallocation to instrument for current misallocation. While this 37 instrument might fail because past misallocation may be correlated with contemporane- ous determinants of output per worker, this bias should remain small when we control for permanent characteristics of districts through district fixed effects. The main challenge is that regression (7) contains three potentially endogenous ex- planatory variables. They are fairly highly correlated among themselves and so are their lagged values. As a result, iv estimates with three instrumented measures of misallocation are unstable. Instead, we use measures of misallocation only one at a time, focusing mainly on the misallocation of output. The results are reported in table 18. In ols, the estimated coefficient on the misalloca- tion of output is -0.74. Instrumenting it with its past value raises its magnitude slightly to -0.94 in column 1. Adding district fixed effects lowers it to -0.46. Weighting by initial employment raises it again to -1.14 while computing misallocation directly across all industries without going through an intermediate stage of industry aggregation leads to a coefficient of -0.55. Using instead the misallocation of employment or that of land and buildings leads to very similar point estimates. Because of larger standard errors, the coefficients when including district fixed effects are insignificant when using factor misallocation instead of output misallocation as explanatory variable. Even though, there is some variation in the estimated effects of misallocation across specifications, they remain in the -0.5 to -1 range which is the same as the one we obtain when estimating these regressions using ols. We experiment with another instrument that we construct using the tendencies of different industries to be misallocated. More specifically, recall that before computing misallocation at the district level, we first compute it for each district and industry. From that intermediate output, we can estimate the tendency of each industry to be misallocated using industry fixed effects. For each district and industry, we can then multiply the share of each industry by the tendency for this industry to be misallocated and sum across all industries in the district. For each district, that gives us an expected misallocation index based on the local composition of economic activity and the national tendency of 38 industries to be misallocated. We can also use a variant of that instrumental variable where we lag the industry composition by one period. The results are reported in table 19. As can be seen from this table, these instruments are often weak but yield estimates for the effect of the misallocation of output on the output per worker that are consistent with our ols estimates above. 6.3 Simulations The general idea of the simulations performed here is the following.16 From the data, we know the misallocation of all factors of production and the distribution of total factor pro- ductivity across establishments. We also know how factors covary across establishments. Finally, we also know the factor shares that were estimated empirically together with total factor productivity. We can then simulate hypothetical economies with a given amount of factor misallocation to match the data. We can also simulate alternative hypothetical economies with a marginally different amount of misallocation for one particular factor. We can then compute the change of output associated with the change in misallocation for one factor. The hypothetical effects of marginal changes in factor misallocation can then be compared with the effects we estimate with regression (7). This exercise, which can also be conducted to assess the effect of factor misallocation on output misallocation, allows us to assess whether our empirical results are in line with the predictions of a model for which the production function is as we estimate it and factors are distributed as in the data at hand. To implement this approach, the main problem is that there is no unique mapping from our empirical measures of misallocation and covariances between factors and productivity across establishments to hypothetical distributions of factors across establishments. We must thus make distributional assump- tions. For the problem to remain easily computable, we need to impose log normal distributions.17 16 We are very grateful to Yu Wang for her help with the modelling. 17 Even when imposing such assumptions, there is no guarantee of uniqueness as made clear below. 39 Our problem can be stated more precisely as follows. There are 5 variables, output Y , productivity ϕ, land and buildings T , other fixed assets K, and employment L which are linked by the production function described by equation (1), which we repeat here: Yi = β γ e ϕi Tiα Ki Li . The marginal distributions of ϕ, T , K, and L and the correlations between any pair of these variables can be observed in the data. The objective of the algorithm is to recover the joint-distribution of ( ϕ, T , K, L) from the marginal distributions (which we observe) and their correlations (which we also ob- serve). To keep the problem tractable, we assume that ( ϕ, T , K, L) follows a multivariate lognormal distribution. To understand the essence of our approach, suppose a mutivariate distribution Xi ∼ N (µi ,σii ) with covariances σij . Then, the following joint distribution can satisfy the con- straint: 1 fX = exp(−0.5( X − µ) Σ−1 ( X − µ) , (15) (2π )4 |Σ| where X is a vector of multivariate lognormally distributed variables, µ the vector of their means and Σij = σij . To illustrate the working of our approach, consider a pair ( X1 ,X2 ) of bivariate lognor- mal random variables with correlation coefficient ρ derived from the bivariate normal with marginal means µ1 and µ2 and standard deviations σ1 and σ2 and correlation coeffi- cient ρ N . Then, it is well known that: exp(ρ N σ1 σ2 ) − 1 ρ= . (16) 2) − 1 exp(σ1 2) − 1 exp(σ2 This means that given ρ, µ1 , µ2 , σ1 , and σ2 , we can recover at least one joint-distribution function for the underlying normal random variables, and therefore, the joint-distribution function of lognormally distributed variables. This approach can be easily extended to four variables. The steps of the algorithm are the following: 1. Take one year of data for both the organized and the unorganized sectors. 40 2. Calibrate the production function (1) using parameters estimated in Duranton et al. (2015). 3. Calibrate the distributions of ϕ, L, K, and T . This can be done in four different ways: (a) Full-sample-correlation calculated from data (correlation 1). (b) Average-within-district-correlation calculated from data (correlation 2). (c) Full-sample-correlation between ϕ and ( L,K,T ) and zero between L, K, and T (correlation 3). (d) Average-within-district-correlation between ϕ and ( L,K,T ) and zero between L, K, and T (correlation 4). 4. For each of the four cases, simulate random numbers for these four variables. 5. Calculate the following quantities ΣY /Σ L, MY , M L , M T , and MK associated to this baseline 6. Slightly change the correlations of ( ϕ, L), ( ϕ,K ), ( ϕ,T ), respectively. 7. Simulate random numbers for these four variables again. 8. Calculate the same five indices 9. Calculate ∆(ΣY /Σ L)/∆M X and ∆MY /∆M X for X = L,K,T . 10. The number of simulation is 800 In the interest of space, the full results (4 correlations and 5 years of data with different sampling options) are not reported here. Table 20 reports only results for 2010. Recall that we can consider four different correlations among factors and productivity: (1) those that we observe in the data in the aggregate, (2) those than we observe on average within district, (3) the aggregate misallocation measures that we observe in the data but setting the covariances across factors to zero, and (4) the average misallocation measures that we observe within districts but setting covariances across factors to zero. 41 The first important result from table 20 is a large negative effect of the misallocation of land and buildings. Across all simulations, the coefficient on the misallocation of land buildings on the log of output per worker is equal to -0.54. In the simulations that correspond most closely to our data, which are in panel C (where sample size is most restricted and where we consider only the within district correlations), the coefficient coming out of the simulations is -0.44. In the estimations of Duranton et al. (2015), the baseline coefficient on the misallocation of land and buildings estimated across all years is equal to -0.64 and to -0.49 when imposing district fixed effects. When note that the match is less good when we use simulations for other years of data. For the misallocation of employment, the average coefficient coming from the simula- tions is -0.12. In the results reported by Duranton et al. (2015), this estimated coefficient is -0.10 in the baseline and -0.09 with district fixed effects. It is insignificant in both cases. For the misallocation of other fixed assets, the average coefficient from the simulation is -0.32. In the results reported by Duranton et al. (2015), this estimated coefficient is close to zero but negative and significant. These empirical results are mostly driven by a few districts with highly negative values of this misallocation index for a some years. When using another measure of tfp, a coefficient of -0.24 is obtained. Hence, despite some differences across years in the simulations, and less good of a match for other fixed assets, the simulations coincide reasonably well with our empirical results. Of course, the match is far from perfect but finding coefficients of similar magnitude is somewhat remarkable given how little structure we impose to the data we simulate. We reach the following conclusion. Using instrumental variables, we show above that the estimates of Duranton et al. (2015) regarding the effect of misallocation on output per worker are unlikely to be driven by reverse causality. We show here that the allocation of factors across establishments, which we measure through the covariances between factors and tfp and among factors, is able to generate the same type of effect as we estimate from the data. This shows that the results we find are those that we should find in a situation where factor misallocation is a causal force behind output per worker. It seems 42 unlikely that a missing variable that affects output per worker and factor misallocation would affect them in just the exact way that would preserve the expected structural effect of factor misallocation on output per worker. 7. Conclusions We can draw a number of conclusions from this wide-ranging analysis. First, we proposed a new factor decomposition of our misallocation indices. In re- gressions of output misallocation on factor misallocation, the estimated coefficients are as predicted by this decomposition. This provide an important validation of the quality of our data and our methodology. Second, we also proposed a new within/between decomposition. We applied it to a variety of dimensions. Our main results are the following. The national economy is very much like a larger version of a typical district when it comes to misallocation. There are nonetheless some interesting patterns across districts. Larger and more productive districts appear to be more misallocated. However, they also receive proportionately more factors. These two effects offset each other in determining national misallocation. We also find that within-district misallocation takes place within industries as well as across industries, to the extent that productivity differences across industries can be well mea- sured. Interestingly, both the organized and unorganized sectors contribute about equally to misallocation. Importantly, the organized sector enjoys the use of disproportionately more factors, which is efficient given the superior productivity of this sector. Finally, we also uncovered interesting patterns regarding the way gender plays into misallocation. Third, we also analysed data for services to mirror our previous analysis for manu- facturing. Just like for manufacturing we found strong evidence that in services factor misallocation breeds output misallocation. Similarly, we also found that misallocation had a strong negative effect on output per worker. There is an important difference between manufacturing and service regarding land and buildings. The key factor behind 43 misallocation in manufacturing is land and buildings while labor appears to matter most for services. We also greatly expanded our work on the effect of economic policies on misallocation. We could provide evidence that labor reforms towards greater worker protection wors- ened misallocation whereas land reforms improved it. We also investigated a number of changes in the regulation of industries. We found that delicencing led to a reduction in misallocation just like the liberalization of foreign direct investment. Interestingly, these effects took a long time to materialize and we found evidence of short-term effects often going in the opposite direction. In addition, note that to explore these policies we had to consider different types of observations, industries within states as opposed to all industries aggregated in districts. This allowed us to check further that misallocation breeds misallocation and that misallocation reduces output per worker. We also deepened our previous analysis and took some steps towards the estimation of the causal effect of misallocation on output per worker. We developed an instrumental variable approach using lags of misallocation and predictions of misallocations using the local industry composition of activity interacted with the tendency of different industries to be misallocated. This approach essentially confirmed our previous results. We also developed a new simulation approach where we replicate an economy with a similar joint distribution of factors and productivity as a typical district in our data. We then make a small change to a measure of misallocation in a counterfactual simulation to assess how much this change in misallocation affects output per worker. Again, the simulation results are broadly consistent with our empirical findings. 44 References Aghion, Philippe, Robin Burgess, Stephen J. Redding, and Fabrizio Zilibotti. 2008. The unequal effects of liberalization: Evidence from dismantling the license Raj in India. American Economic Review 98(4):1397–1412. Besley, Timothy and Robin Burgess. 2000. Land reform, poverty reduction, and growth: Evidence from India. Quarterly Journal of Economics 115(2):389–430. Besley, Timothy and Robin Burgess. 2004. Can labor regulation hinder economic perfor- mance? Evidence from India. Quarterly Journal of Economics 119(1):91–134. Duranton, Gilles, Ejaz Ghani, Arti Grover Goswami, and William Kerr. 2015. The misallo- cation of land and other factors of production in India. World Bank Working Paper. Duranton, Gilles, Laurent Gobillon, and Henry G. Overman. 2005. Does local taxation affect business’ decision. Work in progress, London School of Economics. Ghani, Ejaz, Arti Grover Goswami, and William R. Kerr. 2012. Is India’s manufacturing sector moving away from cities? Working Paper 17992, National Bureau of Economic Research. Holmes, Thomas J. and John J. Stevens. 2014. An alternative theory of the plant size distribution, with geography and intra- and international trade. Journal of Political Economy 122(2):369–421. Hsieh, Chang-Tai and Peter J. Klenow. 2009. Misallocation and manufacturing TFP in China and India. Quarterly Journal of Economics 124(4):1403–1448. Hsieh, Chang-Tai and Peter J. Klenow. 2014. The life cycle of plants in India and Mexico. Quarterly Journal of Economics 129(3):1035–1084. Hsieh, Chang-Tai and Enrico Moretti. 2015. Why do cities matter? Local growth and aggregate growth. Processed, University of California, Berkeley. Melitz, Marc J. 2003. The impact of trade on intra-industry reallocations and aggregate industry productivity. Econometrica 71(6):1695–1725. Munshi, Kaivan and Mark Rosenzweig. 2016. Networks and misallocation: Insurance, migration, and the rural-urban wage gap. American Economic Review 106(1):46–98. Olley, G. Steven and Ariel Pakes. 1996. The dynamics of productivity in the telecommu- nication equipment industry. Econometrica 64(6):1263–1297. Restuccia, Diego and Richard Rogerson. 2008. Policy distortions and aggregate produc- tivity with heterogeneous establishments. Review of Economic Dynamics 11(4):707–720. Romer, Paul M. 1990. Endogenous technical change. Journal of Political Economy 98(5(2)):S71–S102. Solow, Robert M. 1956. A contribution to the theory of economic growth. Quarterly Journal of Economics 70(1):65–94. 45 Sridhar, Kala. S. 2010. Impact of land use regulations: Evidence from India’s cities. Urban Studies 47(7):1541–71. Srinivas, Lakshmi. 1991. Land and politics in India: Working of Urban Land Ceiling Act, 1976. Economic and Political Weekly 26(5):2482–2484. Trefler, Daniel. 2004. The long and short of the Canada-us Free Trade Agreement. American Economic Review 94(4):870–895. 46 Table 1: Within-Between decomposition for male- and female-owned establishments Year Type M M_tilde M_delta M_female M_male Female TFP Male TFP (1) (2) (3) (4) (5) (6) (7) 1994 Output -0.775 -0.740 -0.035 -1.090 -0.717 -0.348 0.061 Employment 0.005 0.028 -0.023 -0.064 0.037 -0.348 0.061 Land and buildings 0.223 0.265 -0.042 -0.082 0.281 -0.348 0.061 Other fixed assets -0.239 -0.199 -0.040 -0.800 -0.167 -0.348 0.061 2000 Output -0.831 -0.755 -0.076 -1.291 -0.713 -0.483 0.120 Employment -0.127 -0.074 -0.052 -0.171 -0.062 -0.483 0.120 Land and buildings -0.150 -0.076 -0.074 -0.409 -0.049 -0.483 0.120 Other fixed assets -0.241 -0.170 -0.071 -0.813 -0.113 -0.483 0.120 2005 Output -1.321 -1.233 -0.088 -1.718 -1.186 -0.453 0.141 Employment -0.215 -0.155 -0.061 -0.250 -0.140 -0.453 0.141 Land and buildings -0.243 -0.165 -0.077 -0.559 -0.118 -0.453 0.141 Other fixed assets -0.535 -0.450 -0.085 -1.036 -0.389 -0.453 0.141 2010 Output -0.687 -0.605 -0.082 -1.123 -0.560 -0.441 0.128 Employment -0.062 0.000 -0.063 -0.148 0.020 -0.441 0.128 Land and buildings -0.092 -0.015 -0.077 -0.138 -0.003 -0.441 0.128 Other fixed assets 0.024 0.091 -0.067 -0.419 0.151 -0.441 0.128 Table 2: Within-Between decomposition for the organised and unorganised sectors Year Type M M_tilde M_delta M_organised M_unorganised TFP_organised TFP_unorganised (1) (2) (3) (4) (5) (6) (7) 1994 Output -1.526 -0.739 -0.786 -0.737 -0.913 0.797 -0.106 Employment -0.887 -0.214 -0.673 -0.242 -0.033 0.797 -0.106 Land and buildings -0.888 -0.174 -0.713 -0.207 0.146 0.797 -0.106 Other fixed assets -1.03 -0.234 -0.796 -0.235 0.891 0.797 -0.106 2000 Output -1.676 -0.823 -0.852 -0.822 -0.935 0.866 -0.096 Employment -1.059 -0.305 -0.754 -0.327 -0.143 0.866 -0.096 Land and buildings -1.063 -0.229 -0.834 -0.228 -0.253 0.866 -0.096 Other fixed assets -1.237 -0.377 -0.86 -0.377 -0.446 0.866 -0.096 2005 Output -1.29 -0.816 -0.474 -0.808 -1.563 0.481 -0.205 Employment -0.63 -0.176 -0.454 -0.173 -0.245 0.481 -0.205 Land and buildings -0.679 -0.203 -0.476 -0.202 -0.416 0.481 -0.205 Other fixed assets -0.786 -0.308 -0.478 -0.306 -0.89 0.481 -0.205 2010 Output -1.264 -0.781 -0.483 -0.781 -0.654 0.486 -0.157 Employment -0.447 0.017 -0.464 0.019 -0.043 0.486 -0.157 Land and buildings -0.626 -0.149 -0.477 -0.151 -0.034 0.486 -0.157 Other fixed assets -0.406 0.079 -0.485 0.079 -0.145 0.486 -0.157 Table 3: Within-Between decomposition for industries in districts Cross-districts means of (unweighted): M M_tilde M_delta 1989 Output -1.35 -1.00 -0.35 Employment -0.53 -0.58 0.05 Land and buildings -0.80 -0.67 -0.13 Other fixed assets -0.94 -0.70 -0.24 1994 Output -1.27 -0.99 -0.28 Employment -0.42 -0.41 0.00 Land and buildings -0.53 -0.44 -0.09 Other fixed assets -1.16 -0.97 -0.19 2000 Output -1.42 -0.97 -0.45 Employment -0.50 -0.38 -0.12 Land and buildings -0.82 -0.53 -0.29 Other fixed assets -1.16 -0.70 -0.46 2005 Output -1.37 -0.86 -0.51 Employment -0.52 -0.44 -0.08 Land and buildings -0.77 -0.43 -0.34 Other fixed assets -1.09 -0.57 -0.52 2010 Output -1.19 -0.67 -0.52 Employment -0.49 -0.29 -0.19 Land and buildings -0.59 -0.29 -0.30 Other fixed assets -1.00 -0.38 -0.62 Table 4: Within-Between decomposition for districts in the country year type M M_tilde M_delta M_d_mean M_d_sd M_tfp_mean M_tfp_sd (1) (2) (3) (4) (5) (6) (7) 1994 Output -1.728 -1.62 -0.107 -1.631 0.787 -0.029 0.365 Employment -0.887 -0.805 -0.082 -0.737 0.635 -0.029 0.365 Land and buildings -0.976 -0.847 -0.130 -0.948 0.778 -0.029 0.365 Other fixed assets -1.216 -1.116 -0.101 -1.317 2.468 -0.029 0.365 2000 Output -1.954 -1.834 -0.12 -1.883 0.933 -0.072 0.365 Employment -1.301 -1.213 -0.089 -1.057 0.811 -0.072 0.365 Land and buildings -1.304 -1.179 -0.125 -1.337 0.958 -0.072 0.365 Other fixed assets -1.471 -1.38 -0.091 -1.462 0.985 -0.072 0.365 2005 Output -1.848 -1.661 -0.187 -1.953 0.937 -0.128 0.401 Employment -1.103 -0.946 -0.157 -1.179 0.833 -0.128 0.401 Land and buildings -1.129 -0.974 -0.155 -1.316 1.059 -0.128 0.401 Other fixed assets -1.208 -1.085 -0.122 -1.468 0.934 -0.128 0.401 2010 Output -1.788 -1.608 -0.18 -1.766 0.900 -0.091 0.344 Employment -0.974 -0.792 -0.182 -1.052 0.780 -0.091 0.344 Land and buildings -1.101 -0.930 -0.171 -1.215 0.909 -0.091 0.344 Other fixed assets -0.899 -0.783 -0.116 -1.372 0.950 -0.091 0.344 Table 5: Within-Between decomposition for districts across states Cross-states means of (unweighted): year type M M_tilde M_delta M_d_mean (1) (2) (3) (4) 1994 Output -1.595 -1.548 -0.047 -1.501 Employment -0.690 -0.634 -0.056 -0.626 Land and buildings -0.824 -0.762 -0.062 -0.771 Other fixed assets -1.026 -0.988 -0.038 -1.194 2000 Output -1.795 -1.700 -0.096 -1.776 Employment -1.041 -0.957 -0.084 -0.976 Land and buildings -1.184 -1.094 -0.090 -1.217 Other fixed assets -1.332 -1.249 -0.083 -1.350 2005 Output -1.682 -1.566 -0.115 -1.758 Employment -0.877 -0.757 -0.119 -0.986 Land and buildings -0.976 -0.880 -0.096 -1.120 Other fixed assets -1.031 -0.935 -0.096 -1.247 2010 Output -1.723 -1.604 -0.119 -1.721 Employment -0.889 -0.754 -0.135 -1.008 Land and buildings -1.074 -0.959 -0.115 -1.173 Other fixed assets -1.024 -0.933 -0.091 -1.350 Table 6: Within-Between decomposition for states within the country year type M M_tilde M_delta M_s_mean (1) (2) (3) (4) 1994 Output -1.728 -1.662 -0.065 -1.599 Employment -0.887 -0.838 -0.049 -0.693 Land and buildings -0.976 -0.899 -0.078 -0.828 Other fixed assets -1.216 -1.157 -0.059 -1.029 2000 Output -1.954 -1.934 -0.020 -1.797 Employment -1.301 -1.298 -0.003 -1.041 Land and buildings -1.304 -1.282 -0.022 -1.185 Other fixed assets -1.471 -1.460 -0.010 -1.332 2005 Output -1.848 -1.768 -0.079 -1.686 Employment -1.103 -1.072 -0.031 -0.878 Land and buildings -1.129 -1.065 -0.064 -0.979 Other fixed assets -1.208 -1.158 -0.050 -1.034 2010 Output -1.788 -1.746 -0.042 -1.725 Employment -0.974 -0.944 -0.030 -0.890 Land and buildings -1.101 -1.065 -0.036 -1.076 Other fixed assets -0.899 -0.872 -0.027 -1.025 Table 7: Descriptive statistics for the servce sector Industry Plant count Share in employment Share in other fixes assets Share in land and buildings Share in output 2000 2005 2000 2005 2000 2005 2000 2005 2000 2005 Hotels and Restaurants 70009 26976 0.28 0.21 0.35 0.28 0.28 0.26 0.28 0.17 Land Transport 53520 28793 0.11 0.10 0.09 0.10 0.03 0.02 0.09 0.06 Auxilliary Transport Activities 5268 1844 0.02 0.03 0.03 0.05 0.06 0.05 0.04 0.09 Post and Telecommunications 27093 21342 0.06 0.08 0.02 0.02 0.09 0.05 0.02 0.02 Real estate activities 2580 2557 0.01 0.01 0.01 0.03 0.01 0.05 0.01 0.01 Renting of machinery, equipment and household goods 13413 4577 0.03 0.02 0.02 0.01 0.03 0.02 0.01 0.01 Computers and related activities 954 924 0.01 0.02 0.02 0.02 0.01 0.02 0.06 0.08 Other business activities 24592 9711 0.07 0.06 0.04 0.03 0.10 0.09 0.06 0.07 Education and training 27481 10093 0.16 0.14 0.11 0.20 0.15 0.15 0.11 0.19 Health and social work 29770 10539 0.12 0.08 0.22 0.17 0.13 0.12 0.28 0.11 Activities of religious and other organizations 986 2327 0.00 0.01 0.01 0.02 0.00 0.01 0.00 0.00 Recreational and sporting activities 7295 2255 0.04 0.02 0.07 0.03 0.03 0.03 0.03 0.02 Domestic and personal services 45514 23804 0.10 0.11 0.01 0.01 0.08 0.07 0.03 0.04 Financial services 10068 0.10 0.03 0.05 0.14 Plant counts to be confirmed Table 8: Value added misallocation in services as a function of factor misallocation Considering Output per Baseline Considering Including output Considering worker to (6) including disaggregated aggregate fixed district fixed misallocation as OLS TFP with compute district fixed estimation assets effects DV 3 factors misallocation effects (1) (2) (3) (4) (5) (6) (7) A.Unweighted estimations Land and building misallocation 0.131 0.202 0.106 0.036 0.444+++ 0.426+++ (0.085) (0.124) (0.083) (0.062) (0.058) (0.082) Other fixed assets misallocation 0.139+++ 0.083 0.163+++ 0.095+++ 0.372+++ 0.292+++ (0.035) (0.057) (0.039) (0.025) (0.038) (0.048) Employment misallocation 0.800+++ 0.869+++ 0.733+++ 0.780+++ 1.031+++ 0.024 0.020 (0.101) (0.110) (0.184) (0.105) (0.090) (0.044) (0.060) Total fixed assets misallocation 0.230+++ (0.067) Observations 763 763 763 763 763 763 763 Adjusted R-squared 0.366 0.361 0.483 0.403 0.404 0.684 0.752 B. Weighting districts by initial employment levels Land and building misallocation 0.205++ 0.176 0.170++ 0.091 0.521+++ 0.429+++ (0.081) (0.156) (0.080) (0.075) (0.086) (0.128) Other fixed assets misallocation 0.152+++ 0.110+ 0.152+++ 0.119+++ 0.365+++ 0.320+++ (0.038) (0.062) (0.043) (0.032) (0.058) (0.071) Employment misallocation 0.753+++ 0.796+++ 0.587++ 0.752+++ 0.900+++ 0.120++ 0.051 (0.109) (0.115) (0.245) (0.128) (0.110) (0.055) (0.080) Total fixed assets misallocation 0.289+++ (0.064) Observations 762 762 762 762 762 762 762 Adjusted R-squared 0.592 0.579 0.645 0.593 0.627 0.781 0.842 Notes: Estimations quantify the relationship between value added misallocation levels and that of factor inputs. Observations are district-year values. Regressions include year fixed effect and report standard errors clustered by district. Table 9: Changes in misallocation in services following the repeal of ULCRA, 2000-2005 Baseline estimation Using basic Using Hsieh- (1) weighted by (1) considering (5) weighted by with extended set of control Klenow logged initial OLS TFP with 3 logged initial controls variables only metric employment factors employment (1) (2) (3) (4) (5) (6) A. Change in misallocation for land and buildings ULCRA repeal 0.042+ 0.047+ n.a. 0.041+ 0.028 0.026 (0.020) (0.026) (0.020) (0.022) (0.023) Adjusted R-squared 0.365 0.360 0.369 0.264 0.262 B. Change in misallocation for value added ULCRA repeal 0.047 0.041 0.036 0.039 0.025 0.017 (0.057) (0.055) (0.026) (0.056) (0.065) (0.063) Adjusted R-squared 0.483 0.483 0.113 0.483 0.454 0.457 Notes: Estimations quantify the change in misallocation levels surrounding the repeal of the Urban Land (Ceiling and Regulation) Act. Estimations are cross-sectional regressions that include 252 Indian districts. The outcome variable in Panel A is the change in land and building misallocation from 2000 to 2005. The outcome variable in Panel B considers misallocation of value added. The primary explanatory variable is a (0,1) indicator variable for a state that repeals ULCRA by 2003. Estimations control for the initial value of the studied misallocation. Basic controls further include 12 initial traits of districts: log population density, log population, log share of urban population, log built-up area, the log share of built- up area, percent graduates, an infrastructure composite index, log minimum travel time to the 10 largest cities, a measure of local age profiles, the share of population with access to banking, the male-female sex ratio, and the share of district population in scheduled casts and tribes. The infrastructure composite index considers the population share with telecom access, the share with power access, the share of villages with paved roads, and the percent share with safe water. Extended controls further include log distance to national highway, log distance to state highway, log distance to railroads, and log distance to the closest metropolitan area. Regressions are unweighted and report standard errors clustered by state. Table 10: Changes in misallocation in services associated with stamp duty changes, 2000-2003 Baseline Using basic set of (1) weighted by (1) considering (5) weighted by estimation with control variables Using Hsieh- logged initial OLS TFP with 3 logged initial extended controls only Klenow metric employment factors employment (1) (2) (5) (6) (5) (6) A. Misallocation for land and buildings Stamp duty -0.007+++ -0.007+++ n.a. -0.006+++ -0.009+++ -0.009+++ (0.002) (0.002) (0.002) (0.002) (0.002) Observations 308 308 308 308 308 Adjusted R-squared 0.370 0.368 0.373 0.270 0.270 B. Misallocation for value added Stamp duty -0.004 -0.004 0.002 -0.004 -0.004 -0.004 (0.003) (0.003) (0.002) (0.003) (0.003) (0.003) Observations 308 308 308 308 308 308 Adjusted R-squared 0.454 0.451 0.091 0.452 0.415 0.416 Notes: Estimations quantify the change in misallocation levels surrounding adjustments in state-level stamp duties. The outcome variable in Panel A is land and building misallocation from 2000 to 2003. The outcome variable in Panel B considers misallocation of value added. The primary explanatory variable is the state-level stamp duty imposed on land transactions. Estimations include state and time fixed effects. Basic and extended controls are the same as defined in Table 4. Regressions are unweighted and report standard errors clustered by state. Table 11: Labor productivity in services as a function of factor misallocation Output per Baseline Considering Including Considering worker to (5) including disaggregated aggregate district fixed OLS TFP compute district fixed estimation fixed assets effects with 3 factors misallocation effects (1) (2) (3) (4) (5) (6) A.Unweighted estimations Land and building misallocation 0.289+++ 0.296+++ 0.128 0.294+++ 0.358+++ (0.082) (0.108) (0.078) (0.053) (0.069) Other fixed assets misallocation 0.096++ -0.029 0.066++ 0.136+++ 0.061 (0.039) (0.062) (0.032) (0.030) (0.044) Employment misallocation -0.879+++ -0.857+++ -0.212 -0.674+++ -0.414+++ -0.234+++ (0.120) (0.133) (0.187) (0.110) (0.054) (0.053) Total fixed assets misallocation 0.312+++ (0.065) Observations 763 763 763 763 763 763 Adjusted R-squared 0.457 0.458 0.687 0.439 0.540 0.743 B. Weighting districts by initial employment levels Land and building misallocation 0.433+++ 0.241+ 0.259++ 0.416+++ 0.373+++ (0.119) (0.135) (0.114) (0.077) (0.093) Other fixed assets misallocation 0.083 0.028 0.050 0.112+++ 0.129++ (0.050) (0.091) (0.050) (0.040) (0.063) Employment misallocation -1.021+++ -1.087+++ -0.067 -0.790+++ -0.563+++ -0.196+++ (0.108) (0.113) (0.257) (0.102) (0.074) (0.069) Total fixed assets misallocation 0.403+++ (0.074) Observations 762 762 762 762 762 762 Adjusted R-squared 0.528 0.527 0.757 0.514 0.622 0.820 Notes: See Table 8. Table 12: Changes in output misallocation 1989-2000 and policy/institutional changes 1985-1997 State reforms Industry Baseline State reforms with industry Industry reforms with Organised Unorganised estimation Balanced panel only FE reforms only state FE sector sector (1) (2) (3) (4) (5) (6) (7) (8) Initial misallocation -0.499+++ -0.475+++ -0.493+++ -0.813+++ -0.496+++ -0.491+++ -0.779+++ -0.748+++ (0.019) (0.021) (0.019) (0.021) (0.019) (0.019) (0.048) (0.046) Tarif changes -0.002+++ -0.002+++ -0.002+++ -0.002+++ -0.000 -0.000 (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) Delicencing -0.030 -0.073++ -0.029 -0.027 -0.018 -0.025 (0.030) (0.033) (0.030) (0.030) (0.036) (0.036) Changes in labour strictness 0.054++ 0.057+ 0.056++ 0.044++ -0.028 -0.038 (0.026) (0.030) (0.027) (0.022) (0.030) (0.030) Changes in FDI restrictiveness 0.065++ 0.051 0.067++ 0.064++ 0.105++ 0.110+++ (0.033) (0.036) (0.033) (0.032) (0.042) (0.041) Land reforms -0.088+++ -0.092+++ -0.089+++ -0.083+++ -0.016 -0.012 (0.016) (0.018) (0.016) (0.014) (0.018) (0.018) Observations 2,585 2,086 2,585 2,585 2,585 2,585 2,556 2,367 R-squared 0.234 0.223 0.223 0.479 0.222 0.250 0.410 0.387 Standard errors in parentheses +++ p<0.01, ++ p<0.05, + p<0.1 Table 13: Changes in output misallocation 1989-2010 and policy/institutional changes 1985-1997 State reforms Industry Baseline State reforms with industry Industry reforms with Organised Unorganised estimation Balanced panel only FE reforms only state FE sector sector (1) (2) (3) (4) (5) (6) (7) (8) Initial misallocation -0.628+++ -0.645+++ -0.617+++ -0.926+++ -0.627+++ -0.606+++ -0.911+++ -0.876+++ (0.020) (0.022) (0.020) (0.023) (0.020) (0.021) (0.049) (0.048) Tarif changes -0.000 0.000 -0.000 -0.000 -0.001 -0.000 (0.000) (0.000) (0.000) (0.000) (0.001) (0.000) Delicencing -0.113+++ -0.116+++ -0.113+++ -0.104+++ -0.025 0.003 (0.033) (0.035) (0.033) (0.032) (0.043) (0.039) Changes in labour strictness 0.067++ 0.055+ 0.067++ 0.058++ -0.018 -0.035 (0.029) (0.031) (0.029) (0.024) (0.032) (0.031) Changes in FDI restrictiveness -0.184+++ -0.115+++ -0.184+++ -0.178+++ 0.073 0.102++ (0.035) (0.038) (0.036) (0.035) (0.049) (0.046) Land reforms -0.057+++ -0.045++ -0.057+++ -0.058+++ 0.004 0.005 (0.018) (0.019) (0.018) (0.015) (0.020) (0.019) Observations 2,569 2,083 2,569 2,569 2,569 2,569 2,527 2,344 R-squared 0.283 0.298 0.274 0.514 0.280 0.307 0.421 0.407 Standard errors in parentheses +++ p<0.01, ++ p<0.05, + p<0.1 Table 14: Value added misallocation as a function of factor misallocation Considering Including Including Including both Baseline aggregate state fixed industry fixed state and industry Considering output VARIABLES estimation fixed assets effects effects fixed effects misallocation as DV Organised Unorganised (1) (2) (3) (4) (5) (6) (7) (8) A. Unweighted estimations Employment misallocation 0.602+++ 0.641+++ 0.602+++ 0.541+++ 0.556+++ 0.575+++ 0.536+++ 0.841+++ (0.063) (0.062) (0.062) (0.064) (0.065) (0.057) (0.064) (0.076) Land and building misallocat 0.158+++ 0.160+++ 0.153+++ 0.160+++ 0.199+++ 0.144+++ 0.095+++ (0.049) (0.049) (0.049) (0.049) (0.045) (0.047) (0.036) Other fixed assets misallocati 0.162+++ 0.157+++ 0.145+++ 0.131+++ 0.227+++ 0.112+++ 0.325+++ (0.038) (0.037) (0.037) (0.036) (0.037) (0.039) (0.035) Total fixed assets misallocati 0.276+++ (0.045) Observations 2,648 2,648 2,648 2,648 2,648 2,648 2,609 2,405 R-squared 0.639 0.634 0.647 0.691 0.695 0.757 0.391 0.470 B. Weighting districts by initial employment levels Employment misallocation 0.389+++ 0.493+++ 0.396+++ 0.335+++ 0.337+++ 0.370+++ 0.225++ 0.677+++ (0.110) (0.116) (0.107) (0.084) (0.094) (0.095) (0.092) (0.113) Land and building misallocat 0.351+++ 0.328+++ 0.253+++ 0.273+++ 0.320+++ 0.256+++ 0.267+++ (0.076) (0.085) (0.073) (0.077) (0.072) (0.060) (0.055) Other fixed assets misallocati 0.150+++ 0.165+++ 0.196+++ 0.203+++ 0.250+++ 0.171+++ 0.304+++ (0.041) (0.051) (0.054) (0.070) (0.046) (0.050) (0.043) Total fixed assets misallocati 0.373+++ (0.078) Observations 2,648 2,648 2,648 2,648 2,648 2,648 2,609 2,405 R-squared 0.721 0.707 0.727 0.770 0.768 0.823 0.311 0.508 Notes: Estimations quantify the relationship between value added misallocation levels and that of factor inputs. Observations are industry-state-year values. Regressions include year fixed effects, are unweighted, and report standard errors clustered by district. +++ p<0.01, ++ p<0.05, + p<0.1 Table 15: Labour productivity as a function of factor misallocation Considering Including Including Including both state Considering output Baseline aggregate state fixed industry fixed and industry fixed misallocation as VARIABLES estimation fixed assets effects effects effects DV Organised Unorganised (1) (2) (3) (4) (5) (6) (7) (8) A. Unweighted estimations Employment misallocation -0.319+++ -0.301+++ -0.374+++ -0.262+++ -0.287+++ -0.471+++ -0.421+++ -1.993+++ (0.084) (0.083) (0.084) (0.060) (0.069) (0.064) (0.078) (0.186) Land and building misallocation -0.173++ -0.193++ -0.146++ -0.180+++ -0.244+++ -0.157++ -0.449+++ (0.081) (0.077) (0.059) (0.061) (0.059) (0.073) (0.098) Other fixed assets misallocation 0.147++ 0.178+++ -0.065 -0.024 -0.084+ -0.023 0.251+++ (0.067) (0.063) (0.045) (0.046) (0.046) (0.061) (0.081) Total fixed assets misallocation -0.027 (0.058) Observations 2,648 2,648 2,648 2,648 2,648 2,648 2,609 2,405 R-squared 0.151 0.148 0.233 0.549 0.508 0.533 0.061 0.143 B. Weighting districts by initial employment levels Employment misallocation -0.338 -0.525 -0.393++ -0.124 -0.329+ -0.253 -0.567+++ -2.225+++ (0.264) (0.396) (0.159) (0.227) (0.185) (0.253) (0.178) (0.338) Land and building misallocation -0.598 -0.334 -0.170 0.012 -0.508 -0.737 -1.033+++ (0.478) (0.240) (0.140) (0.160) (0.357) (0.558) (0.175) Other fixed assets misallocation 0.595+ 0.410++ -0.103 -0.017 0.356 0.699 0.400+++ (0.342) (0.173) (0.123) (0.132) (0.262) (0.518) (0.109) Total fixed assets misallocation 0.185 (0.121) Observations 2,648 2,648 2,648 2,648 2,648 2,648 2,609 2,405 R-squared 0.193 0.157 0.377 0.658 0.647 0.136 0.091 0.220 Notes: Estimations quantify the relationship between output per worker levels and that of factor inputs. Observations are industry-state-year values. Regressions include year fixed effects, are unweighted, and report standard errors clustered by district. +++ p<0.01, ++ p<0.05, + p<0.1 Table 16: Labour productivity as a function of factor misallocation, restricted regressions Considering Including Including Including both Considering output Baseline aggregate state fixed industry fixed state and industry misallocation as VARIABLES estimation fixed assets effects effects fixed effects DV Organised Unorganised (1) (2) (3) (4) (5) (6) (7) (8) A. Unweighted estimations Employment misallocation -0.279+++ -0.326+++ -0.280+++ -0.293+++ -0.493+++ -0.427+++ -1.795+++ (0.081) (0.080) (0.057) (0.066) (0.061) (0.076) (0.168) Land and building misallocatio -0.051 -0.045 -0.197+++ -0.200+++ -0.312+++ -0.175+++ -0.280+++ (0.064) (0.062) (0.047) (0.052) (0.051) (0.058) (0.090) Value added misallocation -0.280+++ (0.047) Observations 2,648 2,648 2,648 2,648 2,648 2,648 2,609 2,405 R-squared 0.149 0.150 0.230 0.548 0.508 0.532 0.061 0.137 B. Weighting districts by initial employment levels Employment misallocation -0.288 -0.350++ -0.139 -0.331+ -0.224 -0.462++ -1.902+++ (0.267) (0.172) (0.214) (0.179) (0.254) (0.216) (0.317) Land and building misallocatio -0.089 0.025 -0.258+ -0.003 -0.204 -0.125 -0.741+++ (0.261) (0.182) (0.142) (0.144) (0.197) (0.193) (0.178) Value added misallocation -0.359 (0.328) Observations 2,648 2,648 2,648 2,648 2,648 2,648 2,609 2,405 R-squared 0.153 0.152 0.359 0.657 0.647 0.120 0.033 0.205 Notes: Estimations quantify the relationship between output per worker levels and that of factor inputs. Observations are industry-state-year values. Regressions include year fixed effects, are unweighted, and report standard errors clustered by district. +++ p<0.01, ++ p<0.05, + p<0.1 Table 17: Value added misallocation as a function of factor misallocation, TSLS regressions Considering Baseline Considering Including Without output disaggregated aggregate fixed district fixed industry Using OLS TFP misallocation as estimation assets effects aggregation step metrics DV (1) (2) (3) (4) (5) (6) Land and building misallocation 0.467++ 0.556+++ 1.535 1.141+ 0.031 (0.182) (0.163) (1.862) (0.658) (0.202) Other fixed assets misallocation 0.011 0.001 2.387 -0.021 0.024 (0.015) (0.001) (4.209) (0.273) (0.028) Employment misallocation 0.586++ 0.458++ 0.838 -1.373 -0.117 0.667+++ (0.252) (0.196) (0.534) (2.171) (0.451) (0.243) Total fixed assets misallocation 0.642+++ (0.136) Observations 1451 1451 1451 1451 1455 1451 Overid Stat (Hansen J Stat) 0.000 0.000 0.000 0.000 0.000 0.000 Overid P value First stage Stat 0.009 28.162 4.269 32.538 1.852 0.009 Notes: Estimations quantify the relationship between value added misallocation levels and that of factor inputs. Observations are district-year Table 18: Output per worker as a function of factor misallocation, TSLS regressions (2) weighted Baseline Including Baseline Including Baseline Including by initial (1) Without disaggregated district fixed disaggregated district fixed disaggregated district fixed employment industry estimation effects estimation effects estimation effects aggregation step (1) (2) (3) (4) (5) (6) (7) (8) A.Unweighted estimations Value added misallocation -0.942+++ -0.463+ -1.141++ -0.546+++ (0.250) (0.269) (0.513) (0.178) Land and building misallocation -1.184+++ -0.421 (0.281) (0.267) Employment misallocation -1.102+++ -0.700 (0.398) (0.676) Observations 1451 1451 1451 1451 1451 1451 1442 1451 First stage Stat 215.3 44.2 209.5 51.7 273.1 13.2 38.5 232.8 Notes: Estimations quantify the relationship between value added misallocation levels and that of factor inputs. Observations are district-year values. TSLS regressions include year fixed effects, are unweighted, and report standard errors clustered by district. The instruments are the lagged avlues of factor misallocation Table 19: Output per worker as a function of misallocation, alternative TSLS regressions (1) Without (1) Without (1) Without (2) weighted Baseline industry Baseline Including industry Including industry by initial disaggregated aggregation disaggregated district fixed aggregation district fixed aggregation estimation step estimation effects step effects employment step (1) (2) (3) (4) (5) (6) (7) (8) Output misallocation 9.956 -3.826+ -7.299+ -1.552+ -2.649+++ -0.905+++ -0.376 -0.586+++ (10.715) (1.954) (4.309) (0.902) (0.847) (0.247) (0.259) (0.180) Instruments Industry composition Y Y Y Y Y Lagged industry composition Y Y Y Lagged misallocation Y Y Y Observations 1816 1816 1455 1455 1455 1451 1442 8.682 First stage Stat 1.13 4.25 2.415 3.487 9.57 14.697 3.284 3.284 Overid P value 0.000 0.070 0.003 Notes: Estimations quantify the relationship between output per worker and output misallocation. Observations are district-year values. TSLS regressions include year fixed effects, are unweighted, and report standard errors clustered by district. Table 20:Simulation results on 2010 data Panel A: All establishments Correlation 1 Correlation 2 Correlation 3 Correlation 4 Effects of a change in misallocation of: Employment 0.00 -0.01 -0.19 -0.23 Land and buildings -0.58 -0.47 -0.58 -1.04 Other fixed assets -0.41 -0.24 -0.30 -0.66 Panel B: More than 5 establishments per industry and districts Correlation 1 Correlation 2 Correlation 3 Correlation 4 Effects of a change in misallocation of: Employment -0.14 -0.22 -0.06 -0.02 Land and buildings -0.46 -0.21 -0.69 -0.49 Other fixed assets -0.44 -0.38 -0.31 -0.39 Panel C: More than 10 establishments per industry and districts Correlation 1 Correlation 2 Correlation 3 Correlation 4 Effects of a change in misallocation of: Employment -0.08 -0.24 -0.08 -0.19 Land and buildings -0.36 -0.44 -0.56 -0.56 Other fixed assets 0.28 -0.35 -0.18 -0.50