ISSN 0258-6770 (PRINT) ISSN 1564-698X (ONLINE) THE WORLD BANK ECONOMIC REVIEW Downloaded from https://academic.oup.com/wber/issue/31/2 by Joint Bank-Fund Library user on 09 August 2019 Volume 31 • 2017 • Number 2 Property Rights for Fishing Cooperatives: How (and How Well) Do They Work? Octavio Aburto-Oropeza, Heather M. Leslie, Austen Mack-Crane, Sriniketh Nagavarapu, Sheila M.W. Reddy, and Leila Sievanen When Winners Feel Like Losers: Evidence from an Energy Subsidy Reform Oscar Calvo-Gonzalez, Barbara Cunha, and Riccardo Trezzi Does Input-Trade Liberalization Affect Firms’ Foreign Technology Choice? Maria Bas and Antoine Berthou Long-term Gains from Electrification in Rural India Dominique van de Walle, Martin Ravallion, Vibhuti Mendiratta, and Gayatri Koolwal The Changing Structure of Africa’s Economies Xinshen Diao, Kenneth Harttgen, and Margaret McMillan Does Child Sponsorship Pay off in Adulthood? An International Study of Impacts on Income and Wealth Bruce Wydick, Paul Glewwe, and Laine Rutledge Political Connections and Tariff Evasion Evidence from Tunisia Bob Rijkers, Leila Baghdadi, and Gael Raballand Pension Coverage for Parents and Educational Investment in Children: Evidence from Urban China Ren Mu and Yang Du Prices, Engel Curves, and Time-Space Deflation: Impacts on Poverty and Inequality in Vietnam John Gibson, Trinh Le, and Bonggeun Kim Willing but Unable? Short-term Experimental Evidence on Parent Empowerment and School Quality Elizabeth Beasley and Elise Huillery Providing Policy Makers with Timely Advice: The Timeliness-Rigor Trade-off Clive Bell and Lyn Squire On the Effects of Enforcement on Illegal Markets: Evidence from a Quasi-Experiment in Colombia Daniel Mejía, Pascual Restrepo, and Sandra V. Rozo academic.oup.com/wber THE WORLD BANK ECONOMIC REVIEW EDITORS Eric Edmonds, Dartmouth College Nina Pavcnik, Dartmouth College CO - EDITORS Francisco H. G. Ferreira, World Bank Karla Hoff, World Bank Downloaded from https://academic.oup.com/wber/issue/31/2 by Joint Bank-Fund Library user on 09 August 2019 Leora Klapper, World Bank Aart C. Kraay, World Bank David J. McKenzie, World Bank Luis Servén, World Bank The editorial team would like to thank former editor Andrew Foster for overseeing the review process of the articles in this issue. ASSISTANT TO THE EDITOR Marja Kuiper EDITORIAL BOARD Lori Beaman, Northwestern University, USA Beata Javorcik, Oxford University, UK Haroon Bhorat, University of Cape Town, Aprajit Mahajan, University of California, South Africa Berkeley, USA Nicholas Bloom, Stanford University, USA Stelios Michalopoulos, Brown Univserity, USA Asli Demirgüç-Kunt, World Bank Ahmed M. Mobarak, Yale University, USA Ashwini Deshpande, Delhi University, India Yaw Nyarko, New York University, USA Pascaline Dupas, Stanford University, USA Albert Park, Hong Kong University of Science Eduardo Engel, University of Chile, Chile and Technology (HKUST), Hong Kong Claudio Ferraz, Pontifícia Universidade Católica Sandip Sukhtankar, Virginia University, USA do Rio de Janeiro (PUC-Rio), Brazil Romain Wacziarg, University of California, Frederico Finan, University of California, Los Angeles, USA Berkeley, USA Christopher Woodruff, Oxford University, UK Andrew Foster, Brown Univserity, USA The World Bank Economic Review is a professional journal used for the dissemination of research in development economics broadly relevant to the development profession and to the World Bank in pursuing its development mandate. It is directed to an international readership among economists and social scientists in government, business, international agencies, universities, and development research institutions. The Review seeks to provide the most current and best research in the field of quantitative development policy analysis, emphasizing policy relevance and operational aspects of economics, rather than primarily theoretical and methodological issues. Consistency with World Bank policy plays no role in the selection of articles. The Review is managed by one or two independent editors selected for their academic excellence in the field of development economics and policy. The editors are assisted by an editorial board composed in equal parts of scholars internal and external to the World Bank. World Bank staff and outside researchers are equally invited to submit their research papers to the Review. For more information, please visit the Web sites of the Economic Review at Oxford University Press at academic.oup.com/wber and at the World Bank at www.worldbank.org/research/journals. Instructions for authors wishing to submit articles are available online at academic.oup.com/wber. Please direct all editorial correspondence to the Editor at wber@worldbank.org. SUBSCRIPTIONS: A subscription to The World Bank Economic Review (ISSN 0258-6770) comprises 3 issues. Prices include postage; for subscribers outside the Americas, issues are sent air freight. Annual Subscription Rate (Volume 31, 3 Issues, 2017): Institutions—Print edition and site-wide online access: £224/ $337/e337, Print edition only: £205/$309/e309, Site-wide online access only: £166/$252/e251; Corporate—Print edi- tion and site-wide online access: £336/$504/e504, Print edition only: £309/$462/e462, Site-wide online access only: £250/$373/e374; Personal—Print edition and individual online access: £53/$81/e81. US$ rate applies to US & Canada, Eurose applies to Europe, UK£ applies to UK and Rest of World. There may be other subscription rates avail- able; for a complete listing, https://academic.oup.com/wber/subscribe. Readers with mailing addresses in non-OECD countries and in socialist economies in transition are eligible to receive complimentary subscriptions on request by writ- ing to the UK address below. Full prepayment in the correct currency is required for all orders. Orders are regarded as firm, and payments are not refundable. Subscriptions are accepted and entered on a complete volume basis. Claims cannot be con- sidered more than four months after publication or date of order, whichever is later. All subscriptions in Canada are subject to GST. Subscriptions in the EU may be subject to European VAT. If registered, please supply details to Downloaded from https://academic.oup.com/wber/issue/31/2 by Joint Bank-Fund Library user on 09 August 2019 avoid unnecessary charges. For subscriptions that include online versions, a proportion of the subscription price may be subject to UK VAT. Personal rates are applicable only when a subscription is for individual use and are not available if delivery is made to a corporate address. The current year and two previous years’ issues are available from Oxford University Press. Previous BACK ISSUES: volumes can be obtained from the Periodicals Service Company, 11 Main Street, Germantown, NY 12526, USA. E-mail: psc@periodicals.com. Tel: (518) 537-4700. Fax: (518) 537-5899. CONTACT INFORMATION: Journals Customer Service Department, Oxford University Press, Great Clarendon Street, OxfordOX2 6DP, UK. E-mail: jnls.cust.serv@oup.com. Tel: þ44 (0)1865 353907. Fax: þ44 (0)1865 353485. In the Americas, please contact: Journals Customer Service Department, Oxford University Press, 2001 Evans Road, Cary, NC 27513, USA. E-mail: jnlorders@oup.com. Tel: (800) 852-7323 (toll-free in USA/Canada) or (919) 677-0977. Fax: (919) 677-1714. In Japan, please contact: Journals Customer Service Department, Oxford University Press, Tokyo, 4-5-10-8F Shiba, Minato-ku, Tokyo, 108-8386, Japan. E-mail: custserv.jp@oup.com. Tel: þ81 3 5444 5858. Fax: þ81 3 3454 2929. POSTAL INFORMATION: The World Bank Economic Review (ISSN 0258-6770) is published three times a year, in February, June, and October, by Oxford University Press for the International Bank for Reconstruction and Development/THE WORLD BANK. Send address changes to The World Bank Economic Review, Journals Customer Service Department, Oxford University Press, 2001 Evans Road, Cary, NC 27513-2009. Periodicals postage paid at Cary, NC and at additional mailing offices. Communications regarding original articles and editorial management should be addressed to The Editor, The World Bank Economic Review, The World Bank, 3, Chemin Louis Dunant, CP66 1211 Geneva 20, Switzerland. ENVIRONMENTAL AND ETHICAL POLICIES: Oxford Journals, a division of Oxford University Press, is committed to working with the global community to bring the highest quality research to the widest possible audience. Oxford Journals will protect the environment by implementing environmentally friendly policies and practices wherever possible. Please see https://academic.oup.com/journals/pages/about_us/ethical_policies for further infor- mation on environmental and ethical policies. DIGITAL OBJECT IDENTIFIERS: For information on dois and to resolve them, please visit www.doi.org. PERMISSIONS: For information on how to request permissions to reproduce articles or information from this journal, please visit https://academic.oup.com/journals/pages/access_purchase/rights_and_permissions. ADVERTISING: Advertising, inserts, and artwork enquiries should be addressed to Advertising and Special Sales, Oxford Journals, Oxford University Press, Great Clarendon Street, Oxford, OX2 6DP, UK. Tel: þ44 (0)1865 354767; Fax: þ44(0)1865 353774; E-mail: jnlsadvertising@oup.com. DISCLAIMER: Statements of fact and opinion in the articles in The World Bank Economic Review are those of the respective authors and contributors and not of the International Bank for Reconstruction and Development/THE WORLD BANK or Oxford University Press. Neither Oxford University Press nor the International Bank for Reconstruction and Development/THE WORLD BANK make any representation, express or implied, in respect of the accuracy of the material in this journal and cannot accept any legal responsibility or liability for any errors or omissions that may be made. The reader should make her or his own evaluation as to the appropriateness or oth- erwise of any experimental technique described. The World Bank Economic Review is printed on acid-free paper that meets the minimum require- PAPER USED: ments of ANSI Standard Z39.48-1984 (Permanence of Paper). INDEXING AND ABSTRACTING: The World Bank Economic Review is indexed and/or abstracted by CAB Abstracts, Current Contents/Social and Behavioral Sciences, Journal of Economic Literature/EconLit, PAIS International, RePEc (Research in Economic Papers), and Social Services Citation Index. C COPYRIGHT V 2017 The International Bank for Reconstruction and Development/THE WORLD BANK All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without prior written permission of the publisher or a license permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Typeset by Cenveo Publisher Services, Bangalore, India; Printed by The Sheridan Press. THE WORLD BANK ECONOMIC REVIEW Volume 31  2017  Number 2 Downloaded from https://academic.oup.com/wber/issue/31/2 by Joint Bank-Fund Library user on 09 August 2019 Property Rights for Fishing Cooperatives: How (and How Well) Do They Work? 295 Octavio Aburto-Oropeza, Heather M. Leslie, Austen Mack-Crane, Sriniketh Nagavarapu, Sheila M.W. Reddy, and Leila Sievanen When Winners Feel Like Losers: Evidence from an Energy Subsidy Reform 329 Oscar Calvo-Gonzalez, Barbara Cunha, and Riccardo Trezzi Does Input-Trade Liberalization Affect Firms’ Foreign Technology Choice? 351 Maria Bas and Antoine Berthou Long-term Gains from Electrification in Rural India 385 Dominique van de Walle, Martin Ravallion, Vibhuti Mendiratta, and Gayatri Koolwal The Changing Structure of Africa’s Economies 412 Xinshen Diao, Kenneth Harttgen, and Margaret McMillan Does Child Sponsorship Pay off in Adulthood? An International Study of Impacts on Income and Wealth 434 Bruce Wydick, Paul Glewwe, and Laine Rutledge Political Connections and Tariff Evasion Evidence from Tunisia 459 Bob Rijkers, Leila Baghdadi, and Gael Raballand Pension Coverage for Parents and Educational Investment in Children: Evidence from Urban China 483 Ren Mu and Yang Du Prices, Engel Curves, and Time-Space Deflation: Impacts on Poverty and Inequality in Vietnam 504 John Gibson, Trinh Le, and Bonggeun Kim Willing but Unable? Short-term Experimental Evidence on Parent Empowerment and School Quality 531 Elizabeth Beasley and Elise Huillery Providing Policy Makers with Timely Advice: The Timeliness-Rigor Trade-off 553 Clive Bell and Lyn Squire Downloaded from https://academic.oup.com/wber/issue/31/2 by Joint Bank-Fund Library user on 09 August 2019 On the Effects of Enforcement on Illegal Markets: Evidence from a Quasi-Experiment in Colombia 570 Daniel Mejı´a, Pascual Restrepo, and Sandra V. Rozo The World Bank Economic Review, 31(2), 2017, 295–328 doi: 10.1093/wber/lhw001 Advance Access Publication Date: March 10, 2016 Article Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Property Rights for Fishing Cooperatives: How (and How Well) Do They Work? Octavio Aburto-Oropeza, Heather M. Leslie, Austen Mack-Crane, Sriniketh Nagavarapu, Sheila M.W. Reddy, and Leila Sievanen Abstract Devolving property rights to local institutions has emerged as a compelling management strategy for natural resource management in developing countries. The use of property rights among fishing cooperatives operating in Mexico’s Gulf of California provides a compelling setting for theoretical and empirical analysis. A dynamic theoretical model demonstrates how fishing cooperatives’ management choices are shaped by the presence of property rights, the mobility of resources, and predictable environmental fluctuations. More aggressive man- agement comes in the form of the cooperative leadership paying lower prices to cooperative members for their catch, as lower prices disincentivize fishing effort. The model’s implications are empirically tested using three years of daily logbook data on prices and catches for three cooperatives from the Gulf of California. One coop- erative enjoys property rights while the other two do not. There is empirical evidence in support of the model: compared to the other cooperatives, the cooperative with strong property rights pays members a lower price, pays especially lower prices for less mobile species, and decreases prices when environmental fluctuations cause population growth rates to fall. The results from this case study demonstrate the viability of cooperative man- agement of resources but also point toward quantitatively important limitations created by the mismatch be- tween the scale of a property right and the scale of a resource. JEL classification: O13, Q20, Q22, Q50, Q56, Q57 Octavio Aburto-Oropeza is an assistant professor at Scripps Institution of Oceanography, University of California-San Diego (maburto@ucsd.edu). Heather M. Leslie is director and Libra Associate Professor at Darling Marine Center, University of Maine (heather.leslie@maine.edu). Austen Mack-Crane is a research assistant at the Center on Social Dynamics and Policy, Brookings Institution (wmack-crane@brookings.edu). Sriniketh Nagavarapu (corresponding author) is a senior policy associate at Acumen, LLC (sri@acumenllc.com). Sheila M.W. Reddy is a senior scientist for sustainability at The Nature Conservancy (sreddy@tnc.org). Leila Sievanen is an associate scientist at California Ocean Science Trust (leila.sievanen@oceansciencetrust.org). We appreciate outstanding research assistance from Gustavo Hinojosa Arango, Juan Jose ´ Cota Nieto, Alexandra Sanchez, Alexander Lobert, Florencia Borrescio-Higa, Ashley Anderson, Steven Hagerty, and Katherine Wong. For invaluable advice, we thank Chris Costello, Robert Deacon, Andrew Foster, Vernon Henderson, Kaivan Munshi, and seminar participants at Brown’s Population Studies and Training Center, Brown’s Spatial Structures in the Social Sciences, the UC-Santa Barbara fisheries working group, the MIT/Harvard Environment and Development semi- nar, and the UC-Berkeley ARE seminar. We are grateful for financial support from the Institute at Brown for Environment and Society at Brown University and the US National Science Foundation Coupled Natural and Human Systems program (NSF Award GEO-11114964). Remaining errors are our own. Author contributions: 1) Design of research: Leslie, Nagavarapu, Reddy; 2) Quantitative data: Aburto, Reddy; 3) Qualitative data: Leslie, Reddy, Sievanen; 4) Theoretical model: Mack-Crane, Nagavarapu; 5) Empirical analysis: Mack-Crane, Nagavarapu, Reddy; 6) Writing: Leslie, Mack- Crane, Nagavarapu, Reddy. A supplemental appendix to this article is available at https://academic.oup.com/wber. C The Author 2016. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 296 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen There is widespread concern for the health of global fisheries: a recent study estimated that 28–33% of fisheries are over-exploited and 7–13% are collapsed (Branch et al. 2011).1 Since Gordon (1954) and Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Scott (1956), the static and dynamic externalities that lead to over-extraction have been well understood. However, policy makers and researchers have struggled with how best to ensure that these externalities are properly internalized, particularly in low-income countries. In this paper, we use a blend of theory and empirics to better understand whether, when, and how assigning property rights to fishing coopera- tives can resolve the externality problem.2 Assigning cooperatives the exclusive right to fish a spatially delineated area—a specific case of a Territorial Use Right Fishery (TURF) (see Wilen et al. 2012)—is an attractive concept. It potentially improves upon much more common solutions to the externality problem, such as catch shares, tradable quotas, and marine reserves (Hilborn et al. 2005; Deacon 2012). While these more common systems have been associated with improved ecological and economic outcomes (Hilborn et al. 2005; Costello et al. 2008; Lester et al. 2009), they require intensive monitoring and enforcement that may be relatively costly. In contrast, cooperatives may be able to leverage social ties to monitor and enforce spatial boun- daries at relatively low cost. Moreover, cooperatives can allocate fishing effort across space and time in a manner that avoids both closing a fishery and the race for especially profitable fishing areas or times that is inherent with transferable quotas. A cooperative’s ability to coordinate its members’ actions could also reduce races to fish within a TURF that is assigned to a less cohesive group of fishers (Costello and Kaffine 2010). Finally, a cooperative could facilitate greater provision of public goods that reduce mem- bers’ private fishing costs, such as information on the best fishing locations or shared equipment. Despite these advantages, there are still two major challenges to the effective use of property rights by fishing cooperatives, and empirical evidence on these challenges is scarce. First, the scale of the property right may not match the scale of the resource, thereby giving a cooperative much less incentive and abil- ity to manage its exclusive rights (Ostrom 1990; White and Costello 2011).3 The “scale of the resource” refers to the degree of geographic mobility of the organisms targeted by fishers. Second, and less often discussed in the literature, cooperative management may not adapt effectively in the face of environmen- tal variability. For instance, cooperatives may need to ensure members some minimum level of well- being in order to retain members; this may hinder a cooperative’s ability to dramatically cut fishing effort when environmental conditions negatively impact fish populations. In this paper, we develop a dynamic model of cooperative decision-making and use rich data from Mexico’s Gulf of California fisheries to examine how fishing cooperatives exercise property rights, with a focus on three issues: 1) Whether cooperatives with property rights manage a resource differently from those without; 2) How such differences depend on the scale of the resource; and 3) How such differences respond to predictable environmental fluctuations associated with ENSO (El Nin ~o/Southern Oscillation) events. Mexico is a natural setting for the analysis. The country’s marine ecosystems are rich in biodiver- sity, which provides an opportunity to examine how fishing cooperatives shift behavior when fishing on species that vary in key traits, such as mobility. Moreover, ENSO events have important impacts on fish- eries, with the direction and magnitude of these impacts differing among species. 1 “Over-exploited” refers to stocks less than half of maximum sustainable yield (MSY), while “collapsed” is defined as stocks less than one-fifth of MSY. 2 Following Deacon (2012), we define fishing cooperatives as “an association of harvesters that collectively holds rights to control some or all of its members’ fishing activities.” Cooperatives are quite common; for example, Deacon (2012) points to at least 400 cooperatives in Bangladesh and over 12,000 in India. 3 A similar problem arises if a local user group does not have control over a species that is ecologically connected to the resource it controls. For instance, fishermen outside the group may fish species that are preyed on by species controlled by the group. The World Bank Economic Review 297 We begin our analysis with a dynamic model of cooperative decision-making. Cooperative leadership can engage in more aggressive management by lowering the price that is paid to cooperative members for a specific species and, therefore, disincentivizing fishing of that species. The model yields three test- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 able implications for how cooperative price and resulting catch change in response to external factors. Specifically, compared to other cooperatives, a cooperative with stronger property rights manages effort more aggressively, decreases the relative aggressiveness of management if a species is highly mobile, and restricts effort more when environmental forcing (e.g., ENSO events) limit the population growth rate of a species. Throughout, by “strong property rights” we mean the ability to exclude noncooperative fish- ers from the cooperative’s established fishing grounds. We test these implications using daily logbook data from three cooperatives in the Gulf of California region, in northwest Mexico. One cooperative, operating on the Pacific coast, retains an exclusive con- cession for some species and is able to exclude outside fishermen for all other species (Cota-Nieto 2010; McCay et al. 2014). The other two cooperatives are located close to La Paz, the state capital of Baja California Sur (B.C.S.), and compete with other cooperatives and noncooperative fishermen for fish (Basurto et al. 2013; Sievanen 2014). Analysis using the fishing team-level logbook data reveals that cooperative members respond to cooperatives’ chosen prices as posited by the model. Exploiting the fact that one cooperative has stronger property rights than the other two, we use the cooperative-level price and catch data to demonstrate empirical support for the model’s three implications. The difference in price and catch between the cooperative with strong property rights and the other two cooperatives is large, and the cooperative with property rights disincentivizes effort more aggressively than the other two when growth rates are likely to be small. But the magnitudes of the estimates also indicate that the difference in management aggressiveness across the cooperatives shrinks in economically meaningful ways when resource scale is large and growth rates are high. Given the small number of cooperatives in the analysis, one should be cautious in extrapolating these findings to other settings. Instead, we view our results as improving the general understanding of when and how cooperative-based property rights can be effective. Our work complements the rich literature on community-based resource management institutions. Ostrom (1990) reviews case studies of institu- tions and derives a set of principles that differentiate those that are successful. We examine the role of some key principles from the Ostrom framework, such as clearly defined boundaries and effective rule enforcement. However, rather than making binary assessments of “success,” we empirically quantify the influence of property rights on economic outcomes. Gutie ´ rrez et al. (2011) consider case studies of fish- ing cooperatives in particular and find predictors of success, including the existence of quotas, enforce- ment institutions, long-term planning, and resource mobility. Recent economics literature examines the decision-making of villages or other local user groups regarding other resources (e.g., Edmonds [2002] and Foster and Rosenzweig [2003] on forests). In contrast to these studies, we examine the short-term dynamics of resource management, illuminating the mechanisms that institutions may use to achieve management goals. We do so in the context of fisheries, which are characterized by important spatial externalities and environmental fluctuations not relevant to some other natural resources. The theoretical literature on optimal fisheries management strategies considers these challenges. For instance, Costello and Kaffine (2010) and White and Costello (2011) examine the implications of spatial externalities in area-based property rights, driven by movement of species across large ranges. Reed (1975), Parma and Deriso (1990), Costello et al. (2001), and Carson et al. (2009) look at how manage- ment may respond to temporary or permanent environmental changes. Our paper examines similar issues but introduces an important complication arising from cooperative leaders’ need to ensure returns high enough to retain members. More importantly, our focus is on empirically testing our theoretical model and presenting quantitative evidence on cooperative decision-making. 298 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen Three important, recent papers examine fishing cooperatives empirically. Deacon et al. (2008) and Deacon et al. (2013) develop a model incorporating concerns specific to cooperatives and then empiri- cally test this model. They examine the intraseasonal allocation of fishing effort across space, time, and Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 fishers in a salmon fishery with one cooperative and independent fishers. Our work provides less detail on the location of fishing effort and instead focuses on the allocation of effort across time in the face of species-specific differences in mobility and large-scale environmental oscillations that cycle over several years. Ovando et al. (2013) use a survey of 67 cooperatives from around the world to examine what management tools cooperatives use and how this is shaped by differing economic, political, and ecologi- cal contexts. We focus on a particular management instrument—the choice of what price to pay cooper- ative members—and we complement our empirical analysis with a detailed theoretical model that delivers clear predictions for how this instrument should respond to a variety of circumstances. The paper proceeds as follows. Section I describes the setting of Mexico’s fisheries in more detail. Section II describes the data used in the empirical analysis. Section III develops the theoretical model and derives three testable implications. Section IV uses the model to develop empirical tests of these implica- tions and presents empirical results from these tests. Section V concludes. I. Mexico’s Gulf of California Fisheries Bordering five Mexican states, the Gulf of California is one of the most biologically productive areas of the world’s oceans.4 While the region’s remarkable biodiversity has considerable conservation value, it also is of substantial social and economic importance. The states surrounding the Gulf contribute 71% of Mexico’s total fisheries volume and 57% of total value (OECD 2006). As in many parts of the world, the Gulf’s fleet is characterized by small-scale subsistence or commercial fishing on small two or three- person boats. Small-scale fisheries are a major source of employment and income, as well as a safety net in times of economic or environmental uncertainty (Pauly 1997; Allison and Ellis 2001; Basurto and Coleman 2010). However, a number of commercially valuable species have declined in recent years due to several factors, including improved technology, population and income growth, and increased export opportunities (Sala et al. 2004; Saenz-Arroyo et al. 2005; Dong et al. 2004). Several ecological factors make the Gulf of California an appropriate focus for our study. The species targeted by small-scale fishers have diverse life histories, ranging from those with fairly high site fidelity (e.g., lobster [Acosta 1999]) to those that move extensively as larvae (e.g., shrimp [Calderon-Aguilera et al. 2003]) or adults (e.g., tuna [Schaefer et al. 2007]). This variation allows for an analysis of how cooperatives deal differently with species that vary in their mobility. Moreover, the region’s terrestrial and marine ecosystems respond dramatically to ENSO (El Nin ~o/ Southern Oscillation) events, which occur every several years (Polis et al. 2002; Velarde et al. 2004). During El Nin ~o years, ocean waters warm, upwelling slows, and rainfall increases, with important impli- cations for fisheries species (Velarde et al. 2004; Aburto-Oropeza et al. 2007). While ocean productivity varies temporally—both with ENSO and other sources of climatic variability—and spatially throughout the Gulf region, we find remarkable coherence in the variability of mean concentration of chlorophyll a, a common proxy for marine primary productivity (Mann and Lazier 2005) in the vicinity of the three cooperatives for which we have logbook data (Leslie et al. 2015). ENSO’s significant role enables us to explicitly test the influence of periodic environmental shocks on cooperatives’ decision-making. ENSO may affect species through three channels: recruitment and growth of juveniles, growth of adults, and movement of adults. Here we focus on the recruitment channel. 4 In addition to the Gulf proper, here we also consider the Pacific Coast of B.C.S. as part of the “Gulf region,” in keeping with previous work as in COBI/TNC (2006). The World Bank Economic Review 299 Fishing cooperatives have had a long history in the Gulf of California—and Mexico more broadly— and continue to be a major factor in the fishing industry today. Under the 1947 Fisheries Law, coopera- tives had exclusive concessions to the eight most commercially valuable species and often had rights to Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 bays, estuaries, or lagoons adjacent to their lands (DeWalt 2001; Young 2001). In addition to coopera- tives, the fisheries law created two other classes of fishermen: permisionarios, who are private individu- als or corporate entities with permits to catch—and sell to the open market—species for which cooperatives do not hold concessions; and pescadores libres, who have rights to fish within cooperatives’ concessions for subsistence only, but are also allowed to fish for permisionarios (Young 2001). To encourage private investment in fisheries, the 1992 Fisheries Law took exclusive rights for the eight species away from the cooperatives and made it possible for permisionarios to fish and sell them (SEPESCA 1992; Ibarra 1996; Villa 1996; Ibarra et al. 1998; Ibarra et al. 2000; Young 2001). Consequently, in the present-day system, independent fishers (i.e., permisionarios) are able to fish most species and sell their catch as long as they are able to acquire permits to do so. The acquisition of these permits involves important costs, including the administrative burden of applying for a permit, interac- tions with government officials, travel to (often distant) administrative offices, and the financial cost of the permit itself. Despite this change, fishing cooperatives continue to play an important role in these fisheries. Our review of the literature, field visits, and conversations with researchers at Centro para la Biodiversidad Marina y la Conservacion (CBMC) have revealed that cooperative membership entails a series of restric- tions on behavior, a specific form of compensation, and a potentially attractive set of benefits. In terms of restrictions on behavior, cooperative members are more constrained than those fishers who are not cooperative members. They are nominally bound to the rules of the cooperative, which determine how, when, and where to fish (see Reddy et al. 2013). A one-time membership payment and an agreement to sell only to the cooperative are also typical (J. J. Cota Nieto, pers. com., 2014). While enforcement of these restrictions varies among cooperatives, social ties may aid in enforcement. Cooperative membership requirements vary, both contemporaneously and historically, but typically, cooperative members live in the community and are often sons of prior members (e.g., Petterson 1980). Cooperative membership also entails a specific form of compensation. Cooperative leaders will nego- tiate with a buyer to supply a certain amount of product. They then set a price and quantity for that spe- cies and pay those fishers who return with product that price, which is some fraction of the market price (Reddy et al. 2013; J. J. Cota-Nieto, pers. com., 2014). Importantly, prices are used in combination with direct restrictions or quotas. Cooperative leaders have a sense of what price is required to fill a quota and can lower the price to avoid incentivizing fishing past a quota (G. Hinojosa-Arango, pers. com., 2014). In this sense, the price paid to cooperative members by the cooperative leaders is one form of con- trolling the effort of cooperative members. Cooperative leaders can use prices as a management tool, in addition to more direct restrictions, to help ensure a certain amount of fishing effort and, ultimately, catch. Given that the price paid to cooperative members is below the market price, the cooperative accrues revenue that can be used to generate various benefits of cooperative membership. This revenue is used to pay administrative costs that aid the cooperative as a whole, which include the salaries of cooperative officials, travel and legal expenses, and taxes (McGuire 1983). Benefits to members from these adminis- trative efforts include access to fishing permits, gear, state subsidies, and shared resources for processing, marketing, and reporting catch (Petterson 1980; Basurto et al. 2013; McCay et al. 2014; Sievanen 2014). Access to permits is one of the primary reasons for joining a cooperative, according to La Paz area fishers (e.g., Sievanen 2014), and thus, those fishers who do not have the financial or social capital to acquire permits as individuals (as the permisionarios do) are more likely to join cooperatives. Revenue may be paid out in annual bonuses, which may be based on fishermen’s total annual catch, equal for all 300 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen members, or determined by some other rule (McGuire 1983). Finally, in some cooperatives, members also enjoy income security through sources such as retirement benefits or credit (McCay et al. 2014; G. Hinojosa-Arango, pers. com., Feb. 2012). Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 In the theoretical model below, we model these benefits received by cooperative members in two ways: (i) a lump sum payment reflecting discounts on equipment (including boats and motors), credit, or annual bonuses from the cooperative leadership; (ii) a factor reducing the costs of fishing reflecting access to fishing permits and state subsidies for fuel, as well as the absence of costs associated with searching for buyers. To the extent that the size of an annual bonus is dependent on annual catch, the bonus is not appropriately classified as a lump sum payment, as it affects incentives for effort. We do not have information on how often such catch-dependent bonuses occur, but it is important to note from the above discussion that other forms of compensation also make up the lump sum payment. While the features above are generally shared by many cooperatives in the Gulf of California area, there are key differences among the three cooperatives for which we have daily logbook data and con- duct empirical analysis. Figure 1 shows the location of these cooperatives. Pichilingue is located on the outskirts of La Paz, the largest city in the state and a major market in the region, and Sargento is a short drive away from La Paz. Punta Abreojos, a member of the federation of cooperatives in Northwest Baja California known as FEDECOOP, is on the Pacific side of the peninsula, adjacent to other cooperatives from FEDECOOP. For the purposes of the empirical work below, we utilize the fact that Abreojos effec- tively has more secure property rights than Pichilingue and Sargento and can therefore manage its own members and restrict access for nonmembers more easily. This arises for five reasons. First, Abreojos fishes in a relatively isolated area, while Pichilingue and Sargento fish in areas that have many cooperatives and fishers (McCay et al. 2014). There is a larger pool of potential fishers for Pichilingue and Sargento to compete with. Second, Punta Abreojos and the other cooperatives in FEDECOOP successfully retained exclusive fishing rights to lobster, abalone, snails, and a few other species even after 1992 (McCay et al. 2014; Cota-Nieto 2010; J. J. Cota-Nieto, pers. com., May 2012). The ten cooperatives in FEDECOOP each have separate, clearly defined poly- gons in which no other cooperative of FEDECOOP and no non-FEDECOOP fisher can fish these spe- cies, unless it is for subsistence purposes. Third, even though these polygons were originally designed for the species with exclusive concessions, in practice they provide clear boundaries for other species as well (Cota-Nieto 2010). Cooperatives with adjacent polygons may fish for species without exclusive conces- sions in the neighbor’s polygon, but this typically involves negotiated agreements between the leaders of the two cooperatives involved (J. J. Cota-Nieto, pers. com., May 2015). Fourth, FEDECOOP has created a system in which fishers in each of the member cooperatives are expected to spend a fraction of their time in monitoring and vigilance efforts to enforce spatial restrictions (McCay et al. 2014). Fifth, FEDECOOP members can be removed from their cooperatives if they fail to sell exclusively to their cooperative or fail to comply with other rules (McCay et al. 2014). The lost benefits from eviction could be much more substantial for Abreojos than for the La Paz cooperatives because of the consistently high value of FEDECOOP fisheries (ensured by productive waters, local monitoring, FEDECOOP’s employ- ment of fisheries scientists, and FEDECOOP’s engagement with the state) (McCay et al. 2014). While exclusive sale to the cooperative may be a nominal requirement for La Paz area cooperative fishers, the cooperative leadership in La Paz do not have the same degree of control of their members to ensure that sales of product outside the cooperative do not occur (J. J. Cota Nieto, pers. com., May 2014). To be sure, illegal fishing still occurs in the areas fished by the FEDECOOP cooperatives, but for these five rea- sons the scale of illegal fishing is likely to be less on the Pacific side of the Gulf. We therefore view Abreojos as an “exclusive” cooperative and Pichilingue and Sargento as “nonexclusive” cooperatives for the purposes of testing our hypotheses below. The World Bank Economic Review 301 Figure 1. Map of Cooperatives’ Locations Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Source: Authors’ map produced using ArcGIS. Finally, it is important to understand whether any given fisher or cooperative can influence the mar- ket price of fish through their catch decisions. There are a large number of fishers in the area. As of 2010, based on data compiled by the National Commission of Aquaculture and Fishing (CONAPESCA), the number of fishers in La Paz alone was estimated at 974 people (748 cooperative 302 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen members plus 226 unregistered fishers) (Leslie et al. 2015). According to data collected from La Paz fish markets by researchers from CBMC and the Scripps Institution of Oceanography, the three cooperatives we studied (Punta Abreojos, Pichilingue, and Sargento) were estimated to each provide approximately Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 12%, 5%, and 8%, respectively, of the fisheries product sold in the La Paz market (Sanchez, Nieto, Osorio, Erisman, Moreno-Baez, and Aburto-Oropeza 2015). Abreojos is more focused on the export market, however. These numbers come from an effort to enumerate the major players in La Paz markets and may be an over-estimate as smaller sellers are not easy to find. The size of the market shares suggest that these cooperatives will have some market power, but we believe the shares are small enough that this is not a first-order concern. II. Data and Descriptive Statistics The empirical analysis uses daily logbook data from the three fishing cooperatives noted above, Pichilingue, Sargento, and Abreojos. Daily data on catches from fishing teams in each cooperative were recorded from January 1, 2007, to December 31, 2009. Catch records include a team identifier (for Abreojos and Pichilingue only), the common name of the species caught, the weight of the catch (kilo- grams), and price per kilogram offered by the cooperative (pesos). The composition of the species fished by the cooperatives partially reflects the biogeography of the Pacific vs. the Gulf coast of B.C.S.; how- ever, there is still substantial overlap, thereby allowing a comparison of the behavior of the different types of cooperatives for a given species. The logbook data have information on the prices cooperatives paid to their fishermen but, unfortu- nately, do not have information on the price the cooperative sold the catch at in the market. Using the Sistema Nacional de Informacion e Integracion de Mercados (SNIIM), available from the Mexican gov- ernment, we have collected data on daily market prices in La Paz for as many species and dates as possi- ble.5 Using the dates in the cooperative logbooks, these market prices are matched to the cooperative purchases. In cases where a market price is not available for a particular date, the average price for the corresponding week or month is substituted instead (depending on availability). To examine whether market prices in La Paz are driven by external forces that are exogenous to sup- ply factors in the vicinity of La Paz, we use the SNIIM to obtain market prices from La Nueva Viga, a large national fish market in Mexico City connecting sources to distributors. The La Nueva Viga data contain information on marine fish, crustaceans, freshwater fish, and mollusks/others. B.C.S. is listed as a source only for the fourth category. This, coupled with the fact that other sources of La Nueva Viga catch have only a partial overlap of species with La Paz, limits the number of species that can be matched to the logbook data. In cases where a market price is not available for a particular date in the logbook, the average price for the corresponding week or month is again used. All cooperative and market prices are converted into 2010 Mexican pesos using a Consumer Price Index obtained from the OECD. We aim to understand how cooperative pricing responds to natural variation that alters population growth rates. The Oceanic Nin ~o Index (ONI) is a three-month running mean of an underlying measure of ENSO cycles, which alter ocean temperature. Data on ONI are publicly available from the National Weather Service Climate Prediction Center.6 These data are matched to every cooperative purchase in the logbook data. The second half of 2007 and first half of 2008 were marked by a “cold episode” (more neg- ative values), while the second half of 2009 saw the onset of a “warm episode” (more positive values). Warmer ocean temperatures have three potential effects on organisms. First, they negatively affect juvenile recruitment of some species and positively affect recruitment of others. These population growth rate effects ultimately impact catch with a lag that generally ranges from one to seven years. Second, 5 Available at http://www.economia-sniim.gob.mx/i_default.asp. 6 Available at http://www.cpc.noaa.gov/products/analysis_monitoring/ensostuff/ensoyears.shtml. The World Bank Economic Review 303 warmer temperatures can either positively or negatively affect adult population abundance by causing individuals to migrate. Third, adult size may be affected through changes in individual growth. The model focuses on population growth rates and does not incorporate the other two effects for the sake of Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 tractability. Therefore, the empirical tests also focus on the first effect. Finally, we conducted a thorough review of the ecological literature to construct a detailed classification of species on two dimensions. First, we classify species as “large scale” (i.e., large scale of movement, or highly mobile) or “small-scale” (i.e., small scale of movement, or less mobile). This classification is based pri- marily on knowledge of the movement of adult organisms, rather than on knowledge of larval dispersal. Second, we create a variable equal to 1 if higher ONI has a positive effect on recruitment (and hence popula- tion growth rates) and equal to À1 if higher ONI has a negative effect on recruitment. If the effect on recruit- ment is unknown to us or if there is no effect on recruitment, we set the variable to 0. Table 1 provides summary statistics on the final merged data set.7 The first panel shows summary sta- tistics for the three cooperatives together, the second panel shows statistics for Pichilingue and Sargento only, and the third panel shows statistics for Abreojos only. Table 1. Summary Statistics All cooperatives Variable N Mean Std. dev. Min. Max. Log catch 5015 4.436 2.062 À1.609 11.258 Log coop price 4310 2.848 1.002 0.729 6.034 Log mkt price 3325 3.365 0.832 0.729 6.061 Large scale (0/1) 5014 0.373 0.484 0.000 1.000 ONI 5015 À0.181 0.776 À1.400 1.800 Recruit effect (À1/0/1) 5014 0.039 0.527 À1.000 1.000 ONI X recruit effect 5014 À0.018 0.426 À1.800 1.800 Pichilingue and Sargento Variable N Mean Std. dev. Min. Max. Log catch 3091 4.170 1.646 0.000 8.790 Log coop price 3005 3.228 0.708 1.640 6.034 Log mkt price 2145 3.476 0.714 0.950 6.054 Large scale (0/1) 3090 0.312 0.463 0.000 1.000 ONI 3091 À0.197 0.754 À1.400 1.800 Recruit effect (À1/0/1) 3090 0.067 0.608 À1.000 1.000 ONI X recruit effect 3090 À0.023 0.475 À1.800 1.800 Abreojos Variable N Mean Std. Dev. Min. Max. Log catch 1924 4.864 2.537 À1.609 11.258 Log coop price 1305 1.975 1.032 0.729 5.907 Log mkt price 1180 3.163 0.982 0.729 6.061 Large scale (0/1) 1924 0.470 0.499 0.000 1.000 ONI 1924 À0.155 0.811 À1.400 1.800 Recruit effect (À1/0/1) 1924 À0.005 0.355 À1.000 1.000 ONI X recruit effect 1924 À0.010 0.331 À1.800 1.800 7 Observations where the market price is less than the cooperative price are dropped due to concerns about measurement of market prices. This affects only 762 transactions out of approximately 42,000 individual fishing transactions for which we have cooperative price data. 304 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen While the logbook data is at the individual fishing transaction level, the data are aggregated in our tests of the model’s implications to the species-year-month-week level by averaging prices and totaling catch. This is because our own field visits suggested that cooperatives do not usually alter prices on a Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 day-to-day basis.8 Therefore, in all three tables of descriptive statistics, an observation is at the aggre- gated level. There is a significant amount of variation in total catch, market prices, and cooperative prices (table 1). Comparing the second and third panels of table 1, the log of cooperative price and log of total catch show a marked difference between the exclusive and nonexclusive cooperatives, despite the fact that average market prices are similar. The log of cooperative price and the log of catch also have a higher coefficient of variation for the exclusive cooperative. The final rows of each panel show the range of variation for the “Recruit Effect” variable and the “ONI X Recruit Effect” interaction. The majority of observations have a value of 0 for the “Recruit Effect” variable, but approximately 800 observations exhibit a positive recruitment effect, and more than 500 observations exhibit a negative effect (figure 2). This variation permits a test of the model’s third implication regarding population growth rates. Figure 2. Frequency Distribution of ONI Effect on Recruits Source: Authors’ analysis based on cooperative database described in the main text. To understand the relationship between market prices in La Paz and market prices external to that fishing area, we use the daily transaction data to estimate a regression of La Paz prices on La Nueva Viga prices, separately for each species for which there are data from both sources. Of the 55,841 daily transactions across Abreojos, Sargento, and Pichilingue, the species that match across the logbooks and La Nueva Viga account for only 4,208 transactions. Moreover, some of these transactions have no infor- mation on La Paz market prices. In table 2, we run OLS regressions of the log La Paz price on the log La Nueva Viga price by species common name. The results paint a mixed picture. For five of the nine common names, there is a positive and statisti- cally significant relationship between La Nueva Viga prices and La Paz prices. For these, the R-squared ranges from 0.016 to 0.877, depending on the species. Two of the nine common names have coefficients that are not statistically distinguishable from zero. The remaining two common names have negative and statistically significant relationships. Moreover, these relationships are not simply an artifact of La 8 Based on daily transaction data, we do sometimes observe multiple prices paid for a given species on a particular day. It is difficult to know whether this is measurement error or variation in price due to differences in time of day (e.g., a higher price for a more inconvenient time). The World Bank Economic Review 305 Table 2. Relationship Between La Paz and La Nueva Viga (LNV) Prices Common name of species (in Spanish) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Variable Almeja Calamar Camaron Cazon Jurel Lisa Mojarra Ostion Pulpo LNV log price 0.313** 0.606*** À0.146 0.102*** À0.231*** 0.145*** 0.124 À0.207*** 0.123*** (0.133) (0.026) (0.133) (0.020) (0.012) (0.027) (0.085) (0.025) (0.040) Constant À0.182 0.484*** 5.769*** 2.777*** 3.691*** 1.971*** 1.851*** 5.939*** 3.637*** (0.449) (0.069) (0.655) (0.073) (0.035) (0.076) (0.292) (0.113) (0.172) Obs 42 80 35 1528 1903 110 152 113 103 R2 0.121 0.877 0.035 0.016 0.170 0.207 0.014 0.373 0.085 Note: All specifications use OLS with the log of the price in La Paz as the dependent variable. Nueva Viga sourcing particular species from La Paz. Of the four species that are sometimes sourced from B.C.S. in the data (almeja, calamar, ostion, and pulpo), three have a positive coefficient while one has a negative and significant coefficient. Of the species not sourced from B.C.S., two have a positive coefficient, with cazon having especially high representation in the logbook data. While this exercise is starkly limited by data challenges, it suggests that the prices for some species are not just locally deter- mined in La Paz. III. Theoretical Model We first provide an overview of the model and then lay out the details in separate subsections below. We consider a single-species fishery with one cooperative and a continuum of fishers who are characterized by heterogeneous fishing costs. Each time period in the model is divided into two stages. In the first stage, the cooperative chooses a per unit price Pc to pay cooperative members for their catch. The coop- erative buys catch from its members at a price Pc and sells that catch on the market at a higher price Pm . The retained earnings are used to pay for operating costs, pay cooperative leadership, and provide lump- sum transfers back to cooperative members. In the second stage, individual fishers decide whether or not to be in the cooperative, as well as how much fishing effort to exert. Cooperative members can only sell to the cooperative; if they want to sell to the open market, they must leave the cooperative. This gives the cooperative a limited amount of monopsony power.9 The fundamental tradeoff facing the cooperative is that it can increase future stocks by decreasing Pc in the current period (and thereby disincentivizing current effort); but this will lower current earnings and—in the case of a cooperative without exclusive fishing rights—induce some fishers to leave the cooperative and fish independently. We make four crucial, additional assumptions in the model. First, individual fishers are atomistic: just as individual consumers and producers take market prices as given in the standard competitive model, each individual fisher takes aggregate fishing effort across all fishers and fishery stock as given. Pursuing a model in which individual fishers engage in strategic considerations would be an interesting extension, but we use the simpler approach because of the large number of individual fishers in these fisheries. In addition, the simplification allows for a focus on the basic tradeoff facing the cooperative. Second, we find an equilibrium in which the highest cost fishers sort into the cooperative. Importantly, low costs in the model represent not just fishing skill, but also how easy it is for fishers to 9 In practice, cooperative members could also fish outside the cooperative, against the cooperative’s wishes. This is likely limited in the three cooperatives used in our analysis, but could definitely be a concern for other cooperatives. Our only goal is to illustrate the influence of fishers’ decisions to fish independently versus fish with the cooperative on the coop- erative’s decisions, and we introduce this feature in the simplest way possible. 306 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen acquire permits for catch, transport catch to market, and purchase fuel and other equipment. Fishers who join the cooperative can lower their costs because the cooperative can acquire permits, transport all catch to market and find buyers, share gear, and coordinate harvesting activities (Ovando et al. 2013). Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 The cost formulation below has the property that high cost fishers obtain a greater benefit from these cooperative activities. Third, fishers can costlessly move in and out of the cooperative. This assumption, in combination with the assumption of atomistic fishers and no labor-leisure tradeoff for fishers, will ensure that fishers choose whether or not to be in the cooperative in any period t based on a simple comparison of profits in the cooperative versus outside the cooperative in period t alone.10 This substantially simplifies the dynamic problem. Our work in the study areas suggests that it is not in fact costless to move back and forth between cooperative and independent fishing, but we believe the mathematical simplification makes this assumption worthwhile and keeps the focus on our issue of primary concern: how the cooper- ative trades off between current returns and conservation and how this is affected by the economic and environmental setting. Fourth, we assume that market prices do not depend on the cooperative’s decisions. As noted above, the cooperatives in our data are important players in the La Paz market, but none of their market shares are above 12%. There is also evidence above that some of the prices in the La Paz market are driven in important ways by external forces, though those results vary markedly by species and speak to only a fraction of the species caught by our three cooperatives. The following subsections lay out the details of the model and then develop three testable implications. The Evolution of Stock There are both static and dynamic externalities associated with a fisher exerting more effort in a particu- lar time period. The “static” externality comes from the fact that when a fisher increases effort today, he decreases the ease with which other fishers can harvest from today’s stock. The “dynamic” externality comes from the fact that when fishers exert effort today, they reduce the available stock in future peri- ods. We formulate a simple model that captures both externalities. As in Deacon et al. (2013), we assume that each unit of fishing effort extracts a fixed proportion h of the remaining stock. This does not depend on whether a fisher is fishing individually or in a cooperative. If Xt is the initial stock at the beginning of the period, the stock after aggregate effort Ht has been applied across all fishers is ð1 À hÞHt Xt , and the overall quantity extracted is Qt ¼ ð1 À ð1 À hÞHt ÞXt . hit Fishers extract simultaneously, and fisher i receives catch qit in proportion to his effort hit : qit ¼ Qt . Ht Both an individual’s catch and the marginal return to an individual’s fishing effort is decreasing in the effort of all other fishers.11 Next, we specify how current stocks and harvests translate into future stocks. Two common choices are the Ricker model and the Beverton-Holt model. In both formulations, harvest and stock growth are sequential: The initial stock in a period is harvested, and the remaining population leads to the new 10 The model also implicitly assumes that fishers cannot save. This assumption will affect the model only if there is a labor-leisure tradeoff. Without a value for leisure, fishers with savings vehicles would still choose cooperative member- ship and hours to maximize profits in every period separately.   @ qi hi @ Q Q 11 The derivative of fisher i’s catch with respect to all others’ effort is ¼ À . The term in parentheses is @ HÀi H @ HÀi H negative, as can be shown by using a second-order Taylor series expansion of ð1 À hÞH about H¼0. To sign the cross- @ qi partial, take the derivative of with respect to hi . Re-arranging terms shows that the cross-partial is @ HÀi 2   h @ Q HÀi À hi @ Q Q þ À , which is clearly negative if HÀi ! hi . H @ hi @ HÀi H2 @ HÀi H The World Bank Economic Review 307 stock. Researchers typically use the Beverton-Holt model in settings where recruitment is relatively insensitive to population size because of density-dependent mortality (Clark 1990, 207–9). Since we would like harvest to have important effects on stocks, we instead use the Ricker model (Clark 1990, À X ÀQ Á Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 t À1 tÀ1 199, 202): Xt ¼ ðXtÀ1 À QtÀ1 Þert 1À K , where rt denotes an intrinsic rate of growth and K is the 12 carrying capacity. Substituting in the expression for aggregate catch yields the following equation for the evolution of stock as a function of effort:   ð1ÀhÞHtÀ1 XtÀ1 H t À1 r t 1À K Xt ¼ ð1 À hÞ XtÀ1 e (1) We do not use the most common fishery model, the Gordon-Shaefer model.13 We are interested in the dynamic trajectory of prices and harvest toward steady state, and finding an analytical expression for the optimal trajectory is generally not possible when we complicate the cooperative’s problem by intro- ducing a joining decision. This forces us to use a numerical solution procedure, so it is natural to use a discrete time stock-recruitment model. Fortunately, Eberhardt (1977) shows that the Ricker growth model is mathematically related to the standard continuous time logistic growth model under certain additional assumptions. Moreover, our way of expressing the static externality is related to the discrete time analog of the Gordon-Shaefer harvest function.14 Individual Fisher and Cooperative Optimization Problems We assume that in any period t0 , fisher i chooses 8t ¼ t0 ; :::; T , whether to be in the cooperative sector (Dit ¼ 1) or not (Dit ¼ 0) and what hours to work in the cooperative sector (hitC ) or the independent sec- tor (hitI ). In doing so, she takes current and future cooperative and market prices as given. She solves the problem: X T ! max hitI 1 hitI ;hitC ; Dit ð1 À ð1 À hÞHt ÞXt À h2 dt Pmt itI ð1 À Dit Þ t ¼0 Ht ai ! t hitC Ht 1 þ d Pct ð1 À ð1 À hÞ ÞXt À h2 þ S t Dit Ht ai þ b itC where d is the discount rate, Pmt is the market price, Pct is the cooperative price, Ht is total effort, and St is a lump-sum payment from the cooperative based on the revenue it accrues from the difference in the market price and the cooperative price. Cooperative fishers are atomistic and therefore take Ht as given, in addition to prices. This will imply that they take St as given. (We state how St is related to prices, stocks, and aggregate effort below in the cooperative’s problem.) Heterogeneity across fishers comes through ai , which denotes the inverse of the cost of effort for fisher i. If the fisher remains independent, 1 the cost function is citI ¼ h2 . If, instead, fisher i joins the cooperative, the cost function is ai itI 12 Under some choices of rt and K, this function can lead to limit-cycle oscillations without steady convergence to any stable equilibrium when there is no harvesting. Our choices of parameters in the numerical simulations ensure this does not occur (Clark 1990, 202). 13 The Gordon-Shaefer model is a continuous time model in which the growth in stock follows the differential equation dX X ¼ rXð1 À Þ À aHX, where H is harvesting effort, K is carrying capacity, r is the intrinsic growth rate, and a is a dt K parameter governing the return to effort. 14 Assuming that harvest and stock growth are sequential and focusing on a constant harvest H for the period t–1 to t, dX solve the differential equation ¼ ÀaHX for X(t). The solution implies XðtÞ ¼ eÀaH Xðt À 1Þ. This is equivalent to dt our ð1 À hÞH Xðt À 1Þ for some h 2 ð0; 1Þ. 308 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen   1 citc ¼ h2 , with b > 0. b represents the reduction in costs derived from fishing inside the ai þ b itC cooperative. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 The cooperative maximizes the present discounted value of current and future harvests from period 0 to period T, minus the present value of members’ costs from period 0 to period T: ð ! X T 1 max t d Á Pmt Qct ðPct Þ À ½hà ðPct ފ2 gðai Þ dai Pct t ¼0 i2coop ð a i þ bÞ itC where d is the discount rate, Qct ðPct Þ is the aggregate catch by cooperative members as a function of the cooperative price, hà itC ðPct Þ is the optimal effort choice of each member as a function of the cooperative price, and gðai Þ is the probability density function of ai . We assume that ai has an exponential distribu- tion: gðai Þ ¼ eÀai . The term i 2 coop indicates that the integral is taken over those who choose to be members. When solving the model below, we will show that fishers with ai below a threshold aà t will select into the cooperative, and this threshold is a function of Pct . Total quantity caught by the cooperative is: Hct Qct ¼ ð1 À ð1 À hÞHt ÞXt (2) Ht where Hct is the total effort expended in the cooperative. The lump-sum transfer St is then: Pmt Qct À Pct Qct St ¼ Ð f ðbÞ (3) i2coop gðai Þd ai where the first term is total accrued revenue divided by the mass of fishers selecting into the cooperative, while f ðbÞ represents the share of revenues per member remaining after expending money to produce b—this includes expenditures for transporting goods to market, acquiring permits, lobbying the govern- ment for fuel subsidies, etc. Equilibrium To find a subgame perfect Nash equilibrium of the model, first note that individual fishers’ strictly domi- nant strategy in any subgame is to select ðDt ; hitC ; hitI Þ in every period to maximize profits in that period. The fisher profits corresponding to the optimal effort choices in independent and cooperative fishing, respectively, are: P2 mt ai PiIt ¼ 2 ð1 À ð1 À hÞHt Þ2 X2 t (4) 4 Ht P2 ct ðai þ bÞ PiCt ¼ 2 ð1 À ð1 À hÞHt Þ2 X2 t þ St (5) 4 Ht Fisher i will choose to be in the cooperative in period t (Dt ¼ 1) if and only if PiCt ! PiIt , and then exerts profit-maximizing effort given that choice. Fishers’ strategies imply expressions for aggregate effort (in the cooperative, in the independent sec- tor, and overall) and for lump-sum transfers, as a function of the cooperative price. Aggregate effort in the cooperative is: ð Pct Hct ¼ ð1 À ð1 À hÞHt ÞXt Á gðai Þðai þ bÞ dai : (6) 2 Ht i2coop Correspondingly, aggregate effort in the independent sector is given by, The World Bank Economic Review 309 ð Pm HIt ¼ ð1 À ð1 À hÞHt ÞXt Á gð ai Þ a i d a: (7) 2Ht i62coop Writing the identity Ht ¼ Hct þ HIt gives us an implicit formula for Ht in any given time period t: Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 " ð ð # ð 1 À ð 1 À hÞ H t Þ Ht ¼ Pct Xt gðai Þðai þ bÞdai þ Pmt Xt gðai Þai dai : (8) 2 Ht i2coop i62coop This equation has a unique solution Ht .15 Using equations 2, 3, and 6, we see that the lump-sum transfer is: Ð Pct Á i2coop gðai Þðai þ b Þ d ai : St ¼ f ðbÞðPmt À Pct Þ ð1 À ð1 À hÞHt Þ2 X2 t Ð : (9) 2Ht2 Á i2coop gðai Þ dai : We look for an equilibrium in which all fishers i with ai < aà select into the cooperative. This is a natu- ral solution to expect for two reasons: Cooperative members face a “tax” on catch since Pct < Pmt , and members receive the benefit of b. The tax most negatively impacts the high-a fishers, while b dispropor- tionately benefits the low-a fishers. Both effects make cooperative membership most enticing for the low-a fishers. Substituting equation 9 into 4, using the exponential pdf for gðai Þ and simplifying, the cooperative knows that all fishers i for whom the following is true will join the cooperative:   1 P2 2 2 ct b þ 2f ðbÞðPmt À Pct ÞPct ð1 þ bÞ ! a Pmt À Pct þ 2f ðbÞðPmt À Pct ÞPct : (10) ea À 1 It is possible to show that the set of a for which this inequality is satisfied indeed takes the form a 2 ½0; aà Š.16 In equilibrium, the cooperative selects Pct for every period taking individual fishers’ strategies as given. Given this form of selection into the cooperative and the fishers’ optimal effort choices, the coop- erative’s problem in any subgame beginning at period t0 becomes: !ð à  X T ð1 À ð1 À hÞHt ÞX2 a t À t0 max d Á ð2Pmt Pct À P2 ct Þ 2 t ðai þ bÞ gðai Þ dai ; (11) Pct0 ;:::;PcT t ¼ t0 4 Ht 0 subject to the constraints:   ð1ÀhÞHtÀ1 XtÀ1 H t À1 r t 1À K Xt ¼ ð1 À hÞ XtÀ1 e (12) 15 First multiply both sides by Ht so that the resulting modified equation takes the form Ht 2 ¼ ½1 À ð1 À hÞHt ŠA, where A 2 is a function of other terms in the model. Note that Ht is continuous, has a derivative of zero at Ht ¼ 0, and the deriv- ative is strictly increasing with Ht . In contrast, the derivative of 1 À ð1 À hÞHt is continuous, strictly greater than zero at Ht ¼ 0 as long as h > 0, and this derivative is strictly decreasing with Ht . It follows that the modified equation has a unique positive solution. While 0 is a solution of the modified equation, it is not a solution of the original equation. Therefore, the original equation has a unique solution. 16 To do so, note first that the left hand side is not dependent on a. The limit of the right hand side as a approaches zero is, after an application of L’hopitals Rule, 2f ðbÞðPmt À Pct ÞPct , which is less than the left hand side. The limit of the right hand side as a approaches þ1 is þ1. Moreover, the derivative of the right-hand side is always positive. The de- ea À 1 À ea a rivative is P2 2 m À Pc þ 2f ðbÞðPm À Pc ÞPc . A sufficient condition for this to be positive is that e2a À ea > ea a, ðea À 1Þ2 or simply ea > a þ 1. But the latter expression follows immediately from a Taylor series expansion of ea about 0. It fol- lows that there is only one crossing of the right-hand side and left-hand side, at a point aà , and all fishers with ai aà select into the cooperative. 310 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen ð aà ð þ1 ! ð1 À ð1 À hÞHt ÞXt Ht ¼ Pct gðai Þðai þ bÞdai þ Pmt gð a i Þ a i d ai (13) 2 Ht 0 aà   1 P2 ÞP ð 1 þ Þ ¼ à P2 2 mt À Pct þ 2f ðbÞðPmt À Pct ÞPct (14) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 ct b þ 2 f ð b ÞðPmt À Pct ct b a : eaà À 1 To solve this dynamic programming problem for the cooperative’s price trajectory and develop basic implications of the model, we discretize the stock level and apply a numerical backward induction algo- rithm. That is, we begin with the last period T and find the optimal choice of PcT and optimal value of the objective function for period T at each possible stock level XT . We then move to period T–1 and find the optimal choice of PcT À1 and optimal value of the objective function from period TÀ1 onwards at each possible stock level XT À1 , given the continuation values from the previous step. We then move to the previous period and so on. In each step, we solve for Ht and aà t using the functions just above. This structure means that the computation time is linear in the number of periods and the number of stock buckets. The computation uses 70 periods and discretizes the stock into 1000 values between 0 and 1. Only the simulated outcomes for periods 10–60 appear in the figures below, as the behavior at the begin- ning of the cooperative’s problem is influenced markedly by the initial stock, and the behavior near time T is influenced by the desire to draw down stock rapidly. We use the following parameters in all simulation results presented here: X0 ¼ :5, K¼1, rðtÞ ¼ :3 þ :3à sinð2 t Þ, d ¼ :95, h ¼ :1, Pm ¼ 70, gðaÞ ¼ eÀa , b ¼ 1:5, f ðbÞ ¼ 0:5. We normalize stock to a carrying capacity of 1 because no other parameters are denominated in the same units, and so we expect that its absolute size does not affect behavior. On the other hand, a and Pm all factor into the revenues and costs faced by fishers, and so their relationships are important. The distribution of a was chosen to allow some nuance in the proportion of the fishers who select into the cooperative, while Pm was set to ensure that the level of harvest would be positive. We choose d to represent a cooperative that gives sig- nificant weight to future harvests. To investigate the recruitment effects of the ENSO cycle—a cyclic fluctuation that completes one full cycle over the course of multiple years—on the cooperative price tra- jectory, we let rt be a sine function of the time variable. Testable Implications The simulated choices of log cooperative prices over time appear in figure 3. Figure 4 is structured analo- gously and shows the resulting log cooperative catch over time.17 In all panels of the figures, the left axis provides the population growth rate r. Below, we explain each panel of the figure, the mechanics of the simulations underlying the panel, and the resulting testable implication. Implication 1: Price Levels We begin by comparing between the two types of models suggested above: a cooperative with endoge- nous membership coexisting with independent fishers (“nonexclusive” cooperative) and a cooperative operating with no independent sector (“exclusive” cooperative). To operationalize the “exclusive” coop- erative model, we simply use the relationship Ht ¼ Hct , replace aà with 1, and solve the model in the same way as described above. Chosen log prices are strikingly lower for the exclusive cooperative than the nonexclusive cooperative across all periods (figure 3, panel 1). The economic intuition for this result is clear: The exclusive cooper- ative has more capability of managing stocks and exercises this capability by lowering fishing effort through lower prices. The nonexclusive cooperative, on the other hand, faces fishing pressure from the independent sector and knows that lowering prices could induce members to join the independent sector; 17 Because of the carrying capacity and initial stock choices, catch is always between 0 and 1. Consequently, the log of catch is negative. The World Bank Economic Review 311 Figure 3. Evolution of Cooperative Log Price Over Time Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Notes: Panel 1 shows the population growth rate r and log price choices for exclusive and nonexclusive cooperatives. Panel 2 shows how species scale—“small range” versus “large range”—affects the difference in log price between the nonexclusive coop- erative and the exclusive cooperative. Panel 3 shows how the difference between the nonexclusive log price and the exclusive log price responds to fluctuations in the population growth rate. Source: Authors’ analysis based on the theoretical model and simulation methods described in the main text. 312 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen Figure 4. Evolution of Cooperative Log Catch Over Time Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Notes: Panel 1 shows the population growth rate r and log catch for exclusive and nonexclusive cooperatives. Panel 2 shows how species scale—“small range” versus “large range”—affects the difference in log catch between the nonexclusive cooperative and the exclusive cooperative. Panel 3 shows how the difference between the nonexclusive log catch and the exclusive log catch responds to fluctuations in the population growth rate. Source: Authors’ analysis based on the theoretical model and simulation methods described in the main text. The World Bank Economic Review 313 for both reasons, the nonexclusive cooperative exerts higher fishing effort by setting higher prices.18 Since healthier stocks are produced by more aggressive management, the exclusive cooperative tends to have higher log catch than the nonexclusive cooperative (figure 4, panel 1). Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Implication 2: High Mobility vs. Low Mobility In addition, we examine the effect of the extent of species mobility (“species scale”) on cooperative decision-making. We use the model described above for species where individuals exhibit relatively little movement. For species where individual organisms exhibit high geographic mobility, we assume the stock in any given period is subject to some amount of catch Het that is external to the given fishery, so that Ht ¼ Hct þ HIt þ Het . We calculate Het as in equation (7), assuming that this external effort comes from a population of the same size and skill distribution as the focal population; the only difference is that we assume all fishers operate independently in this external sector. The difference between the nonexclusive cooperative price and the exclusive cooperative price for spe- cies with large scale of movement (“large range”) is generally smaller than the difference for small-scale species (“small range”) (figure 3, panel 2). For a large-scale species—that is, one that is highly mobile—a local property right has less meaning, as users outside the local area will have an impact on stocks. Accordingly, the exclusive cooperative should behave more like the nonexclusive cooperative—and exert less control on effort by paying a higher price—in cases where the relevant species has a large geographic range. As should be expected from this reasoning, the difference in log catch between the nonexclusive cooperative and the exclusive cooperative is especially large in magnitude for the small range species (fig- ure 4, panel 2). Implication 3: Changes in Population Growth Rates Exclusive cooperative prices covary more markedly with population growth rates than nonexclusive cooperative prices (figure 3, panel 1). In fact, the correlation between prices and population growth rate is 0.81 for small-range species fished by exclusive cooperatives, which is statistically significantly differ- ent from zero at the 1% level. In contrast, the correlation for nonexclusive cooperatives is only 0.03 and is statistically indistinguishable from zero.19 Restricting attention to small-range species, where management is most effective, the difference in log prices between the nonexclusive cooperative and the exclusive cooperative oscillates with the population growth rate, so that in times of low projected growth, the exclusive cooperative sets a low price relative to the nonexclusive cooperative (figure 3, panel 3). The intuition is again clear: when low growth is pro- jected, the exclusive cooperative wants to conserve the resource by cutting back on fishing effort; in con- trast, the nonexclusive cooperative is confronted with an independent sector and the threat that some of its members will leave to the independent sector if it manages effort too aggressively. The consequences for differences in catch across the cooperatives are more subtle than in the case of implications 1 and 2. The reason is that catch is a function of both cooperative price and stock, and cur- rent and future growth rates affect both price and stock in complicated and potentially offsetting ways. The peaks in catch differences occur prior to the peaks in price differences; this is because price differen- ces are at their highest when growth rates (and hence stocks) are relatively low (figure 4, panel 3). 18 The price level for the exclusive cooperative increases as period 60 nears. As the cooperative nears the final period, it draws down its stocks by increasing the price. 19 Correlations are computed using only periods 10–60. To verify that this pattern has to do with sustainable manage- ment and not just the exclusive nature of the cooperative, we also compute this correlation for an alternative model in which the exclusive cooperative cannot predict growth rates. We find that the correlation is statistically indistinguish- able from zero in this case. In the case of large-range species, this correlation cannot be distinguished from zero for ei- ther type of cooperative. 314 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen Therefore, we do not have a sharp testable implication for how differences in log catch are correlated with changing growth rates. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Influence of Assumptions on Testable Implications To summarize, the three testable implications are: 1) An exclusive cooperative will on average pay lower prices to its members than a nonexclusive cooperative but will have higher catch; 2) The gap in prices and catch will on average be smaller in magnitude for species that have a larger scale of movement; 3) The gap in prices will rise when population growth rates fall and fall when population growth rates rise. Here, we briefly speculate about how altering key assumptions of the model would affect our main theo- retical results. First, consider the assumption that low-a (high cost) fishers sort into the cooperative, while others stay out. Suppose instead that fishers with the lowest costs sorted into the cooperative. This could be the case, for example, if high cost fishers do not benefit as much from the equipment, information, and mar- keting ability provided by the cooperative. In this case, a cooperative without exclusive rights would face a somewhat different problem. An increase in the current buying price would still increase current profits at the expense of future profits, but the marginal fisher that enters the cooperative would now be worse, so that the marginal profit from that fisher would be less. This suggests that the cooperative has an incentive to manage its stock more aggressively than what we see above. Correspondingly, the differ- ences between an exclusive and nonexclusive cooperative would be less. Any differences we do see in the empirical work could therefore be an underestimate of the influence of mechanisms from our model. Second, consider the assumption that exiting and entering the cooperative is costless. It may be the case that, instead, when a fisher leaves the cooperative, the cooperative makes it prohibitively costly for them to return. This will affect the results for nonexclusive cooperatives, where the joining decision plays a role. The change will give cooperatives an additional lever with which to keep members from leaving to take advantage of short-term profit opportunities outside the cooperative. This makes the cooperative more willing to manage its stock more aggressively and depress buying prices when it is necessary. This reasoning suggests that, if this assumption were changed, the nonexclusive cooperative would behave more like the exclusive one. Again, any differences we do see in the empirical work could therefore be an underestimate of the influence of mechanisms from our model. Third, consider the assumption that cooperatives cannot influence the market price. If this were not the case, cooperatives have an additional consideration: an increase (decrease) in the cooperative buying price will tend to decrease (increase) the market price. There is now an incentive to keep production low in order to increase prices. This effect will tend to depress average cooperative buying prices in both exclusive and nonexclusive cooperatives, but if both types of cooperatives are selling into the same mar- ket, it is difficult to predict which type would see the larger change. The relationship between this effect and scale or growth rates is even more complicated. For instance, if growth rates are low, the coopera- tive knows that stocks will be relatively low in the future. With market power, the cooperative will have less incentive to recoup stocks compared to our model above. But again, it is difficult to predict whether this will affect exclusive or nonexclusive cooperatives more. We discuss this assumption again when we explore alternative explanations for our empirical results below. IV. Empirical Analysis This section develops empirical specifications from the theory in order to test a key assumption of the model and the model’s three implications. The theory provides specific guidance as to what methods are appropriate for these tests and how estimated coefficients should be interpreted. In a few dimensions, the theory is too simple to be applied literally to the empirical work. For instance, the theory uses a single-species model, while in reality each cooperative fishes many different species. Moving to a The World Bank Economic Review 315 multispecies model would entail adding significant complexity but would be a valuable avenue for future research. Here, we view the cooperative as performing the optimization above independently for each species. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Testing Assumption of Members’ Response to Cooperative Prices To test the assumption that cooperative members increase their catch in response to an increase in the cooperative price, we use the daily transaction data with one observation for each recorded sale by a fishing team. In the model, a cooperative member i’s catch in period t is given by the following: " #2 Pct ðai þ bÞ ð1 À ð1 À hÞHt ÞXt qà ict ¼ : (15) 2 Ht Taking logs gives the following: Ht logðqà ict Þ ¼ Àlogð2Þ þ logðPct Þ þ 2logð1 À ð1 À hÞ Þ À 2logðHt Þ þ 2logðXt Þ þ logðai þ bÞ: (16) The first two terms on the right-hand side pose no complications for estimation, but the remaining terms do. First, Xt , b, and ai are unobservable to us. Second, Ht is implicitly a function of other quantities from the model (see equation 8 above). In the case of either a nonexclusive cooperative or the exclusive cooperative with a large range species, Ht is a function of Pct , Pmt , Xt , and parameters b and h. For an exclusive coopera- tive fishing a small-range species, the same is true, except Ht is not a function of Pmt . These considerations and equation 16 motivate the following log-linear approximation: logðqà ict Þ ¼ di þ a0 logðPct Þ þ a1 logðPmt Þ þ ict ; (17) where di is a fishing-team fixed effect capturing ai , b, and h, and ict is a residual. The key identification con- cern is that the residual is clearly correlated with Pct (and, perhaps, with Pmt as well). This is because the residual contains Xt , and Xt is chosen in part by the cooperative when it sets Pct . In addition, the residual may contain an important factor that is outside the model, time-varying shocks to the cost of fishing. If one can address this concern, then an estimate of a0 includes two economic items: the direct, posi- tive impact of increasing the cooperative price on an individual’s catch; and the negative, indirect impact coming from the resulting increase in Ht .20 The term a1 captures the negative impact of increasing Pmt through the resulting increase in Ht . Our primary goal is to verify that the net impact of the cooperative price on catch, a0 , is positive. This is a necessary prerequisite to the model’s assumption that the cooper- ative can control its members’ effort by changing cooperative buying prices. We use the following idea to address the identification concerns: Unlike in the model, changes in Pct and changes in Xt and fishing costs do not happen at exactly the same point in time. We assume that in reality, when the cooperative changes Pct , both individual catch qà ict and aggregate effort Ht respond to the change immediately; however, the stock Xt and unobservable time-varying shocks to fishing do not respond immediately. Under this identifying assumption, if we can examine a narrow enough window around the price change, we can reasonably assume that the expected value of the stock Xt and unob- servable fishing costs is similar on either side of the price change. To operationalize this idea, we estimate equation 17 using fixed effects at the fishing team-species- month level and the fishing team-species-week level. The fishing team is not identified in the Sargento data, so we restrict the analysis to Pichilingue and Abreojos. Unlike in the model, in reality members may not fish for certain species at certain times. Consequently, some members have zero catch for a spe- cies in particular periods. The results below first examine the responsiveness of catch to prices including just the intensive margin, and then include both the intensive and extensive margin. 20 We are very appreciative of an anonymous referee who provided this important insight. 316 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen The first set of results appear in table 3, which examines fishers who catch a positive amount of a spe- cies. The first panel (columns 1–3) uses the Pichilingue sample, while the second panel (columns 4–6) uses the Abreojos sample. Both panels are structured analogously: the first column estimates equation 17 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 using fixed effects at the fishing team-species-month level and conventional standard errors, the second column does the same but clusters standard errors at the fishing-team-species month level, and the third column uses fixed effects and clustered standard errors at the fishing team-species-week level. Moving from the first to second column shows the effect of clustering, while moving from the second to third col- umn shows the effect of focusing on a narrower time interval.21 Table 3. Responsiveness of Catch to Price: Intensive Margin Dep. variable: log catch Pichilingue Abreojos Variable (1) (2) (3) (4) (5) (6) Log coop price 0.563** 0.563 0.782** 0.670*** 0.670*** 0.704*** (0.220) (0.345) (0.352) (0.029) (0.051) (0.046) Log mkt price À0.599 À0.599 0.364 0.310 0.310 1.124** (1.319) (1.244) (1.335) (0.221) (0.274) (0.505) Obs 2618 2618 2618 23,586 23,586 23,586 Fixed effects Sp-M-FT Sp-M-FT Sp-wk-FT Sp-M-FT Sp-M-FT Sp-wk-FT Clustering none Sp-M-FT Sp-wk-FT none Sp-M-FT Sp-wk-FT Num. groups 1493 1493 2110 3169 3169 7153 Within-R2 0.006 0.006 0.018 0.025 0.025 0.041 Note: All specifications use linear fixed effects estimation. First three columns use only Pichilingue observations, and next three columns use only Abreojos observa- tions. Columns 1, 2, 4, and 5 include fixed effects at the species-month-fishing team level, while columns 3 and 6 include fixed effects at the species-week-fishing team level. “Num. groups” indicates the number of unique combinations at each level. Standard levels clustered at the level indicated in the “Clustering” row. The coefficients on log cooperative price are generally positive and statistically significantly different from zero at conventional levels, though the P-value in column 2 increases to 0.103. The estimated elas- ticities of catch with respect to price range from 0.563 to 0.782, with more stability for Abreojos across specifications. The coefficients on the market price are negative in columns 1–2 as expected given the discussion above, but cannot be distinguished from zero in any of the columns except one. The one exception is column 6, where we see an unexpected positive sign. As noted in the context section above, Abreojos may have nonprice mechanisms with which to induce members to fish; the significant positive coefficient is consistent with this, and may suggest that our model captures only one mechanism through which cooperatives control effort.22 The results that incorporate both the intensive and the extensive margin appear in table 4. To incor- porate both margins, we consider the exponentiated form of equation 17: qà ict ¼ expðdi þ a0 logðPct Þ þ a1 logðPmt Þ þ ict Þ: (18) 21 For identification, the fixed effects model does not use fixed effect groups that only have one observation. For Pichilingue, 19–37% of groups have more than one observation (depending on the specification), and for Abreojos 70–77% of groups have more than one observation. 22 We also estimated these specifications without including the market price, even though this specification is only consis- tent with the theory in the case of small-range species at Abreojos. In these cases, when standard errors are clustered, the coefficient on log cooperative price is still positive and significant for Abreojos at the 1% level, ranging from 0.718 to 0.739. The Pichilingue coefficient falls to 0.375 (week-level regressions) and 0.205 (month-level regressions), with P-values of 0.141 and 0.335, respectively. Similar qualitative patterns hold when performing the same exercise for the specifications examining both the intensive and extensive margin in table 4. The World Bank Economic Review 317 Table 4. Responsiveness of Catch to Price: Including Zero Catch Dep. Variable: catch Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Pichilingue Abreojos Variable (1) (2) (3) (4) (5) (6) Log coop price 0.441*** 0.441 0.543 1.489*** 1.489*** 1.500*** (0.024) (0.548) (0.512) (0.007) (0.172) (0.147) Log mkt Price À2.566*** À2.566 À0.791 À0.455*** À0.455 0.898 (0.091) (1.568) (1.855) (0.020) (0.553) (0.664) Obs 9341 9341 4458 60,497 60,497 41,695 Fixed effects Sp-M-FT Sp-M-FT Sp-wk-FT Sp-M-FT Sp-M-FT Sp-wk-Ft Clustering none Sp-M-FT Sp-wk-FT none Sp-M-FT Sp-wk-Ft Num. groups 1345 1345 1470 3147 3147 6966 Note: All specifications use fixed effects Poisson estimation. First three columns use only Pichilingue observations, and next three columns use only Abreojos obser- vations. Columns 1, 2, 4, and 5 include fixed effects at the species-month-fishing team level, while columns 3 and 6 include fixed effects at the species-week-fishing team level. “Num. groups” indicates the number of unique combinations at each level used in estimation. Standard levels clustered at the level indicated in the “Clustering” row. Given that this equation is consistent with the conditional expectation function of the fixed effects Poisson model, and given that a large fraction of observations are zeros, we estimate the model using the Poisson quasi-maximum likelihood estimator with fixed effects at the fishing team-time-species level.23 As discussed in Wooldridge (2010), ch. 18, and Burgess et al. (2012), the Poisson model is a quasi- maximum likelihood estimator that yields consistent estimates as long as the conditional expectation is correctly specified, regardless of the exact distribution of the underlying error. Table 4 is structured analogously to table 3. The coefficients can be interpreted as the elasticity of the conditional expectation of catch with respect to price. The coefficient on log cooperative price is positive and significant for every column except columns 2 and 3. This means that the extensive margin effect appears stronger for Abreojos than Pichilingue. Except in one case, the coefficient on market price is neg- ative (and significant in columns 1 and 4).24 A complementary empirical approach, presented in the supplemental appendix available at https://aca demic.oup.com/wber, examines changes in catch at the time of discrete events when there is a large and sustained change in cooperative prices for a species. Such events are difficult to pinpoint in the data. Still, the signs of estimated coefficients are consistent with those above, though not always statistically significant. Therefore, there is evidence that cooperative members respond to cooperative buying prices in the way posited by the model. Implication 1: Price Levels Across Cooperatives Next, we test the model’s implications for how cooperatives choose buying prices. For this portion of the analysis, we first aggregate the data to the cooperative-week-species level, taking the sum of catch and the average log price across the week. The first implication of the model is if market prices are 23 The observations row in the table shows the number used for estimation; observations that do not show variation within group are not used. Among the observations used for estimation, 6871 (month specification) and 2480 (week specification) observations are zero for Pichilingue. The corresponding numbers for Abreojos are 36,933 and 18,296. 24 The exception is again column 6, the week specification for Abreojos, where the coefficient is positive but statistically indistinguishable from zero. 318 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen constant and growth rates and X0 are the same across cooperatives, then on average the exclusive coop- erative will pay lower prices to its members than a nonexclusive cooperative but will have higher catch. To operationalize this comparison, note that in our forward-looking model without uncertainty, the Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 cooperative price in one period Pct will be a function of the exogenous variables for every period, the ini- tial stock X0 , and the cost parameter. Therefore, if we denote the exclusive cooperative by E and the nonexclusive cooperative by N, we have for d¼E,N: logðPctd Þ ¼ gd ðlogðPm0 Þ; ::::; logðPmT Þ; r0 ; :::; rT ; X0 ; bÞ, where the gd ð:Þ are functions. In a more general case, the cost term b could also vary by time. A first-order Taylor series approximation of the equations about the expected log mar- ÀÀÀÀÀ ket price logðPm Þ and the expected growth rate rÀ yields: À ÀÀÀÀÀ ÀÀÀÀ À logðPctd Þ ¼ gd þ gdp0 ðlogðPm0 Þ À logðPm ÞÞ þ ::: þ gdpT ðlogðPmT Þ À logðPm ÞÞ (19) À À À þgdr0 ðr0 À r Þ þ ::: þ gdrT ðrT À r Þ þ td (20) À where gdis the function evaluated at logðPm Þ and rÀ and contains X0 . The terms gdpt and gdrt give the derivatives of gd with respect to logðPmt Þ and rt evaluated at the mean values of these variables, respec- tively. Finally, ctd contains both approximation error and period-cooperative-specific shocks to cost (if b is allowed to vary by time). If Dtd is a dummy variable equal to 1 when d¼E and 0 otherwise, then the two equations can be combined: T h X i À À À À À logðPctd Þ ¼ gN þ ðgE À gN ÞDtd þ gNrt ðrt À r Þ þ Dtd ðgErt À gNrt Þðrt À r Þ (21) t ¼0 T h X ÀÀÀÀÀ ÀÀÀÀÀ i þ gNpt ðlogðPmt Þ À logðPm ÞÞ þ Dtd ðgEpt À gNpt ÞðlogðPmt Þ À logðPm ÞÞ (22) t ¼0 þtN þ Dtd ðtE À tN Þ (23) This motivates the following regression equation: logðPctd Þ ¼ a0 þ a1 Dtd þ utd (24) where: T h X ÀÀÀ ÀÀ ÀÀ ÀÀÀ i utd ¼ gNpt ðlogðPmt Þ À logðPm ÞÞ þ Dtd ðgEpt À gNpt ÞðlogðPmt Þ À logðPm ÞÞ t ¼0 T h X i À À þ gNrt ðrt À r Þ þ Dtd ðgErt À gNrt Þðrt À r Þ þ tN þ Dtd ðtE À tN Þ t ¼0 Since both the exclusive and nonexclusive cooperative see the same market prices and growth rates in any period, the expected value of utd conditional on Dtd is just EðtN þ Dtd ðtE À tN ÞjDtd Þ. À À Implication 1 of the model for prices is that ðgE À gN Þ ¼ a1 < 0. The above reasoning makes clear what the threats to interpreting a1 in this way are. First, EðtN þ Dtd ðtE À tN ÞjDtd Þ may not be zero. To take an example, fishing costs b may differ between exclusive and nonexclusive cooperatives because of differences in species caught. Another possibility is that, since Abreojos fishes some species that the other cooperatives do not, the two types of cooperatives systematically see different market prices for output. To deal with this important issue, we use species-time fixed effects. The fixed effects permit a comparison of cooperative prices within species-time period. Second, X0 may not be the same across cooperatives within species.25 In this 25 Coastal ocean productivity varies temporally, due to seasonal and longer-term drivers (e.g., ENSO). Ocean productiv- ity also varies spatially between the Pacific and Gulf coasts of BCS (Lluch-Cota et al. 2010; Leslie et al. 2015), which could create differences in the productivity of fish populations. These effects are still not well understood. The World Bank Economic Review 319 case, a1 reflects both the mechanism stressed in the model simulations and the difference in initial stocks. We discuss these identification concerns further below. Finally, we can follow the same reasoning as above to develop an estimating equation for catch: Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 logðQctd Þ ¼ b0 þ b1 Dtd þ vtd (25) Implication 1 of the model for catch is that b1 > 0. The identification concerns noted above are applica- ble here as well. We estimate equations 24 and 25 using fixed effects at the species-quarter, species-month, and species-week levels.26 From the point of view of flexibly capturing time-varying unobservable costs, the species-week specification is most preferable. However, this uses a more limited subset of data for identi- fication. Reassuringly, the results are very similar with all three approaches. The left panel of table 5 shows the results for log cooperative price, while the right panel shows the results for log weekly catch. Within species and time period, prices are more than one log point lower in Abreojos than the other cooperatives (columns 1–3). The magnitude is quite similar across the columns. This suggests that within a species-quarter combination, omitted determinants of log cooperative prices that vary by week or month are not strongly correlated with the Abreojos dummy. The coefficient on the Abreojos dummy in the catch specifications also has the expected sign (columns 4–6). Within species and time period, log catch is substantially higher in Abreojos than the other cooperatives. Table 5. Implication 1: Price and Catch Across Cooperatives Log price Log catch Variable (1) (2) (3) (4) (5) (6) Abreojos À1.095*** À1.112*** À1.124*** 0.984*** 1.037*** 1.002*** (0.029) (0.023) (0.016) (0.183) (0.148) (0.125) Constant 3.180*** 3.185*** 3.189*** 4.059*** 4.039*** 4.052*** (0.009) (0.007) (0.005) (0.070) (0.057) (0.048) Obs 4310 4310 4310 5015 5015 5015 Num. groups 626 1379 3505 703 1572 4081 Within-R2 0.790 0.840 0.916 0.035 0.047 0.081 Fixed effects Sp-qtr Sp-month Sp-week Sp-qtr Sp-month Sp-week Note: Sample includes observations at the weekly level from Abreojos, Pichilingue, and Sargento. The omitted category is Pichilingue/Sargento. All specifications use linear fixed effects at the species-quarter (sp-qtr), species-month (sp-month), or species-week (sp-week) level. Standard errors are clustered at the level of the fixed effect. Above, we raised a number of endogeneity concerns. The stability of the coefficient estimates across the columns in table 5 may alleviate some of these concerns. Moreover, some of these concerns are less problematic when one considers both the price and catch results simultaneously. For instance, the catch results could be driven by the fact that initial stocks of all species are exogenously higher on the Pacific side of B.C.S. (near Abreojos) compared to the Gulf of California side (near Pichilingue and Sargento). But this by itself would not explain the negative coefficient in the price regressions. Similarly, differences in the number of members across Abreojos and the other cooperatives could explain the differences in catch but not necessarily the differences in prices. If market prices are systematically lower for Abreojos than for the other cooperatives even within a species—since Abreojos sells in part to different markets— this could explain the price results, but not the catch results. 26 The sample in both regressions uses only weeks in which at least one catch was recorded in the logbooks. Standard er- rors are clustered at the level of the fixed effect. 320 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen Nevertheless, there is still a class of relevant endogeneity concerns: If the area around Abreojos is more productive ecologically, fishing costs could simply be lower for Abreojos; this could lead to lower cooperative prices and higher catch totals. The next subsection shows that, while this effect may be at Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 work, it does not fully capture the patterns in the data. Implication 2: Role of Species Scale in Price Gaps Next, we examine the second implication of the model: The gap in prices and catch between coopera- tives with exclusive property rights and cooperatives without exclusive rights will be smaller in magni- tude for species that have a larger scale of movement. The development of an empirical specification to test implication 2 is similar to the case of implication 1. Instead of allowing the functions g(.) to differ based only on whether the cooperative is exclusive or not, there are now four possible combinations (d,x), d¼E,N, x¼L,S: exclusive, large-scale (EL); exclusive, small-scale (ES); nonexclusive, large scale (NL); and nonexclusive, small scale (NS). If Lx is a dummy variable that is 1 for large scale species and 0 otherwise, the Taylor series expansion about the scale-specific expected growth rates and expected log prices yields: À À À À À À À À À logðPctdx Þ ¼ gNS þ ðgES À gNS ÞDtd þ ðgNL À gNS ÞLx þ ½ðgEL À gNL Þ À ðgES À gNS ފDtd Lx þ utdx (26) where utdx is a function of prices, growth rates, and tdx analogous to the one in the previous subsection, except now including the scale-specific expected growth rates and expected log prices, Dtd , Lx , and the interaction of the two. This motivates the following regression equation: logðPctdx Þ ¼ a0 þ a1 Dtd þ a2 Lx þ a3 Dtd Lx þ utdx (27) Assuming again that the two types of cooperatives see the same market prices and growth rate for a given species, the expectation of utdx conditional on Dtd and Lx simplifies to the following: EðNS þ ðES À NS ÞDtd þ ðNL À NS ÞLx þ ½ðEL À NL Þ À ðES À NS ފDtd Lx jDtd ; Lx Þ Implication 2 of the model is that, with all else held equal, a3 > 0. As above, there are two types of threats to interpreting a3 as reflecting the model’s mechanisms. First, the expectation in the expression just above may not be zero. This could happen, for instance, if the difference in fishing costs between the exclusive and nonexclusive cooperative varies by the scale of the species. Since one source of this issue is differences in the type of species caught, we again use species-time fixed effects. The second type of iden- tification concern is that the difference in X0 between the exclusive cooperative and the nonexclusive cooperatives could vary depending on scale. For both identification concerns, the crucial issue is whether the difference between the exclusive and nonexclusive cooperatives varies by scale. For example, simply having a difference in fishing costs between exclusive and nonexclusive cooperatives that is invariant to species scale biases the estimate of a1 but not the estimate of a3 . In this sense, the test of implication 2 is more robust than that of implication 1 and is analogous to a differences-in-differences approach. Analogous reasoning leads to an estimating equation for catch: logðQCtdx Þ ¼ b0 þ b1 Dtd þ b2 Ltdx þ b3 Dtd Ltdx þ vtdx (28) Implication 2 of the model for catch is that b3 < 0. As in the previous subsection, we estimate equation 27 and 28 using fixed effects first at the species- quarter level, then the species-month level, and finally at the species-week level.27 The left panel of 27 The sample in both regressions uses only weeks in which at least one catch was recorded in the logbooks. Standard er- rors are clustered at the level of the fixed effect. The World Bank Economic Review 321 table 6 shows the regressions for log price, and the right panel shows the regressions for log weekly catch. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Table 6. Implication 2: Price/Catch Differences by Scale Log Price Log Catch Variable (1) (2) (3) (4) (5) (6) Abreojos À1.174*** À1.188*** À1.187*** 1.347*** 1.354*** 1.240*** (0.029) (0.020) (0.014) (0.293) (0.217) (0.173) Abreojos X large scale 0.168*** 0.167*** 0.149*** À0.778** À0.705** À0.561** (0.058) (0.046) (0.035) (0.351) (0.289) (0.246) Constant 3.178*** 3.183*** 3.185*** 4.060*** 4.044*** 4.062*** (0.009) (0.007) (0.005) (0.069) (0.056) (0.047) Obs 4309 4309 4309 5014 5014 5014 Num. groups 625 1378 3504 702 1571 4080 Within-R2 0.795 0.845 0.920 0.041 0.053 0.088 Fixed effects Sp-qtr Sp-month Sp-week Sp-qtr Sp-month Sp-week Note: Sample includes observations at the weekly level from Abreojos, Pichilingue, and Sargento. The omitted category is Pichilingue/Sargento. All specifications use linear fixed effects at the species-quarter (sp-qtr), species-month (sp-month), or species-week (sp-week) level. Standard errors are clustered at the level of the fixed effect. There is again stability in the coefficients across the various specifications. As expected, the coefficient on the Abreojos dummy—reflecting the price gap for small scale species—is always negative and signifi- cant (columns 1–3). More interesting is the coefficient on the interaction between the Abreojos dummy and the large scale dummy. As predicted by the theory, this coefficient is positive and statistically differ- ent from zero. The magnitude suggests that the gap in prices between Abreojos and the other coopera- tives is reduced by 15–17% when considering species that are more highly mobile (columns 1–3). The results for catch also confirm the theory: While weekly catch is higher in Abreojos than in the other cooperatives, this difference is cut in half for large scale species (columns 4–6). This is consistent with the idea that Abreojos exerts less control of effort over large scale species rela- tive to small scale species. These results also narrow the class of alternative explanations that can capture the data. For example, if lower fishing costs near Abreojos than near the other cooperatives are driving the results, then it must be the case that the cost difference is lower for large scale species than for small scale species. Implication 3: Changes in Growth Rates The third implication of the model is that the difference in prices between a nonexclusive cooperative and an exclusive cooperative will rise when population growth rates fall, and fall when population growth rates rise. Essentially, the exclusive cooperative acts more aggressively to limit effort when growth rates are projected to be low. To develop an estimating equation to test the prediction, first consider the cooperative’s maximiza- tion problem in equation 11 above. Let kt be the Lagrange multiplier on the constraint for the Xtþ1 stock equation, and note that Ht is an implicitly defined function of ðXt ; Pct ; Pmt Þ and aà is an implicitly defined function of ðPct ; Pmt Þ. The first-order condition with respect to Pct can then be written in general as GðPct ; Pmt ; Xt ; rtþ1 ; kt ; d; b; hÞ ¼ 0, where G(.) is a function and ðd; b; hÞ give the discount rate, costs of fishing, and the extraction rate from fishing effort. This means that logðPctd Þ for cooperative of type d¼E,N can be written in general as: logðPctd Þ ¼ gd ðlogðPmt Þ; Xtd ; rtþ1 ; ktd ; d; b; hÞ, where this could be further generalized to allow for time-varying and cooperative-varying costs by substituting btd for b. 322 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen We begin again with a Taylor series approximation of this function about the expected values of all arguments: À À logðPctd Þ ¼ gd þ gdp ðlogðPmt Þ À logðPm ÞÞ þ gdr ðrtþ1 À r Þ (29) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 À À þgdx ðXtd À Xd Þ þ gdk ðktd À kd Þ þ td (30) À gd where is the function evaluated at the expected values and all other bars indicate expected values. The additional subscripts on g indicate derivatives with respect to a variable, evaluated at the expected val- ues. Here, td is approximation error and, if b is allowed to vary by time and cooperative type, period/ cooperative-specific shocks to fishing costs (similarly, td could reflect shocks to other parameters). Both cooperatives see the same values of Pmt and rtþ1 . Let Dtd equal 1 when d¼E and 0 otherwise. Then, similarly to above, we have: À À À ÀÀÀÀÀ logðPctd Þ ¼ gN þ Dtd ðgE À gN Þ þ ðgNp þ Dtd ðgEp À gNp ÞÞðlogðPmt Þ À logðPm ÞÞ (31) À À þðgNr þ Dtd ðgEr À gNr ÞÞðrtþ1 À r Þ þ ðgNk þ Dtd ðgEk À gNk ÞÞðktd À kd Þ (32) À þðgNx þ Dtd ðgEx À gNx ÞÞðXtd À Xd Þ þ td (33) Implication 3 of the model concerns the difference in the correlation between logðPct Þ and logðrtþ1 Þ across exclusive and nonexclusive cooperatives, holding market prices, fishing costs, and initial stock constant. This derivation makes several challenges clear. First, there are items in td that are potentially correlated with the observable variables. Time-varying costs, for example, may be related to market pri- ces if these prices are locally determined. Another possibility is that the differential effects of species scale across cooperatives, as shown above. Second, differences in two key unobservables—stock Xtd and Lagrange multiplier ktd —reflect differences in market prices, fishing costs, and initial stock. Third, and finally, direct data on growth rates are not available. Instead, we use ONI as a proxy variable. Since ONI increases growth rates for some species and reduces growth rates for others, as discussed above, we must account for species-specific responses to ONI in the specification. Our classification of these responses will introduce measurement error. To deal with these issues, we estimate the following specification in the empirical analysis using data on the species-week level for each cooperative: logðPctdx Þ ¼ a0 þ a1 Dtd þ a2 Dtd Lx þ a3 logðPmtx Þ þ a4 Dtd logðPmtx Þ þ a5 ONIt (34) þa6 Dtd ONIt þ a7 Dtd Rx þ a8 Rx ONIt þ a9 Dtd Rx ONIt þ dxp þ utdx (35) where x is the species subscript and Lx is the dummy for large scale as in the discussion of implication 2. ONIt is our ONI measure (varying at the month level). Including Dtd ONIt allows for baseline differen- ces across geographic areas in response to ONI for species whose recruits have no known response to ONI; the adults of these species may still respond to ONI. Rx is a variable equal to À1 if ONI has nega- tive effects on recruitment of species x, equal to 1 if ONI has positive effects and 0 if there is no estab- lished consensus. Finally, dxp is a species-time period-specific fixed effect, where a time period is a month in the preferred specifications below. Implication 3 of the model is that a9 > 0: the a9 coefficient indicates the difference across coopera- tives in responding to other species’ positive or negative responses to ONI. Our identifying assumption is that, after controlling for the other observables in equation 34, no factor in utdx leads to differential effects of ONI across cooperatives. The remaining threats to identification must take a very particular form. For instance, Xtdx and ktdx are omitted from the estimating equation. The cooperative dummy Dtd and the species-time period-specific fixed effect dxp capture the components of these unobservables that are additively separable between these items, so the remaining problem comes from components that are cooperative-species-specific. An example that could generate the patterns we see in the data is that the The World Bank Economic Review 323 initial stock X0 might be higher near Abreojos only for those species that respond positively to ONI. While we cannot prove an explanation like this is not at work, below we show that our basic results are robust to a number of changes in the specification. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Table 7 shows the results from estimating versions of equation 34. The first column shows the base specification, using fixed effects at the species-month level. The first two rows show the results for the Abreojos main effect and the interaction with the large scale species dummy. These coefficients have the expected signs, given the discussion of implications 1 and 2 above. The third and fourth rows show how the market prices are correlated with cooperative prices for each cooperative. The next two rows contain the interactions of Abreojos with ONI and the Recruit Effect (Rx from above). Finally, the last row con- tains the estimate of a9 . This shows that when growth rates change due to ONI, the price difference between Abreojos and other cooperatives moves in the predicted direction. The coefficient a9 implies a 16% price change in response to a one standard deviation change in ONI. Table 7. Implication 3: Effect of Growth Rates on Price Differences Dep Variable: Log Cooperative Price Variable (1) (2) (3) (4) (5) (6) Abreojos À0.780*** À0.835*** À0.960*** À1.189*** À1.181*** À1.188*** (0.133) (0.173) (0.233) (0.023) (0.032) (0.017) Abreojos X large scale 0.157*** 0.139*** 0.162*** 0.158*** 0.162*** 0.141*** (0.043) (0.045) (0.052) (0.044) (0.056) (0.033) Log mkt price 0.395*** 1.022 À2.440 (0.117) (0.845) (2.406) Abreojos X log mkt price À0.135*** À0.114* À0.085 (0.050) (0.060) (0.075) Abreojos X ONI 0.056* 0.071** 0.047 0.065** 0.064* 0.064*** (0.034) (0.035) (0.045) (0.031) (0.037) (0.023) Abreojos X recruit effect 0.793*** 0.722*** 0.652*** 0.393*** 0.274*** 0.403*** (0.176) (0.198) (0.232) (0.063) (0.085) (0.049) Abreojos X ONI X recruit effect 0.211*** 0.193** 0.196** 0.208*** 0.097 0.228*** (0.074) (0.075) (0.086) (0.074) (0.090) (0.055) Estimation method FE FE-IV FE-IV FE FE FE Fixed effects Sp-mth Sp-mth Sp-mth Sp-mth Sp-qtr Sp-week Obs 2871 2600 2538 2871 2871 2871 Num. groups 770 575 560 770 340 2152 Within-R2 0.905 0.902 0.876 0.941 Note: Sample includes observations at the weekly level from Abreojos, Pichilingue, and Sargento. The omitted cooperative category is Pichilingue/Sargento. Columns 2 and 3 treat log market price as endogenous and instrument for it using the one period and two period lag of log market price, respectively. Standard errors are clustered at the level of the fixed effect. A serious concern with this result is that market prices may be correlated with fishing costs. Above, we saw that our three cooperatives are not dominant players in the La Paz market, so they are unlikely to be able to affect market prices directly with their own actions. Moreover, for at least some species, market prices in La Paz seem to be related to prices in a market that should be relatively unaffected by supply-side issues in B.C.S. Still, if at least some market prices are locally determined, a large positive shock to fishing costs of every player in B.C.S. may cause a large positive shock to market prices as well. Our species-month fixed effect deals with this issue in part; but it could still be the case that a weekly shock in costs relative to the monthly mean is associated with a weekly shock to market prices. This endogeneity could then cause a bias in the coefficient of interest that would be difficult to sign. We deal with this issue in two ways. First, we instrument for market prices and the interaction of mar- ket prices with the Abreojos dummy. As our instruments we use a lag of market prices and the 324 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen interaction of the lag with the Abreojos dummy. Column 2 uses the one period lag, while column 3 uses the two period lag. The estimates of a9 are quite similar to the estimate in column 1. Nevertheless, this solution has several problems. If weekly shocks to costs from the monthly mean are correlated across Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 time, then the period t shock in the residual could be correlated with the instruments. Even if the exclu- sion restriction is satisfied, these are not strong instruments. The Kleibergen-Paap LM statistic for col- umn 2 is large enough that we can reject the null hypothesis of underidentification at the 10% level, but we fail to reject with the corresponding Wald statistic. In column 3, we fail to reject with both statistics. Therefore, we also estimate the standard fixed effects model without incorporating the market prices. While this deviates from the theoretically inspired specification, it is still useful: If the remaining coeffi- cients change dramatically, this suggests that the endogeneity of market prices could cause large biases. The results appear in column 4. Reassuringly, a9 is still positive, statistically different from zero, and of similar magnitude. Finally, we test the sensitivity of the results to the level of the fixed effect. With a less flexible set of fixed effects (species-quarter), a9 falls in size and becomes statistically indistinguishable from zero (col- umn 5). However, with species-week fixed effects, a9 is again positive and significant, and the magnitude is closely comparable to the baseline specification (column 6).28 Alternative Explanations There are historical and geographic differences between Abreojos and the other two cooperatives. An important concern is that the empirical patterns above reflect these other differences, rather than the dif- ference in property rights. The tests of implications 2 and 3 help greatly in this regard. The empirical specifications testing these implications are essentially difference-in-difference models: while the Abreojos dummy may be endogenous to price and catch levels, the estimated coefficients on the key interaction terms are unbiased as long as the source of endogeneity does not differ by species scale or by species-specific responses to environmental oscillations. One specific concern, for example, is that Abreojos’s distance from La Paz leads to higher costs for selling catch, and this cost is taken out of payments to cooperative members. This explains the negative coefficient on the Abreojos dummy in the buying price regressions above. However, this explanation is problematic. First, the cost difference is unlikely to be large: Abreojos has streamlined methods of trans- porting, processing, and marketing catch through its operations in La Paz and Ensen ~ada, as well as FEDECOOP’s exports. Second, the explanation cannot capture the fact that the price gap between Abreojos and Pichilingue/Sargento is smaller for large-scale species. One could supplement the above alternative explanation with the idea that, relative to the other coop- eratives, Abreojos faces a smaller disadvantage in marketing catch for its exports to the United States and elsewhere. If large-scale species are more often exported, this could explain the fact that the price gap between Abreojos and Pichilingue/Sargento is smaller for large-scale species. However, this explana- tion is also problematic. Of the species for which we have market price data, Abreojos’s most salient exports are lobster and sea bass, and both species are small-scale.29 Moreover, this alternative explana- tion cannot explain our third finding, that the price gap is responsive to ONI and the specific effect of ONI (positive or negative) on a particular species. 28 Throughout, we have used the sample for which we have nonmissing cooperative prices and nonmissing market prices. In specifications not shown here, we show that a9 continues to be positive and significant with species-week fixed ef- fects when this sample is broadened to all observations without nonmissing cooperative prices. However, a9 is smaller and insignificant with this sample when using species-month fixed effects. 29 By “salient,” we mean products advertised on the FEDECOOP website: http://www.fedecoop.com.mx/. Abalone, an- other small-scale species, is a key export of Abreojos, but we do not have market price data for this species. The World Bank Economic Review 325 The geographic differences between Abreojos and the other cooperatives suggest there may be differ- ential stock endowments across the two areas. This could explain our results for differential catch, but such an alternative explanation would have to take a particular form to capture both of our empirical Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 results: specifically, the stock advantage of the Pacific side would have to be relatively larger for small scale species than for large scale species. Existing work suggests the presence of differences in ocean pro- ductivity in the Pacific versus Gulf sides of the Baja Peninsula; unfortunately, however, not much is known (as far as we are aware) about how these differences manifest differentially across small scale and large scale species. Therefore, this alternative explanation for our catch results may be plausible, and we do not have evidence to support or refute it at this time. Nevertheless, we believe our model is the most plausible explanation that captures all of the empirical patterns we have seen for price and catch in one unified framework. V. Conclusion In this paper, we set out to understand how cooperatives use property rights to control fishing effort, and how this use is shaped by key characteristics of the targeted fish populations: scale of individual movement and responses to large environmental fluctuations. Using rich logbook data from three coop- eratives, including one that enjoys strong property rights, we find support for the implications of our the- oretical model. The magnitude of the differences between the cooperative with property rights and the ones without rights—as well as the shrinking of these differences when resources have a large scale or when growth rates are high—are economically significant. These results highlight the value of linking theory with empirical analysis in order to examine the reciprocal interactions between natural resources and resource users. Our approach focuses on the eco- nomic mechanisms underlying these interactions and thereby complements more descriptive, existing analyses of fishery outcomes. Our results are also relevant beyond fisheries, as they illustrate how the decisions of resource users embedded in local institutions are mediated by characteristics of the resource and external dynamic factors. However, given that our logbook data are restricted to only three coopera- tives, our conclusions are necessarily provisional. We hope that our analysis points the way to future investigation with more representative data. More generally, integration of the connections between resources and resource users may well increase the effectiveness of state policies in coastal areas, whether focused on environmental steward- ship, economic development, or both. Indeed, CONAPESCA’s recently developed National Program of Inspection and Vigilance (Programa Nacional de Inspeccio n y Vigilancia) demonstrates the govern- ment’s interest in involving the leaders of local user groups in fisheries management; the program calls for the formation of state committees that include representatives of national, state, and local govern- ments, as well as representatives of fisher groups.30 Our results suggest that even greater devolution of authority to local users—by granting property rights—could be a viable management strategy in the right circumstances. By recognizing the influence that local institutions like Mexico’s fishing coopera- tives may have on both ecological and economic outcomes, policymakers will be better able to craft pro- active, ecosystem-based policies that sustain both marine resources and the human communities that rely on them. References Aburto-Oropeza, O., E. Sala, G. Paredes, A. Mendoza, and E. Ballesteros. 2007. “Predictability of Feef Fish Recruitment in a Highly Variable Nursery Habitat.” Ecology 88 (9), 2220–28. 30 For details, see http://www.conapesca.sagarpa.gob.mx/wb/cona/programa_nacional_de_inspeccion_vigilancia_ 326 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen Acosta, C. A. 1999. “Benthic Dispersal of Caribbean Spiny Lobsters among Insular Habitats: Implications for the Conservation of Exploited Marine Species.” Conservation Biology 13 (3), 603–612. Allison, E. H., and F. Ellis. 2001. “The Livelihoods Approach and Management of Small-Scale Fisheries.” Marine Policy 25 (5), 377–88. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Basurto, X., A. Bennett, A. Weaver, S. Rodriguez-Van Dyck, and J.-S. Aceves-Bueno. 2013. “Cooperative and Noncooperative Strategies for Small-Scale Fisheries’ Self-Governance in the Globalization Era: Implications for Conservation.” Ecol. Soc. 18 (38). Basurto, X., and E. Coleman. 2010. “Institutional and Ecological Interplay for Successful Self-Governance of Community-Based Fisheries.” Ecological Economics 69 (5), 1094–103. Branch, T. A., O. P. Jensen, D. Ricard, Y. Ye, and R. Hilborn. 2011. “Contrasting Global Trends in Marine Fishery Status Obtained from Catches and from Stock Assessments.” Conservation Biology 25 (4), 777–86. Burgess, R., M. Hansen, B. A. Olken, P. Potapov, and S. Sieber. 2012. “The Political Economy of Deforestation in the Tropics.” Quarterly Journal of Economics 1707–54. Calderon-Aguilera, L., S. Marinone, and E. Arago n-Noriega. 2003. “Influence of Oceanographic Processes on the Early Life Stages of the Blue Shrimp (Litopenaeus stylirostris) in the Upper Gulf of California.” Journal of Marine Systems 39 (1), 117–28. Carson, R. T., C. Granger, J. Jackson, and W. Schlenker. 2009. “Fisheries Management under Cyclical Population Dynamics.” Environmental and Resource Economics 42 (3), 379–410. Clark, C. W. 1990. Mathematical Bioeconomics: Optimal Management of Renewable Resources, Hoboken, NJ: John Wiley and Sons Inc. Costello, C., S. D. Gaines, and J. Lynham. 2008. “Can Catch Shares Prevent Fisheries Collapse?” Science 321 (5896), 1678–81. Costello, C., and D. T. Kaffine. 2010), “Marine protected areas in spatial property-rights fisheries.” Australian Journal of Agricultural and Resource Economics 54 (3), 321–41. Costello, C., S. Polasky, and A. Solow. 2001. “Renewable Resource Management with Environmental Prediction.”  Canadian Journal of Economics/Revue Canadienne d’ Economique 34 (1), 196–211. Cota-Nieto, J. 2010. Descripcio ria y reciente de las pesquer n histo ıas artesanales de punta abreojos B.C.S., Me´ xico. periodo 2000–2007., Thesis, Universidad Auto noma de Baja California Sur. Deacon, R. T. 2012. “Managing Fisheries by Assigning Rights to Harvester Cooperatives.” Review of Environmental Economics and Policy 6 (2), 258–77. Deacon, R. T., D. P. Parker, and C. Costello. 2008. “Improving Efficiency by Assigning Harvest Rights to Fishery Cooperatives: Evidence from the Chignik Salmon Co-op.” Arizona Law Review 50 (2). ———. 2013. “Reforming Fisheries: Lessons from a Self-Selected Cooperative.” Journal of Law and Economics 56 (1). DeWalt, B. R. 2001. “Community Forestry and Shrimp Aquaculture in Mexico: Social and Environmental Issues.” presentado en la Latin American Studies Association, Washington DC. Dong, D., B. W. Gould, and H. M. Kaiser. 2004. “Food Demand in Mexico: An Application of the Amemiya-Tobin Approach to the Estimation of a Censored Food System.” American Journal of Agricultural Economics 86 (4), 1094–107. Eberhardt, L. 1977. “Relationship between Two Stock-Recruitment Curves.” Journal of the Fisheries Research Board of Canada 34 (3), 425–28. Edmonds, E. V. 2002. “Government-Initiated Community Resource Management and Local Resource Extraction from Nepal’s Forests.” Journal of Development Economics 68 (1), 89–115. Foster, A. D., and M. R. Rosenzweig. 2003. Agricultural Development, Industrialization and Rural Inequality, Technical report, Cambridge, Massachusetts: Harvard University. Gordon, H. S. 1954. “The Economic Theory of a Common-Property Resource: The Fishery.” The Journal of Political Economy 62(2), 124–142. Gutie´ rrez, N. L., R. Hilborn, and O. Defeo. 2011. “Leadership, Social Capital and Incentives Promote Successful Fisheries.” Nature 470 (7334), 386–89. Hilborn, R., J. K. Parrish, and K. Litle. 2005. “Fishing Rights or Fishing Wrongs?” Reviews in Fish Biology and Fisheries 15 (3), 191–99. The World Bank Economic Review 327 Ibarra, A. A. 1996. “Fisheries Trade under NAFTA and a Comparison with the EU,” Technical report, Instituto Nacional De La Pesca, Mexico City. Ibarra, A. A., C. Reid, and A. Thorpe. 1998. Neo-Liberalism and the Latin “Blue Revolution” Fisheries Development in Chile, Mexico and Peru, University of Portsmouth, Centre for the Economics and Management of Aquatic Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Resources. ———. 2000. “The Political Economy of Marine Fisheries Development in Peru, Chile and Mexico.” Journal of Latin American Studies 32 (02), 503–27. Leslie, H. M., X. Basurto, M. Nenadovic, L. Sievanen, K. C. Cavanaugh, J. J. Cota-Nieto, and B. E. Erisman et alet al. 2015. “Operationalizing the Social-Ecological Systems Framework to Assess Sustainability.” PNAS 112 (19), 5979–84. Lester, S. E., B. S. Halpern, K. Grorud-Colvert, J. Lubchenco, B. I. Ruttenberg, S. D. Gaines, S. Airame ´ , and R. R. Warner. 2009. “Biological Effects within No-Take Marine Reserves: A Global Synthesis.” Marine Ecology Progress Series 384, 33–46. Lluch-Cota, S., A. Pares-Sierra, V. Magana-Rueda, F. Arreguin-Sanchez, G. Bazzino, H. Herrera-Cervantes, and D. Lluch-Belda. 2010. “Changing Climate in the Gulf of California.” Progress In Oceanography 87, 114–26. Mann, K., and J. Lazier. 2005. Dynamics of Marine Ecosystems: Biological-Physical Interactions in the Oceans, Wiley-Blackwell. McCay, B. J., F. Micheli, G. Ponce-D ıaz, G. Murray, G. Shester, S. Ramirez-Sanchez, and W. Weisman. 2014. “Cooperatives, Concessions, and Co-management on the Pacific Coast of Mexico.” Marine Policy 44, 49–59. McGuire, T. R. 1983. “The Political Economy of Shrimping in the Gulf of California.” Human Organization 42 (2), 132–45. OECD 2006. Agricultural and Fisheries Policies in Mexico: Recent Achievements, Continuing the Reform Agenda, Organisation for Economic Cooperation and Development Publishing. Ostrom, E. 1990. Governing the Commons: The Evolution of Institutions for Collective Action, Cambridge University Press: Cambridge (UK). Ovando, D. A., R. T. Deacon, S. E. Lester, C. Costello, T. Van Leuvan, K. McIlwain, and C. Kent Strauss et alet al. 2013. “Conservation Incentives and Collective Choices in Cooperative Fisheries.” Marine Policy 37, 132–40. Parma, A. M., and R. B. Deriso. 1990. “Experimental Harvesting of Cyclic Stocks in the Face of Alternative Recruitment Hypotheses.” Canadian Journal of Fisheries and Aquatic Sciences 47 (3), 595–610. Pauly, D. 1997. Small-Scale Fisheries in the Tropics: Marginality, Marginalization, and Some Implications for Fisheries Management, in E. Pikitch, D. Hupert, and M. Sissenwine, eds, “Global trends: Fisheries Management.” American Fisheries Society Symposium 20: Bethesda, Maryland, 40–49. Petterson, J. S. 1980. “Fishing Cooperatives and Political Power: A Mexican Example.” Anthropological Quarterly 64–74. Polis, G., M. Rose, F. Sanchez Pinero, P. Stapp, and W. Anderson. 2002. “Island Food Webs,” in A New Island Biogeography of the Sea of Cortes. New York: Oxford UP, 280–362. Reddy, S. M., A. Wentz, O. Aburto-Oropeza, M. Maxey, S. Nagavarapu, and H. M. Leslie. 2013. “Evidence of Market-Driven Size-Selective Fishing and the Mediating Effects of Biological and Institutional Factors.” Ecological Applications 23 (4), 726–41. Reed, W. J. 1975. “A Stochastic Model for the Economic Management of a Renewable Animal Resource.” Mathematical Biosciences 22, 313–37. aenz-Arroyo, A., C. M. Roberts, J. Torre, M. Carin S ıquez-Andrade. 2005. “Rapidly Shifting ~ o-Olvera, and R. R. Enr Environmental Baselines among Fishers of the Gulf of California.” Proceedings of the Royal Society B: Biological Sciences 272 (1575), 1957–62. Sala, E., O. Aburto-Oropeza, M. Reza, G. Paredes, and L. G. Lo pez-Lemus. 2004. “Fishing Down Coastal Food Webs in the Gulf of California.” Fisheries 29 (3), 19–25. Sanchez, A., J. J. C. Nieto, I. M. Osorio, B. Erisman, M. Moreno-Baez, and O. Aburto-Oropeza. 2015. Caracterizacion de las cadenas productivas pesqueras – baja california sur, mexico (primera fase la paz). Database, Centro para la Biodiversidad Marina y la Conservacion (CMBC) and Scripps Institution of Oceanography. Schaefer, K. M., D. W. Fuller, and B. A. Block. 2007. “Movements, Behavior, and Habitat Utilization of Yellowfin Tuna (Thunnus albacares) in the Northeastern Pacific Ocean, Ascertained through Archival Tag Data.” Marine Biology 152 (3), 503–25. 328 Aburto-Oropeza, Leslie, Mack-Crane, Nagavarapu, Reddy, and Sievanen Scott, A. 1956. “The Fishery: The Objectives of Sole Ownership.” The Journal of Political Economy 63 (2), 116–24. SEPESCA 1992. Legal Framework for Fisheries 1992, Technical report, Secretaria de Pesca, Mexico City. Sievanen, L. 2014. “How do small-scale fishers adapt to environmental variability? lessons from Baja California Sur, Mexico.” Maritime Studies. 13:9:doi:10.1186/s40152-40014-40009-40152. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/295/2897307 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Velarde, E., E. Ezcurra, M. A. Cisneros-Mata, M. F. Lav ın, and M. F. 2004. “Seabird Ecology, El Nin ~o Anomalies, and Prediction of Sardine Fisheries in the Gulf of California.” Ecological Applications 14 (2), 607–15. Villa, A. 1996. “A Review of Recent Changes in Mexico’s Fishing Policy,” thesis, College of Marine Studies at the University of Delaware. White, C., and C. Costello. 2011), “Matching Spatial Property Rights Fisheries with Scales of Fish Dispersal.” Ecological Applications 21, 350–62. Wilen, J. E., J. Cancino, and H. Uchida. 2012. “The Economics of Territorial Use Rights fisheries, or TURFs.” Review of Environmental Economics and Policy (62), 237–57. Wooldridge, J. 2010. Econometric Analysis of Cross Section and Panel Data, Second Edition, Cambridge, MA: The MIT Press. Young, E. 2001. “State Intervention and Abuse of the Commons: Fisheries Development in Baja California Sur, Mexico.” Annals of the Association of American Geographers 91 (2), 283–306. The World Bank Economic Review, 31(2), 2017, 329–350 doi: 10.1093/wber/lhv058 Advance Access Publication Date: October 22, 2015 Article Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 When Winners Feel Like Losers: Evidence from an Energy Subsidy Reform Oscar Calvo-Gonzalez, Barbara Cunha, and Riccardo Trezzi Abstract In 2011 the Government of El Salvador implemented a reform to the liquefied gas (LPG) subsidy that increased the welfare of households in all but the top two deciles of the income distribution. However, the reform turned out to be rather unpopular, including among winners. This paper relies on ad hoc household surveys con- ducted before the implementation and in the following two-and-a-half years to test which factors help explain the puzzle. The analysis uses probit regressions to show that misinformation (a negativity bias by which people with limited information inferred negative consequences), mistrust of the government’s ability to implement the policy, and political priors explain most of the (un)satisfaction before implementation. Perceptions im- proved gradually—and significantly so—over time when the subsidy reception induced households to update their initial priors, although political biases remained significant throughout the entire period. The results sug- gest several implications with respect to policy reforms in cases where agents have limited information. JEL classification: H23, H24, O54 I. Introduction The government of your country announces a certain policy change. How do you know if it will benefit you or not? There are many things you may take into account. For example, what do the media and dif- ferent political parties say about it? Who do you trust? What are your beliefs about the government’s ca- pacity to deliver? The list of factors to consider could go on. But what happens when your assessment is wrong? Sometimes winners may believe they are losers. This paper explores empirically one such case. More specifically, it analyzes the determinants of the citizens’ satisfaction about a reform of the liquefied Oscar Calvo-Gonzalez (corresponding author) is practice manager in the Poverty and Equity Global Practice of The World Bank; his email address is ocalvogonzalez@worldbank.org. Barbara Cunha is a senior economist with The World Bank. Riccardo Trezzi is an economist with the Board of Governors of the Federal Reserve System. We are grateful to La Prensa Grafica for making available the data for the public opinion surveys conducted in 2011 and 2012 and especially to Edwin Segura for his help and insights regarding public opinion polling in El Salvador. We are also grateful to Guillermo Raul Beylis, Augusto de la Torre, Marianne Fay, Auguste Tano Kouame, Alice Kuegler, Carlos Rodriguez Castelan, Adrien Vogt- Schilb, to the participants of the microeconomic shadow talks at the University of Cambridge, to three anonymous referees, and the editor of this journal for numerous comments and suggestions. Riccardo Trezzi is also grateful to the University of Cambridge for financial support. All errors and omissions are ours. Disclaimer: the findings, interpretations, and conclu- sions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the Board of Governors of the Federal Reserve Bank, the International Bank for Reconstruction and Development/World Bank and its affliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. C The Author 2015. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 330 Calvo-Gonzalez, Cunha, and Trezzi gas (LPG) subsidy in El Salvador, a reform that was expected to improve the welfare of around three- quarters of the population but which was initially unpopular. The theoretical literature has long recognized the potential importance of such scenario. If individuals Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 are uncertain about the benefits of a policy change it can lead to a status quo bias, in which a policy that would benefit the majority of the population is not adopted (Fernandez and Rodrik 1991). Under individual-specific uncertainty, an increase in the number of expected winners could reduce the probabil- ity that a reform is approved, until it reaches a critical threshold and becomes an overwhelming majority (Jain and Mukand 2003). More generally, the inability of the policymaker to persuade the electorate of the benefits of a policy change has featured in a large number of political economy models (reviewed by Drazen 2000). Recently, the literature has turned its attention to the importance that the dynamics of learning about reform outcomes may have on support for reforms (van Wijnbergen and Willems 2014). Others have pointed out that political support for reforms can vary dramatically over time, which may alleviate the problem of status quo bias because governments may be able to withstand long periods of low popularity, as long as political support recovers before election-day (Veldkamp 2009). While the theoretical literature is abundant, it has proven difficult to identify these effects empirically.1 This paper contributes to the literature in three ways. First, it documents a case of a reform that bene- fited the majority of the population but was initially unpopular. Second, it uses new survey data to iden- tify the factors that help explain this puzzle. Third, it analyzes how the main factors driving the popularity of the reform evolved once the reform was implemented. The policy change that is the focus of this study is the reform of the gas subsidy implemented in El Salvador in April 2011. In order to do so, we rely on six consecutive surveys, one conducted before the implementation of the reform and the re- maining five afterwards. The reform implied the removal of the price subsidy for liquefied petroleum gas (LPG), resulting in a price increase for consumers from $5.10 to $13.60 per 25 lb. bottle of LPG, the most common fuel used for cooking by Salvadorans at home. In place of the price subsidy the authorities introduced a monthly income transfer of $8.50 to households with an electricity consumption of less than 200 Kwh per month. This was a relatively high cut-off as around 94 percent of households with ac- cess to electricity consumed less than the eligibility threshold. For these households the monthly subsidy of $8.50 (equivalent to a bottle of LPG) was provided through the electricity bill. Households without access to electricity were entitled to a government-issued card that would allow them to collect the monthly $8.50. Overall, an incidence analysis of the reform suggests that around two-thirds of households benefited from the reform (Tornarolli and Vazquez 2012). Ex-ante the winners from the reform included two groups. First, households that did not use LPG as cooking fuel and would now benefit from the $8.50 per month. Since LPG use was not as common among the lowest income groups as among the richest, this facet of the reform was particularly pro-poor. In particular, based on the household surveys, among the households in the lowest decile of the income distribution, only 21 percent used LPG before the re- form, while the corresponding figure for the second lowest decile was 37 percent. For these households a common fuel for domestic use was firewood. Second, households who consumed less than one bottle of LPG per month would also benefit from the reform. On the losing side, some of the richest households in El Salvador would become ineligible for the LPG subsidy on account of their high electricity consump- tion or because they could collect the subsidy for only one of their properties. Households that consumed more than one bottle of LPG per month would also lose out. Nevertheless, the reform proved to be—at least initially—highly unpopular. In January 2011 just one-third of the electorate favored the upcoming reform, and in August of the same year, less than 45 percent of people declared themselves either “satis- fied” or “very satisfied” about it. The satisfaction rate continued to increase during the following year 1 One of the few examples in the literature is represented by Harrison (2013) who discusses the British Columbia carbon tax reform. The World Bank Economic Review 331 and a half before stabilizing at around 65 percent, the level observed in our last survey in September 2013. In our empirical investigation we answer two research questions. First, what are the factors driving Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 the unpopularity of the reform before and in the aftermath of its implementation? Second, what vari- ables account for the relatively high popularity of the reform two years after it was implemented? In se- lecting potential explanatory variables to include in our analysis we draw from the recent literature on energy subsidies, which has suggested a number of common barriers to successful subsidy reform, in- cluding lack of information about the magnitude and shortcomings of subsidies and lack of government credibility and administrative capacity (IMF 2013). Widespread media and information campaigns ap- pear to have played an important role in successful reform efforts in countries such as Ghana, Namibia, and the Philippines, and the need for public information campaigns has been identified as a key lesson learned from country case studies (Vagliasindi 2013). Case studies on the political economy of reform in sectors other than energy also highlight the importance of providing information to citizens about the benefits and costs of different policy choices (Fritz et al. 2012). In addition, there is a long-standing body of literature showing that supporters of the party in govern- ment have a more positive attitude towards government policies (Anderson and Tverdova 2001; Anderson and LoTempio 2002). Finally, creative solutions, such as advanced compensatory payments deposited in new bank accounts for households, have allowed governments to circumvent lack of trust among citizens when implementing subsidy reform efforts as in the case of Iran (IMF 2013). Fuel subsi- dies are often large and have proven to be very hard to reform, prompting some in the literature to refer to them as constituting a “policy trap” (Bril-Mascarenhas and Post 2015). To the best of our knowledge, our paper provides the first empirical test of the role that information and trust in the government capac- ity play in explaining support for an energy subsidy reform. We identify three main factors that explain the evolution of the popularity of the reform: the individ- ual’s level of information (which is especially relevant ex-ante), his/her trust in the government’s ability to implement the reform effectively (or the ability to deliver the subsidy ex-post), and his/her political views. Using probit regressions we test the marginal effect of each of these factors in the six surveys, re- spectively conducted in January 2011, May 2011, August 2011, May 2012, August 2012, and September 2013. Our results—robust to a large set of checks—suggest three main conclusions. We first show that in January 2011—before implementation—the level of information about the reform, the ex- pectations on the ability of the government to deliver, and political priors help explain most of the over- all satisfaction rate. On average, around 70 percent of the variance of the dependent variable is captured by our main regressors. Second, we show that the increase in the satisfaction rate over time is essentially driven by the ability of the government to deliver the subsidy. Throughout the five surveys following April 2011, the significance and magnitude of the coefficient identifying the above effect progressively increases. Finally, we show a nonmarginal effect of political partisanship in the perception of the reform not only before the reform was implemented but also throughout the entire period of analysis. Our findings may be useful for those considering subsidy reforms. The starting point of a reform cannot be to assume that accurate information is widely known or that departures from perfect information are unbiased. Surveying the extent of information and categorizing attitudes so as to inform any public infor- mation campaign are worth undertaking (as suggested by Fritz et al. 2012). Any efforts at informing the public would then need to be evaluated against that baseline. In some cases the timing of reforms may need to be adjusted if the priors that individuals hold suggest that reform efforts would be premature. In those cases, emphasis could be put first on affecting the information landscape. Piloting of reforms could also help government test, learn, and adapt their interventions (Haynes et al. 2012). The timing of releasing information about any upcoming reform is also to be carefully planned to minimize the need for adjust- ments that could add to the confusion and undermine the credibility of the reform efforts. 332 Calvo-Gonzalez, Cunha, and Trezzi The rest of the paper is organized as follows. Section II explains the details of the gas subsidy reform. Section III describes the dataset. Section IV explains our empirical models. Sections V and VI present our results and robustness checks, respectively. Finally, section VII concludes. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 II. The Reform In April 2011, the government of El Salvador implemented a substantial reform of the subsidy for gas to improve targeting and to alleviate fiscal pressures. The main element of the reform, as we will see further below, involved eliminating the price subsidy (so that now gas would be sold at market price) and intro- ducing a compensatory cash transfer to eligible households. Liquified petroleum gas (LPG) is one of the most common fuels used for cooking in El Salvador, with around 70 percent of households using LPG in their homes. Around 75 percent of all LPG is sold bottled, the form that households consume, the rest being sold in bulk to industry. LPG bottles are sold in 10, 20, 25, 35, and 100 lb. presentations. The most widely used bottle is the 25 lb., with an 85 percent market share of all LPG sold. The government had been subsidizing bottled LPG since 1974. Until April 2011 the authorities set a maximum retail price for the 10, 20, 25, and 35 lb. bottles so that consumers experienced a fixed (con- trolled by the government) price of the bottle. Prices for consumers had remained unchanged from 1996 to April 2008, when the consumer price of a 25 lb. LPG bottle was increased from $4.15 to $5.10, a price that was relatively well known by the population at large. As of April 2011 the amount of subsidy that the government paid per 25 lb. bottle was $8.50, bringing the hypothetical market price, without subsidy, of a 25 lb. bottle to $13.60. Knowledge by the population of this price without subsidy was, as we will see later, more limited. The LPG subsidy scheme proved to be increasingly costly. As the international oil price increased in the 2000s the, fiscal costs of the LPG soared from around $10 million in 2004 to $109 million in 2007 (or around 0.5 percent of GDP). Despite the increase in the consumer price to $5.10, the fiscal cost of the LPG subsidy kept increasing, especially after international energy prices rebounded after 2009. In 2010, the last full year in which the subsidy scheme described above was in force, the fiscal cost of the LPG subsidy reached $154 million (or 0.7 percent of GDP). The goal of the gas benefit had been to sub- sidize domestic consumption by Salvadoran households, but leakages happened. Smugglers would buy the subsidized bottles in El Salvador and ship them illegally to neighboring countries that did not subsi- dize LPG. In January 2011 the market price of a 25 lb. bottle of gas was around $16 in neighboring Guatemala and around $12 in Honduras and Nicaragua. Gas that was legally imported from Guatemala to El Salvador was shipped back illegally to Guatemala after having been retailed for household con- sumption in El Salvador. The LPG subsidy scheme was also regressive. While 70 percent of all house- holds used LPG for cooking, the use was not as widespread among the poor. The exclusion error was high: around 47 percent of households in the bottom 40 percent of the income distribution did not re- ceive the LPG subsidy because they did not consume gas. If we restrict ourselves to the bottom decile of income, 67 percent of households did not cook with LPG (Artana and Navajas 2008). As a result the subsidy was poorly targeted, with the households in the bottom 40 percent of the income distribution re- ceiving only 27 percent of the entire benefits of the subsidy. Think tanks and international organizations had been highlighting these issues for some time. A 2006 World Bank report argued that “there is no so- cial or economic justification to keep the current gas subsidy” on account of its fiscal cost and the many inclusion and exclusion errors (WorldBank 2006). The reform implemented in April 2011 changed drastically the way the LPG subsidy was provided. Instead of subsidizing prices at the point of sale, the new mechanism delivered an income transfer to a large set of eligible households. As a result of this change, the consumer price (kept regulated by the gov- ernment) increased from $5.10 (the subsidized price) to $13.60 (the price without subsidy). Individual households received a transfer of $8.50 per month, provided they were eligible. The eligibility The World Bank Economic Review 333 requirement was consuming less than 200 Kwh in electricity per month, a criterion that was meant to ex- clude the highest income brackets of the population from receiving the gas subsidy. Property owners that had two or more properties registered with the electricity company could only collect the subsidy once. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Households that lacked electricity needed to register at a governmental office (at least one per depart- ment) and provide their address so that the household received a card that entitled it to collect the monthly $8.50. There were 3,744 collection points spread out over the entire country, mostly banks and financial institutions. No eligibility criteria were required other than providing information to ensure that the household was receiving the subsidy through only one instrument (i.e., through the electricity bill or via the government-issued card). Although the paperwork was relatively simple, it still had to be submitted at the site, which might require traveling for some households. For those receiving the subsidy through the electricity bill the mechanism was as follows. The subsidy came in the form of a barcode at the bottom of the electricity bill that people had to take to the bank; the teller would then scan the barcode and give the choice to the consumer whether they wanted to apply it against the electricity bill or cash it. Most people applied the subsidy to the bill, which then got discounted. For those with the electricity bill under a direct debit scheme (which very few peo- ple did), they still had to go to the bank and get the teller to scan the barcode and give them the cash. It is worth highlighting that the reform had been announced long before its implementation, and it suffered a few adjustments along the way. Within the first 100 days after taking office in June 2009, the govern- ment announced the intention to “rationalize” or “focalize” the LPG subsidy. However, the plans mate- rialized only in late 2010, when the specifics of the reform described above were introduced to the public. Between December 2010 and February 2011, the eligibility criteria based on electricity consump- tion were revised (the maximum consumption limit increased from 99kw/h to 200kw/h). The eligibility criteria for individuals without electricity or individual electricity meters did not change, but subsistence business became eligible for the subsidy shortly after implementation began. These groups faced a few challenges for registering and getting the electronic card. Issues frequently raised by individuals included long lines and lack of adequate information at the registration center. These implementation adjustments could potentially have affected individuals’ views about the reform. An incidence analysis of the April 2011 reform shows that the new scheme was substantially more pro-poor than the one in existence until then. First, the poor households that did not use LPG as cooking fuel would now benefit from the $8.50 per month. Second, some of the richest households in El Salvador would become ineligible for the LPG subsidy on account of their high electricity consumption or because they would collect the subsidy for only one of their properties. Based on electricity billing records, around 6 percent of households consumed 200 or more KWh per month. In addition, the incentive to smuggle was removed, and consumption of LPG in El Salvador decreased by 15 percent in 2011 com- pared to 2010, while it increased in Guatemala by 7 percent (CEPAL 2012). An important aspect of the reform was that the amount of the subsidy was now limited to $8.50 per month to each eligible household regardless of how many bottles per month the household purchased. Those households who consumed more than one bottle per month would be worse off under the new scheme. This could potentially hurt some of the poor. However, a survey conducted in May 2011 indicated that one 25 lb. bottle of LPG was enough to cover their monthly consumption for 80 percent of house- holds. Among the 20 percent of households for which one 25 lb. bottle was not enough to cover their monthly consumption needs, the majority (almost 70 percent) were households with a monthly income above the average. Finally, since the subsidy was for household consumption, not industrial use, indirect effects on the price level and other second-order effects can be thought of as relatively minor in this case. To sum up, the incidence analysis suggests that the winners of the reform included the poor that did not use LPG but would now receive $8.50 per month and any household that consumed less than 25 lb. of LPG per month. The losers would be the top 6 percent of households in electricity consumption (with 334 Calvo-Gonzalez, Cunha, and Trezzi an electricity consumption higher than 200kw/h), the owners of more than one property, those house- holds that consumed more than one 25 lb. bottle of LPG per month, as well as the smugglers and the dis- tributors (who saw a decline in volumes). The overall incidence by income decile of the LPG subsidy Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 before and after the April 2011 reform is shown in figure 1. Figure 1. Share of Households that Received the LPG Subsidy in Each Income Decile Source: Tornarolli and Vazquez (2012). The incidence analysis is based on de jure eligibility for the subsidy, not on whether people actually received the subsidy. This is an important distinction because it quickly became apparent that a number of households that were entitled to the subsidy did not cash it in. In the estimates of the authorities around 70,000 households did not collect the subsidy even though they were legally entitled to it. While we do not have information on the income level of these households, the anecdotal evidence available suggests that these were households with access to electricity and relatively high income. A common ex- planation for this behavior is that they did not think it was fair to claim a benefit that was meant for the poor. It is also possible that the benefit was lower that the opportunity cost of cashing it (going to the bank, queuing, etc.). As a result, the de jure incidence analysis may underestimate the pro-poor nature of the reform. At the same time it is also possible that some poor households were not well informed about the benefit or were unable to prove their eligibility, which would have reduced the pro-poor nature of the reform. However, there is no evidence that a large number of poor households were unduly ex- cluded. Therefore, we believe that on balance the incidence analysis based on de jure eligibility provides a good approximation of the actual direct impact on households. The fiscal implication of the reform for the government was relatively limited. On the one hand some savings materialized because of reduced smuggling as well as reduction in eligibility of the gas subsidy for the few households above the qualifying thresholds. On the other hand government expenditure in- creased as a result of the fact that households that consumed little or no gas now received the full amount of the monthly benefit. On aggregate, these opposing effects netted each other out. Also, for the month of April 2011, the two regimes operated in parallel implying a small temporary increase in gov- ernment spending. As a reference, the 2012 government spending on gas subsidy was virtually identical to the 2010 level (US$135.6 million or 0.57 percent of GDP). The Puzzle While the reform benefited a large part of the population (and proportionally more the poor) it initially proved to be highly unpopular. A nationally representative survey conducted in late January 2011 The World Bank Economic Review 335 showed that 70 percent of Salvadorans disapproved of the planned LPG subsidy reform. Criticism came from a variety of angles. Some critics stood to lose from the reform, such as LPG distributors, who would see the volume of sales decreased and who would eventually mount a short-lived and ineffective Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 strike in late May 2011. Other criticisms came from more unlikely quarters. In early February 2011 the Archbishop of San Salvador, Monsignor Jose ´ Luis Escobar Alas, expressed his concern that “the poor may be left out” and asked the authorities to reconsider the reform plan. This was not an off-the-cuff remark. The Archbishop spoke to the press on a variety of issues regarding the LPG subsidy reform at least on three different occasions in February 2011 and would continue to do so after the reform was implemented in April 2011. As Monsignor Escobar Alas himself put it: “We have expressed on several occasions our concern that so-called the focalization of the liquified gas subsidy is, with all due respect to the Government or the Ministry of Economy, not the right measure.”2 It is worthwhile spending some time on the Archbishop’s views as those of someone who has legitimacy among the population at large. In fact, the most recent survey of the Latin American Public Opinion Poll shows that Salvadorans have more confidence in the church than in any state institution, with the exception of the armed forces. The dissatisfaction with the reform of the LPG subsidy apparently played a role in the Congressional elections held on March 11, 2012, in which the ruling party (Frente Farabundo Mart n ı para la Liberacio Nacional, FMLN) suffered significant losses. Reflecting on the reasons for the electoral outcome, repre- sentatives from both the FMLN and the opposition party ARENA agreed that the gas subsidy reform had played a role. The head of the ARENA group in Congress, Donato Vaquerano, argued that the sub- sidy reform had been a “colossal mistake” by the FMLN. The head of party organization at the FMLN, Jose´ Luis Merino, recognized that the subsidy reform “undoubtedly had an effect among urban sectors who have resented the measure.”3 The president of Congress, Sigfrido Reyes, also of the FMLN, la- mented that the defeat of his party had been due in part to “serious mistakes [including] the change in the gas subsidy [which] increased tremendously the price of gas for domestic use.”4 Other analysts also agreed that the gas subsidy reform had “hurt” the FMLN.5 Even more puzzling than the overall lack of popularity was the fact that the reform was particularly unpopular among the poor. For example, in January 2011 among those respondents in the bottom 40 percent of the income distribution, only 28 percent were satisfied, while among those with an income in the top decile, almost 50 percent were satisfied with the reform. Overall, the satisfaction of respon- dents that were expected to benefit from the reform was no different from the satisfaction of those that were expected to lose (see table 1).6 Table 1. Satisfaction Rate—January 2011 Total population 30.0 (a1) Bottom 40 percent of income distribution 28.0 (a2) Rest of income distribution 33.2 “Losers” or “Winners” (b1) Consume more than 200 Kwh per month (losers) 26.9 (b2) Consume less than 200 Kwh per month (winners) 30.5 2 It is also worth noting that under the new scheme churches were not eligible for receiving the benefit. 3 Agencia EFE, March 15, 2012. 4 Agencia EFE, March 12, 2012. 5 Rene  gica de El Salvador, as quoted by the newspaper La Opinio ´ Portillo, dean of the Universidad Tecnolo n on March 13, 2012. 6 A t-test fails to reject the null that the satisfaction rate among “losers” is statistically different from the one of “winners.” 336 Calvo-Gonzalez, Cunha, and Trezzi Yet the puzzle faded over time, slowly but surely. The popularity of the reform improved substantially but only after many months of implementation. The evolution of satisfaction rates is shown in figure 2. While in January 2011 only 30 percent of people were satisfied with the upcoming reform, by August Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 2012 it more than doubled to 66 percent. This pattern was observed also among the poor, who went from a 28 percent satisfaction rate in January 2011 to 68 percent in August 2012. In short, it took many months before the reform became popular among the majority of the population. In May 2011 the over- all satisfaction with the reform was better but still low at around 40 percent, even though by then house- holds had been receiving the benefit under the new scheme for two months. It appears that people’s negative priors about the reform were only slowly adjusted. Figure 2. Satisfaction Rate over Time Note: People answering either “satisfied” or “very satisfied.” Source: Authors’ own calculation. III. Data This study uses data from six waves of household surveys conducted by La Prensa Gra fica, the largest news- paper in El Salvador. The survey reflects the regular practice of that institution of polling people’s views on political and social issues. As discussions about the proposed subsidy reform had become a contentious polit- ical issue, the newspaper decided to start polling, devoting a module of its periodic survey to the reform. The waves considered by the study (January 2011, May 2011, August 2011, May 2012, August 2012, and September 2013) include a special module on the gas subsidy reform implemented in April 2011. The sur- veys were conducted through face-to-face interviews. Each survey between January 2011 and August 2012 includes a total of 1,200 adult respondents drawn from a stratified random sample using the population cen- sus as frame. The wave of September 2013 polled 610 respondents. The samples were designed to be nation- ally representative with a margin of error of 62.9 percent and a 95 percent confidence level.7 The January 2011 survey, conducted prior to the implementation of the subsidy reform includes a more comprehensive set of questions, which allows exploring different dimensions associated with ex- ante satisfaction with the reform. The survey included direct questions assessing individuals’ satisfaction with the reform (whether individuals consider the reform a good or bad idea, what positive consequences they thought the reform would bring about, what negative consequences they thought the reform would bring about, whether they considered that the reform would have mainly positive or negative 7 As in other Latin American countries, nationally representative households surveys tend to underrepresent households at the top end of the income distribution. The World Bank Economic Review 337 consequences for families like them); their level of information about it (how much they estimated the real cost of a 25 lb. bottle of LPG without subsidy to be, how informed they considered themselves to be about the reform, what they understood the subsidy reform to be); their trust in the government fulfilling Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 its commitment (whether they thought that they would receive the compensation that the government had promised); and their political views (party for which they voted in the last election, whether their po- litical views are aligned with the government party or the opposition). In addition, the survey collected information on cooking fuel and electricity consumption patterns, which allow us to identify potential “losers” from the reform. Finally, the survey collected a variety of individual and household characteris- tics that are used as controls (see appendix for details). Surveys between May 2011 and August 2012 collected only a subset of the questions asked in the January 2011 survey. These included questions about the level of satisfaction with the subsidy reform implemented, whether an individual received the benefit, political views, cooking fuel and electricity consumption patterns, and household characteristics. The September 2013 survey expanded on this sub- set by including questions about the ex-post level of information (whether individuals know the electric- ity threshold that qualifies for the subsidy or how often it is distributed) and questions about the mechanism through which the benefit was received (whether it was received through the electricity bill, electronic card, or withdraw in cash). These two sets of variables are used as part of the robustness checks to verify if, even years after the implementation, there are misconceptions about the benefit and whether issues such as salience play a significant role (table 2 provides descriptive statistics on the satis- faction rates across surveys). Table 2. Descriptive Statistics January 2011 May 2011 August 2011 May 2012 August 2012 September 2013 Rate # Obs. Rate # Obs. Rate # Obs. Rate # Obs. Rate # Obs. Rate # Obs. Overall satisfaction 30.0 1,200 43.2 1,200 44.9 1,200 50.2 1,200 66.0 1,200 64.3 610 Satisfaction rates conditional on: Being a loser from reform 26.9 48 39.5 129 42.5 216 45.5 226 59.3 322 59.5 136 Gender: males 37.7 493 43.4 499 45.6 550 48.2 439 65.0 535 66.6 283 Age - less than 40yo 29.2 594 39.4 585 44.6 647 50.5 522 64.4 635 65.1 301 Cooking method Propane gas 29.2 864 43.8 911 53.0 963 49.6 901 65.9 977 64.3 510 Kerosene 25.0 4 50.0 8 60.0 5 50.0 8 71.4 7 100.0 1 Electricity 46.1 13 50.0 10 0.0 8 50.0 6 83.3 6 33.3 3 Wood 33.5 140 37.8 111 94.7 190 51.7 126 65.7 114 69.2 13 Political party ARENA 18.8 254 33.8 260 44.2 287 42.1 273 52.7 290 55.0 120 FMLN 44.1 344 57.7 296 50.5 261 57.8 292 76.9 342 71.3 164 Education level None 22.9 135 55.4 119 34.7 141 57.3 82 71.9 114 65.6 32 Sixth grade 29.0 306 48.6 290 44.3 338 51.7 224 71.2 282 62.5 104 Ninth grade 28.0 182 41.9 193 49.1 242 49.1 171 65.9 217 69.1 107 High school 31.5 241 37.6 255 39.8 271 44.3 255 64.2 302 62.7 188 Higher than high school 41.0 153 36.4 152 52.4 157 54.5 153 50.5 166 63.4 89 Income level (per month) Less than $150 29.0 372 51.3 308 36.8 385 52.5 328 70.4 408 66.6 139 $150–$250 26.6 263 38.7 261 50.5 275 50.5 178 64.3 219 63.5 129 $251–$450 27.7 162 41.4 176 43.4 205 48.2 143 52.6 167 62.8 70 $451–$750 29.4 85 40.5 79 47.5 82 46.1 65 71.6 81 58.8 51 $751–$1000 45.7 35 32.0 25 62.5 40 50.0 38 57.6 59 61.2 31 More than $1,000 52.0 25 44.0 25 42.3 26 44.7 38 56.1 41 85.7 21 Note: Satisfaction rates are calculated only considering defined opinions (“NS/NR” answers and missing observations are dropped out). 338 Calvo-Gonzalez, Cunha, and Trezzi A descriptive analysis of the satisfaction with the reform indicates no obvious links between being a winner (loser) of the reform and being satisfied (unsatisfied) with the reform.8 Throughout the surveys a simple t-test fail to reject the null that the satisfaction rates are significantly different across groups (e.g., Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 between losers and winners). Also, the overall lack of popularity of the reform does not appear to differ significantly across a range of characteristics such as income or the source of fuel for cooking. For exam- ple, the approval of the reform was highest among the richest respondents to the survey conducted in January 2011 (around 50 percent of those in the top 15 percent of income were satisfied with the reform compared to 30 percent for rest of the sample), but the reform turned out to be more popular among poor people in later surveys (for instance in May 2012). Males appear slightly more favorable than fe- males at any point in time but, again, not significantly so. Unsurprisingly, there was a visible difference across education levels in January 2011 (more educated people responded more favorably to the policy change), but already in May 2012, the difference across levels was not significant. Rather, the level of information about the reform appears to be associated with the views of the re- spondents. Most of the respondents had limited information and acknowledged it so. In January 2011 only 18 percent of people considered themselves to be well (or very well) informed about the policy change. The lack of information was also reflected in the fact that only 15 percent of the respondents correctly identified the true price of LPG in the absence of a subsidy. Around 25 percent of the popula- tion did not know what the price without subsidy would be and a further 22 percent underestimated it by more than five dollars (see figure 3). It is worth underlining that US$5 was the subsidized price at the time and US$13 was the unsubsidized price as reported by the media in January 2011.9 Finally, those that considered themselves to be informed had different priors about the consequences of the reform (see table 3). Figure 3. Kernel Density—Perception about Unsubsidized Price (January 2011 Survey) Note: Kernel ¼ Epanechnikov, Bandwidth ¼ 0.34. Source: Authors’ own calculation. 8 In general we consider “losers” those households that were not eligible for the gas subsidy (because they consumed more than 200 kw/h of electricity per month) or those that qualified for the subsidy but consumed more than one bottle of LPG per month. We consider “winners” the remaining households. However, for the first survey (January 2011), we consider “losers” only households with a monthly electricity consumption of less than 200 kw/h since the question about LPG usage was not included in that survey. Please note that this limitation does not bias our econometric results since the dummy “loser” is not statistically significant in any of the regressions. 9 For instance, the “El Faro” in December 2010 was reporting to expect a price of US$12.6 for the month of January. The World Bank Economic Review 339 Table 3. Expected Consequences from the Reform (January 2011 Survey) Informed people Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Satisfaction rate ¼ 53.7 percent Positive consequences Negative consequences Identify at least one 45.80 Identify at least one 44.50 Identify none 9.68 Identify none 9.35 NS/NR 44.52 NS/NR 46.13 Total 100.00 Total 100.00 Uninformed People Satisfaction rate 5 24.0 percent Positive consequences Negative consequences Identify at least one 13.10 Identify at least one 69.36 Identify none 40.40 Identify none 1.22 NS/NR 46.95 NS/NR 29.42 Total 100.00 Total 100.00 Among those respondents that were well informed (top panel of table 3), the satisfaction with the re- form was relatively high at 54 percent (compared to 24 percent among the badly informed). Among the well-informed, 46 percent mentioned that the reform would have at least one positive effect, for example improving the lives of the poor. At the same time, those well informed were also able to come up with negative effects at a similar rate (45 percent mentioned at least one negative effect). In contrast, among those that were badly informed (bottom panel), only 13 percent mentioned any positive effects, while 69 percent were able to identify a negative consequence.10 The finding that the uninformed more easily came up with negative than positive consequences of the re- form may be related to what is known in the psychology literature as negativity bias or positive-negative asym- metry. While the concept refers to a broad range of psychological phenomena it has also been found to apply to information processing. In a survey of the extensive literature on the issue, Baumeister et al. (2001) conclude that “bad information is processed more thoroughly than good.” This may help to explain why some survey respondents could come up with examples of negative impact more readily than for positive impacts. It is worth stressing that the survey results suggest that information about the reform is linked with a lower negativ- ity bias. Satisfaction with the reform also differed depending on the respondents’ trust in the government’s in- tention and ability to deliver on the proposed reform. In particular, satisfaction with the reform was higher among those respondents who thought that the government would be able to deliver on its promise (42 per- cent) than among those who had no trust (22 percent). Satisfaction with the reform was also higher among those leaning politically with the government (44.1 percent) than among those who favored the political oppo- sition (18.9 percent). This may reflect the well-established fact that people assimilate information in a way that is skewed in the direction of support for their antecedent beliefs (Glaeser and Sunstein 2013). In our context, such biased assimilation of information may simply take the form of supporters of the political party in govern- ment paying more attention to or believing the positive aspects of the reform proposed by their party. 10 It is also worth noting that while the number of “NS/NR” answers is quite high, a t-test comparing the shares of “NS/ NR” answers across the observable covariates of households in the dataset fails to reject the null of equal means of the distributions. The only two variables for which the answers “NS/NR” are statistically different across groups are “age” and “level of education.” Unsurprisingly, younger and more educated people display a lower share of “NS/NR” answer than the rest of the population. It is also worth highlighting that this does not represent an issue for our empiri- cal specification since we control for these two observables. 340 Calvo-Gonzalez, Cunha, and Trezzi While our dataset provides unique information about individual’s perception and knowledge about the subsidy reform, it has a few caveats worth mentioning. First, the surveys are not a panel but separate cross sections. We are no able to track changes in satisfaction for the same individual in different pe- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 riods, and have to rely on change in representative samples of the population. Some variables, such as the level of information of the respondent about the reform, are not included in the intermediary surveys, which prevents us from following its evolution in time. Finally, different surveys were collected in differ- ent times of the year, and some control variables such as income or occupation could have been affected by seasonality. It is unlikely that our main variables of interest (satisfaction, information, access to the benefit, and political partisanship) are significantly affected by seasonal effects. IV. The Empirical Model In our empirical analysis we aim to quantify the effect of different factors affecting the satisfaction with the reform before and after its implementation. The analysis before implementation explores the role of three main factors of interest: (i) the level of information about the reform (variable “Information”), (ii) whether the respondent trusts the government’s ability to deliver the subsidy (variable “Delivery”), and (iii) the political partisanship of the individual (variable “Partisanship”). “Information” is a dummy var- iable taking the value of “1” if the respondent declares themselves “informed” or “well informed” about the upcoming reform. “Delivery” captures the expectations of getting the subsidy conditional on qualify- ing for it, in other words conditional on a level of monthly electricity usage below the threshold. Finally, “Partisanship” is a dummy variable taking the value of “1” if the respondent is a voter of FMLN, President Mauricio Funes’ party. In order to avoid endogeneity issues (the consensus towards FMLN could be endogenous to the satisfaction about the reform), we consider the preference expressed at the 2009 general elections, before the reform was announced. Our dependent variable is a dummy taking the value of “1” if the respondent expressed a view that the proposed reform was either a “very good” or “good” idea and “0” otherwise. We do not consider in- dividuals without a defined opinion of the reform (those answering “NS/NR”) and drop them out of the regressions. This choice reduces—although only marginally—the statistical power of our estimates, but it does not affect the significance of our results. In robustness checks we relax this assumption and show the results are insensitive to this choice. In the analysis after implementation we have a slightly different set of explanatory variables. One might expect that the level of information about the reform would in- crease significantly once it was implemented. Using the September 2013 survey, we can observe directly how well informed individuals were about the nature of the reform and whether this was relevant for the individuals’ satisfaction with the reform. We consider “informed” the respondents who correctly re- ported the electricity consumption threshold to qualify for the subsidy.11 The variable “Delivery” now captures whether the respondent effectively received the subsidy or not. The variable “Partisanship” re- mains unchanged. “Delivery” and “Partisanship” are available for all five surveys after implementation. Following the nature of our dependent variable, we employ a Probit model estimated using standard maximum likelihood techniques.12 Our baseline model can be formally expressed as Yi ¼ ai þ b1 Information þ b2 Delivery þ b3 Partisanship þ h0 Xi þ d0 Zi þ i : (1) where ai is a constant term, h0 and d0 are vectors of coefficients, Xi is a matrix containing controls describ- ing personal characteristics of the respondent, Zi is a matrix of geographical dummies, and ei is an error term. The coefficients of interests are b1, b2, and b3. Following the classical approach of limited dependent variable regressions, we report (for the two main surveys: January 2011 and September 2013) the marginal 11 In robustness checks exercises (not reported in the paper but available upon request) we allow for a margin of error (up to 5US$) in the responses and show that our results are fully robust in this dimension. 12 Results using logit regressions are virtually identical to the ones presented in the next section. The World Bank Economic Review 341 effects of the three main regressors keeping all other variables at their mean values. Finally, in order to overcome the traditional issues of the R2 in probit models, for each regression we report two alternative measures of goodness of fit: the percent of correctly predicted (PCP) observations13 and the “receiver oper- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 ating characteristic” (ROC) curve that overcomes the arbitrarily PCP cutoff to classify the observations. V. Results The results of our baseline regressions are reported in table 4 for the January 2011 survey, in table 5 for the four intermediate surveys, and in table 6 for the September 2013 survey. The marginals for the three main regressors are plotted in figures 4 to 6 for the January 2011 survey and in figures 7 to 9 for the September 2013 survey. In each table we report the results of model (1) allowing for different controls. As for the January 2011 and the September 2013 surveys, the model is run four times: the first one in- cluding our main regressors plus a constant term only, the second one controlling for the individual char- acteristics, the third one by entering a set of geographical dummies, and the fourth one by including all controls. On the other hand, for the four intermediate surveys (May 2011, August 2011, May 2012, and August 2012), we allow for two specifications: the first one including a constant term only and the sec- ond one including all controls. We start the description of our results by considering the January 2011 survey (table 4). The three re- gressors of interest enter significantly at 1 percent level in all models. While the sign and significance of the coefficients in the regressions is informative, the magnitude has no specific meaning. For this reason we report the marginal effects of each variable, keeping all other variables at their median values. The marginals for “Information,” “Delivery,” and “Partisanship” are shown in figure 4, 5, and 6. Being in- formed about the reform increases the probability of being satisfied by around 20 percentage points. Similar effects are found for the variable “Delivery” and the variable “Partisanship.” It is also possible Table 4. Baseline—January 2011 Survey Information 0.564*** (0.103) 0.526*** (0.108) 0.589*** (0.112) 0.555*** (0.118) Delivery 0.551*** (0.089) 0.588*** (0.094) 0.606*** (0.097) 0.617*** (0.103) Partisanship 0.505*** (0.088) 0.503*** (0.092) 0.561*** (0.099) 0.568*** (0.104) Constant Yes Yes Yes Yes Personal Yes Yes Dummies Yes Yes Observations 1032 1032 1032 1032 PCP 0.70 0.72 0.74 0.77 ROC 0.71 0.75 0.77 0.80 AIC 1143.8 1142.5 1221.7 1222.4 BIC 1163.6 1310.3 1641.0 1789.4 *** Indicates significance at 0.1% level, ** at 1% level and * at 5% level. to estimate the joint effect of the three variables. While the unconditional satisfaction rate is 30.0 percent (on a total of 1,032 observations), it increases to 50.0 percent for informed people (204 observations). If we further condition on being confident about receiving the transfer (99 observations in total), meaning that the variable “Delivery” ¼ 1, the satisfaction rate increases to 59.6 percent. Finally, if we also condi- tion on being an FMLN voter, the satisfaction rate jumps to 75.0 percent. As a measure of goodness of fit we rely on the PCP and on the area under the ROC curve. These two synthetic measures are reported at the bottom of table 4 together with the Akaike (AIC) and Bayesian (BIC) information criterion. 13 The procedure involves three steps: first, run the model and estimate Y i ; second classify as a “1” any observation with a predicted probability higher than 0.5; finally, the PCP measure is calculated as PC P ¼ (100 Á Correct Predictions) /N where a correct prediction arises if Y i ¼ Yi . 342 Table 5. Baseline—Four Intermediate Surveys Survey May 2011 August 2011 May 2012 August 2012 Delivery 0.502*** (0.093) 0.233* (0.116) 1.127*** (0.084) 0.856*** (0.109) 0.599*** (0.098) 0.605*** (0.116) 0.899*** (0.087) 0.965*** (0.104) Partisanship 0.489*** (0.087) 0.463*** (0.104) 0.152 (0.093) 0.135 (0.107) 0.251** (0.090) 0.269* (0.109) 0.429*** (0.091) 0.464*** (0.107) Constant Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Observations 1040 1040 1166 1166 954 954 1104 1104 PCP 0.62 0.73 0.69 0.79 0.70 0.74 0.71 0.72 ROC 0.63 0.80 0.67 0.85 0.70 0.74 0.69 0.75 AIC 1364.0 1314.4 1152.0 1000.0 1166.4 1188.7 1286.0 1341.4 BIC 1378.8 1848.2 1166.5 1484.7 1181.0 1595.1 1301.0 1925.1 *** Indicates significance at 0.1% level, ** at 1% level and * at 5% level. Complementary statistics (PCP, ROC, AIC, and BIC) refer to probit regressions. Calvo-Gonzalez, Cunha, and Trezzi Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 The World Bank Economic Review 343 Table 6. Baseline—September 2013 Survey Information 0.202 (0.141) 0.131 (0.151) 0.166 (0.143) 0.101 (0.153) Delivery 1.104*** (0.137) 1.260** (0.435) 1.096*** (0.139) 1.276** (0.438) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Partisanship 0.337** (0.129) 0.335* (0.138) 0.344** (0.130) 0.343* (0.138) Constant Yes Yes Yes Yes Personal Yes Yes Dummies Yes Yes Observations 527 527 527 527 PCP 0.72 0.73 0.72 0.72 ROC 0.69 0.74 0.70 0.74 AIC 621.7 632.3 622.6 633.4 BIC 638.8 747.1 648.2 756.7 *** Indicates significance at 0.1% level, ** at 1% level and * at 5% level. Figure 4. Marginal Effect—Information (Jan ’11 Survey) Note: Shaded area indicates the 95% confidence intervals. Source: Authors’ own calculation. Figure 5. Marginal effect—Delivery (Jan ’11 Survey) Note: Shaded area indicates the 95% confidence intervals. Source: Authors’ own calculation. 344 Calvo-Gonzalez, Cunha, and Trezzi Figure 6. Marginal Effect—Partisanship (Jan ’11 Survey) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Note: Shaded area indicates the 95% confidence intervals. Source: Authors’ own calculation. Figure 7. Marginal Effect—Information (Sep ’13 Survey) Note: Shaded area indicates the 95% confidence intervals. Source: Authors’ own calculation. Although the estimates of the coefficients of interest are slightly different, the goodness of fit of all mod- els is high (ranging between 0.7 and 0.8), indicating a high power of the model to explain the variance of the dependent variable. The results of the regressions run on the four intermediate surveys are reported in table 5. The focus of these regressions is on the significance of the variables “Delivery” and “Partisanship.” For each survey we run model (1) twice, the first time including a constant term only while the second time adding all controls. The point estimate of the variable “Partisanship” is positive and progressively more significa- tive: the coefficient is not significant in the August 2011 survey but it is in the May 2012 survey and it becomes significative at 0.1 percent level in the August 2012 one. On the other hand the variable The World Bank Economic Review 345 Figure 8. Marginal Effect—Delivery (Sep ’13 Survey) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Note: Shaded area indicates the 95% confidence intervals. Source: Authors’ own calculation. Figure 9. Marginal Effect—Partisanship (Sep ’13 Survey) Note: Shaded area indicates the 95% confidence intervals. Source: Authors’ own calculation. “Delivery” remains significant in all surveys. Overall, our results show that over time the personal char- acteristics lose power in explaining the satisfaction about the reform in favor of our main regressors. Finally, the results for the September 2013 survey are reported in table 6. As for the January 2011 survey, the regressions are run four times, each time specifying a different number of controls. The only two signifi- cant variables are “Partisanship” and “Delivery.” In all models the variable “Delivery” is significant at 1 per- cent level showing a large impact on the dependent variable. The variable “Partisanship” enters significantly at 1 percent level in all models except for model 4 where it is significant at 5 percent level. On the other hand, the level of information (“Information”) is not significant in any model. The marginal effects are 346 Calvo-Gonzalez, Cunha, and Trezzi shown in figures 7, 8, and 9. The marginal impact of the political partisanship (“Partisanship”) is significant, increasing the probability of satisfaction from around 62 percent to around 74 percent. Even more significant is the marginal effect of the variable “Delivery,” which increases the probabil- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 ity of being satisfied from around 35 percent to around 75 percent. Finally, because the marginal proba- bilities might be misleading for binary covariates, we also report (in figures 10, 11, and 12) the treatment effects over time for the same variable, together with their respective 95 percent confidence in- tervals. The evidence emerging from these charts largely confirms our previous analysis. Figure 10. Marginal Effect—Information over Time Note: Shaded area indicates the 95% confidence intervals. Source: Authors’ own calculation. Figure 11. Marginal Effect—Delivery over Time Note: Shaded area indicates the 95% confidence intervals. Source: Authors’ own calculation. The World Bank Economic Review 347 Figure 12. Marginal effect—Partisanship over Time Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 Note: Shaded area indicates the 95% confidence intervals. Source: Authors’ own calculation. VI. Robustness checks In order to further validate the main results, this section proposes a number of robustness checks considering different samples, alternative definitions for the variables of interest, and additional con- trols.14 Given the number of questions in the surveys, we run our checks using the January 2011 and September 2013 surveys. All tables containing robustness check results are shown in the appendix. The first set of checks considers different samples, including the observations omitted in the baseline (those for which the respondent replied “NS/NR”). We treat these individuals as unsatisfied and assign a value of “0” in the corresponding entry of the dependent variable.15 The number of observations in- creases to 1,200 for the January 2011 survey and to 610 for the September 2013 survey, significantly larger than the baseline case. The results of this check, reported in the appendix are extremely close to the baseline in terms of significance and magnitude of the coefficients. All coefficients remain significant, and in most cases the magnitude of the coefficients differs only at the margin. Therefore, the baseline re- sults are fully robust to changes in the sample considered. The second set of checks considers alternative definitions for the three variables of interest: political partisanship, information, and delivery. We start by checking whether the partisanship effect applies only to FMLN voters or if it applies also to ARENA (the major opposition party) voters.16 In this check the variable “Partisanship” is a dummy variable taking the value of “1” if the respondent is a voter of the ARENA party in the 2009 elections and “0” otherwise. The results, shown in the appendix, are simi- lar to the baseline. The estimated coefficients of the variable “Partisanship” are highly significant for both surveys. They have negative sign as expected and their magnitude (in absolute value terms) is in line with the baseline case. These results suggest that the marginal effect induced by political partisanship (either in favor or against the governing party) is a nontrivial contribution to satisfaction. 14 We also explored different specifications of the model including all possible interactions among the main regressors (which are not presented in this section but can be shared upon request). The coefficients of the interaction terms (esti- mated following Ai and Norton (2003)) are not statistically significant. 15 While in principle the “NS/NR” responses could be treated both ways (as “satisfied” or “unsatisfied” people), we think that treating them as unsatisfied is more appropriate given how widespread was the opposition to the reform. 16 FMLN and ARENA are the two major parties of the country representing almost 70 percent of preferences in the last elections. 348 Calvo-Gonzalez, Cunha, and Trezzi We further check our definition of “information.” The baseline case considers “informed” those indi- viduals who explicitly answered to be “informed” or “well informed” about the subsidy reform. This ro- bustness check exercise redefined the variable “Information” using a different question in the 2011 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 survey. This variable reflects whether the respondent is informed not only about the reform but also about the unsubsidized gas price, or in other word, the value of the subsidy. In this check the variable “Information” is the same dummy variable as in the baseline however it is conditional on knowing the correct unsubsidized price. The respondent is considered informed about the unsubsidized price if his/ her answer is less than 3 US$ away (in both directions) from the “true” price. This choice is more restric- tive than the baseline case. While in the baseline 216 people are considered “informed,” in this robust- ness check only 121 are considered so. The results of the regressions with the more restrictive definition of “Information” are reported in the appendix. The focus in this case is on the coefficient of the variable “Information.” The magnitude of the estimated coefficient is slightly lower than in the baseline but the estimates remain significant at one percent level in all models. Furthermore, the estimated coefficients of the other two regressors remain highly significant. We also consider a bigger error band (5US$) and re- port the results in the appendix. The baseline scenario is also confirmed in this case. Finally, we consider a slightly different definition to the variable “Delivery.” In the baseline analysis for the January 2011 survey, the variable “Delivery” is a dummy variable taking the value of “1” if the respondent declares themselves confident about receiving the subsidy under the announced new scheme. However, it could be the case that the respondent answers “no” because she or he does not qualify for the transfer. This robustness exercise considers under the variable “Delivery” only individuals with a monthly usage of electricity below the qualifying threshold, that is, those that qualify for the benefit. This choice reduces the number of people confident in the delivery (meaning the number of observations taking the value of “1”) from 373 (baseline) to 305. The results are reported in the appendix. The coeffi- cient of “Delivery” is only marginally lower than in the baseline but the variable remains significant at the one percent level. Overall, these results suggest that the baseline findings are fully robust to the alter- native definitions of delivery, information, and partisanship and that these variables play an important role in individuals’ satisfaction. VII. Conclusion In this paper we analyze the determinants of the citizens’ satisfaction about a reform of the gas subsidy in El Salvador. The reform was expected to improve the welfare of around three-quarters of the popula- tion but turned out to be highly unpopular. We contribute in three ways. First, we document a case of a reform that benefited the majority of the population but was initially unpopular. Second, we use new survey data to identify the factors that help explain this puzzle. Third, using probit models we test the marginal effects of three key observables: the individual’s level of information (which is especially rele- vant ex-ante), his/her trust in the government’s ability to implement the reform effectively (or the ability to deliver the subsidy ex-post), and his/her political views. We show that in January 2011—before imple- mentation—the level of information about the reform, the expectations on the ability of the government to deliver, and political priors help explain most of the overall satisfaction rate. On average, around 70 percent of the variance of the dependent variable is captured by our main regressors. We also show that the increase in the satisfaction rate over time is essentially driven by the ability of the government to de- liver the subsidy. Throughout the five surveys following April 2011, the significance and magnitude of the coefficient identifying the above effect progressively increases. Finally, we show a nonmarginal effect of political partisanship in the perception of the reform not only before the reform was implemented but also throughout the entire period of analysis. Overall, our findings suggest that the level of satisfaction with the reform could potentially have been affected by actions to increase the information of individuals. It is important to stress that such efforts The World Bank Economic Review 349 could have played a role without necessarily modifying the content of the reform. In this sense the find- ings of our paper point to issues that go beyond the political economy of reform as it is often understood, that is, in the sense of identifying winners and losers. Our paper suggests that exploring factors that may Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 affect why an individual considers himself to be a winner or a loser is an under-studied yet worthwhile effort for understanding the success or failure of policy reforms. Finally, our findings point to three pol- icy implications. First, to the extent possible, piloting a new delivery mechanism for a subsidy such as the one analyzed here could help identify implementation issues that could affect the overall credibility of the government to deliver. Second, given the importance of information in our findings, a clear and consistent public communication strategy is critical. This implies not only making information available but also verifying that this information was, indeed, understood by the public. Third, our results favor widespread rather than targeted communication efforts. Although we find a persistent impact of partisan views on the perception of the reform, such partisan views do not attenuate or amplify the effect of other variables, such as being better informed. References Ai, C., and E. C. Norton. 2003. “Interaction Terms in Logit and Probit Models.” Economics Letters 80 (1): 123–29. Anderson, C. J., and A. J. LoTempio. 2002. “Winning, Losing and Political Trust in America.” British Journal of Political Science 32 (02): 335–51. Anderson, C. J., and Y. V. Tverdova. 2001. “Winners, Losers, and Attitudes about Government in Contemporary Democracies.” International Political Science Review 22 (4): 321–38. Artana, D., and F. Navajas. 2008. “Analisis y rediseno de los subsidios en El Salvador.” Unpublished manuscript. Baumeister, R., E. Bratslavsky, C. Finkenauer, and K. D. Vohs. 2001. “Bad is Stronger than Good.” Review of General Psychology 5: 323–70. Bril-Mascarenhas, T., and A. E. Post. 2015. “Policy Traps: Consumer Subsidies in Post-Crisis Argentina. Studies in Comparative International Development 50 (1): 98–120. CEPAL. 2012. “Centroamerica: Estadisticas de Hidrocarburos.” Technical report, Comision Economica para America Latina y el Caribe. Drazen, A. 2000. “Political Economy in Macroeconomics.” Princeton University Press. Fernandez, R., and D. Rodrik. 1991. “Resistance to Reform: Status Quo Bias in the Presence of Individual-Specific Uncertainty.” American Economic Review 81 (5): 1146–55. Fritz, V., B. Levy, and R. Ort. 2012. “Problem-Driven Political Economy Analysis: The World Bank Experience.” Technical report, The World Bank. Glaeser, E. L., and C. R. Sunstein. 2013. “Why Does Balanced News Produce Unbalanced Views?” NBER Working Papers 18975, National Bureau of Economic Research, Inc. Harrison, K. 2013. “The Political Economy of British Columbia’s Carbon Tax.” OECD Environment Working Papers. Haynes, L., O. Service, B. Goldacre, and D. Torgerson. 2012. “Test, Learn, Adapt: Developing Public Policy with Randomised Controlled Trials.” Technical report, Cabinet Office. IMF. 2013. “Energy Subsidy Reform: Lessons and Implications.” Technical report, International Monetary Fund. IMF. “Case Studies on Energy Subsidy Reform: Lessons and Implications.” 2013. Technical report, International Monetary Fund. Jain, S., and S. W. Mukand. 2003. “Redistributive Promises and the Adoption of Economic Reform.” American Economic Review 93 (1): 256–64. Tornarolli, L., and E. Vazquez. 2012. “Incidencia distributiva de los subsidios en El Salvador.” Technical report, Interamerican Development Bank. Vagliasindi, M. “Implementing Energy Subsidy Reforms: Evidence from Developing Countries.” Technical report, The World Bank, 2013. van Wijnbergen, S., and T. Willems. 2014. “Learning Dynamics and the Support for Economic Reforms: Why Good News can be Bad.” World Bank Economic Review 28 (3). 350 Calvo-Gonzalez, Cunha, and Trezzi Veldkamp, L. 2009. “Learning about Reform: Time-Varying Support for Structural Adjustment.” International Review of Economics & Finance 18 (2): 192–206. World Bank. 2006. “Infrastructure Service Provision in El Salvador: Fighting Poverty, Resuming Growth.” Technical report, The World Bank Group. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/329/2897300 by Sectoral Library Rm MC-C3-220 user on 08 August 2019 The World Bank Economic Review, 31(2), 2017, 351–384 doi: 10.1093/wber/lhw062 Advance Access Publication Date: December 10, 2016 Article Does Input-Trade Liberalization Affect Firms’ Foreign Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Technology Choice? Maria Bas and Antoine Berthou Abstract This paper studies the impact of input-trade liberalization on firms’ decision to upgrade foreign technology em- bodied in imported capital goods. Our empirical analysis is motivated by a simple theoretical framework of endogenous technology adoption, heterogeneous firms and imported inputs. The model predicts a positive effect of input tariff reductions on firms’ technology choice to source capital goods from abroad. This effect is heteroge- neous across firms depending on their initial productivity level. Relying on India’s trade liberalization episode in the early 1990s, we demonstrate that the probability of importing capital goods is higher for firms producing in industries that have experienced greater cuts on tariffs on intermediate goods. Only those firms in the middle range of the initial productivity distribution have benefited from input-trade liberalization to upgrade their technology. JEL classification: F10, F12, F14 Key words: Input-trade liberalization, firms’decision to import capital goods, firm heterogeneity and Indian firm-level data I. Introduction Trade liberalization has, in the past two decades, produced steady growth in imports of intermediate and capital goods across countries. The endogenous-growth literature has provided theoretical arguments for the role of foreign intermediate inputs in enhancing economic growth and productivity gains (Ethier 1979, 1982; Grossman and Helpman 1991; Rivera-Batiz and Romer 1991).1 The Maria Bas (corresponding author) is a Professor at University of Paris 1; her email is maria.bas@univ-paris1.fr. Antoine Berthou is an economist at Banque de France and associate researcher at CEPII; his email is antoine.berthou@banque-france.fr. We thank the editor and two anonymous referees for their comments. We have benefited from discussions with Andrew Bernard, Pamela Bombarda, Lorenzo Caliendo, Pierre-Philippe Combes, Arnaud Costinot, Matthieu Crozet, David Dorn, Swati Dhingra, Lionel Fontagne, Juan Carlos Hallak, Emeric Henry, Sebastien Jean, Gianmarco Ottaviano, Philippe Martin, Thierry Mayer, Marc Melitz, John Morrow, Nina Pavcnik, Sandra Poncet, Ina Simonovska, Vanessa Strauss-Kahn, Cristina Terra, Thierry Verdier, Eric Verhoogen, and seminar participants of CEPII, Paris School of Economics, Universite ´ Dauphine, North American Econometric Society Winter meetings 2014, AEA Meeting Philadelphia, Asian Econometric Society Meeting 2013 in Singapore, Royal Economic Society Meeting, Royal Holloway 2013, and 47th annual conference of the Canadian Economic Association, Montreal, Canada 2013. We are responsible for any remaining errors. This research does not reflect the views of the Banque de France. A supplemental appendix to this article is available at https://academic.oup.com/wber. 1 Recent firm-level studies have confirmed that input-trade liberalization played a key role on firm productivity growth (Schor 2004; Amiti and Konings 2007; Topalova and Khandelwal 2011), the ability to introduce new products in the do- mestic market [Goldberg et al. 2010], export performance (Bas 2012; Bas and Strauss-Kahn 2015), and mark-ups C The Author 2016. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 352 Bas and Berthou specific influence of trade in capital goods on economic growth has also been emphasized in a number of theoretical and empirical works (Lee 1995; Eaton and Kortum 2001; Goh and Olivier 2002). Importing capital goods is found to be a relevant channel of foreign technology transfers and R&D spillovers across countries (Xu and Wang 1999). Trade liberalization is therefore expected to im- prove economic growth by decreasing the cost of both foreign intermediate goods and capital equip- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 ment (Amiti and Konings 2007; Topalova and Khandelwal 2011; Goldberg et al. 2010; Eaton and Kortum 2001). This paper investigates the link between input-trade liberalization and foreign technology adoption embodied in imports of capital goods. Input-trade liberalization may affect technology adoption through a direct channel: the reduction of tariffs on capital goods decreases their price and allows firms to import a larger volume of these goods. In this work, we take a different perspective and focus on an indirect channel. We look at the effect of tariff cuts affecting variable inputs on firms’ to import a larger volume of these goods. In this work, we emphasize unexplored mechanisms through which trade liberalization affects firms’ technology choice: a supply shock of input tariff reductions and a complementarity channel between imported variable intermediate goods and capital equipment. Such complementarity is observed in our microdata of Indian firms used in the empirical analysis. We first show that only a subset of firms in our sample import capital goods, and almost all of them also import intermediate inputs. This feature of the data suggests that importing capital goods is associated with a technological investment decision. Moreover, these firms that import both intermediate inputs and capital equipment goods improve their productivity gains, suggesting a complementarity between imported inputs and foreign technology in the production process. Our empirical analysis is motivated by a simple model of heterogeneous firms, endogenous technol- ogy adoption, and imported inputs that captures these main features of the Indian data. The aim of the theoretical model is to rationalize the channels through which input-trade liberalization affects firms’ de- cision to upgrade foreign technology embodied in imported capital goods. Input-trade liberalization re- duces the costs of imported intermediate inputs and allows firms to decrease their marginal costs and increase their profitability. In the presence of fixed cost of technology adoption, heterogeneous firms and complementarity between imported inputs and high-foreign technology, the model yields two main testable implications. First, input tariff reductions increase the probability of importing capital goods. Second, the effect of input-trade liberalization on firms’ technology choice is heterogeneous across firms depending on their initial productivity level. Firms that will benefit from input-trade liberalization are those with a high productivity level using low technology embodied in domestic capital goods before in- put tariff cuts. We then test the model implications using the Indian firm-level dataset, Prowess, over the 1989- 1997 period. This data was collected by the Centre for Monitoring the Indian Economy (CMIE). The Prowess dataset provides information on imports distinguished by type of goods (capital equipment, in- termediate goods, and final goods). To establish the causal link between the availability of imported in- termediate goods and firms’ decision to import capital goods, we rely on the unilateral trade reform that took place in India at the beginning of the 1990s as a part of the “Eighth Five-Year Plan.”2 We de- part from previous studies of input-trade liberalization by distinguishing tariffs on variable inputs from tariffs on capital equipment products. The empirical identification strategy disentangles the direct ef- fects of tariffs on capital goods and the indirect effects of tariffs on other variable intermediate goods changes (DeLoecker et al. 2016). Other works highlight a positive link between imports of intermediate goods and firm productivity (Kasahara and Rodrigue 2008; Halpern et al. 2015). 2 Section V describes the policy instruments applied by the Indian government during this reform. The World Bank Economic Review 353 on firms’ decision to import capital equipment goods from abroad. Using effectively applied most fa- vorite nation (MFN) tariffs data and input-output matrix, we construct tariff measures on variable in- puts and on capital goods separately. We first present evidence that our tariff measures are free of reverse causality concerns. We extend the previous works in the literature and show that input tariff changes are uncorrelated with initial firm and industry characteristics relevant for our analysis during Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 the trade reform under the “Eighth Five-Year Plan.” We then exploit this exogenous variation in input tariffs across industries to identify the effect of the availability of foreign variable intermediate goods on firms’ decision to import capital goods, taking into account changes in specific tariffs on capital goods. The empirical findings confirm the theoretical predictions. Firms producing in industries with larger input tariff cuts have a higher probability of importing capital goods. Our results imply that the average input tariff reductions during the 1989–1997 period, 27 percentage points, is estimated to produce a 4.6 percent increase in the probability of importing capital goods for the average firm importing interme- diate goods. These results take into account the direct effect of capital goods tariff changes. We then in- vestigate if the impact of input-trade liberalization is heterogeneous across firms. Only those firms in the middle range of the productivity distribution import capital goods after input tariff reductions. Firms in the middle range of the initial productivity distribution increase their probability of sourcing capital goods by almost ten percent. As predicted by the model, our findings suggest that the least productive firms do not benefit from input tariff cuts to upgrade foreign technology. Input tariff changes do not af- fect either the most productive firms that might have already adopted the foreign technology before in- put tariff cuts. These results are robust to specifications that control for industry- and firm-observable characteristics that could be related to tariff changes and might change over time. We also take into account other pos- sible explanations related to the incentives of Indian firms to adopt foreign technology. We show that our results remain robust when we explicitly control for other reforms that took place in India, foreign demand shocks (export-channel), and changes in firms’ financial health that also affect firms’ decisions to import capital goods. Our findings are also robust and stable to other sensitivity tests. First, we inves- tigate if reductions on tariff on intermediate goods are associated with the decision to start sourcing capi- tal goods from abroad when we restrict our sample to firms that have not imported capital goods in the previous years. Second, the previous findings also remain stable when we exclude foreign- or state- owned firms from the sample. Finally, we also find a positive effect of input-trade liberalization on the intensive margin of imports of capital goods. These findings contribute to the literature on trade liberalization and firms’ technology choice. Most of the existent theoretical studies focus on the effects of foreign demand shocks on firms’ tech- nology or quality upgrading. They look at demand shocks related to final goods tariff changes af- fecting exports in bilateral trade agreements or expansion of other export opportunities (Yeaple 2005; Verhoogen 2008; Bustos 2011; Aw et al. 2011; Lileeva and Trefler 2010; Costantini and Melitz 2008; Bas and Ledezma 2015). The contribution of this paper to this literature is to focus on an unexplored channel through which trade liberalization might also affect firms’ technology choice, namely, a supply shock related to changes in the costs of imported intermediate inputs. Changes in tariffs on intermediate goods might affect firms’ performance and thereby firms’ technol- ogy upgrading decision through multiple mechanisms: reduction of production costs, foreign tech- nology transfer, and complementarity between imported intermediate inputs and hightechnology. Our findings show that input tariffs changes are also an important factor to explain firms’ technol- ogy choices. Our results also complete the existing evidence regarding the microeconomic effects of input- trade liberalization on firm performance. Concerning the case of India, input tariff cuts have 354 Bas and Berthou contributed significantly to firm productivity growth and also to the ability of firms to introduce new products. Topalova and Khandelwal (2011) show that input-trade liberalization improved firm productivity by 4.8 percent in India while Goldberg et al. (2010) demonstrate that input-tariff cuts in India account, on average, for 31 percent of the new products introduced by domestic firms. They also show evidence of the direct effect of import tariff cuts on intermediate inputs in India during the Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 same period under analysis. They find that tariff declines have a more pronounced impact on the ex- tensive margin of imported intermediate products relative to final goods. DeLoecker et al. (2016) show that trade liberalization reduces prices and that output tariff cuts have pro-competitive effects. They find that price reductions are small relative to the declines in marginal costs due to the input- tariff liberalization. Recent studies focused on the role of input-trade liberalization in shaping firms’ export performance. Using firm-level data from Argentina, Bas (2012) finds that firms producing in industries with larger input-tariff cuts have a greater probability of entering the export market. Bas and Strauss-Kahn (2015) show that Chinese firms that have benefited from input tariff cuts bought more expensive inputs and raised their export prices. These findings suggest that input-trade liberali- zation induces firms to upgrade their inputs at low cost to upgrade the quality of their exported products. The next section describes the main empirical facts on Indian firms importing intermediate inputs and capital equipment goods. Section III presents a simple theoretical framework of endogenous foreign tech- nology adoption that reflects the main features of the data and rationalizes the mechanisms through which input-trade liberalization affects firms’ decision to upgrade technology. Section IV describes the testable empirical implications. Section V presents the trade-policy background in India, the estimation strategy and the empirical results. Section VI explores alternative explanations. Section VII introduces several robustness tests. Section VIII explores the additional gains from input-trade liberalization. The last section concludes. II. Empirical Motivation Before analyzing the relationship between input-trade liberalization and firms’ decision to upgrade foreign technology embodied in imported capital goods, this section provides a first inspection of the data. We document several empirical facts on firms, sourcing intermediate inputs and capital equipment goods from foreign countries that will guide the assumptions of our theoretical model. Only a small subset of Indian firms produces with foreign technology embodied in imported capital goods. During the period 1989–1997, only 38 percent of firms in the sample import capital goods while most of the firms (73 percent) import intermediate goods. Moreover, firms import inter- mediate inputs on yearly basis, while firms import capital goods more sporadically. Looking at the firms that source both foreign goods reveals that almost all firms that import capital goods (99 per- cent) also purchase imported intermediate goods. The fact that only half of the firms that import in- termediate goods are able to source imported capital equipment goods suggests that the decision to source capital goods from abroad is related to a technological choice that involves a fixed investment cost.3 Empirical fact 1: A large proportion of firms imports intermediate goods, while only a subset of those firms also imports capital equipment goods. 3 Using detailed product-level data on imports by Indian manufacturing plants, Fernandes et al. (2012) show the existence of fixed costs of importing. The World Bank Economic Review 355 This small subset of firms that produces with imported capital goods technology performs better than non-importers of capital equipment goods. Table 1 shows estimations of importer premia of capital goods. We regress firms’ sales, capital stock, wage-bill, profits, and the share of imported inputs (imports of intermediates over total inputs) on a dummy variable equal to one for firms with positive values of imports of capital goods (importer of capital goods) and zero for those firms that do not import (non- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 importer of capital goods), including industry and year-fixed effects. The results show that, within an industry-year, firms that import capital goods have larger sales, capital stock, wage-bill, profits, and im- ported input share. Table 1. Importers vs. Non-Importers of Capital Goods: Importer of Capital Goods Premia The dependent variable is described in each column (1) (2) (3) (4) (5) Sales Capital Wage-bill Profits Imported inputs share Importer of capital goods 1.356*** 1.517*** 1.350*** 1.558*** 0.085*** (0.043) (0.045) (0.049) (0.051) (0.014) Year fixed effects Yes Yes Yes Yes Yes Industry fixed effects Yes Yes Yes Yes Yes Observations 14,680 14,647 14,680 11,945 14,680 R-squared 0.224 0.283 0.202 0.214 0.005 Notes: The dependent variable is described in the head of each column, all of those variables are expressed in logarithm terms except by the share of imported in- puts. The table shows regressions of each firm performance measure on a dummy variable equal to one if the firm imports capital goods in year t and zero otherwise. All regressions include industry- and year-fixed effects. Heteroskedasticity-robust standards errors clustered at the firm level are reported in parentheses. ***, **, and * indicate significance at the one, five, and ten percent levels respectively. Empirical fact 2: Firms producing with foreign technology embodied in imported capital goods have larger sales, capital stock, are more profitable, and have a higher share of imported inputs than non-importers of capital goods. We also observed that imports of intermediates and capital goods are positively correlated. Firms that import intermediate inputs have a greater probability to source capital equipments from foreign countries. Table 2 presents a set of simple estimations of the probability of importing capital equipment goods as a function of firms’ imported inputs intensity (the ratio of imported inputs over total inputs). We look at the relationship between the decision to import capital goods and the imported input intensity of the firm across firms within the same three-digit industry (columns 1 and 2) and within-firm over time (columns 3 and 4). Column (1) suggests that, comparing firms producing in the same industry, firms that import inter- mediate goods are more likely to import capital equipment goods. Column (2) includes a control variable of firm size (wage-bill).4 Looking at within-firm variation over time, columns (3) and (4) show that the de- cision to upgrade foreign technology embodied in imported capital goods is positively correlated with firms’ imported input intensity. This descriptive evidence suggests that there exists a certain complementar- ity between imported capital goods and intermediate inputs in India. 4 We rely on wage-bill as a measure of firms’ size since total employment is not reported in the Prowess dataset. 356 Bas and Berthou Table 2. Complementarity Between Imports of Capital Goods and Intermediate Inputs Dependent variable: Importer of capital goods status (1) (2) (3) (4) Imported input share 0.032*** 0.058*** 0.046*** 0.047*** Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 (0.004) (0.004) (0.005) (0.005) Firm size Yes Yes Industry 3 digit fixed effects Yes Yes Firm fixed effects Yes Yes Year Fixed effects Yes Yes Yes Yes Observations 14,680 14,680 14,680 14,680 R-squared 0.262 0.326 0.092 0.294 Notes: The dependent variable is a dummy for firm i having positive imports of capital goods in year t. The imported input share is the ratio of imported inputs over total inputs. Firm size is measured by the logarithm of wage-bill and it is included in columns (2) and (4). Heteroskedasticity-robust standards errors clustered at the firm level are reported in parentheses.***, **, and *indicate significance at the one, five, and ten percent levels respectively. Empirical fact 3: The decision of sourcing capital goods from abroad is positively correlated with imports of inter- mediate goods. As a final step, we explore if this complementarity between foreign technology and imported intermediate goods translates into a higher global efficiency of the firm in the production process. We investigate if im- porting both capital and intermediate goods improves firms’ total factor productivity (TFP) by estimating a production function relying on the methodology developed by Levinsohn and Petrin (2003) (henceforth LP). The LP approach controls for simultaneity bias in the estimation of firms’ production function.5 LP is based on Olley and Pakes (1996) methodology that develops a two-stage method to control for unobserved firm productivity.6 We modify the LP-OP estimation by incorporating the importer status of capital goods and in- termediate inputs in the production function estimation. We also control for the volume of imports of capital goods and/or intermediates to avoid that importer status pick up the fact that firms that import both goods, capital, and intermediates will tend to have a larger volume of imports that can affect total efficiency.7 Table A1 in the online appendix reports the results. In column (2), we include the dummy variables indicat- ing whether the firm imports only intermediates or both inputs and capital equipment goods. Firms produc- ing with foreign inputs and domestic capital goods have greater TFP relative to firms using only domestic inputs (18 percent). The estimates also show that firms producing with both foreign inputs and imported capital goods are 22 percent more productive than non-importers.8 Empirical fact 4: Producing with both imported inputs and foreign capital equipment goods improves firms’ global efficiency in the production process. Given such complementarity, input-trade liberalization should affect firms’ decision to upgrade for- eign technology in imported capital goods. The average tariff on variable imported intermediate goods 5 Simultaneity arises because firms’ variable input demands and unobserved productivity are positively correlated: the firm-specific productivity is known by the firm but not by the econometrician, and firms respond to productivity shocks by modifying their purchases of variable inputs. 6 Levinsohn and Petrin (2003) build upon the idea of Olley and Pakes using primary input demand (electricity) instead of the investment decision to control for unobserved productivity shocks. Their rationale lies in the idea that investment data are often missing or lumpy, whereas data on raw inputs are of better quality, thus guaranteeing strict monotonicity without efficiency loss. The Prowess dataset reports information on electricity inputs so we rely on the LP methodology. 7 We thank an anonymous referee for suggesting this control variable. 8 Note that this evidence gives just an empirical motivation of the model assumption of complementarity between im- ported capital goods and intermediate inputs. Although the production function is estimated pooling industries, the esti- mation includes three-digit industry fixed effects. The World Bank Economic Review 357 fell 27 percentage points between 1989 and 1997.9 At the same time that input tariffs drop, the share of firms importing capital equipment goods increases in most industries. As can be seen in figure A1 (in the online appendix) within each two-digit industry, the highest input tariffs drop, and the greatest expan- sion of the share of capital goods importers occur at the same time. Empirical fact 5: As average input tariffs fall, the share of firms importing capital goods increase. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 The next section develops a simple model that rationalizes these empirical facts to explain the role of input- trade liberalization on firms’ decision to upgrade foreign technology embodied in imported capital goods. III. Theoretical Motivation Previous models of heterogeneous firms and technology or quality upgrading focus on the impact of for- eign demand shocks, mainly through export variable cost changes, on firms’ decision to upgrade their technology/quality. Yeaple (2005) develops a trade model of heterogeneous skills, technology choice, and ex-post heterogeneous firms. In this model, trade liberalization by a reduction of trade variable costs en- hances technology adoption and skill-upgrading. Verhoogen (2008) presents a model of firm heterogene- ity and quality differentiation where more productive firms produce high quality goods to the export market. Expansion of export opportunities leads more productive firms to upgrade the quality of their goods for the export market. Bustos (2011) builds on Yeaple (2005) and Melitz (2003) to develop a trade model of heterogeneous firms and endogenous technology adoption. In her model, trade-variable cost re- ductions increase expected export revenues and enhance technology upgrading. Bas and Ledezma (2015) extends the Melitz (2003) model by including an additional stage of investment choice over a continuous support that determines firm productivity. In this model, trade liberalization also affects investment choice and productivity through an expansion of foreign demand. Other works that include fixed costs of innovation or technology upgrading in a Melitz-type model are Aw et al. (2011), Lileeva and Trefler (2010), and Costantini and Melitz (2008). In those models, trade liberalization also shapes technology choice via a foreign demand channel through changes in trade variable costs affecting final goods. Our model is also related to Kugler and Verhoogen (2011) who extend Melitz (2003) heterogeneous firms model to include an endogenous input- and output-quality choices. They add a domestic intermediate-input sector that produces inputs of different qualities. They consider two scenarios. In the first one, input quality and firm capability draws are complements to generate output quality. In the sec- ond scenario, they assume fixed costs of quality upgrading and that producing high-quality output re- quires high-quality inputs.10 Only in this second scenario, firms’ quality choice depends on the scale of market to which the plant sells. This second variant of the model will then predict that an exogenous in- crease in market access induces quality investments.11 Given that inputs are only domestically produced, trade liberalization will not affect production factor costs. We depart from these models of trade, heterogeneous firms, and technology/quality upgrading that focus on foreign demand shocks related to final goods trade variable cost changes and expansion of mar- ket access. Our focus relies instead on a supply shock, namely, variations in the relative production costs associated to input-trade liberalization. Assuming that firms produce their final product with both do- mestic and imported intermediate inputs and that hightechnology is biased toward foreign inputs, trade 9 Input tariffs are computed as tariff on variable intermediate goods other than capital equipment goods at the three-digit industry level as described in section V. 10 Hallak and Sivadasan (2013) also consider fixed costs of quality upgrading. They develop a model of trade with two di- mensions of firm heterogeneity (productivity and calibre, the ability to develop high-quality products with lower fixed outlays). In this model, exporters have more incentives to invest in quality upgrading due to a higher total demand and because trade costs decrease with quality. Thus, trade liberalization enhances quality upgrading. 11 For simplicity, the authors assume that there are no trade variable costs. 358 Bas and Berthou liberalization through input tariff reductions affects the relative costs of foreign inputs and thus firms’ profitability and the incentives for technology adoption. Kasahara and Lapham (2013) also introduce in a Melitz-type model imported intermediate goods and fixed cost of importing to investigate the simulta- neous choice of export final goods and import intermediates. Amiti and Davis (2012) build on Kasahara and Lapham (2013) to explain the effects of input- and output-trade liberalization on firms’ wages. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Bombarda and Gamberoni (2013) develop a Melitz-type model, including an intermediate goods sector producing differentiated varieties for domestic and foreign markets to explain the impact of relaxing rules of origin. However, they assume that intermediate goods producers are trade frictionless. These models do not take into account how imported inputs tariffs affect firms’ technology choice. Our model is closely related to the recent framework developed by Boler et al. (2015) of heterogeneous firms, endogenous R&D choice, and international sourcing of intermediate goods. In their setting, the complementarity mechanism between imported inputs and R&D investments arise due to a scale effect: on the one hand, lower R&D costs raise the average productivity and firm size increasing the number of imported inputs, and, on the other hand, importing intermediate goods reduce marginal costs, making it easier to incur the fixed costs of R&D. In the empirical analysis, they test the first implication by exploit- ing the implementation of a R&D tax credit in Norway and show that this reform stimulates not only R&D investments but also imports of intermediates, which contributed to productivity growth. Our fo- cus is instead on how input-trade liberalization affects firms’ foreign technology choice embodied in im- ported capital goods. Assuming that foreign technology is biased toward imported intermediates and the existence of fixed costs of importing capital goods, we show that input-trade liberalization fosters foreign technology adoption, and the effect of input-tariff cuts is heterogeneous across firms depending on their initial productivity level. Since we want to emphasize this imported input channel, for the sake of simplic- ity we abstract from the export side of the story and the effects of trade liberalization through variations in trade-variable costs affecting final goods that are already well-documented in the literature. Set-up of the model The aim of this section is to motivate our empirical analysis by introducing a simple model of heteroge- neous firms, endogenous technology adoption, and imported inputs based on Melitz (2003). The as- sumptions of the model capture the empirical facts described in the previous section. The theory rationalizes the mechanisms through which input-trade liberalization affects firms’ decision to upgrade technology embodied in imported capital equipment goods. Preferences The representative household allocates consumption from among the range of differentiated varieties of final goods x. Consumer preferences are assumed to take the Constant Elasticity of Substitution (CES) hÐ rÀ1 irÀ r 1 utility function: U ¼ x2X qðxÞ r dx , where r > 1 is the elasticity of substitution between two varie- ties and X the set of available varieties. The optimal demand function for each differentiated variety is h iÀr given by: qðxÞ ¼ Q pðP xÞ , where Q  U is the aggregate consumption of available varieties, P the price index and pðxÞ the price set by a firm. R ¼ PQ; aggregate revenue. The price index dual to the hÐ i1À 1 CES utility function is P ¼ x2X pðxÞ1Àr dx : r Production There are two sectors in the economy. One sector produces a homogeneous domestic constant-return-to- scale intermediate-input xd with one unit of labor requirement under perfect competition. Labor is inelastically supplied and the wage is used as a numeraire. This homogeneous intermediate goods sector is characterized by perfect competition, so that the price of domestic inputs equals the marginal cost of The World Bank Economic Review 359 producing the input: px ¼ w ¼ 1. Similar to previous works on heterogeneous firms and imported inter- mediate goods (Kasahara and Lapham [2013] and Amiti and Davis [2012]), we assume that intermediate goods are available in the country in fixed measure exogenously determined.12 This sector also produces domestic capital equipment goods kd under perfect competition and constant-return-to-scale using one unit of labor requirement. The price of domestic capital goods is then equal to one. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 The other sector produces a continuum of differentiated final goods under monopolistic competition. In this sector, there is a continuum of firms, which are all different in terms of their initial productivity level u. Each firm produces a distinct horizontally-differentiated variety of final good in a monopolistic competition market structure. The production of each variety of final good q involves a fixed production cost f in terms of labor.13 Firms combine intermediate inputs x and capital equipment goods k to produce the final good in a Cobb-Douglas technology with factor shares g and 1 À g: qðuÞ ¼ uxg k1Àg . Firms produce using both do- mestic xd and imported xm inputs combined in a CES function with an elasticity of substitution between the 1 two types of inputs equal to h ¼ 1À a : Domestic and imported inputs are imperfect substitutes, 0 < a < 1 and 1 h 1. To keep the model simple, we assume that all firms used both intermediate goods. This assumption is in line with the empirical fact 1 described in the previous section. À Á1 x ¼ xa a a a di þ ci xmi for i ¼ fl; hg (2) Firms can produce the final good with a low or a hightechnology with subscripts l and h. Lowtechnology is embodied in domestic capital goods kd and is available to all firms. Hightechnology is characterized by imported capital goods km and implies incurring an additional fixed technology adop- tion cost fh in terms of labor.14 Empirical fact 1 suggests that sourcing capital equipment goods from abroad in India involves a fixed investment cost that only a few firms can afford. The fixed cost of im- porting capital goods represents an investment in a new and more advanced technology that reduces marginal costs of production. The parameter ci represents the complementarity between imported inter- mediate inputs and imported capital goods (empirical facts 3 and 4). The high value of this factor is only available to firms that pay the fixed foreign technology cost. Therefore, firms producing with hightech- nology embodied in imported capital goods combine both types of capital goods by a Cobb-Douglas 1À b function k ¼ kb m kd and increase their efficiency due to the complementarity in the production process between imported inputs and imported capital goods with ch > 1. Firms producing only with low- domestic-technology have k ¼ kd and cl ¼ 1: The complementarity between imported inputs and im- ported capital goods yields to a higher efficiency in the production process, reducing firms’ marginal costs. This complementarity translates in an imported-input biased foreign-technology.15 Given that im- ported and domestic intermediate goods are imperfect substitutes, the complementarity assumption implies that firms producing with hightechnology embodied in imported capital goods are imported- input intensive, and firms producing with lowtechnology represented by domestic capital goods are domestic-input intensive. The evidence presented in the previous section suggests that imported interme- diate inputs are complementary with foreign technology embodied in imported capital goods for Indian firms. 12 This assumption of a fixed measure of intermediate goods allows us to focus on the cost reduction channel of interme- diate goods trade in a tractable way, avoiding the possible multiple equilibriums of models like Venables (1996). 13 This assumption allows us to study the decision of firms that face homogeneous fixed costs. 14 The assumption that the fixed-technology adoption cost is also measured in terms of labor allows us to study the tech- nology choice of firms that face homogeneous fixed costs. 15 Note that this complementarity is similar to the one present in the trade-induced skilled-biased technological change models. The main difference is that such models do not explain supply shocks driven by trade liberalization and associ- ated with changes in the price of production factors. They focus instead on demand-side shocks related to trade variable costs reductions in final goods that increase firms’ output demand and then the relative demand of skilled labor. 360 Bas and Berthou Each firm chooses its price to maximize its profits subject to a demand curve with constant elasticity r. The equilibrium price reflects a constant mark-up over marginal cost: r ci p i ð uÞ ¼ (3) rÀ1u In this model, marginal cost can be divided into an intrinsic productivity term u and a cost index ci, Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 which combines the prices of intermediate and capital goods. Final good producers are price-takers in intermediate-input and capital equipment goods markets. The price of imported inputs and capital goods takes into account the input tariff sm and the capital goods tariff sk, respectively. Since the price of do- mestic intermediate and capital goods is equal to the wage which is used as a numeraire, the cost index for the low- and hightechnology firms can be expressed as a function of the complementarity parameter,  gðaa À1Þ    aÀ a gðaa À1Þ a bð1ÀgÞ 1 input and capital goods tariffs: cl ¼ 1 þ smaÀ1 and ch ¼ sk 1þ s c m . Hightechnology h firms pay a fixed technology cost that allows them to reduce their marginal cost by increasing their effi- ciency through the complementarity between imported inputs and imported capital goods (ch). We as- sume that the efficiency parameter of imported capital goods ch is higher than its additional variable cost sk. The cost index of hightechnology firms ch is then lower than the one of lowtechnology firms cl. The ratio c cl is determined by: h a !gð1aÀaÞ ch bð1ÀgÞ s1m þ1 Àa ¼ sk a a (4) cl sm þ c1 1 À a Àa h This ratio expresses the relative cost of hightechnology firms to lowtechnology firms. The relative cost c cl is an increasing function of input tariffs. Partially differentiating equation (4) with respect to the h input tariffs (sm), we find that @ c cl =@ sm > 0 since 0 < a < 1 and ch > 1: The lower the input tariffs the h lower the relative unit costs of firms using the hightechnology vis-a-vis lowtechnology firms. This result is explained by the fact that using hightechnology in imported capital goods improves the efficiency of production through the use of foreign inputs. Adopting the hightechnology induces a technical change that is biased toward the use of foreign inputs given the substitutability between domestic and imported intermediate goods in the CES production function. This makes the production process more sensitive to input tariff changes. The relative cost c cl is also an increasing function of capital goods tariffs. A reduction h of tariffs on capital goods reduces the relative costs of using foreign high technology. The ratio of the relative hightechnology unit cost to lowtechnology expressed in equation (4) is the key variable in this model that captures the differential effect of input-tariff changes on firms’ revenues and profits. Combining the demand and the price function, firms’ revenues are given by ri ðuÞ  rÀ1 À 1ÁrÀ1 ¼ pi PðuÞ R ¼ Ac1 i u ; where R is the aggregate revenue and A ¼ PrÀ1 R rÀ Àr rÀ1 r is an index for market demand. Hightechnology firms’ revenues can be written as a function of revenues of lowtechnol-  1Àr ogy firms rh ðuÞ ¼ rl c cl h . Hence, firms that upgrade technology importing capital goods have a rela-  1Àr tive cost advantage that allows them to raise their revenues by the term c h cl . Note that this term is higher than one since the elasticity of substitution among final goods is r > 1 and cl > ch. Profits for  1Àr ch rl ðuÞ cl rl ðuÞ both types of firms are given by pl ðuÞ ¼ À f and ph ðuÞ ¼ r r À f À fh : Given that the price is a constant mark-up over marginal costs, in this model firms with a higher pro- ductivity draw using hightechnology set lower prices than lowtechnology firms due to a better exoge- nous productivity draw (u) and a higher input efficiency thanks to the complementarity between The World Bank Economic Review 361 imported intermediate goods and imported capital goods (ch). Since the demand is elastic, these lower prices imply that more productive firms using foreign-technology embodied in imported capital goods have also larger revenues and profits relative to those firms producing only with domestic capital goods (consistent with empirical fact 2). Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Firms’ decisions The decision to exit or stay and produce Firms have to pay a sunk entry cost fe to enter the market before they know what their productivity level will be. Entrants then derive their productivity u from common distribution density gðuÞ; with sup- port ½0; 1Þ and cumulative distribution GðuÞ. After observing its productivity draw, firms decide whether to stay and produce or to exit the market. Since there is a fixed production cost f, only those firms with enough profits to afford this cost can produce. The profits of the marginal firm that decides to stay and produce with lowtechnology are equal to zero: pl ðuà à l Þ ¼ 0. The value ul is the survival pro- ductivity cut-off to produce with lowtechnology. This cutoff is determined by the following condition: À Á À Á rl uà A Àr ÃrÀ1 pl uà l ¼ l À f ¼ c1 ul Àf ¼0 (5) r r l Equation (5) implies that the survival productivity cutoff to produce with lowtechnology is deter- rÀ1 À1 r mined by uà l ¼ f cr l A. All firms that have a productivity draw lower than the survival cutoff are not able to pay the fixed production cost, they make losses and exit the market (u < uà l ). Firms with a pro- ductivity draw greater than the survival cutoff stay in the market and produce (u > uà l ). The decision to adopt hightechnology If a firm decides to stay in the market once it has received its productivity draw, it may also decide to up- grade its technology by importing capital goods to reduce its marginal costs on the basis of its profitabil- ity. Technology choice is endogenously determined by the initial productivity draw. Firms with a more favorable productivity draw have a higher potential payoff from adopting the hightechnology that is bi- ased toward foreign inputs, and hence are more likely to find incurring the fixed technology cost worth- while. Thus, firms that will upgrade technology are the most productive ones whose increase in revenues due to the adoption of hightechnology enables them to pay the fixed technology cost to import capital goods. Technology adoption allows firms to increase their profitability through the complementarity channel between imported intermediate goods and imported capital goods in the production process.16 The indifference condition for the marginal firm to acquire the new and more advanced foreign technol- ogy is given by ph ðuà à h Þ ¼ pl ðuh Þ: rh ðuà à h Þ À rl ðuh Þ ¼ fh (6) r The hightechnology productivity cutoff uà h is the minimum productivity level for the marginal firm that is able to adopt the hightechnology and import capital goods. Equation (6) implies that uÃh rÀ 1 ¼ c1Àr fh r Àc1Àr A . By combining equation (5) with (6), we obtain uà à h as an implicit function of ul : h l  rÀ 1   1À r !1À 1 r fh 1 ch uà à h ¼ ul À1 (7) f cl 16 Firms’ technology adoption decision takes place after they discover their productivity draw. There is no other uncer- tainty or additional time discounting apart from the probability of exit (d). Thus, firms are indifferent between paying the one-time investment cost Fh or paying the amortized per-period portion of this cost in every period fh ¼ dFh . 362 Bas and Berthou Where the relative unit costs c cl is a function of input and capital goods tariffs and the complementar- h ity parameter between imported inputs and capital goods determined in equation (4). The sorting of firms by technology status depends on the relationship between fixed costs of production, of technology adoption, and variable costs of importing intermediate inputs and capital goods. If fixed costs of adopt- ing the hightechnology are lower than fixed production costs, all firms will use the hightechnology. The    Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 1Àr à à ch parameter condition that ensures that uh > ul is given by fh > f cl À1 . We are interested in determining how changes in input tariffs affect firms’ decision to upgrade technol- ogy depending on their productivity levels. This question can be answered by investigating the impact of input-tariff changes on the hightechnology productivity cutoff uà h . Equation (7) shows that input tariffs af- fect the hightechnology productivity cutoff through a direct effect captured by the relative unit costs of high- technology vis-a-vis lowtechnology and through an indirect effect captured by the impact of input tariffs on the survival productivity cutoff uà l . Hence, to determine the hightechnology productivity cutoff, we need to solve first for the equilibrium level of the survival productivity cutoff. This is done in the next section. Industry equilibrium Two conditions determined the equilibrium value of uà l : the free entry condition (FE) and the zero cutoff profit condition (ZCP).17 The FE condition represents a relationship between the average profits and the lowtechnology productivity cutoff level, where the average profits are an increasing function of the cut- off. In equilibrium, where entry is unrestricted, the net value of entry is equal to zero. Once firms pay the sunk entry costs, entrants then draw their productivity from a known Pareto distribution function gðuÞ u k ¼ k min with umin > 0 the lower bound of the support of the productivity distribution and a shape pa- ðuÞkþ1  k rameter k. The Pareto cumulative distribution function is GðuÞ ¼ 1 À umin u .18  k df e uà l ~¼ p ¼ df e ðFEÞ (8) 1 À Gð u à lÞ umin Under the ZCP condition, average profits are a decreasing function of the cutoff. ~ ¼ ql pl ðu p ~l Þ þ qh ph ðu ~h Þ ðZCPÞ (9) Where u ~h correspond to the average productivity levels of firms producing with low and high- ~l and u  à Àk 1ÀGðuÃ Þ uh technology, which depend on the productivity cutoff levels. qh ¼ 1ÀGðuhÃÞ ¼ uà and ql ¼ 1 À qh repre- l l sent the ex-ante probability of using high and lowtechnology. Combining the FE (equation [8]) and ZCP conditions (equation [9]), we can solve the equilibrium sur- vival productivity cutoff. Derivations are detailed in the appendix: 2   1À r !rÀ 3 1   Àk k ch fh rÀ1 f þ À1 f k rÀ1 6 6 cl f h7 7 uà l ¼ 6 7umin k (10) k À ðr À 1Þ 4 df e 5 where k > r À 1 and the relative unit costs c cl is a function of input tariffs sm ; capital goods tariffs sk, and h the complementarity parameter ch determined in equation (4). In this model, the equilibrium 17 All aggregate variables are defined in the appendix. 18 Assuming that productivity draws are Pareto distributed implies that firm size and variable profits are also Pareto dis- tributed with a shape parameter k=ðr À 1Þ. The condition for average variable profits to be finite is that k > r À 1. Axtell (2001) provides empirical evidence that the Pareto distribution is a good approximation of firm size distribution. The World Bank Economic Review 363 productivity cutoff uà l is a function of the input tariffs, the fixed production and hightechnology costs and the complementarity technology parameter. IV. Input-Trade Liberalization and Technology Upgrading Theoretical mechanisms Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 This simple model yields two main predictions related to the determinants of the probability of import- ing capital goods. The probability of adopting hightechnology embodied in imported capital goods is de- termined by the relationship between the two productivity cutoffs defined in equation (7): à Àk qh ¼ ðuÃh =ul Þ . This equation shows that the probability of upgrading technology is a function of fixed production costs, fixed costs of hightechnology, input and capital goods tariffs, and the complementarity parameter. Input tariff cuts increase the likelihood of firms to upgrade hightechnology. Proposition 1: The probability of adopting hightechnology by importing capital goods qh is a decreas- ing function of input tariff: @ qh =@ sm < 0: Using equation (7), we can express this probability as a function of the relative unit cost of hightech- Àk  1À k nology that depends on input tariffs: qh ¼ ðfh =f ÞrÀ1 ðch =cl Þ1Àr À 1 Àr . From equation (4), we know that @c cl =@ sm > 0 since 0 < a < 1 and ch > 1, thereby, @ qh =@ sm < 0, since r > 1. h 19 This model also predicts a heterogeneous effect of input-trade liberalization on firms’ technology choice. The assumptions of firm heterogeneity and fixed costs of hightechnology adoption imply that those firms that will be able to benefit from input-trade liberalization are the most productive firms using lowtechnology before input-tariff cuts. Using equation (7) and (10) to determine the hightechnology pro- ductivity cutoff, we know that this cutoff decreases with input-tariff reductions. Input-trade liberaliza- tion induces the highest-productivity firms producing with lowtechnology to switch to hightechnology. Proposition 2: The hightechnology productivity cutoff uà h is an increasing function of input tariffs: à @ uh =@ sm > 0 Proof. See appendix. We focus on two testable predictions derived from propositions 1 and 2 which are in line with the em- pirical facts 5 presented in the previous section. These testable implications are presented in the next sec- tion. Input tariff reductions also induce a selection effect of most productive firms in this model. The least productive firms producing with lowtechnology intensive in domestic inputs will lose competitive- ness and market shares relative to hightechnology firms due to input-trade liberalization. Indeed, input tariff reductions imply an increase in the relative costs of domestic inputs vis-a-vis foreign intermediate goods. This is shown formally in the appendix. Unfortunately, the Indian dataset that we exploit in the empirical analysis is not suitable to test this prediction since we cannot identify entry and exit of firms. Since firms are under no legal obligation to report to the data collecting agency, the Prowess data do not allow us to identify entry and exit of firms. Testable implications In the empirical analysis, we focus on firms’ technological decision to import capital equipment goods in India. The simple model presented in the previous section yields two testable implications on the relationship between changes in input tariffs and firms’ decision to upgrade technology embodied in foreign capital goods. Input tariff cuts imply a reduction of the relative costs of foreign inputs vis-a-vis domestic ones. Taking into account that the hightechnology embodied in imported capital goods is biased toward im- ported inputs and the substitutability between intermediate goods, input-trade liberalization in this 19 The model also predicts that the probability of importing capital goods is a decreasing function of capital goods tariffs. In the empirical analysis presented in the following sections, we take into account the direct role of capital goods tariffs. 364 Bas and Berthou framework enhances the cost-advantage of hightechnology firms. Thus, input tariff cuts reduce the rela- tive unit costs of using hightechnology, increasing profits of hightechnology firms relative to lowtechnol- ogy firms, creating incentives to upgrade technology embodied in imported capital goods. Proposition 1 shows that input tariff reductions increase the likelihood of firms to adopt the foreign high technology by importing capital goods. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Testable implication 1: Input-trade liberalization has a positive effect on firms’ decision to import capital goods. Which are the firms that decide to upgrade foreign-technology after input-trade liberalization? The effect of input tariff reductions on firms’ technology choice is heterogeneous across firms depend- ing on their initial productivity level u. Proposition 2 shows that the hightechnology productivity cutoff uÃh decreases with input tariff reductions. Figure 1 illustrates the impact of input-trade liberalization on firms’ technology choice for firms with different productivity levels. Input tariff cuts reduce the hightech- nology productivity cutoff, allowing the most productive firms producing with domestic low technology before input-trade liberalization to upgrade their technology embodied in imported capital goods (uÃ0 à h < u < uh ). These firms will experience an increase in the expected profits of hightechnology due to input tariff reductions that allows them to cover the fixed technology adoption costs. Figure 1. Heterogeneous Effect of Input-Trade Liberalization on Firms’ Technology Choice A Before input-liberalization B After input-liberalization Testable implication 2: The effect of input-trade liberalization is heterogeneous across firms. Firms that will bene- fit from input tariff cuts to upgrade foreign technology embodied in imported capital goods are firms in the mid- dle range of the productivity distribution. In the following sections, we test these empirical implications using the episode of India’s trade liber- alization at the beginning of the 1990s. V. Empirical Analysis Data The Indian firm-level dataset is compiled from the Prowess database by the Centre for Monitoring the Indian Economy (CMIE).20 This database contains information from the income statements and balance 20 The CMIE is an independent economic center of India that provides services of primary data collection through analyt- ics and forecasting. Further information can be found at http://www.cmie.com/. The World Bank Economic Review 365 sheets of listed companies comprising more than 70 percent of the economic activity in the organized in- dustrial sector of India. Collectively, the companies covered in Prowess account for 75 percent of all cor- porate taxes collected by the government of India. The database is thus representative of large and medium-sized Indian firms. As previously mentioned, this dataset was already used in several studies on the performance of Indian firms.21 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 The dataset covers the period 1989–1997, and the information varies by year. It provides quantitative information on sales, capital stock, income from financial and non-financial sources, consumption of raw material and energy, compensation to employees (wage-bill), and ownership group.22 This dataset allows us to estimate firm total factor productivity (TFP) using the Levinsohn and Petrin (2003) method- ology. The Prowess database provides detailed information on imports by category of goods: finished goods, intermediate goods, and capital goods. In our main empirical specification, we use imports of capital goods (machinery and equipment) to measure foreign technology. Although we are not able to test directly for the impact of imported capital goods depending on the country of origin since the Prowess dataset does not include the origin country of imported goods (e.g., developed vs. developing countries), one realistic assumption for the case of a developing country like India is that most imports of capital goods are sourced from more advanced economies and thus are a good proxy of a modern and hightechnology. Looking at imports of capital goods at HS6 product level of India by country of origin reveals that about 70% of their imports came from developed countries in the period 1989–1997.23 Input-trade liberalization might also allow firms to access to high-quality inputs. Using detailed firm- product level data for Colombia, Kugler and Verhoogen (2009) compare the price of domestic and imported inputs and provides evidence that higher-quality inputs may be relatively more available inter- nationally. Due to data constraints, we are not able to look at the effects of input-liberalization on qual- ity upgrading. The Prowess database does not provide any information on quantities to compute unit values as a proxy of quality of intermediate goods. Despite that, we cannot observe the quality of inter- mediate goods; for imported capital goods, we can infer that they are more advanced or of a higher qual- ity relative to domestic capital equipment goods produced in India since most of the imports of capital goods come from developed economies. Our sample contains information for around 3,744 firms in organized industrial activities from manufacturing sector for the period 1990–1997. Since we lagged control variables, our estimating sam- ple starts in 1990 and the total number of observations firm-year pairs is 14,425. In order to keep a con- stant sample throughout the paper and to establish the stability of the point estimates, we keep firms that report information on all the firm- and industry-level control variables. Although our panel of firms is unbalanced, there is no statistical difference in the average firm characteristics between the initial year and the final year of our sample. Input-tariff data To identify the impact of input-trade liberalization on firms’ foreign technology choice, we use input tar- iffs at the three-digitNIC industry level. Tariffs data is provided by WITS (World Bank) and corresponds to India’s effectively applied most favorite nation (MFN) import tariffs with respect to the Rest of The World at the industry level ISIC (rev 2).24 In order to identify the effect of input tariff changes on firms’ 21 See Topalova and Khandelwal (2011), Topalova (2004), Goldberg et al. (2010), Goldberg et al. (2009), Alfaro and Chari (2009), DeLoecker et al. (2016). 22 Variables are deflated with industry-specific wholesale price indices from India’s national accounts statistics. 23 We used the BACI database provided by the CEPII as well as the Broad Economic Categories (BEC) classification of HS6 products by intermediates, capital goods, and consumption goods. 24 We use correspondence tables to convert tariffs into ISIC rev 3.1. that match almost perfectly with NIC three-digit clas- sification. This dataset is available at http://wits.worldbank.org/wits/. 366 Bas and Berthou decision to import capital goods, we construct different tariffs measures for capital goods and for vari- able intermediate goods. In this sense, we depart from previous studies on input-trade liberalization to consider both variable inputs and capital goods in the construction of input tariffs. This methodology allows us to disentangle the indirect effects of tariffs on intermediate goods on firms’ decision to import capital goods from the direct effects of tariffs on capital goods. For each three- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 digit industry, s, we generate a capital goods tariff as the weighted average of tariffs on the capital goods used in the production of final goods of that three-digit industry, where the weights reflect the share of capital goods of the final goods industry on total expenditures in capital goods using India’s input- output matrix in 1993. We rely on fixed input weights and a pre-sample year input-output matrix to avoid possible endogeneity concerns between variations in input weights and industry and firm perfor- mance. Using a disaggregated input-output matrix, 14 from a total of 52 industries are classified as capi- tal goods.25 Similarly, for each industry, s, we generate an input tariff as the weighted average of tariffs on all the other intermediate goods (excluding capital goods) used in the production of final goods of that industry, where the weights reflect the input industry’s share of the output industry’s total expendi- tures in other inputs using India’s input-output matrix in 1993. P We compute input (capital goods) tariffs as sst ¼ z azs szt , where azs is the value share of input (capi- tal) z in the production of output in the three-digit industry s. Take, for example, an industry that uses three different intermediate goods in the production of a final good. Suppose that the intermediate goods face a tariff of five, ten, and 15 percent and value shares of 0.10, 0.30, and 0.60, respectively. Using this methodology, the input tariff for this industry is 12.5 percent (5  0.10 þ 10  0.30 þ 15  0.60). Trade liberalization in India The main feature of trade reform in India was the substantial trade-integration process experienced in the 1990s. In this section, we describe India’s trade liberalization process and the trade-policy instru- ments that were applied. India’s trade policy during the 1970s and 1980s was characterized by the license raj. This trade sys- tem was grounded on trade protection policies with an emphasis on import substitution. It was very re- strictive with high levels of nominal tariffs and import licenses in almost all sectors. Unilateral trade-reform plan was launched in the early 1990s as a consequence of the debt crisis and as a part of an IMF program. Trade liberalization was at the core of structural reforms launched during the “Eighth Five-Year Plan” period from 1992 to 1997. Under this plan, gradual tariff cuts were applied in all sectors at the same time that non-tariff barriers and licenses were removed. As Topalova and Khandelwal (2011) emphasize after 1997, tariff changes were not as uniform, and the issue of potential endogeneity of trade protection might be present in the period of the “Ninth Five-Year Plan.” For this reason, we restrict our analysis to the 1989–1997 period. During this period, India also becomes a member of the WTO (World Trade Organization) in 1995. One of the commitments of India when it decides to join WTO is to continue the process of trade liberal- ization started in the early 1990s. From 1995, India starts implementing Uruguay Round commitments that were completed in 2005 (see India’s Trade Policy Review by WTO in 2007). Average input tariffs have declined by 27 percentage points during the period while capital goods tar- iffs were only slightly reduced by ten percentage points. This descriptive evidence suggests that changes in variable inputs and capital goods tariffs were heterogeneous. They were also weakly correlated.26 25 Capital goods industries are tractors and agriculture machinery, industrial machinery, industrial machinery (others), of- fice computing machines, other non-electrical machinery, electrical industrial machinery, communication equipment, other electrical machinery, electronic equipment, ships and boats, rail equipment, motor vehicles, motorcycles, and other transport equipment. 26 The correlation between average output tariffs and input tariffs is 0.45 and between output and capital goods tariffs is 0.01. The World Bank Economic Review 367 There is also significant variation in movements in input tariffs by industry over the 1989–1997 period. At the two-digit industry level, industries that experienced the greatest input tariff cuts are cloth, plastic, machinery, wood, and paper (figure A1 in the online appendix). At the more disaggregated three-digit industry level, there is even more variation in input tariffs. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Exogenous input tariffs variations One of the challenges in the investigation of the relationship between input-tariff reductions and firm de- cisions to upgrade foreign technology embodied in imported capital goods is potential reverse causality between tariff changes and firms’ import choices which would bias our estimates.27 In this case, changes in input tariffs could reflect some omitted industry characteristics. One way of addressing this issue is to test whether tariff changes are exogenous to initial industry and firm characteristics. Similar to previous works analyzing the effects of trade liberalization on different firm performance measures, Topalova and Khandelwal (2011), we regress first changes in input tariffs on a number of industry characteristics computed as the size-weighted average of firms’ characteristics in the initial year of our sample. Table A2 in the online appendix shows the coefficients on the change in input tariffs and capital goods tariffs (1989–1997) on industry level regressions of initial industry charac- teristics (sales, capital stock, wage-bill, imports of intermediates, and capital goods) on these tariff changes. The estimates confirm that input-tariff changes between 1989 and 1997 were uncorrelated with initial industry-level outcomes in 1989. As such, it seems unlikely that firms producing in industries with greater input-tariff cuts were able to lobby for these lower tariffs. Next, following the analysis of Goldberg et al. (2010), we provide additional evidence that input tar- iff changes between 1989 and 1997 were uncorrelated with initial firm performance measures in 1989 that we are considering in this analysis. Table A3 in the online appendix shows estimates from regressing firm characteristics in 1989 such as the importer status, the share of imported capital goods over total sales, the logarithm of capital stock, and firm TFP on the variation in input tariffs and capital goods tar- iffs across industries between 1989 and 1997. Had the government targeted specific firms/industries dur- ing trade liberalization, we would expect tariff changes to be correlated with initial firm performance. However, the correlation is insignificant. This evidence suggests that the government did not take into account pre-reform trends in firms’ im- ports of capital goods and other performance measures when deciding to reduce tariff during trade re- form at the beginning of the 1990s. Input tariff cuts and firm decision to import capital goods Using specific tariffs on inputs (different from capital goods tariffs), we investigate the relationship be- tween the availability of imported intermediate goods and firms’ decision to upgrade foreign technology embodied in imported capital goods. To test the first implication of the model, we estimate the probabil- ity that firm i imports capital goods in year t using the following linear probability model: ImporterðkÞist ¼ c1 Inputss;tÀ1 þ c2 Zs;tÀ1 þ c3 Xi;tÀ1 þ li þ tt þ ist (I) Here ImporterðkÞist is a dummy variable for firm i producing in industry s having positive imports of capital goods in year t. Input ss;tÀ1 represents the input tariffs of industry s in year t – 1. We have already shown that we rely on exogenous changes in tariffs that are not correlated with initial firm or industry char- acteristics. Moreover, we use lagged tariffs values to ensure that contemporaneous firms’ decisions cannot af- fect past values of tariffs. Zs;tÀ1 is a set of industry-level control variables and Xi;tÀ1 is a set of firm-level observable characteristics varying over time. All specifications include firm-fixed effects, li , that take into 27 Karacaovali (2011) shows theoretically and empirically how productivity at the industry level could affect tariff rates at the sectoral level. 368 Bas and Berthou account unobservable and time-invariant firm characteristics and year-fixed effects that control for macro- economic shocks affecting all firms and industries in the same way, tt . Since tariffs vary at the three-digit in- dustry level over time, the errors are corrected for clustering across three-digit industry level. As discussed above, input-tariff changes are not correlated with either initial firm characteristics or in- dustry characteristics during the period 1989–1997. To deal with additional concerns of reverse causal- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 ity and omitted variables, we introduce different control variables at the industry level which may affect firms’ import decisions of capital goods and could reflect the effects of input-tariff changes. The c1 coef- ficient on input tariffs might then simply be picking up the effects of variations of tariffs on capital goods. The simple model presented in the previous section also predicts that the probability of importing capital goods is a decreasing function of capital goods tariffs. Hence, we first include India’s import tar- iffs on capital goods to capture the direct effects of variations in tariffs affecting capital equipment prod- ucts on firms’ decision to import those capital goods. Second, all specifications also include tariffs on final goods. This variable captures foreign competition pressures. Finally, we also include a Herfindahl index at the sectoral level to control for domestic competition. Note that, in order to keep the theoretical framework simple and rationalize the effects of input-tariff on foreign technology adoption, the model abstracts from these competition channels. They should be, however, included in the empirical estima- tion to avoid omitted variable concerns. Next, we also explicitly take into account changes in observable firm characteristics that could affect firms’ import patterns. Using the same dataset, Bas and Berthou (2012) have found evidence on a posi- tive correlation between firms’ decision to import capital goods and firms’ capital intensity. We therefore expect that non-importing Indian firms which experienced significant growth in their capital intensity during the period under analysis were more likely to import capital goods. Xi;tÀ1 is a set of firm-level controls such as firms’ capital intensity and the age of the firm. The Prowess dataset contains the year of creation of the firm that allows computing the age of the firm.28 Table 3 shows the estimation results for equation (I) using a within-firm estimator. These re- sults show the impact of lower input tariffs on the decision to import capital goods. In column (1) the coefficient on the input tariffs is negative and significant at the 1% confidence level, indicating that the drop in input tariffs between 1989 and 1997 increased the probability of importing capi- tal goods. The estimated input tariff coefficient is robust to the inclusion of MFN tariffs for final goods set India (column 2). We also introduce tariffs on capital goods to be sure that the input tariffs are not just capturing the effect of changes in direct tariffs of imported capital equipment products (column 2). Not surprisingly, reductions in tariffs on capital goods enhance the probabil- ity of upgrading foreign technology embodied in imported capital goods. More interesting, the in- clusion of capital goods tariffs does not pick up the indirect effect of reductions of tariffs on intermediate inputs. We next include additional industry- and firm-level variables to control for industry- and firm-observable characteristics that vary over time and that could be related to in- put tariffs. The coefficient of interest on input tariff is robust and stable when we control for do- mestic competition measured by the Herfindahl index (column [3]), the age of the firm and firm capital intensity (column [4]). The coefficient on input-tariff changes remains negative, signifi- cant, and stable, however. It is very similar in size to the estimations with only industry-level con- trols shown in column (1). 28 Prowess dataset does not report consistent information on number of employees. The World Bank Economic Review 369 Table 3. Input-Tariff Liberalization and Firms’ Decision to Import Capital Goods Dependent variable: dummy equal to one if the firm i imports capital goods in t. (1) (2) (3) (4) (5) (6) Input tariff(s)(t-1) À0.166*** À0.149** À0.152** À0.152** À0.151** À0.012 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 (0.056) (0.060) (0.060) (0.060) (0.069) (0.078) Input tariff(s)(t-1) imported inputs >0 À0.170** (0.066) Capital goods tariff(s)(t-1) À0.165** À0.168** À0.170** À0.168* À0.173** (0.073) (0.073) (0.073) (0.086) (0.083) Output tariff(s)(t-1) À0.057 À0.057 À0.053 À0.026 À0.030 (0.047) (0.047) (0.047) (0.041) (0.040) Herfindhal index(s)(t-1) 0.044 0.045 0.022 0.024 (0.049) (0.049) (0.059) (0.059) Age(t-1) À0.004 À0.016 À0.011 (0.028) (0.033) (0.034) Capital intensity(t-1) 0.013** 0.014** 0.015* (0.006) (0.007) (0.007) Imported inputs >0 0.333*** 0.405*** (0.028) (0.041) Firm fixed effects Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes Yes Yes Observations 14,680 14,680 14,680 14,680 14,680 14,680 R-squared 0.036 0.036 0.036 0.036 0.093 0.094 Notes: The dependent variable is a dummy for firm i having positive imports of capital goods in year t. Output tariff(s)(t-1) are MFN applied tariffs from WITS- WB dataset at the three-digit industry level, and input and capital goods tariffs are constructed separately using these output tariffs and India 1993 input-output ma- trix. Importer inputs is a dummy equal to one if the firm imports intermediate goods. Herfindahl index measures the concentration of sales of the industry. Capital in- tensity is measured by capital stock over sales of the firm. The Prowess dataset reports the year of creation of the firm that allows to construct the age of the firm. Heteroskedasticity-robust standards errors are reported in parentheses. Errors are corrected for clustering at the three-digit industry level. ***, **, and *indicate sig- nificance at the one, five, and ten percent levels respectively. If the availability of foreign intermediate goods induces firms to start importing capital goods, we would expect the effect of lower input-tariffs to be greater for firms that actually import intermediate in- puts. Columns (5) and (6) carry out this test. First, we include a dummy variable equal to one if the firm imports intermediate goods. Firms sourcing inputs from abroad are more likely to also import capital goods (column 5) confirming the previous descriptive evidence on technological complementarity be- tween imported inputs and foreign technology.29 Next, we introduce an interaction between input tariff and importer of intermediate goods status (column [6]). Comparing the coefficients of tariffs on capital goods with those on intermediate goods (column [6]) reveals that the indirect effect of input tariffs cuts on the probability of upgrading capital goods is of a similar magnitude to the direct effect of reducing capital goods tariffs. The estimated coefficient of vari- able input tariff cuts implies that a ten percentage point fall in input tariffs leads to 1.5% to almost 1.7% increase in the probability of importing capital goods for the average firm and for those actually importing intermediate goods. Between 1989 and 1997, input tariffs declined, on average, by 27 percent- age points with an associated implied increase in the probability of importing capital goods of about 4.6 for the average firm importing intermediate goods. These findings suggest that the additional gains from 29 As previously mentioned, unfortunately the Indian firm-level dataset does not allow us to test the quality upgrading channel of imported inputs or to distinguish the country of origin of imports of intermediate goods in order to provide a better assessment of the complementarity mechanism. 370 Bas and Berthou input-trade liberalization thanks to the complementarity channel of imported inputs and capital goods are non negligible. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 The heterogeneous effects of input tariff cuts The simple model presented in section III shows that input-trade liberalization affects firms differently according to their initial productivity. Most firms with a high-productivity level might already import capital goods before input tariff cuts, while the least productive firms might not be able to afford the fixed cost of importing capital goods despite input tariff changes. The model predicts that firms using lowtechnology before the reform that have a productivity level close to the hightechnology productivity cutoff will benefit from input tariff reductions to face the sunk costs of importing capital goods. We ex- plore in this section whether the impact of input-tariff changes on firms’ decision to import capital goods depends on previous firm productivity. To investigate the heterogeneous effect of input-trade liberalization on firms’ decision to import capi- tal goods, we introduce interactions between input-tariff changes and quantiles of firms’ TFP in the ini- tial year of the sample. We rely on firm-initial TFP to avoid potential endogeneity issues between firm performance and imports of capital goods. Firms are divided up into three initial TFP quantiles with the first one representing the least productive firms (those firms with an initial TFP lower than the 33rd per- centile), the second group covers middle range initial productivity firms (between the 33rd and 66th per- centile), and the last group represents the high initial productivity firms (with an initial TFP higher than the 66th percentile).30 We then interact input-tariff with the firms’ initial TFP quantiles. We estimate the following linear probability model for the decision to import capital goods: X 4 ImporterðkÞist ¼ vq ðInputss;tÀ1  Qq is Þ þ c2 Zs;tÀ1 þ c3 Xi;tÀ1 þ li þ tt þ ist (II) q¼ 1 Here ImporterðkÞist is a dummy variable for firm i in three-digit industry s having positive imports of capital goods in year t. Firms are classified into three groups of initial TFP by q: Q1is is a dummy variable for firm i belonging to the group of the least productive firms, and so on. Inputss;tÀ1  Qq is are the inter- action terms between the three groups of firms’ TFP and input tariff. We include the same industry (out- put tariffs, capital goods tariffs, and Herfindahl index) and firm-level (age, capital intensity, and the intermediate goods importer status) controls as in the previous estimations. The dummy variables for each group of firm initial TFP are excluded from the estimation since they are collinear with the firm fixed effects. The estimation results for equation (II) are presented in table 4. Note that, in this specification, we re- strict the sample to firms that are present in the initial year, and so the number of observations is re- duced. Column (1) reports as a benchmark the baseline estimates on the sample of firms that are present in the initial year. Columns (2) to (4) introduce the interaction terms between input tariffs and firms’ ini- tial TFP quantiles. The impact of input tariffs on the probability of importing capital goods is only signif- icant for firms in the middle range of the initial productivity distribution. This result is consistent with the predictions of our model. Since firms faced fixed sunk costs of importing capital goods, only those firms that were not importing capital goods before the input-tariff reform and that are productive enough to pay the importing fixed costs are able to import capital goods thanks to the reduction of input tariffs. The estimated coefficient implies that the 27 percentage point fall in input tariffs during the pe- riod leads to almost 10% increase in the probability of importing capital goods for firms in the middle range of the initial productivity distribution. 30 Firm TFP is estimated using the Levinsohn and Petrin (2003) methodology. The World Bank Economic Review 371 Table 4. The Heterogeneous Effects of Input-Tariff Liberalization on Firms’ Decision to Import Capital Goods Dependent variable: dummy equal to one if the firm i imports capital goods in t. (1) (2) (3) (4) Input tariff(s)(t-1) À0.201* Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 (0.102) Input tariff(s)(t-1) Low initial TFP À0.111 À0.145 À0.149 (0.111) (0.105) (0.104) Input tariff(s)(t-1) Medium initial TFP À0.315*** À0.345*** À0.346*** (0.110) (0.104) (0.103) Input tariff(s)(t-1) High initial TFP À0.043 À0.067 À0.058 (0.164) (0.146) (0.151) Capital goods tariff(s)(t-1) À0.181* À0.150 À0.170* À0.176* (0.095) (0.108) (0.096) (0.095) Output tariff(s)(t-1) À0.040 À0.055 À0.041 À0.033 (0.047) (0.052) (0.048) (0.051) Imported inputs >0 0.344*** 0.344*** 0.344*** (0.029) (0.029) (0.029) Age(t-1) 0.022 0.031 (0.057) (0.061) Capital intensity(t-1) 0.018 0.019 (0.012) (0.012) Herfindhal index(s)(t-1) 0.004 0.013 (0.067) (0.069) Firm-fixed effects Yes Yes Yes Yes Year-fixed effects Yes Yes Yes Yes Observations 7,861 7,861 7,861 7,861 R-squared 0.087 0.038 0.088 0.088 Notes: ’The dependent variable is a dummy for firm i having positive imports of capital goods in year t. Input tariff(s)(t-1) are interacted with quartiles of firm TFP in the initial year of the sample. Firm TFP is estimated using the Levinsohn and Petrin (2003) methodology. All control variables are defined in table 3. Industry control variables (output tariffs, capital goods tariffs, and the Herfindahl index) are included in all specifications. Heteroskedasticity-robust standards errors are reported in pa- rentheses. Errors are corrected for clustering at the three-digit industry level. ***, **, and * indicate significance at the one, five, and ten percent levels respectively. VI. Alternative Explanations There are other potential explanations for the incentives of Indian firms to upgrade foreign technology embodied in imported capital goods over the 1989–1997 period, with the input-trade liberalization be- ing one of them. In this section, we discuss and examine three alternative explanations: (i) other reforms that took place in India during this period; (ii) learning effects; (iii) foreign demand shocks (ex- port-channel); and (iv) firms’ financial health. First we describe our strategies to take into account these alternative factors in the estimations. We then present evidence showing that our previous findings re- main stable when including these factors, suggesting that the input tariff cuts channel is an important factor determining firms’ foreign technology upgrading decision. Other reforms in India During the 1990s, India experienced structural reforms in several areas of the economy. In order to test if the coefficient on input tariffs is picking up the effects of other reforms that took place in India, we carry out alternative sensitivity tests. Table A4 in the online appendix presents the results. The benchmark estimation presented in column (5) of table 3 is reported in column (1). Next, we include in column (2) industry year-fixed effects to take into account all unobservable characteristics varying over time that could affect industries. In this case, only the interaction term between input tariff and the importer of intermediate goods status 372 Bas and Berthou variable is included. The coefficient of the interaction term is negative and significant, and the magnitude is slightly smaller relative to the one found in the baseline specification reported in column (1).31 Since other reforms like labor market regulations were introduced at the beginning of the 1990s at the state level, we introduce region year-fixed effects to control for unobservable characteristics affecting the 21 Indian states in columns (3) and (4). As can be seen, the coefficient of interest on input tariffs and Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 on the interaction term between input tariffs and the initial quartiles of firm TFP remain robust and sta- ble to the inclusion of region year-fixed effects. The point estimates of input tariffs remain robust relative to the ones presented in the baseline specifications in tables 3 and 4. Overall, these results confirm that our previous findings do not suffer from omitted variables bias re- lated to other policy-reforms that took place in India. Learning effects Learning by importing channel could also explain the relationship between input trade liberalization and foreign technology upgrading in imports of capital goods. Firms that import intermediate inputs might learn about sourcing countries and providers, and it is easier for them to start importing capital goods with the new information acquired than firms that face foreign sellers for the first time. Note that testing directly this mechanism requires further information on the country of origin of imports that is not available for the Indian dataset. We present here a test for this channel that relies on past import experience on intermediate goods as a proxy of learning effects. The previous specification is extended to include an interaction term between intermediate good tariff with the number of prior years in which the firm imported intermediate goods. Results are presented in columns (5) and (6) of table A4 in the online appendix. The coefficient measur- ing past import experience is positive but not significant, and the interaction term with input tariffs is negative but also not significant. Nevertheless, this alternative channel is not picking up the effects of our main variable of interest: input tariffs cuts have still significant effect on the probability of importing capital goods on average, and mainly firms in the middle range of the productivity distribution benefit from input-tariffs liberalization to upgrade foreign technology in imported capital goods.32 Foreign demand shocks In the simple theoretical framework presented in section III, we emphasize the imported input channel as the main mechanism through which trade liberalization affects firms’ decision to upgrade foreign tech- nology. For the sake of simplicity, we did not take into account the export side of the story and the ef- fects of trade liberalization through variations in trade-variable costs affecting final goods that are already well documented in the theoretical literature (Yeaple 2005; Bustos 2011). Expansion of export opportunities due to foreign demand shocks might also increase the incentives for firms’ to upgrade foreign technology embodied in imported capital goods. Moreover, importing in- termediate inputs might lead to higher exports as documented in the empirical literature (Feng et al. 2016; Bas 2012; Bas and Strauss-Kahn 2015). Higher export profits would allow to overcome the fixed cost of importing capital goods. If input tariff changes are positively correlated with export performance or with variations in output tariffs set by India’s trading partners, our previous empirical findings might be just picking up the effects of foreign demand shocks. The industry year-fixed effects included in the estimations of the previous section already address this is- sue since they capture all unobservable shocks at the industry level varying over time. In this section, we pro- vide additional evidence that foreign demand shocks at the sectoral level captured by export tariffs are not picking up our results. We control for this alternative explanation by including in the previous specifications 31 Note that, in the specification in which we include industry-year and firm-fixed effects in column (2), the effect of input tariff and initial quantiles of firm TFP will be completely subsumed by the fixed effects. 32 We thank an anonymous referee for suggesting this test. The World Bank Economic Review 373 the average effectively applied tariff at the three-digit NIC industry level set by the rest of the world to India (export tariff) during the 1989–1997 period from WITS dataset (World Bank). Columns (1) to (3) of table A5 in the online appendix report the results. The effect of export tariff is negative but not significant. The co- efficient of interest on the input tariffs remains robust and stable in all specifications when we take into ac- count the role of foreign demand. This finding suggests that the supply side mechanism emphasized in this Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 paper is also an important channel through which trade liberalization affects technology upgrading. Firms’ financial health In a previous work, we have shown that firms’ financial health is an important determinant of firms’ decision to import capital goods in India (Bas and Berthou 2012). We investigate whether the previous findings are not driven by an omitted variable bias related to firms’ financial health. The previous estimations are ex- tended to include lagged values of the leverage ratio (borrowings over total assets) of the firm. Columns (4) to (6) of table A5 in the online appendix present the findings. As in our previous study, we find that firms’ fi- nancial health is an important determinant of firms’ decision to upgrade foreign technology. Nevertheless, our coefficient of interest on input tariffs is not affected by the inclusion of firms’ financial variables. VII. Other Robustness Tests The decision to start importing capital goods We explore the robustness of our baseline specification when we restrict our sample to firms that have not imported capital goods in the previous years. The estimates from linear probability estimations of equation (I) and (II) with firm- and year-fixed effects for the restricted sample of firms that have not imported capi- tal goods in the previous year or two years are reported in table 5. In these cases, the coefficients on input tariff are higher compared to the baseline specification. We should keep in mind that this could be due to the reduction of the sample size to half from 14,425 to around 9,200 or 5,500 observations. Table 5. The Decision to Start Importing Capital Goods Dependent variable: dummy equal to one if the firm i imports capital goods in t. (1) (2) (3) (4) Non importer in the last year Non importer in the last two years Input tariff(s)(t-1) À0.210** À0.249* (0.099) (0.133) Output tariff(s)(t-1) 0.069 0.088 0.083 0.094 (0.046) (0.065) (0.060) (0.077) Capital goods tariff(s)(t-1) À0.248** À0.234* À0.376** À0.443** (0.118) (0.124) (0.172) (0.201) Input tariff(s)(t-1)  Low initial TFP À0.262 À0.289 (0.153) (0.182) Input tariff(s)(t-1)  Medium initial TFP À0.304** À0.393** (0.127) (0.186) Input tariff(s)(t-1)  High initial TFP À0.082 À0.269 (0.155) (0.187) Herfindahl index(s)(t-1) Yes Yes Yes Yes Firm level controls Yes Yes Yes Yes Firm fixed effects Yes Yes Yes Yess Year fixed effects Yes Yes Yes Yes Observations 9,228 4,245 5,439 2,758 R-squared 0.029 0.031 0.056 0.066 Notes: The dependent variable is a dummy for firm i having positive imports of capital goods in year t. All control variables are defined in table 3. Heteroskedasticity-robust standards errors are reported in parentheses. Errors are corrected for clustering at the three-digit industry level. ***, **, and * indicate sig- nificance at the one, five, and ten percent levels respectively. 374 Bas and Berthou The role of firm ownership In this section, we investigate if firms’ ownership is driving our previous results. Previous studies on mul- tinational firms show that foreign firms in developing countries tend to use more advanced technologies and be more productive relative to domestic firms [Javorcik 2004]. In general, the fact that foreign com- panies are more efficient and use more advanced technology could potentially explain our results. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Foreign affiliates might benefit more from input tariff changes to upgrade foreign technology embodied in imported capital goods since they have connections with foreign headquarters located abroad. In order to address this issue, we carry out two different tests. First, we test for the possibility that foreign spillovers are driving our findings: if multinational com- panies benefit the most from input-trade liberalization, there could also be foreign technology transfer to domestic firms. We include in the previous specification a variable measuring the number of foreign affil- iates in the region (Indian state) and industry where the firm is producing and an interaction term be- tween this variable and input tariffs. Columns (1) and (2) of table 6 present the results. The presence of multinational affiliates increases the probability that Indian firms upgrade their technology, and the in- teraction term suggests that input tariffs cuts have a greater effect on firms located in states and indus- tries that have experienced an increase in the number of foreign affiliates (column 1). Once we control for this potential alternative explanation, our coefficient of interest on input tariffs alone is lower in magnitude but still significant and negative. Moreover, the heterogeneous effect of input tariff cuts de- pending on firms’ initial TFP is robust and stable (column 2). Table 6. The Role of Firm Ownership Dependent variable: dummy equal to one if the firm i imports capital goods in t. (1) (2) (3) (4) (5) (6) MNF spillovers Without MNF firms Private firms Input tariff(s)(t-1) x MNF(r,s,t) À0.086* À0.072 (0.044) (0.058) MNF(r,s,t) 0.057* 0.070 (0.034) (0.044) Input tariff(s)(t-1) À0.119* À0.185** À0.150** (0.070) (0.081) (0.067) Input tariff(s)(t-1)  Low initial TFP À0.130 À0.205 À0.163 (0.107) (0.119) (0.105) Input tariff(s)(t-1)  Middle initial TFP À0.323*** À0.416*** À0.340*** (0.106) (0.115) (0.103) Input tariff(s)(t-1)  High initial TFP À0.025 À0.089 À0.039 (0.158) (0.161) (0.143) Capital goods and output tariff(s)(t-1) Yes Yes Yes Yes Yes Yes Herfindahl index(s)(t-1) Yes Yes Yes Yes Yes Yes Industry level controls Yes Yes Yes Yes Yes Yes Firm level controls Yes Yes Yes Yes Yes Yes Firm fixed effects Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes Yes Yes Observations 14,680 7,861 13,300 6,817 14,247 7,593 R-squared 0.094 0.089 0.097 0.093 0.096 0.091 Notes: The dependent variable is a dummy for firm i having positive imports of capital goods in year t. MNF(r,s,t) is the logarithm of the number of foreign affiliates located in the same region (r) and industry (s). All other control variables are defined in table 3. Heteroskedasticity-robust standards errors are reported in parentheses. Errors are corrected for clustering at the three-digit industry level. ***, **, and * indicate significance at the one, five, and ten percent levels respectively. The World Bank Economic Review 375 Second, we exclude from our sample multinational firms in columns (3) and (4) of table 6. Our coef- ficients of interest on input tariff (column [3]) and on the interaction term between input tariff and the initial firm TFP quantile (column [4]) remain robust when we restrict the sample to domestic firms, sug- gesting that input-trade liberalization matters for non-multinational firms. Moreover, previous works using the same firm-level dataset have emphasized the role of state-owned Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 firms relative to private companies in India (Topalova 2004; Alfaro and Chari 2009). One could argue that state-owned companies might have a greater lobby power to induce the government to reduce tariff on those goods that they use as intermediate ones in the production of final goods. In order to address this issue, we restrict the sample to private firms in columns (5) and (6). The point estimates of input tar- iff (column [5]) and the interaction term between input tariff and the initial firm TFP quantile (column [6]) remain robust and stable for the sample of private firms. VIII. Additional Gains From Input-Trade Liberalization The previous results show that lower tariffs on intermediate inputs increases the probability of importing capital goods in addition to the direct effect of lowering tariffs on capital goods. Those findings suggest that the potential gains from trade liberalization might be larger than in previous studies when one takes into account the additional gains from input tariff cuts on the decision to import capital goods (the indi- rect channel). This section discusses the additional gains at the firm level of input-trade liberalization thanks to the complementarity channel between variable foreign inputs and capital goods. The intensive margin of imports of capital goods If imports of intermediate goods are complementary with imports of capital goods, we expect that in- put tariff reductions will also enhance larger volumes of imports of capital goods. One concern that arises in the estimation of the determinants of the intensive margin of imports of capital goods is that this variable is observed only over some interval of its support. An OLS estimation of the logarithm of imports of capital goods will exclude the zero import values leading to sample selection bias and incon- sistent parameter estimates as the censored sample is not representative of the entire sample of Indian firms. To address this issue we present Tobit estimates with imports of capital goods shares on the left- hand side, explicitly taking censoring into account by considering the zero values as a left-censored.33 Tobit models with individual fixed effects have an incidental parameters problem and are generally bi- ased (Greene 2003). We thus report results from both pooled Tobit, without unobserved effects, and random effects Tobit.34 Table 7 presents the results. Columns (1) and (2) show the marginal effects at the sample mean from pooled Tobit estimation of tariffs on imports of capital goods shares and col- umns (3) and (4) report the results from random-effects Tobits. The coefficient of interest on input tar- iffs is negative and significant in all specifications, implying that input-trade liberalization increases the share of imports of capital goods. 33 The predicted values from Tobit estimations account for the lower limit of the censored data. We should keep in mind that Tobit estimation relies on the assumption of homoskedastic normally-distributed errors for consistency. 34 In the random-effects Tobit, firm unobserved heterogeneity is assumed to be part of the composite error. Random- effects Tobits are unbiased if firm characteristics are exogenous (uncorrelated with the regressors). Honore (1992) has developed a semiparametric method dealing with this issue, which captures unobserved time-invariant individual het- erogeneity. He proposes a trimmed least squares estimator of censored regression models. Nevertheless, this semipara- metric estimator for fixed-effect Tobits is not suitable here due to the relatively small sample size. 376 Bas and Berthou Table 7. Input-Trade Liberalization and the Intensive Margin of Imports of Capital Goods Dependent variable: the share of imported capital goods over total imports of the firm i in t. (1) (2) (3) (4) Pooled Tobit Random effects Tobit Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Input tariff(s)(t-1) À0.378*** À0.362*** À0.218*** À0.225*** (0.066) (0.066) (0.064) (0.064) Capital goods and output tariff(s)(t-1) Yes Yes Yes Yes Random effects Yes Yes Industry fixed effects Yes Yes Year fixed effects Yes Yes Yes Yes Observations 10,800 10,800 10,800 10,800 log likelihood À5533 À5412 À4457 À4436 Sigma u 0.352 0.348 0.267 0.261 Sigma e 0.264 0.264 Notes: The dependent variable is the share of imported capital goods over total sales of the firm i in year t. All specifications include capital goods and output tariffs and the Herfindahl index. Columns (2) and (4) also include firm-level controls. All control variables are defined in table 3. Heteroskedasticity-robust standards errors are reported in parentheses. Errors are corrected for clustering at the three-digit industry level. ***, **, and * indicate significance at the one, five, and ten percent levels respectively. Input-trade liberalization, firm profitability, sales, and electricity Next we explore the relationship between input-tariff cuts and other firm outcomes such as firms’ prof- its, sales, and electricity consumption. Previous literature has already shown that India’s trade liberaliza- tion yields to larger gains from new imported varieties of intermediate goods (Goldberg et al. [2010]). The simple theoretical model presented in section III emphasizes that input-tariff reductions allow firms to increase their profits and sales to afford the hightechnology. We thus estimate equation (I) with the loga- rithm of firms’ profits and sales as dependent variables, and we include an interaction term between input tariffs and a dummy variable equal to one when the firm imports capital goods. Since the estimation includes firm and year fixed effects, the coefficient on the interaction term captures the effect of input tariff cuts on firms’ profits and sales for firms that start importing capital goods. Table 8 presents the results. Columns (1) Table 8. Additional Gains from Input-Trade Liberalization Dependent variable: logarithm of profits or sales or electricity of firm i in t. (1) (2) (3) (4) (5) (6) Profits Sales Electricity consumption Importer capital goods 0.128*** 0.308*** 0.098*** 0.268*** 0.091*** 0.213*** (0.025) (0.057) (0.010) (0.029) (0.011) (0.027) Input tariff(s)(t-1)  Importer capital goods À0.427*** À0.409*** À0.297*** (0.123) (0.055) (0.057) Input tariff(s)(t-1) À0.200 0.169 À0.194* (0.308) (0.113) (0.106) Capital goods and output tariff(s)(t-1) Yes Yes Yes Yes Yes Yes Herfindahl index(s)(t-1) Yes Yes Yes Yes Yes Yes Industry level controls Yes Yes Yes Yes Yes Yes Firm level controls Yes Yes Yes Yes Yes Yes Firm fixed effects Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes Yes Yes Observations 11,945 11,945 14,680 14,680 14,662 14,662 R-squared 0.178 0.182 0.604 0.607 0.708 0.709 Notes: The dependent variable is the logarithm of firms’ profits (columns 1 and 2) or sales (columns 3 and 4) in year t. All control variables are defined in table 3. Heteroskedasticity-robust standards errors are reported in parentheses. Errors are corrected for clustering at the three-digit industry level. ***, **, and * indicate sig- nificance at the one, five, and ten percent levels respectively. The World Bank Economic Review 377 and (3) show that firms that upgrade their foreign technology embodied in imports of capital goods increase their sales and profits as predicted by the model. Input-tariff reductions lead to greater sales and profits for firms that start importing capital goods of about 4% (columns [2] and [4]). Finally, we test the assumption that Indian firms are upgrading their technology by importing capital goods. More advanced technologies are more likely to be reliant on electricity. Thus, we expect that Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Indian firms increase their consumption of electricity as a result of input tariff cuts and importing capital goods. Columns (5) and (6) of table 8 show that this was indeed the case.35 IX. Concluding Remarks The main contribution of this paper to the literature on the microeconomic effects of input-trade liberali- zation on firm performance is to investigate the efficiency gains from input tariff cuts on firms’ decision to source capital goods from abroad. We motivate our empirical analysis with a simple theoretical model of heterogeneous firms that ex- plains the channels through which changes on tariff on intermediate goods might affect firms’ decision to upgrade foreign technology in imported capital goods. Assuming that imported intermediate inputs and foreigntechnology are complementary and fixed costs of technology upgrading, the model predicts a positive effect of reductions of tariff on intermediate goods on firms’ choice to adopt a foreigntechnol- ogy. The impact of input-trade liberalization is heterogeneous across firms depending on their initial pro- ductivity level. Using Indian firm-level data and the trade liberalization episode of the early 1990s, we test the main implications of the model. Our findings demonstrate that the probability of importing capital goods is higher for firms producing in industries that have experienced greater cuts on tariff on intermediate goods. Looking at the heterogeneous effect of input-trade liberalization, we find that only those firms in the middle range of the productivity distribution have benefited from input tariff cuts as predicted by the model. These empirical findings are robust to alternative specifications that control for imported capital goods tariffs, other reforms, and industry and firm characteristics. Appendix Aggregation The lowtechnology average productivity level u~l and the ex-ante weighted average productivity level of ~h is given by: foreign high-technology firms u ð uà 1 " # 1 1 h rÀ 1 1 À nÀ k þ u À 1 r À 1 ~l  u ð uÞ gð u Þ d u ¼ uà lt r À1 if uà l u < uà h Gð u à à h Þ À Gð u l Þ uà 1 À nÀ k l ð1 1 1 ~h  u ðuÞrÀ1 gðuÞdu ¼ uà h tr À 1 if u ! uà h 1 À Gð u à hÞ uà  rÀ 1 h  1Àr !1À 1 r fh1 ch k where t ¼ kÀðrÀ1Þ and n ¼ f cl À1 . The ex-post average productivity of foreign high-technology firms takes into account the increase in the firms’ efficiency due to the acquisition of the more advanced technology complementary with imported intermediate inputs. The adoption of the high technology allows these firms to reduce their 35 We thank an anonymous referee for suggesting this test. 378 Bas and Berthou   1À r unit costs and raise their market shares by this term c h cl . Notice that average revenues of hightech-  1Àr nology firms can be expressed as rh ðu ~h Þ c ~h Þ ¼ rl ðu h cl : Therefore, the weighted average productivity index of the industry u ~T represents the market shares of all types of firms:  1Àr ! Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 rÀ 1 u~T rÀ1 ¼ 1 M Ml ð u ~l Þ ch þ Mh c l ~h ÞrÀ1 : ðu The number of firms producing with low technology Ml ¼ ql M and those producing with high tech- nology Mh ¼ qh M are determined by the total number of firms M and the probabilities of using low and 1ÀGðuÃ Þ À à à ÁÀ k high technology. qh ¼ 1ÀGðuh Ã Þ ¼ u h =u l and ql ¼ 1 À qh . The low- and high-technology average pro- l ductivity levels and the aggregate productivity index define all the aggregate variables. The price index of the industry is determined by: ð uà ð1 "  1Àr # h  r  1À r ch 1À r 1À r 1À r 1À r rÀ1 rÀ 1 P ¼ Ml ðpl Þ l l ð uÞ d u þ M h ðph Þ lh ðuÞdu ¼ cl M l ðu ~l Þ þ Mh ~h Þ ðu uà uà rÀ1 cl l h 1Àr Using the aggregate productivity u~T ; the price index can be expressed as P ¼M  1Àr r cl rÀ1 rÀ 1 u~ ¼ Mpðu~T Þ . T Proof of the equilibrium survival productivity cutoff FE (8) and ZCP (9) conditions jointly determine the equilibrium cutoff level (uà l ). In order to obtain this cutoff, we use the technology productivity cutoff, the average productivity for low- and high-technology (u ~l ; u~h ) firms and the probability of using low and hightechnology (ql ; qh ). The equilibrium cutoff level (u à l ) is given by: " " ð à ð1 # # k uà l 1 1 uh df e ¼ Ml rl ðuÞll ðuÞdu þ Mh rh ðuÞlh ðuÞdu À Mf À Mh fh umin k M r uà uà h Solving for low and high technology revenues and using Ml ¼ ql M; Mh ¼ qh M; rl ¼ Ac1 l À r rÀ 1 ui ; rh 1À r rÀ 1 ¼ Ach ui and using equation (5), to determine A, so as to express average profits as a function of the productivity cutoff, yields: "    rÀ1  rÀ1 # uà l k ~l rÀ1 u cl ~h u df e ¼ ql à þ qh À 1 f À q h fh umin k ul ch uÃl By substituting the average productivity for low and hightechnology u ~h and using the high- ~l ; u productivity cutoff defined in equation (7), yields: 2  1Àr !rÀ 3 1   Àk k ch fh r À 1 f þ À1 f k rÀ1 6 6 cl f h 7 7 uà l ¼ 6 7umin k (A.1) k À ðr À 1Þ 4 df e 5 This cutoff, uà à l ; then determines the hightechnology productivity cutoff level uh defined in equation (7). Proof of proposition 2 This hightechnology productivity cutoff is an increasing function of input tariff (sm). Keeping in mind that c cl is an increasing function of sm, we take the partial derivative of the productivity technological cut- h off (uÃh ) determined in Equation (7) with respect to sm: The World Bank Economic Review 379 2 3 ch @ uà uà 6 @ uà @ uà cl 7 h ¼ h6 l 4@ sm þ @ sm  1Àr l !  r 7 5 (A.2) @ s m uà ch l cl À1 c h cl @ uà Next, we partially differentiate equation (A.1) uà l with respect to sm, to obtain @ sm : l Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019   1À r !rÀ k 1À1   Àk ch rÀ1 fh  1À1 @ ch À1 fh  Àr   @ uà k k cl cl f ch rÀ1 l ¼ ðÀ1Þ uà l umin k < 0 (A.3) @ sm @ sm df e cl k À ðr À 1Þ c @ ch   1À r @ uà Since @ sm > 0; c l h cl rÀ 1 > 1 and kÀð rÀ1Þ > 0; yields to @ sm < 0. l @ uà Plugging equation (A.3) into equation (A.2), a sufficient condition for @ sm h > 0 is: 2" #r À 3   k 1  Àk k r À 1 4 ch 1Àr fh rÀ1 fh 5 uà > À1 u k (A.4) l k À ðr À 1Þ cl f dfe min To prove that this condition holds, we plug in the equation (A.4) the survival productivity cutoff uà l as determined in equation (10), and we obtain: f >0 Table A1. Production Function Estimates Dependent variable: output of firm i in year t. (1) (2) Wage-bill 0.416*** 0.402*** (0.010) (0.010) Materials 0.180*** 0.176*** (0.007) (0.010) Capital stock 0.365*** 0.347*** (0.029) (0.040) Importer of capital and inputs 0.221*** (0.023) Only importer of inputs 0.183*** (0.021) Total imports of inputs and/or capital 0.190*** (0.026) Notes: The table reports estimates of a production function relying on the Levinsohn and Petrin (2003) methodology using electricity expenditures to control for unobserved productivity shocks. All variables are expressed in logarithms. The estimation includes industry- and year-fixed effects. The number of observations is 19138. Heteroskedasticity-robust standards errors are reported in parentheses. ***, **, and * indicate significance at the one, five, and ten percent levels respectively. 380 Bas and Berthou Figure A1. The Share of Firms Importing Capital Goods as Input Tariff Fall Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Source: Authors’ calculation based on tariff data from WITS and the Prowess dataset. Table A2. Tariff Reductions Between 1989 and 1997 and Pre-Reform Industrial Characteristics Dependent variable: change in input tariffs between 1989-1997 (1) (2) (3) (4) (5) Sales(s,1989) 0.004 (0.008) Capital stock(s,1989) 0.003 (0.009) Wages(s,1989) 0.004 (0.009) Imports capital goods(s,1989) 0.007 (0.008) Imports inputs(s,1989) 0.001 (0.005) Panel B change in capital goods tariffs between 1989-1997 (1) (2) (3) (4) (5) Sales(s,1989) 0.002 (0.002) Capital stock(s,1989) 0.001 (0.002) Wages(s,1989) 0.002 (0.002) Imports capital goods(s,1989) 0.003 (0.003) Imports inputs(s,1989) 0.001 (0.002) Observations 47 47 47 47 47 Notes: The dependent variable is the changes in input tariffs between 1989 and 1997. The table shows regressions at the three-digit industry level of changes in input tariffs on different industry level characteristics. All regressions include indicators for industry-use type (consumer goods, capital, and intermediates). All indus- try-level variables are expressed in logarithms. Heteroskedasticity-robust standards errors are reported in parentheses. The World Bank Economic Review 381 Table A3. Initial Firm Characteristics in 1989 and Input Tariff Changes Between 1989-1997 (1) (2) (3) (4) Importer of K Imports K /sales Capital stock TFP D Input tariffs(s,97-89) 0.071 À0.005 0.709 0.081 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 (0.385) (0.092) (0.500) (0.335) Panel B D Capital goods tariffs(s,97-89) À0.270 0.100 1.483 À1.299 (0.254) (0.778) (4.221) (2.821) Observations 676 676 676 676 Notes: The dependent variables in each column are the initial firm-level outcomes in 1989. The table shows the coefficients on changes in input tariffs between 1990 and 1996 from firm-level regressions of initial firm characteristics on input tariff changes and two-digit industry-fixed effects. Firm-level variables are expressed in logarithms except for the importer of capital goods dummy and the ratio of imports of capital goods over total sales. Heteroskedasticity-robust standards errors are reported in parentheses. Errors are corrected for clustering at the three-digit industry level. Table A4. Other Reforms in India and Past Importer Input Experienced Dependent variable: dummy equal to one if the firm i imports capital goods in t. (1) (2) (3) (4) (5) (6) Input tariff(s)(t-1)  imported inputs>0 À0.170** À0.158** À0.156** (0.066) (0.065) (0.066) Imported inputs>0 0.405*** 0.397*** 0.404*** 0.348*** 0.334*** 0.348*** (0.041) (0.039) (0.042) (0.031) (0.028) (0.027) Input tariff(s)(t-1) À0.011 0.018 À0.130* (0.078) (0.078) (0.067) Input tariff(s)(t-1)  Low initial TFP À0.112 À0.235 (0.101) (0.124) Input tariff(s)(t-1)  Medium initial TFP À0.296*** À0.288*** (0.103) (0.106) Input tariff(s)(t-1)  High initial TFP À0.005 À0.029 (0.121) (0.161) Capital goods tariff(s)(t-1) À0.173** À0.141 À0.147 À0.179* À0.193* (0.084) (0.093) (0.095) (0.093) (0.114) Output tariff(s)(t-1) À0.029 À0.045 À0.053 À0.028 À0.040 (0.040) (0.047) (0.057) (0.042) (0.062) Input tariff(s)(t-1) x experience importer inputs À0.051 À0.008 (0.049) (0.074) Experience importer inputs 0.032 0.005 (0.021) (0.031) Herfindahl index(s)(t-1) Yes Yes Yes Yes Yes Yes Firm level controls Yes Yes Yes Yes Yes Yes Firm fixed effects Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes Yes Yes Industry year fixed effects No Yes No No No No Region year fixed effects No No Yes Yes No No Observations 14,680 14,680 14,430 7,825 14,430 7,825 R-squared 0.094 0.130 0.135 0.156 0.094 0.090 Notes: The dependent variable is a dummy for firm i having positive imports of capital goods in year t. All control variables are defined in table 3 in the main text. Heteroskedasticity-robust standards errors are reported in parentheses. Errors are corrected for clustering at the three-digit industry level. pairs. ***, **, and * indi- cate significance at the one, five, and ten percent levels respectively. 382 Bas and Berthou Table A5. Demand Shocks and Firms’ Financial Health Dependent variable: dummy equal to one if the firm i imports capital goods in t. (1) (2) (3) (4) (5) (6) Export tariff(s)(t-1) À0.049 À0.045 À0.034 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 (0.061) (0.059) (0.074) Leverage(t-1) À0.155*** À0.155*** À0.092 (0.036) (0.036) (0.059) Input tariff(s)(t-1) À0.162** À0.024 À0.139** 0.000 (0.074) (0.083) (0.064) (0.076) Input tariff(s)(t-1)  imported inputs>0 À0.170** À0.170** (0.067) (0.067) Imported inputs>0 0.334*** 0.406*** 0.346*** 0.332*** 0.405*** 0.344*** (0.029) (0.041) (0.029) (0.029) (0.040) (0.029) Input tariff(s)(t-1)  Low initial TFP À0.163 À0.140 (0.103) (0.101) Input tariff(s)(t-1)  Medium initial TFP À0.361*** À0.343*** (0.105) (0.099) Input tariff(s)(t-1)  High initial TFP À0.071 À0.058 (0.157) (0.149) Output tariff(s)(t-1) À0.026 À0.030 À0.033 À0.027 À0.031 À0.035 (0.042) (0.042) (0.052) (0.040) (0.039) (0.051) Capital goods tariff(s)(t-1) À0.174* À0.179** À0.184* À0.157* À0.161* À0.170* (0.089) (0.086) (0.096) (0.083) (0.080) (0.092) Capital goods and output tariff(s)(t-1) Yes Yes Yes Yes Yes Yes Herfindahl index(s)(t-1) Yes Yes Yes Yes Yes Yes Firm level controls Yes Yes Yes Yes Yes Yes Firm fixed effects Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes Yes Yes Observations 14,425 14,425 7,731 14,678 14,678 7,861 R-squared 0.093 0.094 0.088 0.096 0.097 0.089 Notes: The dependent variable is a dummy for firm i having positive imports of capital goods in year t. All control variables are defined in table 3 in the main text. Heteroskedasticity-robust standards errors are reported in parentheses. Errors are corrected for clustering at the three-digit industry level. pairs. ***, **, and * indi- cate significance at the one, five, and ten percent levels respectively. References Alfaro, L., and A. Chari. 2009. India transformed? Insights from the firm level 1988-2005. India Policy Forum 6. Amiti, M., and R. Davis. 2012. “Trade, Firms, and Wages: Theory and Evidence.” Review of Economic Studies 79: 1–36. Amiti, M., and J. Konings. 2007. “Trade Liberalization, Intermediate Inputs, and Productivity: Evidence from Indonesia.” American Economic Review 97 (5): 1611–38. Aw, B., M. Roberts, and D. Xu. 2011. “R&D Investment, Exporting, and Productivity Dynamics.” American Economic Review 101: 1312–44. Bas, M. 2012. “Input-Trade Liberalization and Firm Export Decisions: Evidence from Argentina.” Journal of Development Economics 97: 481–493. Bas, M., and A. Berthou. 2012. “The Decision to Import Capital Goods in India: Firms’ financial factors matter.” World Bank Economic Review 26 (3): 486–513. Bas, M., and I. Ledezma. 2015. “Trade Liberalization and Heterogeneous Technology Adoption.” Review of International Economics 23 (2): 738–81. Bas, M., and V. Strauss-Kahn. 2015. “Input-Trade Liberalization, Export Prices and Quality Upgrading.” Journal of International Economics 95 (2). The World Bank Economic Review 383 Boler, E., A. Moxnes, and K. Ulltveit-Moe. 2015. “R&D, International Sourcing and the Joint Impact on Firm Performance.” Forthcoming. American Economic Review. Bombarda, P., and E. Gamberoni. 2013. “Heterogeneous Firms, Rules of Origin and Rules of Cumulation.” International Economic Review 54 (1): 307–28. Bustos, P. 2011. “Trade Liberalization, Exports and Technology Upgrading: Evidence on the Impact of Mercosur on Argentinian Firms.” American Economic Review 101: 304–40. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 Costantini, J., and M. Melitz. 2008. “The Dynamics of Firm-Level Adjustment to Trade Liberalization.” In E., Helpmann, D. Marin, and T. Verdier,eds.The Organization of Firms in a Global Economy. Cambridge: Harvard University Press, 107–41. DeLoecker, J., P.K. Goldberg, A.K. Khandelwal, and N. Pavcnik. 2016. “Prices, Markups and Trade Reform.” Econometrica 84 (2). Eaton, J., and S. Kortum. 2001. “Trade in Capital Goods.” European Economic Review 45 (7): 1195–1235. Ethier, W. 1979. “Internationally Decreasing Costs and World Trade.” Journal of International Economics 9: 1–24. ——— 1982. “National and International Returns to Scale in the Modern Theory of International Trade.” The American Economic Review 72: 389–405. Feng, L., Z. Li, , and D. Swenson. 2016. “The Connection between Imported Intermediate Inputs and Exports: Evidence from Chinese Firms.” Journal of International Economics 101: 86–101. Fernandes, A., E. Ghani, S. O’Connell, and G. Sharma. 2012. “Sourcing and Sophistication Decisions: Input Choice by Indian Manufacturing.” Mimeo. Goh, A.T., and J. Olivier. 2002. “Learning by Doing, Trade in Capital Goods and Growth.” Journal of International Economics 56 (2): 411–44. Goldberg, P., A. Khandelwal, N. Pavcnik, and P. Topalova. 2009. “Trade Liberalization and New Imported Inputs.” American Economic Review Papers and Proceedings 99 (2): 494–500. Goldberg, P.K., A.K., Khandelwal, N., Pavcnik, and P., Topalova 2010. “Imported Intermediate Inputs and Domestic Product Growth: Evidence from India.” The Quarterly Journal of Economics 125 (4): 1727–67. Grossman, G., and E. Helpman. 1991. Innovation and Growth in the Global Economy. Cambridge: MIT. Hallak, J., and J. Sivadasan. 2013. “Firms’ Exporting Behavior under Quality Constraints.” Journal of International Economics 91 (1): 53–67. Halpern, Koren, and Szeidl 2015. “Imported Inputs and Productivity.” American Economic Review 105 (12): 3660–3703. Javorcik, B.S. 2004. “Does Foreign Direct Investment Increase the Productivity of Domestic Firms? In Search of Spillovers through Backward Linkages. American Economic Review 94 (3): 605–27. Karacaovali, B. 2011. “Productivity Matters for Trade Policy: Theory and Evidence.” International Economic Review 52 (1): 33–62. Kasahara and Rodrigue 2008. “Does the Use of Imported Intermediates Increase Productivity? Plant-Level Evidence.” Journal of Development Economics 87: 106–18. Kasahara, H., and B. Lapham. 2013. “Productivity and the Decision to Import and Export: Theory and Evidence.” Journal of International Economics 89 (2): 297–316. Kugler, M., and E. Verhoogen. 2009. “Plants and Imported Inputs: New Facts and an Interpretation.” American Economic Review Papers and Proceedings 99 (2): 501–7. ———. 2011. “Prices, Plant Size, and Product Quality.” Review of Economic Studies 1–33. Lee, J.W. (1995). “Capital Goods Imports and Long-Run Growth.” Journal of Development Economics 48 (1): 91–110. Levinsohn, J., and A. Petrin. 2003. “Estimating Production Functions Using Inputs to Control for Unobservables.” Review of Economic Studies 70 (2): 317–41. Lileeva, A., and D. Trefler. 2010. “Improved Access to Foreign Markets Raises Plant-Level Productivity . . . For Some Plants.” Quarterly Journal of Economics 125: 1051–99. Melitz, M. 2003. “The Impact of Trade on Intra-Industry Reallocations and Aggregate Industry Productivity.” Econometrica 71: 1695–1725. Rivera-Batiz, L., and P. Romer. 1991. “International Trade with Endogenous Technological Change.” European Economic Review 35 (4): 971–1001. 384 Bas and Berthou Schor 2004. “Heterogeneous Productivity Response to Tariff Reduction: Evidence from Brazilian Manufacturing Firms.” Journal of Development Economics 75: 373–96. Topalova, P. 2004. Overview of the Indian Corporate Sector: 1989–2002. IMF Working Papers 04/64. Topalova, P., and A., Khandelwal 2011. “Trade Liberalization and Firm Productivity: The Case of India.” The Review of Economics and Statistics 93 (3): 995–1009. Venables, A. 1996. “Equilibrium Location of Vertically Linked Industries.” International Economic Review 37: Downloaded from https://academic.oup.com/wber/article-abstract/31/2/351/2632343 by LEGVP Law Library user on 08 August 2019 341–59. Verhoogen, E. 2008. “Trade, Quality Upgrading and Wage Inequality in the Mexican Manufacturing Sector.” Quarterly Journal of Economics 123: 489–530. Xu, B., and J. Wang. 1999. “Capital Goods Trade and R&D Spillovers in the OECD.” Canadian Journal of Economics 32 (5): 1258–74. Yeaple, S. 2005. “A Simple Model of Firm Heterogeneity, International Trade and Wages.” Journal of International Economics 65: 1–20. The World Bank Economic Review, 31(2), 2017, 385–411 doi: 10.1093/wber/lhv057 Advance Access Publication Date: October 14, 2015 Article Long-term Gains from Electrification in Rural India Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Dominique van de Walle, Martin Ravallion, Vibhuti Mendiratta, and Gayatri Koolwal Abstract We know surprisingly little about the long-run impacts of household electrification. This paper studies the im- pacts on consumption in rural India over a 17-year period, allowing for both internal and external (village- level) effects. Under our identifying assumptions, electrification brought significant consumption gains for households who acquired electricity for their own use. We also find evidence of a dynamic effect of village con- nectivity for households without electricity themselves. This is suggestive of an external effect, which also comes with a shift in consumption spending suggestive of status concerns among those still without electricity. Labor earnings were an important channel of impact. This was mainly through extra work by men. There was no effect on average wage rates. JEL classification: H54, O12, O13 A great many people still do not have electricity. In much of the developing world, households continue to rely on traditional sources of fuel for lighting, heating, and cooking. Indeed, it is estimated that 1.3 billion people in 2009 had no access to electricity.1 Rural households can spend a lot of time collecting and preparing fuel for domestic use. Many con- tinue to cook with wood and biomass (mainly dung) with deleterious effects on the health of family members. Across countries, these time and health burdens are thought to be higher for women and the children under their care. Over recent decades, governments and donors have made a concerted effort to bring more efficient sources of energy, particularly electricity, to rural households. It is often simply assumed that electrification will result in significant welfare gains for households and particularly for the women within them. Referring to policy discussions in the 1970s and 1980s, Dominique van de Walle is lead economist in the Research Department of the World Bank; her email address is dvandewalle@worldbank.org. Martin Ravallion (corresponding author) is with the Department of Economics, Georgetown University and NBER; his email address is mr1185@georgetown.edu. Vihbuti Mendiratta is a PhD Student with the Paris School of Economics; her email address is vibhutim@gmail.com. Gayatri Koolwal is a consultant with the World Bank; her email address is koolwalg@gmail.com. The findings, interpretations, and conclusions of this paper should not be attributed to the World Bank or any affiliated organization. Grants from the World Bank’s Gender Action Plan and Research Support Budget are gratefully acknowledged. The authors are also grateful to Douglas Barnes, Shahid Khandker, the Review’s editor Andrew Foster, and three anonymous referees for their many useful comments. 1 See World Energy Outlook (WEO) 2011, published by the International Energy Agency. The WEO estimates that in 2009 25% of the population of the developing world was still without electricity, and the proportion rises to 37% in ru- ral areas. In South Asia, the corresponding proportions are 32% and 40%, while in Sub-Saharan Africa they were 69% and 86%. It is estimated that 70–77% of energy consumption in Sub-Saharan Africa over 1980–2005 comes from wood fuel (Kebede et al. 2010). C The Author 2015. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 386 van de Walle, Ravallion, Mendiratta, and Koolwal Barnes and Binswanger (1986, 26) note the “blind faith placed in rural electrification.” Some observers have expressed skepticism on the claimed benefits of electrification over other energy sources.2 There are concerns about both internal and external validity of past evidence. In an extensive review of the litera- ture, the World Bank’s Independent Evaluation Group (2008, xvii) concluded that “the evidence base remains weak for many of the claimed benefits of rural electrification.” Similarly, Bernard (2010, 41) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 writes that “While funding for rural electrification programs often rests on their supposed impacts on such outcomes as health, education, or poverty level, there is still very little empirical evidence to sub- stantiate them.” A number of issues make identification of the household welfare impacts of rural electrification diffi- cult. Three stand out. First, there is the potential for electricity acquisition to be jointly determined with outcomes or correlated with omitted variables. Second, nonrandom placement is likely to entail sensitiv- ity to the choice of regression controls. Third, there are likely to be external effects of electrification in the village, which could well bring benefits to an individual household even if it does not have electricity itself. Indeed, some part of the externality may also be asymmetric in that certain types of benefits accrue to those who do not already have electricity much more than to those who do. For example, the enter- tainment benefits of having a neighbor with an electricity-powered television and fans is likely to be con- siderably greater for a household without electricity than for one who has electricity. In contrast, lit village streets are likely to be advantageous to all local households. Recent studies have recognized the endogeneity of placement issue, though sensitivity to controls and external effects are less often discussed. It has become common to exploit geographic variables for iden- tification under conditional independence assumptions. Those assumptions appear to be more plausible in some applications than in others. Randomized approaches are difficult with rural electrification because of the large scale and political nature of these efforts.3 Observational panel data studies help address these concerns, to the extent that the endogeneity can be fully accounted for in the correlation between placement and time invariant factors that can be differenced out using the panel data. However, the acquisition of electricity within the time period is still likely to be endogenous to changes in outcomes at the household level. This has been dealt with in the literature by either adding controls for initial conditions that are likely to be correlated with subsequent trajectories or by using an instru- mental variables estimator. Both methods require a conditional independence assumption, namely that the error term in the outcomes regression must be conditionally independent of either placement or its instrumental variable (IV). This paper studies the effects of India’s large expansion in rural household electrification on house- hold living standards as measured by consumption. A contribution is that we distinguish the internal effects of household electrification from the external effect of village electrification. Asymmetry in the external effects between households with and without their own electricity plays a role in our identifica- tion strategy, as does (time-varying) proximity to power generating plants. We find long-term consump- tion gains from household acquisition of electricity but also show that there are strong positive external effects of village connection to the grid for households without electricity themselves. Our estimated con- sumption gains are lower than the past estimates in this setting that addressed endogeneity. We argue that this reflects biases when ignoring these external effects, and latent geographic heterogeneity. In par- ticular, the use of village placement as an instrument for household placement is also seen to be a source of sizeable bias. 2 See, for example, the discussion in Mathur and Mathur (2005) with regard to India’s “Electricity for All” program. Also see the discussion in Barnes and Binswanger (1986). 3 Two recent exceptions are ongoing studies that look at specific features of new electricity programs in rural Sub- Saharan Africa, where only 5 percent of households on average have electricity connections. These are described in Bernard (2010). The World Bank Economic Review 387 There is only so far we can go in understanding the mechanisms linking electrification to household consumption. One potentially important mechanism (emphasized in some of the literature reviewed later) is through the labor market, notably labor supply and wage rates. To test whether this is an impor- tant mechanism, we also use labor supply and wage rates as the dependent variables. After reviewing the literature in the following section, we discuss our data in section II; an online statisti- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 cal addendum is also available with further details. The model and identifying assumptions are outlined in section III. Our results are presented in section IV, while section V summarizes the lessons learnt. I. Arguments and Evidence from the Literature The evidence on the economic gains from rural household level electrification is somewhat mixed. Some papers have reported evidence of seemingly large impacts on household consumption, income and other dimensions of welfare in developing countries (Khandker, Barnes, and Samad 2009, 2013; Khandker, Barnes, Samad, and Minh 2009; Khander et al. 2012). Using data for India, Khandker et al. (2012) claim proportionate impacts of household electrification on income of the order of 25–50%.4 Also using data for India, the results of Chakravorty et al. (2014) suggest a 9% gain in rural nonfarm incomes. Other studies, including Bensch et al. (2011) using data for Rwanda, do not find such impacts. Labor supply responses figure prominently in past arguments. World Bank (2004) and Mathur and Mathur (2005) point to large differences in time allocation between rural Indian households with electricity and those without it, although concerns about endogenous selection clearly cloud the inferences that can be drawn. One popular hypothesis on the benefits of household electrification is that by relieving the time bur- dens in collecting and preparing fuel, household electricity leads rural women to engage in market-based work (Dinkelman 2011; Kohlin et al. 2011). A number of studies show that the introduction of household electrical appliances—by raising women’s productivity in domestic work—can account for a large share of the increase in married American women’s labor force participation in the 20th century (see, for example, Greenwood et al. 2005; Coen-Pirani et al. 2010).5 Dinkelman (2011) and Grogan and Sadanand (2012) find similar effects on female employment (and not on male employment) for South Africa and Nicaragua, respectively. Dinkelman attributes this to the use of electric stoves and other time-saving appliances. Household electrification presumably increases the productivity of domestic production relative to other uses of time. However, this productivity effect may well be weak in poor rural economies. The evi- dence indicates that, with rare exceptions, rural households in developing countries use electricity first and foremost for lighting, followed by powering televisions and fans (Bernard 2010; Barnes 2007; IEG 2008). Southern Africa appears to be unusual among developing countries in that rural households com- monly use electricity for cooking (IEG 2008). The relevance of the Dinkelman (2011) results for South Africa to other developing countries, where most women continue to use traditional fuels and technolo- gies for domestic tasks, is thus unclear. In some contexts, electricity may reduce the cost of lighting and is certain to improve lighting quality over traditional kerosene lighting devices.6 However, since bio- fuels and firewood continue to be used for cooking in most rural settings, collection time is unlikely to be hugely affected, contrary to some claims (ADB 2010; Mathur and Mathur 2005; Barnes and Sen 2004). In India, there is also evidence that, given erratic electricity supply, reliance on kerosene for light- ing is maintained alongside electricity (Mathur and Mathur 2005; Rehman et al. 2005). 4 These are coefficients on household electrification in regressions for the log of income, using village electrification rates as the instrumental variable. Estimates vary with the conditional quantile. Impacts were somewhat lower for consump- tion but still sizable for the upper consumption groups. 5 Contrast the American view with that of a judge in Japan who famously said: “modern appliances are partly responsible for failed marriages because they give women time to contemplate” (Hendry 2010). 6 A typical device is a homemade or locally produced wick lamp. This is known to have low luminous efficiency and to generate smoke with potentially adverse health effects. 388 van de Walle, Ravallion, Mendiratta, and Koolwal While the productivity effect may be weak in poor rural settings, there is another implication of household electrification that could well be more important. Electric lighting extends the time available for activities that need good lighting, thus enabling a rearrangement of tasks to evening hours. Household members can then continue their enterprise work, domestic duties, homework, and reading into the evening with potential positive effects on earnings and living standards. For example, studies for Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Bangladesh suggest that with lighting, people spend a greater share of their evening hours in household- based income-generating activities (see Barkat et al. 2002; Chowdhury 2010). Electrification may foster small home-based enterprises (such as ironing and sewing services) with longer hours for productive work. Constraints on electricity supply are believed to be a significant impediment to small-scale enter- prise development and rural industrialization in this setting, and there is supportive evidence in the work of Alby et al. (2013).7 Leisure activities too can be reallocated, and this may well matter as much for men as for women. Without electricity, a significant share of leisure time will no doubt be in daylight hours, in which case it competes with labor supply. With electric light, there can be a substitution of male leisure from daylight hours to night time; for example, instead of hanging out with the village men at the tea shop during daylight hours, men can sit at home and watch TV in the evening once electricity is available. By freeing up daylight hours for work outside the home or own-farm, household electrifica- tion may well make regular salaried work more feasible. The literature has suggested other ways through which household electrification might increase wel- fare. Health benefits can arise as electric lighting reduces the pollution from using candles or kerosene (IEG 2008). Traditional biomass fuels for cooking and kerosene lamps generate indoor air pollution that is a recognized health hazard, with health costs that are disproportionately borne by women and children.8 The implications for household expenditure on energy have also been prominent in the literature, often with claims about benefits from lower energy expenditures due to electrification (see, for example, Mathur and Mathur 2005, in the context of rural India). However, while a lower price (per unit of energy consumed) can be expected to generate a welfare gain, there can be no presumption that total expenditure on energy provides an inverse welfare indicator. Spending on energy may increase due to electrification, and this can be a good thing. Substitution amongst energy sources can also be expected. Heltberg (2004) documents evidence of a switch to modern (nonsolid) cooking fuels with rural electrification in eight developing countries includ- ing India. There is evidence that the highly subsidized kerosene ration that many Indian households receive is substituted for cooking once electricity is used for lighting (Heltberg 2004), although kerosene lamps and candles remain a common back-up given erratic electricity supply (Rehman et al. 2005). Such substitution can still entail welfare gains, of course; for example, even kerosene is considered a cleaner fuel for cooking stoves than traditional biomass. The role of television is not well understood and may be more important than past economic analyses have allowed for. Television viewing may improve women’s domestic productivity and welfare through greater knowledge (see, for example, Kohlin et al. 2011; ADB 2010). Research has documented signifi- cant effects on fertility from information about modern contraception gained from watching TV (IEG 2008; Peters and Vance 2011). By the same argument, other health related behaviors may alter and lead to better family nutrition, reduce child morbidity and result in overall health improvements. Any improved productivity for one member is likely to have implications for other members through reallocations within the household. Schultz (1993), for example, discusses how changes in home-based 7 Evidence on the effects of electrification on rural industrialization in India can also be found in Binswanger et al. (1993) and Rud (2012). 8 See Dasgupta et al. (2006), using data for Bangladesh. Also see Duflo et al. (2008) for a review of the evidence on the health effects of indoor air pollution. The World Bank Economic Review 389 technology such as electricity can reduce household dependence on girls’ labor and the opportunity cost of sending girls to school. Past claims about the household-level impacts of electrification on employment are hard to reconcile with the classic characterization of an under-developed rural economy as having a large labor surplus— as in the famous Lewis (1954) model. In a setting with a large excess supply of labor one would not nor- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 mally expect a purely supply-side change such as a household’s electrification to directly increase employment. If the Lewis model is right, then the channel would have to involve relaxing the external quantity constraints on employment at the household level. However, it is important in this setting to distinguish formal (regular or casual) wage work from self-employment. The quantity constraints may well apply to the former but not the latter. For example, Dinkelman (2011) conjectures that the female employment effects (in a setting with unusually high unemployment rates) are through self-employment, although her data (from census employment questions) do not allow this type of work to be identified. The main focus of past research has been on the “internal” effect of electrification within the house- hold. There will also be external effects of village-level electrification. A few papers treat the community as the unit of analysis and estimate the total effect on all households, whether or not they have an inter- nal electricity connection (Dinkelman 2011; Lipscomb et al. 2013). Here we separate out the internal from the external effect.9 The external effect can be expected to operate through wages, prices, or other income and also through any quantity constraint on labor supply in a model with a labor surplus. We can distinguish two types of external effects. Symmetric external effects exist when village electrification gives similar gains to households who are themselves electrified and to those who are not. Examples include potential benefits such as safer streets, changing social norms, and general equilibrium effects on wages and employment opportunities. By contrast, asymmetric external effects are substantially greater for households without electricity. An example is shared lighting. If your household already has its own lighting, it may matter rather little whether or not your neighbors have light, but if you do not have light yourself, having neighbors with light can make a big difference; for example, you can send your child to study there in the evening, watch the neighbors’ TV, or store your perishables in their refrigerator. The village externality can also operate through domestic production possibilities. The domestic production activity need not be physically located within the household in question but might reside instead within a friend or neighbor’s dwelling. Thus a household still without electricity can benefit by using the electric sewing machine (say) of a neighbor who has acquired an electric connection, possibly with some compensation to the neighbor. This is another example of an asymmetric external effect. Village-level external effects can also arise through status-seeking behavior. Having electricity in one’s home in a typical Indian village is conspicuous and conveys a sense of status. Those without elec- tricity (for some exogenous reason) may then respond by changing their own consumption behavior, spending more on other status-conveying goods to compensate; Frank (1997) discusses such behavior and points to supportive evidence. These goods would clearly need to also be conspicuous goods, rather than (say) food consumed within the home. Spending on celebrations and festivals is a plausible exam- ple. The welfare implications of such behavior are unclear. As Frank (1997) argues, such social effects on consumption behavior may have little or no lasting effect on welfare because everyone keeps trying to keep up their relative status in a race that leaves everyone spending too much on such goods. Rao (2001) describes the importance of status-related spending in South Indian villages, though Rao emphasizes the more positive roles that such spending can play in building social capital, which can be welfare enhancing. 9 To our knowledge, the only other paper that does this is Khandker et al. (2013), which separates the internal from the external gains from connection to the electricity grid in Vietnam. It finds evidence of external effects on household con- sumption, labor earnings, and the schooling of girls (but not boys). 390 van de Walle, Ravallion, Mendiratta, and Koolwal Geographic variables have often been treated as a source of exogenous variation in past efforts to identify the household-level impacts of electrification. Grogan (2012) uses distance to the power source, namely hydroelectric dams. Khandker, Barnes, and Samad (2009) use household proximity to an elec- tricity line. Similarly, Chakravorty et al. (2014) use the density of transmission lines in the district of resi- dence. Coen-Pirani et al. (2010) and Khandker et al. (2012) use the local geographic mean electrification Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 (or appliance-ownership) rate as the IV for household electrification. A number of papers have been influenced by the identification strategy used by Duflo and Pande (2007) exploiting geological/topologi- cal features of the land; Duflo and Pande use local river gradient interacted with predicted district level dam construction as an IV (in their case for dam placement). Dinkelman (2011) and Grogan and Sadanand (2012) use local land gradient as the IV for electricity placement.10 Lipscomb et al. (2013) use the time path of estimated electricity network expansion costs as the grid expands to more costly areas. Some of these papers also control for location fixed effects to control for time invariant factors that may affect both the generation of electricity and outcomes. II. Setting and Data India’s early, post-independence, economic plans gave priority to the use of electricity for urban industry in an effort to develop capital-intensive domestic production. Efforts to bring electricity to rural house- holds were delayed until the 1970s and initially focused on connections for irrigation pump sets. Based on India’s National Sample Surveys, only around 18 percent of rural households had an electricity con- nection for household use in 1982, while close to 70 percent did so twenty years later, although with wide variation across regions. The expansion of the grid was initially influenced by population density (households in large cities were among the first to be connected) but also favored areas where natural resources were abundant for generating electricity (rivers for hydroelectric power, and coal for thermal power). Connections in dwellings expanded rapidly during the 1980s and 1990s. The bulk of rural households get their electricity from a wired connection to above-ground power lines, which in turn are connected to village transformers that are linked via a feeder (or medium-voltage) line to power substa- tions further away on the grid. According to official statistics, more than 90 percent of villages are currently electrified, in that they have a feeder connection to the grid (International Energy Agency 2002). However, far from all house- holds in these villages are connected to the electricity grid, and the extent of household connections varies widely on a regional basis. Since the 1960s, five main regions (the North, East, West, Northeast, and South) have comprised and managed India’s grid (Pandey 2007). Each state and union territory of India falls in one of these regions; power is also shared across regions. A key question is how India’s dramatic increase in household-level electrification impacted household economic behavior. The India Rural Economic and Demographic Survey (REDS) covers this key period of rising household level electrification and includes detailed information for a panel of households. This appears to be the only long-period household panel data set available for a rural economy that underwent extensive electrification. While providing a unique opportunity for assessing the long-term impacts of electri- fication, the use of the data for this purpose still requires a number of assumptions, which we outline later. We use the 1981–82 and 1998–99 (henceforth 1982 and 1999) rounds of the REDS, conducted by the National Council for Applied Economic Research (NCAER).11 The two rounds form a panel of 6,008 households across 242 villages in 15 states. The REDS was initially designed to be representative 10 Flatter land makes it cheaper to lay cables (Dinkelman 2011). 11 There was a first round of the REDS (called ARIS) in 1970–71 but its questionnaire was more restricted in scope than that for the subsequent rounds. There is also a more recent round for 2006, which is not yet for public access. However, for the questions of interest to us, the 1982–1999 panel covers the key period. The World Bank Economic Review 391 of rural India, excluding states in significant conflict.12 Given the long time period covered by the panel, current representativeness of the survey data for rural India can be questioned and we can only make inferences for the baseline panel sample at subsequent dates. In both rounds, the household survey collected data on education, health, marital status, labor supply (main and secondary occupations), and sources of farm and nonfarm income, access to infrastructure Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 and facilities, consumption expenditure and assets, agricultural production, and land owned and inher- ited.13 Accompanying village surveys for both rounds elicited community access to facilities, infrastruc- ture, population characteristics, prices, and wage rates by activity for men and women. For each round, we construct a binary indicator of whether the household has electricity. Only the 1999 survey directly asked whether the household had electricity in the home. In the earlier survey, households were asked about their ownership of consumer durables that use electricity and of electric irrigation pumps. We construct an indicator of household electrification over time that is equal to one in 1982 if the household reports having an electric appliance or pump, and equal to one in 1999 if the household reports having an electricity connection in the home, incurring consumption expenditures on electricity, or owning an electric pump.14 In a few cases, it is unclear whether an appliance requires electricity to run. For example, early radios and sewing machines were run without electricity. The evidence for rural India in 1982 convinced us that such appliances were typically not electricity driven then. We constructed alternative definitions of household electrification—both conservative (assuming such appliances were not electricity dependent) and liberal (assuming they were). In this paper we will use the more conservative definition, which we believe is more likely to approximate the correct measure. However, we tested sensitivity to using the broader measure as well and found this made negligible difference to the main results. We do not know from the survey whether the household’s electricity was obtained from a private gen- erator or through connection to the grid, although our expectation is that private generators were rela- tively rare. We include households that report having an electricity connection for purely agricultural purposes since it is not clear that we can separate out the effects of electricity in the home versus that on the household’s farm. Irrigation connections may well also have ramifications for time use and alloca- tion. However, we also tested the sensitivity of our results to excluding households whose only connec- tion is for agricultural purposes; this made very little difference to the results.15 Using our main electrification indicator, 24 percent of rural households had electricity in 1982, while 71 percent did in 1999. Our figures accord reasonably well with national trends as reported in other sources.16 12 The 1982 round surveyed a total of 4,979 households across 250 villages. (The survey excluded Assam because of an insurgency at the time. In the 1999 round, Jammu and Kashmir was excluded due to unrest there.) The 1999 round covered all surviving 1982 households and added a small random sample of new households from the same villages. Together with household division since 1982, this results in a sample of 7,474 households. The increase in the number of households with baseline information in 1982 and hence in the 1982–1999 panel is explained by household splits over time. 13 Both survey rounds also collected data from women on their time allocation and that of their children. This is likely to be quite noisy data, especially in measuring changes over time (not helped by changes in the questionnaire for report- ing time allocation). We decided that the data were not usable for our purpose. 14 There are 122 households for which our electricity indicator is positive in 1982 and zero in 1999. Of these, 42 split off from a mother household that maintains its electricity status over time; 42 split off from an initial household with elec- tricity to a number of households of which none have electricity in 1999. The remaining 38 households did not split between rounds and we can only conclude that they became un-electrified over the period studied, perhaps through loss of a private generator or an illegal connection. 15 The results are reported in table A2 of the online Statistical Addendum. 16 Using National Sample Survey (NSS) data for India, Pachauri and Muller (2008) estimate that 18% of rural house- holds had electricity in 1981 (for both household and agricultural use). Restricting their sample to just those fifteen states in the REDS panel, the share is 20%. The comparable figure in our analysis is a bit higher at 24% for households with electricity connections for agriculture and owning an electric appliance. The 55th NSS round (1999–00) indicates 392 van de Walle, Ravallion, Mendiratta, and Koolwal Table 1 presents the share of panel households with electricity across the fifteen states in both rounds. There is considerable regional variation. In 1982, household electrification was less common in the East than elsewhere, especially the North (where many hydropower stations are found). Although many more households had received electricity by 1999, access was still limited in the East, especially Bihar, eastern parts of Madhya Pradesh, and Uttar Pradesh. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Table 1. Proportion of Rural Households with Electricity by State 1982 1999 N Mean Std. dev. Mean Std. dev. Andhra Pradesh 350 0.23 0.42 0.76 0.43 Bihar 277 0.03 0.17 0.08 0.28 Gujarat 463 0.29 0.45 0.94 0.24 Haryana 353 0.26 0.44 0.87 0.34 Himachal Pradesh 94 0.32 0.47 1.00 0.00 Karnataka 500 0.11 0.32 0.86 0.35 Kerala 330 0.39 0.49 0.91 0.29 Madhya Pradesh 627 0.22 0.42 0.70 0.46 Maharashtra 310 0.30 0.46 0.79 0.41 Orissa 337 0.27 0.44 0.58 0.49 Punjab 228 0.68 0.47 1.00 0.00 Rajasthan 639 0.32 0.48 0.70 0.46 Tamil Nadu 458 0.36 0.40 0.90 0.30 Uttar Pradesh 783 0.08 0.27 0.38 0.49 West Bengal 259 0.05 0.22 0.54 0.50 Total panel 6008 0.24 0.43 0.71 0.45 Notes: There were two more households in Andhra Pradesh in 1999 and two less in Tamil Nadu. A household is defined as being electrified if it owned an unam- biguously electricity-run appliance or an electric irrigation pump set in 1982 and if it reports having electricity, incurring expenditures on electricity, or owning an electric pump set in 1999. Source: 1982 and 1999 REDS panel. Table 2 gives a cross-tabulation of the numbers of households in the panel according to whether they have electricity. We see that 49% (2,929/6,008) of the sampled panel households went from being non- electrified in 1982 to being electrified by 1999. (Very few went in the other direction.) The share of households with electricity for their own use rose from 24% to 71%. The table also gives the cross-tabulations with village electrification, as reported in the village survey for REDS, which asked what year the village was electrified. The latter is not defined precisely, but we presume this to imply that the village has a feeder connection to the grid. (Nor does “village elec- trification” mean it had a reliable flow of electricity.) The expansion in village electrification is evident; in 1982, 27% of sampled households lived in villages without electricity; this had declined to 5% by 1999. The bulk of households with electricity lived in villages deemed to be electrified at both dates, although “non-grid” sources appear to have been relatively more important in 1982; 90% of households with electricity in 1982 lived in electrified villages as compared to 99% in 1999. Also notable is that 86% of those who did not have their own electricity at either date lived in electrified villages. An exter- nal effect for such households could thus entail large gains. Table 2 also suggests that household demand-side factors have been important in electricity expan- sion for residential use. Amongst those households without electricity in 1982, 64% had acquired elec- tricity by 1999. However, it is clear that the acquisition of electricity by the village was not the only that 66% of rural households across the fifteen states in the REDS panel use electricity for domestic lighting, while our figure based on the REDS 1999 is 71%. The World Bank Economic Review 393 Table 2. Change in Rural Household and Village Electrification between 1982 and 1999 1982 Household has electricity? No Yes Total Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Village has electricity? Village has electricity? Household Village has Yes No Total Yes No Total has electricity? electricity? 1999 No Yes 880 518 1,398 83 34 117 1,515 No 0 233 233 0 5 5 238 Total 880 751 1,631 83 39 122 1,753 Yes Yes 2,223 658 2,881 1,223 102 1,325 4,206 No 0 47 47 0 2 2 49 Total 2,223 705 2,928 1,223 104 1,327 4,255 Total 3,103 1,456 4,559 1,306 143 1,449 6,008 Source: 1982 and 1999 REDS panel households. Also see Table 1 notes. reason; indeed, 76% of the 2,928 households who acquired electricity for their own use over the period lived in villages that were already deemed to be electrified, presumably through a grid connection. (As one would expect, the villages that were connected in 1982 all stayed connected.) And 32% of the 1,631 households who did not have electricity at either date lived in villages that acquired a connection over the period. The evident importance of household-demand factors from table 2 reaffirms the importance of treating household electrification as endogenous. We are interested in the dynamic impacts of village electrification separately from the current impacts of household access. Past grid-connection may bring cumulative gains over time, including for house- holds not directly connected to the electricity source. Using these data, we construct a variable for the years since the village has been electrified. Given a few discrepancies in dates across the two rounds, we rely primarily on the 1982 data, which we deem likely to be more reliable given recall bias, but we use information from the later survey when the information is missing in the baseline and whenever electrifi- cation occurred more recently. We first examine the impact of electrification on real total consumption per capita expressed in 1998 rupees. We also consider impacts on components of spending including fuel expenditures per capita.17 The consumption variables are fully comparable across the two survey rounds.18 Whether electricity has an impact on the ownership of kerosene run stoves is of interest given the claims that subsidized kerosene is substituted for cooking when electricity powers lighting. The survey contains information on the primary and secondary activities of adults. We construct household level variables for the average number of eight-hour work days per individual male or female adult member in casual wage work, regular wage work, and nonagricultural and agricultural self- employment during the year preceding the survey round.19 For our purposes here, we define adults to be those aged between sixteen and fifty-five. Table 3 presents summary statistics for our outcome variables. 17 We are not able to examine fuel expenditures net of that spent on electricity, as these expenditures were not disaggre- gated in the first survey round. Fuel expenditures include the imputed value of own-produced fuel. 18 The online statistical addendum provides more information on the construction of these variables. 19 There are some measurement errors as person-specific days worked in certain activities run in excess of 365 days in a year. We restrict the number of potential person-days to 365 for each activity as well as the total person-days spent in all activities combined for every individual household member. Table 3. Summary Statistics for Household Level Outcome Variables, by Electricity Status and Year 394 Panel households only 1982 1999 No electricity Electricity No electricity Electricity Mean SD Mean SD Mean SD Mean SD Consumption variables (1998 Rs.) Total consumption per capita 3,414.34 1,939.49 5,687.66 4,537.10 5,133.11 2,694.52 9,163.87 7,725.03 Log total consumption per capita 8.002 0.514 8.466 0.569 8.437 0.446 8.945 0.552 Food consumption per capita 2,079.66 1,217.14 3,222.68 2,421.59 2,755.13 1,211.84 3,784.94 1,779.25 Nonfood, no-fuel consumption per capita 1,144.17 967.55 2,158.66 2,822.53 2,039.29 1,701.03 4,846.63 6,381.94 Clothing & footwear expenditures per capita 377.11 271.93 626.16 562.58 458.89 296.24 745.90 606.00 Entertainment expenditures per capita 40.60 63.95 84.54 148.51 75.47 126.71 154.78 252.68 Ceremonies expenditures per capita 56.01 299.83 117.60 845.80 195.94 769.71 644.11 2,439.41 Travel expenditures per capita 55.58 79.83 100.82 134.70 105.87 139.25 230.25 335.64 Education expenditures per capita 43.12 108.22 124.75 320.10 77.32 194.52 290.05 757.58 Health expenditures per capita 97.29 155.06 147.10 207.81 205.59 248.82 364.34 634.46 Domestic help expenditures per capita 3.94 34.45 47.25 362.52 3.02 86.39 22.03 269.49 Repairs to housing expenditures per capita 0.81 10.44 2.04 21.66 0.99 13.21 8.69 76.46 Fuel expenditure per capita 190.51 175.19 306.33 318.88 338.70 286.16 532.30 418.19 Kerosene stove ownership share 0.16 0.36 0.41 0.49 0.28 0.45 0.56 0.50 Days of (nondomestic) work per adult (aged 16–55) Total days of work by women 78.39 79.18 69.23 72.51 72.73 78.28 69.23 77.83 Total days of work by men 149.75 98.23 129.60 96.74 172.04 93.13 165.77 106.54 Days of regular wage work women 0.95 12.39 3.93 29.80 0.87 15.27 5.18 39.24 Days of regular wage work men 21.41 64.22 35.03 78.74 19.25 68.46 48.70 108.99 Days of casual wage work women 28.57 64.17 6.65 32.89 42.29 71.32 23.14 57.00 Days of casual wage work men 62.33 93.92 15.41 48.89 104.62 98.82 50.36 86.52 Days of agricultural self-employment by women 45.08 53.12 55.45 58.34 28.03 41.25 39.90 50.92 Days of agricultural self-employment by men 57.07 59.59 63.47 65.60 38.09 46.36 52.82 59.28 Days nonagricultural self-employment by women 4.07 18.97 3.48 17.35 1.75 16.71 1.33 14.42 Days nonagricultural self-employment by men 11.66 45.74 18.16 61.99 11.44 47.36 17.29 60.12 Share panel households with electricity 0.24 0.43 0.71 0.45 Observations 4,557 1,449 1,753 4,253 Notes: Consumption aggregates are expressed in 1998 prices using a deflator obtained from NCAER. A household is defined as being electrified if it owned an unambiguously electricity-run appliance or an electric irrigation pump set in 1982 and if it reports having electricity, incurring expenditures on electricity, or owning an electric pump set in 1999. A household is involved in an activity if at least one family member aged sixteen and over is involved in the activity either as their primary or secondary activity. The number of days worked in each activity is calculated based on the total hours reported in the activity during the last year, divided by eight. Days are then expressed as a household mean for male and female members aged sixteen to fifty-five. The data are unweighted. Source: 1982 and 1999 REDS panel. van de Walle, Ravallion, Mendiratta, and Koolwal Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 The World Bank Economic Review 395 In choosing controls, we have been guided in part by past literature. In the closest prior study for India, Khandker et al. (2012) controls for household demographic characteristics, maximum education, land and nonland assets, and infrastructure. The REDS allows a similar set of controls. Our base-year household level controls include demographic characteristics (age, age squared, and marital status of the household head, disability/illness in the household, size and age/gender composition, religion, and caste); Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 maximum years of schooling of any adult; and wealth variables (dummies for landownership and whether the house is built with bricks and inherited land amounts). Community variables include access to facilities and infrastructure, local agricultural and nonagricultural wages for men and women, the Muslim population share, characteristics of land tenure and cultivable land, a dummy indicating below normal crop yields in the preceding year and inequality as measured by the mean log deviation of house- hold consumption in the community. We also include controls for some changes in household character- istics (including size and composition and the maximum education) and in community characteristics over time. There are endogeneity concerns about some of these controls to which we will return. We do not control for changes in whether the house is built with bricks, land inheritance amounts, or changes in village access to facilities that are all clearly affected by access to electricity. Finally we include a dummy variable to indicate whether the household split between the survey rounds. Physical proximity to power generating plants is known to have influenced the placement of grid infrastructure (see, for example, Chaurey et al. 2004). For identification purposes, we use data on the location of power generating plants obtained from the 2004–5 CO2 Baseline Database for the Indian Power Sector, published by the Central Electricity Authority of India. This public database has detailed information on each power plant in India, across different types of power (hydropower versus thermal, for example) and generating capacity. The data include the date at which each power plant became active. Using the GPS locations of the REDS survey villages, we calculated the straight-line distance from each village to the nearest power plant for 1965 (as a benchmark before the REDS survey, during the time when rural electricity generation capacity was beginning to expand) and for 1975. III. Specification and Identification We postulate two distinct ways that a rural household can benefit from electrification. First, it can gain from village electrification, even if it does not have electricity itself, though the extent of that gain may depend on whether the household itself has electricity. Second, the household can gain directly from hav- ing electricity inside the home. We first outline the model specification and then discuss identification. Model Specification To quantify these two factors we exploit the unusually long time period in the REDS panel. This allows us to observe household consumption at a base date (1982), denoted Cij82 for household i in village j, and at the follow-up survey date seventeen years later (1999), denoted Cij99 . To help motivate our model specification, consider first a standard difference-in-difference (DD) spec- ification giving the change in (log) consumption for household i in village j as a function of the change in electrification: DlnCij99 ¼ lnðCij99 =Cij82 Þ ¼ a þ cðEij99 À Eij82 Þ þ ij (1) Here Eij99 and Eij82 denote whether household i in village j has electricity in 1999 and 1982 respectively. This allows household electrification to yield a proportionate consumption gain represented by the term cðEij99 À Eij82 Þ. Household fixed effects in the levels of log consumption are differenced out in deriving equation (1). The specification in (1) allows for a time effect in the levels of consumption, represented by a. In the simple DD set-up, the innovation error term, ij , is assumed to be orthogonal to ðEij99 À Eij82 Þ. There are two groups of concerns about this specification in this context. 396 van de Walle, Ravallion, Mendiratta, and Koolwal Dynamic Effects The first concern is that there may be dynamic returns to electrification. There are two aspects to con- sider. One is that the returns to household electrification vary over time. This can be addressed by allow- ing for a vector of control variables (Xij ) that includes Eij82 . We test this, as described further below. If we cannot reject the null hypothesis that the coefficient on Eij82 in the vector Xij is zero then the relation- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 ship between consumption growth and household electrification is said to be homogeneous, meaning that the effect is the same at the two dates. Another possibility within this overall concern about dynamics is that this specification does not allow explicitly for a cumulative effect of electrification. In the underlying levels equation (for which equation 1 is the differenced model) current consumption only depends on current electrification. If there is also a cumulative gain from past access then this will depend on the time period in which the house- hold had electricity. As a practical matter, we do not know from the data when the household gained electricity. What we do know is when the village was connected to the grid. Note also that village electri- fication can have both an internal (dynamic) gain for a household with electricity and an external gain to those who do not have electricity themselves. To allow for a dynamic effect of past electrification we postulate that village electrification allows household consumption to grow at a higher rate. Let Tj denote the number of years that vil- lage j has been connected to the grid. We expect that the gain in consumption from village electrifi- cation will also depend on whether the household has electricity itself. Our prior here is that there will be a larger external effect for those who do not have their own electricity connection. Thus we would ideally augment equation (1) with an extra term of the form ðb0 þ b1 Eij99ÀTj ÞTj where Eij99ÀTj ¼ 1 if the household has electricity at the time of connection to the grid Tj years earlier and Eij99ÀTj ¼ 0 otherwise. As a practical matter, however, we do not observe whether the household had electricity at the time of the grid connection, as we only observe this for the initial and the final sur- vey dates. We replace Eij99ÀTj by Eij99 . (This will probably produce some attenuation bias in our esti- mates of b1 .) We considered the alternative of replacing Eij99ÀTj by Eij82 . However, since we will be using Eij82 as an instrumental variable for Eij99 this makes little difference—either one is using Eij82 or a predicted value based on Eij82 . Notice that the dynamic gain becomes b0 ð1 À Eij99ÀTj Þ under the restriction b0 þ b1 ¼ 0, in which case the private return to village grid connection is zero if one already has electricity privately. We do not assume that this is the case but rather test the null hypothesis that b0 þ b1 ¼ 0 as a restricted form. Endogeneity of Electricity Acquisition There are two reasons why one can question the assumption that Cov½ij ; ðEij99 À Eij82 ފ ¼ 0 in (1). The first relates to observable sources of endogeneity, while the second relates to unobservable ones. With regard to the first, there may be systematic heterogeneity in the observable baseline characteristics of households and in changes over time in exogenous characteristics that jointly influence consumption growth and the acquisition of electricity, implying that Cov½ij ; ðEij99 À Eij82 ފ 6¼ 0 in (1). To address this concern we augment equation (1) to include a vector of household and community characteristics as con- trols, which include changes in exogenous characteristics. Thus, on incorporating the dynamic effect, we have the following augmented DD model of consump- tion growth between the beginning and final survey dates: DlnCij99 ¼ a þ ðb0 þ b1 Eij99 ÞTj þ cðEij99 À Eij82 Þ þ pXij þ ij (2) (We use the same specification for labor supply.) In this equation we see the two distinct ways that elec- trification can matter. The first is through village electrification, the benefits of which can depend on The World Bank Economic Review 397 own-electrification; this is the term ðb0 þ b1 Eij99 ÞTj . We interpret this as the external effect. The second (partial equilibrium) channel is the direct idiosyncratic effect of the household acquiring electricity within the period (cðEij99 À Eij82 Þ). The second endogeneity concern is that there may well be latent characteristics (or changes in charac- teristics) in the error term that are correlated with electricity acquisition, again implying that Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Cov½ij ; ðEij99 À Eij82 ފ 6¼ 0. We next turn to our identifying assumptions in addressing this concern. Identifying Assumptions While geographic variables are plausible predictors of electricity placement at the household level, the validity of the exclusion restrictions is never beyond question. Household electrification in the geo- graphic area is a questionable IV since we postulate the existence of external benefits (including to non- electrified households), such as through greater employment opportunities or general equilibrium price effects on the village economy. Geographic proximity to an electricity line is also questionable given endogenous placement of those lines.20 We shall use instead long lags of proximity to the primary power generators source from which the transmission lines emanate. While the lines and substations that emanate from that source are endoge- nously placed, the source itself can be more plausibly treated as independent of household outcomes con- ditional on placement and other covariates. Of course, judgments on the plausibility of the identification strategy must also depend on what other control variables are used, given that the estimator is making a conditional independence assumption. By exploiting the relatively rich REDS data on village attributes and the (long) panel structure we believe that our exclusion restriction is defensible. We make two identifying assumptions. The first assumption is as follows: Identifying Assumption 1 The endogeneity of household electrification is confined to the acquisition of electricity over the time period, i.e., Eij99 is endogenous but Eij82 is exogenous and excludable from the vector Xij in (2). Thus Eij82 is used as an IV for Eij99 À Eij82 . We acknowledge that this assumption can be questioned. Although exogeneity of Eij82 is more plausi- ble over a long period (seventeen years in our case), the length of the time period also makes it more likely that the returns to electrification may have changed (as noted already). This casts doubt on the homogeneity condition and (hence) the excludability of the initial value in a difference-in-difference specification. In view of this concern, we use a second IV to test the validity of assumption 1. Given an over-identifying IV, the homogeneity restriction is testable. The sequential structure to the identification plays an important role in that we do not estimate impacts unless the first assumption is deemed to be valid. For this purpose we draw on our second identifying assumption: Identifying Assumption 2 Access to electricity depends in part on historical proximity to the primary power-generating plants, which does not influence outcomes independently of electrification or the other controls, including other exogenous geographic variables. 20 Chakravorty et al. (2014) argue that their use of the local density to federally placed transmission lines is a better in- strument than the state-placed distribution systems, but the concern remains; as Chakravorty et al. note, the placement of the transmission lines from power generating plants is a matter of choice, and past placement is likely to have re- sponded to the potential for local nonfarm development. 398 van de Walle, Ravallion, Mendiratta, and Koolwal In implementing assumption 2, we use the distance in kilometers from the village to the nearest power generating plant within the state in 1965 and 1975. As compared to distance to the nearest power sub- station (which supplies feeder lines to villages and is hence located closer to villages), distance to the nearest power generating plant is more likely to be orthogonal to unobserved factors affecting household outcomes and electricity access. Villages do not shift location in rural India, and even by 1999, power- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 generating plants were often several hundred kilometers away from villages (with means of 177 km in 1965, and 139 km in 1999). Also, power generating plants are typically located on engineering criteria and do not appear to have any special economic salience beyond that. So it does not seem plausible that proximity to a power station (let alone 10–20 years ago) would have any lasting independent effect on household consumption. Nonetheless, to help relieve concerns about omitted geographic factors corre- lated with proximity to power generating plants we allow a complete set of district fixed effects, as well as control variables for village characteristics, in the vector X. We include initial conditions (1982 val- ues) of the household and community characteristics detailed in Section II as well as changes in their value over time. We tested separate specifications dropping variables that are potentially endogenous (such as changes in household land and housing material and in community access to facilities21). Our preferred model and the one we will focus on omits these variables. The online addendum gives supple- mentary summary statistics including for the control variables. Armed with the extra IV, we test the null hypothesis that electrification has the same effect at each date, conditional on other controls, that is, we test whether Eij82 should be included in the control vector Xij . It is this homogeneity restriction that allows us to use Eij82 as an IV for Eij99 . In other words, this is the test of the exclusion restriction for this IV, which is only possible given that we have the second IV. If the test fails then we will not employ our estimator.22 IV. Results We first discuss the results for consumption, before turning to the role of labor earnings as the channel of impact. Total Consumption The first stage regressions for electrification are given in table 4. As expected based on table 2, there are a number of significant demand-side factors in the expansion of household electrification over the period (including religion, caste, schooling, and wealth variables). Even so, the instruments perform well. Distance to the nearest power generating plant for the two years we examine has a strong significant impact on household electrification. The two distance variables enter with about the same coefficient but opposite signs, implying that the households living in villages for which the distance to a power sta- tion fell more (higher difference between the “1965 distance” and that for 1975) were more likely to gain access to electricity.23 21 Including these potentially endogenous household and community change variables results in a slightly higher IV esti- mate of the internal household effect but no difference to the external village effect. 22 Strictly, our model is still identified if the test fails. However, with only the second, geographic IV left, we naturally do not feel that this would adequately capture the exogenous variation in household electrification. 23 Note that the signs switch between columns (1) and (2) in table 4 since the dependent variable in (1) reflects whether a household is not electrified in 1999. The World Bank Economic Review 399 Table 4. First Stage Estimates (1) (2) Variables Years of village elec. Change in HH 99 * Household not electrification electrified 99 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Log household size (initial condition) À0.064 0.001 [À0.13] [0.06] Age of head (initial condition) À0.128 0.006 [À1.47] [1.54] Age sq. of head (initial condition) 0.001 À0.000 [1.49] [À1.55] Head is divorced/widowed (initial condition) 0.519 À0.051* [0.75] [À1.81] A HH member 15þ has chronic illness/disability (initial condition) À0.947** 0.035* [À2.10] [1.81] Head is Hindu (initial condition) À1.432* 0.067** [À1.91] [2.48] Head is SC/ST (initial condition) 2.442*** À0.066*** [5.19] [À3.50] Max yrs. of schooling of any adult 15þ (initial cond.) À0.452*** 0.020*** [À3.75] [4.39] Max yrs. of schooling of any adult 15þ squared (initial condition) À0.003 0.000 [À0.46] [0.86] Share of women aged 16–55 (initial condition) À1.144 0.037 [À0.61] [0.55] Share of men aged 16–55 (initial condition) 2.241 À0.085 [1.22] [À1.38] Share of girls aged 7–15 (initial condition) 0.986 0.011 [0.42] [0.13] Share of boys aged 7–15 (initial condition) 0.224 À0.068 [0.10] [À0.81] Share of girls aged 0–6 (initial condition) 0.213 À0.006 [0.07] [À0.05] Share of boys aged 0–6 (initial condition) 1.586 À0.081 [0.69] [À0.90] Head moved from outside village (initial condition) 1.167 À0.075* [1.35] [À1.73] HH owns land (initial condition) À1.755*** 0.089*** [À3.18] [4.15] Inherited land (100s acres) (initial condition) À0.023 0.001 [À0.77] [0.76] House made from bricks/cement (initial condition) À0.714* 0.038** [À1.86] [2.19] Mean log deviation of total village consumption (initial condition) À6.488* 0.344** [À1.66] [2.14] Village crop yield below normal (initial condition) 1.434 À0.012 [1.64] [À0.29] Share of Muslims in village (initial condition) 0.256 0.012 [0.14] [0.23] Share of village area under cultivation (initial condition) À2.465*** 0.124*** [À2.64] [3.41] Continued 400 van de Walle, Ravallion, Mendiratta, and Koolwal Table 4. (continued) (1) (2) Variables Years of village elec. Change in HH 99 * Household not electrification electrified 99 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Large landowners lease out land (initial condition) 0.259 À0.008 [0.40] [À0.33] Km to nearest pucca road (initial condition) 0.007 À0.001*** [0.99] [À2.60] Km to block HQ (initial condition) À0.002 0.000 [À0.72] [0.70] Nearest market within 2 km of village (initial condition) À1.068 0.022 [À1.32] [0.73] Cooperative in village (initial condition) 1.352** À0.022 [2.18] [À0.87] Health facility within 5 km (initial condition) À0.462 À0.009 [À0.93] [À0.50] Primary school within 2 km (initial condition) À0.152 0.007 [À0.26] [0.26] Secondary school within 2 km (initial condition) 0.785 À0.015 [1.27] [À0.52] Community has a public drinking water tap (initial condition) À0.638 0.007 [À1.08] [0.35] Ag. wage rate for women (initial condition) À0.149 À0.003 [À1.11] [À0.60] Non-ag. wage rate for women (initial condition) À0.034 À0.004 [À0.38] [À1.37] Agricultural wage rate for men (initial condition) À0.046 0.007* [À0.46] [1.83] Non-ag. wage rate for men (initial condition) 0.139** À0.002 [2.44] [À1.00] HH split in 1998 (y ¼ 1, n ¼ 0) À0.683** 0.033*** [À2.28] [2.66] Log HH size, diff. À0.765** 0.021 [À2.18] [1.29] Age of head, diff. À0.035 0.003 [À0.69] [1.57] Age of head sq., diff 0.000 À0.000 [0.69] [À1.48] Head is divorced/widowed, diff. 0.618 À0.028* [1.48] [À1.66] HH member 15þ has chronic illness/disability, diff. À0.734** 0.022* [À2.11] [1.83] Max yrs of schooling of any adult, diff. À0.270*** 0.013*** [À3.21] [3.58] Max yrs of schooling of any adult sq., diff. À0.004 0.000 [À0.88] [1.04] Share of women aged 16–55, diff. À1.030 0.035 [À0.89] [0.79] Share of men aged 16–55, diff. 1.039 À0.053 [0.95] [À1.44] Share of girls aged 7–15, diff. 0.335 0.010 [0.24] [0.18] Share of boys aged 7–15, diff. À0.352 À0.027 [À0.30] [À0.55] Continued The World Bank Economic Review 401 Table 4. (continued) (1) (2) Variables Years of village elec. Change in HH 99 * Household not electrification electrified 99 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Share of girls aged 0–6, diff. 1.442 À0.047 [0.95] [À0.78] Share of boys aged 0–6, diff. 1.723 À0.079 [1.21] [À1.41] HH owns land y ¼ 1 n ¼ 0, diff. À2.527*** 0.109*** [À6.11] [6.80] Village mean log deviation of total consumption, diff. À0.864 0.149 [À0.38] [1.55] Crop yield below normal, diff. 1.408** À0.066** [2.41] [À2.18] Share of Muslims in village, diff. 1.101 0.103** [0.72] [2.30] Distance to block headquarters, diff. 0.003 À0.000 [0.91] [À0.23] Years in 1999 since village was electrified 0.367*** 0.003*** [11.63] [3.13] Interaction: years in 1999 of village electrification* HH is electrified in 1982 À0.083*** À0.003*** [À2.75] [À2.89] Identifying instruments Distance to power plant in 1965 À2.132** 0.138*** [À2.50] [4.74] Distance to power plant in 1975 1.880*** À0.081*** [2.66] [À3.71] HH is electrified in 1982 y ¼ 1, n ¼ 0 1.099 À0.894*** [1.22] [À23.19] Constant 3.493 0.049 [0.90] [0.35] Observations 5,954 5,954 R-squared 0.440 0.606 F-test of excluded instruments F (5,241) 28.80 913.04 Prob. 0.0000 0.0000 Notes: Robust t-statistics in brackets; *** p < 0.01, ** p < 0.05, * p < 0.1. A household is defined as being electrified if it owned an unambiguously electricity-run appliance or an electric irrigation pump set in 1982 and if it reports having electricity, incurring expenditures on electricity, or owning an electric pump set in 1999. Source: Authors’ calculations using 1982 and 1999 REDS panel. Table 5 provides the detailed total consumption regression results for the various specifications, as well as the homogeneity tests. The simple DD method attributes a 17.7% increase in consumption to the household acquisition of electricity (1.0% per annum).24 Evaluated at the mean consumption of those households who did not have electricity in 1982, this represents a gain of Rs. 604.30 per person per year. However, the simple DD method appears to greatly over-estimate the consumption gains attribut- able to electrification. Our IV estimator gives an implied consumption gain of 8.8% (0.5% per annum), representing a gain of Rs. 300.3 per person per year. These numbers suggest sizeable bias due to endoge- nous acquisition of electricity by more (latently) wealthy families. 24 Note that the regression coefficient of 0.163 is the change in log consumption. Then the ratio of consumption in 1999 to 1982 is exp (0.163) ¼ 1.177 402 van de Walle, Ravallion, Mendiratta, and Koolwal Table 5. Detailed Results for the Panel Data Regressions for Household Consumption (1) (2) (3) (4) (5) OLS OLS OLS with IV IV with Simple DD restriction restriction Variables imposed imposed Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Years village has been electrified 1999 0.001 0.024* [0.60] [1.96] Interaction: years village electrified in 99* HH elec- 0.002 À0.034* trified in 99 [1.14] [À1.80] Interaction: years village electrified in 99* HH not À0.001 0.008** electrified in 99 [À0.79] [2.00] Change in HH electrification between 99 and 82 0.163*** 0.100*** 0.105*** 0.051 0.084*** [6.85] [4.51] [4.93] [1.31] [2.97] Log HH size (initial condition) 0.047 0.045 0.046 0.047 [1.51] [1.45] [1.25] [1.50] Age of head (initial condition) 0.001 0.001 0.006 0.002 [0.16] [0.20] [0.91] [0.39] Age sq. of head (initial condition) À0.000 À0.000 À0.000 À0.000 [À0.12] [À0.15] [À0.90] [À0.37] Head is divorced/widowed (initial condition) À0.012 À0.012 À0.031 À0.017 [À0.34] [À0.34] [À0.65] [À0.47] HH member 15þ has chronic illness/disability (ini- À0.052* À0.051* À0.014 À0.042 tial condition) [À1.95] [À1.92] [À0.41] [À1.60] Head is Hindu (initial condition) 0.039 0.041 0.090* 0.051 [1.10] [1.17] [1.77] [1.45] Head is SC/ST (initial condition) À0.027 À0.030 À0.114** À0.049* [À0.96] [À1.09] [À1.98] [À1.69] Max yrs. of schooling of any adult 15þ (initial À0.009 À0.009 0.008 À0.005 condition) [À1.43] [À1.35] [0.66] [À0.77] Max yrs. of schooling sq of any adult 15þ (initial 0.001** 0.001** 0.001** 0.001** condition) [2.51] [2.53] [2.51] [2.59] Share of women aged 16–55 (initial condition) 0.032 0.032 0.071 0.042 [0.32] [0.33] [0.54] [0.41] Share of men aged 16–55 (initial condition) 0.021 0.018 À0.066 À0.001 [0.24] [0.21] [À0.59] [À0.01] Share of girls aged 7–15 (initial condition) À0.017 À0.015 À0.056 À0.028 [À0.14] [À0.12] [À0.38] [À0.22] Share of boys aged 7–15 (initial condition) À0.050 À0.046 À0.060 À0.053 [À0.43] [À0.40] [À0.46] [À0.47] Share of girls aged 0–6 (initial condition) 0.284** 0.290** 0.276 0.281* [1.98] [2.02] [1.44] [1.88] Share of boys aged 0–6 (initial condition) À0.080 À0.075 À0.132 À0.095 [À0.63] [À0.59] [À0.90] [À0.75] Head moved from outside village (initial condition) 0.050 0.051 0.000 0.037 [0.83] [0.84] [0.00] [0.61] HH owns land (initial condition) À0.089*** À0.088*** À0.017 À0.070** [À3.27] [À3.25] [À0.33] [À2.32] Inherited landownings (100s acres) (initial À0.014*** À0.014*** À0.013*** À0.013*** condition) [À3.54] [À3.52] [À2.98] [À3.41] House made from bricks/cement (initial condition) À0.127*** À0.126*** À0.099*** À0.120*** [À3.74] [À3.73] [À2.72] [À3.57] Continued The World Bank Economic Review 403 Table 5. (continued) (1) (2) (3) (4) (5) OLS OLS OLS with IV IV with Simple DD restriction restriction Variables imposed imposed Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Village mean log deviation of total consumption À0.184 À0.132 0.007 À0.148 (initial condition) [À0.74] [À0.52] [0.02] [À0.60] Crop yield below normal (initial condition) 0.084* 0.091** 0.026 0.067 [1.88] [2.05] [0.42] [1.45] Share of Muslims in village (initial condition) À0.076 À0.052 À0.064 À0.079 [À0.77] [À0.53] [À0.54] [À0.80] Share of village area under cultivation (initial À0.217*** À0.215*** À0.135* À0.196*** condition) [À3.26] [À3.21] [À1.71] [À2.93] Large landowners lease out land (initial condition) À0.094** À0.094** À0.100** À0.096** [À2.14] [À2.12] [À2.11] [À2.19] Km to nearest pucca road (initial condition) À0.000 À0.000 À0.000 À0.000 [À0.51] [À0.62] [À0.75] [À0.57] Km to block HQ (initial condition) À0.000 À0.000 0.000 À0.000 [À0.22] [À0.10] [0.15] [À0.15] Nearest market within 2 km of village (initial À0.020 À0.027 0.024 À0.006 condition) [À0.38] [À0.53] [0.38] [À0.12] Cooperative in village (initial condition) À0.056 À0.058* À0.101* À0.068* [À1.64] [À1.66] [À1.90] [À1.91] Health facility within 5 km (initial condition) 0.042 0.044 0.061 0.047 [1.51] [1.60] [1.64] [1.63] Primary school within 2 km (initial condition) 0.071 0.067 0.079* 0.075 [1.53] [1.41] [1.69] [1.64] Secondary school within 2 km (initial condition) À0.039 À0.039 À0.066 À0.046 [À1.03] [À1.00] [À1.49] [À1.25] Community has a public drinking water tap (initial À0.011 À0.012 0.012 À0.005 condition) [À0.27] [À0.28] [0.27] [À0.12] Ag. wage rate for women (initial condition) 0.009 0.009 0.012 0.010 [1.33] [1.32] [1.61] [1.48] Non-ag. wage rate for women (initial condition) 0.005 0.006 0.006 0.005 [0.78] [0.94] [0.88] [0.79] Ag. wage rate for men (initial condition) 0.004 0.006 0.005 0.004 [0.73] [0.99] [0.77] [0.68] Non-ag. wage rate for men(initial condition) 0.009** 0.009** 0.003 0.008* [2.22] [2.18] [0.65] [1.90] HH split in 1998 (y ¼ 1, n ¼ 0) À0.050*** À0.050*** À0.022 À0.042** [À2.98] [À2.99] [À0.87] [À2.42] Log HH size, diff. À0.428*** À0.429*** À0.400*** À0.420*** [À21.35] [À21.38] [À14.10] [À20.27] Age of head, diff. 0.004 0.004 0.006 0.004 [1.31] [1.34] [1.55] [1.43] Age of head sq., diff. À0.000 À0.000 À0.000 À0.000 [À1.03] [À1.05] [À1.29] [À1.15] Head is divorced/widowed, diff. À0.012 À0.012 À0.035 À0.018 [À0.59] [À0.57] [À1.16] [À0.85] HH member 15þ has chronic illness/disability, diff. 0.058*** 0.059*** 0.086*** 0.065*** [3.59] [3.64] [3.41] [3.98] Max yrs of schooling of any adult, diff. À0.011*** À0.011*** À0.001 À0.009** [À2.67] [À2.61] [À0.12] [À2.04] Continued 404 van de Walle, Ravallion, Mendiratta, and Koolwal Table 5. (continued) (1) (2) (3) (4) (5) OLS OLS OLS with IV IV with Simple DD restriction restriction Variables imposed imposed Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Max yrs of schooling of any adult sq., diff. 0.002*** 0.002*** 0.002*** 0.002*** [6.20] [6.21] [5.84] [6.34] Share of women aged 16–55, diff. 0.162*** 0.162*** 0.202*** 0.172*** [3.04] [3.03] [2.60] [3.03] Share of men aged 16–55, diff. 0.231*** 0.231*** 0.193*** 0.221*** [4.51] [4.52] [3.07] [4.41] Share of girls aged 7–15, diff. À0.046 À0.040 À0.052 À0.049 [À0.67] [À0.59] [À0.60] [À0.70] Share of boys aged 7–15, diff. 0.067 0.069 0.078 0.069 [0.99] [1.03] [0.98] [1.02] Share of girls aged 0–6, diff. À0.112 À0.109 À0.161 À0.126 [À1.35] [À1.31] [À1.55] [À1.48] Share of boys aged 0–6, diff. À0.200*** À0.196*** À0.258*** À0.217*** [À2.72] [À2.67] [À2.68] [À2.86] HH owns land y ¼ 1 n ¼ 0, diff. 0.084*** 0.084*** 0.179*** 0.108*** [4.30] [4.32] [3.10] [4.61] Village mean log deviation of total consumption, 0.136 0.170 0.116 0.122 diff. [0.79] [0.97] [0.63] [0.71] Crop yield below normal, diff. 0.003 0.003 À0.047 À0.010 [0.10] [0.10] [À1.03] [À0.30] Share of Muslims in village, diff. 0.050 0.052 0.015 0.040 [0.52] [0.54] [0.14] [0.42] Distance to block headquarters, diff. 0.000 0.000 À0.000 À0.000 [0.12] [0.25] [À0.44] [À0.07] Constant 0.607 À0.162 À0.094 À0.159 À0.181 [25.80] [À0.83] [À0.50] [À0.74] [À0.94] District fixed effects No Yes Yes Yes Yes Homogeneity test: F (1,241) 2.90 2.04 Prob. (0.090) (1.154) Observations 6,006 5,954 5,954 5,954 5,954 R-squared 0.019 0.466 0.466 0.209 0.449 Notes: A household is defined as being electrified if it owned an unambiguously electricity-run appliance or an electric irrigation pump set in 1982 and if it reports having electricity, incurring expenditures on electricity, or owning an electric pump set in 1999. Robust t-statistics in brackets. Clustering is at the village level. *** p < 0.01, ** p < 0.05, * p < 0.1. District fixed effects are included. Identifying IVs are as in table 4. Source: Authors’ calculations using 1982 and 1999 REDS panel. Continuing to focus on our IV estimator, we could not reject the null hypothesis that b0 þ b1 ¼ 0 in all but one of the regressions (entertainment expenditure per capita), so we give results with this restric- tion imposed (or note its rejection). This implies that village electrification had no significant effect on our household level outcomes if the household already had electricity (beyond the internal effect cap- tured by the term cðEij99 À Eij82 Þ). The dynamic benefits of village electrification are largely confined to those households without electricity. Given that we cannot reject the null that b0 þ b1 ¼ 0, we are justi- fied in using Eij82 as an IV. However, the dynamic effect is not estimated with much precision; the 95% confidence interval includes the possibility that it is virtually zero. Comparisons between the magnitudes of the external and internal effects are thus of uncertain veracity. The World Bank Economic Review 405 There are also possible concerns with these results related to the other control variables. In particu- lar, we have included controls for the change in (log) household size and the change in the maximum years of schooling. The assumption that these variables are exogenous can be questioned. And it could be argued that these are also channels of influence for electrification, leading us to underestimate the impact. Against these concerns, there is a potential omitted variable bias in excluding these controls. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 We tried dropping the control for the change in log household size and instead using the change in log consumption per equivalent single person as the dependent variable, where the number of equivalent persons was assumed to be the square root of household size. This assumption was motivated by the fact that the coefficient on the change in log household size was close to À0.5.25 This made little difference to the results for electrification; the coefficient on the change in household electrification in the IV esti- mate fell slightly (to 0.083 with a t-ratio of 2.98), while the coefficient on the village effect and its stand- ard error were almost identical. There was greater sensitivity to dropping the controls for changes in the household’s maximum edu- cation. Then the coefficient on household electrification fell to 0.065 with a t-ratio of 2.22, while the coefficient and standard error for the village effect increased (0.009; t ¼ 2.41). Not controlling for edu- cation (a key factor in other income sources) may well be imparting a downward bias on the estimated impact of household electrification. This bias stems from a “catching up” process in the spread of electri- fication, whereby it tended to be the less well educated households in 1998 who had become electrified—the relatively well-off having already acquired electricity in the base year.26 We found no sign of village effects until we use our IV estimator with homogeneity imposed. Then a significant effect emerges for those households without their own electricity. The annualized consump- tion gain from village electrification for households who are not electrified is 0.8%. (Note that b0 can be interpreted as the annual growth rate of consumption attributed to village electrification for a household that is not electrified. By contrast, the coefficient on the household becoming electrified gives the total impact over the period.) We also estimated a specification with extra controls for changes in village characteristics related to vil- lage accessibility, infrastructure, and institutions.27 There are concerns about the possible endogeneity of these extra controls, but there are also concerns about omitted variable biases. This specification shows a somewhat stronger village externality of electrification (the coefficient rose to 0.009, t ¼ 2.34). The coeffi- cient and standard error for household electrification in this augmented model was 0.079, t ¼ 2.81. From a methodological perspective, it is notable that our IV estimator implies large selection biases in either single or double difference estimators.28 From the difference in means of log consumption per cap- ita the single difference estimate (log consumption for households with electricity less that for those without it) is 0.508, implying a selection bias of 0.424 (the difference between 0.508 and our IV estimate of 0.084). The selection bias accounts for 83% of the observed difference. The DD estimator reduces the bias considerably, bringing it down to 0.079 (the difference between 0.163 and our IV estimate of 0.084 from table 4). Nonetheless, even for the DD estimator, the selection bias is as large again as our impact estimate. 25 The coefficient in the restricted IV regression is À0.42 (t ¼ 20.3); (see column 5 of table 5). 26 This is evident when we regress the change in the maximum years of education on the change in household electrifica- tion controlling for all other covariates used in the main regression. The regression coefficient is negative and signifi- cant (at the 1% level). 27 The extra variables included the changes in the distances to: a paved road, market, health clinic, and school and changes in the presence of a public water tap and an agricultural cooperative. 28 The following calculations use the fact that the observed difference between mean outcomes for those receiving a pro- gram and those who do not is identically equal to the true causal impact (the treatment effect for those treated) plus the selection bias (the difference in counterfactual outcomes between those receiving the program and those not). 406 van de Walle, Ravallion, Mendiratta, and Koolwal Table 6. Impacts of Household and Village Electrification on Consumption and Its Components Simple DD IVE: Allowing for external village effect, endogenous electrification and with controls Change in Change in Years of village Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 household household electrification in 1999 electrification electrification times household not electrified in 1999 Total consumption expenditure per capita (log) 0.163*** 0.084*** 0.008** (6.85) (2.97) (2.00) Food expenditure per capita (log) 0.098*** 0.095*** À0.0002 (4.42) (3.42) (À0.06) Fuel expenditure per capita (log) 0.174*** 0.229*** À0.001 (3.50) (4.89) (À0.06) Nonfood, nonfuel expenditure per capita (log) 0.260*** 0.084** 0.013*** (8.01) (2.26) (2.77) Of which: Clothing & footwear 0.228*** 0.115** 0.004 (5.56) (2.34) (0.55) Entertainment 0.882*** – – (2.85) Ceremonies 0.654* À0.606* 0.099* (1.72) (À1.80) (1.77) Travel 0.471** À0.091 À0.072 (2.52) (À0.40) (À1.45) Education 0.454* 0.439 0.020 (1.73) (1.31) (0.52) Health 0.251** 0.056 À0.0004 (2.09) (0.43) (À0.02) Domestic help 0.320** 0.127 0.018 (2.28) (0.59) (0.69) Repairs to housing 0.147 À0.095 À0.009 (1.28) (À0.64) (À0.48) Owns a kerosene stove (1 ¼ yes; 0 ¼ no) 0.133*** 0.131*** 0.002 (6.57) (4.92) (0.64) Notes: The table gives the regression coefficients for electrification for each outcome variable. DD: “difference-in-difference.” IVE: Instrumental variables estimates of a panel data model treating the acquisition of electricity as endogenous, using distances in 1965 and 1975 to the nearest power plant and whether the household was electrified in 1982 as IVs. Controls and district effects included. See text for further details. ***: 1%; **: 5%; *: 10%. The restriction passes in all cases but one (entertainment expenditure per capita). Also see notes to table 3. Source: Authors’ calculations using 1982 and 1999 REDS panel. Components of Consumption Table 6 gives results for the components of consumption. The table first gives the simple DD estimate of the effect of electrification for each outcome variable. Next, the table gives our estimates of the same parameter based on equation (2), allowing for the village-level external effect with controls and treating the household’s acquisition of electricity by 1999 as endogenous. The last column of table 6 gives the corresponding estimates of the external effect from each regression. We find a significant internal effect of electrification on each of the three main categories: food, fuel, and other (nonfood, nonfuel) expenditures. The impacts are robust to allowing for the endogeneity of household acquisition of electricity. When we break up nonfood, nonfuel spending further, we find sig- nificant internal effects on all components other than repairs to the house. However, only spending on clothing and on ceremonies (with a negative effect) are robust to allowing for endogeneity. The positive and strong internal effects on fuel expenditures imply about a 25.7% increase as a result of the house- hold acquisition of electricity. The World Bank Economic Review 407 We find significant external effects on nonfood, nonfuel spending. This is suggestive of the social effects on consumption behavior discussed in section II, whereby households without electricity them- selves shift their spending toward consumer goods that display affluence (unlike food, which is typically consumed in the privacy of one’s home). We investigated this further using a finer breakdown of non- food consumption. Most suggestive of such a social effect is a (just) significant and positive external Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 effect on spending on ceremonies (a coefficient of 0.099, t ¼ 1.77); in contrast, the (just) significant inter- nal effect is negative (À0.606, t ¼ À1.80), suggesting a substitution away from such spending for house- holds who acquire electricity. There were no significant external effects on clothing and footwear, entertainment, domestic help, housing repairs, health, or education spending.29 We would not expect a village effect on fuel spending and that is confirmed by our results in table 6. So the external, village-level, effect for nonelectrified households entailed a change in the composition of spending away from food and fuel toward other goods, while the internal (household-level) effects were distributed across food and nonfood expenditures. The households who do not have electricity themselves spend on more conspicuous consumption, while those acquiring electricity for their own use tend to divert their spending toward less conspicuous food and nonfood items. While these point estimates are suggestive of large effects of village electrification on conspicuous con- sumption by households who do not have electricity themselves, we should flag again a cautionary note that this external effect is not estimated with the same level of statistical precision as the internal effect. Electrification also increased the ownership of kerosene stoves (table 6), consistent with the claim that subsidized kerosene rations are switched to cooking when electricity becomes available for lighting (Heltzberg 2004). Here too we would probably not expect a village external effect, and this is borne out by the results (table 6). Labor Supply and Wage Rates Turning to labor supply, a more complex picture emerges, involving the substitution of some activities for others. Table 7 gives the results (in the same format as table 6) for the various categories of labor sup- ply that can be identified in the REDS survey data and compared over time. (Again the homogeneity restriction performed well and so was imposed.) Casual wage work decreases for both men and women, though with a larger impact on men. Using our IV estimator, an extra 14.6 days per year of regular wage work for men is attributed to household electrification. This extra work came mainly from reduced casual wage work (8.4 days). Thus our results indicate a significant substitution in male labor supply from casual to regular work attributed to electrifi- cation. Using the IV estimator there is no significant effect of “own electrification” on self-employment in either farm or nonfarm activities for either men or women. There is evidence of a significant switch out of agricultural self-employment through the external dynamic effect. One interpretation of these findings is that electricity allows men to switch leisure time from daylight hours to night time, allowing a more regular supply of labor, as required by salaried work. For women, there are no significant labor supply impacts. The only positive effect, and the largest effect at 4.2 days (although still not significant), is on casual wage work, suggestive of women taking up the casual work that was displaced by men. There are signs that this came in part from reduced days of regular wage work and of farm and nonfarm self-employment, although again, these displacement effects are not statistically significant. In summary, we find that household electrification increased male labor supply. Men’s regular wage work increased, much of it coming from casual wage work, and some from other activities, including leisure. 29 This is with the homogeneity restriction imposed. 408 van de Walle, Ravallion, Mendiratta, and Koolwal Table 7. Impacts of Household and Village Electrification on Labor Supply Simple DD IVE: Allowing for external village effect, endogenous electrification and with controls Years of village Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Change in Change in electrification in 1999 household household times household not Labor supply in days per year electrification electrification electrified in 1999 Days of regular wage work women À0.026 À2.389 0.113 (À0.03) (À1.17) (0.62) Days of regular wage work men 10.384*** 14.581** 0.320 (3.33) (2.72) (0.54) Days of casual wage work women À5.374** 4.230 À0.075 (À2.13) (1.25) (À0.19) Days of casual wage work men À21.140*** À8.415* À1.162* (À5.45) (À1.66) (À1.71) Days of agricultural self-employment by women 2.252 À2.668 À0.899* (0.96) (À0.89) (À1.86) Days of agricultural self-employment by men 9.503*** À2.841 À1.261** (3.07) (À0.73) (À2.46) Days nonagricultural self-employment by women À1.859** À1.834 0.292* (À2.42) (À1.23) (1.79) Days nonagricultural self-employment by men 1.848 3.319 0.524 (0.83) (0.88) (1.60) Note: See table 6. Source: Authors’ calculations using 1982 and 1999 REDS panel. Using the wage rates reported in the village survey we can also test whether there is any sign that vil- lage electrification affected wage rates. We used the data on harvest wage rates to compare the changes in mean wage rates for those villages that became electrified between the two survey rounds with those for the villages that already had electricity in 1982. The DD estimate gave a small gain of Rs. 4.19 per day (in 1999 prices) for women and a small loss for men of Rs. À0.93; however, neither is significantly different from zero (standard errors of Rs. 3.87 and 3.68). We also repeated this analysis using the num- ber of years the village was electrified as the treatment effect; again, there was no significant impact on wages. We find no evidence in these data that village electrification increased real wage rates. This does not, of course, rule out impacts on the demand for labor. When our IV estimates of the impacts on days of labor supply are valued at the sample mean wage rates, the implied impact on total labor earnings in 1999 is Rs. 2,230 per household.30 The mean impact on household consumption implied by our IVE results is Rs. 1,721. So the implied aggregate consump- tion gain is 77% of the implied income gain. The remaining gap has two possible explanations. The first is the existence of foregone income from market labor supply due to displaced activities within the household. In terms of the model in section II, this would entail that the extra market labor supply comes from domestic work in producing marketable commodities. The second explanation is savings; by this interpretation, about one quarter of the income gain was saved or invested directly. We cannot say which of these explanations is the right one. However, some support for the first explanation can be 30 The online addendum provides details on the wage rates used for imputation. A potentially contentious assumption we make is in using the casual wage rate for agricultural work in valuing the income effect of the impacts on days of self-employment in agriculture. As a sensitivity test we also tried using 50% of the casual wage rates; the impact on to- tal earnings rose to Rs. 2,406. The World Bank Economic Review 409 found in past estimates of the income foregone in taking up new employment opportunities provided by public works in India. Datt and Ravallion (1994) estimate a foregone income of about one quarter of the gross wage rate in Maharashtra. Dutta et al. (2014) estimate a mean forgone income of about one third of the gross wage rate in Bihar. These observations suggest that our gap between the mean earnings gain and the gain in consumption could be explained by foregone income even if there is no impact on Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 savings. Turning now to the effects of village level electrification, we find a small increase in women’s non- farm self-employment due to village electrification. The effect is only a third of a day; but recalling that it is annualized, it would add up to an extra week of such work over the period. We also find somewhat larger, but still small, significant decreases in men’s casual wage work and in agricultural self- employment by both men and women attributable to village electrification. On the whole, the estimated village effects on labor supply of connecting the village to the electricity grid are small. V. Conclusions While positive claims are often heard in the literature and policy discussions about the consumption and income gains from household electrification in developing countries, there is surprisingly little rigorous evidence on the long-term impacts, as required for assessing the economic benefits of public investment in expanding access to electricity for the great many people in the world today who do not have such access. The literature has also largely ignored the distinction between internal (household-level) impacts and external, village-level, effects. This paper has tried to fill these gaps in knowledge by studying the household-level impacts on con- sumption and labor supply of a period of huge expansion in rural electrification in India in the 1980s and 1990s. We have distinguished the dynamic growth effect of village electrification from the direct, idiosyncratic, household effect on consumption and labor supply. It is hoped that this study of the long- run impacts of expanding household access to electricity will have salience not only for India’s continu- ing efforts to expand electrification but for the many other developing countries where this is still ahead. We find a significant “internal” impact on consumption of household acquisition of electricity during the period. Under our identifying assumptions, the consumption gain is not as large as some past esti- mates, but it is still sizeable at around 0.5% per annum. We also find evidence of a dynamic effect of vil- lage electrification, which is asymmetric in that it favors households without electricity. This is suggestive of an external effect, which also comes with a shift in consumption spending away from food toward goods such as ceremonies—possibly associated with an attempt to maintain status among those without their own electricity. In exploring the channels of impact, we find evidence of effects on labor supply, with electrification resulting in extra regular wage work for men. Our findings on the direct, household-level, impact on labor supply do not support the idea of a rural economy in labor surplus, whereby only the demand side matters to employment. Assuming that the income gains are fully consumed, our imputations of the income gains generated by the labor supply effects using gross wage rates suggest that about one quarter of the impact of electrification on gross earnings is lost due to foregone incomes stemming from dis- placed activities within the household. However, we cannot rule out the possibility that a share of the income gain from electrification is saved or directly invested. References Alby, P., J.-J. Dethier, and S. Straub. 2013. “Firms Operating under Electricity Constraints in Developing Countries.” World Bank Economic Review 27 (1): 109–32. 410 van de Walle, Ravallion, Mendiratta, and Koolwal Asian Development Bank (ADB). 2010. “Asian Development Bank’s Assistance for Rural Electrification in Bhutan— Does Electrification Improve the Quality of Rural Life?” ADB Impact Evaluation Study. Ref Num: IES: BHU 2010–27. Barkat, A., S. H. Khan, M. Rahman, S. Zaman, A. Poddar, S. Halim, N. Ratna, M. Majid, A. K. M. Maksud, A. Karim, and S. Islam. 2002. “Economic and Social Impact Evaluation Study of the Rural Electrification Program in Bangladesh.” Human Development Research Center, NCERA International, Dhaka. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Barnes, D. F. 2007. “The Challenge of Rural Electrification.” In D.F. Barnes, eds., The Challenge of Rural Electrification: Strategies for Developing Countries. Washington, DC: Resources for the Future. Barnes, D. F., and H. Binswanger. 1986. “Impact of Rural Electrification and Infrastructure on Agricultural Changes 1966–1980.” Economic and Political Weekly 21 (1): 26–34. Barnes, D. F., and M. Sen. 2004. “The Impact of Energy on Women’s Lives in Rural India.” UNDP/ESMAP. Bensch, G., J. Kluve, and J. Peters. 2011. “Impacts of Rural Electrification in Rwanda.” Journal of Development Effectiveness 3 (4): 567–88. Bernard, T. 2010. “Impact Analysis of Rural Electrification Projects in Sub-Saharan Africa.” World Bank Research Observer, 27 (1): 33–51. Binswanger, H., S. Khandker, and M. Rosenzweig. 1993. “How Infrastructure and Financial Institutions Affect Agricultural Output and Investment in India.” Journal of Development Economics 41 (2): 337–66. Chakravorty, U., M. Pelli, and B. U. Marchand. 2014. “Does the Quality of Electricity Matter? Evidence from Rural India.” Fondazione Eni Enrico Mattei Working Paper 11.2014. Chaurey, A., M. Ranganathan, and P. Mohanty. 2004. “Electricity Access for Geographically Disadvantaged Rural Communities: Technology and Policy Insights.” Energy Policy 32: 1693–705. Chowdhury, S. K. 2010. “Impact of Infrastructures on Paid Work Opportunities and Unpaid Work Burdens on Rural Women in Bangladesh.” Journal of International Development, 22: 997–1017. Coen-Pirani, D., A. Leon, and S. Lugauer. 2010. “The Effect of Household Appliances on Female Labor Force Participation: Evidence from Micro Data.” Labour Economics 17 (3): 503–13. Dasgupta, S., M. Huq, M. Khaliquzzaman, K. Pandey, and D. Wheeler. 2006. “Who Suffers from Indoor Air Pollution? Evidence from Bangladesh.” Health Policy and Planning 21 (6): 444–58. Datt, G., and M. Ravallion. 1994. “Transfer Benefits from Public Works Employment,” Economic Journal 104: 1346–69. Dinkelman, T. 2011. “The Effects of Rural Electrification on Employment: New Evidence from South Africa.” American Economic Review 101 (7): 3078–108. Duflo, E., M. Greenstone, and R. Hana. 2008. “Indoor Air Pollution, Health and Economic Well-Being.” Surveys and Perspectives Integrating Environment and Society 1 (1): 7–16. Duflo, E., and R. Pande. 2007. “Dams.” Quarterly Journal of Economics 122 (2): 601–46. Dutta, P., R. Murgai, M. Ravallion, and D. van de Walle. 2014. Right to Work? Assessing India’s Employment Guarantee Scheme in Bihar, Equity and Development Series, World Bank, Washington, DC. Frank, R. H. 1997. “The Frame of Reference as a Public Good.” Economic Journal 107: 1832–47. Greenwood, J., A. Seshadri, and M. Yorukoglu. 2005. “Engines of Liberation.” Review of Economic Studies 72 (1): 109–33. Grogan, L. 2012. “Household Electrification, Fertility and Employment: Evidence from the Colombian Censuses,” mimeo, University of Guelph, Canada. Grogan, L., and A. Sadanand 2012. “Electrification and Labour Supply in Poor Households: Evidence from Nicaragua.” World Development 43: 252–65. Hendry, J. 2010. Marriage in Changing Japan: Community and Society, Routledge, New York. Heltberg, R. 2004. “Fuel Switching: Evidence from Eight Developing Countries.” Energy Economics 26: 869–87. International Energy Agency. 2011. World Energy Outlook, Organization for Economic Cooperation and Development, Paris. ———. 2002. Electricity in India: Providing Power for the Millions. Organization for Economic Cooperation and Development, Paris. Independent Evaluation Group (IEG). 2008. The Welfare Impacts of Rural Electrification: A Reassessment of the Costs and Benefits. An IEG Impact Evaluation. World Bank, Washington, DC. The World Bank Economic Review 411 Kebede, E., J. Kagochi, and C. M. Jolly. 2010. “Energy Consumption and Economic Development in Sub-Sahara Africa.” Energy Economics 32 (3): 532–37. Khandker, S., D. F. Barnes, and H. Samad. 2009. “Welfare Impacts of Rural Electrification: A Case Study from Bangladesh.” World Bank Policy Research Working Paper No. 4859, World Bank, Washington, DC. ———. 2013. “Welfare Impacts of Rural Electrification: A Panel Data Analysis in Vietnam.” Economic Development and Cultural Change 61 (3). Downloaded from https://academic.oup.com/wber/article-abstract/31/2/385/2897299 by LEGVP Law Library user on 08 August 2019 Khandker, S., D. F. Barnes, H. Samad, and N. H. Minh. 2009. “Welfare Impacts of Rural Electrification: Evidence from Vietnam.” World Bank Policy Research Working Paper No. 5057, World Bank, Washington, DC. Khandker, S., H. Samad, R. Ali, and D. Barnes. 2012. “Who Benefits Most from Rural Electrification?” World Bank Policy Research Working Paper No. 6095, World Bank, Washington, DC. Kohlin, G., E. Sills, S. Pattanayak, and C. Wilfong. 2011. “Energy, Gender and Development: What are the Linkages? Where is the Evidence?” World Bank Policy Research Working Paper No. 5800, World Bank, Washington, DC. Lewis, A. 1954. “Economic Development with Unlimited Supplies of Labor,” The Manchester School of Economic and Social Studies 22: 139–91. Lipscomb, M., A. M. Mobarak, and T. Barham. 2013. “Development Effects of Electrification: Evidence from the Topographic Placement of Hydropower Plants in Brazil.” American Economic Journal: Applied Economics 5 (2): 200–31. Mathur, J. K., and D. Mathur. 2005. “Dark Homes and Smoky Hearths: Rural Electrification and Women.” Economic and Political Weekly 40 (7): 638–43. Pachauri, S., and A. Muller. 2008. “A Regional Decomposition of Domestic Electricity Consumption in India: 1980– 2005.” Presented at IAEE conference, Istanbul, Turkey, June. Pandey, V. 2007. “Electricity Grid Management in India: An Overview.” Electrical India 47 (11). Peters, J., and C. Vance. 2011. “Rural Electrification and Fertility – Evidence from Co ˆ te d’Ivoire.” Journal of Development Studies 47 (5): 753–66. Rao, V. 2001. “Celebrations as Social Investments: Festival Expenditures, Unit Price Variation and Social Status in Rural India.” Journal of Development Studies 38 (1): 71–97. Rehman, I. H., P. Malhotra, R. Chandra Pal, and P. B. Singh. 2005. “Availability of Kerosene to Rural Households: A Case Study from India,” Energy Policy 33: 2165–74. Rud, J. P. 2012. “Electricity Provision and Industrial Development: Evidence from India.” Journal of Development Economics 97 (2): 352–67. Schultz, T. P. 1993. “Returns to Women’s Education.” in M. King Elizabeth, and M. Anne Hill (eds). Women’s Education in Developing Countries: Barriers, Benefits, and Policies. World Bank Publications. World Bank. 2004. “The Impact of Energy on Women’s Lives in Rural India,” World Bank, Washington, DC. The World Bank Economic Review, 31(2), 2017, 412–433 doi: 10.1093/wber/lhw070 Advance Access Publication Date: April 6, 2017 Article The Changing Structure of Africa’s Economies Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 Xinshen Diao, Kenneth Harttgen, and Margaret McMillan Abstract Using data from the Groningen Growth and Development Center’s Africa Sector Database and the Demographic and Health Surveys, we show that much of Africa’s recent growth and poverty reduction has been associated with a substantive decline in the share of the labor force engaged in agriculture. This decline is most pronounced for rural females over the age of 25 who have a primary education; it has been accompanied by a systematic increase in the productivity of the labor force, as it has moved from low productivity agricul- ture to higher productivity services and manufacturing. We also show that, although the employment share in manufacturing is not expanding rapidly, in most of the low-income African countries the employment share in manufacturing has not peaked and is still expanding, albeit from very low levels. More work is needed to un- derstand the implications of these shifts in employment shares for future growth and development in Africa south of the Sahara. JEL classification: C80, N17, O14, O40, O55 Key words: Structural change, Labor productivity, Africa It cannot be denied that Africa1 has come a long way over the past 15 years. As recently as 2000, the front cover of The Economist proclaimed Africa “the hopeless continent” (The Economist 2000). Yet recent evidence suggests that the continent is anything but hopeless. Although there is some debate as to the magnitude of the decline, it is clear that the share of the population living below the poverty line fell significantly over the past decade and a half (Sala-i-Martin and Pinkovskiy 2010, McKay 2013, Page and Shimeles 2014). In addition to the decline in monetary poverty, several researchers have documented a general decline in infant mortality rates and increased access to education (McKay 2013, Page and Xinshen Diao is a Senior Research Fellow at the International Food Policy Research Institute; her email address is x.diao@cgiar.org. Kenneth Harttgen is Senior Researcher in Development Economics at ETH Zurich NADEL’s Center for Development and Cooperation; his email address is kenneth.harttgen@nadel.ethz.ch. Margaret McMillan (corresponding author) is Professor of Economics, Tufts University, a Senior Research Fellow at the International Food Policy Research Institute, and a Faculty Research Associate at the NBER; her email address is Margaret.McMillan@tufts.edu. The research for this article was financed by the African Development Bank, CGIAR’s research program Policies, Institutions, and Markets (PIM), and the Economic and Social Research Council (ESRC) in cooperation with the UK government’s Department for International Development (DFID) as part of the DFID/ESRC Growth program, grant agreement ES/ J00960/1, PI Margaret McMillan. The authors thank Matthew Johnson and Inigo Verduzco-Gallo for excellent research as- sistance and Doug Gollin, David Lagakos, and Michael Waugh for providing data. The authors would also like to thank Alan Gelb, Adam Storeygard, Doug Gollin, Remi Jedwab, William Masters, Jan Rielander, Dani Rodrik, Abebe Shimeles, Erik Thorbecke, and Enrico Spolaore for helpful comments. A supplemental appendix to this article is available at https:// academic.oup.com/wber. 1 Africa in this paper refers only to countries in Africa south of the Sahara. C The Author 2017. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. The World Bank Economic Review 413 Shimeles 2014). Average growth rates have been positive for the first time in decades and, in some of the fastest-growing economies, have exceeded six percent per annum; moreover, these growth rates are likely to be underestimated. Young (2012) found that, since the early 1990s, real consumption in Africa has grown between 3.4 and 3.7 percent per year, or three to four times the 0.9–1.1 percent growth reported using national accounts data; he dubbed this an “African growth miracle.”2 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 The reasons behind this success are not well understood. The main contribution of this paper is to show that there has been a substantial decline in the share of the labor force engaged in agriculture across much of Africa south of the Sahara (SSA). Previous researchers have shown that agriculture is by far the least productive sector in Africa (McMillan and Rodrik 2011, Gollin, Lagakos, and Waugh 2014) and that income and consumption are lower in agriculture than in any other sector (Gollin, Lagakos, and Waugh 2014). Researchers have also noted that real consumption is growing in Africa (Young 2012) and that poverty is falling (McKay 2013, Page and Shimeles 2014). To our knowledge, this paper is the first to connect these improvements in living standards to important occupational changes. Before proceeding further, a word about the data is in order, because much has been written about the poor quality of statistics in Africa3 and because the results presented in this paper depend heavily upon the quality of the data. To be as transparent as possible, this paper only uses publicly available data.4 Thus, the two main data sources for this paper are the Africa Sector Database,5 produced by the Groningen Growth and Development Center (GGDC), and the Demographic and Health Surveys (DHS) (ICF International 2016). The GGDC database, which covers 11 African countries, was last updated in October 2014. The GGDC database includes all the countries used in McMillan and Rodrik (2011) plus two additional countries, Botswana and Tanzania. A big advantage of the GGDC data is that they cover employment and value-added at the sector level going back to 1960. These data were obtained from national statistical offices as well as from libraries across Europe (GGDC 2013). The employment data are consistent over time and are comparable to the value-added data in the national accounts calcula- tions because they are constructed using census data. Using the census data has the added benefit of cap- turing activity in the informal sector. However, because census data are not collected on a regular basis, growth rates in employment by sector are obtained using labor forces surveys. Using the GGDC data to compute average labor productivity by sector raises two potential measure- ment issues. The first, and the one that has gotten the most attention in the literature,6 is that the quality of the data collected by national statistical agencies in Africa has been poor. We address this issue, at least in part, by cross-checking our estimates of changes in employment shares using the GGDC data with changes in employment shares computed using the DHS data. The DHS data are collected by enu- merators working for a US-based consulting firm and are generally thought to be of very high quality. A comparison of changes in employment shares across datasets reveals remarkable consistency across the 2 Harttgen, Klasen, and Vollmer (2013) found no evidence supporting the claim of an African growth miracle that extends beyond what has been reported in gross domestic product per capita and consumption figures. They argue that trends in assets can provide biased proxies for trends in income or consumption growth. 3 For recent critiques of African data, see papers by Devarajan (2013) and Jerven and Johnston (2015). 4 A previous version of this paper used additional data provided by researchers at the International Monetary Fund. Because these data are not publicly available, and because we do not have access to the original datasets, we decided not to use these countries. Most, but not all, of these countries are included in the Demographic and Health Surveys. 5 This dataset can be accessed at http://www.rug.nl/research/ggdc/data/africa-sector-database and was constructed with the financial support of the ESRC and the DFID as part of the DFID/ESRC Growth program, grant agreement ES/ J00960/1, PI Margaret McMillan. 6 See, for example, the special issue of the Review of Income and Wealth, Special Issue: Measuring Income, Wealth, Inequality, and Poverty in Sub Saharan Africa: Challenges, Issues, and Findings, October 2013, 59, Supplement S1: S1- S200. 414 Diao, Harttgen, and McMillan two datasets. Our confidence in the estimates of value-added at the sectoral level is bolstered by the fol- lowing facts. First, the African countries included in the GGDC database are the countries in Africa with the strongest national statistical offices, and these countries have been collecting national accounts data 7 for some time. Second, researchers at the GGDC specialize in providing consistent and harmonized measures of sectoral value-added, and our view is that this expertise lends credibility to these numbers. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 Finally, using LSMS surveys, researchers have shown that sectoral measures of value-added based on national accounts data are highly correlated with sectoral measures of consumption (Gollin, Lagakos, and Waugh 2014). A second concern stems from the measurement of labor inputs. Ideally, instead of using the measured number of workers employed in a sector, we would use the number of hours worked in a sector. This would correct for biases associated with the seasonality of agriculture that might lead to an underestima- tion of agricultural labor productivity. This is a serious issue, and, for the purposes of this paper, we rely on work by Duarte and Restuccia (2010) who show that, in a sample of 29 developed and developing countries, the correlation between hours worked and employment shares is close to one and Gollin, Lagakos, and Waugh (2014) who show that correcting labor productivity measures for hours worked does not overturn the result that labor productivity in agriculture is significantly lower than labor pro- ductivity in the rest of the economy. Note that this does not mean that there are not off-farm activities in rural areas that bring in less income, for example, than farming. In fact, this is highly likely in very poor economies where a large share of economic activity is of a subsistence nature.8 The analysis begins by asking whether it is reasonable to compare structural change in Africa to struc- tural change in other regions during the same period. Average incomes in Africa are significantly lower than in East Asia, Latin America, and all other regions. If countries at different stages of development tend to exhibit different patterns of structural change, the differences between Africa and other develop- ing regions may be a result of their different stages of development. Motivated by this possibility, this paper explores how the level of employment shares across sectors in African countries compares to the level in other countries, controlling for levels of income. The findings show that African countries fit quite well into the pattern observed in other countries, with some minor exceptions. In other words, given current levels of income per capita in Africa, the share of the labor force in agriculture, services, and industry is roughly what would be expected. Having confirmed that, in 1990, most African countries were characterized by high employment shares in agriculture, we turn to an investigation of changes in agricultural employment shares. For the eight low-income countries in the GGDC dataset, the share of the labor force engaged in agriculture from 2000 to 2010 declined by an average of 9.33 percentage points. Over this same period and for the same countries, the employment share in manufacturing expanded by 1.46 percentage points, and the employment share in services expanded by 6.13 percentage points. Combining these data on employ- ment shares with data on value-added, we show that for the period 2000–2010, labor productivity in these eight low-income African countries grew at an unweighted annual average of 2.8 percent; 1.57 per- centage points of this labor productivity growth was attributable to structural change. We report the unweighted averages because the weighted average is dominated by Nigeria in the low-income sample and by South Africa in the high-income sample. By contrast, for 1990–1999, labor productivity growth was close to zero, and structural change was growth-reducing. In the three high-income countries in the GGDC Africa Sector Database, labor productivity growth was similar to that in the eight low-income countries, but it was entirely accounted for by within-sector productivity growth. 7 Zambia appears to be an exception. 8 Using LSMS-ISA data, McCullough (2015) finds that correcting for hours worked reduces the gap between labor pro- ductivity in agriculture and in other activities significantly, but she provides no explanation for the large difference between her results and the results of Gollin, Lagakos, and Waugh (2014). The World Bank Economic Review 415 Although these results are encouraging, they only capture the experience of 11 countries in Africa. Thus, an important goal of this paper is to expand the sample of countries to include more of the poorer countries in Africa. To this end, this paper uses the DHS, which are nationally representative surveys designed to collect detailed information on child mortality, health, and fertility, as well as on house- holds’ durables and quality of dwellings. In addition, the DHS include information on gender, age, loca- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 tion, education, employment status, and occupation of women and their partners between the ages of 15 and 59. Importantly, the design and coding of variables (especially variables on the type of occupation, educational achievements, households assets, and dwelling characteristics) are generally comparable across countries and over time. Finally, the sample includes considerable regional variation—90 surveys are available for 31 African countries, and, for most countries, multiple surveys (up to six) were con- ducted between 1993 and 2012. Using the DHS, this paper shows that the changes in agricultural employment shares in the sample of African countries for which there is overlap between the GGDC and the DHS are similar. It then shows that, between 1998 and 2014, the share of the labor force employed in agriculture for the countries in the DHS sample decreased by about ten percentage points. In addition, there is a significant degree of within- and cross-country heterogeneity in the changes in agricultural employment shares. Within coun- tries, the decline in the employment share in agriculture is most pronounced for poor, uneducated females in rural areas. Across countries, the most rapid decline occurred for rural females in Cameroon and Mozambique, while in Mali, Zimbabwe, and Madagascar there was an increase in the share of women who reported agriculture as their primary occupation. This work is related to work by Gollin, Lagakos, and Waugh (2014). Using contemporary data for 151 developing countries, including several from Africa, they confirmed the persistence of a sizable agri- cultural productivity gap as well as a gap in income and consumption. Based on these results, they con- cluded that there should be large economic gains associated with a reduction in the share of employment in agriculture. Our paper differs in that it takes as given the agricultural productivity gap and shows a significant decline in the share of employment in agriculture across much of the continent. This paper is also related to work by Duarte and Restuccia (2010) and Herrendorf, Rogerson, and Valentinyi (2014), who found that structural change is a fundamental feature of economic growth. This structural transformation continues until farm and nonfarm productivity converge, which typically occurs only at high levels of per capita income. In the United States, for example, the exodus of labor from agriculture did not end until the mid-1990s. At lower levels of income, countries that pull them- selves out of poverty also exhibit positive structural change.9 The main difference between our work and these two papers is that they do not include Africa. Most closely related to the present paper are recent studies by McMillan and Rodrik (2011) and McMillan, Rodrik, and Verduzco-Gallo (2014). Like Gollin, Lagakos, and Waugh (2014), these two studies by McMillan and others document a significant gap in productivity between agriculture and other sectors of the economy. McMillan, Rodrik, and Verduzco-Gallo (2014) showed that structural change in Africa contributed negatively to growth during the 1990s and then positively to growth during 2000–2005. However, these studies have two important limitations. First, the sample of African coun- tries used is not representative of the poorest African countries; rather, the countries are, on average, richer, and the populations are more educated and healthier when compared with the rest of Africa. Second, the data in these studies do not paint an accurate picture of the most recent economic activity in Africa because the samples used stop in 2005. 9 The converse is not true, however. All countries with structural change do not also achieve poverty reduction. Structural change into protected or subsidized sectors comes at the expense of other activities and is therefore not associated with sustained growth out of poverty for the population as a whole. Structural change is effective at reducing poverty only when people move from lower to higher productivity activities. 416 Diao, Harttgen, and McMillan In summary, section I of this paper describes the GGDC data. Section II documents a number of styl- ized facts to situate Africa within the recent literature on structural change. Section III outlines the meth- odology and the data used for measuring structural change. It also describes recent patterns of labor productivity growth across regions and countries. Section IV describes the DHS. It then uses these data to explore the robustness of the results presented in section III. Section V concludes. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 I. Groningen Growth and Development Center Data To analyze the patterns of structural change and labor productivity growth in Africa relative to the rest of the world, this paper uses the ten-sector database produced by researchers at the Groningen Growth and Development Center (GGDC). The data were last updated in January 2015 (Timmer, de Vries, and de Vries 2015), which is the version used here. Note that the Africa data in the paper by McMillan and Rodrik (2011) was collected by McMillan and helped generate interest in producing a longer time series of harmonized data for Africa. These data consist of sectoral and aggregate employment and real value- added statistics for 39 countries covering the period up to 2010 and, for some countries, to 2011 or 2012. Of the countries included, 30 are developing countries, and nine are high-income countries. The countries and their geographical distribution are shown in table S.1 (in supplemental appendix, available at https://academic.oup.com/wber), along with some summary statistics. As table S.1 shows, labor pro- ductivity gaps between different sectors are typically large in developing countries; this is particularly true for poor countries with mining enclaves where few people tend to be employed at very high labor productivity. The countries in our sample range from Ethiopia, with an average labor productivity over 2000–2010 of $1,400 (at 2005 purchasing power parity [PPP] dollars), to the United States, where average labor productivity over this same period is almost 60 times as large ($83,235). The data include 11 African countries, nine Latin American countries, ten Asian countries, and nine high-income countries. China shows the fastest overall productivity growth rate (10.38 percent per annum from 2000 to 2010). At the other extreme, Italy, Singapore, Mexico, and Venezuela experienced negative labor productivity growth rates over this same period. The sectoral breakdown used in the rest of this paper is shown in table S.2 (in supplemental appendix). Apart from mining and utilities, which are highly capital-intensive and create relatively few jobs, the sectors with the highest average labor productivity for 2000–2010 are transport services, busi- ness services, and manufacturing; the sector with the lowest average labor productivity is agriculture. The developed countries tend to have the highest average labor productivity across all ten sectors while countries in Africa have the lowest productivity levels across all ten sectors with the exception of mining. An important question regarding data of this sort is how well they account for the informal sector. The data for value-added come from national accounts, and, as mentioned by Timmer and de Vries (2007, 2009), the coverage of such data varies from country to country. While all countries make an effort to track the informal sector, obviously the quality of the data can vary greatly. On employment, Timmer and de Vries (2007, 2009) relied on household surveys (namely, population censuses) for total employment levels and their sectoral distribution; they used labor force surveys for the growth in employment between census years. Census data and other household surveys tend to have more com- plete coverage of informal employment. In short, a rough characterization of the data would be that the employment numbers in the GGDC dataset broadly coincide with actual employment levels, regardless of formality status, while the extent to which value-added data include or exclude the informal sector heavily depends on the quality of national sources. For a detailed explanation of the protocols followed The World Bank Economic Review 417 to compile the GGDC 10-Sector database, refer to Timmer, de Vries, and de Vries (2015) and “Sources and Methods” at the database’s web page: http://www.ggdc.net/databases/10_sector.htm. We would, of course, like to have data for more African countries. In the absence of additional data for Africa, however, table S.3 (in supplemental appendix) reports the characteristics of the African coun- tries in the GGDC sample and compares them to the characteristics of all countries in Africa. All of the Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 data used for the comparisons in table S.3 come from the World Bank’s World Development Indicators. The GGDC sample includes 11 out of 48 countries from SSA. The statistics in column (2) of table S.3 indicate that the African countries in the GGDC sample have significantly higher GDP per capita, lower infant mortality rates, higher years of primary and secondary schooling, bigger populations, and are gen- erally less reliant on agricultural raw material exports and resource rents than countries SSA taken as a group. A discussion of the DHS sample appears in section IV of this paper, which expands on the Africa sample to include more of its poor countries. II. Fitting Africa Into the Recent Literature on Structural Change Among the earliest and most central insights of the literature on economic development is the fact that development entails structural change (Lewis 1955). In most poor countries, large numbers of people live in rural areas and devote most of their time to the production of food for home consumption and local markets. In richer countries, by contrast, relatively few people work in agriculture. This is a robust and long-recognized feature of the cross-sectional data from different countries (Chenery and Taylor 1968). It is also a feature of the historical experience of development in almost all rich countries. For example, Duarte and Restuccia (2010) found that, over their sample period, structural change played a substantial role in the productivity catch-up of developing countries in their sample relative to the United States. As predicted, the gains are particularly dramatic in the sectors with international trade. They found in their sample that productivity differences in agriculture and industry between the rich and developing countries have narrowed substantially, while productivity in services has remained signifi- cantly lower in developing countries relative to rich countries. Thus, developing countries with the most rapid growth rates have typically reallocated the most labor into high-productivity manufacturing, allowing aggregate productivity to catch up.10 Duarte and Restuccia (2010) concluded that rising pro- ductivity in industry, combined with a shift in employment shares from agriculture into industry, explains 50 percent of the catch-up in aggregate productivities among developing countries over their sample period of 1950–2006. Some stylized facts of the pattern of structural change over the course of development have emerged from the literature on structural change. As countries grow, the share of economic activity in agriculture monotonically decreases, and the share in services monotonically increases. The share of activity in man- ufacturing appears to follow an inverted U-shape; it increases during low stages of development as capi- tal is accumulated and then decreases for high stages of development where higher incomes drive demand for services, and labor costs make manufacturing difficult. Herrendorf, Rogerson, and Valentinyi (2014) documented this pattern for a panel of mostly developed countries over the past two centuries while Duarte and Restuccia (2010) documented a similar process of structural change among 29 countries for 1956–2004. 10 Conversely, where the manufacturing sector stagnates and structural transformation primarily involves the reallocation of workers into lower productivity sectors, aggregate productivity growth is slower, especially among developing countries whose productivity in services remains low relative both to agriculture in other countries and to other sectors within the country. 418 Diao, Harttgen, and McMillan African countries have been largely absent from empirical analyses in this literature. Thus, there is lit- tle evidence on how structural change has played out in African countries since achieving independence half a century ago. A major reason for this has been absence of data, as economic data to undertake such analysis has been largely unreliable or nonexistent for most African countries. A deeper reason is poverty itself. Until recently, few African countries had enjoyed the sustained economic growth needed to trace Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 out the patterns of structural transformation achieved in earlier decades elsewhere. The start of the 21st century saw the dawn of a new era in which African economies grew as fast as, or faster than, the rest of the world’s economies. Examining the recent process of structural change in Africa and how it has interacted with economic growth could yield significant benefits. For one, the theory and stylized facts of structural change offer several predictions about the allocation of the factors of production for countries at different stages of development. In addition, because SSA is now by far the poorest region of the world, including African countries could enrich the current understanding of how structural change has recently played out around the world. Perhaps more importantly, and most pertinent to this paper, is that such an analysis could offer insight regarding the continent’s recent economic performance—both its prolonged period of weak economic growth since the 1970s and its period of stronger growth over the past decade. This paper uses the GGDC data to study the evolution of the distribution of employment between sectors across levels of income experienced in Africa and how it compares with the patterns seen historically in other regions over the course of development. Using as a baseline the patterns seen in other regions histori- cally helps gauge the extent to which structural change in Africa compares with what would be “expected” based on its income levels. Following Duarte and Restuccia (2010) and Herrendorf, Rogerson, and Valentinyi (2014), we start by aggregating the ten sectors in the GGDC Africa Sector Database (GGDC- ASD) into three main categories: agriculture, industry, and services. This is accomplished as follows: 1. Manufacturing, mining, construction, and public utilities are combined into “industry.” 2. Wholesale and retail trade; transport and communication; finance and business services; and com- munity, social, personal, and government services are combined into “services.” 3. “Agriculture” is left as-is.11 In addition to these three sectors, we add a fourth category: manufacturing. For purposes of comparability with the results in Duarte and Restuccia (2010) and Herrendorf, Rogerson, and Valentinyi (2014), we also measure “development” using the log of GDP per capita in international dollars from Maddison (2010). Figure 1 plots employment shares in agriculture, services, industry, and manufacturing on the y-axis and log GDP per capita on the x-axis for the 11 African countries in the GGDC sample for 1960–2010. The share of employment in agriculture decreases with income while the share of employment in services and industry both increase in income. These patterns are consistent with those documented by Duarte and Restuccia (2010) and Herrendorf, Rogerson, and Valentinyi (2014) for the rest of the world. Figure 1 also indicates the inverted-U shape for industry that was documented in Duarte and Restuccia (2010) and Herrendorf, Rogerson, and Valentinyi (2014) for Africa, although this shape seems to be driven mostly by Botswana (green triangles), Mauritius (purple dots), and South Africa (blue triangles). Mauritius is the only country in the Africa sample with a log GDP per capita at or exceeding 9.0, the threshold identified by Herrendorf, Rogerson, and Valentinyi (2014) at which deindustrialization has occurred in the rest of the world, excluding Africa but including many other developing countries. The pattern for manufacturing appears to be similar to the pattern for industry, although, as is discussed next, regression analysis reveals a difference in the two patterns. 11 This aggregation is consistent with that used in Duarte and Restuccia (2010) who also used the pre-Africa GGDC data- base (along with other sources) to construct their dataset. The World Bank Economic Review 419 Figure 1. Employment Shares by Main Economic Sector, Africa 1960–2010 Agriculture Industry 100 50 90 BWA MUS BWA MUS 80 40 ZAF Rest of Africa ZAF Rest of Africa 70 Employment share (%) Employment share (%) 60 30 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 50 40 20 30 20 10 10 0 0 6.0 7.0 8.0 9.0 10.0 6.0 7.0 8.0 9.0 10.0 Log GDP per capita, 1990 international $ Log GDP per capita, 1990 international $ Services Manufacturing 70 40 BWA MUS BWA MUS 60 ZAF Rest of Africa ZAF Rest of Africa 30 50 Employment share (%) Employment share (%) 40 20 30 20 10 10 0 0 6.0 7.0 8.0 9.0 10.0 6.0 7.0 8.0 9.0 10.0 Log GDP per capita, 1990 international $ Log GDP per capita, 1990 international $ Note: For estimation results, see table S.4. GGDC Africa sample includes Botswana, Ethiopia, Ghana, Kenya, Malawi, Mauritius, Nigeria, Senegal, South Africa, Tanzania, and Zambia. Sources: Maddison (2010) GDP version 2013; GGDC dataset (Timmer, de Vries, and de Vries 2015); authors’ calculations. Table S.4 (in supplemental appendix) reports results of regressions that test for the shape of these relationships. All specifications include country-fixed effects and the log of GDP per capita; the regres- sions for industry and manufacturing include the log of GDP squared to capture the inverted U-shape documented for non-African countries. The results in columns (1) through (3) confirm that the patterns uncovered in our Africa sample are similar to those uncovered for other countries—that is, the employment share in agriculture is decreasing in the log of GDP per capita and that in services is increasing in the log of GDP per capita. For industry, the results in column (3) are indicative of a U-shaped relationship. However, the results in column (4) indicate that the relationship between log GDP per capita and the employment share in manufacturing is first decreasing and then increasing. Columns (5) through (8) of table S.4 separate the “rich” African countries in the sample—Botswana, Mauritius, and South Africa—from the “poor” African countries in the sample by interacting log GDP per capita (and its square for industry and manufacturing) with dummy variables for rich and poor Africa. The differences between the rich African countries and the poor African countries in the Africa sample are visu- ally evident in figure 1; table S.1 also indicates the significant gap in economywide labor productivity between the rich African countries and the rest of the countries in the Africa sample. The results in columns (5) and (6) of table S.4 show very little difference in the coefficients on log GDP per capita in the regressions of the employment share in agriculture and services between the rich Africa sample and the poor Africa sam- ple. For example, in poor Africa, a one percent increase in log GDP per capita reduces the employment share in agriculture by 0.20 percent, while in rich Africa, a one percent increase in log GDP per capita reduces the employment share in agriculture by 0.22 percent. The results in columns (7) and (8) confirm the differences between the rich African countries and the poor African countries that are shown in figure 1. In particular, the inverted U-shape for industry appears to peak earlier for poor countries than for rich countries. In manu- facturing, the signs on log GDP per capita and its square are reversed for the rich African countries. We also investigate the phenomenon of ‘premature deindustrialization’ in Africa, as described by Rodrik (2016), who found that the share of employment in manufacturing in developing countries is peaking at lower levels of GDP per capita than it did in today’s industrialized countries. Among the 11 African 420 Diao, Harttgen, and McMillan countries in our sample, eight of them have incomes well below the level of income at which the manufac- turing employment share begins to decline as identified by Herrendorf, Rogerson, and Valentinyi (2014).12 Also, in five countries—Ethiopia, Kenya, Malawi, Senegal, and Tanzania—the employment share in manu- facturing is still growing. Of the high income countries in the Africa sample—Mauritius, Botswana, and South Africa—Mauritius appears to have followed a path much like the high income East Asian countries Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 in the sample in that manufacturing’s share of employment and value-added reached very high levels and has only recently been replaced by similarly or more productive services. In short, it seems difficult to make the case that Africa is de-industrializing. Thus, with the possible exceptions of Botswana and South Africa, recent patterns of employment shares in Africa appear to fit the stylized facts of other regions’ historical development.13 Although figure 1 and the results in table S.4 suggest that the patterns of employment allocation and income for agriculture, services, industry, and manufacturing are qualitatively similar to the stylized facts based on the experience of other regions, it may be that they differ quantitatively. For instance, although figure 1 confirms that the agricul- tural employment share and services employment share in Africa decrease and increase, respectively, with the level of income, it could be that the level of agricultural or services employment in Africa is higher than in other regions, perhaps because of resource endowments or productivity levels. Directly comparing the rela- tionship between income levels and the distribution of employment in Africa with other regions over the past several decades indicates whether the process of structural change in Africa is playing out differently than we would expect given current levels of income. Figure 2 displays employment shares in agriculture, industry, services, and manufacturing on the y-axis and log GDP per capita on the x-axis simultaneously for our sample of African countries and for Figure 2. Employment Shares in Africa Compared with Non-Africa Sample, 1960–2010 Agriculture Industry 100 60 90 Non-Africa Africa Non-Africa Africa 50 80 70 40 Employment share (%) Employment share (%) 60 50 30 40 20 30 20 10 10 0 0 6.0 7.0 8.0 9.0 10.0 11.0 6.0 7.0 8.0 9.0 10.0 11.0 Log GDP per capita, 1990 international $ Log GDP per capita, 1990 international $ Services Manufacturing 90 50 80 Non-Africa Africa Non-Africa Africa 70 40 60 Employment share (%) Employment share (%) 30 50 40 20 30 20 10 10 0 0 6.0 7.0 8.0 9.0 10.0 11.0 6.0 7.0 8.0 9.0 10.0 11.0 Log GDP per capita, 1990 international $ Log GDP per capita, 1990 international $ Note: For estimation results, see table S.4. GGDC full sample includes 39 countries (see table S.1 for the list of the countries). Sources: Maddison (2010) GDP version (2013); GGDC dataset (Timmer, de Vries, and de Vries 2015); authors’ calculations. 12 GDP per capita in the majority of African countries is also well below the lower threshold of around $6,000 (in 1990 US$) identified by Rodrik (2016) as the turning point for employment deindustrialization. 13 Although Ghana had an employment share in manufacturing of around 14 percent in 1978, its current level of real GDP per capita is quite a bit lower than the income level at which manufacturing employment would be expected to peak, regardless of whether Rodrik’s (2016) threshold or that identified by Herrendorf, Rogerson, and Valentinyi (2014) is used. Thus, in principle, the employment share in manufacturing should continue to grow. The World Bank Economic Review 421 the rest of the countries in the GGDC sample for the period 1960–2010. As indicated by the legend, red dots in the figure denote African countries, and blue dots denote all other countries in the sample. Two features of the data are immediately evident from the figure. First, in recent years, per capita incomes in most African countries in our sample are among the lowest seen in most of the world since 1960. Second, the distributions of employment among the African countries appear to fit quite well with Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 those seen over the past six decades in other regions. To obtain a more precise measure of the differences between our Africa sample and the rest of the world, we regress employment shares on the log of GDP per capita and its square for industry and manu- facturing, an interaction between the log of GDP per capita and an Africa dummy and an interaction between the log GDP per capita squared and an Africa dummy for industry and manufacturing. The results of these regressions are reported in columns (1) through (4) of table 1. In the case of agriculture, the coefficient of –0.04 on the interaction term indicates that the employment share in agriculture is fall- ing faster as income increases in Africa as compared with the rest of the world. In other words, the line is steeper, but the magnitude of the difference is small. In the case of services, there is no statistically or economically meaningful difference between Africa and the rest of the world as a one percent increase in GDP per capita is associated with a 0.18 percent increase in the employment share in services. Table 1. Regression Results for Figure 2: GDP and Employment Shares, Full Sample (1) (2) (3) (4) (5) (6) (7) (8) Africa vs. rest of world Africa (Rich & poor vs. rest of world) Agriculture Services Industry Manufacturing Agriculture Services Industry Manufacturing lngdp –0.176*** 0.175*** 0.973*** 0.757*** –0.176*** 0.175*** 0.973*** 0.757*** (0.014) (0.019) (0.187) (0.166) (0.014) (0.019) (0.187) (0.167) lngdp2 –0.056*** –0.045*** –0.056*** –0.045*** (0.011) (0.010) (0.011) (0.010) Lngdp  Africa –0.042* –0.022 –0.774*** –0.858*** (0.021) (0.023) (0.194) (0.170) lngdp2  Africa 0.047*** 0.054*** (0.011) (0.010) Lngdp  AfricaPoor –0.025 –0.041 –0.341 –0.382 (0.039) (0.032) (0.368) (0.229) lngdp2  AfricaPoor 0.015 0.018 (0.026) (0.015) lngdp  AfricaRich –0.046** –0.017 –0.763*** –0.877*** (0.022) (0.024) (0.193) (0.175) lngdp 2  AfricaRich 0.047*** 0.055*** (0.011) (0.010) Observations 1873 1873 1873 1873 1873 1873 1873 1873 R-squared 0.636 0.585 0.473 0.422 0.636 0.586 0.474 0.424 Source: Maddison GDP V. (2013). GGDC dataset (Timmer, de Vries, and de Vries 2015); authors’ calculations. Note: Robust standard errors in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Industry includes manufacturing, mining, construction, and public utilities. There does appear to be a significant difference between Africa and the rest of the world when it comes to industry and manufacturing. In particular, adding the coefficients on log GDP per capita and its square and the interaction of log GDP per capita and its square with the Africa dummy to the coeffi- cients for the rest of the world—columns (3) and (4) of table 1—we get the results in column (3) and (4) of table S.4. The implication is that, at lower levels of income, the rest of the world has higher 422 Diao, Harttgen, and McMillan employment shares in industry than does Africa, and the inverted U-shape in industry for Africa peaks at a lower employment share in industry. However, once poor Africa is separated from rich Africa the dif- ference persists only for rich Africa. In rich Africa—Botswana, Mauritius, and South Africa—the inverted U-shape in industry is to the left of the inverted U-shape for the rest of the world (column [7] of table 1). Also, in rich Africa the employment share in manufacturing is first falling in income and then Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 rising at an increasing rate; in other words, at the levels of GDP per capita observed in the data over the past 50 years, the pattern follows more or less an upward sloping line.14 By contrast, the size and signifi- cance of the interaction terms that include poor Africa (columns [5–]–[8] of table 1) indicate that the pat- terns observed in poor Africa appear to be similar to the patterns observed in the rest of the world. Figure 3 illustrates that, among the 11 African countries in the GGDC sample, the productivity gaps are indeed enormous across sectors. Each bin in the figure corresponds to one of the nine sectors in the dataset,15 with the width of the bin corresponding to the sector’s share of total employment and the height corresponding to the sector’s labor productivity level as a fraction of average labor productivity. Agriculture, at 35 percent of average productivity, has the lowest productivity by far; manufacturing productivity is 1.7 times as high, and that in mining is 16.8 times as high. Furthermore, the figure makes evident that the majority of employment in the African sample is in the most unproductive sectors with roughly two-thirds of the labor force in the two sectors with below-average productivity (agriculture and personal services). Based on this figure, it appears that the potential for structural change to contrib- ute to labor productivity growth is still quite large. Figure 3. Labor Productivity Gaps in Africa, 2010 1,681 Sector-relative labor productivity, the economywide labor 1,600 Agriculture Personal services Trade services 1,400 Construction Manufacturing Transport services Business services Utilities Mining 1,200 productivity = 100 1,000 800 600 484 422 400 325 200 162 171 111 35 69 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Share of total employment (%) Note: The sector-relative labor productivity and sector share of employment are calculated using the weighted average for the region; the country data is in 2005 purchasing power parity dollars. The total employment considers only the employment in the private sector. Source: GGDC datasets (Timmer, de Vries, and de Vries 2015); authors’ calculations. The productivity gaps described here refer to differences in average labor productivity. When markets work well and structural constraints do not bind, productivities at the margin should be equalized. Under a Cobb-Douglas production function specification, the marginal productivity of labor is the aver- age productivity multiplied by the labor share. Thus, if labor shares differ greatly across economic activ- ities, then comparing average labor productivities can be misleading. The fact that average productivity in mining is so high, for example, simply indicates that the labor share in this capital-intensive sector is quite small. In the case of other sectors, however, there does not appear to be a clearly significant bias. Once the share of land is taken into account, for example, it is not obvious that the labor share in 14 Although the coefficients in the regression suggest a U-shaped relationship, when we plug actual log GDP per capita into the fitted equation the relationship is more linear than U-shaped. 15 Figure 3 excludes government services. The World Bank Economic Review 423 agriculture is significantly lower than in manufacturing (Mundlak, Butzer, and Larson 2012). Therefore, the fourfold difference in average labor productivity between manufacturing and agriculture does point to large gaps in marginal productivity. An additional concern with the data presented in figure 3 is that the productivity gaps may be mis- measured. For example, differences in hours worked or human capital per worker could be driving the Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 observed productivity gaps. However, in a recent paper, Gollin, Lagakos, and Waugh (2014) used microdata to take into account sectoral differences in hours worked and human capital, as well as alternative measures of sectoral income; after doing so, they still found large differences in productivity between agriculture and other sectors of the economy. The agricultural productivity gaps for SSA (presented by country in appendix 3 of their paper) range from a low of 1.14 in Lesotho all the way to 8.43 for Gabon. Thus, our preliminary analysis reveals some important stylized facts about countries in Africa. First, when the patterns of employment in Africa are compared to the patterns observed in other regions across levels of development, the pattern among our sample follows that seen in other regions for agriculture and services, that is, the agricultural employment share is decreasing in income while the services employ- ment share is increasing in income. Second, when the levels of employment shares are compared to the levels observed in other countries, the levels of employment shares in agriculture and services approxi- mate the levels observed in other countries at similar levels of income. Third, all of this holds for industry and manufacturing in the eight low-income African countries. Fourth, in Botswana, Mauritius, and South Africa, the patterns in industry are similar but the levels differ, and, in the case of manufacturing, the rela- tionship between income and employment shares follows more of an upward sloping line than an inverted U-shape. Fifth, Africa is still, by far, one of the poorest regions of the world. And finally, structural change continues to remain a potent source of labor productivity growth in much of SSA. There are a number of reasons to believe that structural change might have been delayed in much of Africa, and it is only relatively recently that much of Africa has begun to grow rapidly. Part of this had to do with the rise in commodity prices that began in the early 2000s, although Africa is also starting to reap the benefits of economic reforms and improved governance. For example, three of the fastest- growing countries in Africa—Ethiopia, Rwanda, and Tanzania—continue to grow rapidly despite the decline in commodity prices. In fact, according to the World Economic Outlook 2016 published by the IMF, economic growth in Africa in 2015 only slowed down in a handful of oil exporters and is expected to rebound by 2021. To explore the nature of Africa’s recent growth, we investigate structural change in Africa, including the most recent period in history for which data are available: 2000–2010. This most recent period is important because it was during this time that Africa experienced the strongest growth in four decades. The key question is whether this growth was accompanied by labor productivity growth and structural change. III. Patterns of Structural Change Across Regions and Countries This section begins by describing the methodology used to measure structural change. This is followed by a description of patterns of structural change across the following country groupings for 1990–1999 and for 2000–2010: Africa, Asia, and Latin America, and the Organization for Economic Co-operation and Development (OECD) countries. The section concludes with a discussion of the heterogeneous expe- riences across the African continent. Measuring Structural Change Labor productivity growth can be achieved in one of two ways. First, productivity can grow within exist- ing economic activities through capital accumulation or technological change. Second, labor can move from low-productivity to high-productivity activities, increasing overall labor productivity in the econ- omy. This can be expressed using the following decomposition: 424 Diao, Harttgen, and McMillan X X DPt ¼ hi;tÀk Dpi;t þ pi;t Dhi;t ; (1) i ¼n i¼n where Pt and pi;t refer to economywide and sectoral labor productivity levels, respectively, and hi;t is the share of employment in sector i. The D operator denotes the change in productivity or employment shares between t–k and t. The first term in the decomposition is the weighted sum of productivity growth Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 within individual sectors, where the weights are the employment share of each sector at the beginning of the period. Following McMillan and Rodrik (2011), we call this the “within” component of productivity growth. The second term captures the productivity effect of labor reallocations across different sectors. It is essentially the inner product of productivity levels (at the end of the period), with the change in employment shares across sectors. When changes in employment shares are positively correlated with productivity levels, this term will be positive. Structural change will increase economywide productivity growth. Also following McMillan and Rodrik (2011), we call this second term the “structural change” term. The second term in equation (1) could be further decomposed into a static and dynamic component of structural change as in de Vries, Timmer and de Vries (2015). As in McMillan and Rodrik (2011), we choose not to do this because the dynamic structural change component of the structural change term is often negative but difficult to interpret. For example, when agricultural productivity growth is positive and the labor share in agriculture is falling, the term is negative, even though, on average, the movement of workers out of agriculture to other more productive sectors of the economy makes a positive contribu- tion to structural change and economywide labor productivity growth. Moreover, structural change is, by its very nature, a dynamic phenomenon; thus, we find it counterintuitive to label a part of structural change static. The decomposition we use clarifies how partial analyses of productivity performance within individ- ual sectors (for example, manufacturing) can be misleading when there are large differences in labor pro- ductivities (pi;t ) across economic activities. In particular, a high rate of productivity growth within a sector can have quite ambiguous implications for overall economic performance if the sector’s share of employment shrinks rather than expands. If the displaced labor ends up in activities with lower produc- tivity, economywide growth will suffer and may even turn negative. This decomposition can be used to study broad patterns of structural change within a country and across countries. An example of this type of analysis can be found in McMillan and Rodrik (2011). Individual components of the decomposition such as labor shares and within-sector changes in produc- tivity can also be used at the country level to dig deeper into where structural change is or is not taking place and to gain a deeper understanding of the country-specific factors that drive structural change. For example, if we know that the expansion of manufacturing is a characteristic of structural change in a particular country, we could use more detailed data on manufacturing to pinpoint which specific indus- tries expanded, how many people were employed, and whether specific events or policies contributed to the expansion or contraction of a particular sector. For country-specific analyses of this type, refer to Structural Change, Fundamentals, and Growth: A Framework and Country Studies (forthcoming), edited by McMillan, Rodrik, and Sepulveda. Structural Change in Africa in Comparison to Latin America and Asia The previous discussion indicated that the distribution of employment levels across sectors in our Africa sample are fairly similar to what would be “expected” based on current levels of income. We now inves- tigate the changes in employment shares within African countries and the effect of those changes on economywide labor productivity. The analysis begins using the GGDC sample, breaking the period into two: 1990–1999 and 2000–2010. As previously noted, the early 1990s in Africa were still a period of The World Bank Economic Review 425 adjustment. The period starting around 2000 marks the beginning of a rapid acceleration in growth rates across much of the continent. Table 2 presents the central findings on patterns of structural change for 1990–1999 and 2000–2010 for four groups of countries: Latin America, SSA, Asia, and high-income countries. Results are presented by country for the Africa sample; weighted and unweighted averages for all four groups of the countries Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 appear in the bottom four panels of table 2. The most striking result is Africa’s turnaround. Between 1990 and 1999, and using the weighted average, structural change was a drag on economywide labor productivity growth in Africa; this result is largely driven by Nigeria and was a result of structural change in the wrong direction. From 2000 to 2010, however, structural change contributed between 0.93 and 1.25 percentage points to economywide labor productivity growth in Africa, depending on whether weighted or simple averages are used. If only the eight low-income African countries are consid- ered, structural change contributed 1.57 percentage points to economywide labor productivity growth. Moreover, overall labor productivity growth in Africa was second only to Asia, where structural change continued to play a positive role. The biggest difference between low-income Africa and Asia for 2000– 2010 is that Asia experienced significantly greater within-sector productivity growth. Table 2. Decomposition of Labor Productivity Growth, 1990–2010 (Using GGDC Data) 1990–1999 2000–2010 Total Within sector Structural change Total Within sector Structural change Botswana 1.58 1.82 –0.25 2.17 2.81 –0.64 Ethiopia 0.17 –0.70 0.87 4.52 2.22 2.30 Ghana 3.20 2.53 0.67 2.68 2.07 0.61 Kenya –1.65 –4.38 2.74 0.68 0.81 –0.13 Malawi 1.53 –0.22 1.75 1.67 –1.53 3.20 Mauritius 3.47 2.42 1.05 3.41 2.91 0.50 Nigeria –0.23 10.68 –10.91 4.59 –0.91 5.49 Senegal 0.23 –0.74 0.97 1.11 –0.03 1.14 South Africa –0.57 –0.45 –0.12 2.90 2.92 –0.02 Tanzania 1.07 0.49 0.58 4.03 0.31 3.72 Zambia –3.05 –1.87 –1.19 3.24 2.71 0.54 SSA weighted average –0.40 0.68 –1.08 2.54 1.60 0.93 SSA weighted ave. excluding Nigeria 0.67 0.00 0.67 1.79 0.54 1.25 SSA simple average 0.52 0.25 0.27 2.82 1.69 1.13 Africa low-income, simple average 0.16 –1.13 0.28 2.81 1.24 1.57 Africa high-income, simple average 1.49 1.26 0.23 2.83 2.88 –0.05 Asia weighted average 4.84 3.59 1.26 6.58 5.38 1.20 Asia simple average 3.98 3.20 0.79 3.37 2.97 0.39 LA weighted average 0.76 0.87 –0.11 1.61 1.18 0.44 LA simple average 0.91 0.77 0.15 0.08 0.01 0.07 High-income countries weighted ave. 1.46 1.32 0.13 1.23 1.26 –0.04 High-income countries simple ave. 1.54 1.64 –0.10 0.84 1.09 –0.25 Note: SSA ¼ Africa south of the Sahara; LA ¼ Latin America. The regional weighted averages are calculated using the regional data for sector value-added and sec- tor labor employment. The sector value-added data of GGDC are converted into 2005 purchasing power parity dollars. Because of the size of Nigeria, its effect on the SSA weighted average results is large when Nigeria’s growth rate differs from other countries. Excluding Nigeria improves the departure of the simple average results from the weighted average. Africa low-income countries include Ethiopia, Ghana, Kenya, Malawi, Nigeria, Senegal, Tanzania, and Zambia, and high-income coun- tries include Botswana, Mauritius, and South Africa. Sources: GGDC dataset (Timmer, de Vries, and de Vries 2015); authors’ calculation. Employment data for Tanzania are adjusted according to the 2012 census (National Bureau of Statistics 2014); data for Zambia are adjusted according to Resnick and Thurlow (forthcoming). Of course, the country-specific results for Africa presented in table 2 indicate a great deal of heteroge- neity across the countries in the sample. Between 2000 and 2010, economywide labor productivity growth was highest in the low-income countries of Ethiopia, Nigeria, and Tanzania. In all three of these 426 Diao, Harttgen, and McMillan countries, structural change was growth-enhancing and was responsible for the majority of labor pro- ductivity growth. By contrast, in the three richest countries in the Africa sample—Botswana, South Africa, and Mauritius—labor productivity growth is almost exclusively accounted for by within-sector productivity growth. This finding is not surprising given the relatively low shares of agricultural employ- ment in each of these three countries. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 Like McMillan and Rodrik (2011), we find that structural change has made very little contribution (positive or negative) to the overall growth in labor productivity in the high-income countries in the sam- ple. This result is as expected, because intersectoral productivity gaps tend to diminish during the course of development. Even though many of these advanced economies have experienced significant structural change during this period, with labor moving predominantly from manufacturing to service industries, this (on its own) has made little difference to productivity overall. What determines economywide per- formance in these economies is, by and large, how productivity fares in each sector. We can gain further insight into the results by looking at the sectoral details by region for the develop- ing countries in the sample. All numbers reported are simple averages across countries in each of the four groups. The four panels of figure 4 show changes in employment shares for 2000–2010, relative labor productivity for 2010, and initial employment shares by sector for 2000. Sectors are generally ranked from highest to lowest employment share in 2000. Employment shares in 2000 are denoted by triangles, and the value of the shares is noted on the right y-axis. Clearly, countries in Africa started with the highest employment share in agriculture in 2000, at close to 60 percent for all of the African coun- tries and 70 percent for the low-income African countries. The next highest initial employment shares in agriculture were in Asia, at more than 40 percent, and in Latin America, at less than 20 percent. By this measure, African countries clearly had (and still have) the most to gain from structural change. Figure 4. Relative Labor Productivity (2010), Employment Shares (2000), and Change in Employment Shares (2000–2010) (a) 20.0 SSA all countries 70 (b) 20.0 SSA low-income countries 70 Change in employment share in 2000-10 60 Change in employment share in 2000-10 Change in employment share (%) & Relative 17.0 17.0 60 Relative Lprody Relative Lprody Lprody (Economywide Lprody = 1) Change in employment share (%) & Relative 14.0 14.0 Employment share in 2000 (%) Employment share in 2000 (%) Employment share in 2000 50 Employment share in 2000 50 Lprody (Economywide Lprody = 1) 11.0 11.0 8.0 40 8.0 40 5.0 5.0 30 30 2.0 2.0 -1.0 20 -1.0 20 -4.0 -4.0 10 10 -7.0 -7.0 -10.0 0 -10.0 0 Asian developing countries Latin America (c) (d) 70 70 20.0 20.0 17.0 Change in employment share in 2000-10 60 17.0 Change in employment share in 2000-10 60 Relative Lprody Relative Lprody 14.0 Change in employment share (%) & Relative Change in employment share (%) & Relative Employment share in 2000 (%) Employment share in 2000 (%) 14.0 Employment share in 2000 50 50 Lprody (Economywide Lprody = 1) Lprody (Economywide Lprody = 1) Employment share in 2000 11.0 11.0 8.0 40 8.0 40 5.0 5.0 30 30 2.0 2.0 -1.0 20 -1.0 20 -4.0 -4.0 10 10 -7.0 -7.0 -10.0 0 -10.0 0 Notes: SSA ¼ Africa south of the Sahara; Lprody ¼ labor productivity. (1) For part (a), SSA all countries includes Botswana, Ethiopia, Ghana, Kenya, Malawi, Mauritius, Nigeria, Senegal, South Africa, Tanzania, and Zambia. For part (b), SSA low-income countries exclude Botswana, Mauritius, and South Africa. For part (c), Asian developing countries include China, India, Indonesia, Malaysia, Philippines, and Thailand. For part (d), Latin America includes Argentina, Bolivia, Brazil, Chile, Colombia, Costa Rica, Mexico, Peru, and Venezuela. (2) Relative Lprody means sector labor productivity divided by economywide labor productivity. (3) In 2010, the economywide labor productivity averaged $10,342 for SSA all countries, $6,006 for SSA low-income countries, $13,416 for Asian developing countries, and $28,088 for Latin American countries (all measured by 2005 purchasing power parity dollars). The simple average is used in the calculation in the figure. Source: GGDC datasets (Timmer, de Vries, and de Vries 2015); authors’ calculations. The World Bank Economic Review 427 In all four country groups, the share of employment in agriculture fell with the decline greatest in low- income Africa at 9.3 percent. The manufacturing employment share only increased in the low-income African countries while it actually fell in the developing Asian countries and in Latin America. In all African countries, an examination of the purple diamonds indicates that average labor productivity in the sectors where employment is expanding was higher than average labor productivity in agriculture. Indeed, this is Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 what drives the growth decomposition results presented in table 2. However, the expansion of the employ- ment share in trade services is largest. Although this sector’s average productivity is currently higher than that in agriculture, it is not clear that this gap will be maintained if more and more workers shift into this sec- tor. Also, in all African countries, relative labor productivity in mining and utilities is extremely high. However, these sectors are highly capital intensive and unable to absorb large numbers of workers, which can be seen by examining the employment shares in 2000 by sector as denoted by the red diamonds. So far, this analysis has revealed that structural change became growth enhancing in Africa during 2000–2010 and that, with the exception of manufacturing, the analysis for the other three regions remains largely similar to results presented in McMillan and Rodrik (2011). For the 11 African countries in the GGDC sample, annual labor productivity grew by an (unweighted) average of 2.82 percent, and structural change contributed an (unweighted) average of 1.13 percentage points to overall labor pro- ductivity growth. Put differently, from 2000 to 2010, structural change accounted for 40 percent of Africa’s annual labor productivity growth. This positive contribution of structural change to economy- wide growth paints a somewhat more optimistic picture of growth in Africa than did the results in McMillan and Rodrik (2011) and are more consistent with the results in McMillan, Rodrik, and Verduzco-Gallo (2014). The remaining sections of this paper dig into the robustness of these results using an alternative source of data for employment shares: the Demographic and Health Surveys. The paper then turns to a discussion of the broader implications of the results presented here. IV. Using the DHS to Understand Structural Change Our first objective in this section of the paper is to use the DHS data to check the robustness of the results we obtained on changes in employment shares in the previous section of the paper. There are eight countries included in both the GGDC dataset and the DHS dataset. In addition, since structural change should be most pronounced in countries with the highest share of the labor force in agriculture and because these are almost always the poorest countries using the DHS has the added advantage of giving us a window into what is happening in the very poor African countries. The statistics in table S.3 confirm that the GGDC sample is biased toward the richer countries in Africa. Thus, to incorporate more of the poorer African countries into this analysis, we turn to the DHS. This section explains both the advantages and the limitations of the DHS and then provides an analysis. The DHS Data Although the DHS is not designed as a labor force survey, it does contain a module on employment status and occupation for women and men between the ages of 15 and 59. Because information on men is not provided for all DHS countries and survey rounds, this paper only uses surveys that include both women and men. Table S.5 (in supplemental appendix), provides a list of the surveys, by country, used in this analysis. In total, the sample contains information for about 750,000 women and 250,000 men. Because the samples are nationally representative, they include employment in both formal and informal sectors. The data do not appear to be well suited to making this distinction because many of the questions that could be used to do this were left unanswered. An advantage of the DHS for analyzing determinants and trends of occupation types across countries and over time is that the design and coding of variables (especially those on type of occupation, educa- tional achievements, household assets, and dwelling characteristics) are generally comparable across 428 Diao, Harttgen, and McMillan countries and survey rounds (see table S.6 in supplemental appendix), for a list of questions by survey round). At the household level, the DHS provides information on household socioeconomic characteris- tics, household structure, and family composition, enabling analysis of the distribution and determinants of occupation types by socioeconomic characteristics and of changes in the distribution over time. Note that this does not mean the original DHS files do not contain “recode” errors; we corrected these kinds of Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 errors, and details of this procedure are available upon request. A second and important advantage of the DHS data is that, in addition to an individual’s occupation, the data contain information on the individual’s gender, age, educational status, and location. Thus, for example, the data enable an examination of changes in occupational status by location, gender, age, and educational status. A disadvantage of DHS data is that household income and expenditures are not included, although available information on household assets can be used to construct an asset index to proxy for individual or household welfare. In addition, measures of nutrition, health, and education can be combined with information on assets to gain a more complete measure of wellbeing. This paper restricts the sample to African countries for which at least two DHS surveys are available, allowing us to analyze trends over time. The large coverage of countries and survey years leaves a sample size of 24 African countries, capturing the period from 1998 to 2014. As was done to check the represen- tativeness of the GGDC sample in section II, we compare the countries in the DHS sample to all coun- tries in Africa to assess the bias in the DHS sample. The results of this analysis are presented in column (3) of table S.3. A comparison of average infant mortality rates and education levels shows no statistical difference between the countries in the DHS sample and the rest of SSA. However, the countries in the DHS sample have an average level of GDP per capita that is significantly lower than the overall average for Africa, which is not surprising in that the DHS are funded by the United States Agency for International Development and that the mandate is to focus on the poorest countries in the world. As noted by Young (2012), the raw DHS files include coding errors; therefore, the data need to be examined on a country-by-country basis to ensure accuracy. The most glaring coding error was for Mali in 2006, when agricultural workers were accidentally classified as military workers. Coding errors such as this indicate that it is not a good idea to take the aggregate statistics provided by DHS on the Internet at face value. It also explains why, for example, some researchers have found the aggregate data on occu- pational shares published on the website to be unreliable. A detailed description of the way in which we arrived at our final sample is available upon request. To assign individuals to occupational categories, we rely on the question on occupation for women and men. The DHS provides a grouped occupation variable that relies on the question that asks what the respondent mainly does for work.16 The DHS sorts respondent responses into one of eight categories: (1) not working; (2) professional/technical/managerial/clerical; (3) sales; (4) agricultural (self-employed); (5) agricul- tural (employee); (6) household and domestic services; (7) skilled manual; and (8) unskilled manual. For this paper, we adjust these categories in the following ways. We combine categories (4) and (5) are into a group called “agricultural occupation.” We would have liked to separate these two variables but there were not enough surveys in which this type of information was collected. We combine categories (3) and (6) into a group called “services.” We combine categories (7) and (8) into a group called “industry.” We retain cate- gory (2) in its original form and rename it “professional.” Finally, we retain category (1) in its original form only for adults 25 and older and split this category into “in school” and “not working” for youth aged 15– 24 years. Thus, in total, we have five occupational categories for adults: agriculture, services, industry, pro- fessional, and not working, plus a sixth category of “in school” for youth (those aged 15–24 years). 16 Variable v717: “What is your occupation, that is, what kind of work do you mainly do?” The World Bank Economic Review 429 Changes in Occupational Shares over Time and across Countries in Africa This first goal in this subsection is to check whether the changes in employment shares reported in section III are also apparent in the DHS data. To this end, we compare changes in employment shares in agriculture as reported in the GGDC data with changes in employment shares in agriculture in the DHS data for the eight countries for which the samples overlap. These countries are Ethiopia, Ghana, Kenya, Malawi, Nigeria, Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 Senegal, Tanzania, and Zambia. Because the GGDC data are annual, the agricultural employment share from the GGDC is matched to the exact survey year in the DHS. For example, the DHS surveys in Kenya were conducted in 1998 and 2009; the agricultural employment shares in these two years are paired with the GGDC agricultural employment shares for 1998 and 2009. Figure 5 presents the results of this analysis. Figure 5. Comparison of Changes in Agriculture Employment Shares: GGDC versus DHS Weighted Ethiopia Ghana Kenya Malawi Nigeria Senegal Tanzania Zambia Simple ave. ave. 0 -2 -4 -6 -8 -10 -12 -14 -16 GGDC, post 2000 DHS, post 2000 -18 Note: Because the GGDC data are annual, the data for the survey years in the relevant DHS country are matched to the correspond- ing year in the GGDC dataset. Because the DHS occupational categories do not correspond directly to those reported by the GGDC (except for agriculture), this analysis of the DHS data begins by focusing on the share of the population engaged in agriculture. Table S.7 (in supplemental appendix), shows, by country, period, and gender, the percentage of the population who report that their primary occupation is agriculture. Since the DHS were done in waves but in different years for different countries, the period is broken into two intervals that correspond roughly to waves 3 and 4 (1998–2005) and waves 5 and 6 (2006–2014). This analysis focuses on the latter period because this is when growth in Africa picked up; therefore, we expect to see the most significant declines in agricultural employment shares during this time. In the rare event that two surveys were conducted in one of the subperiods, the employment shares represent a simple average across survey years. Results are broken out by gender because women often report that they are not working. In addition, this exercise focuses on workers age 25 and older to avoid confounding the results with children who may be in school. We begin by drawing the reader’s attention to the averages at the bottom of table S.7. Average 1 at the bottom of table S.7 is the unweighted average; average 2 is the labor force weighted average across countries. This discussion focuses on the labor force weighted averages, or average 2. For the males and females com- bined, the share of respondents who reported that their primary occupation is agriculture fell from around 430 Diao, Harttgen, and McMillan 61 percent to 51 percent, a decline of roughly 10 percentage points. This finding is similar to the percentage point decline in the share of population working in agriculture in the low-income African countries in the GGDC sample (9.3 percentage points). Interestingly, this decline was more pronounced for women than for men. Thus, we can conclude, with some degree of confidence, that there has been a significant decline in the share of the labor force engaged in agriculture in Africa starting around 2000. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 The second thing made clear by table S.7 is the enormous cross-country heterogeneity in employment shares in agriculture and in changes in employment shares in agriculture. For example, focusing on the most recent period (2006–2014), the share of females engaged in agriculture in Rwanda was 80.8 percent while the share of females engaged in agriculture in Namibia was only 3.0 percent. The differences are equally striking for males; the share of the male population working in agriculture was 74.0 percent in Ethiopia while it was only 9.7 percent in Namibia. Although in almost all countries the share of the labor force engaged in agriculture fell, in Madagascar the share of the labor force engaged in agriculture increased for both women and men. This increase in Madagascar is consistent with the increase in poverty in Madagascar over the same period as pointed out by Arndt, McKay, and Tarp (2016). Although not the central focus of this paper, it is worth noting that the cross-country heterogeneity has important policy implications, some of which have been described in recent work by Dercon and Gollin (2014). There is also significant heterogeneity across subgroups of the population. Figure 6 shows the average ten-year change in employment shares in selected occupations based on the DHS data. As previously noted, the occupations are grouped into the following categories: agriculture, services, professional, industry, and not working. For youth, there is the additional category of “in school.” Agriculture includes subsistence farmers and commercial farmers. Unfortunately, details about occupations are not provided on a consistent enough basis to create more disaggregated occupation codes. Services include, but are not limited to, secreta- ries and typists, sales clerks, street vendors, drivers, and traditional healers. Professional occupations include, but are not limited to, business owners, engineers, financiers, teachers, doctors, health professionals, lawyers, and civil servants. Industry includes skilled and unskilled manual labor. Unskilled manual labor includes, but is not limited to, garbage collectors, construction workers, and factory workers. Skilled manual labor includes, but is not limited to, masons, mechanics, blacksmiths, telephone installers, and tailors. Figure 6. Average Change in the Probability of Working in Selected Occupation Types Ten-year change in occupation, Women (age 25+) Ten-year change in occupation, Men (age 25+) 0.04 0.04 0.02 0.02 0.00 0.00 -0.02 -0.02 -0.04 -0.04 -0.06 -0.06 Urban Rural Total Urban Rural Total -0.08 -0.08 -0.10 -0.10 Agriculture Service Professional Industry Not working Agriculture Service Professional Industry Not working Ten-year change in occupation, Women (age 15-24) Ten-year change in occupation, Men (age 15-24) 0.10 0.10 0.08 0.06 0.05 0.04 0.00 0.02 0.00 -0.05 -0.02 -0.04 Urban Rural Total -0.10 Urban Rural Total -0.06 -0.08 -0.15 Agriculture Service Professional Industry In school Not working Agriculture Service Professional Industry In school Not working Note: Subsample of all male and female (aged 25 plus and 15–24, respectively) agricultural workers who currently are not attend- ing school. Results are based on annual averages obtained from country-specific data for 1998–2005 and 2006–2014, which are then multiplied by 10 to get the average ten-year change in employment shares. Source: DHS datasets (ICF International 2016 r); authors’ calculations. The World Bank Economic Review 431 Figure 6 shows the average ten-year change in the share of the population working in each occupation by population subgroup. Because the interval between countries varies and because this analysis describes gen- eral trends, the following procedure is used to obtain estimates of the ten-year change in employment shares: for each country and occupation, we compute the change in the employment share between the first survey year and the last survey year. We then divide this change by the number of years to get an annual change in Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 the employment share for each country and occupation subgroup. We then multiply these annual changes by ten to get the average ten-year change in employment shares by country and occupation subgroup. To create the average across countries, we compute a labor force weighted average of the ten-year changes for each occupation subgroup. The results in figure 6 are presented separately for females and males by age group. Within these groups, results are presented for both rural and urban dwellers. In each panel of the figure, urban is shaded in red diagonals, rural is shaded in blue vertical lines, and the total change in the predicted employment share is denoted with a dashed line with black diamonds. The patterns that emerge are generally consistent with the patterns presented in figure 4(b) with addi- tional nuances for population subgroups. For example, there is a decline in employment shares in agricul- ture for men and women age 25 and older; the decline is larger for females than for males. In addition, and not surprisingly, the biggest declines occurred in rural areas. A second pattern that emerges and that is con- sistent with the results in figure 4(b) is an overall increase in the predicted share of employment in services, including professional services. One of the most interesting patterns in the figure is the fairly large increase in the share of rural youth in school. Although it is fairly well established that more children are going to school in many African countries, as primary school enrollment rates have been going up in many countries in Africa, the less well-known fact documented here is that this is not just an urban phenomenon. Finally, figure 7 shows changes in agricultural employment shares by educational status, gender, and location for the population age 25–59. The left axis and the blue bars show changes in employment shares while the red dotted line and the right axis show initial employment shares. All values reported are labor force weighted averages; the procedure for obtaining the ten-year average is the same Figure 7. Agricultural Employment Share (%) by Level of Education for Population Age 25–59, 1998–2005 and 2006–2014 Rural female Rural male Change in share Share in 1998-2005 Change in share Share in 1998-2005 Ag. employment share in 1998-2005 (%) Ag. employment share in 1998-2005 (%) 0.0 0.0 Change in employment share between Change in employment share between 80 80 1998-2005 and 2006-2014 (%) 1998-2005 and 2006-2014 (%) -2.5 70 -2.5 70 60 60 -5.0 -5.0 50 50 -7.5 40 -7.5 40 -10.0 30 -10.0 30 20 20 -12.5 -12.5 10 10 -15.0 0 -15.0 0 No education Primary Secondary or higher No education Primary Secondary or higher Urban female Urban male Change in share Share in 1998-2005 Change in share Share in 1998-2005 Ag. employment share in 1998-2005 (%) Ag. employment share in 1998-2005 (%) 0.0 0.0 Change in employment share between Change in employment share between 80 80 1998-2005 and 2006-2014 (%) 1998-2005 and 2006-2014 (%) -2.5 70 -2.5 70 60 60 -5.0 -5.0 50 50 -7.5 -7.5 40 40 -10.0 30 -10.0 30 20 20 -12.5 -12.5 10 10 -15.0 0 -15.0 0 No education Primary Secondary or higher No education Primary Secondary or higher Notes: Results are based on annual averages obtained from country-specific data for 1998–2005 and 2006–2014, which are then multiplied by ten to get the average ten-year change in employment shares. Weighted averages are computed using the size of each country’s labor force. Source: DHS datasets (ICF International 2016); authors’ calculations. 432 Diao, Harttgen, and McMillan procedure used to obtain the results in figure 7. All four panels show that the employment share in agri- culture is highest among the cohort of the population with no education. Employment shares in agricul- ture declined the most among rural females with a primary education (–12.11 percent). However, the decline was also pronounced for rural females with no education (–10.34 percent). Not surprisingly, there was very little movement in employment shares in agriculture among urban males. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 V. Conclusion Africa has been largely absent from empirical work on structural change. This paper aims to fill that gap. It begins by documenting a number of stylized facts. First, recent patterns of employment shares in Africa appear to fit the stylized facts of other the historical development in other regions. In other words, controlling for income, the quantitative patterns of employment shares in Africa are roughly what would be expected based on what has transpired elsewhere. Second, between 2000 and 2010, structural change contributed 1.57 percentage points annual labor productivity growth in Africa in low-income African countries. Moreover, overall labor productivity growth in Africa was second only to Asia, where struc- tural change continued to play an important positive role. There is, however, an important difference between the two regions: the share of employment in manufacturing in developing Asian countries is more than double the share of employment in manufacturing in low-income African countries. As in other developing regions, structural change in SSA has been characterized by a significant decline in the share of the labor force engaged in agriculture. This is a positive development, because agriculture has been, on average, the least productive sector in the economies of Africa. However, unlike other developing regions, structural change in Africa has not yet been accompanied by a significant expansion in the share of the labor force employed in manufacturing. Instead, the reduction in the employment share in agriculture has been matched by a sizable increase in the share of the labor force engaged in services and a modest increase in the manufacturing sector employment in low-income African countries. These stylized facts are robust to alternative data sources. In particular, data from the DHS are used to check our estimates of changes in employment shares; similar patterns were found. These results are encouraging and point to reasons for the real income growth in many African coun- tries south of the Sahara and for the poverty reduction documented by Sala-i-Martin and Pinkovskiy (2010), Young (2012), McKay (2013), and Page and Shimeles (2014). However, it is important to recog- nize that, unlike in East Asia, the employment share in manufacturing is not expanding rapidly in Africa. In East Asia’s economies, the rapid expansion of labor-intensive manufacturing for export accelerated structural change-led growth. Although manufacturing has an important role to play in the economies of Africa, it seems unlikely that it will play the same role in Africa’s economies that it played in East Asia’s economies. This is not necessarily bad news; it simply highlights the importance of investing in things like human capital and infrastructure, which can raise productivity levels in all sectors of the economy. References Arndt, C., A. McKay, and F. Tarp. 2016. “Two Cheers for the African Growth Renaissance (but not Three).” In C. Arndt, A. McKay, and T. Finn, eds. Growth and Poverty in Sub-Saharan Africa. Oxford: Oxford University Press, 11–40. Chenery, H.B., and L. Taylor. 1968. “Development Patterns among Countries and Over Time.” Review of Economics and Statistics 50 (4): 966–1006. Devarajan, S. 2013. “Africa’s Statistical Tragedy.” Review of Income and Wealth 59 (special issue): S9–S15. Dercon, S., and D. Gollin. 2014. “Agriculture in African Development: A Review of Theories and Strategies.” Annual Review of Resource Economics 6: 471–92. de Vries, G.J., M.P. Timmer, and K. de Vries. 2015. “Structural Transformation in Africa: Static Gains, Dynamic Losses.” The Journal of Development Studies 51 (6): 674–88. Duarte, M., and D. Restuccia. 2010. “The Role of the Structural Transformation in Aggregate Productivity.” Quarterly Journal of Economics 125 (1): 129–73. The World Bank Economic Review 433 Easterly, W. 2001. The Elusive Quest for Growth: Economists’ Adventures and Misadventures in the Tropics. Cambridge: MIT Press. The Economist. “Hopeless Africa.” May 11, 2000. http://www.economist.com/node/333429. Accessed on April 11, 2015. Fund, International Monetary. October 2016. World economic outlook. Washington: International Monetary Fund. Gollin, D., D. Lagakos, and M.E. Waugh. 2014. “The Agricultural Productivity Gap.” Quarterly Journal of Economics 129 (2): 939–93. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/412/3110990 by LEGVP Law Library user on 08 August 2019 Harttgen, K., S. Klasen, and S. Vollmer. 2013. “An African Growth Miracle? Or: What Do Asset Indices Tell Us about Trends in Economic Performance?.” Review of Income and Wealth 59 (Special Issue): S37–S60. Herrendorf, B., R. Rogerson, and A.  Valentinyi. 2014. “Growth and Structural Transformation.” In Aghion and Durlauf, eds. Handbook of Economic Growth 2. Amsterdam: North-Holland, 855–941. ICF International. 2016. The DHS Program: Demographic and Health Surveys. Download of datasets from: http:// dhsprogram.com/data/. Accessed on April 11, 2016. Jerven, M., and D. Johnston. 2015. “Statistical Tragedy in Africa? Evaluating the Data Base for African Economic Development.” Journal of Development Studies 51 (2): 111–5. Lewis, A.W. 1955. The Theory of Economic Growth. London: Allen and Unwin. Maddison, A. 2010. Statistics on World Population, GDP, and Per Capita GDP, 1-2008 AD. Groningen: University of Groningen. Download of datasets from http://www.ggdc.net/maddison/oriindex.htm. Accessed on June 11, 2016. McCullough, E.B. 2015. Labor productivity and employment gaps in Sub-Saharan Africa. World Bank Policy Research Working Paper 7234. McKay, A. 2013. “Growth and Poverty Reduction in Africa in the Last Two Decades: Evidence from an AERC Growth-Poverty Project and Beyond.” Journal of African Economies 22 (suppl. 1): i49–i76. McMillan, M., and D. Rodrik. 2011. “Globalization, Structural Change, and Productivity Growth.” In M. Bachetta and M. Jansen, eds. Making Globalization Socially Sustainable. Geneva: International Labour Organization and World Trade Organization. McMillan, M., D. Rodrik, and C. Sepulveda, eds. Forthcoming. Structural Change, Fundamentals, and Growth: A Framework and Country Studies. Washington DC: International Food Policy Research Institute. McMillan M., D. Rodrik, and I. Verduzco-Gallo. 2014. “Globalization, Structural Change, and Productivity Growth, with an Update on Africa.” World Development 63: 11–32. Mundlak, Y., R. Butzer, and D.F. Larson. 2012. “Heterogeneous Technology and Panel Data: The Case of the Agricultural Production Function.” Journal of Development Economics 99 (1): 139–49. National Bureau of Statistics, Ministry of Finance, and Office of Chief Government Statistician, Ministry of State, President’s Office, State House and Good Governance, Tanzania. 2014d. “Tanzania 2012 Census: Basic Demographic and Socio-Economic Profile,” Dar es Salaam and Zanzibar, Tanzania, April 2014. Page, J., and A. Shimeles. 2014. Aid, Employment, and Poverty Reduction in Africa. WIDER Working Paper 2014/ 043. Helsinki: UN University World Institute for Development Economics Research. Resnick, D., and J. Thurlow. Forthcoming. “The Political Economy of Zambia’s Recovery: Structural Change without Transformation?.” In M., McMillan, D. Rodrik, and C. Sepu lveda, eds. Structural Change, Fundamentals, and Growth, Chapter 6. Washington DC: International Food Policy Research Institute. Rodrik, D. 2016. “Premature Deindustrialization.” Journal of Economic Growth 21 (1): 1–33. Sala-i-Martin, X., and M. Pinkovskiy. 2010. African Poverty Is Falling . . . Much Faster Than You Think! NBER Working Paper 15775. Cambridge: National Bureau of Economic Research. Timmer, M.P., and G.J. de Vries. 2007. A Cross-Country Database for Sectoral Employment and Productivity in Asia and Latin America, 1950–2005. Gronigen Growth and Development Centre Research Memorandum 98. Gronigen, Netherlands: University of Gronigen. ———. 2009. “Structural Change and Growth Accelerations in Asia and Latin America: A New Sectoral Data Set.” Cliometrica 3 (2): 165–90. Timmer, M.P., G.J de Vries, and K. de Vries. 2015. “Patterns of Structural Change in Developing Countries.” In J. Weiss, and M. Tribe, eds. Routledge Handbook of Industry and Development, 65–83. World Bank. 2016. World Development Indicators. Download of dataset from http://databank.worldbank.org/data/ reports.aspx?source¼world-development-indicators. Accessed on April 11, 2016. Young, A. 2012. “The African Growth Miracle.” Journal of Political Economy 120: 696–739. The World Bank Economic Review, 31(2), 2017, 434–458 doi: 10.1093/wber/lhv081 Advance Access Publication Date: February 18, 2016 Article Does Child Sponsorship Pay off in Adulthood? Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 An International Study of Impacts on Income and Wealth Bruce Wydick, Paul Glewwe, and Laine Rutledge Abstract We estimate the impact of international child sponsorship on adult income and wealth of formerly sponsored children using data on 10,144 individuals in six countries. To identify causal effects, we utilize an age- eligibility rule followed from 1980 to 1992 that limited sponsorship to children twelve years old or younger when the program was introduced in a village, allowing comparisons of sponsored children with older siblings who were slightly too old to be sponsored. Estimations indicate that international child sponsorship increased monthly income by $13–17 over an untreated baseline of $75, principally from inducing higher future labor market participation. We find evidence for positive impacts on dwelling quality in adulthood and modest evi- dence of impacts on ownership of consumer durables in adulthood, limited to increased ownership of mobile phones. Finally, our results also show modest effects of child sponsorship on childbearing in adulthood. JEL classification: O15, O22, J31, J13, J24, D03 I. Introduction Millions of households in wealthy countries support nonprofit organizations whose aim is to alleviate poverty in the developing world. But only recently has a growing body of research in development Bruce Wydick is a professor in the Department of Economics at the University of San Francisco, 2130 Fulton Street, San Francisco, CA 94117-1080; his email address is wydick@usfca.edu; Paul Glewwe (corresponding author) is a professor in the Department of Applied Economics at the University of Minnesota, 1994 Buford Ave, St. Paul, MN 55108; his email address is pglewwe@umn.edu; Laine Rutledge is a doctoral student in the Department of Economics at the University of Washington, 319 Savery Hall, Seattle, WA 98195-3330; her email address is lainemr@u.washington.edu. We would like to thank Wess Stafford, Joel Vanderhart, Scott Todd, Alistair Sim, Herbert Turyatunga, Jose-Ernesto Mazariegos, Ester Battz, Noel Pabiona, Rowena Campos, Sofia Florance, Catherine Mbotela, Sam Wambugu, Boris Zegarra, Marcela Bakir, and other local Compassion staff and enumerators in Bolivia, Guatemala, India, Kenya, the Philippines, and Uganda for logistical help and support in carrying out our field research. Thanks to graduate students Joanna Chu, Ben Bottorff, Jennifer Meredith, Phillip Ross, and Herman Ramirez for outstanding work in the field. We also appreciate support and helpful comments from Christian Ahlin, Michael Anderson, Jesse Anttila-Hughes, Chris Barrett, Jere Behrman, Michael Carter, Alessandra Cassar, Pascaline Dupas, Giacomo De Giorgi, Alain de Janvry, Fred Finan, Pauline Grosjean, Phil Keefer, David Levine, Jeremy Magruder, Craig McIntosh, David McKenzie, Ted Miguel, Douglas Miller, Jeff Nugent, Jon Robinson, Elizabeth Sadoulet, John Strauss, and seminar participants at the University of California at Berkeley, Stanford University, the World Bank, the University of Southern California, the University of California at Davis, the Georgia Institute of Technology, the 2010 and 2011 Pacific Conferences for Development Economics, and the Institutions, Behavior, and the Escape from Persistent Poverty (IBEPP) conference at Cornell University. Finally, we thank the edi- tor, Andrew Foster, and three anonymous referees for very helpful comments. We are grateful to BASIS/USAID for substantial funding for this project, to two generous South Korean donors, and to the University of San Francisco’s graduate program in International and Development Economics. C The Author 2016. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. The World Bank Economic Review 435 economics begun to rigorously evaluate the impact of these programs on their intended beneficiaries.1 International child sponsorship is one of the most popular approaches taken by ordinary households in weal- thy countries to help impoverished children overseas. We estimate that there are 9.14 million sponsored chil- dren in the world today, the vast majority of whom are sponsored by individuals and families in wealthy countries.2 Donors typically contribute $25–40 per month to sponsor a child. In many cases child sponsor- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 ship organizations use these funds to provide school uniforms, tuition, nutritious meals, and programming that directly benefits sponsored children. Other types of sponsorship programs pool funds to invest in pro- gramming and infrastructure that benefit children in the community more broadly.3 For many individuals, child sponsorship represents their most direct contact with the poor in develop- ing countries. Donors are drawn to child sponsorship because of the personalization of the relationship between sponsor and child. But whether child sponsorship actually benefits sponsored children has re- mained an open question. In Wydick, Glewwe, and Rutledge (2013), a companion paper to this re- search, we find international child sponsorship to have a statistically significant and positive effect on educational outcomes in all six survey countries (Bolivia, Guatemala, Kenya, Uganda, India, and the Philippines). Sponsorship during childhood increased the probability of secondary school completion by 12–18 percentage points over a 44.5% baseline, and increased completed years of schooling by 1.03– 1.45 years. Sponsorship also increased adult white-collar employment by 6.5 percentage points over an 18.5% baseline, as well as the probability of being a community leader. Previous research has studied the impacts of various programs on children’s persistence in school in developing countries. Examples include Dre ` ze and Kingdon (2001) and Kremer and Vermeersch (2004), who find positive impacts of school meal programs on school attendance in India and Kenya, respec- tively. In a randomized trial, Evans, Kremer and Ngatia (2008) find a nearly 40% reduction in absentee- ism from the random provision of free school uniforms in Kenya, while Kremer, Miguel, and Thornton (2009) estimate that a merit scholarship program for girls boosted attendance by 5 percentage points. Aside from our research, the only other investigation related to ascertaining the impacts of international child sponsorship is Kremer, Moulin, and Namunyu (2003). In this paper the authors assess the impact of a Dutch child-sponsorship program, finding that even a relatively low-cost program focused on the provision of school uniforms and textbooks to each child caused sponsored children to advance a third of a grade farther in schooling completion. In this paper we present results for the impacts of child sponsorship on the adult income4 and wealth of children sponsored through one of the leading international child sponsorship organizations. An un- derstanding of these impacts is important for the millions of individuals in wealthy countries involved in international child sponsorship, individuals who are likely to view their contributions as an investment 1 See for example Cristia et al. (2012) evaluating the One Laptop Per Child Program, Rawlins et al. (2014) on the nutri- tional impacts of dairy cows and meat goats donated via the Heifer Project, and the analysis of Blattman et al. (2014) on cash transfers. 2 We estimate this figure based on comprehensive internet search across multiple languages for sponsorship programs. For details on how the 9.14 million figure was compiled, see Wydick, Glewwe, and Rutledge (2013). 3 Of the top ten child sponsorship organizations, a more direct child-centered approach is taken by Compassion International, ChildFund, Children International, Christian Foundation for Children and Aging, and Bornefonden. The community-centered approach is favored by World Vision, Plan International, Kindernothilfe, Save the Children, and SOS Children’s Villages. 4 Our study examines changes in labor income, which includes wages paid by an employer, income earned by an entre- preneur from a small enterprise, or income from farming. We do not study income earned from capital holdings, as these were deemed to be insignificant for the great majority of the households in our sample. In our study, the terms “la- bor income” and “wages” have a similar meaning but are not exactly the same; we use the term “wages” as conditional on working status and “labor income” in contexts that are unconditional on working status, e.g., labor income in- creases when one enters the workforce and begins to earn a wage. 436 Wydick, Glewwe, and Rutledge in these overseas children that yields tangible economic returns in the future, when they are adults. But it is also important for governments in countries implementing similar programs that work directly with impoverished children, helping them to understand whether direct investments in child development are financially sustainable by virtue of the positive impacts on the future incomes of beneficiaries. Thus we ask the question: Does international child sponsorship pay off for children in adulthood? Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 II. Methodology Fieldwork for our six-country study took place from 2008 to 2010. We obtained initial enrollment lists from village projects that were rolled out from 1980 to 1992 by Compassion International, the world’s third largest sponsorship organization, which currently sponsors over 1.3 million children in 26 coun- tries. Compassion’s child sponsorship program is a very intensive intervention in the lives of impover- ished children. Children typically begin sponsorship at age 4–6 and continue into their mid-teens. Many sponsored children attend retreats together with program staff that focus on the nurture of their spiritual and moral values as well as their aspirations. Since, even in a typical week, children typically spend about 8–10 hours per week after school and on Saturdays participating in the program, and because the average duration of sponsorship is 9.3 years, this means that during the course of their childhood, on average sponsored children spend slightly more than four thousand hours participating in Compassion programming. Average years of participation by country is given in table 1. Table 1. Years of Sponsorship in Program by Country Mean Std. Dev. Sample size Uganda 11.02 3.53 188 Guatemala 6.64 2.55 357 Philippines 7.12 4.79 237 India 10.86 3.53 221 Kenya 10.13 3.43 543 Bolivia 9.48 3.78 288 All six countries 9.30 3.93 1,834 Note: weighted mean is presented. Across the countries in which it is implemented, the Compassion program contains many similar ele- ments. In each country Compassion uses funding to provide tuition fees for children, several nutritious meals per week, basic healthcare, school uniforms, and an after-school tutoring program. The tutoring program not only helps sponsored children with homework and gives them additional academic instruc- tion, but emphasizes spiritual and character formation and the development of schooling and vocational aspirations and self-esteem. Note that vocational training was not included in any of these six countries. The program has changed slightly since the time of the study and varies somewhat between countries. In the past, Compassion worked in tandem with local schools (which is true of our data from Guatemala and the Philippines) but more recently has operated through local church-affiliated tutoring centers. We study the impacts from children involved in the program in India and Bolivia when small cash transfers were given to parents of participating children, but not in other countries. Other than these differences the program across countries is highly standardized. Through the use of local enumerators, we were able to locate 93.5% of the families of these formerly sponsored children, who by the time of the survey were aged seventeen to forty-three. Our field person- nel were unaffiliated with Compassion, in order to reduce bias in the responses of our subjects. We ad- ministered our survey first-hand to households of formerly sponsored children, a random sample of nonparticipating households in nineteen program villages, and a random sample of households in The World Bank Economic Review 437 thirteen neighboring, nonprogram villages. The survey questionnaire was administered to family mem- bers (typically parents or adult siblings) present at the time of the survey, and data were obtained on all adult siblings in the household cohort, including the nonsponsored siblings of sponsored children. We also administered the survey to 50–75 randomly selected households with children in a similar cohort age that did not participate in the program in program villages, as well as 50–75 randomly selected Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 households with children in a similar cohort age in nearby nonprogram villages. Overall, our data contain information on educational and vocational outcomes, monthly labor income, con- sumer good ownership, and dwelling quality on 1,860 formerly sponsored children, 3,704 of their unsponsored siblings, 2,136 individuals of a similar age from nonparticipating families in villages where the Compassion program operated, and 2,444 individuals from similar, nearby villages without the Compassion program. There are several empirical challenges to estimating the program’s causal effects on future income and wealth. First, there may be nonrandom selection of households with eligible children into the pro- gram. Second, since a limited number of children per household were eligible for sponsorship (ranging from one in the African countries to three in the Latin American countries), intrahousehold selection of children for sponsorship may not be random. Third, there may be spillover effects from sponsored chil- dren onto their siblings or onto other children in the village, which complicates the estimation. Lastly, when estimating impacts on future labor income, it is useful to distinguish between impacts on employ- ment and impacts on wages, conditional on employment. To identify causal effects of child sponsorship, we use a program age-eligibility requirement, which stipulated that only children twelve and under could enter the program in any year, including the pro- gram’s first year in a village. Figure 1 shows the program’s strong adherence to this rule. Because a child’s age at the time of program rollout in his or her village is independent of adult life outcomes, ex- cept via its impact on program participation, we can use the age-eligibility rule as an instrumental vari- able that allows one to account (and test) for nonrandom intrahousehold selection of children for sponsorship. To address possible endogenous household selection into the program, we present estimates with household fixed effects, which control for unobserved differences in parenting behavior and house- hold environments.5 Implicitly, this compares life outcomes of children who were age-eligible for sponsorship with their siblings who were too old for sponsorship when the program arrived in their village. Figure 1. Sponsorship as a Function of Age When Program Started 5 We found that program staff usually selected households for participation, and then parents chose which children to be sponsored. 438 Wydick, Glewwe, and Rutledge The regression estimates allow for the possibility of spillovers. Dummy variables are included for: (a) sponsored children, who were twelve or younger when the program started in their villages (denoted by T ¼ 1); (b) program participants’ siblings who were twelve or younger when the program began in their villages and, while eligible, were not selected for sponsorship (denoted D1 12 ¼ 1); (c) program par- ticipants’ siblings who were 13–16 years of age when the program arrived in their villages and thus were Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 ineligible (D13 1 À16 ¼ 1); (d) individuals in non-Compassion households in program villages who were twelve or younger at program introduction (D2 12 ¼ 1); (e) individuals in non-Compassion households in program villages age 13–16 at program introduction (D13 2 À16 ¼ 1); (f) individuals twelve or younger in non-Compassion villages when the program started in a neighboring village (D3 12 ¼ 1); and (g) - individuals 13–16 in non-Compassion villages when the program began n a neighboring village (D13 3 À16 ¼ 1). Individuals seventeen or older in nonprogram villages are the omitted category. The household fixed-effects equation for child i in household j is yij ¼ a1 D1ij12 þ a2 D13 1ij À16 þ sðD1ij12 à Ti Þ þ b1 D2ij12 þ b2 D13 2ij À16 (1) þc1 D3ij12 þ c2 D13 3ij À16 þ X ij u þ hj þ ij where yij measures labor income or wealth, X ij is a vector of controls that include age, gender, birth order, and oldest child, and hj is a household fixed effect. Assuming that spillovers: (a) occur only within villages and (b) affect age-eligible, but not age-ineligible, siblings of sponsored children, then ½a1 À a2 ŠÀ ½c1 À c2 Š captures spillovers from sponsored children onto nonsponsored siblings age twelve and younger and ½b1 À b2 Š À ½c1 À c2 Š captures spillovers onto age-eligible children in non-Compassion households in program villages.6 To address possibly endogenous selection of age-eligible children within families we use instrumental vari- able estimation. The oldest age-eligible sibling was sponsored most often, followed by the second-oldest age- eligible sibling, third oldest, etc., so the instruments are interaction terms between three age-at-program- rollout categories (4 years and under, 5–8, 9–12) and dummy variables for oldest age-eligible sibling, second- oldest age-eligible sibling, and younger age-eligible siblings, yielding a vector of nine instruments.7 In the first stage, the probability of sponsorship, T i , is estimated using these instruments and the vector of controls; T i replaces the treatment variable (Ti ) in (1) in a second-stage regression. Estimating the impact of sponsorship on monthly wages involves another challenge: wages are unob- served for the 61% in the sample who were not working. This suggests the use of Heckman (1979) esti- mation for the wage impact regressions, which uses a probit employment equation to generate an Inverse Mills ratio for each observation in a second-stage wage regression. Given certain assumptions, this removes bias from the censored wage variable (the dependent variable in second equation). Using this approach allows us to decompose overall labor income impacts of child sponsorship into the impact from formerly sponsored children obtaining employment and the impact on wages conditional on employment. These two effects are seen by differentiating the expected average wage, EðwÞ, where EðwÞ ¼ Uðz0 cÞ Á wðx0 bÞ; (2) and Uðz0 cÞ is the probability that an individual works and earns a wage, based on characteristics z, and wðx0 bÞ is the individual’s wage, conditional on working, based on characteristics x. To estimate b with- out assuming arbitrary functional forms, z should have one or more variables that are excluded from x; we use the individual’s number of children, which strongly affects the probability of employment but 6 Spillovers onto nonsponsored siblings may reflect extra income available from sponsorship, role model effects, and parental reallocation of assistance to nonsponsored children. 7 An identical set of instruments was used in Wydick et al. (2013). The World Bank Economic Review 439 should have relatively little effect on wages.8 Both x and z include the sponsorship (treatment) variable, T. Differentiating (2) with respect to T, and setting variables to their means, gives @ EðwÞ @ Uðz0 cÞ À0 @w À0 ¼ Á wðx bÞ þ Á Uðz c Þ: (3) @T @T @T Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 The first term gives the impact of sponsorship on income from its employment effect; the second is the impact of sponsorship on wages, conditional on employment. Both terms in (3) are obtained using Heckman’s method; Uðz0 cÞ is estimated using a probit specifica- tion, and wðx0 bÞ is essentially equation (1). To calculate the standard errors of the employment effect @ U ð z0 c Þ (the first term in (3)), a bootstrapping procedure is used; estimates of and of the average wage @T are obtained from a random draw (with replacement) from the sample. These two estimates are multi- plied for each bootstrap iteration and (household-level clustered) standard errors are obtained from five hundred bootstrapped replications. Similarly, the impact of sponsorship on wages (the second term in @w (3)) is the product of the estimate of and mean labor market participation; for each bootstrapped @T sample (with replacement), the entire estimation procedure, which combines estimates of the probit equation with those of the wage equation, is implemented, and five hundred bootstrap replications are used to obtain standard errors. For all 10,144 individuals in the study, interviewers attempted to obtain current labor income, which in our use of the term covers fixed wages paid by an employer, itinerate wages, estimated monthly income from a small business, or income from farming; all of this we refer to as wages. We did not collect data on nonlabor income (such as returns on assets), which were infrequently realized among the low-income households in our sample. For 83% of those who were reported to be working for any wage at all, they or their family members reported the wages of the individual. For the remaining 17% no one could provide a wage figure, but for nearly all of these individuals, family members knew their completed schooling and current occupation. Using data on education, occupation, gender, and age (and country fixed effects), labor income was imputed for all individuals in the sample. Two estimates of labor income impacts from sponsorship were thus imple- mented; one drops the 17% of the sample without wage data, and the other imputes labor income val- ues to all individuals in the sample, including those with observed wages.9 A hybrid in which we impute labor income only for missing observations yields estimates very similar to the latter estimates. Assuming that any imputation errors are independent of the explanatory variables in equation (1), esti- mates using imputed values are consistent and unbiased (Wooldridge 2010, 77). To carry out estima- tions by country and gender we use the imputed labor income, which yield slightly lower (yet more precise) impact estimates than do our directly reported wage data. To examine the impact of child sponsorship on adult wealth, we examine two broad categories: indi- cators of current dwelling quality and current ownership of common consumer durable goods and land. The dwelling quality measures include the presence of an indoor toilet, electricity, walls constructed of 8 It is possible that there are unobserved characteristics that reduce individuals’ wages and also influence their fertility (e.g., tastes for children, or lower labor productivity, which reduces the opportunity cost of raising children). This could cause the error term in the wage equation to be correlated with the number of children, invalidating number of children as an identifying variable in the selection equation. This seems unlikely for men, but it may occur for women. To check the robustness of our results, we tried two alternative approaches. First, we used “electrified household” as the excluded variable. Such households are more likely to be near labor markets so that individuals in them are more likely to be em- ployed, but it is unlikely that this variable directly affects wages. Second, we used no exclusion restriction and thus re- lied on the assumed normality of the probit error term to identify our selection term. These two approaches yielded very similar results and thus suggest that our findings are robust to different identifying assumptions. 9 The few individuals whose imputed wages are less than or equal to zero are assigned nonworking status. 440 Wydick, Glewwe, and Rutledge sturdy materials (e.g., wood or concrete rather than mud or sticks), high quality roofs (constructed from tile, concrete, or high-quality wood, rather than thatch, leaves, or low-quality corrugated iron), and high quality floors (concrete, wood, or tile, rather than dirt floors or floors made from other natural mate- rials). For the second wealth proxy, information was collected on ownership of mobile phones, bicycles, motorcycles, automobiles, and land. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 To address issues of over-testing and joint-testing of related hypotheses, two types of indices were cre- ated. The first simply weights each of the five variables within a category equally; OLS and GMM IV es- timations are then carried out on these simple indices. Secondly, for each of these two categories of variables we created an Anderson (2008) summary index. This index is created by de-meaning each of the dependent variables in the respective category j (j 2 dwelling, consumer goods), then weighting each observation by the sum of its row entries across the inverted variance-covariance matrix of the depen- dent variables in the group. Specifically, each observation i in group j receives a weight (index score) of sij ¼ ð10 RÀ1 1ÞÀ1 ð10 RÀ1 yij Þ, where 1 is a m x 1 column vector of 10 s, RÀ1 is the m x m inverted covariance À matrix, and yij is the m x 1 vector of outcomes for individual i. Relative to the simple index, the Anderson Index gives more weight to dependent variables within the grouping that are least correlated with other variables and hence embody the greatest degree of unique information. III. Results Table 2 provides summary statistics for the data. Monthly labor income is $16.67 higher among those who were sponsored as children (p < 0.01). This mainly reflects a higher employment rate (54.5% to 47.9%) among formerly sponsored individuals (p < 0.01). This is evident in figure 2; conditional on pos- itive labor income, (log) income is only slightly higher conditional on positive income, but formerly sponsored individuals show many more positive labor income observations.10 Figure 3 illustrates the program’s impact in a discontinuity diagram; nonparametric estimation shows that labor income is somewhat higher for individuals in untreated (relative to treated) households among those over age twelve when the program began. However, among those twelve or younger when the program rolled out, income is clearly higher in treated households. Table 2. Summary Statistics Variable Full sample Sponsored Unsponsored Difference p-value Age 29.82 26.51 30.56 4.05 0.000 Sex 0.504 0.481 0.509 À0.028 0.013 Birth order 3.041 3.006 3.049 À0.043 0.389 Mothers educ. (years) 4.85 4.94 4.83 0.11 0.298 Uganda 0.080 0.102 0.075 0.027 0.001 Guatemala 0.167 0.192 0.162 0.030 0.002 Philippines 0.141 0.129 0.143 À0.014 0.122 India 0.159 0.119 0.168 À0.049 0.000 Kenya 0.304 0.296 0.306 À0.010 0.418 Bolivia 0.145 0.158 0.142 0.016 0.089 Working ¼ 1 0.491 0.545 0.479 0.066 0.000 Monthly income ($US) 77.96 91.53 74.86 16.67 0.008 Monthly income ($US), working¼1 198.13 194.25 199.24 À4.99 0.649 Monthly income (Imputed, $US) 90.63 104.13 87.62 16.51 0.000 Monthly income (Imp, $US), working¼1 170.25 177.48 168.43 9.05 0.009 Housing quality index (simple) 2.81 2.88 2.80 0.08 0.014 Consumer good index (simple) 1.26 1.25 1.27 À0.02 0.378 Sample size 10,011 1,819 8,192 10 To show the density of incomes equal to zero, we specify log income as log(income þ 1). The World Bank Economic Review 441 Figure 2. Differences in Log Income, Sponsored vs. Unsponsored Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 Figure 3. Monthly Income as a Function of Eligibility Impact on Income The impacts of child sponsorship on adult labor income by current age are best seen visually. Figure 4 presents nonparametric estimations of the labor market income trajectories of sponsored (upper line) and unsponsored (lower line) individuals; the impact of sponsorship appears to increase over time (band- width ¼ 1, Epanechnikov kernel). While differences in income are small in the twenties and early thir- ties, they grow substantially from the mid-thirties to the mid-forties, beyond which our data no longer contain observations of older formerly sponsored individuals. 442 Wydick, Glewwe, and Rutledge Figure 4. Growth in Labor Income Gap Over Time, Sponsored vs. Nonsponsored Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 Heckman estimates of income impacts are provided in table 3 in columns (1) through (12). The first row of table 3 presents the marginal impacts of child sponsorship on the probability of working. Column (1) gives estimates without household fixed effects and omits missing income observations, column (2) adds household fixed effects, column (3) uses household fixed effects with imputed labor income (and thus includes observations with missing income data), and columns (4), (5), and (6) provide bootstrapped IV- Heckman estimates that are analogous to the estimates in columns (1), (2), and (3). These estimates are from instrumental variable (IV) estimations in which we regress treatment on our vector of instruments and controls in the first stage and then carry out the Heckman estimation in the second stage, bootstrap- ping clustered standard errors at the household level for the entire process with 500 replications. The first three columns yield estimated marginal effects of 0.096, 0.079, 0.068, for the first-stage (probit) estima- tions. The second three columns yield estimates of 0.116, 0.191, and 0.186 of the marginal effect of the program on employment. All of the former are significant at p < 0.01 and the latter at p < 0.05 and p < 0.10. While the IV estimates are higher than the estimates in columns (1) – (3), Hausman tests cannot re- ject the null hypothesis that the standard probit estimate is consistent (the lowest p-value is 0.136). @ Uðz0 cÞ À0 The second row of table 3 provides estimates of Á wðx bÞ in (3), the increased income from @T sponsorship via greater employment. These impacts range from $12.81 per month in column (3) to $38.00 in column (5). Because we cannot reject the consistency of the standard probit estimates (Hausman test t ¼ 1.49), we emphasize the average impact in columns (1) to (3), which is $15.23. @w The third row provides second-stage (Heckman) estimates of , the impact of sponsorship on wages @T conditional on employment.11 Only the $6.06 estimate in column (3) is significant (p < 0.05); although two of the three IV-Heckman estimates are much larger, they are very imprecise. Thus, it is only over the whole sample (including observations with missing income data) that there is evidence that sponsorship raises incomes conditional upon employment, and when the income impacts in the third row are combined @w À0 with the probability of employment in the fourth row, Á Uðz cÞ, all estimates are insignificant except @T 11 The log of labor income is typically used in such regressions, but for decomposition of income effects between the Heckman equation of new entrants into the labor force and the marginal income increases conditional on employ- ment, it is more convenient to use levels rather than logs. We checked our results using log income, and they are simi- lar, but this specification becomes very cumbersome analytically and yields little benefit. The World Bank Economic Review 443 Table 3. Impact on Monthly Labor Income: Heckman Estimates (1) (2) (3) (4) (5) (6) Heckman Heckman Heckman IV-Heckman IV-Heckman IV-Heckman missing omitted missing omitted obs. imputed missing omitted missing omitted obs. imputed No FE HH FE HH FE No FE HH FE HH FE Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 @ Uðz0 cÞ Heckman selection ¼ 0.096*** 0.079*** 0.068*** 0.116* 0.191** 0.186** @T (0.017) (0.0155) (0.014) (0.066) (0.090) (0.080) Selection impact on income $17.25*** $15.63*** $12.81*** $23.65* $38.00** $35.31** @ Uð z 0 c Þ À ¼ Á w ðx0bÞ (3.12) (3.08) (2.71) (13.22) (17.76) (15.33) @T Marginal wage À$1.17 À$5.39 $6.06** $40.16 $48.99 $10.23 @w impact j w > 0 ¼ (9.85) (10.79) (3.10) (50.05) (50.79) (15.21) @T Marginal wage À$0.46 À$2.12 $2.64* $15.80 $19.28 $4.46 impact on Income (4.02) (4.25) (1.38) (23.84) (19.76) (6.69) @w À ¼ Á Uðz 0cÞ @T Lambda À18.85*** À56.12 À10.55*** À19.37** À57.80 À10.31 (3.94) (41.19) (3.60) (8.03) (52.76) (7.50) Observations 8,389 8,389 10,004 8,389 8,389 10,004 Mean w, untreated $74.86 $74.86 $74.86 $74.86 $74.86 $74.86 (196.07) (196.07) (196.07) (196.07) (196.07) (196.07) Mean w j w > 0, untreated $199.24 $199.24 $199.24 $199.24 $199.24 $199.24 (278.47) (278.47) (278.47) (278.47) (278.47) (278.47) By country: (7) (8) (9) (10) (11) (12) Selection impact on Income Col. (3) OLS Col. (3) OLS Col. (3) OLS Col. (3) Col. (3) OLS Col. (3) OLS ¼ @ Uðz0 cÞ À HH FE HH FE HH FE OLS HH HH FE HH FE Á wðx0bÞ @T Uganda Guatemala Philippines FE India Kenya Bolivia Sponsored $7.19 $27.63** $17.01* $37.61*** $1.61 $8.19 (7.82) (8.35) (9.54) (6.47) (3.57) (6.09) Observations 809 1,680 1,407 1,599 3,051 1,458 Mean w, untreated $36.90 $56.65 $115.33 $131.96 $31.22 $57.73 (144.30) (104.66) (253.02) (167.22) (89.62) (120.46) Mean w j w > 0, untreated $154.99 $193.34 $301.78 $198.28 $111.62 $165.36 (264.39) (104.68) (333.85) (169.90) (140.65) (154.36) Notes: Heckman estimations include controls for age, gender, sibling order, and oldest sibling. Selection on Impact multiplies marginal effect of first- stage tobit by E[ w j w > 0] for sample ($178.07). First-stage F -test for instrumental variable estimation yields F ¼ 225.5 ( p < 0.001). Hausman test for exo- geneity of treatment (child sponsorship) fails to reject null that child sponsorship is exogenous ( p ¼ 0.1363). Bootstrapped clustered standard errors of parameter estimates, and standard deviations of means, are shown in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. for the $2.64 estimate (p < 0.10) in column (3), although again two of the three IV point estimates are considerably larger. Overall, these estimates are consistent with the density functions in figure 2—the main impact of child sponsorship on income is primarily via increased employment, rather than via in- creased wages among those already employed. Columns (7) through (12) in table 3 show impacts on income due to the increased probability of employ- ment from child sponsorship, replicating the estimate in column (3) for each country. Impacts are highest in India ($37.61, p < 0.01), Guatemala ($27.63, p < 0.05), and the Philippines ($17.01, p < 0.10). And al- though estimates are positive in every country, they are lower and statistically insignificant in Bolivia 444 Wydick, Glewwe, and Rutledge ($8.19), Uganda ($7.19), and Kenya ($1.61).12 Joint tests for differences across continents indicate signifi- cantly lower impacts in Africa than in Asia and Latin America, likely due to comparatively low economic op- portunity in these two African countries. This is true even though educational impacts in our companion paper were found to be much stronger in the African countries; this could reflect low returns to education in rural areas of Africa. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 In table 4 we disaggregate our Heckman estimations by gender and find that the monthly labor income effects from the increased probability of employment are nearly identical in the (first-stage) standard probit estimations ($12.60 for men, $12.75 for women, both p < 0.01). (These correspond to our estimates in col- umn (3) of table 3.) IV estimates for both are larger but insignificant. There is a positive impact on wages conditional on employment for men ($6.74, p < 0.01), but this effect is zero for women. IV estimates of the marginal effect on men’s wages are very large, $33.15 and $53.08, but imprecisely measured. All effects for the impact on the marginal wage for women are small and insignificant. Thus sponsorship yields an increase in girls’ future labor income of $12.75, resulting solely from greater labor market participation. But the total impact from sponsorship on boys’ future labor income is $19.34: $12.60 from higher labor market participa- tion and $6.74 from higher wages conditional upon labor market participation. Table 4. Impact on Monthly Labor Income by Gender: Heckman Estimates (1) (2) (3) (4) (5) (6) Heckman all IV-Heckman IV-Heckman Heckman all IV-xHeckman IV-Heckman observations missing omitted missing omitted observations missing omitted missing omitted HH FE men No FE men HH FE men HH FE women No FE women HH FE women Heckman 0.063*** 0.104 0.258 0.073*** 0.093 0.138 @ Uð z 0 c Þ selection ¼ (0.023) (0.134) (0.205) (0.022) (0.071) (0.106) @T Selection impact $12.60*** $22.03 $51.78 $12.75*** $18.07 $26.50 on income (4.44) (27.60) (41.95) (3.90) (13.50) (20.01) @ Uðz0 cÞ ¼ Á wðxÀ 0bÞ @T Marginal wage $12.90*** $69.44 $111.18 $0.43 À$3.77 $20.97 @w impact j w > 0 ¼ (4.14) (83.70) (87.69) (4.62) (72.58) (54.89) @T Marginal wage $6.74*** $33.16 $53.08 $0.15 À$1.16 $6.48 impact on income (2.21) (41.36) (41.57) (1.65) (21.35) (16.72) @w ¼ Á UðzÀ 0cÞ @T Lambda À5.41*** À13.48*** À109.50 À35.9*** À27.05 À42.37 (1.71) (4.95) (24.79) (6.19) (50.32) (101.1) Observations 5,048 4,197 4,197 4,956 4,192 4,956 Mean w, untreated $100.70 $96.98 $96.98 $47.54 $58.83 $58.83 (185.69) (195.92) (195.92) (114.82) (183.47) (183.47) Mean w j w > 0, untreated $201.89 $203.12 $203.12 $175.02 $190.29 $190.29 (220.71) (242.57) (242.57) (162.01) (289.65) (289.65) Notes: Heckman estimations include controls for age, gender, sibling order, and oldest sibling. Selection on Impact multiplies marginal effect of first-stage tobit by E[w j w > 0] for sample by gender. First-stage F-tests for instrumental variable estimation yields F ¼ 110, 116, respectively, for men and women. Bootstrapped clus- tered standard errors of parameter estimates, and standard deviations of means, are shown in parentheses. (p < 0.001). *** p < 0.01, ** p < 0.05, * p < 0.1. 12 Other estimation models yield similar estimates. For example, Ordinary Least Squares (OLS) estimates for the full sample, including those who are not working, and over a variety of specifications, yield significant (p < 0.01) estimates ranging from $16.60 to $19.05. While these OLS estimates combine both employment selection and marginal wage effects, they may be biased because they omit the Inverse Mills ratio, which is included in the Heckman estimation as a right-hand-side regressor. Tobit estimates, which are not preferred because they assume the impacts of the explanatory variables to be the same for the selection and marginal wage effects, yield significant (p < 0.01) estimates between $12.53 and $24.51. The World Bank Economic Review 445 We test for income spillovers onto unsponsored siblings and other children of eligible age within pro- gram villages using a joint test of the significance of the linear combinations ½a1 À a2 Š À ½c1 À c2 Š and ½b1 À b2 Š À ½c1 À c2 Šfrom (1) but find no evidence of either positive or negative spillover effects in either case (p ¼ 0.987, 0.195, respectively). Thus we conclude that the benefits of international sponsorship on adult income appear to be limited to the sponsored child. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 Figure 5 shows that the impacts on income appear to be smallest among both the least educated and most educated mothers of sponsored children. The largest difference in income between sponsored and nonsponsored children occurs among children of mothers with a primary school education, which is per- haps enough education to offer complementary support to the sponsorship program but not so much that the counterfactual levels of education for children would be high even without sponsorship. Figure 5. Impact on Labor Income between Sponsored and Nonsponsored by Mother’s Education (Bandwidth ¼ 0.5) Because most of our impact on income is on the extensive margin (labor market participation) rather than the intensive margin (higher wages), one possibility is that child sponsorship simply encouraged individ- uals on the margin of labor market participation to move into the labor market. Perhaps the sponsorship program merely raised aspirations for labor market activity rather than genuinely increasing the returns to labor market participation. In this case, the income gains we show here might be only slightly higher than the opportunity cost from work that generates no income such as raising children or subsistence farming. One way to test whether the income gains from child sponsorship truly increased income or simply substituted income for nonwage opportunity costs of similar value is to estimate the increase in labor market value of sponsorship via its impact on greater schooling completion. In this sense we estimate the individual terms of ! @ EðwÞ @ Uðz0 cÞ ÀÀ0 Á @ w ÀÀ0 Á @ s ¼ Áw x b þ ÁU z c ; (4) @T @s @s @T where s is years of schooling (more precisely, highest grade attained). Equation (4) is similar to equation (3) except that it measures program impacts on expected wages via the impact of the program on years of completed schooling. 446 Wydick, Glewwe, and Rutledge Schooling exhibits significant impacts on labor market participation and the number of children an individual has as an adult. Table 5 shows that each additional year of schooling increases labor market participation by 2.3 percentage points (3.2 for women and 1.4 for men). Each additional year of school- ing also reduces the number of children (as shown in negative binomial estimations) in columns (4)–(6) by À0.094 children for women and À0.054 for men. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 Table 5. Schooling Impacts on Labor Market Participation and Number of Children Variables Labor market participation Number of children (1) (2) (3) (4) (5) (6) All Men Women All Men Women OLS, household fixed effects Negative binomial regression Years of schooling 0.023*** 0.014*** 0.032*** À0.074*** À0.054*** À0.094*** (0.002) (0.004) (0.003) (0.005) (0.007) (0.007) Observations 8,348 4,180 4,168 9,955 5,025 4,930 R-squared 0.045 0.034 0.032 0.303 0.298 0.303 Notes: Regressions include controls for age, gender, and sibling order. OLS estimations incorporate household fixed effects, but negative binomial regressions do not incorporate household fixed effects due to integer constraints. Alphas are significant at p < 0.01 in all negative binomial estimations. Clustered standard errors at the household level are in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. To check whether the program truly increased the financial returns to labor market participation via its impact on schooling we present results from a three-stage procedure in table 6. In this table we jointly estimate the impact of child sponsorship on schooling (essentially replicating the results in our compan- ion paper) through both OLS and GMM IV estimation. The second stage estimates a Heckman wage equation on the nontreated individuals in our sample as a function of total years of education and covar- iates, in which we obtain an estimate of the monthly income gains from an added year of schooling on the nontreated. In the third stage, we multiply the impact coefficients from the first and second stages by the expected monthly wage, that is (sponsorship program impact on schooling)  (schooling impact on labor market participation)  (mean monthly labor market wage), to obtain the mean impact of child sponsorship on wages via its impact on schooling and the added value that additional years of schooling yield in the labor market, bootstrapping the entire process with five hundred replications. These esti- mates of the impact of the sponsorship program via the effects of education on labor market participa- tion are given in the middle set of estimates in table 6, and they show an increase in monthly wages for men between $3.54 and $6.56; the estimates for women lie between $3.83 and $6.21. The third row of estimations repeats this exercise for the income impact of the program via the higher wages that the added schooling from the program yields conditional upon employment, that is, (sponsor- ship program impact on schooling)  (schooling impact on marginal wage, conditional on employment)  (probability of employment). Estimates here range from $2.36 to $3.92 for men and from $4.11 to $8.37 for women. The bottom row of the table shows the total income impacts that accrue to sponsored children simply via added education, $8.04 to $10.48 per month overall, a little lower for men and a little higher for women, thus representing perhaps two-thirds of the impact of the program. Notably, however, these estimates indicate a higher impact for women on the marginal wage conditional on employment than is actually realized in the program. This may be because young women who finish the program show high rates of entry into the labor market but choose to undertake relatively low-paid positions commensurate the program’s focus on human service rather than on higher adult incomes, a message which may have taken root more deeply among formerly sponsored women than among for- merly sponsored men. Table 6. Estimated Program Impact on Income via Labor Market Effects from Added Schooling Variables (1) (2) (3) (4) (5) (6) OLS fixed effects IV fixed effects Men: OLS fixed effects Men: IV fixed effects Women: OLS fixed effects Women: IV fixed effects Dep. variable: years of schoolinga Sponsored 1.11*** 1.45*** 1.14*** 2.00*** 1.09*** 0.695 (1.21) (0.406) (1.38) (5.16) (0.145) (0.487) The World Bank Economic Review R-squared 0.063 0.047 0.056 0.042 0.076 0.072 Income impact of OLS  probit  avg. IV  probit  Men: OLS  Men: IV  Women: OLS  Women: IV  program via schooling wage avg. wage probit  probit  probit  avg. probit  avg. wage, FE effect on labor market fixed effects fixed effects avg. wage, FE avg. wage, FE wage, FE participation:b $5.03*** $6.56** $3.54*** $6.21* $6.01*** $3.83 (0.743) (2.83) (0.902) (3.36) (1.09) (3.38) Income impact of program OLS  Heckman  IV  Heckman  Men: OLS  Men: IV  Heckman Women: OLS Women: IV via schooling effect on prob(emp.) prob(emp.) Heckman   prob(emp.)  Heckman   Heckman labor market wage:c fixed effects fixed effects prob(emp.) fixed effects prob(emp.)  prob(emp.) fixed effects fixed effects fixed effects Years of schooling $3.01* $3.92** $2.36* $4.11 $8.37* $5.33 (1.75) (2.13) (1.44) (2.80) (4.62) (5.08) Heckman’s lambda À5.62** À5.62** 11.66 11.66 À0.90*** À0.90*** (2.44) (2.44) (7.55) (7.55) (0.24) (0.24) Observations 8,348 8,348 4,180 4,180 4,168 4,168 Sponsorship impact via schooling " Earnings from Employment þ $8.04 $10.48 $5.90 $10.32 $14.38 $9.16 " Earnings from Marginal Wage a OLS and Instrumental Variables estimates include controls for age, gender, sibling order, oldest sibling, and household-level fixed effects. First-stage F-tests for instrumental variable estimation yields F ¼ 225.5 (all), 111.0 (Men), 116.7 (Women), respectively (p < 0.001). b Reported coefficients stem from product of (sponsorship program impact on schooling)  (schooling impact on labor market participation)  (mean labor market wage). Joint estimates and include controls for age, gender, sib- ling order, oldest sibling and household-level fixed effects. Coefficients are estimated jointly; bootstrapped standard errors clustered at the household level (500 replications). c Reported coefficients stem from product of (sponsorship program impact on schooling)  (schooling impact on marginal wage, conditional on employment)  (probability of employment). Joint estimates and include controls for age, gender, sibling order, oldest sibling, and household-level fixed effects. Coefficients are estimated jointly; bootstrapped standard errors clustered at the household level (500 replications). *** p < 0.01, ** p < 0.05, * p < 0.1. 447 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 448 Wydick, Glewwe, and Rutledge In many instances the decision to enter the labor force is commensurate with other demographic deci- sions regarding, for example, marriage and childbearing. Figure 6 shows kernel densities of the number of children in the families of formerly sponsored and unsponsored individuals, now adults; the diagram indi- cates smaller families in adulthood among the formerly sponsored. Although figure 7 does not appear to show substantial differences in marriage rates over age by sponsorship status for either gender, table 7 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 presents modest evidence that sponsorship may slightly reduce the probability of marriage. Whereas in other estimations spillover effects to siblings were found to be insignificant and so were omitted from our regression tables, here we find evidence of significant spillovers and therefore present results that account for spillover effects onto (younger) siblings. We find that sponsorship causes a roughly 3 percentage point reduction in the probability of marriage at the time of the survey (p < 0.10) with effects reasonably uni- form across age groups, perhaps having a slightly larger impact on marriage from age seventeen to twenty-one, although this is measured with noise. Figure 6. Kernel Density of Number of Children by Sponsorship Status Figure 7. Probability of Marriage by Sponsorship, Sex, and Age The World Bank Economic Review 449 Table 7. Sponsorship Impacts on Marriage and Number of Children Variables Prob. married Number of children (1) (2) (3) (4) (5) (6) LP Model LP w/ FE, GMM-IV FE Neg Binomial Neg Binomial Neg-Bin-IV Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 HH FE interact. w/ interact. w/ interact. w/ interact. OLS, household FE: Sponsored À0.029* À0.038 À1.431 À0.018 0.157 1.074** (0.016) (0.094) (1.085) (0.038) (0.230) (0.422) Sponsored & age 17–21 À0.042 À0.032 (0.045) (0.045) Sponsored & age 22–27 0.001 À0.012 À0.077 0.580*** (0.022) (0.023) (0.065) (0.199) Sponsored & age 28–33 À0.012 À0.030 À0.010 0.562*** (0.024) (0.027) (0.047) (0.175) Sponsored & age 34–39 À0.021 À0.077 0.005 0.292 (0.041) (0.058) (0.085) (0.191) Sponsored & age 40–45 À0.256* À0.265 (0.145) (0.307) Intrahousehold spillovers À0.041 À0.029 À0.001 À0.152** À0.152*** À0.388*** (a1-a2) - (c1-c2) (0.039) (0.039) (0.043) (0.069) (0.065) (0.105) Program impact with household spillovers: À0.071* À0.067 1.430 À0.170*** 0.005 0.685* s þ(a1-a2) - (c1-c2) (0.039) (0.101) (1.608) (0.068) (0.240) (0.410) s þ(a1-a2) - (c1-c2), age 17–21 À0.071 À0.031 (0.059) (0.064) s þ(a1-a2) - (c1-c2), age 22–27 À0.027 À0.011 À0.229*** 0.191 (0.044) (0.047) (0.087) (0.157) s þ(a1-a2) - (c1-c2), age 28–33 À0.041 À0.029 À0.162** 0.138 (0.042) (0.043) (0.074) (0.119) s þ(a1-a2) - (c1-c2), age 34–39 À0.050 À0.077 À0.146 À0.095 (0.056) (0.061) (0.101) (0.143) s þ(a1-a2) - (c1-c2), age 40–45 À0.407*** À0.653*** (0.156) (0.264) Observations 10,001 10,001 10,001 10,004 10,004 10,004 R-squared 0.108 0.144 0.125 Notes: Regressions include controls for age, gender, and sibling order. OLS and IV estimations incorporate household fixed effects. Negative binomial regressions display coefficients (not marginal effects) and omit household fixed effects with alpha significant at the 1% level in all regressions (rejecting null of Poisson distribu- tion). Impacts on “Sponsored & Agegroup” are the sum of coefficients on Sponsored added to the coefficient of Sponsored x Agegroup and joint tests of these two co- efficients. GMM-IV estimations instrument for schooling using sponsorship program. First-stage instrumental variable estimations yield F ¼ 297.12. Clustered standard errors at the household level in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.10. In the negative binomial estimations in columns (4) to (6) of table 7 we see sponsorship yielding a re- duction in family size. This table presents negative binomial estimation coefficients, but these magni- tudes translate to marginal effects (accounting for spillovers) of 0.25 fewer children overall and 0.64 fewer children for older sponsored children age 40–45 at the time of the survey, with lower point esti- mates for the younger adults. The impacts on childbearing are likely greater on older individuals simply because birth rates were much higher when these individuals were sponsored back in the early 1980s, and so the impact of greater labor force entry has a stronger effect on birth rates. This can be seen in fig- ure 8, for example, where birth rates are only slightly lower for the formerly sponsored but then fall con- siderably among the older cohort. We do not present impacts by gender; they are virtually identical for men and women. 450 Wydick, Glewwe, and Rutledge Figure 8. Number of Children in Adulthood by Age Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 The increase in income from child sponsorship occurs through different career choices of formerly sponsored men and women. Table 8 presents multinomial logit estimates of the impacts of sponsorship on vocational trajectories, separately by gender. Sponsorship of boys leads them in adulthood into two main types of jobs: K–12 teachers and employees in lower-skill technology jobs, such as work in call cen- ters, with some evidence of an increase in blue-collar employment. Formerly sponsored men are roughly 64% more likely to be teachers in adulthood relative to the counterfactual and are about 44% more likely to have a semiskilled technology job or work in a call center. Sponsorship of girls makes them 50% less likely to be involved in agriculture as adults, 55% more likely to have a clerical job, 60% more likely to work in finance or for a large corporation, 148% more likely to have a semiskilled technology job (starting from a very low baseline), and 93% more likely to be a nurse or health professional. Wealth Impacts Finally, we consider the impact of child sponsorship on indicators of wealth in adulthood. Results show that individuals who were sponsored as children live in better houses as adults. Both the simple index and the Anderson index indicate significant impacts of sponsorship on adult dwelling quality. Specifically, OLS (linear probability model) household fixed-effect estimates in table 9 indicate that sponsorship increases the probability that a home has electricity by 2.9 percentage points, raises the probability of having improved walls by 2.5 percentage points, and increases having improved floors by 1.9 percentage points. GMM-IV estimates are often smaller and all insignificant for specific improve- ments, but larger and significant for both dwelling indices. Does child sponsorship increase consumer durable ownership in adulthood? The only asset with a sta- tistically significant effect is the probability of owning a mobile phone, an increase of 5.4 percentage points in the OLS estimate and 18.3 percentage points in the IV estimate (baseline of 76.8%). We find no evidence that sponsorship increased ownership of bicycles, motorcycles, vehicles, or land; the IV coef- ficients on both consumer good summary indices are insignificant. Tests for household and village level spillovers (not shown) find no significant effects on aggregated dwelling indicators (p ¼ 0.237 and p ¼ 0.523, respectively) or consumer durables (p ¼ 0.333 and p ¼ 0.536). Although our research on educational impacts provides evidence for spillovers onto younger sib- lings, particularly in secondary school completion (Wydick, Glewwe, and Rutledge 2013), we find no ev- idence of income or wealth spillovers in our data. Tables 10 and 11 disaggregate wealth impact estimations by gender. Not surprisingly, impacts on dwelling quality appear to be higher for formerly sponsored men than for women. OLS estimates for Table 8. Impacts on Adult Vocation by Gender: Marginal Effects, Multinomial Logit Estimations Men Occupational Category MN Logit coefficients Marginal effects Baseline untreated Occupational category MN logit coefficients Marginal effects Baseline untreated 1 Agriculture 0.042 0.0068 0.048 8 Small business 0.053 À0.0052 0.045 (0.23) (0.011) (0.25) (0.0105) 2 Construction, day labor 0.497** 0.012 0.046 9 Ministry, pastoral 0.477 0.0026 0.0043 (0.256) (0.0089) (0.484) (0.0041) 3 Clerical, sales 0.290 0.0050 0.047 10 Finance and large business 0.329 0.0045 0.027 (0.249) (0.009) (0.287) (0.0078) 4 Blue collar 0.349** 0.019 0.104 11 Police, army, security, fire 0.471 0.0056 0.022 The World Bank Economic Review (0.166) (0.015) (0.350) (0.0064) 5 Personal services 0.390* 0.012 0.063 12 Professional, doctor, lawyer 0.059 À0.0025 0.024 (0.216) (0.011) (0.331) (0.0072) 6 Teaching 0.920*** 0.0274*** 0.043 13 Semi-skill tech, call centers 0.618** 0.012* 0.027 (0.233) (0.0079) (0.288) (0.0071) 7 Government 0.831 0.0044 0.0078 14 Nursing, health, hospital À0.500 À0.0026 0.0043 (0.506) (0.0033) (0.83) (0.0032) Women Occupational category MN logit coefficients Marginal effects Baseline untreated Occupational category MN logit coefficients Marginal effects Baseline untreated 1 Agriculture À0.393 À0.021** 0.042 8 Small business À0.234 À0.012 0.028 (0.26) (0.010) (0.277) (0.0091) 2 Construction, day labor 1.180* 0.0036* 0.0049 9 Ministry, pastoral 0.361 0.0004 0.0012 (0.657) (0.0021) (1.12) (0.0018) 3 Clerical, sales 0.637*** 0.025*** 0.045 10 Finance and large business 0.721** 0.012** 0.020 (0.210) (0.0094) (0.31) (0.0060) 4 Blue collar 0.530** 0.0127* 0.027 11 Police, army, security, fire À0.617 À0.0027 0.0041 (0.265) (0.0076) (0.844) (0.0031) 5 Personal services 0.578 0.013* 0.025 12 Professional, doctor, lawyer À0.760 À0.0018 0.0098 (0.268) (0.0073) (0.488) (0.0046) 6 Teaching 0.371 0.018 0.063 13 Semi-skill tech, call centers 1.199*** 0.011*** 0.0074 (0.181) (0.011) (0.387) (0.0037) 7 Government 0.671 0.0022 0.0045 14 Nursing, health, hospital 0.955*** 0.014*** 0.015 (0.641) (0.0025) (0.33) (0.0051) Notes: Estimations include fixed effects at the household level. Marginal effects, dy/dx, are from corresponding multinomial logit estimations; control variables are gender, age, age2, birth order, and oldest child. Number of ob- servations ¼ 4,956. Psuedo R2 ¼ 0.0165, Chi-squared p < 0.0001. 451 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 452 Table 9. Impact on Adult Wealth Variables Dwelling Quality (1) (2) (3) (4) (5) (6) (7) Indoor toilet Electricity in home Improved walls Improved roof Improved floor Simple dwelling index Anderson dwelling index OLS, household FE: Sponsored 0.009 0.029*** 0.025** 0.004 0.019** 0.082*** 0.034* (0.006) (0.007) (0.012) (0.006) (0.009) (0.017) (0.020) Observations 9,477 9,490 7,863 8,554 8,614 10,004 10,004 R-squared 0.006 0.008 0.011 0.004 0.012 0.013 0.009 GMM-IV, household FE: Sponsored À0.017 0.041 0.006 À0.001 0.070 0.232** 0.192* (0.028) (0.035) (0.049) (0.030) (0.049) (0.093) (0.106) Observations 9,477 9,490 7,863 8,554 8,614 10,004 10,004 Variables Consumer Durables (8) (9) (10) (11) (12) (13) (14) Mobile phone Owns bike Owns motorcycle Owns car Owns land Simple consumer index Anderson consumer index OLS, household FE: Sponsored 0.054*** 0.015 0.010 À0.000 0.003 0.089*** 0.003 (0.012) (0.010) (0.009) (0.007) (0.009) (0.025) (0.029) Observations 9,884 9,856 9,906 9,880 9,444 10,004 10,004 R-squared 0.047 0.044 0.036 0.023 0.047 0.097 0.047 GMM-IV, household FE: Sponsored 0.183*** 0.004 0.019 0.000 À0.006 À0.004 0.024 (0.055) (0.052) (0.040) (0.036) (0.046) (0.128) (0.145) Observations 9,883 9,856 9,906 9,880 9,444 10,004 10,004 R-squared 0.085 0.060 0.106 0.029 0.063 0.149 0.077 Notes: Regressions include controls for age, gender, and sibling order. OLS and IV estimations incorporate household fixed effects.). First-stage F-test for instrumental variable estimation yields F ¼ 95.82 (p < 0.001). Clustered standard errors at the household level in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Whenever one of the components is missing for the particular house characteristic, we replace it with the mean for the index variables. This was done due to concern that creating an index only for individuals that had none of the constituent categories missing would lead to a nonrepresentative sample. The “improved walls” variable, which is mildly significant, has the most missing observations . Dropping that variable slightly lowers the estimated im- pact of the index variables. Wydick, Glewwe, and Rutledge Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 Table 10. Impact on Adult Wealth, Formerly Sponsored Men Variables Dwelling Quality (1) (2) (3) (4) (5) (6) (7) Indoor toilet Electricity in home Improved walls Improved roof Improved floor Simple dwelling index Anderson dwelling index OLS, household FE: Sponsored 0.021** 0.030** 0.019 0.014* 0.045*** 0.114*** 0.081*** The World Bank Economic Review (0.009) (0.013) (0.018) (0.008) (0.015) (0.029) (0.031) Observations 4,829 4,833 4,007 4,377 4,413 5,048 5,048 R-squared 0.009 0.008 0.011 0.010 0.017 0.024 0.013 GMM-IV, household FE: Sponsored À0.043 0.079 À0.076 À0.066 0.111 0.132 0.047 (0.039) (0.053) (0.063) (0.048) (0.084) (0.133) (0.149) Observations 4,829 4,833 4,007 4,377 4,413 5,048 5,048 Variables Consumer durables (8) (9) (10) (11) (12) (13) (14) Mobile phone Owns bike Owns motorcycle Owns car Owns land Simple consumer index Anderson consumer index OLS, household FE: Sponsored 0.073*** 0.015 0.012 À0.000 À0.010 0.094** À0.018 (0.021) (0.020) (0.014) (0.012) (0.015) (0.043) (0.050) Observations 4,986 4,971 4,993 4,979 4,806 5,048 5,048 R-squared 0.039 0.008 0.018 0.025 0.066 0.067 0.038 GMM-IV, household FE: Sponsored À0.048 À0.005 À0.029 0.031 À0.068 À0.154 À0.007 (0.098) (0.090) (0.062) (0.060) (0.069) (0.202) (0.231) Observations 4,985 4,971 4,993 4,979 4,806 5,048 5,048 Notes: Regressions include controls for age, gender, and sibling order. OLS and IV estimations incorporate household fixed effects. First-stage F-test for instrumental variable estimation yields F ¼ 76.48 (p < 0.001). Clustered stan- dard errors at the household level in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. 453 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 454 Table 11. Impact on Adult Wealth, Formerly Sponsored Women Dwelling Quality Variables (1) (2) (3) (4) (5) (6) (7) Indoor toilet Electricity in home Improved walls Improved roof Improved floor Simple dwelling index Anderson dwelling index OLS, household FE: Sponsored 0.008 0.029** 0.026 À0.001 0.009 0.052* À0.002 (0.009) (0.013) (0.019) (0.009) (0.016) (0.030) (0.030) Observations 4,648 4,657 3,856 4,177 4,201 4,956 4,956 R-squared 0.010 0.007 0.010 0.014 0.014 0.006 0.010 GMM-IV, household FE: Sponsored À0.020 À0.007 0.034 0.011 0.004 0.122 0.166 (0.025) (0.028) (0.049) (0.028) (0.047) (0.091) (0.107) Observations 4,648 4,657 3,856 4,177 4,201 4,956 4,956 Variables Consumer Durables (8) (9) (10) (11) (12) (13) (14) Mobile phone Owns bike Owns motorcycle Owns car Owns land Simple consumer index Anderson consumer index OLS, household FE: Sponsored 0.058*** 0.012 0.018 0.003 0.006 0.107*** 0.029 (0.018) (0.012) (0.012) (0.010) (0.013) (0.035) (0.041) Observations 4,898 4,885 4,913 4,901 4,638 4,956 4,956 R-squared 0.040 0.005 0.005 0.025 0.057 0.053 0.027 GMM-IV, household FE: Sponsored 0.103 0.002 0.050 À0.011 0.021 0.143 0.031 (0.065) (0.044) (0.042) (0.035) (0.048) (0.129) (0.145) Observations 4,898 4,885 4,913 4,901 4,638 4,956 4,956 Notes: Regressions include controls for age, gender, and sibling order. OLS and IV estimations incorporate household fixed effects. First-stage F-test for instrumental variable estimation yields F ¼ 78.88 (p < 0.001). Clustered standard errors at the household level in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Wydick, Glewwe, and Rutledge Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 The World Bank Economic Review 455 men are positive on every dwelling category—indoor toilet, electrification of the household, improved walls, improved roof, and improved floor—and statistically significant for every category except im- proved walls. Both the simple dwelling index and the Anderson dwelling index are also strongly signifi- cant. However, IV estimates for men are all insignificant. OLS estimates for women indicate that sponsorship appears to affect only the probability of living in a home with electricity and even this is not Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 significant for the IV estimates. These differences in effects by gender presumably reflect the larger over- all impact of sponsorship on the incomes of formerly sponsored men, and they may also reflect that a husband’s income has a stronger influence on the type of dwelling in adulthood. Impacts on consumer durables are virtually identical between formerly sponsored men and women, where estimations indicate a strong and significant impact on cell phone ownership for both genders but no effect on ownership of a bicycle, motorcycle, car, or land. Another measure of wealth in adulthood relates to where an individual resides as an adult. Does a grown adult remain in her parents’ home, in a rented home, or in an owned home? Typically living in a home apart from parents is desirable after a certain age, especially for married couples, but this is not al- ways economically feasible. Baseline values among the untreated show 46.8% living in the parents’ home, 23.8% living in a rented home, and 29.4 living in a home owned by the individual or jointly owned with a spouse. Table 12 shows multinomial logit estimations indicating that formerly sponsored individuals are much less likely to remain living in their parents’ home (the base category). There is some indication that women are more likely to live in an owned home, but marginal effects are insignificant. Instead of living in their parents’ home as adults at the time of the survey, sponsored individuals are 4.8 percentage points more likely to live in a home rented themselves, about two percentage point higher for men than for women. Table 12. Impacts on Home Residence: Multinomial Logit Estimations (Base Category: Living in Parent’s Home) Variables (1) (2) (3) (4) (5) (6) All individuals Men Women Live in Live in Live in Live in Live in Live in rented home owned home rented home owned home rented home owned home Multinomial logit coeffs: 0.362*** 0.161* 0.382*** 0.089 0.318*** 0.211* Sponsored (0.085) (0.088) (0.118) (0.128) (0.121) (0.121) Marginal effects: 0.048*** 0.00 0.057*** À0.011 0.035** 0.011 Sponsored (0.013) (0.13) 0.019 (0.019) (0.017) (0.018) Observations 8,365 4,288 4,077 Pseudo R-squared 0.170 0.166 0.181 Notes: Multinomial logit estimations include controls for age, gender, oldest child, sibling order, number of siblings, mother’s education, father’s education, and country fixed effects. Data on residence not obtained in Uganda. Baseline values among untreated: 46.8% living in parent’s home, 23.8% living in rented home, 29.4% living in an owned home. *** p < 0.01, ** p < 0.05, * p < 0.10. IV. Conclusion International child sponsorship is a leading form of individual contact and financial assistance between ordinary people in developed countries and the poor in developing countries, yet little has been known about the impact of these programs on the economic outcomes in adulthood of sponsored children. Our more conservative Heckman estimates from a six-country study of 10,144 individuals show that child sponsorship is responsible for increases in monthly income of about $13–17 over an unconditional base- line of $75, or an increase of 17.3–22.9%. This effect of child sponsorship on future labor income is due 456 Wydick, Glewwe, and Rutledge Figure 9. Plausible Chain of Causal Effects from Sponsorship Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 principally to sponsored children entering the labor market as adults who would not have done so other- wise, particularly for women. We find that men realize an added $6.74 of monthly income from higher wages conditional upon employment but that estimations on the impact on wages of women cannot re- ject a null hypothesis of zero.13 Given that the cost of sponsorship to sponsors was $28 per month during the time in the 1980s and 1990s when the individuals in our study were sponsored and that the average length of sponsorship was 9.3 years, a monthly income increase of $13–17 over an average lifetime of work implies a modest finan- cial rate of return to child sponsorship of 3.7–5.0%. Our estimations find significant impacts of child sponsorship on proxies for adult wealth, where we find that sponsored children—especially males—are more likely as adults to live in better housing (homes with electricity and with roof and floor made of superior construction materials). Formerly sponsored men 13 As pointed out by a referee, income increases due to increased labor market participation do not account for the implicit cost of reduced leisure time, and since at the margin those two uses of time should be approximately equal the net benefit of child sponsorship via increased labor market participation may be close to zero. However, this ignores the fact that sponsorship can have a variety of non-economic effects that make leisure time more valuable, such as improved health and overall greater psychological well-being. While we cannot estimate those effects with our data, they are likely to be positive, and so even if all of the increased income is due solely to increased labor force participation the net benefit of the program is unlikely to be zero. In addition, some of our estimates in tables 3, 4, and 6 find positive impacts on wages from sponsorship, and these can serve as a lower bound of the “net” impact of sponsorship on income. The World Bank Economic Review 457 are more likely to live in homes with indoor plumbing. Impacts on adult consumer good ownership, how- ever, are more modest and appear to be limited to substantially greater ownership of mobile phones among both formerly sponsored men and women. We also present (modest) evidence suggesting that spon- sored children have fewer children in adulthood, along with much stronger evidence that both formerly sponsored men and women are less likely to live with their parents in adulthood. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 What about child sponsorship, in particular Compassion’s approach to child sponsorship, could be responsible for these significant effects on income and wealth in adulthood? In related research using a separate sample of currently sponsored children we explore the hypothesis that child sponsorship may improve adult incomes not merely through relieving external constraints that improve schooling access, nutrition, and health, but also through addressing internal constraints related to imparting a greater level of hopefulness about the future and instilling greater aspirations for schooling and adult vocation. Using data on currently sponsored children, we find in Glewwe, Ross, and Wydick (2016) a causal link be- tween child sponsorship and elevated educational and vocational aspirations among children in Kenya, and higher levels of happiness, self-efficacy, and hopefulness based on a quantitative analysis of chil- dren’s self-portrait drawings in Indonesia. Although it is yet impossible to definitively identify these in- creased aspirations as a causal channel to the positive impact from sponsorship on income and wealth we find in this study, what is clear from our three pieces of research on child sponsorship is that child sponsorship increases aspirations and that child sponsorship also improves adult economic outcomes. We present a diagram in figure 9 of what appears to us to be the causal channel for the effects we observe from child sponsorship on income and wealth in adulthood. Most conditional and unconditional cash transfer programs, and many—if not most—educational in- terventions, do not seek to directly address internal constraints of children, which are also related to the fostering of noncognitive (socio-emotional) skills. Our findings on the impacts of child sponsorship raise the possibility that this may constitute a missed opportunity. Taken together, our results suggest that de- velopment programs that relieve tangible external constraints, while simultaneously addressing the inter- nal constraints faced by the poor, may realize stronger impacts than programs that address external constraints alone, thus providing a basis for experimenting with new programs that embody these joint characteristics and for important future research. References Anderson, M. 2008. “Multiple Inference and Gender Differences in the Effects of Early Intervention: A Reevaluation of the Abecedarian, Perry Preschool, and Early Training Projects.” Journal of the American Statistical Association 103 (484): 1481–95. Blattman, C., N. Fiala, and S. Martinez. 2014. “Generating Skilled Self-Employment in Developing Countries: Experimental Evidence from Uganda.” Quarterly Journal of Economics 129 (2): 697–752. Cristia, J., P. Ibrraran, S. Cueto, A. Santiago, and E. Severin. 2012. “Technology and Child Development: Evidence from the One Laptop per Child Program.” IZA working paper No. 6104, Bonn, Germany. Dre` ze, J., and G. Kingdon. 2001. “School participation in rural India.” Review of Development Economics 5 (1): 1–24. Evans, D., M. Kremer, and M. Ngatia. 2008. “The Impact of Distributing School Uniforms on Children’s Education in Kenya.” World Bank Working Paper, Washington, DC. Glewwe, P., P. Ross, and B. Wydick. 2016. “Developing Hope among Impoverished Children: Using Child Self- Portraits to Measure Poverty Program Impacts.” Working Paper, University of Minnesota and University of San Francisco. Heckman, J. 1979. “Sample Selection Bias as a Specification Error.” Econometrica 47 (1): 153–60. Kremer, M., E. Miguel, and R. Thornton. 2009. “Incentives to Learn.” Review of Economics and Statistics 91 (3): 437–56. 458 Wydick, Glewwe, and Rutledge Kremer, M., and C. Vermeersch. 2004. “School Meals, Educational Attainment, and School Competition: Evidence from a Randomized Evaluation.” Policy Research Paper 3523. World Bank, Policy Research Department, Washington, DC. Kremer, Mi., S. Moulin, and R. Namunyu. 2003. “Decentralization: A Cautionary Tale.” Working paper no. 10, Poverty Action Lab, Cambridge, MA. Rawlins, R., S. Pimkina, C. Barrett, S. Pederson, and B. Wydick. 2014. “Got Milk? The Impact of Heifer Downloaded from https://academic.oup.com/wber/article-abstract/31/2/434/2897305 by LEGVP Law Library user on 08 August 2019 International’s Livestock Donation Programs in Rwanda.” Food Policy 44 (2): 202–13. Wooldridge, J. 2010. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press. Wydick, B., P. Glewwe, and L. Rutledge. 2013. “Does Child Sponsorship Work? A Six-Country Study of Impacts on Adult Life Outcomes.” Journal of Political Economy 121 (2): 393–436. The World Bank Economic Review, 31(2), 2017, 459–482 doi: 10.1093/wber/lhv061 Advance Access Publication Date: November 18, 2015 Article Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Political Connections and Tariff Evasion Evidence from Tunisia Bob Rijkers, Leila Baghdadi, and Gael Raballand Abstract Are politically connected firms more likely to evade taxes? This paper presents evidence suggesting firms owned by President Ben Ali and his family were more prone to evade import tariffs. During Ben Ali’s reign, evasion gaps, defined as the difference between the value of exports to Tunisia reported by partner countries and the value of imports reported at Tunisian customs, were correlated with the import share of connected firms. This association was especially strong for goods subject to high tariffs, and driven by underreporting of unit prices, which diminished after the revolution. Consistent with these product-level patterns, unit prices re- ported by connected firms were lower than those reported by other firms and declined faster with tariffs than those of other firms. Moreover, privatization to the Ben Ali family was associated with a reduction in reported unit prices, whereas privatization per se was not. JEL classification: D73, F13, H26 Are politically connected entrepreneurs more likely to evade tariffs? At issue are not only inequity and fiscal losses but also inefficiency since tariff evasion endows perpetrators with a cost advantage over those who are compliant that is not based on performance. The question is especially relevant for devel- oping countries, as they tend to be disproportionately reliant on revenues collected by customs to finance Bob Rijkers (corresponding author) is an economist in Development Economics Research Group at the World Bank; his email address is brijkers@worldbank.org. Leila Baghdadi is an associate professor at the Tunis Business School, University of Tunis, Tunisia, where she holds a WTO chair; her email address is leila.baghdadi@tbs.rnu.tn. Gael Raballand is senior public sector specialist at the World Bank; his email address is graballand@worldbank.org. Disclaimer: The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development / World Bank and its af- filiated organizations or those of the executive directors of the World Bank or the governments they represent. The paper benefited from KCP funding for the project “Job Creation, Structural Change, and Economic Development in MENA with Lessons from East Asia” (TF 014655, project code P127907). This work was also sponsored by the Economic Research Forum (ERF) and has benefitted from both their financial and intellectual support. The contents and recommendations do not necessarily reflect ERF’s views. We would like to thank Caroline Duclos for exceptional research assistance and Rebekka Grun, Melise Jaud, and especially Antonio Nucifora for help obtaining access to the data. We also benefitted from useful comments from Lotfi Ayadi, Varanya Chaubey, Caroline Freund, Bernard Hoekman, Phil Keefer, Beata Javorcik, Chahir Zaki, and seminar participants at the World Bank, the Economic Research Forum’s Annual Conference, and the Centre for the Study of African Economies, Oxford University. We are very grateful to the Tunisian Institut National de la Statistique for generously hosting us for several months, helping assemble the database, and providing incisive feedback. A supplemental appendix to this article is available at https://academic.oup.com/wber. C The Author 2015. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 460 Rijkers, Baghdadi, and Raballand public expenditures (Jean and Mitaritonna 2010),1 are characterized by a greater prevalence of problem- atic state-business relationships (Faccio 2006), and often have weaker tax collection capacity (Slemrod and Yithzaki 1999). Using a unique dataset in which importing firms owned by former president Ben Ali and his family Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 confiscated in the aftermath of the Jasmin Revolution are identified, this paper examines whether politi- cally connected enterprises were more likely to evade tariffs during Ben Ali’s tenure. Their behavior is compared with that of public firms and other private sector firms, which can be categorized into two types, notably “onshore” firms and so-called “offshore” firms, which are exempted from having to pay tariffs used in the production of exported output (or output sold to other offshore firms) and conse- quently have less incentive to evade tariffs. Such “offshore” firms can sell up to 30% of their output in the domestic market. If they chose to do so, they have to pay import tariffs over the share of production sold domestically. More specifically, politically connected enterprises are identified in firm-product-source country cus- toms data spanning the universe of import transactions. These are merged with UNCTAD data on HS6 exports to Tunisia by country and year to compute evasion gaps (Fisman and Wei 2004), defined as the discrepancy between exports to Tunisia reported by partner countries minus imports of the same product from that source country reported in Tunisia. Such evasion gaps are a useful proxy for the amount of imports that are underreported (or misreported) and thus not taxed (appropriately) and hence have become a standard indicator of tariff evasion (Fisman et al. 2008; Jean and Maritonna 2010). If con- nected firms are more prone to evade tariffs, evasion gaps should increase with the share of imports accounted for by connected firms. In addition, the correlation between the import share of connected firms and evasion gaps should strengthen with the tariff rate, since evasion is more lucrative when taxes are high. Moreover, gaps can be expected to decrease in product-source lines in which connected firms were operating after they lose such connections as a result of being confiscated. The data also enable us to examine whether the elasticities of connected firms’ reported import val- ues, quantities, and prices, with respect to tariffs, are different from those of nonconnected firms, which is our second test for tariff evasion. If Ben Ali firms are more likely to evade tariffs, one would expect their reported imports to decrease disproportionately faster with tariff increases than those of other firms. Comparing the responsiveness of reported unit prices and import quantities across different groups of firms helps assess to what extent evasion occurs through underreporting of prices and mis- or underreporting of quantities. In addition, a difference-in-difference strategy comparing the import decla- rations of firms privatized to the Ben Ali family with those of firms privatized to non–Ben Ali family members is deployed to identify the impact of becoming connected on reported unit prices. Tunisia, a small open economy at the forefront of the Arab Spring, provides an interesting case study of which firms are most likely to engage in tariff evasion for several reasons. To start, the Ben Ali family had extensive business interests and several of its members have been convicted for (ab)using their politi- cal power for personal gain, limiting competition in the process (Rijkers et al. 2014b). Establishing lead- ing positions in the import of key consumer products including cars and electronics appears to have been an important strategy by which the family reaped rents (Beau et Graciet 2009; World Bank 2014a). Second, Tunisia has been one of the most successful exporters in Northern Africa and is rather reliant on imports, which amount to roughly half of its GDP, with customs revenues accounting for 9% of fiscal revenues. Third and related, its export success is to a large extent due to tax regulations that stipulate that the so-called offshore firms do not have to pay import duties on goods that they (use to produce goods that they) export. As alluded to above, this diminishes their incentives to engage in tariff evasion. 1 Estimates by Cantens et al. (2012) suggest that in low income countries such as Cameroon customs revenues account for between 30% to 60% of state revenues and that the importance of customs revenues as a source income declines when countries get richer. The World Bank Economic Review 461 Fourth, Tunisian customs authorities are considered among the most corrupted of all government insti- tutions by Tunisian citizens and companies (ATCP 2015; ITCEQ 2012). While the Tunisian customs code is consistent with best practices defined by the World Customs Organization,2 its implementation is discretionary (World Bank 2014b).3 The combination of a bewildering complexity of import regimes Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 allowing for suspension of import duties and very weak administrative controls, including a severely defunct IT system, render effective enforcement challenging. Last but not least, Tunisia has great admin- istrative data and authorities willing to share those, enabling us to unambiguously identify politically connected firms as those who were confiscated in the aftermath of the Jasmin Revolution. By providing evidence suggesting that politically connected firms are more likely to evade tariffs, this paper contributes to and combines different strands of literature. To start, a large literature, pioneered by Bhagwati (1964, 1967) and popularized by Fisman and Wei (2004), has used discrepancies between trade flows reported by trading partners to show how tariff evasion varies with product characteristics, tariff rates (Javorcik and Narcisko 2008; Fisman and Wei 2009), enforcement (Mishra et al. 2008), cus- toms organization, and country characteristics such as level of corruption or bureaucracy efficiency (Jean and Mitaritonna 2010; Carre ` re and Grigoriou 2015). However, there is still no research that the authors are aware of that examines which type of entrepreneurs are most likely to evade import taxation.4 Second, by unveiling an additional mechanism by which firms may benefit from political connections, the paper helps explain why political connections tend to be valued highly (Fisman 2001; Faccio 2006; Faccio et al. 2006; Johnson and Mitten 2003). Though quantifying the costs associated with tariff eva- sion is challenging, a highly conservative back-of-the-envelope calculation presented in section V sug- gests that between 2002 and 2009, underreporting of unit prices alone enabled Ben Ali–owned firms to evade 1.2 billion USD worth of import taxes more than other private firms would have; this estimate does not consider other types of tax fraud, such as smuggling or underreporting of quantities. It also does not consider the indirect costs of this type of fraud, in terms of a lack of transparency and market distortions limiting incentives to invest and hampering efficiency. Our findings also resonate with earlier studies on trade-related corruption by Mobaraq and Purbasari (2008) and Khandelwal et al. (2013), which demonstrated that politically connected firms were prone to firm-specific preferential treatment in being granted exclusive and inefficient import licenses. Third, by showing evidence suggestive of abuse of power by the ruling elite, our results also contrib- ute to the literature on state-business relationships in the Middle East and Northern Africa (Acemoglu et al. 2014; Diwan et al. 2014) and (indirectly) the economics of the Arab Spring (Campante and Chor 2012; Malik and Awadallah 2013). The remainder of this paper is organized as follows. The next section briefly reviews related literature and discusses why firms might differ in their propensity to evade tariffs. Testing strategies are presented in section two. Data and descriptive statistics are presented in section three. Product-level regressions are presented in section four, while firm-level regressions are presented in section five. Section six concludes. 2 A new customs code compliant with “best practices” was adopted in 2009 just months before the Jasmin revolution. 3 Like many other countries, Tunisian customs maintain a three-track system whereby customs declarations allocated to the “green corridor” are allowed to pass through customs without any inspection, customs declarations allocated to the “yellow” corridor face document inspections, and customs declarations in the “red” corridor are subject to physical inspections. Customs officers, however, were reluctant to disclose both which firms were allocated to these various cor- ridors and the criteria used to make such allocations. In all cases, containers have to go through a scanner due to a 100% scanning policy, which means that there is a minimal control for any container. Document checks are carried out systematically. Therefore, any firm is subject to controls and therefore subject to discretion from customs officers. 4 In answering this question, we also contribute to the nascent literature on which firms are most likely to evade taxes (see e.g., Slemrod 2007 and Slemrod and Yithzaki 2002 for overviews of the literature). 462 Rijkers, Baghdadi, and Raballand I. Why Do Firms Differ in Their Propensity to Evade Tariffs? Models of tax compliance predict tax evasion, the circumvention of taxes through illegal practices, to increase with the tax rate, and to decrease with the probability of detection, penalties for evasion, risk aversion, and opportunities to avoid taxes (through legal means) (see Alm 1999; Andreoni et al. 1998; Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Slemrod 2007; and Slemrod and Yithzaki 2002 for reviews of the literature). These factors are likely to vary across different types of firms. Starting with punishment, detection, and risk-aversion, connected firms faced both a lower risk of being caught and lower penalties conditional on being caught and might have been less risk averse as connected entrepreneurs were on average wealthier than nonconnected ones. Customs officials who had been working in the risk management unit during the Ben Ali era told us that fraudulent behavior by Ben Ali firms was less likely to be reported in part because of career concerns and fear of retaliation from the family. Ben Ali entrepreneurs also appear to have been very well informed of customs risk man- agement practices and control criteria. According to customs officials we interviewed, they “continuously adapted their tactics in response to newly introduced [anti-corruption] measures.” Connected firms may also have enjoyed greater opportunities for tax avoidance (through legal means); anecdotal evidence suggests connected firms may have had privileged access to duty exempted import regimes.5 Tax exemptions to promote exports were also very important for offshore firms; since they do not have to pay import tariffs for goods used to manufacture exports (or sold to other offshore firms), their incentives to evade are limited compared to those faced by onshore firms. Moreover, such tariff exemptions may make detecting evasion by offshore firms relatively more challenging than detect- ing fraud by onshore firms. State-owned firms might have had weaker incentives to evade tariffs because they have softer budget constraints and may not be purely profit-oriented and because the compensation of public sector firms does not covary with firm profits to the same extent as is the case in private sector firms (Brockmeyer et al. 2015).6 Moreover, financial reporting is typically weak in Tunisian SOEs and overinvoicing imports is often used to reduce possible taxable profits, which can in turn help generate extra funding for the SOE (Banque Mondiale 2014). At the same time, the relative inefficiency of SOEs may incentivize evasion to stay cost-competitive. In sum, managers of public sector firms face mixed incentives. Since connected entrepreneurs arguably have some sway over policy making, one may wonder why they do not simply exempt themselves from having to pay tariffs at all, or use their political clout to remove (or at least lower) tariffs. In practice, Ben Ali firms did not on average face lower tariffs, as is demonstrated in section III. One possible explanation is that they benefitted to some extent from tariffs, as these reduce foreign competition (Grossman and Helpman 1994; Goldberg and Maggi 1999). This explanation is consistent with evidence suggesting that sectors in which connected firms were active were more likely to be subjected to new restrictions on foreign investment (Rijkers et al. 2014b), which enhanced the market power of connected firms. Moreover, evasion is all the more advantageous when competitors are forced to pay higher taxes. Reducing tariffs would not only help connected firms but also their competitors. Another possible explanation is that the scope for influencing tariff setting might have been limited by Tunisia’s WTO membership and international trade agreements, which predated the aggressive expansion of the Ben Ali business empire after the turn of the millennium. 5 For example, the customs regime “vente  a quai” was predominantly used by ENNAKL, a firm owned by Ben Ali’s son- in-law Sakhr El Matri, and allowed firms to import goods destined for sale on the domestic market without having to pay import taxes. Unfortunately, analyzing the (ab)use of such duty suspension regimes is beyond the scope of this paper due to data limitations. Note, however, that if connected firms used duty exempted import regimes more inten- sively, detecting evasion would be more difficult. 6 Evidence from the United States suggests private firms have higher proposed tax deficiency ratios than public firms (see Slemrod 2007 and the references therein). The World Bank Economic Review 463 In short, connected entrepreneurs may be more likely to evade tariffs because they face a lower risk of being caught and lower penalties conditional on being caught and because they might be less risk averse. At the same time, they might also be more successful at lobbying for tax breaks, which would diminish their incentives to evade tariffs. Offshore firms are able to avoid taxes when they export and thus have Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 weaker incentives for evasion, while public sector firms face mixed incentives. II. Econometric Strategy Two complementary strategies are deployed to test whether connected firms are more likely to evade taxes. The first, product-level, approach relates trade evasion gaps to the relative importance of Ben Ali owned importers and examines how such gaps changed in the aftermath of the Jasmin Revolution. The second, firm-level approach assesses whether the value, quantity, and price of connected firms’ imports decline more rapidly with tariffs than those of nonconnected firms. A difference-in-difference strategy comparing privatizations to the Ben Ali family with other privatizations is deployed to isolate the impact of becoming connected on reported unit prices. Do Evasion Gaps Increase with the Presence of Connected Firms? If imports into Tunisia are reported correctly, then they must be close to reported exports to Tunisia. Since exports are typically reported as Free on Board7 (FOB), that is, excluding transport and insurance costs, while recorded imports are often calculated in terms of Cost Insurance Freight8 (CIF), small dis- crepancies reflecting transportation and insurance costs are expected.9 Yet, it is not clear why such dis- crepancies should be systematically correlated with tariffs once product and destination characteristics are controlled for and tax compliance is perfect. By contrast, a correlation between tariffs and trade gaps, defined here as the difference between exports to Tunisia reported by source countries and imports reported into Tunisia at the HS6-country-year level, is suggestive of tariff evasion. If Ben Ali firms are more likely to evade tariffs than other firms, then the evasion gap should be higher when Ben Ali firms are present, especially for products subject to high tariffs. To assess whether trade gaps vary with the prevalence of Ben Ali importers in a particular product-source line, the following specification is estimated: À Á À Á TradeGappst ¼ log Epst À log Ipst ¼ bT LogðTariffpst þ 1Þ þ bBA BenAlipst þ bO Offshore pst þbP Public pst þ bTBA Ben Ali pst à LogðTariff pst þ 1Þ (1) þbTP Public pst à LogðTariff pst þ 1Þ þ bTO Offshore pst ÃLogðTariff pst þ 1Þ þ bN logðNTBpt þ 1Þ þ ups þ st þ epst ; where Epst stands for exports to Tunisia of product p reported by partner country s at time t, Ipst stands for imports of product p from country s reported by Tunisian customs at time t, ups is a vector of source- country-product dummies, st, is a vector of time dummies, and Ben Alipst is a proxy for the share of imports of product p from country s in year t imported by politically connected firms. Similarly, Publicpst is a proxy for the import share of public enterprises, and Offshorepst a proxy for the import share of offshore firms. Note that, implicitly, the omitted category here is onshore private firms. If Ben Ali firms are more prone to tariff evasion, especially for products that are subject to high tariffs, 7 The seller loads the goods on board the ship nominated by the buyer. The seller must clear the goods for export. 8 The seller must pay the costs and freight including insurance to bring the goods to the port of destination. 9 Such discrepancies are typically negative (meaning reported imports in Tunisia are higher than exports reported by part- ners) and can be amplified by classification errors and/or exchange rate fluctuations. Additional discrepancies may arise because different countries use different accounting systems. 464 Rijkers, Baghdadi, and Raballand then bTBA > 0: Similarly, one might hypothesize public enterprises and offshore firms to be less likely to evade tariffs, in which case: bTO < 0, and bTP < 0. To examine whether potential evasion is due to underreporting of prices or mis- and/or underreporting of quantities, the same regression is estimated using evasion quantity gaps and evasion unit price gaps as dependent variables. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Four different proxies for the import shares of different types of firms are used, notably: (i) the aggre- gate share of the value of all reported imports in a given HS6-country-year cell, (ii) the aggregate share of the total reported quantity imported in that cell, and (iii) the share of all firms that report importing product p from country s at time t. Neither of these proxies is ideal; the latter implicitly assumes homo- geneity across importers within an HS6-country pair. The first two measures arguably better account for the relative size of different types of firms, yet they may themselves be affected by differential underre- porting and/or misclassification. Indeed a problem with this identification strategy is that underreporting or even nonreporting of imports can be very difficult to detect. In the extreme case in which Ben Ali firms imported goods but simply did not report them at all, they would not appear in the data altogether. To remedy this problem, (iv) the predicted import share of the different types of firms is used as an alterna- tive proxy, which is constructed using a combination of tax and firm census data and an input-output table that does not rely on reported imports but rather on production data (see the supplemental appen- dix or the working paper version of this article (Rijkers et al. 2015) for details and results using these alternative proxies). One potential limitation of this strategy is that evasion gaps may be misattributed to the Ben Ali fam- ily when in fact their competitors are the ones who are evading tariffs. To address this limitation, we not only resort to firm-level analysis but also exploit the fact that the Jasmin Revolution was associated with a loss of connections for the confiscated firms. Since firm-level and tariff data are not available for the post–2009 period, the relative importance of the different categories of firms is fixed at their 2009 levels. A difference-in-difference variant of equation (1) is estimated; Trade Ga ppst ¼ bT LogðTariff pst þ 1Þ þ bPT Postt LogðTariff ps09 þ 1Þ þbBA Ben Ali pst þ bPBA Postt Ben Ali ps09 þ bO Offshore pst þbPO Postt Offshore ps09 þ bP Public pst þ Postt bP Public ps09 þbTBA Ben Ali pst LogðTariff pst þ 1Þ þ bPTBA Postt Ben Ali ps09 LogðTariff ps09 þ 1Þ þ bTP Public pst LogðTariff pst þ 1Þ (2) þbPTP Postt Public ps09 LogðTariff ps09 þ 1Þ þbTO Offshore ps09 LogðTariff pst þ 1Þ þ bPTO Postt Offshore ps09 LogðTariff ps09 þ 1Þ þ bN logðNTB pt þ 1Þ þbPN Postt logðNTB p10 þ 1Þ þ u ps þ st þ e pst : Since the last year for which NTB data are available is 2010, the values for 2011, 2012, and 2013 are all imputed using the 2010 value. If connected firms were more likely to engage in tariff evasion, this effect should diminish after the Jasmin Revolution, especially in product lines subject to high tariffs, where evasion is expected to be greatest; in this case bPTBA < 0. Do Connected Firms Report Lower Import Values, Quantities, and Prices (When Tariffs are Higher)? If Ben Ali firms are more likely to evade tariffs than nonconnected firms, their reported imports can be expected to decline more strongly with tariffs than those of nonconnected firms. To test whether their reported imports are indeed differentially sensitive to tariffs a simple import demand function is esti- mated at the product-source country level: The World Bank Economic Review 465 lnIijpst ¼ bY logYit þ bL log Lit þ bA logageit þ bBA BenAlii þ b0 Offshorei þbp Publici þ bT LogðTariffpst þ 1ÞbTBA BenAlipst à LogðTariffpst þ 1Þ (3) þbTP Publicpst à LogðTariffpst þ 1Þ þ bTo Offshorepst Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 ÃLogðTariffpst þ 1Þ þ bN logðNTBpt þ 1Þ þ aj þ ups þ st þ epst ; where Iipst are imports by firm i operating in sector j of product p from country s at time t, Yit is the log of firm i’s output in year t, Lit is the amount of workers employed by firm i at time t, and aj is a vector of five-digit activity dummies (Ben Ali importers are active in seventy distinct five-digit sectors) ups is a vec- tor of product-source dummies. Controlling for output, employment, firm age, and detailed activity helps control for size heterogeneity and differences in technology across different types of importers. In addition, inclusion of product-source dummies mitigates potential bias associated with sorting into importing particular products from particular countries. Section V also presents specifications in which industry-product-source-year dummies are included (in which case the effects of tariffs and nontariff measures cannot be separately identified) to control for industry-product-source specific shocks. Of focal interest is the coefficient bTBA; if Ben Ali firms are more likely to underreport or misclassify imports, one would expect a negative coefficient. These regressions are estimated using import values, quantities, and prices as dependent variables. The identifying assumption underpinning this strategy is that once their output, labor usage, age, and sector are accounted for, there is no reason why Ben Ali firms would import less of goods that are subject to higher tariffs than other firms importing the same product from the same source country in the same year would other than their proclivity to evade taxes.10 This assumption might be flawed, for instance if Ben Ali firms are better negotiators and more successful at bargaining for lower prices for products sub- ject to high tariffs. Alternatively, if it were easier for Ben Ali firms to substitute away from using high- tariff imports (perhaps because they are more likely to obtain import licenses) than for other firms, then a negative coefficient bT might be misinterpreted as evidence for tariff evasion. The strategy obviously also does not detect pure smuggling, though it helps detect difference due to misclassification and under- invoicing of imports conditional on reporting. Another potential limitation is that political connections may be endogenous; the Ben Ali family may have bought or set up firms that were particularly cost efficient and/or had a comparative advantage in navigating the complex Tunisian bureaucracy. To examine this issue, we exploit the fact that five of the connected firms were privatized into the Ben Ali family, rendering it feasible to compare their pre- and postprivatization customs declarations for the same firm by source-country. By comparing the evolution of the unit prices of these firms with those of firms privatized to non–Ben Ali family members, we attempt to isolate the impact of becoming connected from the impact of changing public to private own- ership. Formally, the following difference-in-difference strategy is adopted: logPipst ¼ bPriv PostPrivatizationit þ bBAPriv PostBenAliPrivatizationit þ uips þ eipst ; (4) where Pipst is the unit prices reported by firm i importing product p from source country s at time t, uips is a firm-product-source country fixed effect, and PostPrivatizationit and PostBenAliPrivatizationit are dummy variables that take the value 1 after a firm has been privatized or privatized to the Ben Ali family, respectively, and zero otherwise. Identification is thus based on comparing the unit prices of the same product imported from the same source country by the same firm before and after becoming privately owned. If Ben Ali firms are more likely to underreport, then one would expect bBAPriv < 0: The sample of competitors is confined to firms operating in the same five-digit industries as (to be) privatized firms 10 If for some other reason than tariff evasion Ben Ali firms import relatively more goods subject to high-tariffs, the coef- ficient estimates on the tariff will be spuriously attributed to evasion. 466 Rijkers, Baghdadi, and Raballand that report importing the same products from the same origin in the same years as (to be) privatized firms in at least two different calendar years. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 III. Data and Descriptive Statistics In the aftermath of the Tunisian revolution, assets of the Ben Ali clan were confiscated. The confiscation process, which is still ongoing, affects 114 individuals, including Ben Ali himself, his relatives, and his in-laws, and concerns the period from 1987 until the outbreak of the revolution. We obtained from the Tunisian authorities a list of 662 firms that were owned by the Ben Ali clan and confiscated in the aftermath of the revolution (before December 2014) and were able to identify 206 of these firms as importers by merging the business register, the Repertoire National des Enterprises (RNE), with annual firm-HS6 product-origin data on import transactions for the period 2000–2009 from Tunisian customs. The RNE contains information on the age, sector, and employment of all regis- tered nonagricultural firms operating in Tunisia, including for firms not employing any salaried workers, that is, the self-employed (see Rijkers et al. 2014a). Moreover, it has information on firms’ tax status and whether or not firms are publicly or privately owned. These data are complemented with informa- tion on output declared to the Tunisian tax authorities from the Tunisian Ministry of Finance. Thus, not all Ben Ali–owned importers are identified. However, the vast majority are included in our sample, which is most likely skewed towards the largest and economically most relevant firms since these are easier to identify (and hence also easier to confiscate). In order to calculate evasion gaps and assess how they relate to tariffs, the Tunisian HS6-product- origin import data are merged with bilateral tariff data by product from WITS, which is available for 2002–2009, though missing for 2007 and 2009, for which tariff data from 2006 and 2008, respectively, are used instead; data on nontariff measures by product are available for the period from 2000 to 2010 and information on imports and exports to Tunisia by HS6 product and year from COMTRADE for the period from 2002 to 2013. To ensure the results reflect systematic mismatches, rather than erratically reported incidental transactions, the sample is confined to (i) products which account for more than 0.01 percentage points of cumulative total exports to Tunisia reported by partner or more than 0.01 per- centage points of cumulative imports reported in Tunisia over this period.11 In addition, (ii) the focus is on the top fifteen source countries in terms of total import value reported in Tunisia or total export value declared in source countries12 and (iii) HS6-source country-year combinations for which both reported imports in Tunisia and reported exports by partner countries are positive.13 The resulting sample com- prises 1,386 products and sixteen countries,14 which cover 69.75% of all exports to and 61.03% of all 11 This reduces the sample from 5,449 to 1,493 products, which together account for 91.05% of exports to Tunisia del- cared by partners and for 92.05% of imports declared in Tunisia. 12 These countries account for 84.26% of import value declared in Tunisia over the period and for 78.92% of all exports to Tunisia reported by partners. Focusing only on products that account for more than 0.01 percentage points of cumulative imports reported in Tunisia or 0.01 percentage point of cumulative exports to Tunisia reduces these num- bers to 76.97% and 73.32%, respectively. 13 Such source-product combinations account for 81.44% of all exports to Tunisia reported by partners and 73.65% of imports reported in Tunisia. We also examined whether the likelihood of imports being “orphaned,” that is, existing in Tunisian customs declarations without having a corresponding matching declaration in the alleged source country, was related to the importance of Ben Ali firms but could not reject the null hypothesis that this was not the case. Results are omitted to conserve space but available upon request. 14 The countries are Algeria, Argentina, China, France, Germany, Italy, Japan, Libya, the Netherlands, the Russian Federation, Spain, Turkey, the United States of America, Ukraine, and the United Kingdom. Note that we have sixteen rather than fifteen countries because we are including countries in that are in the top fifteen source countries, either based on imports declared in Tunisia or based on exports to Tunisia reported partner countries. While the ranking of different source countries using these different criteria usually line up well, there are a few exceptions where this is not The World Bank Economic Review 467 imports declared in Tunisia. In addition, product-level information on nontariff measures from the World Bank (Malouche et al. 2013) is used. Table 1 presents descriptive statistics for different types of firms. The 206 connected importing firms are on average larger and more diversified than other private firms; while they comprise only 0.7% of all Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 importing firms, they account for 2.3% of all imports over the period considered. Even more striking, 124 public firms (0.44% of all firms in our sample) together account for more than a quarter of all import value over the period. By contrast, private “onshore” importers, which represent nearly three- quarters of all importing firms, tend to be the smallest and the least diversified, accounting for 38% of reported exports. “Offshore” importers that specialize in exports are comparatively large. Note also that connected firms do not, on average, face lower tariffs or fewer nontariff measures. Table 1. Descriptive Statistics on Firm Characteristics—By Type of Firm Descriptive statistics firms Firm type Ben Ali Offshore Onshore Public All Economic significance Number of firms N 206 7074 20869 124 28273 % of all firms % 0.73 25.02 73.81 0.44 100 % of overall imports % 2.29 32.53 38.64 26.54 100 % of source-product-years with at least % 7.22 36.62 81.16 12.40 one importer of this type Firm characteristics By firm-year N 865 27926 82691 782 112264 Y Mean 14.389 13.268 13.138 17.057 13.202 Sd 2.690 1.934 2.021 2.684 2.036 L Mean 2.436 2.983 1.925 2.470 2.198 Sd 2.180 1.861 1.630 3.102 1.770 Age Mean 9.279 7.607 12.755 30.197 11.562 Sd 10.776 6.746 11.250 18.585 10.731 Log total imports Mean 12.690 12.421 11.391 14.011 11.676 Sd 2.444 2.410 2.092 3.438 2.245 #source countries Mean 5.318 3.294 3.267 8.944 3.329 Sd 5.900 3.646 3.696 9.604 3.812 #products Mean 25.383 13.127 9.256 46.219 10.600 SD 43.069 18.219 16.556 73.868 18.712 #source countries*products Mean 36.091 17.139 12.687 80.004 14.444 SD 65.748 29.027 28.400 146.05 31.984 By source-country and year N 31219 478618 1049102 62563 1621502 Log imports per source-country Mean 9.115 9.457 9.184 9.585 9.279 Sd 8.789 9.166 8.912 9.177 8.992 Log (1þTariffs) Mean 3.540 3.432 3.334 3.259 3.366 Sd 0.692 0.997 0.831 0.751 0.883 Log (1þNTBs) Mean 0.060 0.024 0.037 0.058 0.035 Sd 0.309 0.202 0.255 0.293 0.243 Source: Authors’ analysis based on data from Tunisian customs, the World Bank, COMTRADE, WITS, and La Commission Nationale de Gestion d’Avoirs et des ´ cupe Fonds objets de Confiscation ou de Re ´ ration (see appendix A). the case. Notably, Argentina is ranked as the fifteenth largest source country based on Tunisian import data, but the sixteenth largest source country based on export data reported by partners. By contrast, Japan is considered the fif- teenth largest source country based on import data reported in Tunisia but the twenty-first largest importer based on data reported by partners. 468 Rijkers, Baghdadi, and Raballand Table 2 presents descriptive statistics on average evasion gaps at the source-country-year level for the entire sample and by dominant importer type, by tariff level. It discriminates between, respectively, Ben Ali, offshore, onshore, and public sector–dominated products, depending on which type of firm is the dominant importer based on aggregate import value; that is, goods for which Ben Ali (public/onshore/ Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 offshore) firms account for more than 50% of the value of all reported imports from a given country in a given year are classified as Ben Ali (public/onshore/offshore) dominated. Goods for which there is not a single dominant importer are classified into a residual “mixed” category. Table 2. Evasion Gaps Descriptive statistics evasion gaps (Log exports reported by partner minus log imports reported in Tunisia at the HS6-source-country-year level) All By dominant importer Ben Ali Offshore Onshore Public Residual Mean Std. dev. N mean mean mean mean mean Evasion gap—values (Log export value reported by partner—log import value reported in Tunisia) All À0.001 1.875 49347 0.356 0.071 À0.036 À0.234 0.098 High tariff 0.099 1.939 24896 0.797 0.090 0.094 À0.101 0.170 Low tariff À0.104 1.801 24451 À0.108 0.033 À0.136 À0.293 0.048 N 760 16068 29692 2106 721 Evasion gap—quantities (Log export quantity reported by partner—log import quantity reported in Tunisia) All À0.015 2.172 48724 0.134 0.080 À0.083 0.129 0.087 High tariff 0.080 2.173 24646 0.318 0.104 0.037 0.389 0.127 Low tariff À0.112 2.167 24080 À0.056 0.033 À0.176 0.014 0.059 N 749 15,908 29,303 2,059 705 Evasion gap—unit prices (Log unit price reported by partner—log unit price reported in Tunisia) All 0.020 1.026 48724 0.216 À0.006 0.056 À0.357 0.028 High tariff 0.022 0.951 24646 0.465 À0.017 0.065 À0.473 0.048 Low tariff 0.018 1.098 24080 À0.042 0.014 0.049 À0.306 0.014 N 749 15,908 29,303 2,059 705 Notes: Sample is confined to product-source-year combinations(i) in which imports reported in Tunisia and exports reported in partner countries are strictly posi- tive, (ii) from countries that are among the top fifteen source countries in terms of cumulative import value (either as declared by partners or as declared by Tunisian customs) over the period 2002–2009, and (iii) products that account for at least 0.01 percentage points of import value (either as declared by partners or as declared by Tunisian customs) over this period. In this sample, Ben Ali–dominated goods account for 4.75% of aggregate import value, offshore-dominated goods for 44.17%, onshore-dominated goods for 35.63%, state-dominated goods for 14.56%, and mixed goods for 1.04%. Source: Authors’ analysis based on data described in the text. Average log evasion gaps in terms of import value are very small. The value of imports recorded in Tunisia is on average 0.1 percentage points higher than the export value reported in partner countries. Arguably more informative are differences in evasion gaps between goods subjects to high tariff vis- a-vis gaps goods subject to low tariffs. Average evasion gaps for goods subject to high tariffs, that is, subject to a tariff rate of at least 36 percentage points, are positive and approximately 9.9 percentage points. By contrast, gaps for goods subject to low tariffs are minus 10.4 percentage points. Thus, goods subject to higher tariffs seem more prone to tax evasion, and such evasion is likely taking place through misclassifi- cation of goods. The World Bank Economic Review 469 More striking are the differences across dominant importers. Log evasion gaps for goods for prod- ucts for which onshore firms are the dominant importer are À0.04 on average, but 0.09 for goods subject to high tariffs. Log evasion gaps for goods for which offshore firms are the dominant importer are around 0.07 and do not seem to vary much with tariffs. By contrast, log gaps for Ben Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Ali–dominated products are on average approximately 0.36 and a striking 0.80 for goods subject to high tariffs. Gaps are consistently negative for goods for which public firms are the most prominent importers, which is consistent with overinvoicing of imports. Thus, prima facie, the results are sug- gestive of evasion of tariffs by politically connected firms, as well as overreporting of imports by public firms. Average log evasion gaps in terms of quantities and prices, presented in panels B and C, respec- tively, are also small on average, notably 0.02. Quantity gaps are on average somewhat higher for goods subject to high tariffs (0.08) and negative for goods subject to low tariffs (À0.11), which hints at misclassification. The difference in average evasion price gaps between goods subject to low and high tariffs is very small. The standard deviation of price gaps is much lower than that of quantity gaps. Evasion strategies seem to vary by firm type; while all firm types appear to misclassify goods to some extent (as is evidenced by the fact that across the different dominant importer types evasion gaps are con- sistently higher for goods subject to high tariffs than for goods subject to low tariffs), price underreport- ing is most egregious in Ben Ali–dominated product lines subject to high tariffs, for which gaps are as high as 0.47.15 Can differences between connected and other firms operating in the same five-digit industry and importing the same product from the same country in the same year be detected? Table 3 presents information on average firm-level import values, quantities and unit prices, normalized by the product-source-industry-year average (which is normalized to be equal to 1), and documents descriptive statistics consistent with the product-level patterns; the declared value of Ben Ali firms’ imports exceeds that of the average firm by 18%, and their declared import quantities are 21% higher than the average. Yet, their reported unit prices are on average 4.8% lower than those reported by a representative firm. For goods subject to low tariffs, Ben Ali firms’ import prices are on a par with those of other firms, whereas for goods subject to high tariffs, their reported prices are 8.1% lower than the average. Onshore firms on average report lower import values and quantities, but their reported prices are very close to average prices. Public firms pay unit prices, which are on average 7.4% higher than other firms that simultaneously import the same HS6 product from the same country. To sum up, exploratory descriptive statistics are indicative of considerable tariff evasion. They also suggest that connected firms are more likely to evade tariffs than other firms and that they were more likely to use undervaluation of prices as an evasion strategy than other firms. By contrast, underreporting and/or misclassification of import quantities seems to have been an evasion strategy that all types of firms engaged in. The next sections test these hypotheses more rigorously by estimating the models dis- cussed in section II. 15 Quantity gaps are also high for these product lines, yet the highest average quantity gaps are observed for goods for which public firms are the dominant importers, which are subject to high tariffs. By contrast, price gaps for goods pre- dominantly imported by public firms are on average negative (irrespective of the tariff); the overreporting of import values for products predominantly imported by SOEs is thus driven by overreporting of prices, rather than overreport- ing of quantities (and all the more remarkable given the existence of nontrivial average quantity gaps for such products). 470 Rijkers, Baghdadi, and Raballand Table 3. Firm Level Reported Import Values, Quantities, and Prices—by Source-Origin Mean import values, quantities, and prices normalized by product-origin-year-industry weighted average (product-origin-year-industry average ¼ 1) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Firm Type Ben Ali Offshore Onshore Public All (N ¼ 8140) (N ¼ 15880) (N ¼ 23775) (N ¼ 237) (N ¼ 48,032) Mean import value Std. dev. N All 1.183 1.030 0.934 1.403 1.383 48032 High tariffs 1.214 1.032 0.923 1.458 1.440 33147 Low tariffs 1.131 1.020 0.955 1.355 1.246 14885 Mean import quantity Std. dev N All 1.205 1.009 0.923 1.340 1.466 48032 High tariffs 1.253 1.010 0.911 1.414 1.518 33147 Low tariffs 1.124 1.007 0.960 1.273 1.342 14885 Mean unit prices Std. dev N All 0.952 1.022 1.014 1.086 0.791 48032 High tariffs 0.919 1.027 1.023 1.056 0.764 33147 Low tariffs 1.008 1.000 1.000 1.114 0.851 14885 Notes: sample is confined to (five-digit) sector-product-source combinations in which (i) at least one Ben Ali firm and at least one other firm of a different type are simultaneously importing during the same year, (ii) for which tariff data exist, (iii) in which imports reported in Tunisia and exports reported in partner countries are strictly positive, (iv) from countries that are among the top fifteen source countries in terms of cumulative import value (either as declared by partners or as declared by Tunisian customs) over the period 2002–2009, and (v) products that account for at least 0.01 percentage points of import value (either as declared by partners or as declared by Tunisian customs) over this period. In addition, (vi) observations that fall in the top and bottom 1% of normalized prices, quantities, and values are excluded. The sample comprises nine public firms, 1,787 onshore firms, 1,145 offshore firms, and 113 BA firms. Source: Authors’ analysis based on data described in the text. IV. Results: Evasion Gaps This section first presents the main results, before examining to what extent tariff evasion was driven by misrepresenting quantities or prices. The section ends with an analysis of the evolution of evasion gaps in the aftermath of the Revolution. Basic Results Table 4 examines the determinants of log evasion gaps using two different models; in the first, simplistic model (presented in columns 1–3), the evasion gap is modeled as a function of the tariff rate and the share of imports accounted for by, respectively, Ben Ali firms, offshore firms, and public enterprises, with onshore firms being the reference group. The second, and preferred, model (referred to as the “interacted” model, presented in columns 4–6) adds interaction terms between the tariff and the value share of Ben Ali importers, offshore importers, and public importers. The coefficient on the tariff can thus (loosely)16 be interpreted as providing a crude approximation to the evasion elasticity of onshore firms. Standard errors are clustered at the product level. Progressively more elaborate sets of dummies are added when moving from the left side of the table to the right; columns 1 and 4 include country-year effects, columns 2 and 5 add product fixed effects, while columns 3 and 6 control for both country-year and country-product fixed effects. Although the specification in column 6 is in principle our preferred specification as it offers the most rigorous test for differences in evasion across firms, it is important to bear in mind that including both country-year and country-product fixed effects absorbs a lot of the var- iation of interest. 16 The approximation is crude because the value shares used to proxy the importance of different types of firms are them- selves endogenous to evasion. The World Bank Economic Review 471 Table 4. The Determinants of Evasion Value Gaps The determinants of evasion gaps Dependent variable: log evasion value gaps by HS6-source country-year Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 (1) (2) (3) (4) (5) (6) coef/se coef/se coef/se coef/se coef/se coef/se Log (Tariffþ1) 0.058*** 0.012 0.018 0.078*** 0.040 0.050 (0.022) (0.051) (0.053) (0.023) (0.053) (0.056) Ben Ali % 0.419*** 0.422*** 0.174 À0.850* À0.534 À0.759 (0.133) (0.120) (0.150) (0.474) (0.462) (0.480) Offshore % À0.034 À0.288*** À0.464*** 0.246 0.150 0.011 (0.061) (0.051) (0.06) (0.182) (0.164) (0.219) Public % À0.312*** À0.339*** À0.459*** À0.229 À0.612 À0.621 (0.076) (0.082) (0.089) (0.324) (0.404) (0.436) Ben Ali %* Log (Tariffþ1) 0.370*** 0.279** 0.275** (0.133) (0.13) (0.136) Offshore %* Log (Tariffþ1) À0.080 À0.130*** À0.143** (0.053) (0.047) (0.061) Public %* Log (Tariffþ1) À0.025 0.085 0.050 (0.095) (0.120) (0.130) Log (NTB þ 1) 0.016 À0.064 À0.078 0.013 À0.061 À0.076 (0.022) (0.040) (0.050) (0.022) (0.040) (0.050) Country*year FE Yes Yes Yes Yes Yes Yes Product FE Yes Yes Country*product FE Yes Yes N 49347 49347 49347 49347 49347 49347 Number of products 1386 1386 1386 1386 1386 1386 R2 0.036 0.258 0.627 0.037 0.258 0.627 R2-Adjusted 0.034 0.234 0.534 0.035 0.235 0.534 Notes: Standard errors are clustered by product. ***p ¼ < 0.01, **p<0.05, *p ¼ 0.10. The sample is confined to product-source-year combinations (i) in which imports reported in Tunisia and exports reported in partner countries are strictly positive, (ii) from countries that are among the top fifteen source countries in terms of cumulative import value (either as declared by partners or as declared by Tunisian customs) over the period 2002–2009, and (iii) products that account for at least 0.01 percentage points of cumulative import value (either as declared by partners or as declared by Tunisian customs) over this period. Source: Authors’ analysis based on data described in the text. The results presented in column 1 are consistent with substantial tariff evasion; the tariff rate is a strongly significant predictor of the evasion gap, and the estimated evasion elasticity is 0.058. Once product and country-product dummies are included (columns 2 and 3), the coefficient on the tariffs drops and loses statistical significance, which is presumably at least in part due to the fact that tariffs do not vary dramatically across countries and over time and that out of necessity it was assumed that they were equal to tariffs in the preceding years for two years for which tariff data were missing altogether (see the appendix). The ability to nonetheless identify significant interactions between the tariff rate and the import shares accounted for by different firms (columns 4–6) is due to variability in these import shares (rather than the tariffs themselves) over time. Turning to the main result, connected firms are more likely to evade tariffs. To start with, evasion gaps are strongly positively correlated with the share of import value accounted for by Ben Ali firms in the simple specification (column 1). This association is robust to controlling for product fixed effects (column 1) but not to country-product fixed effects (column 3). More importantly the interaction between the Ben Ali proxy and the tariff measure is consistently positive and significant in the preferred interactive specifications (columns 4–6), consistent with the hypothesis that connected firms are more likely to evade tariffs. This result is robust to controlling for product (column 5) and even country- 472 Rijkers, Baghdadi, and Raballand product fixed effects (column 6). The regressions thus strongly reject the null hypothesis that connected firms do not differ in their evasion propensity from other firms and are instead consistent with connected firms evading more. By contrast, offshore firms seem less likely to engage in tariff evasion; once product and product- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 source dummies are conditioned, the coefficient on offshore importers is negative and statistically signifi- cant in the simple specification and the interaction between the tariff and the import share of offshore firms is significantly negative in the interacted specification. Evasion gaps are also significantly nega- tively correlated with the share of imports accounted for by public firms in the simple specification with- out interaction terms, pointing towards possible overinvoicing of imports by such firms. Misrepresentation of Quantities or Prices? To assess whether the documented evasions gaps are due to (i) underreporting of misclassification of import quantities or (ii) underreporting of prices, table 5 presents estimates of the same regressions using as dependent variable log evasion gaps in weights (columns 1–4) and unit prices (columns 5–8). It repli- cates the models that only condition on country-year dummies (presented in columns 1, 3, 5 and 7), as well as the models that condition on country-year as well as country-product dummies (presented in col- umns 2, 4, 6 and 8). The results presented in column 1 are consistent with misclassification and underreporting of quanti- ties; quantity gaps significantly increase with tariffs, with a 10% increase in tariffs being associated with an increase in evasion gaps of approximately 0.56 percentage points. Once product-country dummies are introduced (column 2), the coefficient on the tariff drops and becomes statistically insignificant, pre- sumably in part because there is not a lot of temporal variation in tariffs at the product-country level in the majority of our sample. Different types of firms do not appear to differ substantially in their propen- sity to underreport quantities as the coefficients on the import shares of Ben Ali, offshore, and public firms, and their interactions with the tariff level are not statistically significant. Thus, the hypothesis that different types of firms misreport quantities to the same extent is not rejected. By contrast, firms do appear to differ in the extent to which they misrepresent prices; unit price dis- crepancies vary strongly with the import shares of different types of importers, even though they are not significantly correlated with the tariff rate. Connected firms seem more likely to underreport prices since price gaps are strongly and significantly increasing with the share of Ben Ali firms in the simple specifica- tion and, moreover, the interaction between the share of imports accounted for by Ben Ali firms and the tariff is consistently statistically significant in our preferred specification. For offshore firms, again, the opposite pattern is observed; such firms appear less likely to engage in price underreporting.17 Note also that unit price gaps are strongly decreasing with the share of imports claimed by public firms. In sum, the results are consistent with the hypothesis that politically connected firms are more likely to evade tariffs than other firms and suggest that underreporting of unit prices is the main driver of the association between evasion value gaps and Ben Ali presence. By contrast, such underreporting is signifi- cantly less prevalent where offshore firms account for a larger share of import value, especially when tar- iffs are high. Public firms seem to overreport imports. The results also suggest that misreporting of import quantities was an important evasion mechanism for all types of firms, even though the null hypothesis that such misreporting did not vary across firms is not rejected (perhaps in part because the variance in quantity gaps is larger than the variance in unit price gaps). Robustness checks using alterna- tive proxies for the import share of different types of firms, alternative sample restrictions and alterna- tive specifications, can be found in the supplemental appendix or the working paper version of this article (Rijkers et al. 2015). 17 Gaps are negatively correlated with the share of imports claimed by offshore firms in the simple specification and with the interaction between the offshore share and the tariff in the interacted specification. Table 5. The Determinants of Evasion Gap Quantity and Price Gaps Evasion gaps—prices or quantities? Dependent variable: Log quantity gap (by HS6-country-year) Log price gap (by HS6-country-year) Q Q Q Q P P P P (1) (2) (3) (4) (5) (6) (7) (8) coef/se coef/se coef/se coef/se coef/se coef/se coef/se coef/se Log (Tariffþ1) 0.056** À0.016 0.063** À0.007 À0.001 0.031 0.012 0.052 The World Bank Economic Review (0.025) (0.063) (0.026) (0.067) (0.009) (0.030) (0.011) (0.033) Ben Ali % 0.104 À0.150 0.021 À0.009 0.297*** 0.321*** À0.890*** À0.786** (0.134) (0.168) (0.548) (0.595) (0.081) (0.102) (0.227) (0.324) Offshore % 0.034 À0.372*** 0.123 À0.221 À0.068*** À0.092** 0.124* 0.218* (0.067) (0.070) (0.204) (0.226) (0.020) (0.038) (0.064) (0.117) Public % 0.093 0.006 0.041 À0.361 À0.418*** À0.470*** À0.295* À0.274 (0.091) (0.119) (0.408) (0.638) (0.050) (0.069) (0.170) (0.308) Ben Ali % * Log (Tariffþ1) 0.024 À0.041 0.347*** 0.327*** (0.153) (0.170) (0.069) (0.093) Offshore % * Log (Tariffþ1) À0.026 À0.045 À0.055*** À0.093*** (0.058) (0.064) (0.017) (0.033) Public % * Log (Tariffþ1) 0.016 0.114 À0.037 À0.061 (0.119) (0.189) (0.051) (0.091) Log (NTB þ 1) 0.026 À0.024 0.025 À0.023 À0.012 À0.047* À0.014 À0.045* (0.025) (0.060) (0.025) (0.060) (0.011) (0.025) (0.011) (0.026) Country*year FE Yes Yes Yes Yes Yes Yes Yes Yes Country*product FE Yes Yes Yes Yes N 48724 48724 48724 48724 48724 48724 48724 48724 Number of products 1386 1386 1386 1386 1386 1386 1386 1386 R2 0.032 0.611 0.032 0.611 0.034 0.495 0.036 0.496 Adjusted R2 0.030 0.513 0.030 0.513 0.032 0.368 0.033 0.369 Notes: Standard errors are clustered by product. ***p ¼ < 0.01, **p<0.05, *p ¼ 0.10. The sample is confined to product-source-year combinations (i) in which imports reported in Tunisia and exports reported in partner coun- tries are strictly positive, (ii) from countries that are among the top fifteen source countries in terms of cumulative import value (either as declared by partners or as declared by Tunisian customs) over the period 2002–2009, and (iii) products that account for at least 0.01 percentage points of cumulative import value (either as declared by partners or as declared by Tunisian customs) over this period. Source: Authors’ analysis based on data described in the text. 473 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 474 Rijkers, Baghdadi, and Raballand The Evolution of Evasion Gaps after the Revolution What happened to evasion gaps after the Jasmin Revolution, which involved the ousting of President Ben Ali? Table 6 presents descriptive statistics on the evolution of mean evasion gaps before and after the revolution, distinguishing between products dominated by Ben Ali firms and other products. Note Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 that the sample is confined to products that were imported from the same source country both before and after the Jasmin Revolution. Table 6 documents the change in average evasion gaps after the revolu- tion, showing that they decreased, though not significantly, by approximately 16.2% on average in product lines where Ben Ali firms had been dominant (column 1). By contrast, they increased signifi- cantly, by 5.7% on average, in other product lines (column 2). Thus, the difference in average evasion gaps between previously Ben Ali–dominated product lines and other product line reduced substantially by 21.9% (column 3) and this change was statistically significant at the 10% level. Table 6. The Evolution of Evasion Gaps After the Revolution The evolution of log evasion gaps in the afermath of the revolution Previously Ben Ali–dominated products Other products Difference-in-difference (1) (2) (1-2) Change in average log value gaps All À0.162 0.057** À0.219* High tariff À0.164 0.061** À0.225 Low tariff À0.094 0.058* À0.152 N 858 55,705 55,933 Change in average log weight gap All 0.050 0.093*** À0.042 High tariff 0.311 0.086** 0.224 Low tariff À0.175 0.103*** À0.278* N 835 53,988 54,823 Change in average log price gap All À0.165*** À0.028** À0.138 High tariff À0.434*** À0.033** À0.401*** Low tariff 0.121 À0.023 0.144 N 835 53,988 54,823 Notes: ***p ¼ < 0.01, **p<0.05, *p ¼ 0.10. Tests for whether differences in means are statistically significant are based on regressions of the form log gap ¼ a1þb1*Postþ b2*BA dominatedþ b3*Post*BA dominated, where Post is a dummy taking the value 1 for years after 2010 and 0 otherwise, and BA dominated a dummy variable indicating whether a Ben Ali firms accounted for more than 50% of reported imports. Standard errors are clustered at the product level. The sample is confined to product-source-year combinations (i) in which imports reported in Tunisia and exports reported in partner countries are strictly positive (ii) imported both at least once between 2002 and 2009 and at least once between 2010 and 2013, (iii) from countries that are among the top fifteen source countries in terms of cumulative import value (either as declared by partners or as declared by Tunisian customs) over the period 2002–2009, and (iv) products that account for at least 0.01 percentage points of import value (either as declared by partners or as declared by Tunisian customs) over the period 2002–2009. Source: Authors’ analysis based on data described in the text. The World Bank Economic Review 475 The increase in value gaps in non–Ben Ali–dominated product lines has been driven by a significant increase in quantity gaps, which is consistent with a rise in informal trade with Libya and Algeria docu- mented by Ayadi et al. (2013). Price gaps declined significantly on average. They declined most rapidly in previously Ben Ali–dominated product lines subject to high tariffs. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Table 7 presents regressions that replicate our preferred specifications but now interact all variables with a post-revolution indicator that takes the value 1 for 2011, 2012, and 2013 and zero otherwise. Due to data limitations, all explanatory variables for this period are imputed using the latest available data, which for the firm-level proxies and the tariff is 2009, while it is 2010 for the nontariff measures. The results pre- sented in columns 1–3 show that after the revolution, price gaps decreased significantly faster the greater the share of imports accounted for by Ben Ali firms during his reign. By contrast, evasion gaps increased more quickly in product lines in which public firms and offshore firms had been important importers. Table 7. The Evolution of Evasion Gaps The evolution of evasion gaps Dependent Variable: Log value gap Log quantity gap Log price gap Log value gap Log quantity gap Log price gap V Q P V Q P (1) (2) (3) (4) (5) (6) coef/se coef/se coef/se coef/se coef/se coef/se Post*Log (Tariffþ1) 0.004 0.010 À0.013 À0.057** À0.053* À0.015 (0.022) (0.024) (0.012) (0.026) (0.03) (0.016) Post*Ben Ali % À0.051 0.120 À0.221* À0.282 À1.123* 0.765* (0.179) (0.194) (0.123) (0.469) (0.619) (0.420) Post*Offshore % 0.045 À0.007 0.048* À0.474** À0.518** À0.041 (0.062) (0.068) (0.029) (0.230) (0.248) (0.119) Post*Public % 0.196 À0.116 À0.805 0.366*** À1.440 0.589* (0.130) (0.161) (0.081) (0.600) (0.877) (0.301) Post*Ben Ali %*Log (Tariffþ1) 0.076 0.382** À0.283** (0.141) (0.179) (0.120) Post*Offshore % *Log (Tariffþ1) 0.150** 0.152** 0.025 (0.063) (0.068) (0.032) Post*Public*Log (Tariffþ1) 0.307* 0.397 À0.069 (0.176) (0.255) (0.087) Post* Log (NTB þ 1) À0.028 À0.020 0.006 À0.022 À0.020 0.006 (0.021) (0.024) (0.012) (0.022) (0.025) (0.012) Simple Controls Ben Ali%, Offshore%, Public%, log (NTBþ1), log (Tariffþ1) (all held constant at their last observed pre-Revolution levels in the post 2010 period) Yes Yes Yes Yes Yes Yes Interacted Controls Ben Ali%*Log (Tariffþ1), Offshore%*Log (Tariffþ1), Public%*Log (Tariffþ1) (all held constant at their last observed pre-Revolution levels in the post 2010 period) Yes Yes Yes Country*year FE Yes Yes Yes Yes Yes Yes Country*product Yes Yes Yes Yes Yes Yes N 55933 54823 54823 55933 54823 54823 Number of products 1217 1217 1217 1217 1217 1217 R2 0.524 0.237 0.358 0.525 0.500 0.358 Adjusted R2 0.470 0.217 0.283 0.470 0.441 0.283 Notes: Standard errors are clustered by product. ***p ¼ < 0.01, **p<0.05, *p ¼ 0.10. Sample is confined to product-source-year combinations (i) in which imports reported in Tunisia and exports reported in partner countries are strictly positive and (ii) which were imported both before and after the revolution, (iii) from countries that are among the top fifteen source countries in terms of cumulative import value (either as declared by partners or as declared by Tunisian customs) over the period 2002–2009, and (iv) products that account for at least 0.01 percentage points of import value (either as declared by partners or as declared by Tunisian cus- toms) over this period. Source: Authors’ analysis based on data described in the text. 476 Rijkers, Baghdadi, and Raballand The interacted specifications, presented in columns 4–6, show that unit price gaps diminished espe- cially rapidly with the presence of Ben Ali firms for products subject to higher tariffs. By contrast, quan- tity gaps appear to have increased for such products. To summarize, after the Revolution, evasion gaps diminished, though not significantly, in product lines Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 where Ben Ali firms had been dominant, whereas they increased significantly in other product lines. This led to a significant reduction (albeit at the 10% level) in the difference in average evasion gaps between previ- ously Ben Ali–dominated product lines and other product lines. This reduction was driven by a significant reduction in price gaps in product lines subject to high tariff where Ben Ali firms had been dominant. V. Firm-Level Results One drawback of testing for differential tariff evasion using evasion gaps is that such gaps are only observed at the product level, whereas the focus of this paper is on assessing the evasion propensities of different groups of firms. The complementary firm-level testing strategy discussed in section II mitigates this and assesses whether and, if so, how reported import values, quantities, and prices of the various types of firms vary differentially with tariffs; this is the objective of the first part of section V. The second part assesses the impact of firms becoming owned by connected entrepreneurs on the evolution of reported unit prices. Differential Elasticities with Respect to Tariffs The firm-source-product level regressions presented in table 8 mimic the product-level regressions, but the explanatory variables are modified to account for firm-level differences; the explanatory variables are the log of the tariff, firm-type dummies, and interactions between various firm-type dummies and the log of the tariff, as well as log of the number of nontariff barriers. Onshore firms are the reference category. Dependent variables are, respectively, the log of total import value (columns 1 and 4), the log of import quantity (columns 2 and 6), and the log of import price (columns 3 and 6) measured at the firm-product-country-year level. The specifications presented in columns 1–3 control for country-year, product, and sector dummies, while those presented in columns 4–6 include product-country-sector-year dummies, which sets a high bar for identification but precludes separate identification of the impact of tariffs and nontariff measures, as these do not vary within product-source-years. The results accord with conventional economic wisdom; import values reported by onshore firms (the reference category) decline significantly with tariffs (column 1), mostly because their reported import quantities tend to decrease as tariffs rise (column 2), although the latter association is not statistically sig- nificant. By contrast, their reported unit prices do not appear to vary with tariffs; the coefficient on the tariff in the unit price regression presented in column 3 is very close to zero. The import demands of off- shore firms are significantly less elastic with respect to tariffs, presumably because they are exempted from having to pay them. Other explanatory variables also have the expected sign; firms that produce higher levels of output import significantly more both in terms of reported value and quantities but don’t pay significantly higher or lower unit prices. Of focal interest, import values reported by Ben Ali firms don’t appear more elastic with respect to tariffs than those of imports of onshore and offshore firms, but their reported import prices decline sig- nificantly as tariffs rise. By contrast, their reported quantities are significantly less elastic with respect to tariffs than those of onshore firms. These results are robust to controlling for source-product-year- industry fixed effects (columns 4–6). Overall, these firm-level results are consistent with tariff evasion through underreporting of unit pri- ces by connected firms. The World Bank Economic Review 477 Table 8. Elasticity of Reported Import Values, Quantities, and Prices with Respect to Tariffs at Firm-Product-Source Country Level The elasticity of imports with respect to tariffs at the firm-product-source country level Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Dependent variable: Log V Log Q Log P Log V Log Q Log P (1) (2) (3) (4) (5) (6) coef/se coef/se coef/se coef/se coef/se coef/se Log (Tariffþ1) À0.133* À0.133 0.001 (0.073) (0.098) (0.059) Ben Ali *Log (Tariffþ1) 0.098 0.211*** À0.113** 0.092 0.194*** À0.102* (0.062) (0.078) (0.057) (0.071) (0.074) (0.054) Offshore*Log (Tariffþ1) 0.251*** 0.152* 0.099 0.193* 0.042 0.150** (0.066) (0.081) (0.055) (0.081) (0.092) (0.060) Public *Log (Tariffþ1) À0.070 0.223 À0.293** 0.127 0.562*** À0.434*** (0.144) (0.183) (0.129) (0.170) (0.140) (0.141) Ben Ali firm À0.279 À0.498* 0.218 À0.185 À0.425* 0.239 (0.226) (0.289) (0.190) (0.242) (0.249) (0.156) Offshore 0.251*** 0.152* 0.099* À0.372 À0.009 À0.363* (0.213) (0.300) (0.214) (0.277) (0.358) (0.205) Public 1.168** 0.177 0.991** 0.633 À0.877* 1.510*** (0.515) (0.651) (0.460) (0.664) (0.499) (0.495) Log Y 0.136*** 0.142*** À0.006 0.157*** 0.159*** À0.002 (0.027) (0.029) (0.016) (0.022) (0.024) (0.016) Log L 0.100*** 0.029 0.071** 0.099*** 0.055* 0.044 (0.024) (0.033) (0.029) (0.021) (0.029) (0.027) Log Age À0.028 À0.009 À0.019 À0.037 À0.023 À0.014 (0.027) (0.042) (0.030) (0.028) (0.040) (0.026) Log (NTB þ 1) À0.039 À0.091 0.052 (0.057) (0.072) (0.045) N 48032 48032 48032 48032 48032 48032 Number of firms 3052 3052 3052 3052 3052 3052 Sector FE Yes Yes Yes Country*Year FE Yes Yes Yes Product FE Yes Yes Yes Product*Country*Sector*Year FE Yes Yes Yes R2 0.363 0.577 0.703 0.471 0.672 0.790 Adjusted R2 0.352 0.569 0.698 0.377 0.614 0.753 Notes: Standard errors are robust and clustered by firm. ***p ¼ < 0.01, **p<0.05, *p ¼ 0.10. Notes: sample is confined to (five-digit) sector-product-source com- binations in which (i) at least one Ben Ali firm and at least one other firm of a different types are simultaneously importing during the same year, (ii) for which tariff data exist, (iii) in which imports reported in Tunisia and exports reported in partner countries are strictly positive, (iv) from countries that are among the top fifteen source countries in terms of cumulative import value (either as declared by partners or as declared by Tunisian customs) over the period 2002–2009, and (v) products that account for at least 0.01 percentage points of import value (either as declared by partners or as declared by Tunisian customs) over this period. In addition, (vi) observations that fall in the top and bottom 1% of normalized prices, quantities, and values are excluded. The sample comprises nine public firms, 1,787 onshore firms, 1,145 offshore firms, and 113 Ben Ali firms. Source: Authors’ analysis based on data described in the text. How costly was tariff evasion by connected firms? Accurately quantifying the costs of tariff evasion is challenging, but one admittedly crude yet conservative method to answer this question is to assume that unit price differentials between connected firms and the median import price reported by other private sector firms importing the same product from the same country at the same time are due to evasion. By multiplying that price differential by the quantity imported by connected firms and the tariff rate one can arrive at an estimate of the additional tax loss associated with being connected (private firms on average evade; this measure allows us to assess how much more connected firms evaded than other firms). Doing this and summing overall import transactions by connected firms in our database for which tariff data exists and for which there is at least one counterpart declaration by a private firms 478 Rijkers, Baghdadi, and Raballand suggests that in 2009 alone, connected firms evaded approximately 217 million US dollars’ worth of taxes more than other firms would have. Over the period 2002–2009, they cumulatively evaded 1.2 billion US dollars’ worth of taxes more than other private firms would have. These estimates are very conservative. If, instead of using median prices reported by other firms, the average price reported by other firms is Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 used, the number rises to 2.6 billion USD. Also, this price discrepancy can only be calculated when tariff data are available and when at least one private sector firm is importing the product at the same time as Ben Ali firms; such observations account for less than half the value of all import transactions reported by connected firms between 2000 and 2009. Most importantly, this method only considers underreporting of prices, not other types of tax fraud; if Ben Ali firms were able to grant themselves exemptions, then they would not need to underreport, in which case they are likely to report higher prices than non- connected firms. It also does not consider underreporting of quantities, including smuggling—that is, goods that are simply not declared to Tunisian customs do not feature in this calculation. Moreover, this calculation does not consider the indirect cost of tariff evasion in terms of inefficiency and a basic lack of transparency. The cost advantage that such tariff evasion endows connected firms with distorts investment incentives and undermines competition. During the Ben Ali era, expansion of the market share of connected firms was associated with reduced entry, higher exit, and stronger concen- tration (Rijkers et al. 2014b). Do Firms Report Lower Prices after Becoming Connected? The observation that import and unit prices reported by connected firms decline more rapidly than those of nonconnected firms could be an artifact of selection; for example, Ben Ali entrepreneurs buying firms that were more cost-effective or more likely to engage in evasion to start with. To assess whether this explains the patterns documented above, a difference-in-difference strategy comparing the evolution of unit prices of firms privatized to the Ben Ali family with other privatizations is used. Note that power is limited due to the relatively small number of privatizations (twenty-five) for which imports of the same product from the same source country are observed both pre- and postprivatization. The results are presented in table 9, with dummy variables indicating whether a firm was (i) privatized (in year t) and (ii) privatized to the Ben Ali family (in year t) as key explanatory variables. The dependent varia- ble is the log unit price. All specifications include firm-source country-product fixed effects. Identification is thus based on comparing the evolution of prices reported by the same firm for the same product net of time varying source country-product specific shocks. Column 1 presents estimates in which standard errors are clustered at the firm-product level, while in columns 2 and 3 standard errors are clustered at the firm level. Column 3 excludes one firm that was privatized to the Ben Ali clan and known to have made extensive use of duty suspension regimes and thus had limited incentives to evade tariff by underreporting prices. Privatizations to the Ben Ali family are associated with a decline in unit prices of approximately 18%, whereas privatizations per se are not. The price decrease associated with becoming connected is signifi- cant at the 5% level when standard errors are clustered at the firm-product level (column 1) but insignifi- cant when standard errors are clustered at the firm-level (column 2). Once one of the privatized Ben Ali firms that made extensive use of duty suspension regimes (column 3) is excluded, however, the coeffi- cient on Ben Ali privatizations drops considerably to -0.56 and becomes statistically significant at the 5% level, even when standard errors are conservatively clustered at the firm level. Thus, the results suggest that privatization per se was not on average associated with a reduction in reported unit prices, but privatization to politically connected entrepreneurs was, though it should be borne in mind that these results are based on a small number of privatizations. VI. Conclusion While it is often assumed that politically connected firms are more likely to evade taxes, empirical examination of this hypothesis has been hampered by the difficulties associated with obtaining data on The World Bank Economic Review 479 Table 9. Import Prices: Before and after Being Privatized to the Ben Ali Family The Evolution of Log Unit Prices Before and After Privatization Dependent Variable: log price Standard errors clustered by Firm-product Product Product Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 excluding outlier firm (1) (2) (3) coef/se coef/se coef/se Post-Privatization À0.003 À0.003 À0.008 (0.051) (0.069) (0.071) Post-Privatization*Ben Ali Owned À0.179** À0.179 À0.556** (0.080) (0.230) (0.280) Firm*Product*Source Country FE Yes Yes Yes Porduct*Source Country*Year FE Yes Yes Yes N 55452 55452 38409 Firms 1016 1016 984 Privatized Firms 23 23 22 Privatized to the Ben Ali family 5 5 4 #products 365 365 295 #firm*prodcuts 9382 9382 6374 R2 0.849 0.849 0.857 Adjusted R2 0.794 0.794 0.794 Notes: Robust standard errors in parentheses. ***p ¼ <0.01, **p<0.05, *p ¼ 0.10. Sample confined to source-product-years in which (to be) privatized firms are importing, and source-product combinations imported by at least one privatized firm both at least once before and at least once after privatization. Competitor firms are included if they import at least twice from the same source-product country combination and operate in the same sector as (to be) privatized firms. Source: Authors’ analysis based on data described in the text. political connections and demonstrating tax evasion. Using unique data on firms with ownership con- nections to the Ben Ali family confiscated in the aftermath of Tunisia’s Jasmin revolution, this paper documents evidence suggesting such politically connected firms were more likely to evade tariffs. To start, evasion gaps measured at the source country-product-year level were strongly correlated with import share accounted for by firms owned by the family. The correlation between the import share of connected firms and evasion gaps was especially strong for goods subject to high tariffs and due to underreporting of prices. While misreporting of quantities was an important evasion tactic for all types of firms, the hypothesis that connected firms were not more or less likely to underreport quantities than other firms is not rejected. Higher evasion gaps in product lines dominated by Ben Ali firms appear driven by their higher pro- pensity to underreport prices; average unit prices reported by Ben Ali firms were lower than those reported by other firms importing the same product and, moreover, declined significantly faster with tar- iffs than those reported by nonconnected firms. In addition, privatizations to the Ben Ali are associated with reductions in reported unit prices, whereas privatizations per se are not. Last but not least, after the ousting of Ben Ali, unit price gaps diminished especially rapidly in product lines in which Ben Ali firms had been dominant that were subject to high tariffs. The evidence suggests tariff evasion in Tunisia led to considerable fiscal losses but also resulted in sub- stantial inequality and a lack of transparency. Politically connected entrepreneurs, who were well off, seem to have been especially likely to profit from tariff evasion. This endowed them with a cost advantage over those who were compliant that was not based on efficiency or performance. According to conserva- tive estimates, underreporting of unit prices alone enabled Ben Ali firms to evade at least 1.2 billion USD worth of import taxes on account of their connections between 2002 and 2009. While the Jasmin Revolution has drastically diminished uncompetitive regulatory privileges enjoyed by the Ben Ali family, it has not put a halt to tariff evasion. On the contrary, tariff evasion in Tunisia has escalated since the Jasmin Revolution. 480 Rijkers, Baghdadi, and Raballand Appendix A. Data Construction Variable Description Source Firm types Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Ben Ali firm Dummy variable taking the value 1 if the firm is owned, fully or in part, by a CC and MoF member of the Ben Ali clan. Offshore firm A dummy variable taking the value 1 if a firm is privately owned by entities INS other than the Ben Ali clan and operates in the tax regime ‘totalement exportatrice’, commonly referred to as the “offshore” sector. Firms in this tax regime do not have to pay output and import taxes, provided they export at least 70% of their output or sell it to other “offshore” firms. Onshore firm A dummy variable taking the value 1 if a firm is privately owned and neither INS operates in the “totalement exportatrice” tax regime nor is owned by the Ben Ali clan. Public firm A dummy variable taking the value 1 if a firm is state owned. INS Firm characteristics Y Output as reported in firm’s annual tax declaration. MoF L Number of salaried employees (annual average over four quarters). INS Age The age of the firm defined as the difference between the current year minus INS the year in which it first registered. Sector Classification of a firm’s main activity based on the Nomenclature d’Activite ´s INS Tunisienne (NAT) 1996 five-digit classification, the most disaggregated sec- tor classification available in Tunisia. Post Privatization Dummy value that takes the value 1 after a firm has been privatized and zero MoF while the firm is still public or is never privatized. Post-Privatization*Ben Ali Owned Dummy value that takes the value 1 after a firm has been privatized and is MoF owned by the Ben Ali family, and zero while the firm is still public, is never privatized, or privatized to owners not belonging to the Ben Ali clan. Trade policy data Tariff Bilateral tariff between Tunisia and a partner country at year t for a certain WITS product (HS6); note that for years for which such information is missing, the last year for which such information is available is used. Tariff data are missing altogether for 2007 and 2009; for these years we use the values from 2006 and 2008, respectively. Nontariff Barrier (NTB) This variable measures the sum of all import-related nontariff barriers at the WITS product-year level. The raw data contain the year of creation of these non- tariff measures, but not the year of their removal. Trade Exports Exports of a certain product (HS6) from a specific country (s) at a year t COMTRADE Imports Imports of a certain product (HS6) from a specific country (s) at a year t COMTRADE Evasion gaps Value gap Log export value reported by partner—log import value reported in Tunisia, COMTRADE measured at the HS6-country-year level. Quantity gap Log Export Quantity Reported by Partner—Log Import Value Reported in COMTRADE Tunisia, measured at the HS6-country-year level. Price gap Log export unit price reported by partner—log import unit price reported in COMTRADE Tunisia, measured at the HS6-country-year level. Definition of tariff categories High tariff The tariff rate is greater than or equal to 36 Low tariff The tariff rate is strictly smaller than 36. INS ¼ Institut National de la Statistique, MoF ¼ Tunisian Ministry of Finance, CC ¼ La Commission Nationale de Gestion d’Avoirs et des Fonds objets de ´ cupe Confiscation ou de Re ´ ration The World Bank Economic Review 481 References Acemoglu, D., T. A. Hassan, and A. Tahoun. 2014. “The Power of the Street: Evidence from Egypt’s Arab Spring,” NBER Working Paper 20665. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Alm, J. 1999. “Tax Compliance and Administration,” In Handbook on Taxation, ed. W. Bartley Hildreth, and J. A. Richardson, 741–68. New York: Mercel Dekker. Andreoni, J., B. Erard, and J. Feinstein. 1998. “Tax Compliance,” Journal of Economic Literature, 36 (2): 818–60. Ayadi, L., N. Benjamin, S. Bensassi, and F. Raballand. 2013. “Estimating Informal Trade across Tunisia’s Land Borders,” Washington, DC: World Bank Policy Research Working Paper 6731. Association Tunisienne des Contro ˆ leurs Publics (ATCP). 2015. “La Petite Corruption: Le Danger Banalise ´ : Etude exploratoire sur la perception de la petite corruption en Tunisie.” Banque Mondiale. 2014. Pour une Meilleure Gouvernance des Entreprises Publiques en Tunisie, Rapport 78675-TN, Washington, DC: World Bank. Beau, N., and C. Graciet. 2009. La Re ´ gente de Carthage, main basse sur la Tunisie, La De ´ couverte. Bhagwati, J. 1964. “On the Underinvoicing of Imports,” Oxford Bulletin of Economics and Statistics 26 (4), 389–97. ———. 1967. “Fiscal Policies, The Faking Of Foreign Trade Declarations, and The Balance Of Payments” Bulletin of the Oxford University Institute of Economics and Statistics 29 (1): 61–77. Brockmeyer, A., M. Khatrouch, and G. Raballand. 2015. “Public Sector Size and Performance Management: A Case- study of Post-revolution Tunisia” World Bank Policy Research Working Paper Series 7159. Campante, F. R., and D. Chor. 2012. “Why was the Arab World Poised for Revolution? Schooling, Economic Opportunities, and the Arab Spring.” Journal of Economic Perspectives 26 (2): 167–88. Cantens, T., G. Raballand, S. Bilangna, and M. Djeuwo. 2012. “Comment la contractualisation dans les administra- tions fiscales peut-elle limiter la corruption et la fraude? Le cas des douanes camerounaises,” Revue d’e ´ conomie du de´ veloppement 26 (3): 35–66. Carre ` re, C., and C. Grigoriou. 2015. “Can Mirror Data Help to Capture Informal International Trade?” FERDI Working Paper 123. Diwan, I., P. Keefer, and M. Schiffbauer. 2014. “On top of the Pyramids: Cronyism and Private Sector Growth in Egypt,” Washington, DC: Mimeo. Faccio, M. 2006. “Politically-Connected Firms,” American Economic Review 96 (1), 36986. Faccio, M., J. J. McConnell, and R. W. Masulis. 2006. “Political Connections and Corporate Bailouts” Journal of Finance 61 (6): 2597–635. Fisman, R. 2001. “Estimating the Value of Political Connections,” American Economic Review 91 (4), 1095–102. Fisman, R., and S. Wei. 2009. “The Smuggling of Art, and the Art of Smuggling: Uncovering the Illicit Trade in Cultural Property and Antiques,” American Economic Journal: Applied Economics, American Economic Association, 1 (3), 82–96. Fisman, R., and S. J. Wei. 2004. “Tax Rates and Tax Evasion: Evidence from ‘Missing Imports’ in China,” Journal of Political Economy 112 (2): 471–96. Fisman, R. J., P. Moustakerski, and S. J. Wei. 2008. “Outsourcing Tariff Evasion: A New Explanation for Entrepo ˆt Trade,” The Review of Economics and Statistics 90 (3): 587–92. Goldberg, P. K., and G. Maggi. 1999. “Protection for Sale: An Empirical Investigation,” American Economic Review 89 (5): 1135–55. Government of Tunisia (GoT). 2011. Rapport de la Commission nationale d’enque ^te sur la corruption et les malversa- tions, Tunis. Grossman, G. M., and E. Helpman. 1994. “Protection for Sale,” American Economic Review 84 (4): 833–50. Institut Tunisien de la Compe ´ titivite ´ et des Etudes Quantitatives (ITCEQ). 2012. “Climat des Affaires et Compe ´ de l’Entreprise– Principaux Re ´ titivite etecompe ´ sultats de l’enqu^ ´ 2011.” http://www.itceq.tn/upload/ ´ titivite files/Publications%20recentes/climat%20affaires%202011-F.pdf. Javorcik, B. S., and G. Narciso. 2008. “Differentiated Products and Evasion of Import Tariffs,” Journal of International Economics 76 (2), 208–22. Jean, S., and C. Mitaritonna. 2010. “Determinants and Pervasiveness of the Evasion of Customs Duties,” CEPII Working Paper 2010-26. 482 Rijkers, Baghdadi, and Raballand Johnson, S., and T. Mitton. 2003. “Cronyism and Capital Controls: Evidence from Malaysia,” Journal of Financial Economics 67 (2), 351–82. Khandelwal, A., J. P. Schott, and S. Wei. 2013. “Trade Liberalization and Embedded Institutional Reform: Evidence From Chinese Exporters,” American Economic Review 103 (6): 2169–95. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/459/2897303 by International Monetary Fund user on 08 August 2019 Malouche, M., J. Reyes, and A. Fouad. 2013. “New Database of Nontariff Measures Makes Trade Policy More Transparent” World Bank, Mimeo. Malik, A., and B. Awadallah. 2013) “The Economics of the Arab Spring,” World Development 45: 296–313. Mishra, P., A. Subramanian, and P. Topalova. 2008. “Policies, Enforcement, and Customs Evasion: Evidence from India,” Journal of Public Economics 92, 1907–25. Mobarak, A., and D. Purbasari. 2006. “Corrupt Protection for Sale to Firms: Evidence from Indonesia,” Yale University, Mimeo. Rijkers, B., H. Arouri, C. Freund, and A. Nucifora. 2014a. “Which Firms Create the Most Jobs in Developing Countries? Firm-level Evidence from Tunisia” Labour Economics 31: 84–102. Rijkers, B., C. Freund, and A. Nucifora. 2014b. “All in the Family: State Capture in Tunisia” World Bank Policy Research Working Paper 6810. Rijkers, B., L. Baghdadi, and G. Raballand. 2015. “Political Connections and Tariff Evasion: Evidence from Tunisia,” Washington, DC: World Bank Policy Research Working Paper 7336. Slemrod, J. 2006. “Taxation and Big Brother: Information, Personalization, and Privacy in 21st Century Tax Policy,” Fiscal Studies 27 (1): 1–15. ———. 2007. “Cheating Ourselves: The Economics of Tax Evasion” The Journal of Economic Perspectives 21 (1): 25–48. Slemrod, J., and S. Yitzhaki. 2002. “Tax Avoidance, Evasion, and Administration.” In A. J Auerbach, and M. Feldstein, eds., Handbook of Public Economics, Volume 3, Chapter 22, Elsevier. World Bank. 2014a. “The Unfinished Revolution: Bringing Opportunity, Good Jobs, and Greater Wealth to All Tunisians.” ———. 2014b. Investment Climate Assessment Tunisia, Washington, DC: World Bank. The World Bank Economic Review, 31(2), 2017, 483–503 doi: 10.1093/wber/lhv060 Advance Access Publication Date: November 3, 2015 Article Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 Pension Coverage for Parents and Educational Investment in Children: Evidence from Urban China Ren Mu and Yang Du Abstract When social security is established to provide pensions to parents, their reliance upon children for future finan- cial support decreases, and their need to save for retirement also falls. In this study, the expansion of pension coverage from the state sector to the non-state sector in urban China is used as a quasi-experiment to analyze the intergenerational impact of social security on education investments in children. In a difference-in- differences framework, a significant increase in the total education expenditure is found to be attributable to pension expansion. The results are unlikely to be driven by other observable trends. They are robust to the in- clusion of a large set of control variables and to different specifications, including one based on the instrumen- tal variable method. JEL classification: I29, J32 The old-age security motive for fertility is well recognized in the developing world. In countries where the capital market is underdeveloped and important social welfare institutions are lacking, individuals without insurance and old-age security have the incentive to invest in the future in the form of children (e.g., Cain 1983; Nugent 1987).1 As the transfer from children to parents and the survival of parents are both positively correlated with the level of children’s education (Cai et al. 2006; Lei et al. 2011; Lillard and Willis 1997; Zimmer et al. 2007), support in old age may also constitute a motive for pro- viding children with human capital as part of an explicit or implicit intergenerational contract (Pollak 1988; Raut 1990).2 This old-age security motive for investment in children’s education is possibly more credible when the number of children a family can have is limited by strict family planning Ren Mu (corresponding author) is an associate professor at the Bush School of Government and Public Service, Texas A&M University, 4220 TAMU, College Station, TX 77843 USA; her email address is rmu@tamu.edu.Yang Du is a profes- sor at the Institute of Population and Labor Economics, Chinese Academy of Social Sciences, 5 Jianguomennei Dajie, Beijing 100732, China; his email address is duyang@cass.org.cn. For very useful comments the authors would like to thank Yonghong An, Fang Cai, Jin Feng, Lisa Kahn, Quan Li, Xiaobo Lu, Lori Taylor, Jeffrey Wooldridge, and the seminar par- ticipants in Beijing University, Texas A&M University, Population Association of America Annual Meetings in 2013, the Chinese Economists Society 2013 Annual Meeting, as well as the journal’s editor and three referees. Feng Huang has pro- vided excellent research assistance. A supplemental appendix to this article is available at https://academic.oup.com/wber. 1 Consequently, the growth in social security and other transfer payments to the elderly is believed to have contributed to the decline in fertility (e.g., Hohm 1975). 2 Even parents who leave sizable bequests and do not need support in old age from children can indirectly save for old age by investing in children’s education and then reducing bequests when elderly (Becker 1992). C The Author 2015. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 484 Mu and Du policies, like those implemented in China. In this case, parents’ choice is not on child quantity but on child quality.3 When social security is established to provide pensions to parents in old age, their reliance upon chil- dren for future financial support decreases. At the same time, their need to save for retirement also falls. The declining role of children in the support of elder parents and the decreasing need of parents to save Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 may simultaneously influence parents’ decisions to invest in their children’s education, but in the oppo- site direction, the former implies a decline in the investment, but the latter may lead to an increase. This paper is one of the first studies to assess the intergenerational impact of social security expansion on a family’s education expenditures on children. For identification purposes, the one pension reform in urban China is used as a quasi-natural experiment. The reform entails an expansion of pension coverage from employees in the state sector to those in the non-state sector. Employees in the state sector— including those in government agencies, publicly financed social services (e.g., schools, youth organiza- tions, and health care providers), and state-owned enterprises (SOEs)—have long been covered by social security. However, until 1999, non-state employers were not required to enroll their employees in the public pension program (State Council 1999, 2000). In the empirical analysis, a difference-in-differences framework is applied, in which trends in expendi- tures for the education of children who have a parent employed in the state sector are used to gauge counterfactual trends for what education expenditures would have been for children with both parents in the non-state sector. First, it is demonstrated that other trends—such as those in medical insurance, wages, bonus income, housing ownership and values, working hours, household income per capita and household size—are indistinguishable between the two sectors. Moreover, it is illustrated that determi- nants of an individual working in the state or non-state sector do not change over time. A statistically significant increase in family education expenditures attributable to the pension reform is then proven. The result is robust to the inclusion of a large set of control variables and to different specifications, including one based on the instrumental variable method. This paper contributes to the knowledge of motivation for parental investment in children. In Becker’s classic model (1974), parents are altruistic.4 In later models of bequests (Bernheim et al. 1985) and paren- tal investments in children (Raul 1990), parents are depicted as making strategic decisions based on expected service or transfers from children. Empirically, some studies support the altruistic motive hypoth- esis (e.g., McGarry and Schoeni 1995), while others question it or show a mixture of altruistic and exchange motives in parents (e.g., Cox 1990). This study provides new evidence in this inquiry. This study also relates to the literature on the intergenerational effects of welfare reform on educa- tion. Studies of changes in the US national welfare system in the 1990s, aimed at promoting adult employment and reducing long-term dependence on public assistance, have generally yielded positive findings with regard to educational attainment in adolescents (Miller and Zhang 2012) and young chil- dren (Duncan and Chase-Lansdale 2001; Morris et al. 2005; Zaslow et al. 2002). These findings in the United States have advanced the understanding of the effects of social programs on education. One open question remains, however, as to whether these findings can generalize to developing countries, where parents’ investment in their children’s human capital may be motivated by old-age security. This study provides evidence on this issue from a less-developed country. In addition, this study advances debate and discussion on the impact of pension provisions in develop- ing countries in general and on pension reform in China in particular. Several studies analyzing the 3 As suggested by the influential quantity-quality model in Becker and Lewis (1973), parents may increase child quality when they have fewer children by allocating more resources to each child. This quality gain can happen in the absence of old-age motive. 4 Altruistic transfers from parents to children may also be made to offset inequality in children’s earnings (Behrman et al. 1982). The World Bank Economic Review 485 gender-specific impact of old-age pensions in South Africa found that pensions both reduced the labor supply of prime-age individuals and positively affected the nutritional status of children (Bertrand et al. 2003; Duflo 2003). In contrast, much of the discussion on China has focused on the limitations of the pension system and ways to improve its efficiencies (e.g., Li 2011; Salditt et al. 2008; Zhao and Xu 2002). Evidence of policy impacts, however, is lacking, with the exception of a recent study that relates Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 higher household savings to pension reform in state-owned enterprises (Feng et al. 2011). The remainder of the paper is organized as follows: Section I provides a review of the cultural and institutional background for elderly support in China. Section II presents a simple conceptual framework linking pension coverage for parents and their decisions for investing in their children’s education. Section III introduces data and reports summary statistics. Section IV discusses the empirical approach, section V presents the results, and section VI concludes. I. Background: Elderly Support and the Pension System in Urban China According to Confucian teaching, filial piety is the root of all virtue. The practice of honoring parents, including providing material means of supporting them in old age, is therefore inextricably ingrained in the moral fiber of Chinese society. Since 1950, taking care of the elderly has become more than just an infor- mal social norm: it has been formally codified as a legal obligation for children.5 Not surprisingly, family was, and still is, the primary care provider of the nursing needs and financial support for the elderly.6 The traditional role of family in elder care is fraught with new challenges, largely the result of smaller family size and greater labor mobility. The Chinese “one-child” policy, effectively enforced during the past three decades, has reduced family size. Concerns have been raised about the viability of the family support system because the young can now go anywhere for job opportunities. Parents doubt that filial duty will remain a motivational force for their singletons, and they also worry about the burden of car- ing for four elderly parents placed upon young couples (Fong 2004). Notwithstanding these concerns and the decline of multigenerational coresidence over time, several studies conclude that adult children in China are still responsive to the needs of their parents. Rural chil- dren will forgo migration opportunities and stay in their home villages when one or both of their elderly parents are sick (Giles and Mu 2007). Urban children provide more monetary transfers to parents when needed (Cai, Giles, and Meng 2006). More recently, a study based on the new China Health and Retirement Longitudinal Study emphasizes that living arrangements are such that care from a child, either co-resident or in a nearby community, is readily available for more than 70% of the elderly (Lei et al. 2011). Compared to family support, private saving is a secondary option for the provision of elders. This option is likely to be viable only for high-income families because small-savers are predominant among the current elderly. In 1995, the value of financial assets was less than half the annual earnings for 53% of the urban population aged fifty-five or older (Jackson and Howe 2004). The percentage of households with wealth-to-income ratios of more than two at retirement is predicted to remain small in the future (Takayama 2002). However, with housing boom in urban cities since the early 2000s, housing is likely to become the biggest source of wealth for urban residents. While such assets may have limited liquidity, they still can be leveraged to finance consumption if needed. 5 The Marriage Law of 1950 states that children should support elderly parents, and the Constitution of 1954 emphasizes that children have a duty to support parents (see Fang, Wang, and Song 1992; World Bank 1994). The Marriage Law (2001) further emphasizes this responsibility and endows elderly parents with the right to sue children for support if they fail to provide assistance. 6 Under the mode of traditional filial piety in China, sons are responsible for the care of elderly parents. As shown in Ebenstein and Leung (2010), parents without a son are more likely to participate in old-age pension programs in rural areas. 486 Mu and Du The third source of old-age income is the social pension. Since the early 1950s, urban residents who worked in government, government-sponsored nonprofit institutions, and SOEs have been entitled to social security, including pension coverage. Economic restructuring in SOEs eventually relieved the enterprises from the burden of providing social security for their employees. This change was made pos- sible by several policy reforms. In 1991, following experiments in a number of provinces and municipal- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 ities, the State Council outlined the first major pension reform.7 The idea was to promote the integration of pensions for enterprise workers at the provincial level. The social pooling remained a “pay-as-you- go” (PAYG) system, in which mandatory contributions from enterprises and payroll taxes from workers were collected to pay retirees. With concerns about PAYG system sustainability for a rapidly aging population, further reform of the pension system for enterprise workers was initiated in the mid-1990s.8 As a result, a mandatory sys- tem consisting of social pooling and individual accounts was established. The social pooling pillar, based on PAYG, and the individual accounts, purported to be fully funded, were financed with contributions from both individuals and employers.9 Voluntary pensions outside the mandatory system—including enterprise annuity plans, individual savings, and other pension plans organized by industries or localities—were also encouraged (Barr and Diamond 2010). Although employees of SOEs were covered by the system described above, coverage for workers out- side the state sector was very limited. To facilitate SOE reforms, as well as to improve the financial bal- ance of the pension program, the next step in pension reform was to expand the pension pool and encourage contributions from employers and workers in the non-state sector (Zhao and Xu 2002). In 1999, the State Council called for acceleration of the inclusion of non-state enterprise workers into the pension pool (State Council 1999), and included in the non-state enterprises were collective, foreign- invested, private, and other types of enterprises. Since then, pension coverage in non-state enterprises has increased, even though it remains far lower than in the state sector due to non-compliance (Zhao and Xu 2002). Public employees are covered by a different system that depends not on worker contributions but rather on benefits based on a short period of earnings at the end of a career (Barr and Diamond 2010). Compared to the pension program for enterprises, the system for public employment has undergone little change. In sum, employees in the state sector, including SOEs and the public sector, have always been entitled to pension coverage, even though the structure of the pension program for SOE employees has gone through many changes. Only in the early 2000s were workers in non-state sectors brought into the pub- lic pension system. II. The Link between Pension and Education Expenditure Investment in children’s human capital is largely made by parents. As pointed out in Becker and Murphy (1988), even altruistic parents have to consider the trade-off between consumption and the human capi- tal of children; parents must reduce their own consumption (including leisure) to acquire the time and resources they spend on childcare, education, training, and health. 7 See the 1991 State Council Resolution on the Reform of the Pension System for Enterprise Workers. 8 The two crucial policy documents are the 1995 State Council Circular on Deepening the Reform of the Old-Age Pension System for Enterprise Employees and the 1997 State Council Document No. 26. 9 In 2001, individual contribution rates in Fuzhou, Shanghai, Shenyang, Wuhan, and Xian, the five cities under study in this paper, were 2%, 6%, 8%, 5%, and 6%, respectively. In 2005, they all reached 8%. While individual contribution phased in, during 2001–2005 employer contribution decreased slightly in three cities (Shanghai, Shenyang, and Wuhan). In 2005, employer contribution rates varied across cities from 20% in Shenyang and Xian to 25% in Fuzhou. The World Bank Economic Review 487 Consider in a simple version of a two-period model that parents have a time-separable concave utility function defined over their family consumption in period one (C1) and over their own consumption and their child’s level of well-being in period two (C2 and V): U1 ðC1 Þ þ bU2 ðC2 ; aV ðeÞÞ (1) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 where b 2 (0,1) is the discount factor and a is the weight the parents place on the child’s welfare. The child’s level of well-being is assumed to be a function of his or her education (e), and C1 ! 0; C2 ! 0, and e ! 0. In period one, the parents use their monetary income (y) for family consumption, child education (e), À mandatory pension contributions (m ), and saving (s). In period two, their consumption is determined by their saving, pension income (P), and net transfer from their child (T). Assume the return on saving is R. À Let the pension income be based upon the mandatory contributions (m ), augmented by a parameter (s), À mainly to capture the fact that employers contribute to the pension fund as well. It follows that P ¼ s m and s > R. The transfer is a linear function of the child’s education, that is, T ¼ ce. The parameter c, assumed to be exogenous in the model, essentially measures parents’ perceived pri- vate return to education and can be positive, zero, and negative. The case that c 0 is not unlikely to occur for two reasons. First, the educated young people have higher labor mobility and thus may be less attached to their parents, giving less old-age support particularly in terms of instrumental care. Second, being an only child as a result of the one-child policy is associated with being less trustworthy and less conscientious (Cameron et al. 2013). Higher mobility of the more educated and the behavior traits of the young may leave parents with less expectation for old-age support from their children, resulting in c possibly being zero or even negative. The budget constraints for parents are À C1 þ e þ s þ m ¼ y (2) and C2 ¼ Rs þ T þ P (3) The maximization solution yields the following equations: 1 2 bcU2 þ bU2 0 aV 0 ¼ U1 (4) 2 1 1 U2 aV 0 þ cU2 ¼ RU2 (5) where the partial derivatives of the utility functions are denoted by superscripts. This intertemporal first- order condition in equation (4) dictates the trade-off between C1 and e is such that the utility loss of one unit reduction in C1 is equal to the present value of the utility gains from one unit increase in e for time 1 two. The utility gains include the net gain from the consumption change due to transfer (bcU2 ) and the 2 gain due to an increase in the child’s welfare (bU2 aV 0 ). Similarly, equation (5) shows the utility trade-off between saving and education expenditure. À The effects of expanding pension coverage (change in m ) on e is studied by applying the implicit func- tion theorem to equation (4) and obtaining the following comparative statics: 12 0 11 00 de basU2 V þ bcsU2 þ U1 À ¼À 12 V 0 þ ba2 U22 V 0 V 0 þ baU2 V 00 þ c2 bU 11 þ U00 (6) dm 2bacU2 2 2 2 1 12 In this equation, U2 denotes how parents’ marginal utility of consumption changes with a child’s wel- 12 fare. It is reasonable to assume it is nonnegative (U2 ! 0). Given the concavity of the utility functions, it 11 00 22 is clear that. U2 < 0; U1 < 0; U2 < 0; and V 00 < 0. With this general framework, three cases 488 Mu and Du are examined, each of which uniquely defines parental motives in investing in the education of their child: Case 1. Parents only have exchange motives. The child’s well-being carries zero weight in their parents’ utility function (a ¼ 0), but, if possible, the parents will rely on the child’s transfer for consump- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 tion in the later period (c > 0). Given equation (5), it is easy to see that when the perceived return to edu- cation is not as high as the market interest rate (c < R), parents will not invest in their child’s education, that is e ¼ 0. This decision will not be affected by pension coverage. When c > R, parents would choose to invest in their child’s education (e > 0). The comparative statics in equation (4) reduces to: 11 00 de bcsU2 þ U1 À ¼À 2 11 00 (7) dm c bU2 þ U1 de Therefore, it follows that À < 0. This prediction implies that if nonaltruistic parents invest in their dm child’s education at all, an access to future pension income would reduce parents’ education expenditure. Case 2. Parents are altruistic and have no exchange motives. So they value their child’s well-being (a > 0) but don’t expect to receive positive transfer from children in period two (c 0). In this case, the de sign of À , as given in equation (6), is undetermined. The model thus cannot predict how education dm spending changes with the expansion of pension coverage if parents act purely altruistically. Case 3. Parents care about their child’s welfare (a > 0), but they also depend on the child for consump- À tion in period two (c ! 0). The change in e with respect to the change in m is the same as given in de equation (6). Hence the sign of À is also undetermined. dm In summary, the model unambiguously predicts that nonaltruistic parents decrease education expen- diture for their child in response to new pension availability. Parents with altruistic motives may either increase or reduce the education expenditure. Even though the model doesn’t make definite predictions regarding the behavior of altruistic parents, the model clearly predicts that parents having exchange motives only will decrease education expenditure in response to pension coverage expansion. If empirical evidence shows that education expenditure increases as a result of parents getting pension coverage, it would indicate that parents must have altruistic motives when investing in their child’s education. III. Data and Summary Statistics The data used in this paper are from the China Urban Labor Survey I and II (CULS1 and CULS2) con- ducted in 2001 and 2005, respectively, by the Institute for Population and Labor Economics at the Chinese Academy of Social Sciences. The survey covers five major cities: Fuzhou, Shanghai, Shenyang, Wuhan, and Xi’an. The proportional population sampling approach is used for both waves. Within each city, an average of ten registered urban households in each of seventy and fifty neighborhood clusters (shequ) were surveyed in 2001 and 2005, respectively.10 Each household head was asked questions about family members. Family members above age sixteen who were no longer in school were inter- viewed individually. The sample includes 3,499 households from the 2001 survey and 2,505 households from the 2005 survey. 10 The number of neighborhood clusters drawn is proportional to the population size of street districts (jiedao). On aver- age, three neighborhood clusters are randomly sampled for each street district. The World Bank Economic Review 489 In the final analysis sample, which contains observations of children aged 1–18, there are 988 obser- vations in 2001 and 850 in 2005. About 63% of young children aged 1–5 are enrolled in daycare. The school enrollment rate is 98% for children aged 6–15, and 95% for the 16–18 group. With such high enrollment rates, it would be difficult to detect an impact on education, if any, based on such extensive measures of school outcomes as enrollment rates or dropout rates. With the intensive measures of educa- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 tion input—that is, the actual amount of money spent on education and related activities for each child—more variations can be explored in examining the impact on investment in education. For this purpose, an education expenditure module is included in the household survey, where information on tuition, school fees, and expenditures on interest classes (xingqu ban, which are instructional programs for a variety of subjects ranging from Olympic math and English to chess and dancing) is collected for each child who is either in daycare or in school. Yearly education expenditures did not change much between 2001 and 2005, with an average of about 2,600 yuan (table 1).11 Tuition and school fees averaged about 2,110 yuan and accounted for 81% of total education expenditures in 2001; the amount was 1,935 in 2005, and its share in the total expenditure dropped to 77%. The decrease in tuition and school fees was likely driven by policy, namely the “one fee system” adopted in fall 2004 by three cities (Shanghai, Wuhan, and Shenyang), designed to curb excessive fee collection made by public schools.12 Under this system, public primary and junior high schools in 2005 could charge students one time for two items: school fees and textbook and note- book fees. The fee amount was not set by schools but determined uniformly by the Education Bureau in each province based on a formula adjusted to school levels. Even though school fees regulated and set by the government decreased, the amount of money spent annually on tutors or interest classes and extracurricular activities increased from 486 yuan to 591 yuan. In addition, proportionally more children had tutors or engaged in extracurricular activities in 2005 (54%) than in 2001 (49%). In the light of the income increase occurring during this period,13 more spending on these educational activities outside of school is not surprising. Besides the income effect, additional spending may also reflect a peer effect in that with only one child, parents “compete” to invest in their children in order to enhance labor and marriage market competitiveness (Wei and Zhang 2011). In addition, both elite urban private schools and the public embrace the belief that extracurricular activities can help foster self-confidence in children and stimulate their interest in learning (Lin 2007). With regard to pension coverage for parents, the numbers show that 55% of the mothers and 67% of the fathers were enrolled in an employer-based pension program in 2001, with an increase to 61% and 70%, respectively, in 2005. More parents worked in the non-state sector in 2005 than in 2001, with a significant increase in private enterprises. The samples in these two survey rounds are very similar with respect to child age and gender. The children are about twelve years old on average and half of them are girls. The parent samples are also comparable over the two years in terms of age and years of schooling. Parents in the 2005 sample appear to be less likely to have had a rural Hukou at age sixteen14 and are more likely to live in the current city. 11 The exchange rate between the yuan and the US dollar was 8.27. The expenditure of 2,600 yuan is equal to 314 US dollars. 12 This practice was a mandate following a regulation jointly issued by the Ministry of Education, the National Development and Reform Committee, and the Ministry of Finance in March 2004. The regulation document was ti- tled “Opinions Regarding Implementing the ‘One Fee System’ during the Mandatory Schooling Phase Nationally.” This document can be accessed through the website of the Ministry of Education at: http://202.205.177.9/edoas/ website18/48/info21948.htm (accessed March 1, 2012). 13 The average household income in the sample was 29,029 yuan (3,510 US dollars) in 2001 and 32,829 yuan (3,970 US dollars) in 2005. 14 Hukou status (rural vs. urban) at age of sixteen may account for variations in the quality of education the parents had because urban schools are generally better than rural schools (e.g., Paine and Fang, 2007). The residence place at age 490 Mu and Du Table 1. Summary Statistics 2001 2005 Mean SD Mean SD Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 Educational expenditures Total education expenditure 2596.215 (3364.62) 2529.50 (2591.22) __Tuition and fees 2109.75 (3041.66) 1935.47 (2267.26) __Tutors and interest classes 486.465 (1455.73) 590.77 (1114.30) Has a tutor or enrolled in an interest class 0.54 (0.50) 0.49 (0.50) Parent pension coverage Mother with pension coverage 0.55 (0.48) 0.61 (0.49) Father with pension coverage 0.674 (0.42) 0.70 (0.46) Parent employment type Mother in non-state sector 0.448 (0.50) 0.612 (0.49) __Collective enterprises 0.093 (0.29) 0.056 (0.23) __Private enterprises 0.334 (0.47) 0.482 (0.50) __Foreign enterprises 0.024 (0.15) 0.057 (0.23) Mother is self-employed 0.021 (0.14) 0.075 (0.26) Father in non-state sector 0.369 (0.48) 0.497 (0.50) __Collective enterprises 0.085 (0.28) 0.043 (0.20) __Private enterprises 0.253 (0.44) 0.363 (0.48) __Foreign enterprises 0.025 (0.16) 0.051 (0.22) Father is self-employed 0.031 (0.17) 0.091 (0.29) Child characteristics Age 12.615 (4.09) 11.543 (5.12) Gender 0.505 (0.50) 0.508 (0.50) Parent characteristics Mother’s age 39.463 (5.31) 39.087 (6.25) Father’s age 41.898 (5.25) 41.617 (6.13) Mother’s years of schooling 11.181 (2.70) 11.595 (2.50) Father’s years of schooling 11.597 (2.97) 11.798 (2.60) Mother had a rural Hukou before age 16 0.19 (0.39) 0.141 (0.34) Father had a rural Hukou before age 16 0.145 (0.35) 0.111 (0.31) Mother didn’t live in this city before age 16 0.258 (0.44) 0.145 (0.35) Father didn’t live in this city before age 16 0.202 (0.40) 0.114 (0.32) Mother has a local Hukou 0.926 (0.26) 0.92 (0.27) Father has a local Hukou 0.938 (0.24) 0.94 (0.25) Maternal grandmother’s years of schooling 4.586 (4.44) 5.231 (4.04) Maternal grandfather’s years of schooling 3.995 (4.23) 4.873 (4.24) Paternal grandmother’s years of schooling 6.798 (4.53) 7.306 (4.71) Paternal grandfather’s years of schooling 6.291 (4.61) 6.966 (4.61) Number of observations 988 850 Source: Authors’ analysis based on data described in the text. In both years, 92–94% of the parents have a local Hukou of the city they live in. The differences in parent characteristics are not statistically significant between the two samples. The locally weighted regression lines for rates of pension coverage shows that the coverage rates for the state sector exhibit little systematic change from 2001 to 2005 (figure 1). They average around 80%, with older employees having higher coverage rates. For the non-state sector, the average rates jump from about 40% to a little over 60%. The increase applies to employees of all ages. It also applies to workers in all types of enterprises in the non-state sector (figure 2). On average the workers in the of sixteen, together with the current Hukou status, captures whether parents have local roots and thus have access to a family support system locally. The World Bank Economic Review 491 foreign-invested firms had the highest pension coverage of 64% to begin with, and it further increased to 79% in 2005. The workers in the private enterprises enjoyed the largest increase in pension coverage from about 20% to 70%. A moderate increase from 62% to 70% is observed among workers in collec- tive enterprises. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 Figure 1. Pension Coverage by Sector and Year Source: CULS1 and CULS2. State sector includes government agencies, publicly financed social services (e.g., schools, youth organiza- tions, and health care providers), and state-owned enterprises (SOEs). The non-state sector includes collectively owned or controlled enterprises, privately owned or controlled enterprises (with 8 or more employees), and foreign-owned or controlled enterprises. Figure 2. Change in Pension Coverage for the Non-state Sector Source: CULS1 and CULS2. 492 Mu and Du IV. Empirical Framework As outlined in the conceptual framework, pension availability may have opposing effects of investments in children’s education; therefore, its net effect is inherently an empirical question. In this section, the empirical strategies for identification are explained, followed by a discussion of the data. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 Main Identification Strategy Considering that the pension coverage rate was never close to 100%, it is very possible that there were effects that operated through the expectation of being able to enroll in the public pension system, such as reductions in savings and increases in expenditures for those in the non-state sector during the period. The focus here is on reduced-form/intent-to-treat estimates. The net effect of pension expansion on parents’ investments in their children’s education is estimated in a difference-in-differences (DD) framework: Eict ¼ b1 NonStateict þ b2 Year2005 þ b3 NonStateict  Year2005 þ X0ict b4 (8) 0 þ Hict b5 þ DcÂt þ eict where Eict denotes the education investments in child i in city c at time t. The variable NonStateict is a binary variable indexing whether two parents are currently or last employed in the non-state sector.15 The post pension reform trend (Year 2005) is estimated using children whose parent(s) is employed in or retired from the state sector. The treatment effect of pension reform is measured by the coefficient (b3) on the interaction between NonStateict and the post reform trend. X0 ict is a vector of child characteristics, including age, gender, and school level.16 0 A vector of parental characteristics and family background is denoted by Hict . The inclusion of parents’ family background as controls is important in mitigating a potential omitted variable bias in the estimation. For example, if the academic ability of a child is positively correlated with the amount of investment made by parents, an upward bias would result if ability is inheritable and high-ability parents are more likely to work for employers who enroll in the public pension program. In addition, parents’ local roots may be related to which sector they work in and their child care decisions. Therefore, parents’ Hukou status and residence location at the age of sixteen, their current Hukou status, and their own parents’ education are all included as control variables, additional to their age and education. In addi- tion, the city-specific year dummies (DcÂt) are included to control for city-specific policy changes during this period. For example, the contribution rates for pensions changed during this time and varied across cities. The idiosyncratic error term is denoted by eict. An Alternative Identification To provide additional robustness checks to the estimation results of equation (8), the 2005 Chinese National Intercensal Population Survey17 is used, in addition to CULS, and the instrumental variable method to estimate the following equation is applied, 15 The non-state sector includes three types of enterprises: collectively owned or controlled enterprises, privately owned or controlled enterprises (with 8 or more employees), and foreign-owned or controlled enterprises. The expansion of urban employee pension program applied to employees in theses types of enterprises. The state sector includes govern- ment agencies, social service units (e.g., schools, youth organizations, and health care providers), and state-owned or controlled enterprises. We exclude observations with two self-employed parents. For children with one self-employed parent, the sector is defined by the sector of the parent who is not self-employed. 16 The vast majority of the children are the only child of their family. In our sample, the average number of siblings is 0.084. Given the small variation in the number of siblings, we don’t include this variable in the regressions. But the re- sults are robust to including the variable. 17 We have access to 20% of the observations in the total sample, and they account for 0.95% of the Chinese population. The World Bank Economic Review 493 0 Eic ¼ c1 Pensioni þ X0ic c2 þ Hic c3 þ Dc þ eic (9) in which pension coverage for parent of child i, denoted by Pensioni is directly included as a covariate. It is a binary variable equal to 1 if at least one parent has enrolled in the pension program; 0 otherwise. Two instrumental variables for Pensioni are constructed, and they are the predicted probabilities of Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 enrolling in the public pension program for the mother and the father. The predicted probabilities are the average enrollment rates for each age/education/gender/employer type18/city cell calculated from the 2005 population survey. This instrument varies by age, education, gender, employer ownership type, and city, and each of these factors is controlled for linearly in equation (9). The identification comes only from their interactions, which are assumed to be exogenous to parental decisions on investment in education.19 One obvious disadvantage of using this method in this study is that 2005 data can only be used with a much smaller sample; therefore, this approach is only used as a robustness check. V. Results The SOE reforms that de-link the social services from individual employers also changed other employee benefits. Therefore it is important to first check that trends in other observable factors than pension cov- erage do not differ between the two sectors. Trends in Related Employee Benefits and Other Variables in the Two Sectors The medical reform in 1997 involves establishing a unified “urban employee basic health insurance scheme” and widening coverage of health insurance for the urban employed (Liu 2002). If the effects from the SOE reforms last until the study period, then it is possible that other employment benefits besides pension are different in the two sectors over time during 2001–2005. Further, the possibility of differing expectations of employers across state and non-state sectors in terms of hours of work may affect how much time parents can invest in after-school programs and activities. The availability and cost of childcare facilities may also be different for employees of the two sectors, which may drive differ- ential needs of grandparents as care givers. If any of these factors exhibits different trends over time in the two sectors, the assumption for identification, that is conditional on covariates; that pension cover- age is the only systematic factor that has a differential impact on children who have parents in the non- state sector, will be violated. Although this assumption is not directly testable, the same double- difference framework can be used to examine whether the trends in other social policies, income meas- ures, and household compositions during this period are indeed indistinguishable between the two sec- tors. The following equation is estimated to examine these trends, Yict ¼ a1 NonStateict þ a2 Year2005 þ a3 NonStateict  Year2005 (10) þ Z0ict a4 þ DcÂt þ eict where i indexes the working age individual and Yict stands for the aforementioned outcomes: medical insurance, monthly wage, yearly bonus, housing ownership, housing value, hours of work, household income per capita, household size, and if living with parents. The vector Z0ict contains covariates includ- ing the individual’s age and years of schooling. A small and insignificant estimate of a3 supports the view that other trends are not systematically different in state and non-state sectors. 18 There are eight categories for employer types in the 2005 population survey: land contractor, government agencies and social service units, state-owned or controlled enterprises, collective enterprises, the self-employed, private enter- prises, other enterprises, and others. We exclude the land contractor category and combine “other enterprises” and “others” in the calculation. 19 Others have used prevailing averages as instrument variables in previous studies on health insurance in the US (e.g., Cutler and Gruber 1996; Gruber and McKnight 2003). 494 Mu and Du Estimations for working-age women and men are separately presented in table 2. Both women and men in the non-state sector are more likely to be enrolled in the pension program in 2005 relative to 2001, compared to their counterparts in the state sector. But the trends in medical insurance are indistin- guishable between the two sectors. Non-state sector employees in general have lower values in wage, bonus, and household income per capita, but the changes over time in these outcomes are similar in the Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 two sectors. Both housing ownership and housing values have increased substantially over time, but the Table 2. Estimations of Pension Coverage and Potential Confounding Variables Pension Medical Monthly Yearly Housing insurance wage (log) bonus (log) ownership Panel A. Women (age 24–55) Non-state Sector  Year 2005 0.191*** 0.141 0.115 À0.046 À0.060 (0.055) (0.096) (0.078) (0.049) (0.101) Non-state sector À0.265*** À0.393*** À0.125*** À0.927*** 0.030 (0.038) (0.080) (0.040) (0.216) (0.070) Year 2005 0.035 À0.030 0.182*** 0.119 0.283*** (0.023) (0.037) (0.042) (0.345) (0.066) Number of observations 4,156 4,156 4,156 4,156 4,156 Panel B. Men (age 24–60) Non-state sector  year 2005 0.271*** 0.023 0.051 0.033 À0.030 (0.048) (0.078) (0.089) (0.055) (0.102) Non-state sector À0.391*** À0.344*** À0.024 À1.086*** À0.027 (0.036) (0.067) (0.029) (0.254) (0.054) Year 2005 0.009 0.015 0.259*** 0.058 0.278*** (0.015) (0.034) (0.023) (0.450) (0.064) Number of observations 4,547 4,547 4,547 4,547 4,547 Housing Working Household Household Living with value (log) hours income per size parent(s) capita (log) Panel A. Women (age 24–55) Non-state sector  year 2005 À0.489 0.035 0.063 0.047 0.047 (0.558) (0.027) (0.054) (0.141) (0.046) Non-state sector 0.121 0.053*** À0.172*** 0.002 0.013 (0.398) (0.016) (0.031) (0.106) (0.031) Year 2005 8.145*** 0.020** À0.029 0.148* 0.132*** (0.298) (0.008) (0.030) (0.078) (0.020) Number of observations 4,156 3,059 4,153 4, 156 4, 156 Panel B. Men (age 24–60) Non-state sector  year 2005 À0.336 0.042 0.076 À0.074 À0.052 (0.761) (0.027) (0.061) (0.136) (0.047) Non-state sector À0.017 0.055*** À0.074** 0.017 0.050 (0.325) (0.014) (0.030) (0.085) (0.032) Year 2005 8.269*** 0.038*** 0.015 0.136* 0.087*** (0.297) (0.010) (0.033) (0.082) (0.024) Number of observations 4,547 3,801 4,539 4,547 4,547 Notes: Self-employed individuals (including family workers) and individuals working in firms with fewer than eight employees are not included in the analysis. Other variables included but not reported are age, years of schooling, city fixed effects, and city-specific time dummies. ***significant at 1% level; **significant at 5% level; *significant at 10% level. Source: Authors’ analysis based on data described in the text. The World Bank Economic Review 495 same trends apply to both sectors.20 Employees in the nonstate sector work longer hours than those in the state sector, but the difference in the working hours stays similar in these two years. Lastly, there is no differential trend in either household size or the probability of living with a parent. The evidence presented in table 2 doesn’t suggest that factors other than pension coverage have no impact on education investment. Instead it shows that trends in those factors do not differ between the Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 two sectors. Hence, it is unlikely that any differential changes in education investment over time between parents working in the two sectors would be caused by the outcomes examined in table 2. Main Results from the DD Estimations The estimates of the overall effects of pension reform on education expenditure are reported in table 3. The top panel contains results for the logarithm of the total education expenditure, and the lower panel shows whether or not the child had tutors or participated in extracurricular classes. Table 3. Double-Difference Estimations of Educational Investment for Children Age 1–18 (1) (2) (3) (4) (5) Panel A: Total education expenditures (log) Parents in non-state sector  year 2005 0.177 0.106*** 0.085* 0.109** 0.115** (0.112) (0.040) (0.046) (0.050) (0.057) Parents in non-state sector À0.328*** À0.299*** À0.190*** À0.165*** À0.155*** (0.095) (0.036) (0.043) (0.044) (0.053) Year 2005 0.296*** 0.314*** 0.339*** 0.303*** 0.274*** (0.046) (0.046) (0.053) (0.064) (0.074) Panel B: Has tutors or enrolled in interest classes Parents in non-state sector  year 2005 0.058 0.063 0.081** 0.065** 0.065** (0.041) (0.041) (0.037) (0.033) (0.031) Parents in non-state sector À0.122*** À0.121*** À0.134*** À0.063*** À0.056*** (0.032) (0.033) (0.028) (0.019) (0.016) Year 2005 0.215*** 0.203*** 0.210*** 0.225*** 0.203*** (0.030) (0.031) (0.034) (0.041) (0.047) City-specific year dummies Yes Yes Yes Yes Yes Child age, gender, and school level No Yes Yes Yes Yes Age and education of parents No No Yes Yes Yes Parent current Hukou status & Hukou No No No Yes Yes and residential location at age 16 Grandparents’ education No No No No Yes Number of observations 1,838 1,838 1,838 1,838 1,838 Notes: ***significant at 1% level; **significant at 5% level; *significant at 10% level. Standard errors are clustered by city, year, and employers’ ownership type. Source: Authors’ analysis based on data described in the text. 20 Changes in expected future housing wealth may also influence the household’s education expenditure through inter- temporal budget constraint. We don’t have enough information to examine expected future wealth and its changes across state and non-state workers. But our data suggest that workers in the state sector were more likely to have older and smaller housing unit, built and previously owned by their work unit, than workers in the non-state sector. Compared to housing units later built by real estate developers, the housing units built by state-owned factories or or- ganizations before the housing reform (in 1994) were more likely to be demolished. Therefore, workers in the state sector seemed to be better positioned to benefit from significant payout when their homes were scheduled for demoli- tion. So it is likely that the expected housing wealth is higher for workers in the state sector. This would bias against finding the result that parents in the non-state sector spend more over time on education. The descriptive statistics of housing characteristics by sector are reported in table S1 of the supplementary. 496 Mu and Du Table 4. Double-Difference Estimations of Two Types of Education Expenditures for Children Age 1–18 (1) (2) (3) (4) (5) Panel A: Tuition and school fees (log) Parents in non-state sector  year 2005 0.149 0.040 0.040 0.033 0.029 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 (0.126) (0.050) (0.053) (0.057) (0.062) Parents in non-state sector À0.201* À0.158*** À0.120** À0.102* À0.091 (0.103) (0.047) (0.050) (0.056) (0.063) Year 2005 0.099 0.131** 0.147*** 0.146** 0.127* (0.077) (0.055) (0.056) (0.059) (0.067) Panel B: Expenditures on tutoring and interest classes (log) Parents in non-state sector  year 2005 0.431 0.589** 0.453** 0.449** 0.474** (0.292) (0.249) (0.226) (0.216) (0.216) Parents in non-state sector À0.964*** À1.036*** À0.499*** À0.452*** À0.414*** (0.191) (0.171) (0.110) (0.086) (0.084) Year 2005 1.612*** 1.586*** 1.695*** 1.556*** 1.485*** (0.194) (0.233) (0.287) (0.319) (0.328) City-specific year dummies Yes Yes Yes Yes Yes Child age, gender, and school level No Yes Yes Yes Yes Age and education of parents No No Yes Yes Yes Parent current Hukou status & Hukou and No No No Yes Yes residential location at age 16 Grandparents’ education No No No No Yes Number of observations 1,838 1,838 1,838 1,838 1,838 Notes: ***significant at 1% level; **significant at 5% level; *significant at 10% level. Standard errors are clustered by city, year and employers’ ownership type. Source: Authors’ analysis based on data described in the text. Results from the basic difference-in-differences model, including the full set of city and year fixed effects are shown (table 3). Pension reform is associated with an increase in education expenditure, but the estimate is not statistically significant. In the next column, the estimate is 0.106, significant at the 1% level, with additional controls for child age, gender, and school level. The result shows that, com- pared to education expenditure for children with one parent in the state sector, education expenditure for children with two parents in the non-state sector has an increase of 10.6% during this period. Specification is further augmented with parents’ individual characteristics and their family backgrounds as covariates, including parents’ age and education (column 3), parents’ Hukou status (column 4), and, finally, grandparents’ education (column 5). With this extensive list of covariates, the impact estimate ranges from 8.5% to 11.5%, all significant at the 5% level.21 The results of tutoring and extracurricular classes are sensitive to the inclusion of different controls. However, the estimate remains significant at the 5% level with the full set of controls, showing that over time the likelihood of having tutors and attending special interest classes is 6.5 percentage points higher for children with parents in the non-state sector than for children with a parent in the state- sector. When examining the effects on two components of the total education expenditures: tuition and school fees, and tutors and extracurricular classes (table 4), it is found that, with the full set of con- trols, the impact coefficient for tuition and fees is not significantly different from zero. Note that the specific public schools attended by the K–9 children are based mainly on the school district of their res- idence. With the implementation of the “one fee” system, the amount of tuition and school fees is 21 The regression results are robust to the inclusion of three additional variables as controls: the number of siblings of the child, the number of mother’s siblings, and the number of father’s siblings. The World Bank Economic Review 497 capped in the public schools; thus, it is not surprising that no impact is found on tuition and school fees. The estimated impact on expenditures on tutors and extracurricular classes is large in magnitude (about 47%)22 and significant at the 5% level. These results imply that the rise in total education expenditure occurs primarily because of increased participation in and expenses on educational activ- ities outside regular schools. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 Trends in other benefits—health insurance policy, wage, bonus income, working hours, housing val- ues, etc.—have been shown to be essentially the same for both state and non-state sectors. Therefore, the observed impact on education expenditure for children whose parents are employed in the non-state sec- tor cannot be driven by any of these factors. With the large set of control variables included in the regres- sions, it can be concluded that the effects found on education investments can be ascribed to the expansion of pension coverage to the non-state sector.23 Could the Results be Driven by State-Owned Enterprise (SOE) Restructuring? In the mid-1990s, the SOEs underwent various reforms. Given that the more educated and younger victims of the SOE restructuring were more likely to be reemployed (Giles et al. 2006b)—and consider- ing the extent to which the restructuring process might be still ongoing during 2001–2005—one may be concerned that the differences in the worker composition between the two sectors would change during this time, and then the composition consistency underlying the double difference approach would be violated. Actually, the state-sector restructuring was carried out at a fast pace, and by the end of 2001, some 86% of industrial SOEs had been through restructuring (Garnaut et al. 2006), and, therefore, SOE reforms would not be expected to cause different worker composition in the two sec- tors during 2001 and 2005. To examine this issue empirically, a linear probability model is run of parents’ likelihood of working in the non-state sector on their individual characteristics and family background. It shows that the individual determinants are the same in 2001 and 2005 (table 5). Specifically, compared to workers in the state sector, workers in the non-state sector tend to be younger, less educated, and less likely to have local Hukou. It is worth noting that in 2005, the employees in the private sector don’t have disproportionally more public sector experience than in 2001. As the insignificant interaction terms between the year 2005 dummy and the individual and fam- ily characteristics show, there is no observable difference between the two years in terms of selection into the non-state sector. 22 Even with this sizable increase, the spending on tutors and extracurricular activities accounts for approximately 1.6% of annual household disposable income in 2005. In terms of the share of household income, the magnitude of the esti- mate seems to be reasonable. 23 The main results focused on here are based on the reduced form effects of parents being in the non-state sector in 2005. When we use this interaction as an instrument variable for the direct measure of any parent having pension cov- erage, we find that the estimated effects are consistent with the reduced-form estimation and larger in magnitude. The results are reported in table S2 of the supplementary appendix. As a robustness check, we also include interactions be- tween the 2005 dummy variable and parental characteristics (age, education, current and previous Hukou status) as extra controls to allow the impact of characteristics on educational spending to change over time. The results, reported in table S3 of the supplementary appendix, are similar to the results in tables 3 and 4. 498 Mu and Du Table 5. Probabilities of Working in the Non-State Sector Father Mother Age À0.00 (0.00) À0.011** (0.01) Years of schooling À0.04*** (0.00) À0.046*** (0.01) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 Number of siblings 0.00 (0.01) 0.002 (0.01) Rural Hukou before age 16 À0.00 (0.05) 0.060 (0.05) Didn’t live in this city before age 16 À0.09** (0.05) À0.087* (0.04) Having local Hukou À0.30*** (0.05) À0.230*** (0.04) Previously employed in the state sector 0.07** (0.03) 0.039 (0.03) Father’s years of schooling 0.00 (0.00) À0.005 (0.00) Mother’s years of schooling À0.01 (0.00) À0.009*** (0.00) Shanghai À0.10* (0.05) À0.075 (0.05) Wuhan 0.03 (0.04) 0.013 (0.05) Shenyang 0.07 (0.05) 0.115** (0.05) Fuzhou 0.04 (0.04) 0.035 (0.04) Year 2005 0.10 (0.20) À0.454* (0.25) Age  year 2005 0.00 (0.00) 0.010 (0.01) Years of schooling  year 2005 À0.01 (0.01) 0.001 (0.01) Number of siblings  year 2005 0.03 (0.02) 0.016 (0.01) Rural Hukou before age 16  year 2005 0.04 (0.08) 0.032 (0.08) Didn’t live in this city before age 16  year 2005 À0.09 (0.08) 0.064 (0.07) Having local Hukou  year 2005 À0.05 (0.07) À0.034 (0.06) Previously employed in the state sector  year 2005 À0.07 (0.05) À0.055 (0.05) Mother’s years of schooling  year 2005 À0.00 (0.01) 0.005 (0.01) Father’s years of schooling  year 2005 0.00 (0.01) 0.000 (0.00) Shanghai  year 2005 0.10 (0.08) 0.031 (0.08) Wuhan  year 2005 0.06 (0.07) 0.008 (0.07) Shenyang  year 2005 À0.05 (0.07) À0.119* (0.07) Fuzhou  year 2005 0.01 (0.06) À0.170*** (0.06) Constant 1.09** (0.14) 1.66*** (0.17) Number of Observations 1,838 1,838 Notes: ***significant at 1% level; **significant at 5% level; *significant at 10% level. Standard errors are robust to heteroskedasticity. Source: Authors’ analysis based on data described in the text. Beginning in the late-1990s, one means of shedding workers from the state sector during the enter- prise reform was to allow them to retire early.24 As it is not uncommon for early-retirees to return to work, in the 2005 sample, there may be cases where a worker was a former retiree, collecting pension from the state sector while working in the non-state sector. To address this concern, the observations with mothers being older than forty are dropped, and the model is re-estimated using a much smaller sample, and the results remain similar.25 Finally, a related concern regards pension wealth. A 1997 pension reform (State Council 1997) reduced pension wealth of employees of SOEs, who were offered lower replacement rates than before (Salditt et al. 2008; Sin 2005). The decline in pension wealth is found to be associated with higher 24 Giles et al. (2006) documents that nearly 40% of women between the ages of forty and forty-nine who lost employ- ment during the SOE restructuring were able to retire and receive pensions. 25 The results are reported in appendix table S4. The results on total education expenditure show that, compared to edu- cation expenditure for children with one parent in the state sector, the expenditure for children with two parents in the nonstate sector increases by 19% during this period. The increase in total education expenditure seems to be driven by more spending on tutoring and interest classes, a finding consistent with the results from the total sample. The impact on having tutors, though, is no longer significant. These results are likely to reflect the impacts on younger children, as the average age of children in this sample is 8.9 years versus 12.1 years in the total sample. The World Bank Economic Review 499 Table 6. Parents in State-Owned Enterprises as a Control Group Total education Has tutors or Tuition and Expenditures on expenditure (log) enrolled in interest school fees (log) tutoring and classes interest classes (log) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 Parents in non-state sector  year 2005 0.081** 0.019 0.470** 0.068** (0.041) (0.065) (0.213) (0.031) Parents in non-state sector À0.308*** À0.110* À0.369*** À0.044** (0.035) (0.061) (0.100) (0.017) Parent in state-owned enterprises  year 2005 À0.148 À0.079 0.025 0.015 (0.110) (0.109) (0.371) (0.052) Parent in state-owned or state-controlled enterprises À0.040 À0.085 0.199 0.026 (0.031) (0.066) (0.258) (0.040) Year 2005 0.348*** 0.138 1.501*** 0.196*** (0.050) (0.087) (0.315) (0.048) City-specific year dummies Yes Yes Yes Yes Child age, gender, and school level Yes Yes Yes Yes Age and education of parents Yes Yes Yes Yes Parent current Hukou status & Hukou and Yes Yes Yes Yes residential location at age 16 Grandparents’ education Yes Yes Yes Yes Number of observations 1,838 1,838 1,838 1,838 Notes: ***significant at 1% level; **significant at 5% level; *significant at 10% level. Standard errors are clustered by city, year, and employers’ ownership type. Source: Authors’ analysis based on data described in the text. savings and reduced expenditures on children’s education and health in the late 1990s (Chamon et al. 2010; Feng et al. 2011). If such effect still exists during 2001–2005, it would be wrong to attribute the observed increase in education investment to change in pension coverage in the non-state sector. To allow for the potential impact on children with parents employed in SOEs, an additional “treatment” group of children with parents in SOEs is created. The results based on this new specification (table 6) show that education expenditures on children with parents in SOEs are no different than for children in the control group, whose parents are employed in government or related organizations. The results fur- ther confirm that the observed impact is likely to be driven by the change in pension coverage. One pos- sible reason that the impact of changes in pension wealth are not found in the data might be that the reform affecting pension wealth started in 1997 and the “shock” induced by the reform is likely to be strongest shortly after the reform, for example in 1999 as studied in Feng et al. 2011, but its impact may abate during 2001–2005. Could Change in School Quality and Expansion of College Enrollment Explain the Results? Another concern is that, over time, children whose parents are in the non-state sector may no longer have equal access to quality schools; therefore, their parents would be more likely to invest in after- school tutors in the later years. If that were true, what has been identified would be driven by the differ- ential change in school quality available to the two sectors. To examine this issue, the impact is esti- mated separately for children whose parents do not have a college degree and for those who have at least one college-educated parent. If the results were driven by deteriorating school quality for the non-state sector, then it might be that parents in the non-state sector—particularly those with higher levels of education—spend more for tutors and extracurricular classes over time. However, if the results were driven by income effects brought forth by increased coverage in the non-state sector, then the effect would be more pronounced for families with tighter budget constraints. If income effect dominates, a larger impact on children whose parents are relatively less educated is expected to be observed. With the stratified sample, it is found that the pension reform has no effect on children with more-educated parents or from families less likely to be budget-constrained. The impact estimates remain significant for 500 Mu and Du children whose parents are relatively less-educated, and they are substantial in magnitude. These results imply that the possible differential access to quality schools by sectors cannot explain the results.26 With the increase in the university enrollment, particularly since 1999, the chance for high school stu- dents of going to college is increased by about 13% according to Li and Xing (2010). So it is likely that parents with less expectation of enrolling children in college in 2001 started to believe that college enroll- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 ment was now possible. If the expansion of educational opportunities at the college level had a strong effect on parents working in the non-state sector, then this might be reflected in the main results. To inves- tigate if the results are driven by the increase in university enrollment during the time, the sample is strati- fied by child age into groups (pre-school age 1–5; elementary and middle school age 6–15; high school age 16–18).27 If the better prospect of going to college was the major motivation for the observed increase in education expenditure, it is expected that the effect on children of high-school age would be the same, if not bigger, than the effect on younger children. However, the results show that the effect is not significant for high-school age children but significant for younger children. The results suggest that the expansion of college enrollment is unlikely to be an alternative explanation of the main results. Results from the IV Estimates Finally, the estimates are presented based on the alternative identification with the instrumental variable (IV) method as outlined in equation (9). In this estimation, whether or not at least one parent has enrolled in the public pension program is directly controlled for, using the average enrollment rates for the parents’ age/education/gender/employer type/city cell calculated from the 2005 population survey as instruments. The IV results are presented in table 7, together with the F statistic for excluded instruments and the Hansen J statistic for the overidentification test.28 Table 7. Instrumental Variable Estimations Based on the 2005 CULS and the 2005 Mini Census Total education Has tutors or enrolled Tuition and school Expenditures on expenditure (log) in interest classes fees (log) tutoring and interest classes (log) Parent(s) Have Pension Coverage 0.228* 0.132 0.257 0.360** (0.117) (0.286) (0.363) (0.179) F-test on excluded instruments 21.050 21.050 21.050 21.050 Prob > F 0.000 0.000 0.000 0.000 Over-identification: Hansen J statistic 2.283 0.163 0.156 1.770 Chi-sq p-val 0.131 0.687 0.693 0.183 City fixed effects Yes Yes Yes Yes Child age, gender, and school level Yes Yes Yes Yes Age and education of parents Yes Yes Yes Yes Parent current Hukou status & Hukou and Yes Yes Yes Yes Residential location at age 16 Grandparents’ education Yes Yes Yes Yes Number of observations 850 850 850 850 Notes: ***significant at 1% level; **significant at 5% level; *significant at 10% level. Standard errors are robust to heteroskedasticity. The pension variable meas- ures whether or not at least one parent is covered by a public pension program; it is instrumented by two variables: mother’s probability of enrolling in public pension program and father’s enrollment probability. The probabilities are predicted from a sample consisting of 20% of the observations of the 2005 Chinese national inter- censal population survey, based on the city of residence, age, education level, and employer’s ownership type. Source: Authors’ analysis based on data described in the text. 26 The results are reported in appendix table S5. 27 The results are reported in appendix table S6. 28 As the generated average enrollment rates are used as instruments and not as regressors, no adjustments are needed in calculating the asymptotic standard errors and test statistics (Wooldridge 2010, chapter 6). The World Bank Economic Review 501 These first-stage tests are passed easily for all four measures of education investments. The IV esti- mate for expenditures on tutors and extracurricular classes is positive but no longer significant, yet the coefficients on pension are precisely estimated for the total expenditure and the likelihood of having tutors and participating in extracurricular activities. The results show that children whose parents have pension coverage enjoy 24% more education expenditures and are 38% more likely to have a tutor or Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 participate in extracurricular activities than the children whose parents are not enrolled in the public pension program. Compared to the results based on the DD framework, these two estimated impacts are larger in magnitude, and they further support the conclusion that parental enrollment in the public pension system is conducive to more investment in children’s education. VI. Conclusions No consensus appears in the economics literature regarding what motivates parents to make transfers to or investments in children. This study provides new evidence in this inquiry. It presents the analysis of the impact of an expansion of the public pension program in urban China on children’s education expen- ditures. The conceptual framework predicts that nonaltruistic parents would decrease education expen- diture for their children in response to the pension program expansion. The empirical analysis finds that this policy change significantly increases the total education expenditure and suggests that in urban China parents’ spending on children’s human capital cannot be entirely driven by an old-age security motive but is likely to be motivated by the altruistic concern for their children. At the same time, the results also imply that without access to public pension, altruistic parents do have to face the trade-off between savings for old-age and education spending on children. With the expansion of pension coverage, more parents are able to spend more on education. The evidence pre- sented in this paper shows that social security reform affects intergenerational transfers in the form of education investment. Assessment of the impact of social security reform would be amiss if it did not include such intergenerational effects. With the basic pension system gradually being established in rural China, it would be interesting to examine the effects in the rural areas where arguably the role of family in elder care is traditionally stronger. References Barr, N., and P. Diamond. 2010. “Pension Reform in China: Issues, Options and Recommendations.” Working Paper, the Department of Economics, MIT. Becker, G. 1974. “A Theory of Social Interactions.” Journal of Political Economy 82 (6): 1063–93. ———. 1992. “The Economic Way of Looking at Life.” Nobel Lecture in Economics. Becker, G., and H. Lewis. 1973. “On the Interaction between the Quantity and Quality of Children.” Journal of Political Economy 81 (2): S279–S288. Becker, G., and K. Murphy. 1988. “The Family and the State.” Journal of Law and Economics 31, No. (1): 1–18. Behrman, J. R., R. A. Pollack, and P. Taubman. 1982. “Parental Preferences and Provision for Progeny.” Journal of Political Economy 90 (1): 52–73. Bernheim, B. D., A. Shleifer, and L. H. Summers. 1985. “The Strategic Bequest Motive.” Journal of Political Economy 93 (6): 1045–1076. Bertrand, M., S. Mullainathan, and D. Miller. 2003. “Public Policy and Extended Families: Evidence from Pensions in South Africa.” The World Bank Economic Review 17 (1): 27–50. Cai, F., J. Giles, and X. Meng. 2006. “How Well do Children Insure Parents against Low Retirement Income? An Analysis Using Survey Data from Urban China.” Journal of Public Economics 90 (12): 2229–55. Cain, M. T. 1983. “Fertility as an Adjustment to Risk.” Population and Development Review 9: 688–702. Cameron, L., N. Erkal, L. Gangadharan, and X. Meng. 2013. “Little Emperors: Behavioral Impacts of China’s One- Child Policy,” Science 340 (6130): 272–73. 502 Mu and Du Chamon, M., K. Liu, and E. Prasad. 2010. “Income Uncertainty and Household Savings in China.” NBER Working Paper 16565. Cox, D. 1990. “Intergenerational Transfers and Liquidity Constraints.” Quarterly Journal of Economics 105 (1): 187–217. Cutler, D., and J. Gruber 1996. “Does Public Insurance Crowd out Private Insurance?” Quarterly Journal of Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 Economics 111, 391–430. Duflo, E. 2003. “Grandmothers and Granddaughters: Old-age Pensions and Intrahousehold Allocation in South Africa.” The World Bank Economic Review 17 (1): 1–25. Duncan, G. J., and P. L. Chase-Lansdale. 2001. “Welfare Reform and Children’s Well-Being.” In R. M. Blank, and R. Haskins, eds., The New World of Welfare. Washington, DC: Brookings Institution Press: 391–412. Ebenstein, A., and S. Leung. 2010. “Son Preference and Access to Social Insurance: Evidence from China’s Rural Pension Program.” Population and Development Review 36 (1): 47–70. Fang, Y., C. Wang, and Y. Song. 1992. “Support for the Elderly in China.” In K. Kendig, A. Hashimoto, and L. Copport, eds., Family Support for the Elderly: The International Experience. New York: Oxford University Press. Feng, J., L. He, and H. Sato. 2011. “Public Pension and Household Saving: Evidence from Urban China.” Journal of Comparative Economics 30: 470–85. Fong, V., 2004. Only Hope. Stanford: Stanford University Press. Garnaut, R., L. Song, and Y. Yao. 2006. “Impact and Significance of State-Owned Enterprise Restructuring in China.” The China Journal 55: 35–63. Giles, J., and R. Mu. 2007. “Elder Parent Health and the Migration Decision of Adult Children: Evidence from Rural China.” Demography 265–88. Giles, J., A. Park, and F. Cai. 2006a. “How has Economic Restructuring Affected China’s Urban Workers.” The China Quarterly 185: 61–95. ———. 2006b. “Reemployment of Dislocated Workers in Urban China: The roles of Information and Incentives.” Journal of Comparative Economics 34: 582–607. Gruber, J., and R. McKnight. 2003. “Why did Employee Health Insurance Contributions Rise?” Journal of Health Economics 22: 1085–104. Hohm, C. F. 1975. “Social Security and Fertility: An International Perspective.” Demography 12 (4): 629–44. Jackson, R., and N. Howe. 2004. “The Graying of the Middle Kingdom: The Demographics and Economics of Retirement Policy in China.” Washington, DC: The Center for Strategic and International Studies. Lei, X., J. Strauss, M. Tian, and Y. Zhao. 2011. “Living Arrangements of the Elderly in China: Evidence from CHARLS.” IZA Discussion Paper No. 6249. Li, S. 2011. “Issues and Options for Social Security Reform in China.” China: an International Journal 9 (1): 72–109. Li, S., and C. Xing. 2010. “China’s Higher Education Expansion and its Labor Market Consequences.” IZA Discussion Paper No. 4974. Lillard, L., and R. Willis. 1997. “Motives for Intergenerational Transfers: Evidence from Malaysia.” Demography. 34 (1): 115–34. Lin, J. 2007. “Emergence of Private Schools in China: Context, Characteristics, and Implications.” In E. Hannum, and A. Park, eds., Education and Reform in China. Routledge, New York. Liu, Y., 2002. “Reforming China’s Urban Health Insurance System.” Health Policy 60: 133–50. McGarry, K., and R. F. Schoeni. 1995. “Transfer Behavior in the Health and Retirement Study: Measurement and the Redistribution of Resources within the Family.” Journal of Human Resources 30 (0): S184–S226. Miller, A. R., and L. Zhang. 2012. “Intergenerational Effects of Welfare Reform on Educational Attainment.” Journal of Law and Economics. 55: 437–76. Morris, P., G. Duncan, and E. Clark-Kauffman. 2005. “Child Well-Being in an Era of Welfare Reform: The Sensitivity of Transition in Development to Policy Change,” Developmental Psychology 41 (6): 919–32. Nugent, J. B. 1987. “The Old-Age Security Motive for Fertility.” Population and Development Review 11: 75–97. Paine, L., and Y. Fang. 2007. “Supporting China’s Teachers: Challenges in Reforming Professional Development.” In E. Hannum, and A. Park, eds., Education and Reform in China. New York: Routledge. Pollak, R.A. 1988. “Tied Transfers and Paternalistic Preferences. American Economic Review 78 (2): 240–44. Raut, L. K. 1990. “Capital Accumulation, Income Distribution and Endogenous Fertility in an Overlapping Generations General Equilibrium Model.” Journal of Development Economics 34 (1/2): 123–50. The World Bank Economic Review 503 Salditt, F., P. Whiteford, and W. Adema. 2008. “Pension Reform in China.” International Social Security Review 63: 47–71. Sin, Y. 2005. “Pension Liabilities and Reform Options for Old Age Insurance.” World Bank Working Paper No. 2005–1. State Council. 1991. Resolution on the Reform of the Pension System for Enterprise Workers. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/483/2897302 by Joint Bank-Fund Library user on 08 August 2019 ———. 1997. A Decision on Establishing a Unified Basic Pension System for Enterprise Workers. ———. 1999. Tentative Rules on the Payment of Social Security Dues. ———. 2000. An Experiment to Perfect Urban Social Security System. Takayama, N. 2002. “Pension Reform of PRC: Incentives, Governance and Policy Options.” Paper presented at the ADB Institute Fifth Anniversary Conference on the Challenges and New Agenda for PRC, Tokyo, Japan. Wei, S., and X. Zhang. 2011. “The Competitive Saving Motive: Evidence from Rising Sex Ratios and Savings in China.” Journal of Political Economy 119 (3): 511–64. Wooldridge, J. 2010. Econometric Analysis of Cross Section and Panel Data, 2nd ed. Cambridge, MA: The MIT Press. World Bank. 1994. Averting the Old Age Crisis. New York: Oxford University Press for the World Bank. Zaslow, M. J., K. A. Moore, J. L. Brooks, P. A. Morris, K. Tout, Z. A. Redd, and C. A. Emig. 2002. “Experimental Studies of Welfare Reform and Children.” The Future of Children 12 (1): 78–95. Zhao, Y., and J. Xu. 2002. “China’s Urban Pension System: Reforms and Problems.” Cato Journal 21 (3): 395–414. Zimmer, Z., L. G. Martin, M. Ofstedal, and Y. Chuang. 2007. “Education of Adult Children and Mortality of their Elderly Parents in Taiwan.” Demography 44 (2): 289–305. The World Bank Economic Review, 31(2), 2017, 504–530 doi: 10.1093/wber/lhv082 Advance Access Publication Date: January 25, 2016 Article Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 Prices, Engel Curves, and Time-Space Deflation: Impacts on Poverty and Inequality in Vietnam John Gibson, Trinh Le, and Bonggeun Kim Abstract Many developing countries lack spatially disaggregated price data. Some analysts use “no-price” methods by using a food Engel curve to derive the deflator as that needed for nominally similar households to have equal food shares in all regions and time periods. This method cannot be tested in countries where it is used as a spa- tial deflator since they lack suitable price data. In this paper, data from Vietnam are used to test this method against benchmarks provided by multilateral price indexes calculated from repeated spatial price surveys. Deflators from a food Engel curve appear to be a poor proxy for deflators obtained from multilateral price in- dexes. To the extent that such price indexes reliably compare real living standards over time and space, these results suggest that estimates of the level, location, and change in poverty and inequality would be distorted if the Engel method deflator was used in their stead. JEL classification: D12, E31, O15 I. Introduction Reliable data on real welfare over time and space in poor countries are rare. Statistical agencies mostly focus on the temporal Consumer Price Index (CPI), which lets one compare changes in, but not levels of, prices over space. Few poor countries have a spatial price index, despite their weak infrastructure and limited market integration permitting large spatial price differences.1 Without consistent time-space comparisons of living standards, it is unclear if reports of rising inequality in some developing countries (e.g., China) reflect spatially diverging prices more than growing disparities in real welfare levels. Debates about where and by how much poverty has fallen also depend critically on accurate cost-of- living comparisons over time and space. Amongst ways to spatially deflate in countries without spatial prices, the most startling results use a food Engel curve to calculate the deflator that lets different nominal incomes have the same real standard of living (based on the same food share). This adapts a method developed for temporal comparisons by John Gibson (corresponding author) is a professor of economics at University of Waikato, Hamilton, New Zealand; his email address is jkgibson@waikato.ac.nz. Trinh Le is a research fellow at Motu Economic and Public Policy Research, Wellington, New Zealand; her email address is Trinh.Le@motu.org.nz. Bonggeun Kim is a professor of economics at Seoul National University, Seoul, Korea; his email address is bgkim07@snu.ac.kr. The authors are grateful for assistance from Valerie Kozel, Ian Hinsdale, Nguyen Tam Giang, Tinh Doan, and Geua Boe-Gibson, along with many staff from the General Statistics Office. Helpful comments were received from the editor and three anonymous referees and from seminar audiences at Monash and Otago. All remaining errors are those of the authors. 1 Gibson (2013) provides examples of the priority that statistical agencies in poor countries give to collecting nominal liv- ing standards data over price data, despite both types of data being needed for measuring real welfare. C The Author 2016. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. The World Bank Economic Review 505 Hamilton (2001), who estimated Engel curves to back out the implied true price index and real income growth over time.2 Alma ˚ s (2012) uses Hamilton’s method for spatial comparisons; assuming a unique food Engel curve for the world, gaps between the food share for a particular country and the base coun- try imply bias in the Purchasing Power Parity (PPP) statistics of the Penn World Table. Correcting for PPP bias raises global inequality by at least one-quarter. Alma ˚ s and Johnsen (2012) apply the same Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 method to China but add a time dimension to uncover spatial bias in real growth rates. China’s CPI seems too low in rural areas and too high in urban areas; the Engel curve deflator shows a 44% rise in the rural cost-of-living from 1995 to 2002 and no change in the urban cost-of-living, versus CPI in- creases of 8% and 11%.3 Correcting this bias raises the rural cost-of-living from 60% of the urban level in 1995 to 87% by 2002, and one-half of apparent poverty reduction in rural China disappears.4 Likewise, an Engel curve deflator for India gives a fall in rural poverty of just 5% between 2005 and 2010, versus a 20% fall at official poverty lines that are allegedly time-space consistent (Alma ˚ s, Krelsrud, and Somanathan 2013). Remarkable gaps occur in the records for some states; for example, the official lines show that the poverty rate fell in rural West Bengal from 38% to 29% while the Engel curve deflator has Bengalese poverty rising from 67% to 70%. The record of recent progress in poverty reduction for poor countries may need to be revised if these results based on Engel curve deflators are correct. But it is hard to know how credible these findings are since China and India lack spatially disaggregated price surveys, preventing comparison of Engel curve deflators with multilateral price indexes.5 In this paper we conduct just such a comparison, using high quality data from Vietnam. Specifically, we use the 2010 and 2012 Vietnam Household Living Standards Surveys (VHLSS) and market prices from spatial cost of living surveys fielded in conjunction with the VHLSS. In each year, prices of up to one hundred goods and services were surveyed in sixteen hundred different markets, with surveyors given detailed pictures of the desired specifications to ensure consistency over time and space. The spatial deflators and spatially disaggregated estimates of temporal inflation derived from the food Engel curve are a poor proxy for the deflators obtained from the multilateral price indexes. The Engel curve deflators suggest costs of living in some rural regions exceed those of the capital city. The de- rived changes in the cost of living from 2010 to 2012 vary widely over space, with the Engel curve sug- gesting deflation in some regions while the multilateral indexes give regional price changes of between 14–26% (the CPI rose 26% between the 2010 and 2012 surveys). These differences matter to conclu- sions about the location, level, and trend in poverty and inequality. For example, if the Engel curve 2 Applications of the temporal method include studies of the historical United States (Costa 2001; and Logan 2009) and contemporary Australia (Barrett and Brzozowski 2010), Brazil (Filho and Chamon 2012), Canada (Beatty and Larsen 2005), China (Chamon and Filho 2014), Indonesia (Olivia and Gibson 2013), Japan (Higa 2013), Korea (Chung, Kim and Gibson 2010), Mexico (Filho and Chamon 2012), New Zealand (Gibson and Scobie 2011), Norway (Larsen 2007), and Russia (Gibson, Stillman and Le 2008). 3 Chamon and Filho (2014) use Urban Household Survey data for ten Chinese provinces and estimate an upward bias in the CPI of about one percentage point per year over 1993–2005 but do not consider any spatial deflation. 4 Specifically, for the $1 a day poverty line, the fall in the rural poverty gap between 1995 and 2002 is À0.67 using the CPI deflator but only À0.32 using the food Engel curve deflator. 5 Household expenditure surveys in both countries allow unit values to be constructed (mainly for foods), but these are best treated as a proxy for quality rather than for price (Gibson 2013). Unit values have been widely used in India to cal- culate spatial multilateral price indexes for urban and rural sectors and states, most recently by Deaton and Dupriez (2011) and Majumder, Ray, and Sinha (2012). The most widely used spatial deflator for China is from Brandt and Holz (2006), who used provincial CPI data from 1990 to price national rural and urban expenditure baskets (containing 40–60 items; with 40% of the rural basket using urban prices since rural prices were missing). The annual rate of change in the CPI for each province was then used by Brandt and Holz to extend from the base year back to 1984 and forward to 2004, which likely causes time-space inconsistencies, as demonstrated for the example of Russia by Gluschenko (2006). 506 Gibson, Le, and Kim deflator is used, the national Gini index rises to 0.47 from the 0.43 calculated in nominal terms; in con- trast, using the multilateral price indexes leads to a lower Gini of 0.40 in real terms. The Engel deflator also causes the headcount poverty rate to be ten percentage points higher and skews the poverty profile to finding more rural poverty, especially in some regions that are already poorest. Our results cast doubt on the Engel curve method, but its proponents may claim that our multilateral Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 price indexes do not get the right cost-of-living and are a poor benchmark. The Engel method relies on food shares falling as income rises, so preferences cannot be homothetic. Thus any benchmark price in- dex consistent with homothetic preferences may be considered an unfair test. But a “fair” test of the Engel curve method is hard to design. Beatty and Crossley (2012) show that this method gives the true cost of living for an unknown household whose expenditure gives zero utility at base period prices. As Nakamura et al. (2015) note, the Engel curve method recovers the change in the cost-of-living for a household that may be anywhere in the income distribution; thus it is hard to see what crucial experi- ment could compare the Engel curve deflator with another deflator. Even a fixed-weight cost of goods index that does not imply homothetic preferences, like a Laspeyres index, may be a poor benchmark for such a test since one would not know whether to weight it democratically, plutocratically, or at some other point in the income distribution so as to best match the unknown reference household of the Engel curve method.6 Our strategy for testing the Engel method is more pragmatic. The issue of the preference framework that gives a price index consistent with the cost-of-living changes recovered by the Engel curve method is hard to resolve since the reference household is unidentified. But we note that researchers are not using the Engel curve method as a spatial deflator because they are guided by hypothesis tests that this is a more preference-compatible deflator.7 Instead, this method is used for time-space deflation because spa- tially disaggregated prices are unavailable. For example, Alma ˚ s and Johnsen (2012, 2) motivate their food Engel curve deflator study by stating: Why is it necessary to produce new price indices? First, data on prices in China are scarce. To our knowledge, there are no official and available price indices that allow for cross-province comparisons, and price data on spe- cific goods are extremely limited. Hence natural benchmarks are the sort of price indexes that a statistics office would use if price data were available. Typically, this would be a fixed-weight index, like a Laspeyres, for temporal deflation. ¨ rnqvist, For spatial deflation a statistics office might use a variable-weight superlative index, like a To since substitution bias is likely a bigger concern over space than over time given that relative prices do not vary much over the short to medium term (Van Veelen and Van der Weide 2008). Our Weighted Country Product Dummy (WCPD) testing framework allows both fixed-weight and variable-weight price indexes to be calculated (along with their standard errors), and we apply these dual benchmarks to evaluate the performance of the Engel curve method. On top of the empirical results there are good reasons to doubt the Engel curve method. Anything varying over space and affecting food shares but omitted from Engel curve regressions gets attributed to price differences between areas. For example, calorie needs and food shares are high for hard working rural folk; equally poor but sedentary urbanites seem better off due to their lower food shares 6 The plutocratic weights for the CPI treat dollars equally and thus treat people unequally since some people have many more dollars than others. Deaton (1988) shows that in the United States the CPI weights are representative for a house- hold above the 75th percentile of the expenditure distribution, while Ley (2005) shows that higher inequality, differ- ences in consumption patterns by income groups, and greater variance in individual price behavior all contribute to the gap between plutocratic and democratic price indexes. 7 This contrasts with cross-country studies of PPPs for measuring global poverty where, for example, Ackland et al. (2013) test the hypothesis of common homothetic preferences and find that in samples from the 1996 and 2005 ICP they cannot reject homothetic preferences for about 70% of countries. The World Bank Economic Review 507 (Deaton and Dupriez 2011). Likewise, another Engel curve study for China found a much richer set of covariates than those of Alma ˚ s and Johnsen—including temperature—were relevant to food shares and were correlated with spatial variables (Gong and Meng 2008). While these factors could be included in Engel curve regressions, they almost never are, yet they reflect long-standing spatial differences that likely vary much more than do short-term changes when food budget shares are compared over time. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 Thus the omitted variables bias problem is potentially much worse when the Engel curve method is used over space than over time. Another problem for the Engel curve method is that food shares will vary with relative prices, but if there is no spatial price survey, then relative prices are unobserved. In temporal uses of the Engel curve method, the relative inflation rate for food versus nonfood is used, but this is no help for spatial compar- isons. While unit values (surveyed expenditures divided by quantities) are sometimes used to proxy for prices, it is rare for surveys to get quantities (and hence unit values) of most nonfood items. Moreover, unit values for food will be systematically biased over space because they will average over a different quality mix in net consuming areas compared to net producing areas because of the Alchian-Allen effect that fixed charges for transport, storage, or processing will alter the relative price of quality over time and space (Gibson and Kim 2015). Even allowing for all of these threats to the Engel curve method, a proponent could simply note that the current results show big gaps between what standard price indexes show and what the Engel curve shows. In temporal CPI bias studies these gaps are taken as evidence of problems with the standard price indexes, and results from the Engel curve method are treated as closer to the truth. In the current study, the gaps are treated as evidence that results from the Engel curve method are further from the truth. The authors have all published CPI bias studies (e.g., Gibson et al. 2008 and Chung et al. 2010), so a valid question is why we have switched sides, as it were. There are four reasons: a conventional price index may be more reliable over space than over time; the converse is likely to be true of the Engel curve method; there is more corroborating evidence available for assessing temporal CPI bias than for spatial deflators; and, the point raised by Beatty and Crossley (2012) about the unknown reference household of the Engel curve method potentially raises an important caveat to prior results on CPI bias. In terms of the first point, various sources of bias in price indexes, such as quality change, delayed in- troduction of new goods, and unaccounted for substitution of outlets and commodities likely are bigger problems for a temporal index than for a spatial one. For example, superlative indexes can deal with commodity substitution bias over space since base and current region budget shares are available but not over time (except retrospectively) since contemporaneous budget shares are not available for current pe- riod price index calculations. New and improved quality goods may be accessed in different regions and there need not be outlet substitution bias notwithstanding the challenge of finding similar types of outlets in urban and rural areas when surveying prices. Second, comparing Engel curves over time is likely more reliable than comparing them over space, since household survey design is usually stable over the short term and average characteristics of respondents that might affect measurement error also will be fairly stable over time. In contrast, countries might use different methods to survey urban and rural house- holds, and even if the same method is used it may be de facto different (e.g., diary surveys in illiterate ru- ral areas often degrade to unstructured recall, while they may be truer to design in literate urban areas). This matters since key Engel curve parameters are sensitive to differences in how survey questions are posed and answered.8 Third, Engel curve results on CPI bias are often corroborated by analyses of 8 Gibson et al. (2015) randomly assign eight different consumption surveys to households in Tanzania and find the coeffi- cient on real income in the food Engel curve varies by a factor of three between survey designs. This is one of two coeffi- cients that determines the Engel curve deflators, so this fragility suggests that variation over space in survey design or in characteristics such as respondent’s education, wealth, and food acquisition opportunities, which correlate with mea- surement errors, may spuriously affect the deflators derived from Engel curves. 508 Gibson, Le, and Kim durables ownership or by comparing subjective reports of well-being over time. In contrast, there is only a diffuse prior about expected patterns of prices over space, except perhaps that prices should be higher in nominally richer areas due to the Balassa-Samuelson effect, although the opposite can be claimed (Muller 2002). Absent corroborating analyses, the burden of proof for relying on the Engel curve defla- tor for spatial comparisons should be higher than it is for temporal comparisons.9 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 The remainder of the paper is structured as follows. Section II describes the context and data, paying particular attention to the spatially disaggregated prices that are rarely available for large developing countries. The multilateral price indexes and Engel curve methods are set out in section III. Estimation results and comparisons between the various deflators are in section IV, and the impacts of different de- flators on poverty and real inequality are described in section V. A limited cost-benefit evaluation is in section VI, while the conclusions are in section VII. II. Context and Data Description Over the past two decades Vietnam has conducted eight household living standards surveys that have been widely used to monitor progress in poverty reduction. In the first of these, the 1992/93 Vietnam Living Standards Survey (VLSS), prices in local markets were surveyed to provide one source of informa- tion on regional differences in the cost-of-living. The poverty lines calculated in 1993 suggested urban prices were 20% above rural prices, while the highest cost region (of seven then demarcated) had costs of living about 35% above the lowest cost region. The next VLSS, in 1998, fielded a price survey just in rural areas, with poverty line updating in urban areas relying on prices already collected for the CPI. The next four surveys (the Vietnam Household Living Standards Surveys [VHLSS] of 2002, 2004, 2006, and 2008) relied solely on already-collected CPI prices to update rural and urban poverty lines and spatial deflators. There were several concerns with using temporal index prices to calculate a spatial index. Vietnam’s CPI is ostensibly national in scope but the prices used to form the spatial deflators were from just forty of Vietnam’s sixty provinces. Also, the outlet sample for the CPI is not spatially representative since out- lets need to be easily accessible (some item prices are observed every ten days) and in areas of dense de- mand so that target specifications are always in stock. Moreover, the CPI changed in 2006 to let provinces pick item specifications that suited the peculiarities of local demand, rather than using nationally-consistent specifications, so reported spatial price differences thereafter may have included quality differences. In general, spatial variation in the cost of living is unlikely to be accurately measured with data collected for a temporal index, and in this regard the situation in Vietnam was similar to other large developing countries.10 In light of these concerns, the Prices Department of the General Statistics Office (GSO) introduced a new spatial cost of living index (SCOLI) in 2010, based on a price survey fielded in 1,588 communes (almost one-fifth of the total).11 Surveying overlapped with the VHLSS, which another GSO department was running in the same communes (and others) at the same time, ensuring that the budget shares needed for the SCOLI relate to the same period as the prices. To maintain consistency over space, the price surveyors were given detailed photographs of each of the sixty-four goods and services that were the specifications to be priced. The surveyors were required to find examples in the market of similar size and quality to what was pictured, weigh them, and record prices per metric unit (unless they were in 9 An exception is Alma ˚ s et al. (2013) who attempt to provide corroborating evidence by comparing patterns of calorie sources and self-reported hunger around the poverty lines based on their Engel curve deflator. 10 For example, the main spatial deflator used in China, due to Brandt and Holz (2006), was developed from prices col- lected for the CPI. 11 Vietnam’s communes are the lowest level administrative unit, averaging about 10,000 people or 2,500 households. The World Bank Economic Review 509 standard packaging of known weight or were a service). The sampled prices were to be obtained from three different vendors in each locality; this quota was met in almost 90% of the item-market combina- tions and the price index calculations described below dealt with the remaining cases of missing data. The price surveying for the SCOLI was repeated in 2012, again in conjunction with the VHLSS but with an expanded scope. Specifically, the number of goods and services to be priced increased to 101, Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 with seven food items and thirty nonfood goods and services added to the basket, and the number of communes surveyed increased to 1,644. The prices in approximately one-half of the communes were sur- veyed in March and for the other half in September, with the subsamples in both rounds being nationally representative and matching rounds 1 and 3 of the four-round VHLSS. In 2010, prices had been sur- veyed in all communes that were part of the second (September) round of the three-round VHLSS and in a randomly drawn subset of the communes in the third (December) round. Analyses of the prices from both years reveal that spatial patterns in prices do not vary within-year, and for almost all items, the var- iation in prices over space is much greater than the between-month variation. The nominal welfare variables and the data for the Engel curve analysis—food budget shares and co- variates other than prices—come from the 2010 and 2012 VHLSS. For both surveys the consumption modules use a thirty-day recall of food purchases and consumption from own-production and gifts, an- other recall of spending during festive periods on twenty-four food and drink groups, a thirty-day recall for twenty-eight frequently purchased nonfood items and an annual recall for thirty-six other items. The only change in 2012 was that three of the fifty-four food groups from 2010 (rice, cooking oil and lard, and outdoor meals) were split (high and low quality rice, oil separate from lard, and meals by whether household members were at home or away). This slightly finer disaggregation may prompt recall of some forgotten food spending, so food shares in 2012 may be slightly higher than otherwise and people may appear poorer (and so a higher price index will be derived) than if there had been no change in design. The 2010 and 2012 VHLSS marked a break from prior surveys. A “usual month” format and less comprehensive consumption aggregate than in 2010 were features of the prior surveys, which main- tained definitions from 1993.12 This link to the past caused growing understatement of consumption and overstatement of the food share as Vietnam got richer and people diversified away from a food- based budget.13 For example, just 78% of comprehensive consumption in the 2010 VHLSS would be counted under the 1993 definition, and the average food share would be 54% rather than 46% (World Bank 2012). Correspondingly, the poverty line was also changed in 2010, raising it to VND 653,000 per person per month (US$2.26 per day in 2005 PPP terms). Under this line, 21% of Vietnam’s population in 2010 was counted as poor, with headcount poverty rates of 27% in rural areas and 6% in urban areas. The much lower poverty line used previously had seen headcount poverty rates fall to 15% by 2008 (from 58% in 1993). With these new baseline measures of consumption and poverty in place, the challenge for statistical authorities in Vietnam is to make consistent time-space comparisons of real living standards, inequality, and poverty in the future. While the SCOLI program may continue, the earlier VLSS experience and the current situation in most developing countries is that spatial price surveys are not fielded, even as part of household living standards surveys. Moreover, the SCOLI in 2010 was donor funded, and absent this support, the GSO may revert to using the CPI to calculate spatial deflators, so there is interest in how “no-price” methods of deriving spatial deflators perform. The experience of Vietnam in 2010 and 2012 where there is a benchmark from a comprehensive, spatially disaggregated price survey therefore gives a 12 Usual month recall is based on reporting the number of months in which the item is usually consumed by the house- hold, the usual expenditure in those months, and the quantity usually consumed 13 Annex 2.1 of World Bank (2012) summarizes the differences between the comprehensive consumption aggregate and the one which held fast to the 1993 definition of consumption. 510 Gibson, Le, and Kim rare opportunity to assess how well a “no-price” method, such as the food Engel curve, works in practice. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 III. Methods When researchers deflate for welfare analysis they typically want an empirical approximation to the true cost-of-living index (COLI): the ratio of minimum expenditure at alternative prices to minimum expen- diture at base prices holding the standard of living constant. There are three broad approaches, accord- ing to Dumagan and Mount (1997) and Breur and von der Lippe (2011): use a price index with known biases, such as the Laspeyres, that gives a bound to the COLI; use a superlative index formula such as the To ¨ rnqvist, which is closer to the true COLI (due to less substitution bias) if preferences are homo- thetic but has an income bias if preferences are not homothetic; and, econometrically estimate demand equations for a set of goods, from which the theoretical expenditure functions that are numerator and denominator of the COLI can be derived. While the demand systems approach can handle nonhomo- thetic preferences and has early examples from developing countries (e.g., Ravallion and van de Walle 1991), it has proved difficult to carry out in practice and is not widely used so we do not consider it fur- ther.14 In contrast, the Laspeyres is used in probably all countries for their CPI while there are active de- bates in some countries about switching from this to a superlative price index, as was recommended by the “Boskin Commission” on the CPI in the United States. The known biases in a Laspeyres index are substitution biases from not accounting for consumers moving towards items (or outlets) that are relatively cheaper than in the base period or region, quality change bias when higher prices for improved goods wrongly get treated as inflation, delayed entry of new goods missing the rapid fall in price early in the product lifecycle, and biases due to the formula used to aggregate individual price observations into an index of price relativities. For spatial deflation the quality change and new goods biases should matter less than item substitution bias since new and im- proved goods are, in principle, available in all regions at the same time. A superlative index allows changes in the basket between the base period or region and the current period/region and so accounts for consumer substitution, while the Laspeyres index continues to price the base period or region basket. But using budget shares from two periods or regions has a potential problem; these may not refer to the same standard of living. In contrast, a fixed-weights Laspeyres index refers to the base period or region standard of living. The potential “income bias” in the superlative index will not happen in the special case of homothetic preferences, with budget shares not changing with income. But observed behavior, such as falling food shares as income rises, is inconsistent with homothetic preferences. The income bias of the superlative index may exceed the Laspeyres substitution bias and may be positive or negative whereas substitution bias only overstates changes in the cost of living between the base period or region and the current one (Dumagan and Mount 1997). For time-space deflation, a multilateral index method is needed to calculate regional and temporal price levels jointly so as to ensure transitivity (Hill 2004). The two main methods used for PPPs in cross- country studies are Geary-Khamis (GK), used in the Penn World Table, and EKS (Elteto ¨ ves and ¨ , Ko Szulc), used by the World Bank. The GK method lets subindexes add to a total, which is useful for deflat- ing GDP and its components but is less needed for comparing levels of living (Deaton, Friedman, and Alatas 2004). Moreover, the GK index uses plutocratic weights and does not allow for substitution ef- fects; these are undesirable features if comparing household living standards over space. The EKS allows 14 Oulton (2012) proposes an algorithm based on principal components to overcome a problem for the econometric ap- proach of too many parameters to estimate for the available data. This enables compensated budget shares to be de- rived econometrically in order to hold utility constant at some reference level for the nonhomothetic case with only the same data requirements as needed for conventional index numbers. The World Bank Economic Review 511 substitution because it uses underlying Fisher indexes (geometric means of a Paasche and Laspeyres) which are superlative in the sense of being an exact cost-of-living index for some homothetic utility func- tion that is a flexible functional form, allowing a fully general matrix of price substitution effects (Deaton et al. 2004).15 While less well known than either GK or EKS, another multilateral method is the Weighted Country Product Dummy (WCPD), which allows for substitution effects, for democratic Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 weights, and for reversibility (which matters in spatial comparisons since there is no natural base country or region, unlike for temporal comparisons). Deaton et al. (2004) recommend EKS and WCPD for work on the measurement of living standards. Weighted Country Product Dummy (WCPD) Method The Country Product Dummy (CPD) method is a hedonic regression, proposed by Summers (1973) to deal with missing data in international comparisons, where the only characteristic of a commodity is the commodity itself. This humble origin belies a very useful framework for making price comparisons be- cause with appropriate choice of expenditure or quantity weights one can derive several bilateral price indexes, including those of Dutot, Jevons, To ¨ rnqvist, and Walsh (Diewert 2005), and also a multilateral system that is an expenditure-share weighted geometric form of the Geary-Khamis index (Rao 2005). This set of price indexes includes both fixed-weight ones and variable-weight superlative indexes. Rao (2004) argues strongly in favor of both weighted and unweighted CPD methods, which also let various regression techniques be used to handle data-related problems and allow standard errors of the PPPs to be calculated. We use the WCPD framework to provide benchmarks for evaluating the deflators provided by the food Engel curve method. The WCPD works as follows: for J regions, K goods, and T periods, the rela- tionship between the prices of goods in different regions and periods is assumed to follow: pk;j;t ¼ q j;t gk uk;j;t (1) where qj;t is the price level in region j and period t relative to the base region/period, gk is the price level of good k relative to the base good, and uk;j;t is a random disturbance term. The price parameters (qj;t and gk ) in equation (1) can be directly estimated in a log-linear regression model, using the KÂJÂT prices from a spatially disaggregated price survey: X J XX T J pffiffiffiffiffiffiffiffiffiffi ^þ lnqj;0 pffiffiffiffiffiffiffiffiffiffiffi wk;j;t ln pk;j;t ¼ / wk;j;0 Dj;0 þ pffiffiffiffiffiffiffiffiffiffi lnqj;t wk;j;t Dj;t j ¼1 t¼1 j¼0 (2) X K pffiffiffiffiffiffiffiffiffiffi þ gk wk;j;t Dk þ u k;j;t k¼ 1 where the weight wk,j,t for good k in region j and period t is described below, Dj,t is a dummy variable for region j and period t, Dk is a dummy for good k, and / ^ is the intercept plus the coefficient of the omitted base category dummies. We use two types of weights so as to generate two price indexes for evaluating deflators from the food Engel curve. These represent two of the three broad approaches to approximating a cost-of-living index, with the demand systems approach not used. First, we use a variable-weight price index, by using as weights: ðskj;t þ sk0;1 Þ=2 where skj,t is the average budget share of item k, in region j, and time t, and 15 EKS methods impose transitivity in the following way: first, make bilateral comparisons between all possible pairs of countries and then take the nth root of the product of all possible Fisher indices between n countries. Deaton and Dupriez (2011, 4) note that multilateral price indexes required for spatial work are typically not consistent with the in- flation rates in local CPIs and so need to be calculated regularly, not just once, and updated by the local CPIs. The re- peated implementation of the SCOLI for Vietnam in 2010 and 2012 fits with this requirement. 512 Gibson, Le, and Kim sk0,1 is the average budget share for item k in the base period in region 0 (which we set to be the urban sector of the Red River region, where Hanoi is). We refer to this price index as WCPD_vw (for variable- weight), which gives estimated time-space deflators: " # XK    skj;t þ sk0;1 pkj;t Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 qj;t ¼ ln (3a) k¼1 2 pk0;1 The WCPD-vw allows for substitution since it uses budget shares from both the base region and period and also from the current region and period, but it exactly measures the cost of living only for homo- thetic preferences. Therefore, we also use a fixed-weight index that does not rely on homothetic prefer- ences but is subject to substitution bias, by using sk0,1 as the weight for all periods and regions. The time- space deflators for the WCPD-fw (for fixed-weight) index are: "  # XK pkj;t qj;t ¼ ðsk0;1 Þln (3b) k¼ 1 pk0;1 Intuitively, WCPD-fw is a Laspeyres-like index but it is not exact. Selvanathan (1991) shows how an ap- propriately weighted linear regression lets one calculate a Laspeyres; the difference is that our WCPD models are log-linear. Nevertheless, the deflator in equation (3b) gives an alternative testing framework that does not depend on homothetic preferences and is close to the sort of price index that a statistics office would likely report if they had disaggregated price data. Engel Curve Method In the original formulation of Hamilton (2001), the budget share of food at home for household i in re- gion j and time period t, wi,j,t is treated as a linear function of the logarithm of real household income, a relative price term and control variables: wi;j;t ¼ / þ cðlnPF;j;t À lnPN;j;t Þ þ bðlnYi;j;t À lnPj;t Þ þ X0 h þ ui;j;t (4) where PF,j,t, PN,j,t, and Pj,t are the true but unobserved prices of food, nonfood, and all goods, Y is total expenditure (a permanent income proxy), X represents control variables, and u the disturbance. The true cost of living is a geometric weighted average of food and nonfood prices: lnPj;t ¼ alnPF;j;t þ ð1 À aÞlnPN;j;t (5) Hamilton assumed prices of good G (food, nonfood, or all goods) are measured with error, lnPG;j;t ¼ lnPG;j;0 þ lnð1 þ PG;j;t Þ þ lnð1 þ EG;t Þ; (6) where PG,j,t is the cumulative percentage increase in the CPI-measured price of good G from period 0 to period t and EG,t is the period-t cumulative measurement error in the price index since that base period. Inserting equation (6) into equation (4) gives: wi;j;t ¼ / þ c½lnð1 þ PF;j;t Þ À lnð1 þ PN;j;t ފ þb½lnYi;j;t À lnð1 þ Pj;t ފ þ X0 h (7) þc½lnð1 þ EF;t Þ À lnð1 þ EN;t ފ À blnð1 þ Et Þ þcðlnPF;j;0 À lnPN;j;0 Þ À blnPj;0 þ ui;j;t : An estimable version of equation (7) using a time-series of cross-sectional household budget data and a temporal CPI for food, nonfood, and all consumption is: The World Bank Economic Review 513 ^þc½lnð1 þ PF;j;t Þ À lnð1 þ PN;j;t ފ wi;j;t ¼ / þb½lnYi;j;t À lnð1 þ Pj;t ފ þ X0 h (8) X T J X þ dt Dt þ dj Dj þ ui;j;t t ¼1 j¼1 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 where Dt are time dummy variables, Dj are regional dummies, and / ^ is the intercept from equation (7), plus the coefficients of the omitted time and region dummies. In the usual time series usage, the coeffi- cients on the time dummy variables, dt, are key to the measurement of deflator bias; the calculation of real income should already have put households observed in different years on the same cost-of-living basis so there should be no temporal “drift” in the residual food share. These dummy coefficients cap- ture relative price effects, differential bias for food and nonfood, and overall deflator bias scaled by the coefficient on income: dt ¼ c½lnð1 þ EF;t Þ À lnð1 þ EN;t ފ À blnð1 þ Et Þ: (9) If the degree of CPI-bias in food and nonfood is approximately equal (or if relative price movements have only small effects on food budget shares) then Hamilton (2001) shows that: lnð1 þ Et Þ % Àdt =b (10) with the cumulative CPI bias at period t, Et, just a simple ratio of coefficients: 1 À expðÀd ^ ^t = bÞ: To adapt this method to time-space deflation in the manner of Alma ˚ s and coauthors, it requires three changes to the framework. First, rather than having a vector of spatial dummies, Dj, and a separate vec- tor of temporal dummies, Dt, time-space dummies Dj,t, which equal 1 for region j and period t, are needed so temporal patterns can vary across spatial units and spatial patterns can vary over time. Second, the relative price of food has to be measured at a more spatially disaggregated level, which we here call area, a, otherwise the ^ c is identified from the same regional and temporal variation as the time- space dummies and perfect collinearity will result. The third change is that income and the relative price of food need to be in nominal terms so that what was previously calculated from the dummy variable co- efficients as deflator bias is now the estimate of the omitted true cost of living Pj,t. After these changes, the estimating equation is: ^þcðlnPà wi;a;j;t ¼ / à 0 F;a;j;t À lnPN;a;j;t Þ þ blnYi;a;j;t þ X hþ J X X J T X (11) dj;0 Dj;0 þ dj;t Dj;t þ ui;a;j;t j ¼1 t ¼1 j ¼0 where the starred terms are nominal price indexes for food and nonfood, and the intercept / ^ now in- cludes the coefficient of only a single omitted dummy, D0,0. The estimated PPP for the price level in re- gion j and time period t is then calculated as: P j;t ¼ expðÀdj;t =bÞ: (12) IV. Estimation Results and Implied Deflators The deflators are estimated for Vietnam’s six broad regions (see figure 1), with the cost of living al- lowed to vary between urban and rural sectors within regions. As a first step, the prices had to map to average budget shares from the VHLSS, which has 120 commodity groups, while prices for fewer groups were surveyed. Budget shares for some groups are combined to match the prices, and the 514 Gibson, Le, and Kim reverse also occurs, with fourteen groups having multiple prices mapping to a single budget share; these prices are first aggregated to budget-share level. In some cases, especially residual categories such as “other vegetables,” the mapping was from the prices of closely related items (e.g., for all of the specific items in the broad group whose residual category was lacking a price). Finally, eleven minor items, which in total accounted for just three percent of the average budget, had no prices available, and these are ignored in Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 the analysis. The definition of the items that were priced and the consumption group(s) they map to are re- ported in appendix table 1 based on the data available from the 2010 survey. For most of the analysis we use this mapping, since the greater disaggregation afforded by the 2012 price survey cannot be used when work- ing with the pooled data from 2010 and 2012. But using the more disaggregated items makes almost no dif- ference to the spatial deflators since the additional items priced in 2012 have small budget shares and have regional price relativities that were similar to the relativities for the substitute item(s) that had been used in their stead in 2010. Figure 1. The Six Broad Regions of Vietnam The World Bank Economic Review 515 Another modeling choice concerns use of imputed prices for item-market combinations with the target specification missing (13% of all cases). The imputations used regressions of the price of the target specifi- cation on prices of alternatives gathered in the survey, controlling for regional fixed effects and brand name fixed effects (for unbranded items, quasi-brands are formed by dividing into intervals based on unit prices). To show the effect of including imputed values (and other modeling choices discussed below) a bi- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 lateral To¨ rnqvist index is reported in appendix table 2. This has the advantage of simplicity and also matches what the GSO and World Bank (2012) used in their poverty analysis. The regional deflators in columns (1) and (2) are almost the same with or without imputed values, so the imputed values are used for the remaining analysis. One important price not observed was rents or the user cost of housing services. Instead, econometric analysis of the VHLSS housing module enabled a hedonic house value equation to be estimated. Regional and temporal variation in reported dwelling values (conditioning on over 60 variables) are used as a proxy for prices. Values are used because there is almost no rental activity recorded in the VHLSS, precluding use of either actual rents or hypothetical rents as measures of regional and temporal price relativities for housing. In the third column of appendix table 2, the price index that results from es- timating the housing equation on pooled data for 2010 and 2012 is reported, which compares with the index in column (2) where the housing equation is estimated just on VHLSS data for 2010. This makes almost no difference to the spatial patterns so the pooled housing value equation is used in the results that follow. The final modeling issue is whether the more aggregated mapping from prices to budget shares based on the 2010 survey gives different results than using the finer mapping based on the 37 extra items priced in 2012. To study this issue we first generate a price index for regions in 2012 on a 2010 base (column (4)), create inflation factors for each region (column (5)), and then rebase the 2012 re- gional price differences. The spatial patterns are very similar to 2010, with a correlation between the two years of 0.97. The final column of appendix table 2 has the spatial price index for 2012 if prices of the 37 more items available that year are used. This is almost identical to what results from keeping the level of aggregation from 2010, with a correlation between the deflators in columns (6) and (7) of 0.997. Thus basing the analysis on the more aggregated mapping of prices to budget shares from 2010 should not be a source of bias. The estimates of the main coefficients for the WCPD and food Engel curve regressions (equations (2) and (11)) are reported in table 1. There are two sets of results, first considering prices for items that cover all consumption (and the food share based on that) and then for an aggregate and food share that excludes housing and durable goods. Housing and durables have a combined budget share of almost one-fifth but are only lightly covered in the price survey, with housing prices from a hedonic regression and just a single specification for durables (a Samsung 21 inch television—although the 2012 survey added a DVD player and a motorcycle). By comparing Engel curve and WCPD deflators with and with- out housing and durables, we can assess whether any failure of the “no-price” Engel curve method to match the indexes from the WCPD is driven by these major items for which it is difficult to obtain sur- veyed prices. In addition to the coefficients reported, the Engel curve regression includes as covariates household size; four demographic ratios (for children, youth, elderly, and migrants); the gender, age, sector of ac- tivity, and education of the household head; and prices for two types of street meals, which are a close substitute for food at home—the numerator of the food share in the dependent variable. Including these prices of close substitutes (and the relative price of food, whose coefficient is reported in table 1) favors the Engel curve method; typically these would be unobserved absent a price survey because unit values are unavailable for street meals and nonfoods given that household surveys usually just obtain 516 Gibson, Le, and Kim Table 1. Key Coefficients from WCPD and Food Engel Curve Regressions All consumption Excluding housing and durables WCPD-vw WCPD-fw Engel WCPD-vw WCPD-fw Engel Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 Urban Mid-Northern Mountains10 À0.081 À0.084 0.016 À0.024 À0.018 0.015 (0.028)** (0.030)** (0.007)* (0.019) (0.019) (0.008) Urban North-Central Coast10 À0.147 À0.146 À0.025 À0.095 À0.093 À0.028 (0.029)*** (0.030)*** (0.005)*** (0.019)*** (0.019)*** (0.007)*** Urban Central Highlands10 À0.111 À0.104 À0.040 À0.090 À0.088 À0.054 (0.029)*** (0.030)*** (0.007)*** (0.019)*** (0.019)*** (0.010)*** Urban Southeast10 À0.025 À0.019 À0.042 À0.025 À0.020 À0.071 (0.029) (0.030) (0.005)*** (0.019) (0.019) (0.007)*** Urban Mekong Delta10 À0.180 À0.186 À0.017 À0.127 À0.125 À0.034 (0.028)*** (0.030)*** (0.006)** (0.019)*** (0.019)*** (0.008)*** Rural Red River10 À0.148 À0.147 À0.009 À0.108 À0.110 0.012 (0.028)*** (0.030)*** (0.005) (0.019)*** (0.019)*** (0.008) Rural Mid-Northern Mountains10 À0.107 À0.123 0.045 À0.047 À0.047 0.039 (0.028)*** (0.030)*** (0.005)*** (0.019)* (0.019)* (0.006)*** Rural North-Central Coast10 À0.203 À0.228 À0.002 À0.136 À0.142 À0.011 (0.028)*** (0.030)*** (0.006) (0.019)*** (0.019)*** (0.008) Rural Central Highlands10 À0.164 À0.179 0.004 À0.110 À0.116 À0.007 (0.028)*** (0.030)*** (0.006) (0.019)*** (0.019)*** (0.009) Rural Southeast10 À0.157 À0.160 À0.038 À0.103 À0.102 À0.058 (0.028)*** (0.030)*** (0.006)*** (0.019)*** (0.019)*** (0.007)*** Rural Mekong Delta10 À0.231 À0.252 0.015 À0.173 À0.180 0.000 (0.028)*** (0.030)*** (0.005)** (0.019)*** (0.019)*** (0.008) Urban Red River12 0.191 0.191 0.025 0.231 0.231 À0.001 (0.029)*** (0.030)*** (0.005)*** (0.019)*** (0.019)*** (0.012) Urban Mid-Northern Mountains12 0.122 0.125 0.011 0.223 0.229 À0.009 (0.029)*** (0.030)*** (0.007) (0.019)*** (0.019)*** (0.012) Urban North-Central Coast12 0.059 0.059 0.003 0.142 0.144 À0.026 (0.029)* (0.030)* (0.005) (0.019)*** (0.019)*** (0.010)* Urban Central Highlands12 0.101 0.107 À0.005 0.164 0.163 À0.037 (0.029)*** (0.030)*** (0.007) (0.019)*** (0.019)*** (0.012)** Urban Southeast12 0.112 0.111 À0.023 0.154 0.151 À0.072 (0.029)*** (0.030)*** (0.005)*** (0.019)*** (0.019)*** (0.010)*** Urban Mekong Delta12 0.001 À0.006 0.000 0.102 0.101 À0.040 (0.028) (0.030) (0.006) (0.019)*** (0.019)*** (0.009)*** Rural Red River12 0.050 0.053 0.013 0.125 0.129 0.014 (0.029) (0.030) (0.005)** (0.019)*** (0.019)*** (0.008) Rural Mid-Northern Mountains12 0.074 0.073 0.056 0.175 0.186 0.025 (0.028)** (0.030)* (0.005)*** (0.019)*** (0.019)*** (0.009)** Rural North-Central Coast12 0.002 À0.012 0.010 0.103 0.105 À0.018 (0.028) (0.030) (0.005) (0.019)*** (0.019)*** (0.008)* Rural Central Highlands12 0.056 0.050 0.030 0.148 0.152 À0.007 (0.028)* (0.030) (0.006)*** (0.019)*** (0.019)*** (0.010) Rural Southeast12 0.010 0.007 0.012 0.096 0.094 À0.022 (0.028) (0.030) (0.006)* (0.019)*** (0.019)*** (0.009)** Rural Mekong Delta12 À0.063 À0.085 0.015 0.046 0.044 À0.022 (0.028)* (0.030)** (0.005)** (0.019)* (0.019)* (0.007)** Log total hhold expenditure À0.138 À0.137 (0.002)*** (0.002)*** Log relative price of food 0.021 À0.054 (0.006)*** (0.041) The World Bank Economic Review 517 Table 1. (continued) All consumption Excluding housing and durables WCPD-vw WCPD-fw Engel WCPD-vw WCPD-fw Engel Constant 0.051 0.055 1.218 À0.015 À0.013 1.163 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 (0.028) (0.032) (0.052)*** (0.018) (0.020) (0.061)*** Observations 1,920 1,920 18,798 1,872 1,872 18,798 Adjusted R-squared 0.574 0.612 0.576 0.664 0.666 0.454 Note: The WCPD regressions include as unreported covariates seventy-nine fixed effects for each commodity (seventy-seven if excluding housing and durables) and differ according to whether they use fixed-weights (fw) or variable weights (vw). The unreported coefficients for the Engel curve regression are on household size and four demographic ratios (for children, youths, elderly and migrants), the gender, age, sector of activity and education of the household head, and prices for foods eaten away from home. Robust standard errors in (), with statistical significance denoted as: ***p < 0.001, **p < 0.01, *p < 0.05. reports on the quantities of well-defined food groups. Since the WCPD models include many more co- variates (the fixed effects for each item) the summary statistics reported are adjusted-R2, which range from 0.57 to 0.61 for the all-consumption aggregate and from 0.66–0.67 when housing and durables are excluded. The adjusted-R2 is lower for the food Engel curve, at 0.58 (and 0.45 without housing and durables).16 The spatial variation in the cost-of-living can be observed from the size and significance of the dummy variable coefficients for each region and sector. According to the WCPD results (regardless of weights), the only area in 2010 without a significantly lower cost of living than the base region of urban Red River (Hanoi) is the urban Southeast, which has Ho Chi Minh City. The region and sector with the lowest cost-of-living is the rural Mekong Delta, which is Vietnam’s rice bowl. Except for the Red River and Southeast regions, the between-region differences in the cost-of-living exceed the urban-rural differences within region, reflecting the fact that, apart from Hanoi and Ho Chi Minh City, most cities are small and not highly differentiated from their rural hinterland. These patterns are quite similar to those found in 1993 with the VLSS, which is consistent with the regional variations in the cost-of-living changing only slowly, since they reflect climate, infrastructure, population density, topography, and other factors that are unlikely to vary much in the short-run. The time-space price indexes derived from the WCPD and Engel curve methods are reported in table 2. Also reported is a test of the hypothesis that the price index for a particular region, sector, and year from the Engel curve method differs statistically significantly from WCPD-vw (using * to denote signifi- cance) or from WCPD-fw (using # for significance). There are 13 (out of 23) such occurrences of signifi- cant differences when the full consumption aggregate is used, and these are the same regardless of which benchmark is used. If housing and durables are excluded there are 19 (18) significant differences between the Engel curve deflators and those from WCPD-vw (WCPD-fw). It appears that an abbreviated con- sumption aggregate (from dropping two major items whose prices are hard to survey) does not improve the performance of the Engel curve method in terms of matching benchmark price indexes that are typi- cal of what statistics offices would report if they had a survey of spatially disaggregated prices. Therefore, in the rest of the paper, the comparisons use the “all consumption” results, which also lets us match to the published poverty and inequality estimates for 2010 that are based on this same compre- hensive consumption aggregate. 16 This exceeds the average R2 in the Engel curves of Alma ˚ s and coauthors (0.44), so any poor performance of the Engel curve deflators here should not be due to a poorly specified regression model. 518 Gibson, Le, and Kim Table 2. Time-Space Price Indexes from WCPD and Food Engel Curves (Urban Red River in 2010 ¼ 100) All consumption Excluding housing and durables WCPD-vw WCPD-fw Engel WCPD-vw WCPD-fw Engel Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 Urban Mid-Northern Mountains10 92.2 91.9 112.0**,## 97.6 98.2 111.7*,# (2.6) (2.7) (5.5) (1.8) (1.9) (6.5) Urban North-Central Coast10 86.4 86.4 83.7 91.0 91.1 81.2*,# (2.5) (2.6) (3.2) (1.7) (1.7) (4.0) Urban Central Highlands10 89.5 90.1 74.7**,### 91.4 91.6 67.4***,### (2.6) (2.7) (3.8) (1.7) (1.8) (5.0) Urban Southeast10 97.5 98.1 73.5***,### 97.5 98.0 59.5***,### (2.8) (2.9) (2.9) (1.8) (1.9) (3.2) Urban Mekong Delta10 83.5 83.1 88.1 88.1 88.2 78.2* (2.4) (2.5) (3.9) (1.7) (1.7) (4.6) Rural Red River10 86.2 86.3 93.7 89.7 89.5 109.4**,## (2.5) (2.6) (3.3) (1.7) (1.7) (6.6) Rural Mid-Northern Mountains10 89.9 88.5 138.8***,### 95.4 95.4 133.3***,### (2.5) (2.6) (5.5) (1.8) (1.8) (6.4) Rural North-Central Coast10 81.6 79.6 98.6***,## 87.3 86.8 92.0 (2.3) (2.4) (4.0) (1.6) (1.7) (5.1) Rural Central Highlands10 84.9 83.6 103.1***,## 89.6 89.0 95.0 (2.4) (2.5) (4.8) (1.7) (1.7) (6.1) Rural Southeast10 85.5 85.3 75.6*,## 90.2 90.3 65.6***,### (2.4) (2.5) (3.3) (1.7) (1.7) (3.6) Rural Mekong Delta10 79.4 77.8 111.6***,### 84.1 83.5 100.3**,## (2.2) (2.3) (4.3) (1.6) (1.6) (6.0) Urban Red River12 121.1 121.1 119.6 125.9 125.9 99.4**,## (3.5) (3.6) (4.4) (2.4) (2.4) (8.7) Urban Mid-Northern Mountains12 112.9 113.4 108.6 125.0 125.7 93.4***,### (3.2) (3.4) (5.7) (2.4) (2.4) (8.5) Urban North-Central Coast12 106.1 106.1 101.8 115.2 115.5 82.9***,### (3.0) (3.1) (3.9) (2.2) (2.2) (6.2) Urban Central Highlands12 110.6 111.3 96.4*,## 117.8 117.7 76.3***,### (3.2) (3.3) (5.0) (2.2) (2.3) (6.6) Urban Southeast12 111.8 111.7 84.8***,### 116.6 116.3 59.0***,### (3.2) (3.3) (3.3) (2.2) (2.3) (4.3) Urban Mekong Delta12 100.1 99.4 100.1 110.7 110.7 74.5***,### (2.9) (2.9) (4.4) (2.1) (2.1) (5.1) Rural Red River12 105.1 105.5 110.0 113.3 113.7 110.4 (3.0) (3.1) (3.7) (2.1) (2.2) (6.3) Rural Mid-Northern Mountains12 107.7 107.5 149.9***,### 119.1 120.4 119.7 (3.1) (3.2) (5.9) (2.3) (2.3) (8.1) Rural North-Central Coast12 100.2 98.8 107.4 110.8 111.0 88.0***,### (2.8) (2.9) (4.0) (2.1) (2.1) (4.9) Rural Central Highlands12 105.7 105.1 123.9**,# 116.0 116.4 95.0**,## (3.0) (3.1) (5.8) (2.2) (2.2) (7.1) Rural Southeast12 101.1 100.7 109.2 110.0 109.9 84.9***,### (2.9) (3.0) (4.5) (2.1) (2.1) (5.3) Rural Mekong Delta12 93.9 91.8 111.1***,## 104.7 104.5 85.0***,### (2.7) (2.7) (4.3) (2.0) (2.0) (4.2) Note: Price indexes follow equations (3a) for WCPD-vw and (3b) for WCPD-fw and equation (12) for the food Engel curve method. Robust standard errors in (). *** Statistically significant differences between the Engel curve price index and the WCPD-vw (WCPD-fw) price index for a region and time period denoted as: p <0.001, **p < 0.01, * p < 0.05 (###p < 0.001, ##p < 0.01, #p < 0.05). The World Bank Economic Review 519 Figure 2. Comparison of Spatial Deflators (for 2010): Urban Red River ¼ 100 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 The different spatial patterns for the WCPD and Engel curve deflators are illustrated in figures 2a and 2b, using the results for 2010 (based on the first eleven rows and first three columns of table 2). The Engel curve deflator implies that several rural areas have higher costs of living than in the capital city— markedly so in the case of the Mid-Northern Mountains region where, according to the Engel curve, the 520 Gibson, Le, and Kim Figure 3. Comparison of Inflation Estimates Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 price level in 2010 is up to 39% above the urban Red River price level. Poverty maps show poverty is in- creasingly concentrated in the Northern Mountains (World Bank 2012), which is a region that could aptly be described as Vietnam’s Appalachia given its inaccessibility and topography. It is surprising to consider that such a region could have the highest cost of living in the whole nation, especially with The World Bank Economic Review 521 housing included in the comparison. Also surprising is the position of the rural Mekong Delta as having the second highest cost-of-living, given that this is Vietnam’s rice bowl, with rice moving out of this re- gion to feed the rest of the country. The correlations between the benchmarks and the Engel curve esti- mates of the spatial price indexes for 2010 are À0.16 for WCPD-vw and À0.22 for WCPD-fw. The spatial deflators affect estimates of the level and location of poverty and the gap between nominal Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 and real inequality, while estimated cost-of-living changes affect assessment of overall progress in raising living standards and escaping poverty. Once again, the experience of inflation implied by the Engel curve deflator is unrelated to the record of inflation given by the WCPD benchmarks, with zero correlation be- tween benchmarks and Engel estimates (figures 3a and 3b). The cost-of-living increase between the 2010 and 2012 surveys ranges from 15–25% with the WCPD-vw and 14–26% with the WCPD-fw, with the least increase in the urban Southeast and the most in the rural Central Highlands. A much more varied experience of inflation is shown by the Engel curve deflator, with cost-of-living increases ranging from À3% to 45%. Such changes appear unlikely because of the arbitrage opportunities that they imply. For example, the Engel curve has the cost of living in the rural Southeast rising by 45% which is three times faster than for that region’s urban sector and also contrasts with an estimate of an unchanging price level in the neighboring rural Mekong Delta. Such big price rises in the rural Southeast would be expected to attract food out of the rural Mekong Delta and industrial goods out of Ho Chi Minh City in order to moderate the price increases in the rural Southeast. Indeed, according to the WCPD deflator, the average gap between rural and urban inflation within a region is just 1.6 percentage points using variable weights or two percentage points using fixed weights, suggesting that price changes in urban areas and their hin- terland largely move together. But for the Engel curve deflator, the average gap in the inflation experi- ence of the rural and urban sectors within a region is thirteen percentage points, which seems unlikely to be true. V. Impacts on Inequality and Poverty The Engel curve deflator interprets the higher average food shares of households living in poor rural areas as evidence of a high cost-of-living in these areas. Consequently, the level of inequality appears Figure 4. Lorenz Curves for 2010, with and without Spatial Deflation 522 Gibson, Le, and Kim higher in real terms than in nominal terms if the Engel curve deflator is used. In contrast, the WCPD price indexes show real inequality to be less than nominal inequality because regions and sectors that are nominally richer are found to have a higher price level; this positive relationship between nominal in- comes and the price level is consistent with the Balassa-Samuelson effect. These differing effects of defla- tion are illustrated in figure 4 in the form of nominal and real Lorenz curves for Vietnam in 2010, with Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 only the variable-weight version of the WCPD deflator used since the fixed-weight version gives very similar results. The Gini coefficients corresponding to these Lorenz curves are 0.427 for nominal con- sumption, 0.404 for real consumption when the WCPD deflator is used, and 0.465 if the Engel curve de- flator is used. Thus, the use of the Engel curve deflator would introduce a bias of six Gini points into the measurement of real inequality, which is a relatively large effect. A similar distortion is introduced into estimates of the overall poverty rate and the location of pov- erty. The existing evidence on poverty in Vietnam is that rural regions are poorer than urban regions (World Bank 2012). The Engel curve deflator exacerbates this effect by suggesting that three rural re- gions (the Mekong Delta, the Central Highlands, and the Mid-Northern Mountains) have a higher cost- of-living than even in Hanoi. The effects of this deflator are shown in table 3, which describes poverty in 2010 nationally and in the urban and rural sectors using the Engel curve and WCPD deflators from table 2. We use the Pa class of poverty measures of Foster, Greer, and Thorbecke (1984): q   1 X gi a Pa ¼ ; n i ¼1 z where n is the total population, incomes are ordered from i ¼ 1 as the poorest and q are poor, z is the poverty line, and gi the poverty gap, gi ¼ z À yi ; (yi is per capita consumption in the ith household).17 For a ¼ 0, P0 is the head-count index, for a ¼ 1, P1 is the poverty gap index, and for a ¼ 2, P2 is the squared poverty gap or poverty severity index. The Pa class additively decomposes contributions from each sub- group to the total level of poverty, which is reported in the table as the “share” of poverty. Another use- ful manipulation of the Pa measures is to calculate the “risk” of poverty, which is the poverty rate for a particular subgroup relative to the overall average, and this is also reported. Table 3. FGT Poverty Measures for the Rural and Urban Sectors in 2010, Comparing Three Deflators Headcount (a ¼ 0) Poverty gap index (a ¼ 1) Poverty severity index (a ¼ 2) Rate Share Risk Rate Share Risk Rate Share Risk WCPD—variable weights Vietnam 0.271 1.00 1.000 0.079 1.00 1.000 0.034 1.00 1.000 Rural 0.354 0.92 1.305 0.105 0.93 1.324 0.045 0.94 1.337 Urban 0.075 0.08 0.277 0.019 0.07 0.233 0.007 0.06 0.202 WCPD—fixed weights Vietnam 0.262 1.00 1.000 0.077 1.00 1.000 0.032 1.00 1.000 Rural 0.341 0.92 1.301 0.101 0.93 1.320 0.043 0.94 1.334 Urban 0.075 0.08 0.286 0.018 0.07 0.241 0.007 0.06 0.210 Engel curve deflator Vietnam 0.367 1.00 1.000 0.130 1.00 1.000 0.063 1.00 1.000 Rural 0.490 0.94 1.335 0.177 0.96 1.361 0.087 0.97 1.375 Urban 0.075 0.06 0.205 0.019 0.04 0.144 0.007 0.03 0.111 17 The poverty line of VND 653,000 used by World Bank (2012) is in national average prices of January 2010. In con- trast the deflators used here are based on urban Red River prices, for surveys centred on October 2010, for which the equivalent poverty line is VND 881,000. The World Bank Economic Review 523 If the Engel curve deflator is used to adjust for regional and sectoral differences in the cost-of-living, it makes the national headcount poverty rate appear ten percentage points higher than if either WCPD de- flator is used (37% versus 27%). This upward bias comes entirely from the rural sector, where the Engel curve deflator causes poverty to be overstated with proportionate biases of 39% in the headcount, 68% in the poverty gap, and 92% in the poverty severity index. The basic pattern of the poverty profile is not Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 altered by using the Engel curve deflator—poverty is overwhelming rural in Vietnam—but the policy re- sponse to finding that one-half (49%) of rural dwellers live in households below the poverty line and that the risk of being poor in urban areas is just one-tenth the national risk (for the poverty severity in- dex) is likely to be quite different to finding just over one-third of the rural population poor, which is what is revealed when either a variable-weight or fixed-weight WCPD deflator is used. Finally, the Engel curve deflator also biases assessment of progress in reducing poverty, in this case showing much faster progress between 2010 and 2012 than is likely (table 4). Recall that the Engel curve implied lower inflation (and even deflation) for most regions and sectors than what the WCPD indexes show (figure 3). In fact, only the rural Southeast and the urban Central Highlands—containing just 8% of Vietnam’s people—had Engel curve inflation higher than WCPD inflation. Consequently, much of the growth in nominal consumption between 2010 and 2012 is treated as real growth, and so the fall in pov- erty seems faster than it actually was. For example, the headcount poverty rate appears to fall by eleven percentage points between 2010 and 2012, compared with a seven percentage point decline when either WCPD index is used. For the other poverty measures shown in table 4, the change in poverty using the Engel curve deflated consumption is twice as large as the change using the other two deflators. These other poverty measures include the average exit time measure of Morduch (1998), which shows the ex- pected number of years to escape poverty with constant and uniform growth (assumed 3% per annum here).18 Use of the Engel curve deflator would lead one to find a three-year fall between 2010 and 2012 Table 4. Poverty Comparisons for 2010 and 2012a FGT poverty measures Average exit time measures H (a ¼ 0) PG (a ¼ 1) PS (a ¼ 2) (T3%) (T3%/H) WCPD – variable weights 2010 27.1 (0.6) 7.9 (0.2) 3.4 (0.1) 3.6 (0.1) 13.2 (0.3) 2012 20.0 (0.5) 5.4 (0.2) 2.1 (0.1) 2.3 (0.1) 11.7 (0.3) Change À7.1 (0.6) À2.6 (0.2) À1.3 (0.1) À1.2 (0.1) À1.5 (0.3) WCPD—fixed weights 2010 26.2 (0.6) 7.7 (0.2) 3.2 (0.1) 3.4 (0.1) 13.1 (0.3) 2012 19.6 (0.5) 5.2 (0.2) 2.1 (0.1) 2.3 (0.1) 11.6 (0.3) Change À6.6 (0.6) À2.4 (0.2) À1.2 (0.1) À1.2 (0.1) À1.5 (0.3) Engel curve deflator 2010 36.7 (0.6) 13.0 (0.2) 6.3 (0.2) 6.2 (0.2) 16.9 (0.3) 2012 25.5 (0.5) 7.9 (0.2) 3.5 (0.1) 3.6 (0.1) 14.1 (0.3) Change À11.2 (0.6) À5.1 (0.2) À2.8 (0.1) À2.6 (0.1) À2.8 (0.3) Note: “H” is headcount index, “PG” is poverty gap index, “PS” is poverty severity index, “T3%” is the average exit time measure of Morduch (1998) at a 3% an- nual real growth rate, and “T3%/H” is the average exit time amongst the poor. a Standard errors in () are adjusted for the stratification, clustering, and weighting of the data. 18 The exit time measure has the same properties as the poverty severity index (sensitivity to distribution amongst the poor) P À Á but allows an intuitive interpretation. It is calculated as: Tg ¼ 1=N q j¼1 lnðZÞ À ln yj =g, where constant and uniform growth rate g would see person j below the poverty line take tg years to reach the poverty line (the expected value, Tg, in- cludes an exit time of zero for the nonpoor). For the average poor person, it takes Tg =H years to escape poverty. 524 Gibson, Le, and Kim in the average time expected to escape poverty, and such progress may induce a false sense of achieve- ment for Vietnam’s policy makers when compared with the actual record (based on either WCPD defla- tor) of just over a one-year reduction in expected poverty exit time. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 VI. Cost-Benefit Analysis of Spatially Disaggregated Price Surveys The results in sections IV and V show that deflators from the food Engel curve appear to be a poor proxy for those obtained from the WCPD benchmark price indexes; compared to those benchmarks, estimates of the level, location, and change in poverty would be distorted if the Engel method deflator was used. We also note that researchers may turn to the Engel curve method, in part, because needed prices for time-space deflation are unavailable. In this section we join these two points in a cost-benefit analysis that asks the following question: could an analysis in the absence of prices (and instead using the Engel curve method to get the deflators) be so incorrect that it is so costly that it would have been better for a government to spend the money to gather the spatially disaggregated prices needed for the first-best deflators. Cost-benefit analyses of data infrastructure in poor countries are sorely lacking (Jerven 2014), in part because it is difficult to link data to outcomes, and it is not clear if bad policies are any less likely with better data. Despite those caveats, we proceed as follows: we assume that the goal of the price survey is to deflate in order to measure the total poverty gap, so that Vietnam’s authorities can budget the exact amount to eliminate poverty (with costless and perfectly targeted transfers). We obtain this budgetary figure from the product of the poverty gap index, the value of the poverty line, and the size of the popu- lation. The results from table 3 show that, if the Engel curve deflator is used, the poverty gap index in 2010 was 0.130, while it was just 0.079 if the WCPD-vw price index is used. The difference in the total value of the poverty gap is US$2.4b (at market exchange rates since World Bank funding to the GSO for the SCOLI survey was also at market exchange rates). Even if we assume just a one percent social loss from paying poverty alleviation funds to nonpoor people, the overstated poverty gap would have an an- nual social cost of US$24 million per year. In contrast, the cost of the SCOLI survey was just US$0.25 m, and the survey runs only every second year. If mistargeted transfers are treated as more socially costly than one cent in the dollar, the benefit-cost ratio for spending money to get the needed spatial price data is even higher. If we use results from 2012, when the Engel curve does not overstate the poverty gap so much, the difference in the total value of the poverty gap is US$1.3b, and it still greatly exceeds the cost of the survey even at a one percent social loss rate. While these are little more than back of the envelope calculations, they have some basis in the history of the SCOLI surveys in Vietnam. A growing concern about unreliable poverty results due to spatial de- flators being formed from inappropriate temporal price indexes caused the World Bank and the GSO to invest in a new program of surveys. This program of work was of such use that Vietnam self-funded the 2012 survey since it also helped answer other policy questions, such as setting cost-of-living adjustments for public sector wages in major cities. Moreover, simple as they are, these calculations give an order of magnitude to the question of how costly it could be for a country if a “no-price” analysis was treated se- riously by a government that had the wherewithal to undertake large scale transfer programs. Even if a government did not design transfers based on deflators coming from a food Engel curve, there is a hidden cost when researchers develop and use “no-price” methods. In our opinion, a researcher is implicitly saying “we don’t need price data” when they use “no-price” methods like the food Engel curve method of deflation. This reduces the demand that is placed on the statistics agency to provide higher quality and more extensive price data and price indexes. Governments respond to pressure from constituents, including economists and other researchers. If more demands were made for the right sort of price data, and if the cost of not having such data was shown, better outcomes may result than those that come from using “no-price” methods. The World Bank Economic Review 525 VII. Conclusions In this paper we assess the performance of a “no-price” method of deflating for cost-of-living differences over time and space. Such methods are relied upon by some researchers because many large developing countries do not have spatially disaggregated price surveys. Yet such countries are exactly the place Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 where spatial deflation is needed since it is implausible to assume that prices are the same everywhere, with high internal transport costs and an absence of major brands and chain stores setting prices on a na- tional basis. Moreover, for developing countries emerging from a planned economy past, like China and Vietnam, spatial price differences may be growing since urbanization and the development of urban land price differentials is starting from a low base, and so the need for spatial deflation is unlikely to be re- duced in the near future. The method assessed here relies on estimating a food Engel curve and defining the deflator as that needed for nominally similar households to have the same food budget shares in all regions and time pe- riods. This method has been widely used in the literature examining bias in temporal price indexes, where it generally yields results that concur with what theory and other empirical approaches have sug- gested, in terms of the CPI being an upwardly biased measure of changes in the true cost of living. But there is much less guidance from either theory or from other empirical approaches about spatial defla- tors. The Balassa-Samuelson effect leads one to predict that the price level will be higher in regions and sectors where nominal incomes are higher, so spatial deflation should show less inequality, but one also can conceive of pathways by which people living in poor areas face higher costs (Muller 2002). Consequently, with more diffuse priors about spatial patterns in the cost-of-living, any empirical evidence—including from “no-price” approaches like the Engel curve method—may be quite influential. This makes the experience of Vietnam in 2010 and 2012, where there is a benchmark from comprehen- sive, spatially disaggregated price surveys, an important opportunity for assessing how well such “no- price” methods work in practice. Our results show that spatial deflators and spatially disaggregated estimates of temporal inflation de- rived when the food Engel curve method is applied in Vietnam in 2010 and 2012 are poor proxies for the deflators obtained from two benchmark price indexes that rely on spatially disaggregated prices. Based on these benchmarks, substantial distortion in estimates of the level, location, and change in pov- erty and inequality would occur if Engel method deflators were used in Vietnam. This scope for poten- tially wrong inferences leads us to conclude that while Engel curve methods may be a useful tool, amongst several, for examining bias in temporal deflators, they are unlikely to proxy for the multilateral price indexes that would be calculated from spatially disaggregated price surveys. Even in the temporal context, a concern exists that the Engel curve method is recovering changes in the cost of living for an unknown household that could be anywhere in the income distribution. As such, deflators based on food Engel curves do not appear to provide reliable evidence needed to account for time-space differences in the cost of living, and there may be no substitute for large developing countries developing spatially dis- aggregated price surveys. 526 Gibson, Le, and Kim Appendix Table 1. Mapping of Prices and Budget Shares Code Consumption survey group Avg budget share Price survey item/specification 101 Plain rice 0.082 White rice #1 (lower quality) White rice #2 (premium variety) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 102 Sticky rice 0.005 Sticky rice 103 Maize 0.001 — 104 Cassava 0.000 — 105 Potatoes 0.001 — 106 Bread, flour 0.002 White bread 107 Instant noodles 0.007 Instant noodles 108 Fresh rice noodles 0.002 Fresh rice noodles 109 Vermicelli 0.001 (a) 110 Pork 0.051 Pork: Rump Pork: Belly 111 Beef 0.009 Beef Fresh beef rib 112 Buffalo meat 0.001 (b) 113 Chicken 0.024 Battery chicken meat Live free range chicken Free range chicken meat 114 Duck and other poultry 0.006 Whole local duck 115 Other types of meat 0.002 (c) 116 Processed meat 0.005 Pork- pie 117 Cooking oil, lard 0.009 Cooking oil Lard 118 Fresh shrimp, fish 0.036 Carp Salt-water shrimp Fresh-water shrimp 119 Dried shrimp and fish 0.004 Dried fish 120 Other seafood 0.003 (d) 121 Eggs 0.007 Chicken eggs 122 Tofu 0.005 Tofu 123 Peanuts, sesame 0.001 — 124 Beans of various kinds 0.001 (e) 125 Fresh peas 0.002 Fresh peas 126 Water morning glory 0.004 Water morning glory 127 Kohlrabi 0.001 (f) 128 Cabbage 0.002 Cabbage 129 Tomatoes 0.002 Tomatoes 130 Other vegetables 0.013 (g) 131 Oranges 0.002 Oranges 132 Bananas 0.003 Bananas 133 Mangoes 0.001 Mangoes 134 Other fruits 0.010 (h) 135 Fish sauce 0.005 Fish sauce 136 Salt 0.001 Salt 137 MSG 0.002 (i) 138 Glutamate 0.004 (i) 139 Sugar 0.005 White sugar 140 Confectionery 0.005 Fruit candies 141 Condensed milk 0.007 Condensed milk 142 Ice cream, yoghurt, other diary 0.002 (j) 143 Fresh milk 0.004 — The World Bank Economic Review 527 Appendix Table 1. (continued) Code Consumption survey group Avg budget share Price survey item/specification 144 Alcohol 0.006 Vodka 145 Beer 0.004 Bottled beer #1 (Northern brand) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 Bottled beer #2 (Southern brand) 146 Bottled and canned water, soft drinks 0.002 Soft drink Fruit juice Bottled water 147 Instant coffee 0.001 (k) 148 Coffee powder 0.001 Powdered coffee 149 Instant tea powder 0.000 (l) 150 Other dried tea 0.005 Dried tea 151 Cigarettes, waterpipe tobacco 0.010 Cigarettes #1 (Northern brand) Cigarettes #2 (Southern brand) 152 Betel leaves, areca nuts 0.000 — 153 Outdoor meals 0.074 Outdoor meals - breakfast Outdoor meals - lunch/dinner 154 Other food and drinks 0.013 (m) 201 Pocket money for children 0.009 (n) 204 Petrol 0.034 Petrol 205 Kerosene 0.001 Kerosene 212 Other types of fuel 0.030 (o) 213 Deposit fees for vehicles 0.002 — 214 Matches, candles, fire stones, lighters 0.001 (p) 215 Soap, detergent 0.007 Washing detergent 216 Dish washing liquid 0.003 (q) 217 Shampoo, conditioner 0.005 Shampoo 218 Bath soap, shower gel 0.002 Soap 219 Skin care and cosmetics products 0.002 (r) 220 Tooth paste and brush 0.004 Toothpaste 221 Toilet paper, razor 0.002 Toilet paper 222 Books, newspapers, magazines for adults 0.001 Notebook 223 Books, newspapers for children 0.000 Notebook 224 Fresh, nonworship flowers 0.000 — 226 Regular worship activities 0.006 — 227 Haircut, hairdressing 0.005 Men’s haircut Ladies’ haircut 228 Other daily expenditures 0.007 — 300 Nonfood, annual spending 0.058 Tailoring Puncture repair 400 Gifts for special occasions 0.012 — dur Durables (user cost) 0.088 DVD player edu Education-related spending 0.036 Notebook School fee for public high school hlth Health-related spending 0.043 Paracetamol Flu medicine util Utilities 0.023 Electricity tariffs rent Rent 0.161 Hedonic regression on dwelling values Notes: Average budget shares use democratic weights applied to the 2010–2012 pooled VHLSS dataset. For items with multiple prices per consumption survey group, the price relativities are averaged before mapping to the budget shares. The 11 items with “—” have no prices and are ignored in the analysis. Items with () use prices of similar items as follows: (a) fresh rice noodles; (b) beef; (c) beef, pork, chicken, and duck; (d) carp, shrimp, and dried fish; (e) fresh peas; (f) cabbage; (g) peas, water morning glory, cabbage, and tomatoes; (h) oranges, bananas, and mangoes; (i) salt; (j) condensed milk; (k) powdered coffee; (l) dried tea; (m) all foods; (n) instant noodles, candies, beef noodle soup, notebooks, and school fees; (o) petrol and kerosene; (p) cigarettes; (q) washing detergent; and (r) shampoo and soap. 528 Gibson, Le, and Kim Appendix Table 2. Impact of Various Modeling Assumptions on Spatial Price Indexes and Inflation Rates No imputed Imputed Pooled house 2012 on Inflation Rebased Adding more prices prices value equation 2010 base since 2010 to 2012 prices (1) (2) (3) (4) (5) (6) (7) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 Urban Red River 100.0 100.0 100.0 118.7 18.7 100.0 100.0 Urban Mid-Northern Mountains 82.1 81.5 81.2 98.3 21.1 82.8 82.6 Urban North-Central Coast 77.9 77.1 77.1 94.6 22.7 79.7 77.9 Urban Central Highlands 86.5 86.7 86.2 104.9 21.7 88.3 88.7 Urban Southeast 96.2 97.6 97.8 110.6 13.1 93.1 92.2 Urban Mekong Delta 74.8 74.2 74.4 87.3 17.3 73.5 73.6 Rural Red River 80.7 80.0 78.8 95.8 21.6 80.7 79.8 Rural Mid-Northern Mountains 80.5 79.5 79.0 94.7 19.9 79.8 78.9 Rural North-Central Coast 71.4 70.6 70.0 85.8 22.6 72.3 70.3 Rural Central Highlands 77.3 77.1 76.4 94.6 23.8 79.7 79.3 Rural Southeast 77.4 77.8 77.4 91.4 18.1 77.0 75.6 Rural Mekong Delta 70.5 70.0 69.8 79.7 14.2 67.1 67.2 Notes: The inflation factor reported is the change in the average price level for a region and sector from the 2010 survey (centered on October) to the 2012 survey (centered on June), so it is not an annual rate of inflation. The additional prices added in column (7) are for thirty more nonfoods and seven more foods, which were included in the 2012 price survey but not the 2010 survey. References Ackland, R., S. Dowrick, and B. Freyens. 2013. “Measuring Global Poverty: Why PPP Methods Matter.” Review of Economics and Statistics 95 (3): 813–824. Alma˚s, I. 2012. “International Income Inequality: Measuring PPP Bias by Estimating Engel Curves for Food.” American Economic Review 102 (2): 1093–117. Alma˚s, I., and A. Johnsen. 2012. “The Cost of Living in China: Implications for Inequality and Poverty.” Working Paper, Department of Economics, Norwegian School of Economics. Alma˚s, I., A. Kjelsrud, and R. Somanathan. 2013. “A Behaviour-based Approach to the Estimation of Poverty in India.” CESifo Working Paper No. 4122. Barrett, G., and M. Brzozowski. 2010. “Using Engel Curves to Estimate the Bias in the Australian CPI.” Economic Record 86 (272): 1–14. Beatty, T., and E. Larsen. 2005. “Using Engel Curves to Estimate Bias in the Canadian CPI as a Cost of Living Index.” Canadian Journal of Economics 38 (2): 482–99. Beatty, T., and T. Crossley. 2012. “Lost in Translation: What do Engel Curves tell us About the Cost of Living?” Mimeo, University of Minnesota. Brandt, L., and C. Holz. 2006. “Spatial Price Differences in China: Estimates and Implications.” Economic Development and Cultural Change 55 (1): 43–86. Breuer, C., and P. von der Lippe. 2011 “Problems of Operationalizing the Concept of a Cost-of-Living Index.” MPRA Paper No. 32902. Chamon, M., and I. Filho. 2014 “Consumption Based Estimates of Urban Chinese Growth.” China Economic Review 29 (1): 126–37. Chung, C., J. Gibson, and B. Kim. 2010. “CPI Mis-measurements and Their Impacts on Economic Management in Korea.” Asian Economic Papers 9 (1): 1–15. Costa, D. 2001. “Estimating Real Income in the United States from 1888 to 1994: Correcting CPI Bias Using Engel Curves.” Journal of Political Economy 109 (6): 1288–310. Deaton, A., J. Friedman, and V. Alatas. 2004. “Purchasing Power Parity Exchange Rates from Household Survey Data: India and Indonesia.” Princeton Research Program in Development Studies Working Paper. Deaton, A. 1998. “Getting Prices Right: What Should be Done?” Journal of Economic Perspectives 12 (1): 37–46. Deaton, A., and O. Dupriez. 2011. “Spatial Price Differences Within Large Countries.” Mimeo, Princeton University. The World Bank Economic Review 529 Diewert, E. 2005. “Weighted Country Product Dummy Variable Regressions and Index Number Formulae.” Review of Income and Wealth 51 (4): 561–70. Dumagan, J., and T. Mount. 1997. “Re-examining the Cost-of-Living Index and the Biases of Price Indices.” Department of Commerce Working Paper ESA/OPD, 97–95. Filho, I., and M. Chamon. 2012. “The Myth of Post-reform Income Stagnation: Evidence from Brazil and Mexico.” Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 Journal of Development Economics 97 (2): 368–86. Gibson, J. 2013. “The Crisis in Food Price Data.” Global Food Security 2 (2): 97–103. Gibson, J., K. Beegle, J. de Weerdt, and J. Friedman. 2015. “What Does Variation in Survey Design Reveal About the Nature of Measurement Errors in Household Consumption?” Oxford Bulletin of Economics and Statistics 77 (3): 466–74. Gibson, J., and B Kim. (2015) Hicksian Separability Does Not Hold Over Space: Implications for the Design of Household Surveys and Price Questionnaires. Journal of Development Economics 114 (1): 34–40. Gibson, J., and G. Scobie. 2010 “Using Engel Curves to Estimate CPI Bias in a Small, Open, Inflation-targeting Economy.” Applied Financial Economics 20 (17): 1327–35. Gibson, J., S. Stillman, and T. Le. 2008. “CPI Bias and Real Living Standards in Russia During the Transition.” Journal of Development Economics 87 (1): 140–60. Gluschenko, K. 2006. “Biases in Cross-Space Comparisons Through Cross-Time Price Indexes: The Case of Russia.” BOFIT Discussion Paper No. 9. Gong, H., and X. Meng. 2008. “Regional Price Differences in Urban China 1986–2001: Estimation and Implication.” Discussion Paper No. 3621, Institute for the Study of Labor (IZA), Bonn. Hamilton, B. 2001. “Using Engel’s Law to Estimate CPI Bias.” American Economic Review 91 (3): 619–30. Higa, K. 2013. “Estimating Upward Bias in the Japanese CPI Using Engel’s Law.” Global COE Hi-Stat Discussion Paper Series No. 295, Hitotsubashi University. Hill, R. 2004. “Constructing Price Indexes Across Space and Time: The Case of the European Union.” American Economic Review 94 (5): 1379–409. Jerven, M. 2014. “Benefits and Costs of the Data for Development Targets for the Post-2015 Development Agenda.” Working Paper, Copenhagen Consensus Center. Larsen, E. R. 2007. “Does the CPI Mirror the Cost of Living? Engel’s Law Suggests Not in Norway.” Scandinavian Journal of Economics 109 (1): 177–95. Ley, E. 2005 “Whose Inflation? A Characterization of the CPI Plutocratic Gap.” Oxford Economic Papers 57 (3): 634–46. Logan, T. 2009. “Are Engel Curve Estimates of CPI Bias Biased?” Historical Methods 42 (3): 97–110. Majumder, A, R. Ray, and K. Sinha. 2012. “Calculating Rural-Urban Food Price Differentials from Unit Values in Household Expenditure Surveys: A Comparison with Existing Methods and A New Procedure.” American Journal of Agricultural Economics 94 (5): 1218–35. Morduch, J. 1998. “Poverty, Economic Growth, and Average Exit Time.” Economics Letters 59 (3): 385–90. Muller, C. 2002. “Prices and Living Standards: Evidence from Rwanda.” Journal of Development Economics 68 (1): 187–203. Nakamura, E., J. Steinsson, and M. Liu. (2015) Are Chinese Growth and Inflation Too Smooth? Evidence from Engel Curves. American Economic Journal: Macroeconomics (forthcoming). Olivia, S., and J. Gibson. 2013. “Using Engel curves to Measure CPI Bias for Indonesia.” Bulletin of Indonesian Economic Studies 49 (1): 85–101. Oulton, N. 2012. “How to Measure Living Standards and Productivity.” Review of Income and Wealth 58 (3): 424–56. Rao, P. 2004. “The Country-Product-Dummy Method: A Stochastic Approach to the Computation of Purchasing Power Parities in the ICP.” University of Queensland, Australia. ———. 2005. “On the Equivalence of Weighted Country-Product-Dummy (CPD) Method and the Rao-System for Multilateral Price Comparisons.” Review of Income and Wealth 51 (4): 571–80. Ravallion, M., and D. Van De Walle. 1991. “Urban-rural Cost-of-Living Differentials in a Developing Economy.” Journal of Urban Economics 29 (1): 113–27. Selvanathan, E. 1991. “Standard Errors for Laspeyres and Paasche Index Numbers.” Economics Letters 35 (1): 35–38. 530 Gibson, Le, and Kim Summers, R. 1973. “International Price Comparisons Based upon Incomplete Data.” Review of Income and Wealth 19 (1): 1–16. Van Veelen, M., and R. van der Weide. 2008. “A Note on Different Approaches to Index Number Theory.” American Economic Review 98 (4): 1722–30. World Bank. 2012. Well Begun, Not Yet Done: Vietnam’s Remarkable Progress on Poverty Reduction and the Downloaded from https://academic.oup.com/wber/article-abstract/31/2/504/2897306 by Joint Bank-Fund Library user on 08 August 2019 Emerging Challenges, World Bank: Hanoi. The World Bank Economic Review, 31(2), 2017, 531–552 doi: 10.1093/wber/lhv064 Advance Access Publication Date: January 6, 2016 Article Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 Willing but Unable? Short-term Experimental Evidence on Parent Empowerment and School Quality Elizabeth Beasley and Elise Huillery Abstract Giving power over school management and spending decisions to communities has been a favored strategy to increase school quality, but its effectiveness may depend on local capacity. Grants are one form of such a trans- fer of power. Short-term responses of a grant to school committees in Niger show that parents increased partic- ipation and responsibility, but these efforts did not improve quality on average. Enrollment at the lowest grades increased and school resources improved, but teacher absenteeism increased, and there was no mea- sured impact on test scores. An analysis of heterogeneous impacts and spending decisions provides additional insight into these dynamics. Overall, the findings suggest that programs based on parent participation should take levels of community capacity into account: even when communities are willing to work to improve their schools, they may not be able to do so. The short-term nature of the experiment reduces the extent to which the results can be generalized. JEL classification: 015, C93, I21 The dramatic expansion of access to schools in the last two decades is the result of an unprecedented effort to increase education in poor countries. However, the quality of education is often low. One com- mon strategy to improve quality is through improved management and oversight and in particular by increasing involvement of parents and the community (World Bank 2003). Community-based Elizabeth Beasley is the coordinator of the CEPREMAP Well-Being Observatory; Elise Huillery (corresponding author) is an assis- tant professor in the Sciences Po Department of Economics and a J-PAL affiliate; her email address is elise.huillery@sciencespo.fr. Cornelia Jesse led the implementation of this project and contributed substantially to its design, and we are deeply indebted to her for her leadership. We thank Yann Algan, Bruno Cre ´ pon, Esther Duflo, Pascaline Dupas, Pierre de Galbert, Emeric Henry, Cornelia Jesse, Florian Mayne ´ ris, Miguel Urquiola, and the seminar participants at J-PAL Europe, Sciences Po, Columbia University, Oxford University, UCL, Paris I, and the Journe ´ es d’Economie Publique Louis-Andre ´ Ge´ rard-Varet for helpful com- ments and discussions. We also thank several anonymous referees for thoughtful and detailed suggestions. We thank Adama Ouedraogo for his support throughout the project and are grateful to Pierre de Galbert for excellent project management and Gabriel Lawin for data collection and management and also Elizabeth Linos, Andrea Lepine, and Hadrien Lanvin for research assis- tance. We thank the Government of Niger and the staff of the Ministry of Education for their collaboration, in particular Amadou Tchambou, Yacouba Djibo Abdou, Salou Moussa, and Damana Issaka. Mathieu Brossard was central in the initial conception and design of the project. Much of the project was carried out when Elizabeth Beasley was at J-PAL Europe and she thanks J-PAL for their support. Finally, and most importantly, we gratefully acknowledge the parents, staff, and pupils of the schools for the time and information they shared with us. This work was supported by the World Bank and the donor partners of the Education for All Fast Track Initiative through the Education Program Development Fund. All errors remain our own, and the opinions expressed in this paper are ours alone and should not be attributed to the institutions with which we are affiliated, the World Bank, or the Government of Niger. A supplemental appendix to this article is available at https://academic.oup.com/wber. C The Author 2016. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 532 Beasley and Huillery management policies have been widely adopted throughout the world over the past decade (see Barrera- Osorio et al. 2009 for an overview).1 Grants to school committees, that is, putting money under the control of parents, are one potential way to increase school quality directly, by increasing school resources, and indirectly, by spurring parent participation. For this to work, parents must have the time, energy, and capacity to participate in school Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 management effectively. Given the heavy investment in such programs, it is important to understand whether, and under which circumstances, they can actually work. This paper provides evidence from a field experiment on the short-term impact of a program to encourage parent participation in school management through grants to school committees in a context of low parent authority and capacity. In Niger, levels of education among adults are extremely low: 70% of Nigeriens aged 15–44 in 2010 had no education,2 and the system for education is very hierarchi- cal and centralized. In a pilot program to improve school quality, the Ministry of Education of Niger, in partnership with the World Bank, gave grants to school committees that had been trained in school man- agement with the aim of increasing parent involvement. A randomized evaluation was incorporated into the pilot project to provide information for scale-up. Detailed data from one thousand schools (split into five hundred treatment and five hundred control schools) were collected to assess the impact of the grant on parent empowerment, school management, and school quality. An important limitation of the study is that it provides only short-term evidence on behavioral responses: the first grant arrived in late 2007 and was meant to continue several years, but the evaluation ended in 2009. The survey was administered during April and May of 2008, and administrative data were collected at the beginning of the 2008– 2009 school year. This paper thus documents the short-term dynamics of an anticipated long-term program. On average, parents were willing to increase their participation in school management, but educa- tional quality did not improve in a meaningful way as a result of this participation. There is an overall positive impact of the grant program on parents’ involvement and responsibility: communities with the grant participated more and took on more responsibilities than those without the grant, although the average community did not engage in supervising teacher presence. Parents did not reduce their own contributions in response to the grant. The impact on school management is mixed: cooperation between school stakeholders improved, but overall accountability did not change, and spending shows both expected and unexpected changes: there was more spending in infrastructure but also school festivals, playground equipment, and, most unex- pectedly, investment in agricultural projects, which were probably noneducational but intended to make a profit. Finally, school quality did not improve with these changes, at least in the short term. There were sub- sequent improvements in infrastructure and health resources, as well as an increase in participation at the lowest grades: fewer dropouts in 2007/2008 and increased enrollment in grade two in 2008/2009, but there is no evidence of a change in test scores (note that we cannot exclude the possibility of a down- ward bias in the estimate of test score impact due to differential dropouts, but the lack of change in test scores at levels that had no participation changes supports the finding of no impact on test scores). Teachers decreased their effort in response to the grant, which can be attributed to the fact that some teachers have a preference for a centralized government and might be reluctant to collaborate with parents, especially when parents do not spend the money on projects that make the teacher’s life easier. 1 School-based management programs have been implemented in Argentina, Australia, Bangladesh, Canada, Guatemala, Honduras, Hong-Kong, India, Lebanon, Lesotho, Macedonia, Madagascar, Mexico, Nicaragua, the Philippines, Senegal, Serbia, Sri Lanka, the Gambia, the United Kingdom, and the United States (Duflo et al. 2015). 2 World Development Indicators, World Bank, source: International Institute for Applied Systems Analysis (IIASA). The World Bank Economic Review 533 The paper then examines heterogeneous effects along several different dimensions and highlights three interesting patterns. First, in situations where the school committee is educated or has experience in another community organization—both of which we take as proxies for real authority—parents increased monitoring of teacher attendance in response to the grant (though this did not mitigate the negative effect of the grants on teacher attendance). Second, in small (one-teacher) schools, school com- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 mittees spent on items that benefited the teacher, and teacher attendance increased in response to the grant in these schools. These results together suggest that teachers’ responses to parent participation depend on whether parents are acting in opposition to, or alliance with, the teachers. Third, rural schools used some of the grant to invest in agricultural opportunities, while urban schools did not but invested in school infrastructure instead. This study is related to two strains of the economics literature: parent participation and school resour- ces. Previous evidence on the effectiveness of programs to increase quality via increased parent participa- tion is mixed. Banerjee et al. (2010) report that providing information to parents about the school committee and training the community to measure educational performance in India had no impact on the activity of school committees and, therefore, no impact on education outcomes. Duflo et al. (2015) find that a training to empower parents helped mitigate the negative response of regular teachers to the addition of a contract teacher. In Madagascar, Lassibille et al. (2010) found that facilitating community/ school interactions, combined with streamlining management practices had positive impacts on attend- ance and learning. Other studies have supplied evidence that empowering the community to manage schools improves school quality, though these papers generally do not include random variation in treat- ment assignment, and so the identification is weaker. Bryk et al. (1998) and Hess (1999) have argued that student achievement improved in Chicago after the implementation of reform involving the com- munity in school management and Di Gropello (2006) overviews four school-based management pro- grams in Latin America and concludes that school-based management models have led generally to greater community empowerment and teacher effort. Participation in school management may also be linked to social capital more generally: Sawada and Ishii (2012) employ matching and instrumental vari- able approaches to measure the impact of the COGES program itself in neighboring Burkina Faso and find increases in social capital measured using several different tools, including field experiments. Another group of studies point to heterogeneity in the performance of participatory programs, and in the effect of decentralization more generally. Blimpo et al. (2015) find that training school committees had no impact on learning except in schools where school committee members were educated. Pradhan et al. (2014) find that an intervention to empower parents was effective only when combined with an interven- tion fostering the ties between the school committee and a local governing body. Decentralization of secon- dary school management in Argentina led to higher test scores in provinces with higher managerial capacity and lower test scores in provinces with lower managerial capacity (Galiani and Schargrodsky 2002). Galiani and Perez-Truglia (2013) review the empirical literature on school decentralization on edu- cational outcomes and find that the better-off communities tend to profit more from decentralization than poor communities. Using panel estimation on PISA data, Hanushek et al. (2013) estimate that increasing school autonomy is associated with lower student performance in countries with generally lower perform- ance and higher student performance in countries with generally higher performance. While the context of rural Niger is likely to be substantially different from these contexts, there is good reason to anticipate that there may be heterogeneous impacts of parent participation. Previous studies on increasing school resources have found that it may crowd out the contributions of other actors. For example, parents in Romania decreased time spent on homework when their child was admitted to a better school (Pop-Eleches and Urquiola 2013). In Zambia and India, households decreased spending for edu- cation when they anticipated an increase in school funding (Das et al. 2013). In Kenya, civil-servant teachers decreased presence at school when school committee hired an extra teacher (Duflo et al. 2015). 534 Beasley and Huillery This paper contributes specifically to the literature on heterogeneity by showing that authority and capacity are important prerequisites for parents to undertake the more difficult aspects of management and that cooperation between parents and teachers (rather than confrontation) may be key. An overall message is that parents will not always or even generally make optimal spending and management deci- sions to increase quality. It may be costly and time-consuming, parents may not have good information Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 about how schools work and thus may not make optimal decisions, and it may be very difficult to put pressure on teachers to improve service quality. It may be particularly difficult since capacity depends on parent power vis- a-vis teachers, or “real authority” in the terms of Aghion and Tirole (1997), who underscore the fact that formal authority (the right to make decisions) need not imply real authority (effective control over decisions).3 A major limitation of the paper is the short-term nature of the findings. Long-term follow-up was impossible, so it is possible that different results would have emerged after one or two more years. However, the results presented here are still useful: first, they give evidence about the barriers that com- munities may face at the beginning of participatory programs, and second, the richness of the data on spending decisions, contributions, involvement and responsibility, and link to community characteristics gives some insight into the mechanisms at work within communities when making school management decisions. The remainder of the paper is as follows. Section II presents some background information on educa- tion in Niger and describes the school grant experiment. Section III presents the data and estimation strategy and section IV the empirical results. Section V concludes. I. Background and Experimental Design The grant program sought to empower parent school committees in a context where parents tradition- ally had very little control over their children’s schooling and where overall levels of learning were quite low. The experimental design was incorporated to give information on program effectiveness prior to an intended scale-up. Background on Education in Niger Niger made remarkable progress in access to education in the decade prior to this evaluation: the num- ber of children enrolled in primary school had more than doubled from 656,000 in 2000 to 1,554,102 in 2008, and net enrollment had risen from 27% to 49% in the same period. However, only 44% of chil- dren who begin primary school finished all grades, and only 43% of the sixth graders who took the national exam at the end of primary school passed it.4 The education system in Niger has traditionally been fairly hierarchical and rigid. Inherited from French colonization, the system replicates the French education system: highly centralized, with little, if any, room for local community participation. Unlike other systems, where the school might be super- vised by a local governmental body, at the time of the evaluation there was generally no way for the local community to determine school policy or practice. Schools depended entirely on the hierarchical chain that originated in the Ministry of Education (except for some local fundraising, but these efforts were undertaken only when needs were not provided for by the Ministry). 3 Policies of de jure autonomy do not always lead to de facto autonomy (King and Ozler 2005), and so participation may not be meaningful if communities have no actual power and even increase inequality by “leaving the poor behind” (Galiani et al. 2008). 4 The situation has continued to improve in terms of access to education: in 2011, net enrollment in primary school was 62%, and primary completion rates had risen to 46%. The World Bank Economic Review 535 In 2006 the Ministry of Education in Niger introduced school committees in all primary public schools in order to improve quality. These school committees (called the COGES) were designed to involve parents and community members in the school, improve accountability, improve management, and thus enhance access to and quality of education.5 As discussed in the introduction, the establishment of local community groups for the purpose of Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 improving public service provision via community participation is a strategy that many country govern- ments and civil society organizations advocate. In many respects, the circumstances of Niger make a strong case for school-based management: low population density, vast distances and limited transporta- tion, information, and communications infrastructure make supervision of primary schools by the cen- tral government (or its regional structures) very costly, and the timely transmission of information to and from the central authorities for planning purposes is challenging. In the districts where this program was carried out, the COGES were trained by several different organizations in financial management, governance (elections), and project planning. In 2006, many of the newly created and trained school committees were not actively engaged in school matters, nor did they develop a school improvement plan for the year. To spur school committee involvement and activ- ity, the Ministry of Education introduced school grants to give the committees an incentive to meet, plan, and undertake activities. The grants were expected to improve school management through increased parental participation and accountability, to improve school infrastructure and the quality of education, and to potentially increase enrollment rates and learning. The pilot project was carried out as a randomized evaluation in order to provide reliable information on impact prior to national scale-up. The Ministry selected the regions of Zinder and Tahoua because the COGES there were already func- tional and had received basic training on planning and financial management, whereas COGES in the other six regions of Niger had not been trained yet. However, the context of these two regions is specific, even relative to the rest of Niger. The Zinder region is culturally similar to Northern Nigeria, with a rela- tively conservative Muslim population that has lower rates of formal schooling. On the other hand, the Tahoua region is a nomadic region where formal education poses a challenge because the nomadic popu- lation (the Tuareg and the Fulani) may often rely on children for herding. In both cases, one may expect parents to adhere less to formal schooling than in other regions in Sub-Saharan Africa. Experimental Design The evaluation design included 1,000 schools in Tahoua and Zinder, randomly selected out of the 2,609 total public primary schools in those districts. Once these 1,000 schools were determined to be represen- tative of the total pool of schools in those districts, half were randomly assigned to receive the grants and became the treatment group. The other 500 schools served as a control group. Both randomizations were stratified on inspection (a geographical administrative unit), existing support for the school com- mittee (e.g., existing programs or sponsorship by NGOs), and whether the school was indicated as being in a rural or urban area in administrative data. Strata were constructed by grouping the schools into inspections, then within each inspection into whether or not the school had existing support, and then 5 These school committees consist of six representatives, including the school director, who serves as secretary, and par- ent representatives. The parents are supposed to elect the representatives, who may also be the leaders of the Parent Association (APE), which includes all parents, and the Mother’s Association (AME), which includes all mothers. In practice, the composition of the COGES varies by school. School committees are supposed to be responsible for the management of people working at the school (e.g., monitoring of teacher attendance and performance), financial re- sources (e.g., school meal funds) and material resources (e.g., purchase and management of textbooks, supplies and sup- plies). One of the school committee’s central tasks is to draft an annual school improvement plan that includes its projects, activities, budget, and timelines to guide its work for the school year. The school committee works parallel to the APE and AME. Additional details and background are given in appendix S1, available at https://academic.oup.com/ wber). 536 Beasley and Huillery within each of those groups, whether the school was in a rural or urban area. This gave fifty strata. Schools were assigned a random number between zero and one, and within each stratum they were sorted by this random number, with the first half being assigned to treatment and the second to control. Data from the Administrative School Census in 2005–2006 (the school census is described below) were used to confirm balance between control and treatment schools along various observable characteristics Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 (data from 2006–2007 were not yet available at the time of sampling in August 2007). The balance checks for the randomization and p-values for the test of equality of means across control and treatment are presented in table 1, and show no statistically significant differences. Table 1. Balance of Pre-Program School Characteristics (1) (2) (3) (4) (5) (6) Control Treatment Difference in p-value of means (C-T) difference in means N mean N mean Pupil characteristics Enrollment 07/08 500 149.6 500 141.72 À7.88 0.28 % Girls in 07/08 500 0.38 500 0.38 À0.01 0.26 % Passed exam in 07/08 262 0.45 224 0.42 À0.03 0.28 Teacher characteristics Number of teachers 490 3.87 494 3.55 À0.32 0.13 % of teachers civil servants 490 0.2 494 0.2 0 0.91 Physical infrastructure Number of buildings in 07/08 490 3.91 494 3.68 À0.23 0.17 Number of latrines in 07/06 500 0.89 500 0.82 À0.08 0.55 Water Access in 06/07 500 0.09 500 0.11 0.01 0.53 Electricity in 06/07 500 0.01 500 0.02 0.01 0.22 COGES characteristics COGES sponsored in 07/08 500 0.57 500 0.55 À0.01 0.70 COGES exists in 06/07 500 0.88 500 0.9 0.02 0.32 Location Tahoua 500 0.52 500 0.51 À0.01 0.85 Distance to inspection 500 41.1 500 38.59 À2.5 0.17 Distance to health center 476 8.24 461 8.95 0.7 0.61 Source: Ministry of Education Administrative Data. The data from 07/08 are reported in November (prior to the intervention) and are used when available; other- wise data from 06/07 are used. “Sponsored” COGES are those that have some sort of official sponsor or support group (such as an NGO). The original project plan called for recurrent grants to schools for three consecutive school years, to be distributed at the beginning of each school year to support COGES activities. The Ministry of Education and the Ministry of Finance jointly worked out the grant transfer mechanism, consisting of a direct release of funds from the national treasury into the accounts of the two regional education authorities (i.e., one hier- archical level down from the national government). The funds were then transferred to the inspection level and then to the COGES. The transfers from the regional authorities and below took place as transfers of cash, which were recorded using signed receipts, which were submitted to the Ministry of Finance. In the first year, rather than receiving the grants at the beginning of the year as planned, the five hun- dred COGES received the grants during December 2007 and January 2008, with the school year already in full swing, due to logistical difficulties with the transfer. The grants were not immediately distributed during the 2008–2009 school year, due to problems with the transfer mechanism.6 6 The regional authorities were unable to obtain the actual funds from the local treasury due to a liquidity issue at the lo- cal treasury level. The World Bank Economic Review 537 Due to these issues and political disruptions in 2009, the evaluation was terminated after only one year. As a consequence, this evaluation evaluates only one year of the grant (the 2008/2009 grant was eventually distributed to some schools, after the evaluation had ended). The size of the grant was based on the size of the school (the number of classrooms), and the average was 209 USD per school, or 1.83 USD per student. The grant was a relatively modest amount that was Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 determined by considerations of financial sustainability in view of a potential extension of the program by the government. For the purposes of comparison, the control schools raised a little over 0.60 USD per year per student from the parents on average and had an overall budget of around 199 USD including donations from private NGOs, and so the grant is relatively large compared to the usual fundraising and about equivalent to the annual amount of money available for school projects (note that in principle most school inputs such as teachers and books were provided in kind by the central government and so not included in this 199 USD—if they were, the grant would be smaller than the overall operating budget of the schools). For an idea of the practical scale, the amount of the grant was not, except in the very largest schools, sufficient to build an additional classroom. This grant amount is smaller than grants pro- vided to school committees in most other evaluations: Blimpo et al. (2015) use a grant of 500 USD per school in Gambia. Gertler, Patrinos, and Rodr ıguez-Oreggia (2010) use grants of 500 USD to 700 USD per school in Mexico, and Pradhan et al. (2014) evaluate a grant of 326 USD (to be followed with another grant of 544 USD) per school in Indonesia. About a month before the grant arrived, all five hundred treatment schools (and school committees) received a letter informing them of the grant program and its objectives, and the grant amount allocated to their school. It also included general guidelines on the use of the grants, but the specific activity to be supported by the grants was to be decided on by the school committee.7 One copy of this letter was distributed to the school director and a second copy to the president of the school committee before the arrival of the grants. Compliance in this respect was satisfactory: the grants arrived in 498 schools of the 500 program schools, 492 in the exact amount allocated to them and six in a different amount (see appendix S2, for further details on compliance). II. Data and Empirical Strategy Multiple sources contain rich information on potential treatment outcomes and community characteris- tics that can be used to generate estimates of treatment impact and heterogeneous impact using a simple ITT framework. Data Data come from three sources: (i) administrative data on primary schools (the Ministry of Education’s annual school census, also called administrative data), (ii) an evaluation survey administered to school staff and two members of the school committee at treatment and control schools (the 2008 School Survey), and (iii) a financial control survey administered to one member of the school committee on a subset of treatment schools. The Ministry of Education in Niger administers an annual census of all primary schools, including community schools and madrassas (Koranic schools), which provides data on enrollment, teacher char- acteristics, school facilities and resources, and community characteristics. This paper uses the 2006/ 2007, 2007/2008, and 2008/2009 censuses. Each census is collected in the fall of the school year (for example, the 2008/2009 census contains the information reported by the schools in fall of 2008). 7 One randomly selected group of schools received a slightly more restrictive list of potential expenditures, and another group received a warning that their projects might be audited. Analysis of spending patterns did not show any difference between these groups. 538 Beasley and Huillery In addition to the administrative data, the Ministry and the World Bank worked with a local NGO to prepare a detailed school survey (the 2008 School Survey) to be administered to the one thousand schools included in the experiment in April/May 2008, five to six months after grant distri- bution, to understand the immediate effects of the grant. This questionnaire included information on school infrastructure and resources, pupil enrollment and attendance, school improvement plan, Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 school committee functioning and membership, and school activities. It also asked detailed questions about the level of education and personal wealth of the school committee members. Three tests were also administered at this time: a math test, a French test, and an oral exam. The oral exam was administered to the youngest (grades one and two) pupils. Teacher’s physical presence at that visit was also recorded. The visit was on a day when the school was supposed to be open but was not announced in advance. Finally, a financial survey was administered to eighty-five randomly selected treatment schools in January/February 2009, asking detailed questions about the receipt and spending of the grants, any problems with the administration of the grant, and use of the grants (including the existence of a receipt for each expense). Use of the Grants The school committees used the grants in a variety of ways. Eighty-five schools were randomly selected for a detailed questionnaire on grant arrival and spending. The most common use was for material inputs such as construction and office supplies, and other uses included investment projects, health and sanitation projects, and transportation. Overall, the largest share of spending of the grant was in con- struction, representing about a third of the total amount spent (figure 1). Construction activities included building classrooms, but communities also constructed lodging for teachers, latrines, school enclosures, and other buildings. Other projects included electrification or producing copies of exams. About fifteen percent of schools surveyed used at least part of the grant on some sort of agricultural investment proj- ect. It is unclear whether the loans or small business projects were profitable. Figure 1. Reported Use of Grant Money, by Total Amount Spent Source: Financial Control Questionnaire in eighty-five randomly selected treatment schools. The World Bank Economic Review 539 Outcomes The analysis uses many different indicators of parent participation to draw general conclusions about the experiment’s impact. In order to simplify interpretation and to guard against cherry-picking of results, it presents results for indices that aggregate information over multiple outcome variables (following Kling et al. 2007). The aggregation also improves statistical power to detect effects that go in the same direction Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 within a domain. The summary index Y is defined to be the equally weighted average of z-scores of its components, with the sign of each measure oriented so that more beneficial outcomes have higher scores. The z-scores are calculated by subtracting the control group mean and dividing by the control group stand- ard deviation. Thus, each component of the index has mean 0 and standard deviation 1 for the control group. The index is the average of the nonmissing components, as long as the school has a valid response to at least two components. If only one component is available (or if no components are available), the school is dropped. Different types of outcomes are calculated in this way: parent participation, school management, and school quality. For each outcome, several indices are constructed. The details and full list of component variables for each index are given in appendix S3. For parent participation in school, the paper uses indices of parent contributions (e.g., school fees), involvement (e.g., going to meetings), responsibility (e.g., in charge of supplies), and teacher oversight (e.g., monitoring teacher attendance). School management is measured by two indices: accountability (e.g., keeping records) and cooperation (e.g., reported conflicts), and also by total spending across eight possible spending categories (infrastructure, supplies and textbooks, pupil educational support (e.g., remedial courses), pupil health, teacher support (e.g., housing), COGES expenses (e.g., travel to regional meetings), school festivals and playground, and investments in agriculture). Finally, the effect of the grant on school quality is measured by four indices: infrastructure (e.g., number of desks), materials (e.g., textbooks), health resources (e.g., first aid kit), and teacher effort (e.g., teacher attendance). Data for infrastructure, materials, and health resources come from the 2008/2009 annual administrative database, collected in the fall of 2008, and so reflect changes between eight and ten months after receipt of the grants. The paper also uses data on dropouts, enrollment, and test scores in order to examine the ultimate objective of increasing pupil participation and learning. Participation in education is measured by the number of dropouts reported by the school to surveyors at the April/May 2008 questionnaire and the change in enrollment from fall 2007 to fall 2008 reported to the Ministry of Education in the annual administrative censuses. The paper uses two limited measures of actual learning. First, test scores are obtained from a test administered to pupils during the April/May 2008 questionnaire. The test was administered to three grades, ten pupils per grade. The pupils were supposed to be sampled from those who were enrolled at the beginning of the year, but in practice the ten pupils appear to have been selected from the pupils present on that day. There are further quality problems with the test scores—including identical copies submitted by some grades in some schools—that raise concerns about the quality of the test score data. There is no evidence that the problems are correlated with treatment and appear instead to be related to insufficient oversight of the examiners, so it is possible that the quality problems only add noise. However, as dis- cussed below, the fact that participation is higher in the treatment schools and test takers were sampled from those present on that day leads to concerns of attrition bias in the test scores (if more children stayed in school in the treatment group, then the impact on test scores may be biased downwards). The results are therefore considered as second-order evidence. The overall results are nonetheless informative about the general level of education in rural Niger, and some examples are provided here to help give the reader a better idea of the context. In general, after discarding duplicate and suspect observations, pupils got about one third of questions correct. For example, the following questions were asked: 540 Beasley and Huillery • Grade one: The interviewer asked the pupils to pick up a red crayon and a blue crayon out of a pile containing pieces of chalk of different colors: three white, one red, one blue, one yellow, and one green. 45% of pupils were able to do this. • Grade four: Pupils were asked to place the following numbers in order, from smallest to largest: 807; 708; 788; 800. 24% of pupils were able to do this. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 • Grade six: Pupils were asked to change an adjective from the masculine to the feminine form (Un nou- veau maitre ¼¼> Une ____________ maitresse). 29% of pupils were able to do this. The second measure of learning comes from the annual administrative censuses which report the number of candidates for the national end-of-primary school exam and the number who passed. Results for the end of the 2007/08 school year were reported on the 2008/09 census.8 On average, slightly over half of the schools presented at least one student for the end of sixth grade test (recall that most schools do not have all grades). Interaction Variables The sample size was chosen to be large enough to allow testing for heterogeneous treatment effects by community characteristics, and this was one of the initial objectives of this the study.9 The dimensions chosen for measurement of heterogeneous effects are those that are likely to affect parent response to the grant or that have policy relevance: education, experience in other organizations, wealth of the COGES, whether the school is in an urban or rural area, and whether it is a one-teacher school. Descriptive statis- tics and balance information for the interaction variables are given in table 2. Table 2. Community Characteristics Used for Heterogeneous Treatment Effect Analysis Control Treatment p-value of Control Treatment Difference p-value of obs. obs. difference mean mean in means (C-T) difference in attrition in means Educated COGES member 369 370 0.94 0.317 0.305 0.012 0.73 Experienced COGES member 369 370 0.94 0.209 0.227 À0.018 0.55 Average wealth of COGES (PCA) 360 358 0.89 À0.586 À0.674 0.088 0.42 One-teacher school 499 497 0.32 0.122 0.145 À0.023 0.29 Urban school 500 500 0.108 0.110 À0.002 0.92 Sources: Ministry of Education Administrative Data and 2008 School Survey. Observations at school level. Educated COGES member ¼ 1 if at least one member of the COGES completed primary school. Experienced COGES member ¼ 1 if at least one member is also the member of another community organization. Average wealth is negative because the PCA was carried out with the sample including teachers, who tend to be richer than the parents. The p-value of the difference in means is calculated by creating a dummy variable equal to 1 if the data are missing for a particular school and then calculating the p-value of the difference in this variable between groups. Our intuition is that COGES with higher levels of education and experience in other organizations are likely to have higher capacity to manage schools. To make sure that these dimensions are not merely proxying for wealth, wealth is also included as an interaction term (and it is not impossible that wealthy communities might react differently, either because they have more real authority or because they can leverage a larger supplemental contribution from the community). The distinction between urban and rural schools is important for education planners in general, and it is also important to ensure that the other interaction terms are not just proxies for the urban rural divide. Finally, one-teacher schools present a unique situation in terms of the power dynamics between the teachers and the parents, and 8 Schools choose which of their sixth grade students would sit for the exam. There is no evidence that schools were penal- ized in any way for a low pass rate. 9 The analysis plan was not registered in a secure independent register in 2007 when the experiment was designed, as is best practice today. The World Bank Economic Review 541 these very small schools are also of relevance to education planners. Further details on the construction of these interaction terms are given in appendix S3. Attrition There is some attrition in the datasets. Each year, a handful of schools do not return the administrative Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 data questionnaire, or the questionnaires are improperly filled out, leading to missing data for 3% of the schools for the infrastructure index and 1.4% of the schools for 2008/09 enrollment. The April/May 2008 survey was conducted on the basis of unannounced visits, which meant that many schools were closed. In addition, some schools were not visited due to security concerns, and still others closed early that year because the summer rainy season began early and many children went to the fields with their parents to work. As a result, data from the evaluation questionnaire are available for only 814 of the 1000 schools. Differences in the proportion of schools with missing outcome variables are tested by treatment group as a whole and subdivided by district, urban and rural, and whether the school had external support (for example, NGO sponsorship) prior to the project. Results are reported in appendix S2, table A1. Eighty- four tests on treatment and interaction between treatment and subgroups yield one statistically significant difference (at the 10% level or higher), which is well within the amount that would be expected with ran- dom attrition. The comparability between treatment and control groups is thus intact. As to external valid- ity, there are more schools missing in the region where security was a concern (Tahoua, in the north). Empirical Strategy The estimations present intent-to-treat effects as measured by the differences in the means of school out- comes between schools initially assigned to the treatment group and schools initially assigned to the con- trol group. Let T be an indicator for treatment group assignment and let X be a vector of covariates. Estimation of the intent-to-treat effect b is from the following equation: Yj ¼ bTj þ Xj c þ j (1) where Yj is the outcome of school j. The covariates Xj are included to improve estimation precision and include whether the school is urban, the total proportion of girls in 2007/08, the total enrollment in 2007/08, whether the school was supported by an outside NGO in 2006/07, and the inspection (a geo- graphic/administrative unit). All regressions use robust standard errors.10 The absolute magnitudes are in units of the outcome’s standard deviation (based on the control group), so the estimate shows the treatment effect in terms of standard deviations. Heterogeneous Treatment Effects Along Community Characteristics In the second step, intent-to-treat effects are estimated with an interaction term to determine whether the average treatment effect on parent and teacher behavior varies with community characteristics, using the following regression specification: Yj ¼ bTj þ hðCj Tj Þ þ rCj þ Xj c þ j (2) where Cj denotes a given community characteristic. In this case h is the additional (or reduction of) impact for schools with characteristic Cj . We include an indicator for urban schools and the interaction of this indicator with the treatment assignment for each characteristic whose correlation with being located in an urban area is above 0.1, to disentangle the effect of this characteristic from the effect of being located in an urban area. 10 An alternative specification uses dummy variables for the strata used in random selection, which were defined using a dummy variable for urban, the total enrollment in 2005/06, and support by an outside NGO in 2005/06. This specifi- cation does not substantially change the results, but increases precision of some coefficient estimates and decreases pre- cision of others. 542 Beasley and Huillery III. Results On average, parents did not reduce their own contributions in response to the grant and increased their involvement in and responsibility over school management, although they did not go so far as to enforce rules on teacher attendance. At the same time, school committees increased investment in infrastructure Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 (buildings and the school enclosure) and school festivals and invested in agricultural projects. Accountability did not change, but reported cooperation with a number of school stakeholders improved as a result of the grant. All these effects did not create a path to school quality improvement. While infrastructure and health resources improved and pupil participation increased a bit among the youngest, teacher attendance declined on average, perhaps because of resentment over parent empowerment, and no impact is found on test scores. Particular impacts on the detailed components of each index are given in appendix S4. Appendix S5 provides a model that explains the results of this paper and the existing results in the literature. Parent Participation The grants did not change parent contributions to schools (table 3, column 1). The contribution index mean of the treatment group is statistically and economically similar to the mean of the control group. The analysis of the component variables (funds collected per pupil, in kind donations, and official fees charged) shows that neither financial nor in-kind contributions were affected by the grant (table A2). This result contrasts with previous studies showing that parents decreased their contributions in response to an increase in school resources (Das et al. 2013; Pop-Eleches and Urquiola 2013).11 Table 3. Grant Impact on Participation, Management, and Quality Indices (1) (2) (3) (4) (5) Parent Parent Parent Teacher Accountability contribution involvement responsibility oversight index index index index index Treatment À0.0117 0.0600* 0.0586* 0.0266 0.0127 (0.0490) (0.0321) (0.0353) (0.0389) (0.0351) Constant À0.141 À0.00756 À0.0889 0.335** À0.219* (0.167) (0.117) (0.129) (0.159) (0.124) Observations 782 922 780 778 806 R-squared 0.056 0.059 0.051 0.110 0.124 Control group mean À0.00709 À0.0355 À0.0191 0.00229 0.00325 (6) (7) (8) (9) (10) Cooperation Infrastructure Materials Health Teacher Index Index Index Index Effort Index Treatment 0.0661** 0.0414* À0.0439 0.0469* À0.0237 (0.0306) (0.0236) (0.0350) (0.0270) (0.0435) Constant À0.220** À0.454*** À0.402** À0.396*** 0.484*** (0.103) (0.0936) (0.171) (0.114) 0.158) Observations 777 978 826 933 784 R-squared 0.078 0.164 0.174 0.238 0.213 Control group mean À0.00756 À2.98e-09 À0.00411 1.26e-08 À0.00712 Sources: Ministry of Education Administrative Data and 2008 School Survey. Robust standard errors in parentheses. ***p < 0.01, **p < 0.05, *p < 0.10. Regressions control for whether the school is in a rural or urban area, total enrollment in 07/08, proportion of girls in 07/08, whether the school had NGO support prior to the grant, and inspection fixed effects. Details on the component variables and the impact of treatment on each component variable for each index are given in the appendix S3. 11 An alternative interpretation would be that this result derives from the fact that we measure only the first year of the grant, and so parents did not have time to change their own contribution of inputs (see Das et al. 2013, where crowd- ing out was greater when a school grant was anticipated than when it was unanticipated). This is unlikely since the parents were notified in advance of the grants arrival. The World Bank Economic Review 543 Note that in general the amount of cash income available to schools is obtained through parental con- tributions. An important consequence of this is that, due to the increase in cash from the grant, cash on hand for schools increased and thus so did the possibility for investment. The parent involvement index increased (table 3, column 2), as did all of the individual components, although no change in any individual component is significant: the number of meetings was higher, time Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 elapsed since the last meeting was smaller, the number of topics addressed in the meetings is larger, and the presence at the last meeting is larger (appendix S4, table A3). Overall, the mean of the parent involvement index in the treatment group is 0.06 standard deviations larger than the mean of the control group, and this effect is significant at the 10% level. The impact of grants on parent responsibility in school management is reported in table 3, column 3. The overall effect of the grants is positive: the mean of the index of the treatment group is almost 0.06 standard deviations above the mean of the control group. The analysis of detailed variables composing the index shows some small increases in the proportion of school committees in charge of infrastructure, collecting financial contribution and spending financial contributions, although none of these increases are statistically significant (although some of p-values are close to conventional significance), while the effect on the index itself is significant at the 10% level (appendix S4, table A4). There is no overall impact on parent supervision of teachers (table 3, column 4). Changes in the pro- portion of school committees which discuss teacher behavior in school committee meetings, declare that they are active in increasing teacher attendance and improving education quality, declare that they moni- tor teacher attendance, and take remedial actions against teachers are small and insignificant (appendix S4, table A5). No trend emerges from these variables, and so there is no change in the teacher oversight index. School Management While there is no impact of the grant on school accountability overall (table 3, column 5), the analysis of the detailed components shows a 13% increase in the proportion of schools that could present a register for fund collection for examination and a 21% increase in the proportion of schools that could present a register for fund expenses for examination, which might be simply the direct consequence of the fact that schools in the treatment group received money from the government and had something to record, rather than an overall change in accountability (appendix S4, table A6). However, the grant did not change the use of other registers nor the frequency of minutes, which suggests that the increased involvement and responsibility of parents did not lead to a higher demand for transparency and record keeping. Overall, the cooperation between the school committee and different actors improved (table 3, col- umn 6): school committees are significantly more likely to report support from the community (þ5 per- centage points), from the teachers (þ3 percentage points), and from the parent committee (þ5 percentage points) (appendix S4, table A7). The proportions of school committees reporting support from local authorities, school administration, educational advisors, and inspection are also consistently larger, although these differences are not significant. As a result, mean of the cooperation index for the treatment group is almost 0.07 standard deviations above the mean of the control group, significant at the 5% level. One explanation for the positive effect of grants on cooperation between school stakehold- ers and school committees is that giving resources under the control of the school committee increased respect for its activities. The positive effect of the grant on the cooperation between the school commit- tee and the different actors may be important when considering the short term nature of the experiment. 544 Beasley and Huillery It echoes the short term effect of a similar program on social capital observed in Burkina Faso (Sawada and Ichii 2012). Treatment schools increased spending on infrastructure, festivals and playground, and investments in agriculture. The absolute and percent differences in amounts budgeted for a given type of project in treatment schools compared to comparison schools are presented in figure 2 (significant differences in Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 dark grey, nonsignificant in light grey). The amount budgeted for a given type of project was signifi- cantly larger for infrastructure, festivals and playgrounds, and investments in agriculture (table 4): the amount budgeted for infrastructure was 20% larger in the treatment group (107,705 FCFA (215 USD) versus 86,119 (172 USD) significant at the 5% level), the amount budgeted for festivals and play- grounds was sixfold greater than in the control group (1031 FCFA (about 2 USD) versus 166 FCFA (0.33 USD), significant at the 1% level), and the amount budgeted for investments in agriculture was fourfold greater (2,416 FCFA (5 USD) versus 583 FCFA (about 1 USD), significant at the 1% level). Note that the difference, while large relative to the amount spent in control schools on these activities, is small compared to the entire amount of the grant, so the bulk of the grant was not used on school festivals, playground and agricultural investments. The size of the increase in infrastructure spending in absolute terms (19,659 FCFA, or 40 USD) is much larger than the increases in agriculture and festi- val and playground expenses (1,833 FCFA (a bit less than 4 USD) and 865 FCFA (almost 2 USD), respectively). Figure 2. Conditional Differences in Spending between Treatment and Control Groups Source: 2008 School Survey. Conditional differences show the size of the coefficient on treatment from a regression including controls for whether the school is in a rural or urban area, total enrollment in 07/08, proportion of girls in 07/08, whether the school had NGO support prior to the grant, and inspection fixed effects. Light bars indicate that the difference is not significant. The World Bank Economic Review 545 Table 4. Impact on Spending Decisions Dependent Variable: Amount of money Spent on. . . . (1) (2) (3) (4) (5) (6) (7) (8) (9) Infrastructure Supplies Pupil Pupil Teacher COGES School Investments Total Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 and and educational health support expenses festivals and in agriculture amount equipment textbooks support playground Treatment 21,586** 3,222 1,435 1,253 À1,086 32.14 864.8*** 1,833*** 28,512*** (9,121) (1,981) (1,369) (2,154) (1,331) (300.6) (285.5) (658.5) (9,993) Constant À24,197 836.7 À763.1 À13,404* 1,489 524.5 À1,599** À861.4 À34,994 (38,103) (8,622) (4,031) (8,062) (4,576) (1,046) (765.0) (1,098) (41,928) Observations 726 733 734 734 734 738 736 731 698 R-squared 0.127 0.156 0.087 0.051 0.019 0.039 0.039 0.047 0.157 Control group mean 86,119 11,631 6,058 8,711 4,352 782.7 165.8 582.9 115,898 Sources: 2008 School Survey. Robust standard errors in parentheses. ***p < 0.01, **p < 0.05, *p < 0.10. Regressions control for whether the school is in a rural or urban area, total enrollment in 07/08, proportion of girls in 07/08, whether the school had NGO support prior to the grant, and inspection fixed effects. Dependent variable is the amount in FCFA spent by COGES in the corresponding category of activities, as declared by the president of COGES in the April/May 2008 survey. Infrastructure and Equipment includes expenses related to classrooms, desks, chairs, blackboards, school enclosure and security, and cleaning. Supplies and textbooks include expenses for notebooks, pens, and textbooks. Pupil Educational Support includes expenses like additional courses, awareness campaigns to increase enroll- ment, and academic rewards. Pupil health includes expenses related to nutrition and health like drinkable water, meals, latrines and drugs. Teacher support includes expenses benefitting to teachers like teacher housing, furniture, supplies, guide books, and salary. COGES Expenses includes expenses related to COGES meetings, contributions to “COGES communal” and inspector visits. Schools festivals and Playground includes expenses like graduation ceremonies, parties, and soccer balls. Investments in Agriculture includes fields, crops and livestock, unrelated to education activities. The investments in agriculture do not seem to have been done in the interest of one person, which might be considered a theft of resources, but rather as an investment on the part of the school (since they were recorded in the school ledger). One interpretation of the investment in agricultural projects is that credit in many areas of Niger is severely constrained. There may be profit opportunities from investment in agriculture (either in terms of raising crops or arbitraging prices for inputs or food products), but since isolated areas suffer from low levels of credit, these profitable opportunities are unexploited. If the COGES is aware of these opportunities, and they are patient, it may be most optimal for the long-term interest of the school to invest the windfall cash grant rather than spend it on educational inputs immedi- ately. However, one cannot be sure that these investments were made for the profit of the school and they may not have benefited the pupils in any way.12 Finally, school committees had spent just above a quarter of the grant at the time of the April/May 2008 questionnaire: the average increase in the total spending amount is 28,512 FCFA (57 USD), while the average grant is 104,500 FCFA (209 USD). This finding indicates that about five months after the grants arrived in treatment schools, the school committees had not yet used the remaining three-quarter grant. Together with the types of spending induced by the grant, these results suggest that the school budget constraint is not immediately binding: a large part of the grant is still unused, and some money is spent on leisure and agricultural spending, which seem nonessential for pure educational purposes. Also, the amount budgeted for teacher support is unchanged (the average amount in the treatment schools is even lower than in the control schools, although the difference is not significant), which is striking in a context where teachers suffer from long delays in the payment of their salary. Similarly, it seems surpris- ing that the grant did not change the amount of money spent on supplies and textbooks, pupil 12 Future researchers examining local school management and activities should consider collecting data on school festi- vals, as well as school business investments, as potential targets of school spending. These expenditures were not fore- seen and so detailed questions on these expenditures (for example, the number and type of school festivals, or the anticipated return of investment projects) were not included in the questionnaire, nor were questions about the local credit market. 546 Beasley and Huillery educational support like remedial courses, or pupil health expenses, in a context where school equipment is very poor and pupils do not perform well at the primary school final exam. Overall, the impact of the grant on school expenses suggest that in the context of Niger, parents might not have sufficient informa- tion to make investments that are likely to improve school quality. Other explanations, which may simultaneously be true, are that parents were saving the grant in the face of uncertain future cash flows Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 (see Sabarwal et al. 2014), that they were saving money in order to offset fees in the following year, or that they were saving money for lumpy investments. School Quality Improvements are observed only for infrastructure and health resources, alongside small increases in par- ticipation at the lowest grades. There is no improvement on materials nor on teacher effort. On the con- trary, there is a small decrease in teacher attendance. There is no evidence that test scores increased in response to the program. In the slightly longer term (one year after the treatment) there is a small improvement in the infra- structure index of schools: a 0.04 standard deviation increase in the index for infrastructure quality (table 3, column 7), significant at the 10% level. This is largely driven by increases in the number of classrooms and the construction of walls around the compound (appendix S4, table A8).13 The increase in the number of new classrooms amounts to 0.12 of a standard deviation, representing an additional 0.08 new classrooms per school in the treatment group over 0.28 new classrooms per school in the control group (a 29% increase). The increase in the proportion of schools with walls around the compound (enclosure) amounts to 0.18 of a standard deviation, with 9 percentage points more in the treatment group over 34% in the control group (a 26% increase). There is no overall impact on the materials available at the schools (books and classroom materials such as rulers, protractors, and maps) (table 3, appendix S4, table A9). There is a small (0.05 standard deviations) increase in the index of health resources (table 3, column 9), significant at the 10% level. This increase is driven by increases in health information sessions (34% versus 30% of schools), first aid kits (12% versus 9% of schools), micronutrient supplementation (25% versus 22% of schools), and deworming (64% versus 62% of schools), though none of the individual components of the health index are significant alone (appendix S4, table A10). There is no effect of the grant on the number of days when class was cancelled because teachers were on strike, nor on the opinion of the school committee on teacher assiduousness and punctuality, but a decrease in teacher presence is observed in the treatment group: around 4 percentage points less than the average of 76% presence in the control group, significant at the 10% level (table 5). Teachers thus responded to increased resources under the control of parents with a reduction in their own inputs. Informal feedback from the field suggested that those teachers who felt the central government should make education decisions disliked that the communities were in charge of the grant, and they may have felt resentful that the grants undermined their authority (as representatives of the central government). In addition, the decreased teacher presence might also be related to the fact that the average school com- mittee did not spend the grant on expenses supporting the teachers (teacher housing, furniture, supplies, guide books, and salary), even though school committees had not spent the entire grant at the time of the survey. As a consequence, teachers might have had the impression that parents were not capable of wisely investing the money allocated to them and might have been resentful. Any such resentfulness might have been exacerbated by the ongoing pay disputes between the teachers and the government at that time (in many cases, teachers’ salaries had been substantially delayed or teachers had not been paid). 13 These items were also projects that were frequently reported by the schools as projects undertaken using the grant money. The World Bank Economic Review 547 Table 5. Impact on Teacher Effort (1) (2) (3) (4) Days on strike Teacher is present COGES opinion of teacher effort Teacher effort index Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 Treatment À0.541 À0.0382* À0.0220 À0.0237 (0.490) (0.0227) (0.0253) (0.0435) Constant À2.071 0.937*** 3.656*** 0.484*** (2.292) (0.0738) (0.0932) (0.158) Observations 706 799 734 784 R-squared 0.127 0.248 0.134 0.213 Control group mean 4.592 0.760 3.617 À0.00712 Sources: 2008 School Survey. Robust standard errors in parentheses. ***p < 0.01, **p < 0.05, *p < 0.10. Regressions control for whether the school is in a rural or urban area, total enrollment in 07/08, proportion of girls in 07/08, whether the school had NGO support prior to the grant, and inspection fixed effects. Days on strike is the number of days that the school was closed due to teachers striking in 2007/2008. Teacher is present is the school average of the dummy variable indicating 1 if a teacher is physically present at the day of visit (on a day when the school was supposed to be open). If the school was closed, all teachers were counted as absent. The Teacher effort index is the average of the z-scores of the variables in columns (1) to (3), oriented so that more beneficial outcomes have nigher values. There is no change in enrollment or dropout overall (table 6), but there is a positive impact at the low- est grade levels. The grant program reduced dropouts from grade one at the end of the 2007/2008 school year (2% versus 3% in the control schools) (column 4 of table 6A), a finding that is matched by an increase in enrollment in grade two at the beginning of the 2008/2009 school year (thirty-three versus thirty pupils in the controls schools) (column 5 of table 6B). Table 6. Impact on Dropout and Enrollment A: Dependent Variable: Dropout as reported at school visit in Spring 2008 (1) (2) (3) (4) (5) (6) (7) (8) (9) Total Total girls Total boys Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Treatment À0.00559 À0.206 À0.00469 À0.0136* À0.00646 À0.00791 À0.00778 0.00264 0.00139 (0.00520) (0.212) (0.00609) (0.00758) (0.0107) (0.00582) (0.0100) (0.00849) (0.00987) Constant 0.0723*** 0.775 0.0908*** 0.0366** 0.0613** 0.0678*** 0.143** 0.115** 0.0891** (0.0165) (0.662) (0.0224) (0.0183) (0.0291) (0.0240) (0.0570) (0.0455) (0.0384) Observations 748 754 753 531 434 525 454 381 466 R-squared 0.059 0.036 0.055 0.038 0.042 0.046 0.090 0.068 0.104 Control group mean 0.0359 0.366 0.0379 0.0296 0.0328 0.0295 0.0364 0.0313 0.0508 B: Dependent Variable: Enrollment as reported in 2008/09 Administrative Data (1) (2) (3) (4) (5) (6) (7) (8) (9) Total Total girls Total boys Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Treatment 1.366 0.505 0.862 À0.604 3.256** À0.471 À0.541 0.366 À0.639 (2.445) (1.254) (1.654) (1.502) (1.376) (1.174) (1.190) (1.019) (0.962) Constant 37.56** À21.01*** 58.57*** 34.47*** À1.052 5.214 1.546 À1.388 À1.225 (15.14) (7.562) (9.652) (6.267) (6.441) (4.881) (4.534) (3.911) (3.925) Observations 988 988 988 988 988 988 988 988 988 R-squared 0.901 0.880 0.866 0.470 0.545 0.546 0.484 0.520 0.540 Control group mean 160.3 65.70 94.63 40.09 29.95 23.87 26.22 20.98 19.22 Sources: Ministry of Education Administrative Data and 2008 School Survey. Robust standard errors in parentheses. ***p < 0.01, **p < 0.05, *p < 0.10. Regressions control for whether the school is in a rural or urban area, total enrollment in 07/08, proportion of girls in 07/08, whether the school had NGO support prior to the grant and inspection fixed effects. Table 7A gives the impact of the treatment on dropout rates in the spring of 2008. Schools without a particular grade level are missing. Some schools did not provide breakdowns by sex. Table 7B gives the impact of treatment on enrollment in the fall of 2008 (the academic year fol- lowing the treatment). Schools that have zero pupils at a given grade level (because they are missing a particular level) are counted as zeros. 548 Beasley and Huillery The fact that participation increases only for the youngest pupils suggests that participation is more elastic when the child is young. This might be because the opportunity cost of time is higher for older children.14 The number of candidates presented for the end of primary school exam at the end of the 2007/ 2008 school year, the pass rate for the end of primary school exam, and the math, French, or oral Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 tests administered during the April/May 2008 questionnaire visit were not affected (table 7). Since participation increased (or fewer children dropped out) in the lowest grades, one cannot rule out a downward bias due to attrition. However, the fact that test scores remained unchanged in the higher grades where there was no change in participation supports the finding that there was no improve- ment in learning. Table 7. Impact on Test Scores (1) (2) (3) (4) Oral Math French End primary pass rate Treatment À0.101 À0.0351 À0.0338 À0.0244 (0.0749) (0.0588) (0.0586) (0.0227) Constant À0.0252 À0.159 0.0648 0.525*** (0.261) (0.209) (0.221) (0.0706) Observations 499 763 739 557 R-squared 0.200 0.200 0.251 0.177 Control group mean 0.00828 0.00545 0.0145 0.614 Source: 2008 School Survey. Robust standard errors in parentheses. ***p < 0.01, **p < 0.05, *p < 0.10. Regressions control for whether the school is in a rural or urban area, total enrollment in 07/08, proportion of girls in 07/08, whether the school had NGO support prior to the grant, and inspection fixed effects. Oral, Math, and French test scores come from normalized test scores from the World Bank administered exam in the spring of 2008. Oral test scores were given only to pupils in grades one and two. The end primary pass rate is the percent of students from the school who passed the exam at the end of grade six at the end of 2008 (administrative data). Heterogeneous Treatment Effects The paper now examines the different dimensions identified above to identify heterogeneous effects. Due to space limitations, we do not present the detailed regression tables in the paper, but they are avail- able from the authors upon request. There are two overall messages from this analysis. The first is that the most difficult management task—monitoring teachers—was undertaken only by educated COGES or those with experience in other organizations, that is, those with higher capacity. The second is that, in one-teacher schools, there was a greater threat of teacher strikes, more of the grant was spent on items that benefited the teachers in some way, and, perhaps as a consequence of spending on items that bene- fited teachers, teacher presence increased slightly. Education of the COGES Communities where the school committees were educated increased their supervision of teacher attend- ance in response to the grant. Educated school committees are 9 percentage points more likely to super- vise teacher presence if the school was treated, significant at the 10% level. However, the increased monitoring did not reduce teacher absenteeism, suggesting that parents were not able to effectively con- front teachers. 14 The fact that only younger grades were impacted is evidence that the change in enrollment is not due to intentional misreporting by grant schools. In addition, the finding is replicated across two different types of data collections and at two different periods. The World Bank Economic Review 549 In terms of spending, educated COGES who received grants focused investments on infrastructure, perhaps to the detriment of other types of spending.15 COGES without educated members, on the other hand, increased spending on Health Resources and Pupil Educational Support.16 The negative impact of the grant on money for Pupil Educational Support and the health resources Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 index might reflect that educated COGES increased expenses in infrastructure, which are generally lumpy investments, and might have required the school to spend less on other items. There is also a negative impact of the grant on math and French test scores in schools with educated COGES (about one-third of a standard deviation, significant at the 5% level for math and 10% level for French). This negative impact of the grant on learning in schools with educated COGES, who focused spending on infrastructure, echoes the findings in the literature that providing more-of-the-same educa- tional inputs typically has no impact on learning, whereas interventions such as remedial education and rewards are more effective at increasing learning (Kremer et al. 2013). Educated COGES may not have made the optimal choice because they decreased spending on pupil educational support, perhaps to finance the lumpy infrastructure investments. Experienced COGES Schools where the COGES has at least one member who is also a member of another community organi- zation increased monitoring of teacher attendance in response to the grant. These schools are also those that enjoyed the increases in the cooperation index, whereas schools with no member that is also a mem- ber of another community organization had no increases.17 Wealth of the COGES Parent responsibility increased more in wealthy communities.18 We note that the results for wealth are different from the results for educated and experienced COGES, showing that the effects we find for edu- cation and experience are not merely proxies for wealth. One-Teacher Schools One-teacher schools seem to have made a different choice than larger schools, with important effects: they budgeted more money for expenses related to Teacher Support.19 This may be because there was more threat of striking from the teachers: one-teacher schools in the treatment group lost 1.3 days more to teacher strikes than one-teacher schools in the control schools (significant at the 10% level). 15 Note that while educated COGES budgeted more money for infrastructure (58,755 FCFA (117 USD), significant at the 5% level), the increases in infrastructure in the following year were felt primarily in schools with noneducated COGES: the coefficient on the interaction term of treatment and education is negative (À0.08 SD) and significant at the 5% level. One possible reason, if the data on spending are accurate, is that the projects undertaken by educated COGES in response to the grant might have been larger and taken more time, so that they were not yet completed at the time that data on infrastructure was collected. 16 For Health Resources, the treatment coefficient for the noneducated COGES is 0.06 SD, significant at the 10% level, while the coefficient for the interaction term is À0.12, significant at the 10% level, suggesting zero or negative impact of the grants on health resources in the educated COGES. For Pupil Educational Support, schools with noneducated COGES increased spending (3,639 FCFA (7 USD), significant at the 5% level), but no impact (or a possibly negative impact) for schools with educated COGES (the coefficient on the interaction term is À8,215 FCFA (16 USD), signifi- cant at the 5% level). 17 For monitoring teacher attendance, the coefficient on the interaction term is 0.11, significant at the 5% level, and for cooperation, the coefficient is 0.07, significant at the 10% level. 18 Each standard deviation increase in wealth is associated with an additional 0.05 standard deviation increase in the par- ent responsibility index in response to the grant, significant at the 5% level 19 The coefficient on the interaction term is 8,985 FCFA (18 USD), significant at the 5% level. 550 Beasley and Huillery Perhaps as a result, one-teacher schools are the only schools to not suffer from the negative impact of the grants on teacher attendance on the day of the visit.20 In fact, the size of the coefficient on the inter- action term suggests that teacher attendance actually increased in one-teacher schools. This suggests that by transferring some of the grant to teachers—or at least to investments that benefit teachers—the one teacher schools limited the reduced teacher attendance associated with the grant in other Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 schools. However, infrastructure in one-teacher schools did not improve, in contrast to other schools.21 Urban and Rural Schools Increases in in-kind contributions are driven by parents in urban schools.22 The increase in the parent responsibility index is also driven by increases in urban rather than rural schools.23 Only schools located in rural areas increased their spending on agricultural investments.24 This may be because credit constraints may be less severe in urban areas, but no data are available to confirm this. IV. Conclusion The short run impact of grants to school committees in Niger was to increase cooperation and participa- tion along several dimensions without crowding out parent financial contributions. The implication of this finding is that one way to potentially avoid the crowding out due to increased inputs found in other experiments is to involve parents in the management of the funds. Increased parent participation also came with a small increase in young pupil participation. However, more pessimistically, while the parents were willing to try to improve quality by participat- ing, they were not able to do so, at least in the short run. One possible reason for this is that, in this con- text, parents (the majority of whom did not go to school) do not have sufficient information to make investments that are likely to improve quality. In particular, most investments focused on buildings, rather than extra lessons or materials, and these investments did not translate into improved learning. On average, teachers decreased their effort in response to the grant to the COGES. This finding reinfor- ces other evidence in the literature of negative teacher reactions to participatory programs and highlights the importance of taking this potential reaction into account in policy planning. The heterogeneous impact analysis, while second-order, yields potentially helpful insights for under- standing the impact of the program and considering future programs. The most difficult type of participation—monitoring teachers—was attempted only by educated or experienced school commit- tees. This suggests that participation initiatives need to take the capacity and authority of the intended participators into account. In addition, one-teacher schools that invested in the teacher’s working condi- tions and/or made some type of transfer to the teacher, actually increased teacher attendance. This find- ing suggests that teachers’ negative reaction to parent participation might be reversed when parents are “on the side” of the teachers. Finally, rural school committees as well as noneducated school committees invested a small part of the grant in agriculture, perhaps because they did not prioritize education or 20 The coefficient on the interaction term is 0.17, significant at the 5% level, and the coefficient on the treatment term is -0.06, significant at the 5% level. 21 Infrastructure may have even degraded—the coefficient on the interaction term is À0.17, significant at the 1% level, while coefficient on the treatment variable is 0.06, significant at the 5% level. Note that since the grant was based on the size of the school, one-teacher schools received smaller grants. They may then have been pushed away from invest- ment in infrastructure since the lump sum was not enough to start a project. 22 Urban schools were 17% more likely to have made in kind contributions, significant at the 10% level. 23 The coefficient on the interaction term is almost 0.3 standard deviations, significant at the 1% level, whereas the coef- ficient on treatment alone in the interaction specification is near zero. 24 Rural areas increased spending on agricultural investments by 2,046 FCFA (4 USD), significant at the 1% level, and the interaction term for urban schools is -1,755 FCFA (3.5 USD), significant at the 5% level. The World Bank Economic Review 551 because they invested the money in order to get more funds for the school in the future. We highlight this finding so that future programs might be aware of it and collect more data to understand what schools might do with grants and the role that education preferences and credit constraints play in those decisions. These findings are from an evaluation that ended prematurely. As such, their generalizability is lim- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 ited even as they do give us some insight into what may be the immediate barriers to a community’s abil- ity to effectively leverage grant programs. There are four key policy implications of the findings in this paper. First, on some measures, partici- patory programs can be successful: parents increased their participation in school management in response to the grant without immediately reducing their contributions. Second, on the other hand, there is no reason to assume that parents will make wise spending and management decisions. Third, capacity matters for difficult tasks, as in this case the parents with education or experience were those able to supervise teacher attendance. Finally, teachers may respond to parent empowerment by reducing effort, and avoiding this may require ensuring that teachers also benefit in some way. References Aghion, P., and J. Tirole. 1997. “Formal and Real Authority in Organizations.” Journal of Political Economy 105 (1): 1–29. Banerjee, A., R. Banerji, E. Duflo, R. Glennerster, and S. Khemani. 2010. “Pitfalls of Participatory Programs: Evidence from a Randomized Evaluation in Education in India.” American Economic Journal: Economic Policy 2 (1): 1–30. Barrera-Osorio, F., T. Fasih, H. Patrinos, and L. Santibanez. 2009. “Decentralized Decision-Making in Schools, The Theory and Evidence on School-Based Management.” Washington, DC: The World Bank. Blimpo, M., D. Evans, and N. Lahire. 2015. “Parental Human Capital and Effective School Management: Evidence from the Gambia.” World Bank Policy Research Working Paper N 7238. Bryk, A., Y. M. Thum, J. Easton, and S. Luppescu. 1998. “Academic Productivity of Chicago Public Elementary Schools: A Technical Report Sponsored by the Consortium on Chicago School Research.” Mimeo. University of Chicago. Das, J., S. Dercon, J. Habyarimana, P. Krishnan, K. Muralidharan, and V. Sundararaman. 2013. “School Inputs, Household Substitution, and Test Scores.” American Economic Journal: Applied Economics 5 (2): 29–57. Di Gropello, E. 2006. “A Comparative Analysis of School-based Management in Central America.” World Bank Working Paper, 72. Duflo, E., P. Dupas, and M. Kremer. 2015. “School Governance, Teacher Incentives, and Pupil-Teacher Ratios: Experimental Evidence from Kenyan Primary Schools.” Journal of Public Economics 123: 92–110. Galiani, S., P. Gertler, and E. Schargrodsky. 2008. “School Decentralization: Helping the Good Get Better, but Leaving the Poor Behind.” Journal of Public Economics 92 (10): 2106–20. Galiani, S., and E. Schargrodsky. 2002. “Evaluating the Impact of School Decentralization on Educational Quality.” Economia 2 (2): 275–314. Galiani, S., and R. Perez-Truglia. 2013. “School Management in Developing Countries.” CEDLAS, Working Papers 0147, Universidad Nacional de La Plata. Hanushek, E. A., S. Link, and L. Woessmann. 2013. “Does school autonomy make sense everywhere? Panel estimates from PISA.” Journal of Development Economics 104: 212–32. Hess, A. 1999. “Expectations, Opportunities, Capacity and Will: The Four Essential Components of Chicago School Reform.” Educational Policy 13 (4): 494–517. King, E., and B. Ozler. 2005. “What’s Decentralization Got to Do With Learning? School Autonomy and Student Performance.” Discussion Paper No. 054, Interfaces for Advanced Economic Analysis, Kyoto University. Kling, J. R., J. B. Liebman, and L. F. Katz, 2007. “Experimental Analysis of Neighborhood Effects.” Econometrica 75 (1): 83–119. 552 Beasley and Huillery Kremer, M., C. Brannen, and R. Glennerster. 2013. “The Challenge of Education and Learning in the Developing World.” Science 340 (6130): 297–300. Lassibille, G., J. Tan, C. Jesse, and T. Van Nguyen. 2010. “Managing for Results in Primary Education in Madagascar: Evaluating the Impact of Selected Workflow Interventions.” World Bank Economic Review 24 (2): 303–29. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/531/2897782 by Joint Bank-Fund Library user on 08 August 2019 Pop-Eleches, C., and M. Urquiola. 2013. “Going to a Better School: Effects and Behavioral Responses.” American Economic Review 103 (4): 1289–324. Pradhan, M., D. Suryadarma, A. Beatty, M. Wong, A. Alishjabana, A. Gaduh, and R. P. Artha. 2014. “Improving Educational Quality through Enhancing Community Participation: Results from a Randomized Field Experiment in Indonesia.” American Economic Journal: Applied Economics 6 (2): 105–26. Sabarwal, S., D. Evans, and A. Marshak. 2014. “The Permanent Input Hypothesis: The Case of Textbooks and (No) Student Learning in Sierra Leone.” World Bank Policy Research Working Paper 7021. Sawada, Y., and T. Ishii. 2012. “Do Community-Managed Schools Facilitate Social Capital Accumulation? Evidence from the COGES Project in Burkina Faso.” JICA-RI Working Paper 42. Bank, World. 2003. World Development Report 2004: Making Services Work for Poor People. New York, NY: Oxford University Press. The World Bank Economic Review, 31(2), 2017, 553–569 doi: 10.1093/wber/lhw005 Advance Access Publication Date: March 22, 2016 Article Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 Providing Policy Makers with Timely Advice: The Timeliness-Rigor Trade-off Clive Bell and Lyn Squire Abstract Policy makers bemoan the lack of research findings to guide urgent decisions, whereas researchers’ professional code puts rigor first. This article argues that provisional assessments, produced early in the research cycle, can bridge the gap. Numerous case studies point to the importance of early interaction with policy makers and the delivery of brief, policy-focused papers; but preliminary analyses may be flawed and so increase the chances of a wrong decision. This article demonstrates analytically that a preliminary assessment, supported by the offer of more refined research, provides an option that is superior, on average, to the current practice of submitting a final report at the end of the research cycle. Where practical implementation is concerned, it calls for donor- funded subsidies to promote the use of provisional assessments and for a rapid, independent, professional re- view process to ensure their quality. While the research-policy exchange in developing countries is a complex, context-specific phenomenon, the proposal offered here holds out some promise of improving decisions in the public sphere under a wide range of circumstances. JEL classification: D61, H43, O20, O21, O22 If economists could manage to get themselves thought of as humble, competent people on a level with dentists, that would be splendid. —J. M. Keynes, Essays in Persuasion, 1931. I. Introduction The development community has embraced the notion of so-called evidence-based policy, which is gen- erally understood to be public policy informed by rigorously established, objective evidence (see, e.g., Pawson 2006; Nutley et al. 2007; Carden 2009). This general desideratum is certainly not new, but it was popularized lately by the Blair Government in the United Kingdom (Cabinet Office 1999a, 1999b). Official pronouncements have their value, but the link between research and policy is often tenuous at best. Clive Bell is professor of economics, emeritus, at the University of Heidelberg and Adjunct Senior Researcher at CMI, P.O. Box 6033, N-5892 Bergen, Norway; his email address is: clive.bell@urz.uni-heidelberg.de. Lyn Squire (corresponding author) is the former president of the Global Development Network, 4, Vasant Kunj Institutional Area, New Delhi-110070, India; his email address is lynsquire@yahoo.com. We are much indebted to Pierre Jacquet for valuable and extensive comments on an earlier draft. Subsequently, helpful suggestions from Fred Carden, Randy Filer, Stefan Klasen, Halvor Mehlum, Lant Pritchett, Indira Rajaraman, Alan Winters, participants in CMI’s annual development seminar, and especially two anonymous referees led to improvements in various parts of the text. We bear sole responsibility for all remaining errors of analysis and opinion. C The Author 2016. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 554 Bell and Squire The focus of this article is the interaction between researchers and policy makers in developing coun- tries. Descriptions of this process usually stress its complexity and context-specificity and point to a multi- tude of factors, other than research, that influence decisions (Court et al. 2005; Livny et al. 2006; Carden 2009). Policy makers have to worry about the interests of relevant constituents, pressure from external agencies, and personal career ambitions; they may have to contend with weak implementing agencies, cor- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 ruption, and even violence; they may prefer to draw on their own experience and knowledge or that of their immediate advisors; and they may feel that local research is unreliable or based on data that are lim- ited, inaccurate, or unavailable when required. In sum, even where research appears to play a role, it will only be one, and usually not the most important, in a complex array of factors influencing policy. If the aim is to make research more influential, efforts to remove, or at least weaken, the impediments to a vigorous research-policy interaction merit serious consideration. The contribution of this article is to propose a way of overcoming one of these obstacles—the frequent unavailability of research when re- quired for urgent policy decisions. While research must be sufficiently rigorous that, if acted upon, there is a reasonable expectation it will improve the policy decision, it must also be timely, in the sense that it has to reach the policy maker before the decision is made. This article argues that timeliness of results has received less attention than research quality—and less than it merits. Meeting accepted professional standards has been the watchword for the research community. In con- sequence, efforts to feed advice and guidance into the policy-making process are usually undertaken only at the end of the research cycle, after the research has been completed and vetted. While there is an obvi- ous rationale for this—no one wants results based on badly flawed analysis or grossly inaccurate data reaching the policy maker’s desk—the approach runs the risk that results arrive too late to be of value. Indeed, decision makers worldwide constantly voice frustration with the slowness of the research pro- duction process. The ODI/SciDev.Net International Survey on the Science-Development Policy Inter- face, for example, reports that “A general consensus from the expert interviews was that a major chal- lenge is the narrow focus and long time-scales of scientific research compared with political priorities” (Jones et al. 2008, 20). Similarly, an ODI Briefing Paper notes that “Policymakers bemoan the inability of many researchers to make their findings accessible and digestible in time for policy decisions. Practitioners often just get on with things” (Overseas Development Institute 2004, 1). A possible remedy is to encourage the submission of preliminary results early in the research cycle in order to increase the likelihood of their reaching the policy maker before the decision has been taken. It is this thought that leads to the main proposal of this article—the introduction of what we term provi- sional assessments. As explained more fully in section II, provisional assessments are short papers fo- cused on a current policy issue and delivered to the policy maker early in the research process. With timeliness at a premium, the preliminary analysis presented in these submissions should be based entirely on existing, easily accessible information and routine calculations, with evidence from other countries or extrapolation of historical trends being called upon to supplement whatever hard data are at hand. In consequence, a provisional assessment should also indicate whether its findings are sufficiently robust to warrant an immediate decision, and if not, what additional evidence and analysis are needed. Ideally, the latter conclusion should lead to a request from the policy maker for a more thorough investigation, accompanied by an indication of when results are required if they are to be useful. If, as might first appear, provisional assessments simply provide an additional arrow in the re- searcher’s quiver to be used when and where needed in the effort to inform policy, then, apart from a few remarks on implementation, no further discussion would be required. Provisional assessments come, however, with a potential cost. The danger is that the policy maker acts immediately on receiving one. Had the researcher bided her time, completed her research as thoroughly as possible, and only then sub- mitted a final report, its findings might have pointed to a quite different policy. Meanwhile, the policy maker, who is buffeted by the pressures of office, might have made his decision well before the final The World Bank Economic Review 555 report reaches his desk. In short, there is a trade-off between timeliness and rigor. Section III addresses this trade-off analytically. It is one of the first attempts to our knowledge to go beyond the reliance, in the research-policy literature, on case studies by offering a rigorous analytical framework. It is demon- strated that a provisional assessment, backed by the promise of a more complete assessment to follow if the provisional one points to significant uncertainty about the policy’s social profitability, is superior to Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 the current practice of submitting only a final report. Section IV deals with implementation. It explores ways of promoting the use of provisional submissions, currently a missing component in the research-policy chain, initially by means of inexpensive, donor-funded subsidies. It also proposes checks on provisional assessments in order to ensure quality and reduce the chan- ces of an incorrect decision, even if that decision is taken immediately on receipt of the preliminary analysis. The section also discusses ways of increasing the likelihood that the initial approach to the policy maker does in fact lead to whatever further research is recommended in the provisional assessment. These practical mea- sures buttress our claim that provisional assessments, if widely used, could have a significant and favorable impact on policy in many situations. Section V summarizes the argument and proposals. II. Provisional Assessments Rigor Reigns The primacy of rigor over timeliness is not surprising given that the incentives prevailing within the eco- nomics profession promote rigor before all else. There are well-established mechanisms to ensure that professional standards are met through the widespread use of peer reviews; and researchers, who natu- rally seek professional advancement, are keen to submit their research to a thorough, independent vet- ting by other scholars. There is, in contrast, no such history of individual or collective action from within the profession to promote timeliness in relation to policy making. As one illustration, consider the widely held view that randomized controlled experiments are supe- rior to other techniques of impact evaluation such as multivariate regression and propensity score match- ing, even though these other options offer the prospect of yielding results more quickly. Thus, the website of the Abdul Latif Jameel Poverty Action Lab notes that “Randomized evaluations are often deemed the gold standard of impact evaluation, because they consistently produce the most accurate re- sults” (http://www.povertyactionlab.org/about-j-pal).1 To compound matters, the instruments currently used to reach policy makers—the media, dedicated sem- inars, and policy briefs2—are not well designed to ensure timeliness, since they are almost invariably brought into play only after the research has been completed and reviewed. Thus, the main vehicles now used to strengthen the impact of research on policy reinforce, or at least coincide with, the researcher’s commitment to rigor, in that both imply an interaction with the policy-maker that follows research, and therefore raise the question of whether as much as possible is being done to ensure the timely delivery of results. A Way Forward The experience of the International Development Research Centre (IDRC), perhaps the leading interna- tional organization as far as linking research and policy is concerned, as well as that of other institutions, suggests a way forward. In particular, Carden (2009, 45), drawing on a detailed review of 23 IDRC- funded research projects and other evidence, stresses two points: first, the importance of initiating a 1 Similar statements abound: the mission statement of the Coalition for Evidence-Based Policy asserts that “evidence of effectiveness generally cannot be considered definitive without ultimate confirmation in well-conducted randomized controlled trials” (http://coalition4evidence.org/). The Report of the Evaluation Gap Working Group observes that “most questions about the impact of social programs require collecting data over years. Valid evidence of a program’s effectiveness often cannot be produced in less time” (Savedoff et al. 2006, 25). 2 For a useful overview of policy briefs, see Jones and Walsh (2006). 556 Bell and Squire dialogue with the policy maker as early as possible in the research process; and second, the value of sub- mitting short papers tightly focused on the policy question under consideration, as opposed to the more traditional, methodology-heavy accounts typically produced at the end of a research project. It is this combination of early interaction and policy-focused submissions that leads to the main proposal of this article—the introduction of what we term provisional assessments. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 The type of research for which provisional assessments are intended is typically country-specific, is- sue-specific, and employs techniques such as cost-benefit analysis, cost-effectiveness analysis, and impact evaluations, possibly supported by econometrics or partial and even general equilibrium models.3 The following case study provides an example of how a provisional assessment can be used to address the type of well-defined policy question we have in mind. The issue considered by Annor-Amevor et al. (2012) is the low pass rate for the Basic Education Certificate Examination achieved by Ghanaian children in junior high schools, a matter of then current con- cern in Ghana.4 In particular, the authors assess the merits of participatory remedial classes for those who have failed the examination, combined with an opportunity to re-sit without repeating previous school grades. Besides useful background information, the study provides information on four key magnitudes: • Size of the population likely to be affected. The authors estimate that 41,686 children failed the exam- ination in the five regions under study. • Costs. They compile a detailed estimate of expected costs, concluding that the cost per pupil will be $254.50. • Impact. Based on experience elsewhere, they assume that the pass rate will increase from 62 to 77 per- cent as a result of the intervention. • Benefits. Using current Ghanaian data on earnings, they project that a certificate-holder will earn 20 percent more than someone without the certificate. Armed with this information, the authors conduct a cost-benefit analysis and arrive at a positive net present value for the proposed intervention. They also undertake a sensitivity analysis, especially with re- spect to impact and benefits, which suggests that the initiative is worth undertaking for a wide range of estimates. The example highlights the three characteristics that are likely to be typical of provisional as- sessments: focus on a well-specified policy issue; use of standard techniques; and reliance on readily available information. The Timeliness-Rigor Trade-off The Ghanaian example illustrates the trade-off between timeliness and rigor. All required information for this analysis could be obtained from existing sources because the analysts were willing to draw not only on “hard” evidence (coverage and costs) but also on the “soft” kind, in the form of international experience (impact) and extrapolation (benefits). This emphasis on ready availability should not be un- derstood to rule out the possibility of collecting original data; but the key point is that the information underlying a provisional assessment, whatever the source, has to be marshalled quickly.5 Even though 3 Another kind of research attempts to improve our understanding of development in the broad. The findings of such re- search typically enter the policy-making process indirectly and only gradually. They first appear in, and then percolate through, the pages of academic journals, those important devices for quality control. Only then do they enter policy fora and permeate policy discussions. 4 This is one of twelve country studies in a research project entitled “Strengthening Institutions to Improve Public Expenditure Accountability,” managed by the Global Development Network and Results for Development. 5 An example involving the use of specially collected data is provided by an analysis of the extension of the Nicaraguan Social Security Institute’s health insurance program to informal sector workers. The easily ascertained fact that the pro- posed cost of the insurance was considerably more than current, out-of-pocket health expenditures clearly pointed to the likelihood of low take-up rates, an indication subsequently confirmed by a controlled randomization experiment (Thorton et al. 2010). The World Bank Economic Review 557 this approach will usually entail reliance on data that are soft and potentially unreliable, professional in- tegrity is in no way compromised. All that is required is that the provisional assessment states clearly which inputs into the analysis have been established fairly accurately, which have been based on histori- cal or international experience,6 and how use of the latter affects the robustness of the findings. In the Ghanaian example, the robustness checks suggest that the downside risk of sacrificing full rigor Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 is negligible, but this cannot hold in general. In evaluating the potential of provisional assessments, therefore, it would be desirable to have some sense of the size of this downside risk and, going a step fur- ther, whether the sacrifice of rigor involved in providing such assessments likely nullifies their advantage in terms of timeliness, a task to which we now turn. III. Analyzing the Trade-off A project or program is under consideration. Both the exact timing of the simple “yes-no” decision of whether to adopt it and its social profitability are uncertain. There are two actors, a policy maker and a researcher, hereinafter denoted by D and R. As “insiders” in their respective fields, the former is better informed about timing, the latter about social profitability, but neither is perfectly so. An exchange of in- formation, however incomplete or indirect, may improve the chances that the decision, when it does come, will be the right one, in the sense that the chosen course of action is welfare-superior when viewed ex ante at the time of decision. There is always an alternative to the action in question, and consequently a cost of choosing wrongly. The net social benefit of making the correct decision, when it occurs, is assumed to be given, and is there- fore omitted from the analysis. In this connection, it should be recalled that otherwise identical projects or policies initiated at different dates are indeed different. In general, therefore, timing will affect the level of social profitability ex ante as well as an outcome ex post. Both parties are assumed to be aware of this fact, though R will be able to formulate it more exactly when assessing the probability that an af- firmative decision at a particular time t will be correct. An assumption about how to treat uncertain outcomes in public decision-making is also needed. Let the assumptions for the validity of the Arrow-Lind (1971) theorem hold, namely, that the project’s net returns be evenly spread over a large population and statistically independent of those of other projects. It is then valid to use expected values in making decisions, even though some outcomes are poor when viewed ex post. For simplicity, we also omit the social cost of producing the advice from the formal analysis.7 Let the variate T denote the date on which the decision occurs. At the outset (at time t ¼ 0), D’s sub- jective prior is that T has the continuous distribution function H(t), whose support is ½t1 ; t2 Š. In the ab- sence of the specific advice which will be treated below, his own knowledge, understanding, predilections, and such other opinion—expert or otherwise—at his disposal combine to yield the proba- bility qt that the decision, if it occurs on the date t, will be the correct one. Along with political pressures, these various influences are themselves likely to fluctuate over time. This does not, however, imply that, for any given t, qt is “fuzzy.” In this, we follow Elga (2010), who argues that rational agents must have sharp subjective prior probabilities: for each and every t 2 ½t1 ; t2 Š, D has an exact ex ante qt 2 ½0; 1Š. At 6 Two recent articles address issues surrounding the use of estimates of impacts drawn from other countries. Dhaliwal et al. (2012) provide guidance on how to compare results based on RCTs, using evaluations of educational programs in multiple countries as an example. Pritchett and Sandefur (2013) add the important point that context matters, arguing that experimental results from the right context are currently a better guide to policy than nonexperimental ones from a different context. 7 To a fairly good approximation, this is the value of the researcher’s time devoted to the task when priced at her shadow wage rate. 558 Bell and Squire the outset, therefore, D assesses the corresponding ex ante probability that the decision will be correct to be ð t2 Q¼ qt dHðtÞ: (1) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 t1 Aware that the issue is on D’s agenda but initially uninformed about his H(t) and qt , R undertakes to an- alyze it.8 The longer she devotes herself to this task, the better founded the advice will be, and the more compelling will be the way in which she frames the evidence and arguments. The advice will therefore improve as time passes, especially in the sense of increasing the probability that the correct decision will be taken—provided the memorandum is actually on D’s desk when the decision point arrives. Timeliness, then, is essential if her efforts to make a persuasive case are not to be in vain. When weighing all these considerations, she has her own subjective priors at the outset, F(t) and pt , respectively. Even if the research results are delivered in time, whether they are used depends on D’s response. In particular, it is arguable that the gap between the memorandum’s arrival and the actual decision matters. Advice that arrives far in advance may suffer a loss of effectiveness simply because it appears a bit dated when that moment of decision does arrive. There is even the mundane possibility that the memorandum will get mislaid or overlooked, as D’s office struggles to deal with the stream of matters demanding his attention. Working against these hazards is the fact that a memorandum’s early arrival gives him the op- portunity to reflect on it, which should make it more influential. D and R face a common hazard, namely, that there will be a “changing of the guard” before the deci- sion actually occurs.9 The new regime may take the form of a new government or, if D is a senior bu- reaucrat, his replacement through regular rotation or otherwise. As D and R reflect on the possible timing of the decision and the associated probability that it will be correct, each must also form a prior probability of whether a changing of the guard will take place, and if so, when and with what conse- quences for timing and a happy outcome. There are three broad possibilities, though D and R may well differ in their assessments of whether, when and the resulting consequences. First, there is a firm conviction that the decision will occur before any changing of the guard. Second, if such a change could happen, its date can still be forecast without error, as might well be the case in a stable democracy with a fixed electoral cycle and any stable, compar- atively independent bureaucracy. Then the forms of H(t) and qt will reflect D’s assessment of how the event will influence them, and the same applies to R’s F(t) and pt . Note that this includes the special case in which the said date precedes the prior estimate of the earliest possible date for a decision. The remain- der of this section is formulated on the basis that one or other of these two broad possibilities holds. Turning to the third possibility, what happens if a change of regime within the relevant time interval cannot be ruled out and its timing is uncertain? In principle, D must formulate a pair H(t) and qt for each date on which such a change can occur with positive probability, together with the associated prior probability that it will occur on the date in question. The resulting compound lottery will tax most peo- ple’s powers of analysis, so D searches for a plausible simplification. Assumption. The event in question occurs on the date td 2 ðt1 ; t2 Þ with probability pd , or not at all. If it occurs, H(t) is unaltered up to t ¼ td and is amended to H d ðtÞ over the remaining, revised interval ½td ; t2d Š; and qt is likewise amended. 8 We assume that the interaction between the researcher and the policy maker is initiated by the former, this being, in our view, the more realistic representation. The analytical apparatus could, of course, be re-jigged to make the policy maker the instigator, a desirable extension that we discuss, among others, in section IV. 9 We are indebted to a referee for urging us to examine this possibility. The World Bank Economic Review 559 Note that the problem only arises when td lies in the said interval; otherwise, either the first or second broad possibilities set out above will apply. The single possible date represents a sort of average, albeit one weighted by whatever information at his disposal. In a very uncertain setting, he might well choose the half- way point. Let the same assumption apply, mutatis mutandis, to R’s view of things. The Appendix sets out how the basic elements Q, and—see immediately below—R’s pt ðsÞ and PðsÞ must be reformulated to cover Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 this third possibility, whereby the extension to two or more possible dates is obvious. With these considerations in mind, let R’s prior probability that a memorandum submitted at time s will result in a correct decision occurring at time t be as follows: ( 1 À ð1 À pt Þ/ðs; t À sÞ > pt if 0 < s t; pt ðsÞ ¼ (2) pt otherwise; where the “influence function” / reduces 1 À pt , the probability of an erroneous decision. This function is assumed to be decreasing in its first argument (research quality), but it may be decreasing or increasing in the second (D’s response to the gap between the delivery of the advice and taking the decision). The assumption that a submission that arrives before or at the point of decision makes a correct decision more probable—but not certain—is plausible. For if R is competent, even her preliminary analysis is very likely, on average, to improve on whatever process yields D’s unassisted qt . This assumption implies that / < 1 if s > 0, with /ð0; tÞ ¼ 1. Observe also that the larger is pt , the smaller is the effect of a spe- cific piece of advice, pt ðsÞ being at most unity. A Single Submission Let R obtain no additional information about the timing of the policy decision in the course of her work, so that she gains nothing by postponing her decision about when to submit her memorandum, and she will hold fast to the plan s chosen at the very start. If she waits beyond t1 , it is possible that she will submit her advice too late. This must be weighed against the improvement in its quality that would result from taking longer over it. The plan s yields her ex ante probability that the policy decision will be the correct one: ð s t2 ð PðsÞ ¼ pt dFðtÞ þ 1 À ð1 À pt Þ/ðs; t À sÞ dFðtÞ t1 s (3) t2 ð t2 ð ¼ pt dFðtÞ þ ð1 À pt Þ½1 À /ðs; t À sފ dFðtÞ: t1 s The trade-off is described exactly by differentiating w.r.t. s: t2 ð 0 P ðsÞ ¼ Àð1 À ps Þ½1 À /ðs; 0ފ dFðsÞ À ð1 À pt Þð/1 À /2 Þ dFðtÞ; (4) s where /i denotes the partial derivative of / w.r.t. its ith argument. The first expression on the right-hand side is the loss in terms of P caused by spending a little more time on the memorandum, and so increasing the chances that it will arrive too late; the second expression is the corresponding gain arising from the im- proved analysis, if it arrives on time. As noted above, the assumption that advice becomes more influential as its quality improves implies that /1 < 0. While /2 may take either sign, it is very plausible that, on balance, better analysis will reduce the chances that the wrong decision will follow a timely submission, that is, d/ ¼ ð/1 À /2 Þ ds < 0. 560 Bell and Squire Let sà  arg max PðsÞ s2½t1 ;t2 Š denote R’s optimal plan. It is clear from equation (4) that waiting until the very last minute (s ¼ t2 Þ is not optimal. Thus, a researcher’s natural inclination to delay any policy input until all refinements have been in- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 corporated, and all conceivable robustness checks undertaken, may very well defeat the purpose of influenc- ing the decision. Indeed, the possibility that, at the other extreme, taking no chances at all is optimal, that is, sà ¼ t1 , cannot be ruled out. Indeed, if /1 À /2 ! 0, then that is exactly what R should do. A Provisional Submission The preceding subsection sets out the costs and benefits of delaying a single submission when R cannot acquire additional information about the timing of the policy decision. We now allow her to test the wa- ters by submitting a provisional assessment, accompanied by the offer of a more definitive analysis to follow if required. This provisional assessment also goads D into playing an active role in the ‘game’ be- tween them. We shall prove that the resulting option has a positive value, relative to a single submission, in the sense of increasing the probability that D will decide correctly.10 Let R submit a provisional assessment at time s 2 ½t1 ; t2 Š. If the decision has already been taken, she is so informed, and there is nothing more to be done. If, on the contrary, the decision is still due, let D re- spond in just one of three ways. First, with her memorandum now lying on his desk, he waits no longer and makes his decision, and R is informed accordingly. It is rather likely that D’s precipitate action is prompted by the memorandum telling him what he wanted to hear, but its professional content still raises the probability that the decision is correct; for the memorandum reflects R’s assessment of the project’s social profitability, albeit a provisional one. On the basis of the information available to her, the probability that the decision is correct is 1 À ð1 À ps Þ/ðs; 0Þ, with a corresponding improvement over ps of ð1 À ps Þð1 À /ðs; 0ÞÞ. Second, D expresses a clear interest in a more definitive assessment and sets a deadline tà 2 ðs; t2 Š, which both parties regard as fixed, and the final memorandum is submitted at tà . This more measured response requires D to choose a deadline in the light of his subjective priors at t ¼ s concerning the tim- ing of the decision and the probability that it will be correct. His decision problem is therefore analogous to R’s in the case of a single submission. Let there be no new information about timing, other than the fact that the decision is still outstanding, so that H(t) is truncated at t ¼ s. Since R will have worked on her assessment throughout the period up to t ¼ tà , let its effect on D’s prior at t ¼ s that the decision, when it comes, will be correct take the form analogous to equation (2): ( à 1 À ð1 À qt Þwðtà ; t À tÃ Þ if s < tà t; qt ðt Þ ¼ (5) 1 À ð1 À qt Þwðs; t À sÞ otherwise; which allows for the possibility that events will force a decision before the deadline tà and so confine R’s influence on the proceedings to her provisional memorandum. Corresponding to R’s ex ante probability P, D’s Q at t ¼ s is given by tà ð à ½1 À H ðsފQðt ; sÞ ¼ ½1 À ð1 À qt Þwðs; t À sފ dHðtÞ s : (6) t2 ð þ ½1 À ð1 À qt Þwðtà ; t À tà ފ dH ðtÞ tà 10 The provisional assessment is most unlikely to pass muster with a refereed journal. We return to the issue of quality in section IV. The World Bank Economic Review 561 The trade-off confronting D when choosing tà so as to maximize Qðtà ; sÞ is also analogous to that fac- ing R in the case of a single submission and need not be set out in detail. Let tÃ0 denote the maximizer in question. Since D’s course of action necessarily involves tÃ0 > s, it follows that it must yield an improve- ment not only over receiving no advice from R at any stage but also over just her provisional assessment at t ¼ s. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 R is privy only to D’s specified deadline. While she could infer something about D’s priors at t ¼ s, such speculation will serve little or no purpose in this round—though it might do in the future should they have further dealings. She therefore labours away until submitting her final assessment at tÃ0 . Having done so, she achieves a total improvement over her ex ante prior ptÃ0 of ð1 À ptÃ0 Þð1 À /ðtà ; 0ÞÞ, as she sees things. D’s third possible response to R’s provisional assessment is to send her a polite, noncommittal ac- knowledgement, so that she gains only the information that the decision is yet to come. She must now decide how much more time to invest in producing a final memorandum, with no firm deadline. It is the remaining uncertainty in the event of such an acknowledgement at s that makes the problem of back- ward induction, and hence of determining the optimal plan for s at t ¼ 0, nontrivial—in principle, at least. The corresponding optimal timing of her final submission, denoted by n0 ðsÞ, must be found in or- der to establish the optimal choice of s at t ¼ 0. She is now back in the setting of a single submission, with starting point t ¼ s. It is plausible that learning that the decision has yet to be made will have no effect on her priors other than truncating F(t) at t ¼ s. Let this be so. Then the effect of a provisional submission that draws a noncommittal response on the probability that the final decision will be correct will be given by the upper branch of equation (2), unless it is followed up by a timely final submission, in which event, n will replace s in /ðÁÞ and thus represent the improvement in the quality of the advice. Conditional on the event that such an acknowledgement is received at s, the choice n 2 ½s; t2 Š yields, analogously to equation (3), t2 ð ½1 À FðsފPðn; sÞ ¼ ½1 À ð1 À pt Þ/ðs; t À sފdFðtÞ s (7) t2 ð À ð1 À pt Þ½/ðn; t À nÞ À /ðs; t À sފdFðtÞ; n where n0 ðsÞ  arg max Pðn; sÞ: n2½s;t2 Š The associated first-order condition for an interior maximum is analogous to equation (4): t2 ð 0 Àð1 À pt Þ½/ðs; n À sÞ À /ðn; 0ފF ðnÞ À ð/1 ðn; t À nÞ À /2 ðn; t À nÞÞdFðtÞ ¼ 0: (8) n A comparison of the two conditions establishes that n0 ðsÞ is indeed dependent on s when the latter is the date on which the provisional assessment is submitted. To be more precise, consider the choice n ¼ s < t2 , which is effectively the same as making a single, final submission at s. The first term in brackets on the right-hand side of equation (8) vanishes, and the integral is negative, in virtue of /1 À /2 < 0. Hence, n0 ðsÞ > s, and the difference between the values yielded by submitting at n0 ðsÞ and s, respectively, yields the (positive) option value of a provisional submission for any choice of s 2 ðt1 ; t2 Þ, conditional on R receiving a noncommittal reply. 562 Bell and Squire The size of the option value varies, in particular, with s and the behavior of /. Inspection of equation (7) reveals that the option value will be small if s is sufficiently close to t2 . For a delay in making the pro- visional submission reduces the time in which to make the most of the resulting option; and it lowers the probability that the option will arise in the first place. It is also seen that the option value is smaller, cet. par., if / is decreasing in its second argument; for the interruption at the point n reduces the delay from t Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 Às to t À n thereafter. If, conversely, / increases with the delay, the interruption will be advantageous. Now R’s choice of s also arguably affects the probabilities of getting the responses listed above, con- ditional on the decision yet to be taken. That is to say, the act of making a provisional submission also affects the timing of the decision whose outcome it intends to influence. It is plausible that the closer s approaches the upper limit t2 , the greater is the chance that it will “trigger” the decision, an outcome whose probability is denoted by p1 ðsÞ. The opposite holds for the probability of getting a noncommittal response, p3 ðsÞ; for the earlier the provisional submission is made, the sketchier it is likely to be—and to appear. At all events, R’s plan to submit a provisional assessment at s, with her rational anticipation of choosing n0 ðsÞ in the event of receiving a noncommittal acknowledgement, involves the following possi- ble outcomes: (i) the provisional memorandum arrives too late, with probability FðsÞ; and (ii) it arrives before the decision, with probability 1 À FðsÞ, and then elicits one of the three said responses from D, P 3 with respective probabilities pi ðsÞ, i ¼ 1; 2; 3, pi ðsÞ ¼ 1. i¼1 There is, however, one further complication to be resolved before R chooses s at the outset, namely, that the contingent deadline tÃ0 has yet to be revealed. The latter certainly depends on s. R knows noth- ing about D’s priors at this stage, so it is natural for her to assume that this date, denoted by te , is uni- formly distributed on the interval ½s; t2 Š. Yet whatever be her prior distribution of te in relation to s, R’s ex ante prior probability that the decision, when it does come, will be correct is ðs Pðs; n0 ðsÞÞ ¼ pt dFðtÞ þ ½1 À Fðsފ  fp1 ðsÞð1 À ð1 À ps Þ/ðs; 0ÞÞ þ p2 ðsÞE½ð1 À ð1 À pte Þ/ðte ; 0Þފ þ p3 ðsÞPðn0 ; sÞg; t1 (9) where E is the expectation operator. Inspection of equation (9) reveals that, as R sees things, the proba- bility of securing a correct decision with a provisional submission is necessarily superior to that with a single, final submission if p1 ðsÞ < 1. For a single submission is equivalent to declining an invitation to submit a final assessment against a fixed deadline, or, in the event of receiving only a noncommittal re- ply, to choosing n ¼ s whatever be the choice of s, and it has been proved above that neither course of action is ever optimal. The choice of s itself is influenced by various considerations. Under the above assumptions, a provisional submission before t2 always yields, with strictly positive probability, the opportunity both to deliver the (con- strained) best advice and to adjust the plan optimally when uncertainty about the timing of D’s decision still remains. Against this, there is the drawback that a provisional submission will, with positive probability, pre- cipitate an immediate decision. For its arrival may focus his attention on the matter in question, especially if there is a good chance that a leak might arouse the interest of other influential parties—and provide them with ammunition. If /ðÁÞ is decreasing in its second argument, such an early triggering of the decision will de- crease the probability that it will be the right one. For the delay t À s between the provisional submission and the decision then has a favorable effect on / – whenever there is such a delay. The above proof of the superiority of a provisional over a single, final submission implies that the trigger effect cannot, on average, outweigh the others.11 To grasp why this is so, consider the limiting 11 In particular instances, the decision will indeed be triggered, with an attendant adverse effect; but the claim of superi- ority relates to a whole series of “trials.” The appeal to the Arrow-Lind theorem is important here. The World Bank Economic Review 563 case in which the trigger effect occurs with certainty ðp1 ¼ 1; p2 ¼ p3 ¼ 0Þ for all s; for then there is ef- fectively neither an option nor an invitation with a fixed deadline, and hence no advantage to a provi- sional submission. Indeed, all provisional submissions become final ones. There are, of course, consequences for the optimal choice of s, but that is another matter. In this connection, it should also be noted that in the case of a single submission, there is neither a trigger effect nor even the interim informa- Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 tion as to whether the decision has been taken. In equation (9), the trigger effect is expressed by the term involving /ðs; 0Þ, but the corresponding term in equation (3), /ðs; t À sÞ, contains the delay t À s and some delay is virtually certain. The two models are not therefore fully comparable, and the intuition af- forded by the one cannot be applied wholesale to the other. While the potential triggering effect of a provisional submission cannot overturn the main result of this section, it can greatly reduce the superiority of the plan ðs; n0 ðsÞÞ over a single, final submission. Observe that the option value arises from the second and third terms within braces in equation (9). Now suppose D’s behavior tends to be of the hair-trigger kind, so that the triggering effect approaches a near certainty even if R attempts to keep it at bay by making a provisional submission soon after t1 . Since the remaining terms within braces on the right-hand side of equation (9) approach zero as p1 approaches unity, the claim follows. More generally, all of the above considerations duly appear in the first-order condition for Pðs; n0 ðsÞÞ to take a maximum, whose baroque details need not be laid out here.12 Suffice it to say that the said con- dition, together with condition (8), characterizes the optimal plan ðs0 ; n0 ðs0 ÞÞ. Models of this kind are best thought of as a means of ordering one’s ideas through a private con- versation, with the model as partner. No one in her right mind is going to set about calculating the optimal plan ðs0 ; n0 ðsÞÞ on the lines set out above: researchers have better things to do with their time. Yet the analysis points to conclusions of practical importance. In order to exploit fully the ben- efits of a provisional submission, and to minimize the danger of incorrect decisions, researchers need an appropriate framework of incentives, policy makers (and donors) need an adequate quality- control mechanism, and both parties need some mechanism for encouraging communication. Institutional support in these areas would surely help all parties to allocate their time and energies more efficiently where promoting better decisions is concerned. These themes will be pursued in the next section. IV. Provisional Assessments in Practice The first part of section III demonstrated formally the rather obvious, albeit routinely ignored, point that delaying a single submission until the underlying research has been refined to the fullest possible extent is unlikely to be a wise strategy, since there are very good chances that the policy maker will have acted before the advice arrives. If this happens, his decision will not be informed by any specific guidance from the research community, and the resources devoted to the research will be wasted—at least from an im- mediate policy-making perspective. In these circumstances, an alternative approach, in the form of an early, provisional submission with the promise of a more refined analysis to follow, may well be prefera- ble. This thought led to the analysis in the second part of section III and thence to the key result that the said alternative is in general superior to a single, final submission. Several factors caution against concluding that this result points to a simple way of improving the impact of research on policy. First, as noted in the Introduction, the research-policy interaction is a messy business and decidedly context-specific. The analytical framework presented above necessar- ily abstracts from this complexity and captures only selected aspects of reality. Second, and also noted in the Introduction, the incentives currently confronting researchers deter provisional 12 They are available from the authors on request. 564 Bell and Squire submissions and favor more polished outputs aimed at refereed journals. Third, the key result—that a provisional submission, coupled with the offer of a more thorough analysis, is superior to a single submission—holds only in a probabilistic sense: it is correct on average over a long run of trials, but not necessarily in each case. We address these three points in turn. We begin with some comments on the analytical framework’s value added and its applicability in a wider range of circumstances Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 than may be apparent at first sight. We then discuss how incentives may be changed to promote greater reliance on provisional assessments. Finally, we outline some measures to increase the likeli- hood of a positive outcome and counter the risk of an early and incorrect decision based on faulty analysis and evidence. Value Added and Applicability The model in section III was designed specifically to illustrate the trade-off between timeliness and rigor and to allow an exploration of the merits of provisional assessments. The intention was to cut away the tangle of real-world complexity and context-specificity to expose the essential tension be- tween meeting professional research standards and serving policy makers in a timely manner. This analytical approach departs from the reliance on case studies that has to date dominated research on the research-policy link. The strength of case studies lies in their ability to delve into the rich detail of specific examples, but they are not easily generalizable and usually lack analytical rigor. The model presented above builds on, and brings analytical confirmation to, the results of existing, case- specific research by demonstrating analytically the general merit of two findings that have emerged from observation in a variety of different circumstances: first, successful research-policy episodes are often characterized by early interaction between researcher and policy maker; and second, short, fo- cused inputs are usually the most effective means of communication. The value added of the model lies primarily in its rigorous demonstration that these two elements, when combined in what we have called provisional assessments, do indeed yield outcomes superior on average to those yielded by the traditional approach of waiting until the research is complete in all respects before sharing re- sults with policy makers. The framework also offers an approach to thinking about the interaction between policy-makers and researchers in an orderly way. For it is flexible enough to be more widely applicable than its formal structure might at first suggest. For example, the initial exchange between researcher and policy maker could lead to several rounds during the decision process rather than the single exchange assumed in the model. Likewise, the researcher-led version in section III could be reversed to allow for an opening ap- proach from the policy maker. Or the provisional assessment could be initiated after the main research study has begun if there is still scope for adjusting the delivery date of its results.13 There is a place for other agents, too. Thus, if the policy maker has little interest in research and ignores the provisional as- sessment, the researcher could make her preliminary results available to the media and NGOs in the hope of influencing the decision by arousing public opinion. It should also be remarked that the nature of the decision to be taken, as formulated in section III, is not especially restrictive. Instead of a straightforward yes-no decision on a single project or program, the set of choices can be extended to an array of specific, competing alternatives, in addition to the unspeci- fied alternative, in any cost-benefit analysis that corresponds to rejecting the single proposal if its net pre- sent value at shadow prices is negative. To give examples, the tendering process often yields different designs of the same project, be it a port or a primary school. Likewise, a policy intervention may allow different sequences ex ante. The timing of the analysis can also vary. Thus, a program or policy that 13 In the event that the provisional assessment convincingly questions the merit of the program or policy, then the re- search study should be abandoned. This is a risk, but the common-sense message here is—undertake a provisional as- sessment before embarking on a major research exercise. The World Bank Economic Review 565 stalls during implementation may present the policy maker with a cancel-or-redesign decision that could benefit from a provisional assessment.14 Similarly, but earlier in the project cycle, the analysis could be applied to a pilot project. Indeed, any specific decision that can be subjected to empirical analysis involv- ing standard economic tools is a possible candidate. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 Leveling the Playing Field As argued in the Introduction, the ruling incentives within the economics profession encourage re- searchers to refine and verify their analyses as fully as possible before sharing their results with policy makers. It follows that if the aim is now to promote greater use of provisional assessments, it will be nec- essary to modify researchers’ behavior by means of new incentives specifically designed to counter exist- ing ones. The obvious way of leveling the playing field is to subsidize provisional assessments. Yet it is unlikely that policy makers in developing countries would be willing to pay for these before a solid track-record has been established. Consequently, a third party—the obvious candidate is the international donor com- munity, which is already an important funder of research in developing countries—has to be the source of the subsidy, at least initially. The payment per submission would have to motivate a significant num- ber of researchers to participate and would therefore have to exceed the marginal cost of delivering the input. The cost of producing a preliminary assessment, given its nature, is likely to be small. Even with some additional premium to guarantee adequate participation, therefore, the total cost of the subsidy would remain modest. The appropriate size of this premium could be determined through experimenta- tion, with donors adjusting the amount until sufficient numbers of provisional assessments are tendered. To ensure that the subsidy does indeed result in an early submission to the policy maker, donors could undertake two specific actions. First, they could process requests for funding, and disburse funds, quickly. Second, the initial grant could be limited to the costs of the study, with the premium paid only on confirmation that the provisional assessment has been delivered to the policy maker within some rea- sonable, and precisely specified, time frame. Additionally, and to encourage interaction between researcher and policymaker, donors could give priority to those proposals for which the researcher can produce a written endorsement from the relevant policy maker15 or where the policy maker initiates the process himself. Such expressions of interest would demonstrate both that contact between the parties has been established and that the policy maker can be expected to act on the analysis when it arrives. In some cases, a policy analyst inside a ministry may undertake the preliminary research for a provisional assessment. This attractive practice could be encouraged by extending the subsidy to government analysts wherever legal rules and bureaucratic pro- cedures allow. Whatever the origin of a research-policy interaction, if policy makers share their timetable with re- searchers following receipt of a provisional assessment, then the likelihood that any ensuing final sub- missions will arrive on time should increase significantly. Donors should be especially interested in promoting this outcome, since it ensures that research output at a more refined stage is actually feeding into the policy-making process. This highly desirable result could be encouraged by declaring that re- quests to fund full-scale research projects will receive preferential treatment if they have been preceded by a provisional assessment. The exact advantage—speedier turn-around or more generous funding— could be decided by individual donors; but whatever its form, the knowledge that funds are more likely to be forthcoming for a fully fledged research proposal provided a provisional assessment has been 14 Variants of the project in question may suggest themselves, whether from experience or even institutionalized learning in the course of execution itself, as advocated vigorously by Pritchett et al. (2012). 15 The International Initiative for Impact Evaluation uses this approach. Researchers are asked to include in any applica- tion for funds an endorsement from the relevant policy maker. 566 Bell and Squire completed, will serve as a further and powerful inducement for researchers to interact with policy mak- ers early in the research cycle. Since time is of the essence, the administrative procedures for funding both provisional assessments and full proposals that have been preceded by a provisional assessment should be streamlined, with quick turn-around for the former and almost automatic funding for the latter. This departure from the Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 thorough vetting of full proposals should be balanced by a systematic effort to monitor each stage of the research-policy interaction from start to finish. This would provide reassurance that, simplified proce- dures notwithstanding, the approach is yielding the desired result in terms of the frequency with which policy makers move from provisional assessments as a first step, to final submissions as a contingent sec- ond step, and then to policy action, or otherwise indicate the need for fine-tuning. An attractive side- benefit of this monitoring program is that the information so generated would create a unique data base capable of expanding and deepening the profession’s understanding of the research-policy interaction well beyond that afforded by sporadic case studies. Reducing the Risk of Errors The introduction of provisional assessments would improve policy making in the large; but, in specific cases, the advice may increase the probability that the wrong decision is made. The initial analysis, pre- cisely because it is a first cut, may be so faulty as to point the policy maker in the wrong direction, whereas the alternative—waiting for the full analysis—would provide better guidance (assuming it ar- rives in time to be of use). The quality of provisional assessments is, therefore, an issue to be addressed. Confidence in quality can be increased by means of an independent, professional review of each pro- visional assessment prior to its delivery to the policy maker.16 Thus, researchers receiving grants to con- duct preliminary assessments could be required to submit their assessments to a review covering such basic aspects of their analysis as specification of the policy question, use of data, choice of analytical tool, and robustness checks. The reviewer, who should be of another nationality and drawn from a panel of experts, could also offer more general advice—on alternative designs, for example—drawn from both the published and the so-called grey international literature dealing with whatever program or policy is being evaluated. Making the premium to be paid to the researcher conditional on such a review would reduce the chances of unsatisfactory advice reaching the policy maker and adversely affecting policy decisions. The drawback with this suggestion is, of course, that reviews inevitably take time, and yet timeliness is essential if provisional assessments are to fulfill their intended purpose. To minimize any hold-up, re- viewers could be paid a fee per review, provided it is delivered within a specified time. The academic community, drawing on its well-established expertise in controlling quality by means of double-blind refereeing prior to publication, is well placed to contribute to this more specialized review process. Furthermore, compared with submissions to professional journals, provisional assessments are short, fo- cused pieces and should therefore admit of a speedy review. It is noteworthy that there are journals in other fields, such as medicine and branches of the natural sciences, that enforce a very rapid review, with accepted articles appearing in a matter of weeks after submission.17 Such journals devoted to policy questions would serve our present purposes admirably; 16 This point is not confined to provisional assessments; it applies equally to all existing means of reaching the policy- maker (e.g., policy briefs), which are often delivered without review. In their effort to reach policy-makers, many de- velopment specialists fail to subject research-based advice to careful scrutiny before delivering it, when in fact, such re- search should probably receive more scrutiny than research submitted to a professional journal. Weiss, in her foreword to Carden (2009), makes this point when she rightly queries the latter’s apparent belief that “research is al- ways good and right.” 17 We are indebted to a referee for drawing this alternative to our attention. The World Bank Economic Review 567 but there are none, and there seems little prospect of any unaided births of this kind. Donors might wish to act as midwives, but editors would have to be sure of keeping them at arm’s length subsequently. Even if adequate quality is assured, the pitfalls do not end with the submission of a provisional assess- ment. A policy maker may simply ignore its analysis and instead use its arrival as an excuse to move for- ward with a long-preferred, but quite possibly incorrect, decision. This is the worst sort of trigger effect: Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 the decision is based on personal experience, a yearning for advancement, political pressure, or other fac- tors quite unconnected with the research findings in question. One effective way of reducing this risk is to distinguish a priori those policy makers who are most likely to use a provisional assessment on its merits from those who are not. The best, and perhaps only sensible, solution to this problem is to leave the tactical details to local re- searchers, who are far better acquainted with local policy makers’ inclinations and behavior. These re- searchers will have the model and the option of submitting a provisional assessment in the back of their minds, but how they actually conduct the business will depend on the particular policy maker in ques- tion. Given her unique knowledge, the local researcher is the one best placed to identify those policy makers most likely to act prudently upon receipt of a provisional assessment. Having done so, her knowledge will be valuable at the next stage, if a more refined analysis is needed. For in practice, she will not be in the corset of the highly asymmetrically held information posited in sec- tion III. Rather she will likely have a prior, however fuzzy, regarding the form of what is termed the pol- icy maker’s “influence function” (the wðÁÞ of section III) and even some feeling for his view of the timing and quality of the decision (the H(t) and qt functions). These, too, should improve the probability of a correct decision. The requirement that a researcher seek the relevant policy-maker’s endorsement before receiving the premium will also sharpen her incentives. In effect, such a requirement functions as a method of “in- formed targeting,” one that relies on local-researchers’ knowledge regarding which local policy makers to approach and how best to conduct business with them if asked. In like manner, only more so, interac- tions initiated by policy makers or policy analysts can be thought of as stemming from a form of self- selection that can be expected to lead more often than not to better decisions. It is difficult to know how effective these methods of targeting may be; but even if they fail on occasion, the researcher always has recourse to the media and public opinion to exert pressure on politicians who misuse or ignore the find- ings of research. As noted in section III, the possibility of a changing of the guard will be on both parties’ minds, and it will surely crop up in any discussion between them. An attraction of our proposed scheme is that it pro- vides some measure of protection against this particular hazard in the policy-making process. If they see eye to eye on the likelihood of this event and it is not on the distant horizon, there will be an additional incentive to deliver a provisional assessment in short order, in the hope of tying the hands of any new re- gime, however the assessment turns out. Yet this potential advantage comes with a drawback; for if R senses that her view of the social profitability of the project is likely to be rather different from D’s, she may delay matters, precisely in the hope of a changing of the guard. Such strategic behavior by both par- ties cannot be ruled out. V. Concluding Remarks Researchers, policy makers, and donors are constantly seeking ways to improve the impact of research on policy. This article offers a practical way of contributing to this goal by encouraging the delivery of more timely inputs to the decision maker in the form of provisional assessments. This proposal draws on two observations emerging from the considerable case-study material now available on the interaction between research and policy. The evidence suggests that early interaction and short, policy-focused in- puts are often associated with successful outcomes. Our call for a much greater reliance on provisional 568 Bell and Squire assessments builds on, and is fully consistent with, these observations. By establishing analytically the su- periority, on average, of a provisional assessment, accompanied by the offer of a more refined analysis, over a single, final submission, this article strengthens the argument in favor of early and focused policy inputs. In this instance, the happy outcome is that both case-study evidence and analytical rigor point in the same direction. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 The second, related contribution of the article is to address the chief difficulties of implementing such a scheme. Provisional assessments are not widely used at present, so there is little experience to go on. It is argued that prevailing incentives are the main cause, and that inexpensive donor-funded subsidies are the right remedy to get things started. Ensuring quality is a related concern. To deal with it, the paper calls for a fast but rigorous independent review of all provisional assessments and the tying of funds for more thorough research proposals to the timely submission of a related provisional assessment. The reshaping of incentives coupled with quality control can be thought of as supply-side measures that should encourage more reliance on provisional assessments, as a demand-side response. This, we have argued, will improve the quality of decisions in the public sphere. VI. Appendix It is assumed that on date td , and no other, there may be a changing of the guard, and then with probability pd . Under this assumption, equation (1) in the text becomes td ð t2 ð 2d tð Q¼ qt dHðtÞ þ ð1 À pd Þ qt dHðtÞ þ pd qr d t dH ðtÞ: (10) t1 td td If D has a poor opinion of the likely new regime or his own successor, qd t will be markedly smaller than qt . R revises her priors F(t) and pt in analogous fashion. Note that her priors tr and pr can also differ from D’s. If the event in question occurs, equation (2) in the text becomes ( 1 À ð1 À pr t Þ/ðs; t À sÞ > pt if tr < s t; pr t ð sÞ ¼ (11) r pt otherwise: Equation (3) now takes the form, compactly expressed in terms of pt ðsÞ and pr t ðsÞ, 2 2r 2r 3 ð t ð t tð 6 7 PðsÞ ¼ ð1 À pr Þ pt ðsÞdFðtÞ þ pr 4 pt ðsÞdFðtÞ þ pr r t ðsÞdF ðtÞ5: (12) t1 t1 tr Note that substitution for pt ðsÞ and pr t ðsÞ from equations (2) and (11), respectively, will yield PðsÞ in extensive form. Like D, R will have her own views about how a changing of the guard will affect the timing and correctness of the decision. References Annor-Amevor, E. A., and C. E. Afudego. 2012. “Simulating Alternatives for Increasing Basic Education Certificate Examination Pass Rate in Junior High Schools in Ghana,” mimeo, Integrated Social Development Centre, Ghana. Arrow, K. J., and R.C. Lind. 1971. “Uncertainty and the Value of Public Investment Decisions,” American Economic Review, 60: 364–78. Cabinet Office. 1999a. “Modernising Government,” The Stationery Office, London. Cabinet Office. 1999b. “Professional Policy Making for the Twenty First Century,” Report by Strategic Policy Making Team, London. Carden, F. 2009. Knowledge to Policy: Making the Most of Development Research, Sage Publications India Pvt. Ltd, New Delhi. The World Bank Economic Review 569 Court, J., I. Hovland, and J. Young. 2005. Bridging Research and Policy in Development: Evidence and the Change Process. London: ITDG. Elga, A. 2010. “Subjective Probabilities Should be Sharp.” Philosopher’s Imprint, 10 (5): 1–11. Dhaliwal, I., E. Duflo, R. Glennerster, and C. Tulloch. 2012. “Comparative Cost-Effectiveness Analysis to Inform Policy in Developing Countries: A General Framework with Applications for Education,” Conference Paper, Abdul Downloaded from https://academic.oup.com/wber/article-abstract/31/2/553/2897315 by Joint Bank-Fund Library user on 08 August 2019 Latif Jameel Poverty Action Lab, MIT. Jones, A. M., L. Squire, and R. Thomas. 2010. “Evaluating Innovative Health Programs,” Health Economics 19, Supplement 1, September. Jones, N., and C. Walsh. 2006. “Policy Briefs as a Communication for Development Research,” Overseas Development Institute, London. Jones, N., H. Jones, and C. Walsh. 2008. “Political Science? Strengthening Science-Policy Dialogue in Developing Countries,” Overseas Development Institute, London. Livny, E., A. Mehendale, and A. Vanags. 2006. “Bridging the Research Policy Divide in Developing and Transition Countries: Analytical Lessons and Proposals for Action,” Global Development Network, mimeograph, November. Nutley, S, M., I. Walter, and H. T. O. Davies. 2007. Using Evidence: How Research Can Inform Public Services, The Policy Press, Bristol. Overseas Development Institute. 2004. “Bridging Research and Policy in International Development,” ODI Briefing Paper, London. Pawson, R., 2006. Evidence-Based Policy: A Realist Perspective, Sage Publications Ltd, London. Pritchett, L., S. Samji, and J. Hammer. 2012. “It’s All About MeE: Using Structured Experiential Learning (‘e’) to Crawl the Design Space,” Working Paper No. 2012/104, UNU-Wider Working Paper No. 2012/104, Helsinki. Pritchett, L., and J. Sandefur. 2013. “Context Matters for Size: Why External Validity Claims and Development Practice Don’t Mix,” Working Paper 336, Center For Global Development, Washington, DC. Savedoff, W. D., R. Levine, and N. Birdsall. 2006. “When Will We Ever Learn? Improving Lives Through Impact Evaluation,” Center For Global Development, Washington, DC. Thornton, R. L., L. E. Hatt, E. M. Field, M. Islam, F. S. Dia, and M. A. Gonzalez. 2010. “Social security health insur- ance for the informal sector in Nicaragua: A randomized evaluation,” Health Economics 19, Supplement 1, September. The World Bank Economic Review, 31(2), 2017, 570–594 doi: 10.1093/wber/lhv051 Advance Access Publication Date: November 23, 2015 Article Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 On the Effects of Enforcement on Illegal Markets: Evidence from a Quasi-Experiment in Colombia* Daniel Mej ıa, Pascual Restrepo, and Sandra V. Rozo Abstract This paper studies the effects of enforcement on illegal behavior in the context of a large aerial spraying pro- gram designed to curb coca cultivation in Colombia. In 2006, the Colombian government pledged not to spray a 10 km band around the frontier with Ecuador due to diplomatic frictions arising from the possibly negative collateral effects of this policy on the Ecuadorian side of the border. We exploit this variation to estimate the effect of spraying on coca cultivation by regression discontinuity around the 10 km threshold and by condi- tional differences in differences. Our results suggest that spraying one additional hectare reduces coca cultiva- tion by 0.022 to 0.03 hectares; these effects are too small to make aerial spraying a cost-effective policy for reducing cocaine production in Colombia. JEL classification: H00, K42, O21 Illegal activities such as counterfeiting, tax evasion, and the operation of illegal drug markets remain a serious problem throughout the world. Yet, there is still an open debate about how to fight these prob- lems, what strategies to use and the costs that different enforcement strategies entail. On one side, the economic analysis of crime sees the decision to engage in illegal activities as rational and, as such, shaped by incentives and penalties (see Becker 1968 and Stigler 1970). The central prediction is that enforce- ment reduces crime to the extent that it increases its costs. On the other side of the debate, social scien- tists and pundits have raised several concerns about the role of enforcement. In particular, critics argue that criminals may be irrational, myopic, or predisposed to illegal behavior (Menninger 1968); that extrinsic penalties crowd out intrinsic motivations (Frey 1997); or that enforcement may backfire if it conveys information about widespread illegal behavior (Benabou and Tirole 2003, 2006). The debate also suffers from lack of abundant evidence, in part due to the lack of exogenous sources of variation in enforcement strategies required to uncover their causal impact on crime and illegal activities. Our paper contributes to the growing literature aimed at disentangling and quantifying the causal effects of enforcement on crime and illicit activities. We use the war on drugs in Colombia as a case Daniel Mejıa (corresponding author) is a professor in the Economics Department at the Universidad de los Andes; his email is dmejia@uniandes.edu.co. Pascual Restrepo is a PhD candidate at MIT; his email is: pascual@mit.edu. Sandra V. Rozo is an assis- tant professor in the Finance and Business Economics Department at USC Marshall School of Business; her email is: sandra.rozo@marshall.usc.edu. We thank the editor, Andrew Foster, and two anonymous referees for their valuable comments and suggestions. We are indebted to SIMCI, at the United Nations Office of Crime and Drugs (UNODC) in Colombia, for their invaluable collabo- ration in providing the data used in this study. We would also like to thank Adriana Lleras-Muney, Adriana Camacho, Leopoldo Ferguson, Paola Guiliano, and Leah Boustan for their suggestions. We are also grateful to the participants of the economics seminar at UCLA and Universidad de los Andes for the comments received. C The Author 2015. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. V All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. The World Bank Economic Review 571 study, focusing on the role that aerial spraying with herbicides has on illegal coca cultivation. At least since 1996, Colombia has been the world’s largest cocaine producer and grower of coca crops (the raw input for producing cocaine). Coca cultivation takes place in remote areas of the country with little insti- tutional presence, where farmers face the risk of being detected and their illegal crops being aerially sprayed with herbicides. When coca crops are sprayed with herbicides they are partially lost, which in Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 principle should increase the cost of this illegal activity and reduce the farmers’ incentive to pursue it. Our goal in this paper is to assess the effectiveness of this form of enforcement in reducing coca cultivation. We exploit the geographic and time variation on aerial spraying induced by a diplomatic friction between the governments of Ecuador and Colombia around 2006. In 2000, the Ecuadorian government alleged that Colombian aerial spraying campaigns near the frontier were causing health problems, pro- ductivity losses, and environmental damage in their territory. In response, the Colombian government committed to completely stop aerial spraying campaigns within a 10 km band around the international frontier with Ecuador at the beginning of 2006. The Colombian government broke its commitment at the end of 2006 and continued spraying within the band throughout 2007—though considerably less so than before. However, aerial spraying in the 10 km band from the frontier completely stopped in 2008, in response to a lawsuit filed by the Ecuadorian government against its Colombian counterpart in inter- national courts. We use satellite and georeferenced data of coca cultivation and aerial spraying on 1-square-km (100 hectares) grid cells between 2000 and 2010 and estimate the effect of aerial spraying using two comple- mentary methodologies. First, we use a fuzzy regression discontinuity design and compare coca cultiva- tion in cells near both sides of the 10 km threshold. We show that aerial spraying changed discontinuously at the 10 km threshold during the years in which Colombia agreed not to spray the exclusion area, while other covariates and policies did not. We use this geographic discontinuity in enforcement to identify its effect on coca cultivation. Additionally, we report results obtained using a conditional differences in differences estimation. In particular, we compare the cultivation of illicit crops in cells within the exclusion area to that in similar cells located 10 to 20 km away from the frontier, an area that continued to be sprayed throughout the years in our sample. Both groups of cells were exposed to aerial spraying before 2006, but after that, only the latter group of grid cells continued to be sprayed. Our estimator attributes the differential change in cultivation after 2006 across both areas to the change in enforcement. To guarantee the com- parability of both groups, we control for coca cultivation and spraying before the intervention using a variety of parametric and nonparametric estimators. Consistent with the view that illegal behavior is a rational choice, we find significant (but very small) deterrent effects of spraying on coca cultivation. The regression discontinuity estimates imply that cells in the sprayed area near the cutoff were 10% more likely to be sprayed than close cells in the exclusion area. Farmers responded by planting 0.3 less hectares of cocaine per square kilometer in the sprayed region. Similarly, our estimates using the conditional differences in differences estimator suggest that the areas that were exposed to aerial spraying after 2006 faced approximately 10% higher likelihood of being sprayed and, as a result, had on average 0.22 fewer hectares of coca per square kilometer (relative to cells in the region not sprayed). Both methodologies suggest that spraying one additional hectare reduces coca cultivation by 0.022 to 0.03 hectares in a given year. Our findings confirm the key insight from the economics of crime for this particular context: enforce- ment in the form of a higher likelihood of being sprayed with herbicides dissuades farmers from growing illegal crops. However, these effects are too small and aerial spraying of illicit crops is too costly. In par- ticular, our largest point estimates suggest that to reduce coca cultivation by 1 hectare, approximately 33 additional hectares must be sprayed every year. However, it is very likely that coca cultivation is in 572 Mej ıa, Restrepo, and Rozo part displaced by aerial spraying campaigns, making the 33 hectares a lower bound. The average direct cost to the United States of spraying one hectare of coca crops in Colombia is estimated to be about $750 dollars (cited in Walsh et al. 2008). According to official sources, for each dollar the United States spends on the spraying program, the Colombian government spends about $2.2 dollars protecting the spraying crews and cleaning up the area before they carry out these campaigns. Thus, the joint cost of Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 spraying 33 hectares of coca, and reducing coca cultivation by 1 hectare per year, is about $79,200 dol- lars, out of which the United States pays at least $24,750. As we show in greater detail in the paper, these numbers imply that the marginal cost to the United States of reducing cocaine supply in retail markets by 1 kg by subsidizing aerial spraying policies in Colombia is about $1.6 million dollars, which is in the ballpark of the costs reported by Mej ıa and Restrepo (2013) using a different methodology. This is sig- nificantly higher than the marginal cost of reducing cocaine consumption in the United States using other policies, such as interdiction efforts in Colombia ($175,000 dollars; see Mej ıa and Restrepo 2013), or treatment and prevention policies in the United States ($8,250 and $68,750 dollars, respectively; see MacCoun and Reuter 2001). It is also high when compared to the retail price of 1 kg of pure cocaine in U.S. retail markets, which ranges from $100,000 to $150,000 dollars. In addition to providing evidence on the link between enforcement and illegal behavior, estimating the impact of aerial spraying on coca cultivation is important for several reasons. First, Colombia has been a key player in the international drug trade during the last thirty years. During our period of analy- sis, it was the world’s main cocaine producer, supplying nearly 70% of cocaine and holding a similar share of coca crops among the Andean countries (see UNODC 2012). Thus, reducing the supply of Colombian cocaine has the potential to decrease the availability of cocaine and its associated harms throughout the world. Second, aerial spraying is the largest anti-drug program implemented in Colombia. It entails not only resources from the local government but also from the United States. In particular, since the beginning of Plan Colombia in 2000—the largest cooperative effort between the United States and a source country to curb drug supply and improve security conditions—both countries have spent more than $3 billion dollars in this program. Finally, illegal behavior in Colombia is perva- sive, and understanding its nature and how to contain it is a key policy challenge. The rest of the paper is organized as follows. Section I describes the related literature; section II describes the natural experiment that we exploit in order to identify the causal impact of aerial spraying on coca cultivation and the data. Sections III and IV present the estimates on the effects of spraying. Section V discusses the main results and presents a cost-benefit analysis of the aerial spraying program in Colombia. Finally, section VI concludes. I. Related Literature Our paper is related to two branches of the economics literature. First, it is related to the literature on the effects of enforcement on crime. This topic goes back to the seminal contributions of Becker (1968), Stigler (1970), and Ehrlich (1973). The main implication of these papers is that enforcement—in the form of fines, tighter punishments, or a higher probability of detection—reduces crime and illegal behavior. Yet, testing this proposition is challenging, as it requires credible sources of exogenous variation in enforcement. Otherwise, the fact that enforcement reacts to crime creates a misleading upward bias in the estimated effect of enforcement on crime. Initially, the economics literature failed to find empirical support for this proposition (see Cameron 1988, Marvell and Moody 1996, and Eck and Maguire 2000 for surveys of the early literature), but many of these contributions were plagued with endogeneity issues. Recent Literature on the Efficacy of Enforcement Recent studies have addressed identification more carefully and find some support for the deterrence effect of enforcement on some types of crimes and behavior. For instance, Marvell and Moody (1996) The World Bank Economic Review 573 find that within-state increases in the number of police officers reduce crime in the United States. Levitt (1997) uses electoral cycles as an instrument for police hiring and finds significant reductions in crime when more policemen are hired.1 Corman and Mocan (2000) use high frequency changes in the number of police officers and find that police reduce burglaries but have no effect on other crime categories. Di Tella and Schargrodsky (2004) exploit the exogenous reallocation of police forces across Buenos Aires Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 as a result of a terrorist attack on a Jewish Center. They find a large and localized deterrent effect of more police presence on car thefts. A similar strategy is used by Draca, Machin, and Witt (2011), who also find evidence of deterrence effects by exploiting police reallocation in London after the terrorist attacks of 2005. Evans and Owens (2007) use state grants to fund Community Oriented Policing Services (COPS) as an instrument for the number of police officers. They find that higher police presence reduces auto thefts, burglaries, robberies, and aggravated assaults. Buonanno and Mastrobuoni (2012) exploit delays created by a centralized police hiring system in Italy to estimate the effect of police officers on local crime, finding deterrence effects in some crime categories. Finally, Garcia, Mej ıa, and Ortega (2013) study the randomized introduction of a police-training program among small localities in the four largest cities in Colombia. They find that the intervention significantly reduces crime, not by increasing the police force but by improving its quality and engagement with the community. Another body of literature focuses on the effects of enforcement or characteristics related to the likeli- hood of detection on soft crime or tax evasion.2 For example, Bar-Ilan and Sacerdote (2001) find that the introduction of traffic cameras and changes in fines reduced driving infractions. Dubin, Graetz, and Wilde (1987) and Beron, Tauchen, and Witte (1992) present evidence that higher audit rates modestly increase reported income for some groups of taxpayers. In this same area, Klepper and Nagin (1989) find that noncompliance rates are related to the traceability, deniability, and ambiguity of the items being declared, which are in turn related to the probability that evasion will be detected and punished; and Kagan (1989) presents evidence that compliance is greater among people whose income is directly reported to the IRS and who therefore have fewer opportunities to cheat. We contribute to this literature by cleanly identifying the effect of enforcement on illegal behavior in the context of the war on drugs and illicit crop cultivation in Colombia. The strength of our empirical exercise relies not only on our identification strategy but also on the precision of our data on illicit crop cultivation and enforcement activities. In particular, we observe satellite data on coca cultivation in small 1-square-km cells and information on the exact location of aerial spraying campaigns, closely monitored by the army and police using GPS devices that are built in the aircraft used in the aerial spray- ing program in Colombia. Consistent with the previous findings on the literature, our results suggest that farmers respond to a greater likelihood of enforcement in an area by reducing illicit coca cultivation there, but we show that the effects are too small to make the spraying program a cost-effective policy to reduce cocaine supply. Literature on the Efficacy of Anti-drug Interventions Secondly, our paper is also related to the applied economics literature on the cost-effectiveness of anti- drug policies. The main challenge in this area is that anti-drug interventions typically take place on a ıa large scale; hence, it is difficult to obtain appropriate counterfactuals. One approach, followed by Mej and Restrepo (2012, 2013), is to construct and calibrate economic models of illegal drug markets to understand and quantify the main forces and determinants of the cost-effectiveness of different supply- reduction strategies. They find that spraying illicit crops is costly and ineffective relative to policies aimed at seizing drug shipments. However, both strategies are costly relative to demand reduction poli- cies in consumer countries. 1 See also the criticism by McCrary (2002) and the reply by Levitt (2002). 2 For a thorough review of the empirical literature, see Andreoni, Erard, and Feinstein (1998). 574 Mej ıa, Restrepo, and Rozo Other papers in the literature have focused on estimating the impact of spraying campaigns on coca cul- tivation by using geographic and time variation. For example, Moreno-Sanchez et al. (2003) and Dion and Russler (2008) use departmental data from Colombia and find a positive correlation between the levels of spraying and the presence of coca crops. However, these results are likely to be driven by simultaneity bias in their estimates. Recent studies have attempted to address these endogeneity concerns. For example, Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 Moya (2005) deals with selection on observables using matching techniques and finds that spraying does not have a significant effect on coca crops in Colombian municipalities. Bogliacino and Naranjo (2012), also exploit within municipality variation and find that eradication does not reduce coca production. They estimate a system of equations in which aerial spraying and cultivation are determined simultaneously, identified by the restrictive assumption that crime rates and the number of internal refugees are uncorre- lated with coca cultivation, but do create political pressure for spraying campaigns. Reyes (2014) instru- ments spraying with the distance to the closest military base and finds evidence that aerial spraying increases illicit crops. However, his results require the location of military bases to be exogenous. A differ- ent approach is implemented by Iban ~ez and Carlsson (2010), who elicit farmers’ preferences over risk and the profitability of other crops by using hypothetical questions. Consistent with our findings and the eco- nomic model of crime, farmers in Putumayo claim they would reduce coca plantations if they faced a higher risk of eradication. However, the implied reductions are small, as we confirm using a different methodology. Finally, Rozo (2014) instruments spraying with the interaction of the distance between each 1-square-km cell (or coca producer) and the nearest border of a protected area and U.S. international anti- drug expenditures. She exploits the fact that, by governmental mandate, protected areas cannot be sprayed with herbicides due to environmental and social concerns and finds that aerial spraying has a negative and significant effect on the hectares of coca cropped and their yield. In addition, she documents negative unin- tended consequences on the socio-economic conditions of coca-producing areas. This paper contributes to the existing evidence by estimating the effects of aerial spraying programs using a sharp natural experiment, which in our view provides a credible source of exogenous variation. In addition, we use cost figures to back up a lower bound for the cost-effectiveness of these programs. We find that despite reducing cultivation, aerial spraying is too costly to be a cost-effective anti-narcotic strategy. In particular, demand reduction policies in the United States or interdiction campaigns provide the same benefits in terms of supply reduction at much lower costs. II. Natural Experiment and Data Following the large increase in coca cultivation that took place in Colombia after 1994 and the increas- ing involvement of illegal armed groups in these activities, the governments of Colombia and the United States launched Plan Colombia in September of 1999, a large anti-narcotics program aimed at reducing the Colombian cocaine supply. Under this program, the United States government disbursed close to $540 million dollars per year between 2000 and 2008 in subsidies to the Colombian armed forces to fight against the production and trafficking of drugs. Additionally, the Colombian government spent close to $810 million dollars per year during the same period in the fight against illegal drug production and trafficking (GAO 2008). Total expenditures on the military component of Plan Colombia repre- sented close to $1.35 billion dollars per year, corresponding to about 1.2% of the country’s annual GDP, making it the largest anti-drug intervention in a producing country. The strategies implemented under Plan Colombia included aerial spraying campaigns, manual eradi- cation, control of chemical precursors used in the processing of coca leaf into cocaine, detection and destruction of cocaine processing laboratories, and seizure of drug shipments en route to foreign coun- tries. Aerial spraying has been by far the main anti-drug strategy in terms of financial resources invested. On average, 128,000 hectares per year were sprayed with herbicides, of which almost half are located in Putumayo and Narin ~o, the two Colombian departments (states) bordering Ecuador, where our empirical The World Bank Economic Review 575 analysis is centered. Figure 1 shows the evolution of coca cultivation, aerial spraying with herbicides, and manual eradication between 2000 and 2010, for the whole country (left panel) and for the depart- ments of Narin~o and Putumayo (right panel). Figure 1. Coca Cultivation and Aerial Spraying in Colombia (Left Panel) and Narino and Putumayo (Right Panel) Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 Source: Data from the United Nations Office of Drugs and Crime, UNODC. Spraying campaigns are carried out by American contractors, such as DynCorp, using small aircraft. Coca crops are sprayed with chemicals containing glyphosate, such as Roundup, the commercial name of the herbicide used in the spraying program in Colombia. The herbicide is absorbed through the plant foliage and is effective only on growing plants (e.g., it is not effective in preventing seeds from germinat- ing). It kills the plant by inhibiting its growth. Though Roundup was designed to kill weeds and grasses, including coca bushes, it may also affect other legal crops that are not glyphosate-resistant. Aerial spray- ing with glyphosate is targeted at areas where coca crops have been detected using satellite images. When planting coca crops, farmers face the risk of having their crops destroyed by the herbicides used in aerial spraying. Given this risk, they may still grow coca bushes and play their luck, or mitigate the effects of the herbicide using a variety of techniques. For instance, farmers can spray molasses on the coca bushes to prevent the herbicide from penetrating the foliage and killing the plant. In addition, they can cut the stem of the plant a few hours after the fumigation event, enabling the plant to grow back a few months later. Finally, farmers can relocate their crops to areas less likely to be sprayed. However, these alternatives are costly and force some farmers to cultivate legal crops instead, which are, in princi- ple, not targeted by spraying campaigns. Because aerial spraying campaigns typically target areas with a high prevalence of coca plantations, traditional estimates of the effect of spraying on cultivation are biased upwards. In this paper, we solve this problem and identify the effects of aerial spraying using a natural experiment. In particular, we exploit a diplomatic friction between the governments of Colombia and Ecuador, which concluded with the compromise by the Colombian government not to carry out spraying campaigns within a 10 km strip along the international border with Ecuador, starting in 2006. 576 Mej ıa, Restrepo, and Rozo From the beginning of Plan Colombia, the Ecuadorian government complained of alleged adverse effects of spraying on the health of its population, the environment, livestock, and legal crops near the bordering area. In 2006, the Colombian government announced that it would discontinue aerial spray- ing within a 10 km band along the international frontier with Ecuador within Colombian territory. However, the Colombian government recanted at the end of 2006 and continued spraying the area. As a Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 result of this noncompliance with the initial agreement, the Ecuadorian government filed a lawsuit against Colombia in the International Court of Justice in The Hague. Since the suit was filed, on March 31st 2008, the Colombian government stopped all spraying campaigns within the 10 km strip. The implementation of this exclusion area generated geographical and time variation in the likelihood of aerial spraying, which we exploit to identify its effects. Figure 2 shows a map of the exclusion strip and its location in Colombia. Figure 2. Map of the Frontier between Colombia and Ecuador Illustrating the Sprayed and Exclusion Areas Source: Authors’ illustration. Data We employ a unique panel of data on the location of coca crops within 1-square-kilometer (or 100 hec- tares) cells from 2000 to 2010. The data is collected and processed by the United Nations Office for Drugs and Crime (UNODC) in Colombia and comes from satellite images. Using these images, UNODC esti- mates the number of hectares with coca cultivation detected on each grid by the end of each year. We also use cell level data on the location of aerial spraying campaigns for the same period. The data is collected by the Colombian police using GPS devices, and it records the exact location of the plane when the spray- ing valves are open. Using these observations we code a dummy of whether a grid was sprayed or not, for each year from 2000 to 2010.3 We restrict our sample to all grid points with centroids located within 20 km of the international frontier with Ecuador. Our sample includes 10,880 cells. We refer to the 5,613 3 Moreover, we use data on whether manual eradication campaigns took place on each grid, covering the 2007–10 sub- period. These data are obtained from GPS devices used by manual eradication teams. The World Bank Economic Review 577 cells within 10 km of the frontier as the exclusion region, since this is the area that Colombia agreed not to spray. In contrast, we refer to the 5,275 cells located 10 km to 20 km from the frontier as the sprayed area, as these cells were sprayed throughout our period of analysis. Both regions are depicted in figure 2. To summarize the data, figure 3 presents the likelihood of aerial spraying and the average number of hectares with coca crops per square kilometer in both regions from 2000 to 2010. As anticipated above, Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 the figure reveals similar patterns before 2006, with the exception of 2004. However, and consistent with the description of the natural experiment we exploit, a significant gap opens up beginning in 2006, when the Colombian government first agreed to reduce the spraying campaigns in the exclusion area. The data shows that the Colombian government still sprayed the exclusion area in 2006 and 2007, but considerably less than the sprayed region, where about 20% of the cells were sprayed. However, spray- ing of the exclusion area falls to virtually zero, effectively since 2008—the year in which Ecuador filed the lawsuit against Colombia in international courts. Figure 3. Coca Cultivation and Likelihood of Aerial Spraying in the Exclusion (Dotted Line) and Sprayed Areas (Solid Line) from 2000 to 2010 Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC.Notes: 95% confidence intervals for the mean in each group are presented in gray. 578 Mej ıa, Restrepo, and Rozo The data on cultivation reveals a sharp decline from 2000 to 2004, during the first years of Plan Colombia, from about 3 hectares per square kilometer to about 0.6. However, in 2006, and later in 2009 and 2010, cultivation increased in the exclusion region relative to the sprayed area, though only mildly. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 III. Fuzzy Regression Discontinuity Approach In this section we employ a fuzzy regression discontinuity design to evaluate the impact of aerial spraying on coca cultivation. We exploit the exogenous rule applied by the Colombian government since 2006 to stop aerial spraying 10 kms around the international frontier, which became particularly binding from 2008 onward. In our setting, the forcing variable is the distance from the centroid of each cell i to the international frontier with Ecuador, capturing its geographic location. We normalize the forcing variable to take the value of zero at the 10 km cutoff, and denote it by D ^ i ¼ Di À 10 km.4 Our discussion above ^ i , where D implies that in the remaining sample of cells, there should be a discontinuity in aerial spraying around D ^i ¼ 0 in 2006 and from 2008 onwards, assuming that the Colombian government fulfilled its commitment strictly during these years. Let Sit be a dummy equal to 1 if grid i was sprayed during year t. We start by exploring whether the diplomatic friction led to a discontinuity in aerial spraying near the 10 km band. To do so, we estimate the model: ^ i ; Xi Þ þ it ; ^ i > 0 g þ ft ð D Sit ¼ p0t þ p1t 1fD (1) for specific years, t, or pooling different years together. Here, ft is a polynomial in the forcing variable and other geographic characteristics of the cell (including longitude and latitude), which captures the continu- ous variation of the likelihood of enforcement over different cells. Essentially, it represents the conditional expectation of the policy based solely on the geographic characteristics of a cell. The coefficient on p1t measures any policy discontinuity around the cutoff. Our discussion above implies that we expect p1t > 0 for t ! 2006—in particular since 2008, when spraying was reduced virtually to zero in the exclusion area. In addition we control for municipality fixed effects, year effects (when necessary), and report standard errors robust against heteroskedasticity and serial correlation within cells throughout. The main challenge in estimating equation 1 is to specify a flexible model for the conditional expecta- tion function ft ðD^ i ; Xi Þ. Our first approach is to use a cubic polynomial in Di, together with quadratic terms for latitude and longitude, aimed at capturing the variation of Sit over space. We present estimates using this approach in columns 1 and 2 of table 1. In column 1 we estimate equation 1 on the sample of cells with centroids within 3 km of the discontinuity, and in column 2 we further restrict the sample to cells within 2.5 km. By restricting the sample of cells used, we rely less on the particular parametrization of the conditional expectation function. Alternatively, in columns 3 and 4 we approximate the condi- tional expectation using a local quadratic regression with different choices of bandwidth and a triangular Kernel. In column 3 we use Imbens and Kalyanaraman (2012) optimal bandwidth (labeled IK through- out), which in our setting equals b ¼ 4:7 km.5 4 In this exercise, we exclude from our sample all cells that had their centroid in the first 500 m around the cutoff value since they have a significant portion of their territory in both the exclusion and the sprayed area. Thus, we only compare cells near the 10 km cutoff lying entirely on one side or the other. We obtain similar results using the cells within 500 m of the cutoff to estimate the conditional expectation of cultivation and the likelihood of spraying as a function of the distance to the cutoff. In these models, we add separate dummies for cells within 0 to 500 m away from the cutoff in the exclusion area and cells within 0 to 500 m away from the cutoff in the sprayed area. 5 Though the optimal bandwidth varies estimate by estimate, all are very close to the 4.7 km one used here. Thus, we use this bandwidth and label it as the optimal bandwidth throughout. Table 1. Estimates of the Discontinuity in Spraying and Coca Cultivation around the 10 km Cutoff (Sprayed Minus Exclusion Area) Discontinuity in spraying Discontinuity in cultivation Model for conditional expectation: Cubic polynomial for grids between: Local model with bandwidth: Cubic polynomial for grids between: Local model with bandwidth: 63 km 62:5 km b ¼ IK b ¼ 3 km 63 km 62:5 km b ¼ IK b ¼ 3 km (1) (2) (3) (4) (5) (6) (7) (8) Panel A: Pooled-years estimates. Difference after 2006: 0.112*** 0.086*** 0.107*** 0.098*** À0.479*** À0.644*** À0.473*** À0.501** (0.023) (0.029) (0.019) (0.033) (0.138) (0.172) (0.121) (0.204) Observations 13265 10560 22340 13265 13265 10560 22340 13265 The World Bank Economic Review Difference after 2008: 0.117*** 0.098*** 0.107*** 0.089** À0.371*** À0.454*** À0.337*** À0.424** (0.025) (0.032) (0.021) (0.036) (0.117) (0.140) (0.104) (0.168) Observations 7959 6336 13404 7959 7959 6336 13404 7959 Panel B: Year-by-year estimates. Difference in 2006: 0.079 0.079 0.085* 0.109 À0.840* À1.403** À0.794** À0.966 (0.051) (0.066) (0.045) (0.076) (0.463) (0.584) (0.403) (0.691) Difference in 2007: 0.128** 0.057 0.130** 0.116 À0.444 À0.455 À0.558** À0.268 (0.059) (0.074) (0.051) (0.085) (0.305) (0.374) (0.268) (0.428) Difference in 2008: 0.119*** 0.086 0.108*** 0.068 À0.440** À0.641*** À0.397** À0.644** (0.046) (0.058) (0.039) (0.066) (0.213) (0.247) (0.186) (0.287) Difference in 2009: 0.096*** 0.077* 0.068** 0.080 À0.210 À0.263 À0.195 À0.140 (0.036) (0.044) (0.030) (0.050) (0.161) (0.190) (0.142) (0.227) Difference in 2010: 0.137*** 0.132** 0.144*** 0.120* À0.463** À0.460* À0.421** À0.488 (0.045) (0.058) (0.039) (0.065) (0.216) (0.266) (0.194) (0.325) Observations per year 2653 2112 4468 2653 2653 2112 4468 2653 Notes: The table presents regression discontinuity estimates of the difference in spraying (columns 1 to 4) and cultivation (columns 5 to 8) around the 10 km cutoff. Panel A presents estimates pooling several years together, as indicated by the row labels. Panel B presents year-by-year estimates. Columns 1, 2, 5, and 6 use a global approximation to the conditional expectation based on a cubic polynomial and restrict the sample to grids within 3 km and 2 km of the discontinuity. Columns 3, 4, 7, and 8 use a quadratic local regression to approximate the conditional expectation and set a bandwidth of 4.7 km (following Imbens and Kalyanaraman 2012) or 3 km. Standard errors robust against heteroskedasticity and serial correlation within cells are reported in parenthesis. Estimates with *** are significant at the 1%, those with ** are significant at the 5%, and those with * are significant at the 10%. Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. 579 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 580 Mej ıa, Restrepo, and Rozo Since one may be concerned that this is too large, we also set the bandwidth to b ¼ 3 km in column 4. The estimates in column 1 show that, since 2006, cells in the sprayed area near the 10 km cutoff were 11.2 percentage points more likely to be sprayed than similar cells in the exclusion region (standard error ¼ 0.023). We find similar estimates when we pool the years 2008 to 2010 together—when the spraying campaigns were reduced to nearly zero in the exclusion area—and in the remaining models in columns 2 Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 to 4. Importantly, we find a similar pattern year by year in panel B. Though the estimates are less precise, they do point out to a discontinuity in the likelihood of aerial spraying since 2006 around the 10 km band near Ecuador. Consistent with our discussion, the discontinuity becomes more clear since 2008, when Colombia stopped aerial spraying campaigns in the exclusion band altogether. The previous results can be seen graphically. Figure 4 shows the local behavior of the likelihood of aerial spraying on both sides of the 10 km cutoff for cells within 2.5 km of the discontinuity. To ease the Figure 4. Probability of Aerial Spraying around the 10 km Cutoff during Years in Which Colombia Committed Not to Spray the Excluded Region (2006 to 2010) Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. The World Bank Economic Review 581 graphical analysis, we plot cells by the distance of their border to the cutoff.6 The plots reveal a clear dis- continuity in the likelihood of spraying after 2008 and, to a lesser extent, for 2006 and 2007. When we pool the years 2006 to 2010 (after the diplomatic friction began), we observe a clear discontinuity in the likelihood of spraying. We now investigate the consequences of the discontinuity in spraying on coca cultivation. Let Yit be Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 the hectares with coca crops in cell i in year t, measured with satellite images at the end of each year. We estimate the following specification: ^ i ; Xi Þ þ it ; ^ i > 0g þ f Y ðD Yit ¼ c0t þ c1t 1fD (2) t for different years, t. Here, ftY is another polynomial approximating the conditional expectation of culti- vation, based solely on geographic characteristics (distance to the frontier, latitude, and longitude). The coefficient on c1t measures any discontinuity around the cutoff. Columns 5 to 8 of table 1 present estimates of the difference in coca cultivation at the discontinuity, cit, for several years separately and for pooled years. Each column presents estimates based on different approximations of the conditional expectation function analogous to the ones used in columns 1 to 4. Consistent with the results on spraying, we find evidence of reductions in cultivation in the sprayed area since 2006, relative to the exclusion area. In this case, the estimates in column 5 suggest that cultivation was reduced by about 0.48 hectares per square kilometer in cells near the cutoff in the sprayed region rel- ative to the exclusion area since 2006 (standard error ¼ 0.14). The estimates in columns 5 through 8 confirm our findings. When we focus on the years from 2008 onward, we also find significant, though slightly smaller, differences in cultivation. The results are not very precise when the sample is divided by years, but they are mostly significant at the 10% confidence level, or near significant, and all point to a reduction in cultivation in the sprayed relative to the exclusion area. Figure 5 also presents these results graphically (the construction of these figures is analogous to that of figure 4). The above results suggest that the enforcement of the 10 km exclusion area created a discon- tinuity in enforcement around the cutoff after and during 2008 and less so for 2006 and 2007. The dis- continuity in spraying caused divergent illegal behavior on both sides of the cutoff. Throughout, we present standard errors clustering at the cell level and robust against hetereoskedas- ticity. In this case, the validity of our standard errors requires the error terms eit and it to be uncorrelated across cells. We believe this is a good starting point, since the flexible polynomials ft and ftY —which include detailed geographical information—already purge the errors from sources of spatial correlation depending on geographical proximity. Moreover, municipality fixed effects also remove sources of spa- tial correlation related to unobserved differences across municipalities. In addition, we also present standard errors allowing for spatial correlation in square brackets below our main estimates in the top panel. These standard errors, based in Conley (1999), allow eit and it to be correlated among cells within 5 km of each other and also permit these error terms to be serially correlated. Overall, they are close to our traditional standard errors, lending support for our inference. The validity of our estimates requires our approximation of the conditional expectations to be valid, so that we are capturing a true discontinuity. To support our approach, we take advantage of the timing of the diplomatic friction and show that there is no discontinuity in spraying nor in cultivation before 2006. Figure 6 shows that, when we pool all years before 2006 together, there is no evidence of a discon- tinuity. This suggests that our estimates in table 1 are not driven by unobserved characteristics of cells or a misspecification of the conditional expectation—as these would show up as a discontinuity in the years before 2006. Moreover, these findings suggest that the particular choice of the exclusion area was rather 6 ^ i þ 500 on the left. By doing so, we remove from the figure ^ i À 500 on the right of the cutoff and D This is defined as D the 500-meter band around the cutoff that we excluded from the estimation sample. We use a quadratic polynomial to approximate the local behavior on each side of the 10 km cutoff. 582 Mej ıa, Restrepo, and Rozo arbitrary and was not strategically aimed at certain cells with particular cultivation dynamics, at least not near the 10 km cutoff. Figure 5. Coca Cultivation around the 10 km Cutoff during Years in Which Colombia Committed Not to Spray the Excluded Region Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. We further explore this point by plotting year-by-year estimates of the difference around the 10 km threshold in spraying and cultivation, together with their standard errors in figure 7. We focus on the specifi- cation obtained with the local quadratic polynomial and b ¼ 3 km—which is the more demanding one. Consistent with our motivation, we find no evidence of discontinuities in any year before 2006—with most point estimates being small (relative to the yearly estimates for 2006 to 2010) and close to zero.7 7 Other specifications yield similar patterns. However, as we allow larger bandwidths or use polynomials with lower de- grees, we obtain larger point estimates for the difference in cultivation during 2004. Our view is that, although on aver- age our approximation for the conditional expectation function is valid, it may fail for particular years. This potential failure lends support for complementary approaches, as the conditional difference in differences estimator we analyze below. The World Bank Economic Review 583 Figure 6. Probability of Spraying and Coca Cultivation around the 10 km Cutoff before the Diplomatic Friction Leading to a Suspension of Aerial Spraying Campaigns Near Ecuador Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. Figure 7. Year-by-Year Estimates of the Difference in Cultivation and Aerial Spraying Near the 10 km Cutoff, Both before and after the Diplomatic Friction Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. An alternative test of our specification consists of conducting placebo tests around random cutoffs that experienced no policy changes. To do so, we draw a random sample of 10,000 cutoffs from 10 to 20 km away from the frontier. We then estimated a discontinuity in cultivation and spraying at each cutoff, pooling data from 2006 to 2010.8 Figure 8 plots the empirical distribution of these esti- mates. As expected, they have mean zero but are also remarkably close to zero for all artificially cre- ated cutoffs. For both spraying and cultivation, our estimates in table 1, pooling the years 2006 to 2010, are clearly on the tails of both empirical distributions, indicating that they are unlikely to be spurious. 8 Again, we focus on the specification using the local quadratic approximation to the conditional expectation. We obtain very similar results for the other specifications and for different choices of bandwidths. 584 Mej ıa, Restrepo, and Rozo Figure 8. Distribution of Discontinuity Estimates Obtained from Placebo Cutoffs Located 10 to 20 km away from the Frontier Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. Finally, we conduct two additional tests showing that other observed covariates do not exhibit dis- continuities near the 10 km cutoff. In particular, figure 9 shows there is no discontinuity in the likelihood of manual eradication, an alternative policy for which we have georeferenced data since 2007. Thus, the decrease of aerial spraying on the exclusion area was not compensated for by an increase in manual erad- ication, and hence our estimates reflect only the causal effect of the policy change in spraying. Moreover, the right panel of this figure shows there is no discontinuity in terms of altitude around the ıa and Restrepo 2013b). cutoff, which is a key predictor of yields (see Mej Figure 9. Probability of Manual Eradication and Altitude around the 10 km Cutoff Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. All these tests make us confident that our estimates provide a reasonable approximation to the condi- tional expectation of spraying and cultivation based on the location of cells and that we are capturing a real discontinuity near the 10 km threshold induced by the diplomatic friction since 2006. Estimating Local Treatment Effects To quantify the treatment effect of spraying on illegal coca cultivation, we compute 2SLS estimates using the discontinuity in spraying as the instrument. In particular, we estimate the fuzzy regression disconti- nuity model: ^ i ; Xi Þ þ tit ; Yit ¼ b0 þ b1 Sit þ f ðD (3) ^ i > 0g (so that equation 1 corresponds to the first stage). instrumenting Sit with the dummy 1fD The World Bank Economic Review 585 Table 2 presents our estimates using different approximations of the conditional expectation f ðD^ i ; Xi Þ. Columns 1 to 4 present results obtained by pooling all years from 2006 to 2010, while col- umns 5 to 8 present estimates for the years 2008 to 2010—when Colombia effectively stopped spraying in the exclusion area. In panel A we present estimates approximating f with a linear function of distance to the 10 km cutoff, latitude, and longitude. We restrict the sample to cells within 3 km, 2.75 km, and Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 2.5 km of the cutoff in different columns as indicated in the bottom rows of the panel. Our results in col- umn 1 indicate that a 10 percentage point increase in the likelihood of aerial spraying reduces cultivation by 0.35 hectares per squared kilometer (standard error ¼ 0.14); this result is in line with the reduced form estimates in the previous section. The remaining columns yield similar estimates. Table 2. Fuzzy RD Estimates of the Local Average Treatment Effect of Spraying on Cultivation around the 10 km Cutoff Sample: From 2006 to 2010 From 2008 to 2010 (1) (2) (3) (4) (5) (6) Panel A: Linear global polynomial Effect of spraying: À3.508** À3.084** À2.918** À2.257** À1.869* À2.313* (1.403) (1.369) (1.455) (1.050) (1.014) (1.203) Excluded instrument F statistic 41.6 42.7 35.7 44.8 47.7 35.0 Panel B: Quadratic global polynomial Effect of spraying: À3.756*** À3.472*** À3.260** À2.392** À2.102** À2.561** (1.345) (1.338) (1.418) (1.019) (0.996) (1.186) Excluded instrument F statistic 41.8 42.7 35.8 44.6 47.7 35.1 Panel C: Cubic global polynomial Effect of spraying: À4.144** À5.394* À7.298* À3.085* À4.843* À4.539 (2.112) (3.229) (4.383) (1.673) (2.803) (2.781) Excluded instrument F statistic 15.1 8.0 5.6 15.9 7.8 7.0 Sample of grids: 63 km 62:75 km 62:5 km 63 km 62:75 km 62:5 km Observations: 13265 11885 10560 7959 7131 6336 Panel D: Local linear regression Effect of spraying: À2.980*** À3.183*** À3.628** À1.790** À1.933** À2.526** (1.019) (1.163) (1.537) (0.765) (0.889) (1.206) Excluded instrument F statistic 74.0 58.2 34.5 84.8 64.8 37.6 Panel E: Local quadratic regression Effect of spraying: À3.179*** À3.591** À3.791* À1.900** À2.402** À3.241* (1.233) (1.516) (2.178) (0.944) (1.167) (1.926) Excluded instrument F statistic 30.0 20.8 9.8 30.5 21.5 9.1 Bandwidth: b ¼ IK b ¼ 4 km b ¼ 3 km b ¼ IK b ¼ 4 km b ¼ 3 km Observations: 22340 18495 13265 13404 11097 7959 Notes: The table presents fuzzy regression discontinuity estimates of the effect of differential aerial spraying on cultivation around the 10 km cutoff. Columns 1 to 4 pool the years 2006 to 2010; while columns 5 to 8 pool the years 2008 to 2010. Results use different approximations of the conditional expectation, as indicated in the top of each panel. Moreover, sample restrictions and the choice of bandwidth is indicated at the bottom of each panel. In all models we include municipality and year specific intercepts. Standard errors robust against heteroskedasticity and serial correlation within cells are reported in parenthesis. Estimates with *** are signifi- cant at the 1%, those with ** are significant at the 5%, and those with * are significant at the 10%. Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. In panel B we use a quadratic polynomial and in panel C a cubic polynomial. Despite the similar find- ings and the fact that we restrict our sample to be close to the cutoff, one may be concerned that these high order polynomials end up giving more weight to observations far from the cutoff (see Imbens and Geilman 2012). To address these concerns we present results using a linear and a quadratic local approx- imation to the conditional expectation. We use Imbens and Kalyanaraman (2012) optimal bandwidth (4.7 km) in columns 1 and 4 and smaller bandwitdhs of 4 km and 3 km in the remaining columns, as 586 Mej ıa, Restrepo, and Rozo indicated in the bottom row of the panel. Reassuringly, we find similar estimates in this case and for dif- ferent choices of bandwidths. If anything, only the precision seems to change, as expected, when we use smaller bandwidths. Overall, our estimates indicate a negative local average treatment effect of the likelihood of aerial spraying on coca cultivation. Our estimates suggest that a 10 percentage point increase in the likelihood Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 of aerial spraying reduces coca cultivation by about 0.3 hectares per square kilometer, though the results vary slightly depending on the specification. In practice, we believe that part of our estimate captures the possibility that coca farmers reallocate their crops to the exclusion area, which seems reasonable given the proximity between cells. However, this implies we are over-stating the real effect of spraying on overall cultivation, and does not rule out our conclusion that farmers respond rationally to the increase in enforcement; it simply suggests another margin of response. However, one additional piece of evidence suggests that reallocation may not be that pervasive. When we estimate the effect of the discontinuity in enforcement on the likelihood of culti- vation (the extensive margin, rather than the intensive margin), we find no effects (not reported to save space). This suggests that cultivation in the exclusion region increases within cells, and not because farm- ers move to new cells in this area. In any case, we cannot entirely rule out the extent of reallocation of coca crops, and our point estimates remain an upper bound of the effects of spraying on coca cultivation. IV. Conditional Differences in Differences Estimates ^ i > 0g and YÀ Let Ti ¼ 1fD i be the average cultivation in grid i from 2000 to 2005. We are interested in À estimating the treatment effect of Ti on Yit À Y i , for t ! 2006 (or t ! 2008—when Colombia effectively stopped spraying in the exclusion area). This effect informs us about changes in cultivation brought about by the differences in enforcement induced by the diplomatic friction. In the previous section we exploited the geographical discontinuity in enforcement around the 10 km cutoff to identify this effect. We relied on comparing cells near the cutoff and controlling for potential differences using a smooth function of their geographic location. In this section, we present complementary estimates exploiting within-cell variation and use the data from 2000 to 2005 to construct counterfactuals for cultivation and spraying in a given cell. In particular, we exploit the following conditional independence assumption (CIA): À Yitd À Y i ?Ti jZi ; fYit0 ; Sit0 gt0 2005 ; (4) where Yitd is the potential cultivation in cell i and year t for cells in the sprayed area, d = 1, and cells in the exclusion area, d = 0. The assumption states that once we condition on the whole history of cultiva- tion and spraying in a cell (fYit0 ; Sit0 gt0 2005 ), and cell characteristics, Zi (including a polynomial in alti- tude— determining yields— and municipality fixed effects), the change in potential cultivation from 2006 onward would be equal for cells in the sprayed and exclusion areas (in the absence of changes in spraying).9 In other words, the assumption requires that all differences between cells in the exclusion and sprayed areas are captured by their history of spraying and cultivation and the observed geographic 9 We use the change in potential cultivation instead of its level to remove any permanent difference between cells not cap- tured by the conditioning set. In theory, we do not need to remove the average cultivation, as this is already in the condi- tioning set. In practice, this helps to control for potential sources of misspecification in the model for the propensity score used below. The World Bank Economic Review 587 characteristics. Any change in cultivation not predicted by these observables is attributed to changes in aerial spraying since 2006. We believe this is a plausible assumption. Including lags of cultivation and spraying takes into account the fact that, before 2006, there was a differential behavior of cultivation in both areas (see figure 10, which plots the difference in spraying and cultivation between the sprayed and exclusion Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 areas for each year). This implies a violation of traditional difference in difference estimates and requires us to condition on the observed paths of cultivation and spraying, as we do here. Figure 10. Difference in Coca Cultivation and Spraying between the Sprayed and Exclusion Areas from 2000 to 2010 Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. Estimating the Average Treatment Effect on the Treated We exploit the above CIA in several ways to estimate the effect of being in the sprayed region during years in which Colombia agreed not to spray the exclusion area. First, we start by running the regression: À X Yit À Y i ¼ bt Ti þ dt þ C Á ðYit0 ; Sit0 Þ0 þ HZ0i þ it ; 8t ! 2006 (5) t0 2006 Here, bt identifies the effect of being in the sprayed region (relative to the exclusion region) during year t as long as the conditional expectation of the outcome is linear in the covariates. We compute standard errors clustering at the cell level and robust to heteroskedasticity. Column 1 in table 3 presents the regression estimates on cultivation separately by year and also pools together the years 2006 to 2010 and 2008 to 2010, when Colombia effectively stopped spray- ing the exclusion area. Our estimates show that cultivation fell in the sprayed area relative to the exclusion region in all years since 2006, and the effects are all significant at traditional levels. Pooling the years 2006 to 2010 together (or 2008 to 2010), we find that cells in the sprayed area had 0.24 less hectares of coca per square km (standard error ¼ 0.015) as a consequence of the spraying. As before, we also present standard errors robust against spatial correlation among cells within 5 km of each other. These are presented in square brackets only below the main estimates in table 3. Though, in this case, they are considerably larger than the traditional ones, they do not change any of our conclusions. 588 Mej ıa, Restrepo, and Rozo Table 3. Conditional Differences in Differences Estimate of Being in Sprayed Region Estimates for cultivation Estimates for spraying Controlling for propensity score Controlling for propensity score Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 (1) (2) (3) (4) (5) (6) (7) (8) Panel A: Pooled-years estimates Pooling 2006 to 2010: À0.239*** À0.227*** À0.150*** À0.222*** 0.077*** 0.100*** 0.104*** 0.095*** (0.015) (0.022) (0.021) (0.021) (0.002) (0.004) (0.004) (0.004) [0.053] [0.052] [0.024] [0.052] [0.011] [0.011] [0.007] [0.011] Observations 54400 54400 54400 54400 54400 54400 54400 54400 Pooling 2008 to 2010: À0.153*** À0.140*** À0.067*** À0.132*** 0.104*** 0.124*** 0.122*** 0.119*** (0.012) (0.019) (0.020) (0.018) (0.003) (0.004) (0.005) (0.004) [0.053] [0.052] [0.024] [0.052] [0.011] [0.011] [0.007] [0.011] Observations 32640 32640 32640 32640 32640 32640 32640 32640 Panel B: Year-by-year estimates Estimate for 2006: À0.581*** À0.571*** À0.517*** À0.579*** 0.057*** 0.072*** 0.082*** 0.072*** (0.045) (0.051) (0.048) (0.050) (0.005) (0.006) (0.007) (0.006) Estimate for 2007: À0.154*** À0.146*** À0.032 À0.137*** 0.019*** 0.054*** 0.073*** 0.047*** (0.042) (0.043) (0.039) (0.040) (0.006) (0.009) (0.009) (0.008) Estimate for 2008: À0.091*** À0.059** 0.022 À0.049* 0.129*** 0.159*** 0.155*** 0.150*** (0.023) (0.029) (0.033) (0.028) (0.005) (0.007) (0.008) (0.006) Estimate for 2009: À0.204*** À0.218*** À0.156*** À0.206*** 0.079*** 0.105*** 0.101*** 0.100*** (0.017) (0.022) (0.022) (0.020) (0.004) (0.005) (0.006) (0.005) Estimate for 2010: À0.163*** À0.143*** À0.066** À0.141*** 0.102*** 0.110*** 0.111*** 0.108*** (0.022) (0.026) (0.027) (0.025) (0.005) (0.005) (0.006) (0.005) Observations per year 10880 10880 10880 10880 10880 10880 10880 10880 Notes: The table presents conditional differences in differences estimates of the effect of being in the sprayed region (relative to the exclusion areas) on cultivation and spraying. Columns 1 and 5 present linear regressions. In columns 2 and 6 we reweigh the data using the estimated propensity score. In columns 3 and 7 we stratify on the estimated propensity score. Finally, in columns 4 and 8 we match observations on the propensity score. The propensity score is estimated with a probit model using data for cultivation and spraying from 2000 to 2005 as explanatory variables. The top panel presents results pooling several years together; while the bottom panel presents year-by-year estimates. Standard errors robust against heteroskedasticity and serial correlation within cells are reported in parenthesis. Estimates with *** are significant at the 1%, those with ** are significant at the 5%, and those with * are significant at the 10%. Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. Consistency of the previous estimates requires the conditional expectation of cultivation and aerial spray- ing to be linear in the covariates. To relax this assumption, we follow several strategies in which we control nonparametrically for the propensity score, ki ¼ P½Ti ¼ 1jZi ; fYit0 ; Sit0 gt0 2005 Š. We estimate the propensity score, k^ i, using a probit model (not reported to save space), making this approach semi-parametric. In column 2 we reweight the regression in equation 5 by the propensity score (see Hirano, Imbens, and Ridder 2003). In particular, we weight observations in the sprayed area by p=ð1 À pÞ, where p is the ^ ^ ^ fraction of cells in this area, and observations in the exclusion area by k i=ð1 À kiÞ, with ki being the esti- mated propensity score of the grid. This method ensures that all covariates are balanced and set to the distribution of the sprayed region. Once reweighted, the regression estimate equals the average treatment effect of being in an area currently sprayed.10 Besides reweighting by the propensity score, we also control linearly for all covariates in the regression. This is known as a double-robust regression, as it provides consistent estimates as long as the propensity ^ 10 We can also estimate the average treatment effect, but this requires weighting by 1= k i the observations in the sprayed area. However, there are values with very low estimated propensity scores that make this exercise imprecise. In any case, our results are similar. Moreover, the average treatment effect on the treated seems like the relevant object to evaluate the actual effect of spraying on cultivation. The World Bank Economic Review 589 score is correctly specified or the linear covariates are an accurate model for the conditional expectation of spraying and cultivation (see Imbens and Wooldridge 2009). As can be seen from the results in column 2, the results change little relative to column 1, suggesting that the linear controls were already capturing most of the relevant heterogeneity in cultivation and spraying dynamics before the intervention.11 The role of reweighting the data by the propensity score can be seen graphically in figure 11. We plot Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 the difference in cultivation and spraying for each year after reweighting the data using the propensity score as described above. The right panel shows that now, cultivation is balanced between the sprayed and exclusion areas before 2006. A similar pattern emerges for spraying, although dynamics were already roughly balanced in the raw data. Figure 11. Re-Weighted Difference in Coca Cultivation and Spraying between the Sprayed and Exclusion Areas from 2000 to 2010 Relative to the Average before 2006 Notes: We weight observations in the exclusion area by the estimated likelihood ratio based on the estimated propensity score. Source: Authors’ analysis based on data from United Nations Office of Drugs and Crime, UNODC. In column 3 we follow another strategy and stratify on the propensity score as in Angrist (1998) and Dehejia and Wahba (1999). In particular, we group observations by their propensity score in twenty equal bins covering the (0,1) interval. The j-th bin contains grids with an estimated propensity score between ðj À 1Þ Â 0:05 and j  0:05. For each bin, we estimate equation 5 separately and use weighted averages of all these estimates to obtain an estimate for bt. We obtain the variance of bt as a weighted sum of the varian- ces for each bin as well. We weight each estimate by the number of observations in the bin from the sprayed region. This guarantees that we estimate the average treatment effect on the treated. This approach has the advantage of not imposing any functional form on the conditional expectation as a function of the propensity score, but of course is limited by the size of our bins. Again, we control locally for all covariates when esti- mating equation 5 for each bin. This partly controls for differences in the propensity score within bins and misspecification of the propensity score. Our results are similar to the basic regression estimates in column 1, though we find a smaller reduction in cultivation. Finally, in column 4 we do Kernel matching on the propensity score. This works by finding, for each grid in the sprayed region, others in the exclusion area within a band around its estimated propensity score and weighting them by a triangular Kernel that assigns less weight to distant grids. Reweighting the regression using these weights produces an estimate of the average treatment effect on the treated. The reweighting guarantees that every grid in the sprayed region is compared to an average of grids with 11 We report the usual regression standard errors clustering at the grid level. These errors ignore the fact that the propensity score is estimated in a previous stage. However, as suggested by Hirano, Imbens, and Ridder (2003), these standard errors are actually conservative, relative to adjusted ones. We compute an alternative set of bootstrapped standard errors taking into account the estimation of the propensity score and obtained slightly smaller standard errors (not reported). 590 Mej ıa, Restrepo, and Rozo similar propensity scores in the exclusion region and, thus, controls nonparametrically for the propensity score. Again, we also include the covariates in the regression linearly, which control partly for differen- ces in the propensity score within the kernel of an observation. Our results vary little with respect to the traditional regression estimates in column 1. Columns 5 to 8 present analogous estimates using the likelihood of spraying as the dependent variable. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 We find a large increase in the likelihood of spraying in the sprayed region, especially after 2008, and a smaller increase in 2006 and 2007, consistent with the fact that the diplomatic friction began in 2006 but became binding only from 2008 onwards. When pooling the years 2006 to 2010 together in column 5, we find that cells in the sprayed area were indeed 7.7 percentage points more likely to be sprayed (standard error ¼ 0.002), independently of their past paths of cultivation. Thus, the diplomatic friction caused a clear change in the likelihood of enforcement among cells in different sides of the 10 km band near the frontier. All the same, the estimates in this section suggest that grids in the sprayed region were approximately 10 percentage point more likely to be sprayed from 2006 to 2010, independently of their past levels of cultivation or observable characteristics. As a consequence, farmers in the region reduced cultivation by approximately 0.22 hectares per square kilometer.12 V. Cost Benefit Analysis of Aerial Spraying As discussed in the introduction, the aerial spraying program is the largest component among the supply reduction efforts implemented under Plan Colombia. Between 2000 and 2008, $585 million were allo- cated to the eradication program, whereas $62.5 million were allocated to air interdiction, $89.3 million to coastal and river interdiction by the military forces, and $152.7 million to interdiction activities car- ried out by the Colombian Police (see the U.S. Government Accountability Office—GAO 2008). Our regression discontinuity estimates suggest that a 10 percentage points increase in the likelihood of spraying a 1 square km grid leads to a reduction of approximately 0.3 hectares of coca. Our condi- tional differences in differences point to a reduction of approximately 0.22 hectares. Since 1 square kilo- meter contains 100 hectares, these estimates imply that to reduce cultivation by 1 hectare during a given year, between 33 and 45 hectares would have to be sprayed. It is estimated that the average direct cost to the United States per hectare sprayed is about $750 (see Walsh et al. 2008). Thus, reducing cultivation by 1 hectare through financing spraying campaigns costs the United States between $24,750 and $33,750 dollars. Additionally, for every dollar spent by the United States, Colombia spends about 2.2 dollars (aerial spraying campaigns are jointly financed by the countries), making the overall total cost range between $79,200 and $108,000 per hectare of coca crops reduced. To put these numbers in perspective, the coca leaf in one hectare produces about 1.2 kgs of cocaine per harvest, with a farmgate market value of about $4,200 dollars. From a drug policy perspective, it is more informative to calculate the benefits in terms of the reduc- tion of kilograms of cocaine in consumer markets. We do not have estimates of the social benefits of such reductions, but at least we can compare the cost to that of other policies achieving a similar objec- tive. To do so, we use the estimates in Mej ıa and Restrepo (2013) obtained by calibrating a model of downstream cocaine markets. The authors find that a 1% reduction in coca cultivation reduces cocaine in consumer markets by 0.0025%. This elasticity is small for several reasons. First, cultivation represents only a small fraction of the total market value of cocaine in consumer markets. Thus, an increase in the 12 The fact that this is smaller than our regression discontinuity estimates could be due to two things. First, because they are, in theory, different objects. In columns 2 to 4 of table 3 we estimate an average treatment effect on the sprayed grids (and the regression in column 1 produces a mix between the ATT and ATE), while regression discontinuity estimates a local ef- fect. Second, there may be more reallocation of crops to the exclusion region near the 10 km cutoff. This implies that this indirect margin is more relevant for the regression discontinuity estimates, making them overstate the deterrent effect. The World Bank Economic Review 591 price of coca leaf caused by spraying translates into a small increase in consumer prices. Second, demand is inelastic, so the small increase in prices barely affects consumption. Finally, downstream markets adjust to the shock by substituting towards cheaper inputs of production, such as better chemical precur- sors and technologies to produce more cocaine per hectare, by demanding more cocaine from other source countries, or by switching to better transportation techniques, partially offsetting the effect of the Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 shock on the supply of cocaine. Total coca cultivation in Colombia was about 80,000 hectares during our period of analysis. Reducing this by 1% (800 hectares), would cost the United States $20–$27 million dollars per year. However, this investment would reduce the supply of cocaine in its territory only by about 0.0025%, which equals 12.5 kg. This implies that the marginal cost to the United States of reducing retail quanti- ties of cocaine by 1 kg by subsidizing aerial spraying in Colombia is in the order of $1.6–$2.16 million dollars. These are large magnitudes but are similar to the estimates reported by Mej ıa and Restrepo (2013) using an entirely different methodology. To put them in perspective, the price of 1 kg of cocaine in retail markets is about $150,000 per kilogram. The conclusion from this exercise is that aerial spraying is a very costly policy from a supply- reduction perspective. In particular, the policy is significantly more costly than other alternatives achiev- ing the same objective. The estimated marginal cost to the United States of reducing retail quantities of cocaine by 1 kg is estimated at $175,000 dollars by subsidizing interdiction policies in Colombia (Mej ıa and Restrepo 2013), or $8,250 and $68,750 dollars by funding treatment and prevention efforts, respec- tively, in the United States (MacCoun and Reuter 2001). Thus, despite being able to reduce coca cultiva- tion by affecting farmers’ incentives, aerial spraying has only small effects on cultivation. These effects translate to even smaller effects on downstream markets for the reasons emphasized above, making it a costly supply-reduction policy. If on top of that we add the share of the costs paid by Colombia and the alleged negative effects on health (Camacho and Mej ıa 2015), other legal crops, the environment (see Relyea, 2005 and D avalos et al. 2011), and the socio-economic conditions of coca-producing areas (see Rozo 2014), the policy looks even less favorable. VI. Concluding Remarks In this paper we explored the deterrent effects of enforcement on illegal behavior. We did so in the con- text of illegal coca cultivation in Colombia. We find that aerial spraying of coca crops—a particular type of enforcement aimed at disrupting the production of an illicit good (cocaine)—induces farmers to reduce coca cultivation. Our findings are aligned with the key insight from the economic analysis of crime, suggesting that the decision to engage in illegal activities is rational and, as such, responds to the likelihood of enforcement. Our main contribution is to present a clean and credible source of identification for the effects of enforcement on illegal markets. In particular, we exploit a diplomatic friction between the governments of Colombia and Ecuador over the possible negative effects of spraying campaigns in the Colombian ter- ritory bordering Ecuador. This diplomatic friction ended in a compromise by the Colombian govern- ment to stop spraying campaigns with glyphosate within a 10 km band along the border with Ecuador in 2006. We use a regression discontinuity design, exploiting the arbitrary 10 km cutoff and a conditional dif- ferences in differences estimator comparing similar cells with different treatment probabilities to uncover the causal effects of spraying on coca cultivation. Both methodologies point to a negative and significant effect of the program on coca production. In particular, both methodologies show that cells in the region that continued to be sprayed were approximately 10 percentage points more likely to be sprayed than cells in the exclusion area. In consequence, coca cultivation decreased in this region by about 0.3 592 Mej ıa, Restrepo, and Rozo hectares (regression discontinuity estimates) or 0.22 hectares (conditional differences in differences esti- mate) per square kilometer. Despite reducing coca cultivation, aerial spraying in Colombia has only small effects in downstream markets. We estimate that reducing the Colombian coca cultivation by 1% (about 800 hectares) would cost the United States between $20 and $27 million dollars per year. However, this investment would Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 reduce the supply of cocaine in its territory by only 12.5 kg of cocaine less per year. Hence, the cost of reducing cocaine retail supply by 1 kg via aerial spraying campaigns is at least $1.6 billion dollars per year. Other policies, such as treatment and prevention, or interdiction efforts in Colombia, would be sig- nificantly more cost effective in curbing drug supply. While this version of the paper was being completed, the Colombian government announced that it would stop the aerial spraying program. The decision was taken based on the possible health effects that the program might be having on the populations exposed to the herbicide used in the aerial spraying campaigns. The findings in this paper indicate that, on top of its negative collateral consequences on health, the aerial spraying program is not a cost-effective strategy in reducing cocaine supply. Thus, the Colombian government decision is unlikely to cause a large surge in cocaine supply. References Abadie, A., and G. Imbens. 2002. “Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings.” Econometrica, 70 (1): 91–117. ———. 2006. “Large Sample Properties of Matching Estimators for Average Treatment Effects.” Econometrica 74 (1): 235–67. ———. 2011. “Matching on the Estimated Propensity Score.” National Bureau of Economic Research, Working Paper N.15301. Abadie, A., J. Angrist, and G. Imbens. 2010. “On the failure of Bootstrap for Matching Estimators.” Econometrica, 76 (6): 1537–57. Andreoni, J., B. Erard, and J. Feinstein. 1998. “Tax Compliance.” Journal of Economic Literature 36 (2): 818–60. Angrist, J. 1998. “Estimating the Labor Market Impact of Voluntary Military Service Using Social Security Data on Military Applicants.” Econometrica 66 (2): 249–88. Angrist, J., and J.-S. Pischke. 2009. Mostly Harmless Econometrics, Princeton University Press, New Jersey. Bar-Ilan, A., and B. Sacerdote. 2001. “The Response to Fines and Probability of Detection in a Series of Experiments.” National Bureau of Economic Research, Working Paper No. 8638. Becker, G. S. 1968. “Crime and Punishment: An Economic Approach.” Journal of Political Economy 76 (2): 169–217. Benabou, R., and J. Tirole. 2003. “Intrinsic and Extrinsic Motivation.” The Review of Economic Studies 70 (3): 489–520. ———. 2006. Incentives and Prosocial Behavior. The American Economic Review, 96 (5): 1652–78. Beron, K., H. Tauchen, and A. Witte. (1992). “The Effect of Audits and Socioeconomic Variables on Compliance.” In J. Slemrod, ed., Why People Pay Taxes. Ann Arbor: Univ. of Michigan Press. Bogliacino, F., and A. Naranjo. 2012. “Coca Leaves Production and Eradication: A General Equilibrium Analysis.” Economics Bulletin 32 (1): 382–97. Buonanno, P., and G. Mastrobuoni. 2012. “Police and Crime: Evidence from Dictated Delays in Centralized Police Hiring.” Institute for the Study of Labor, Discussion Paper N. 6477. Caliendo, M., and S. Kopeinig. 2005. “Some Practical Guidance for the Implementation of Propensity Score Matching.” Institute for the Study of Labor, Discussion Paper N. 1588. Camacho, A., and D. Mej ıa. 2015. “The Health Consequences of Aerial Spraying of Illicit Crops: The Case of Colombia.” Center for Global Development WP 408, June. Cameron, S. 1988. “The Economics of Crime Deterrence: A Survey of Theory and Evidence.” Kyklos, 41 (2): 301–23. Conley, T. G. 1999. “GMM Estimation with Cross Sectional Dependence.” Journal of Econometrics 92 (1): 1–45. The World Bank Economic Review 593 Corman, H., and N. H. Mocan. 2000. “A Time- Series Analysis of Crime, Deterrence, and Drug Abuse in New York City.” American Economic Review 90 (3): 584–604. Dehejia, R. 2004. “Program Evaluation as a Decision Problem.” Journal of Econometrics 125: 141–73. Dehejia, R., and W. Sadek. 1999. “Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs.” Journal of the American Statistical Association 94 (448): 1053–62. Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 Di Tella, R., and E. Schargrodsky. 2004. “Do Police Reduce Crime? Estimates Using the Allocation of Police Forces After a Terrorist Attack.” American Economic Review, 94 (1): 115–33. Dion, M. L., and K. Russler. 2008. “Eradication efforts, the State, Displacement and Poverty: Explaining Coca Cultivation in Colombia During Plan Colombia.” Journal of Latin American Studies 40 (3): 399–421. Draca, M., S. Machin, and R. Witt. 2011. “Panic on the Streets of London: Police, Crime and the July 2005 Terror Attacks.” American Economic Review 101 (5): 2157–81. Dubin, J., M. Graetz, and L. Wilde. 1987. “Are We a Nation of Tax Cheaters? New Econometric Evidence on Tax Compliance.” American Economic Review 77: 240–45. Eck, J., and E. Maguire. 2000. “Have Changes in Policing Reduced Violent Crime? An Assessment of the Evidence.” In A. Blumstein, and J. Wallman, eds., The Crime Drop in America. New York: Cambridge University Press, 207–65. Ehrlich, I. 1973. “Participation in Illegitimate Activities: A Theoretical and Empirical Investigation.” Journal of Political Economy 81 (3): 521–65. Evans, W., and E. Owens. 2007. “COPS and Crime.” Journal of Public Economics 91 (1–2): 181–201. Frey, B. 1997. Not Just for the Money: An Economic Theory of Personal Motivation. Cheltenham, UK: Edward Elgar Publishing. Garcia, J. F., D. Mej ıa, and D. Ortega. 2013. “Police Reform, Training and Crime: Experimental evidence from Colombia’s Plan Cuadrantes.” Documento CEDE. Gelman, A., and G. Imbens. 2014. “Why High-Order Polynomials should not be used in Regression Discontinuity Designs.” NBER, Working Paper No. 20405. GAO. 2008. “PLAN COLOMBIA Drug Reduction Goals Were Not Fully Met, but Security has Improved.” United States Government Accountability Office, document 00-71, October. Heckman, J., H. Ichimura, J. Smith, and P. Tod. 1998. “Characterizing Selection Bias Using Experimental Data.” Econometrica, 66: 1017–98. Hirano, K., G. Imbens, and G. Ridder. 2003. “Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score.” Econometrica 71 (4): 1161–89. Hsiang, S. 2010. “Temperatures and Cyclones Strongly Associated with Economic Production in the Caribbean and Central America.” Proceedings of the National Academy of Sciences 107 (35). Ibanez, M., and F. Carlsson. 2010. “A Survey-Based Choice on Coca Cultivation.” Journal of Development Economics, 93: 249–73. Imbens, G., and J. Wooldridge. 2009. “Recent Developments in the Econometrics of Program Evaluation.” Journal of Economic Literature 47 (1): 5–86. Imbens, G., and K. Kalyanaraman. 2012. “Optimal Bandwidth Choice for the Regression Discontinuity Estimator.” The Review of Economic Studies Advance Access, pp: 1–27. Imbens, G., and T. Lemieux. 2008. “Regression Discontinuity Designs: A Guide to Practice.” Journal of Econometrics 142 (2): 615–35. Kagan, R. 1989. “On the Visibility of Income Tax Law Violations.” In J. A. Roth, and J. T. Scholz, eds., Taxpayer Compliance: Social Science Perspectives, 107. Klepper, S., and D. Nagin 1989. “Tax Compliance and Perceptions of the Risks of Detection and Criminal Prosecution.” Law & Society Review 23: 209–40. Lee, D. S., and T. Lemieux. 2009. “Regression Discontinuity Design in Economics.” National Bureau of Economic Research, Working Paper 14723. Levitt, S. 1997. “Using Electoral Cycles in Police Hiring to Estimate the Effect of Police on Crime.” American Economic Review 87 (3): 270–90. ———. 2002. “Using Electoral Cycles in Police Hiring to Estimate the Effect of Police on Crime: Reply.” American Economic Review 92 (4): 1244–50. 594 Mej ıa, Restrepo, and Rozo Ludwig, J., and D. Miller. 2005. “Does Head Start Improve Children’s Life Chances? Evidence from a Regression Discontinuity Design.” National Bureau of Economic Research, Working Paper 11702. Marvell, T., and C. Moody. 1996. “Specification Problems, Police Levels, and Crime Rates.” Criminology 34 (4): 609–46. McCrary, J. 2002. “Using Electoral Cycles in Police Hiring to Estimate the Effect of Police on Crime: Comment.” Downloaded from https://academic.oup.com/wber/article-abstract/31/2/570/2897781 by Joint Bank-Fund Library user on 08 August 2019 American Economic Review 92 (4): 1236–43. ———. 2008. “Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test.” Journal of Econometrics 142 (2): 698–714. MacCoun, R. J., and P. Reuter. 2001. Drug War Heresies: Learning from Other Vices, Times, and Places. Cambridge. UK: Cambridge University Press. Mejıa, D., and D. Rico. 2011. “The Microeconomics of Cocaine Production and Trafficking in Colombia.” In A. Gaviria, and D. Mej ıa, eds., Anti-drug Policies in Colombia: Successes, Failures and Lost Opportunities. ch. 1. Bogota: Ediciones UniAndes. Mejıa, D., and P. Restrepo. 2012. “The War on Illegal Drugs in Producer and Transit Countries: A Simple Analytical Framework.” In C. Costa Storti, and P. De Grawe, eds., Illicit Trade and the Global Economy, ch. 10. Cambridge, MA: CESifo–MIT Press. ———. 2013. “The Economics of the War on Illegal Drug Production and Trafficking.” Documento CEDE No. 54. ———. 2013b. “Bushes and Bullets: Illegal Cocaine Markets and Violence in Colombia.” Documento CEDE No. 53. Menninger, K. 1968. The Crime of Punishment. New York: Viking Press. Moreno-Sanchez, R., D. Kraybill, and S. Thompson. 2003. “An Econometric Analysis of Coca Eradication Policy in Colombia.” World Development 31 (2): 375–83. Moya, A. 2005. Impacto de la Erradicacio n Forzosa y el Desarrollo Alternativo Sobre los Cultivos de Hoja de Coca. Masters thesis. Facultad de Econom ıa. Universidad de Los Andes, Bogota, Colombia. Relyea, R. 2005. “The Impact of Insecticides and Herbicides on Biodiversity and Productivity of Aquatic Communities.” Ecological Society of America 15 (2): 618–27. Reyes, L. 2014. “Estimating the Causal Effect of Forced Eradication on Coca Cultivation in Colombian Municipalities.” World Development 61: 70–84. Rosenbaum, P., and D. Rubin. 1983. “The Central Role of the Propensity Score in Observational Studies for Causal Effects.” Biometrika 70 (1): 41–50. Rozo, S. V. 2014. “On the Unintended Consequences of Anti-Drug Programs in Producing Countries.” Association for Public Policy Analysis and Management. http://www.appam.org/assets/1/7/On_the_Unintended_Effects_of_ Spraying.pdf Stigler, G. J. 1970. “The Optimum Enforcement of Laws.” Journal of Political Economy 78: 526–36. Stuart, E. 2010. “Matching Methods for Casual Inference: A Review and a Look Forward.” Statistical Science 25 (1):1–21. UNODC. 2012. World Drug Report. New York: United Nations Publications. Walsh, J., G. Sanchez, and Y. Salinas 2008. “La Aspersio n Ae ıcito en Colombia: Una ´ rea de Cultivos de Uso Il Estrategia Fallida.” Washington Office for Latin America, Washington, DC. THE WORLD BANK 1818 H Street, NW Washington, DC 20433, USA World Wide Web: http://www.worldbank.org/ E-mail: wber@worldbank.org Downloaded from https://academic.oup.com/wber/issue/31/2 by Joint Bank-Fund Library user on 09 August 2019 ISBN 978-0-19-880920-3