THE WORLD BANK ECONOMIC REVIEW EDITOR Jaime de Melo, University ofGeneva EDITORIAL BOARD Kaushik Basu, CornellUniversity, USA Graciela Kaminsky, George Washington Timothy Besley,London Schoolof University,USA Economics, UK Ravi Kanbur, CornellUniversity,USA Fran~oisBourguignon, World Bank Peter Lanjouw, World Bank Anne Case, PrincetonUniversity,USA Justin Yifb Lin, Peking University,China Stijn Claessens, World Bank Thierry Magnac, Universitide ToulouseI, Klaus Deininger, World Bank France Asli DemiguryKunt, World Bank Juan-Pablo Nicolini, UniversidadTorcuato di Ishac Diwan, World Bank Tella,Argentina Augustin Kwasi Fosu, UNEconomic Jean-Philippe Platteau, Universiq of Namur, Commission forAfrica (ECA), Ethiopia Belgium Alan Gelb, World Bank Boris Pleskovic, World Bank Mark Gersovitz,Johns Hopkins Martin Ravallion, World Bank University,USA Ritva Reinikka, World Bank Paul Gertler, World Bank Indermit Gill, World Bank Elisabeth Sadoulet, University of Calfornia, Jan Wilem Gunning, Free University, Berkeley, USA The Netherlands Joseph Stiglitz, Columbia University,USA Jeffiey Hammer, World Bank Moshe Syrquin, University of Miami, USA Karla Hoff, World Bank L. Alan Winters, World Bank THE WORLD BANK ECONOMIC REVIEW Volume 19 2005 Number 2 The Varieties of Resource Experience: Natural Resource Export Structures and the Political Economy of Economic Growth 141 Jonathan Isham, Michael Woolcock, Lant Pritchett, and Gwen Busby Attaching Workers through In-Kind Payments: Theory and Evidence from Russia Guido Friebel and Sergei Guriev Child Health and Economic Crisis in Peru Christina Paxson and Norbert Schady Entrepreneurship Selection and Performance: A Meta-Analysis of the Impact of Education in Developing Economies Justin van der Sluis, Mirjam van Praag, and Wim Vijverberg Microfinance and Poverty: Evidence Using Panel Data from Bangladesh Shahidur R. Khandker Participation in wo Dispute Settlement: Complainants, Interested Parties, and Free Riders Chad P. Bown Has Rural Infrastructure Rehabilitation in Georgia Helped the Poor? Michael Lohhin and &Ian Yemtsov SUBSCRIPTIONS: A subscription to The World Bank Economic Review (ISSN 0258-6770) comprises 3 issues. Prices include postage; for subscribers outside the Americas, issues are sent air freight. Annual Subscription Rate (Volume 19, 3 Issues, 2005): Academic libraries-Print edition and site-wide online access: US$119/L83, Print edition only US$113/L79, Site-wide online access only US$107/05; Corporate-Print edition and site-wide online access: US$144/L99, Priit edition only US$136/£94, Site- wide online access only US$129/£89; PersonatPrint edition and individual online access: US$44/£34. Please note: £ Sterling rates apply in Europe, US$ elsewhere. There may be other subscription rates available; for a complete listing, please visit www.wbro.oxfordjournals.org/subscriptions.Readers with maifing addresses in non-OECD countr es and in socialist economies in transition are ehgible to receive i complimentary sudscriptions on request by writing to the UK address below. Full prepayment in the correct currency is required for all orders. Orders are regarded as firm, and paymentsare not refundable.Subscriptionsare acceptedand entered on a completevolume basis. Claims cannot be considered more than four months after publication or date of order, whichever is later. All subscriptionsin Canada are subject to GST. Subscriptionsin the EU may be subject to European VAT. If registered, please supply details to avoid unnecessary charges. For subscriptions that include online versions, a proportion of the subscription price may be subject to UK VAT. Personal rates are applicable onlywhen a subscription is for individual use and are not availableif deliveryis made to a corporate address. BACK ISSUES: The current year and two previous years' issues are availablefrom Oxford University Press. Previous volumes can be obtained from the PeriodicalsService Company, 11Main Street, Germantown, NY 12526, USA. E-mail: psc@periodicals.com.Tel: (518) 537-4700. Fax: (518) 537-5899. CONTACT INFORMATION:Journals CustomerServiceDepartment,OxfordUniversityPress,Great Clarendon Street,OxfordOX2 6DP,UK. E-mail:jnls.cust.serv@oxfordjoumals.org.Tel: +44(0)1865353907. Fax: + 44 (0)1865 353485. In the Americas, please contact:Journals Customer Service Department, Oxford University Press, 2001 Evans Road, Cary, NC 27513, USA. E-mail: jnlorders@oxfordjournals.org.Tel: (800) 852- 7323 (toll-free in USA/Canada) or (919) 677-0977. Fax: (919) 677-1714. In Japan, please contact:Journals Customer Service Department, Oxford UniversityPress,1-1-17-SF,Mukogaoka, Bunkyo-ku, Tokyo,113- 0023,Japan. E-mail: okudaoup@po.iijnet.or.jp.Tel: (03) 3813 1461. Fax: (03) 3818 1522. POSTAL INFORMATION: The World Bank Economic Review (ISSN 0258-6770) is published by Oxford University Press for the International Bank for Reconstruction and DevelopmendTHE WORLD BANK. Send address changes to The World Bank Economic Review, Journals Customer Service Department, Oxford UniversityPress, 2001 Evans Road, Cary, NC 27513-2009. Communications regarding original articlesand editorial management should be addressed to The Editor, The World Bank Economic Review, The World Bank, 3, Chemin Louis Dunant, CP66 1211 Geneva 20, Switzerland. DIGITAL OBJECT IDENTIFIERS: For information on dois and to resolve them, please visit www.doi.org. PERMISSIONS: For information on how to request permissions to reproduce articles or information from this journal, please visit www.oxfordjournals.org/jnls/permissions. ADVERTISING: Inquiries about advertisingshould be sent to Helen Pearson, OxfordJournals Advertising, PO Box 347, Abingdon OX14 IGJ, UK. E-mail: helen@oxfordads.com.Tel: +44 (0)1235 201904. Fax: +44 (0)8704 296864. DISCLAIMER: Statements of fact and opinion in the articles in The World Bank Economic Review are those of the respective authors and contributors and not of the International Bank for Reconstruction and DevelopmendTHE WORLD BANK or Oxford University Press. Neither Oxford University Press nor the International Bank for Reconstruction and DevelopmendTHE WORLD BANK make any representation, express or implied, in respect of the accuracy of the material in this journal and cannot accept any legal responsibility or liabilityfor any errors or omissions that may be made.The reader should make her or his own evaluation as to the appropriatenessor otherwise of any experimental technique described. PAPER USED: The World Bank Economic Review is printed on acid-free paper that meets the minimum requirements of ANSI Standard 239.48-1984 (Permanence of Paper). INDEXING AND ABSTRACTING: The World Bank Economic Review is indexed andlor abstracted by C4B Abstracts, Current ContentdSocial and Behavioral Sciences, Journal of Economic LiteratureEconLit, PAIS Internationa(, RePEc (Research in Economic Papers), and Social Services Citation Index. COPYRIGHT 0The International Bank for Reconstruction and Development/T~EWORLD BANK 2005 All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without prior written permission of the publisher or a license permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. The Varieties of Resource Experience: Natural - Resource Export Structures and the Political Economy of Economic Growth Jonathan Isham, Michael Woolcock, Lant Pritchett, and Gwen Busby Many oil, mineral, and plantation crop-based economies experienced a substantial deceleration in growth following the commodity boom and bust of the 1970s and early 1980s. This article illustrates how countries dependent on point source natural resources (those extracted from a narrow geographic or economic base, such as oil and minerals) and plantation crops are predisposed to heightened economic and social divisions and weakened institutional capacity. This in turn impedes their ability to respond effectively to shocks, which previous studies have shown to be essential far sustaining rising levels of prosperity. Analysis of data on classifications of export structure, controlling for a wide array of other potential determinants of governance, shows that point source-and coffee and cocoa-exporting countries do relatively poorly across an array of governance indicators. These governance effects are not associated simply with being a natural resource exporter. Countries with natural resource exports that are diffuserelying primarily on livestock and agricultural produce from small family farms-do not show the same strong effects-and have had more robust growth recoveries. Jonathan Isham is assistant professor of Economics and Environmental Studies at Middlebury College; his email address is jisham@middlebury.edu. Michael Woolcock is senior social scient~stin the Development Research Group at the World Bank and lecturer in Public Policy at the Kennedy School of Government at Harvard University; his email address is mwoolcock@worldbank.org. Lant Pritchett is lead socioeconomist in the South Asia Social Development Unit at the World Bank; his email address is lpritchett@worldbank.org. Gwen Busby is a graduate student iri the Department of Forest Resources at Oregon State University; her email address is gwenbusby@or- egonstate.edu. The authors thank William Easterly, Dani Kaufmann, Michael Ross, and Michael Schott for their rapid and informative sharing of data and ideas and Richard Auty, Jean-Philippe Stijns, Phani Wunnava, four anonymous referees, and participants at seminars at Cornell Univer- sity, Lancaster University, Middlebury College, the World Bank, the University of Cambridge, and United Nations University, World Institute for Development Economics Research (UNU/WIDER)for useful comments. They also thank Maya Tudor for research assistance and the World Bank's Research Support Budget and Middlebury College's Department of Economics and Program in Environmental Studies for research support. A previous draft of this article (Woolcock, Pritchett, and Isham 2001) was prepared for and sponsored by the UNU/WDERProject on Resource Abundance and Economic Growth. THE WORLD BANK ECONOMIC REVIEW, VOL. 19, NO. 2, pp. 141-174 doi:10.1093/wber/lhi010 Advance Access publication September 28, 2005 O The Author 2005. Published by Oxford UniversityPress on behalf of the International Bank for Reconstruction and Development / THEW ORLD BANK.All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. The rentier state is a state of parasitic, decaying capitalism, and this circumstance cannot fail to influenceall the socio-politicalconditions of the countries concerned. -Vladimir Lenin, Imperialism, the Highest Stage of Capitalism It matters whether a state relies on taxes from extractive industries, agricultural production, foreign aid, remittances, or international borrowing because these different sources of revenues, whatever their relative economic merits or social import, have powerful (and quite different) impact on the state's institutional development and its abilities to employ personnel, subsidize social and economic programs, create new organizations, and direct the activities of private interests. Simply stated, the revenues a state collects, how it collects them, and the uses to which it puts them define its nature. -TerryKarl,TheParadoxofPlenty It is useful to contrast the conduct of governmentsin resource-rich nations with that of governments in nations less favorably endowed. In both, governmentssearch for revenues; but they do so in different ways. Those in resource-rich economies tend to secure revenues by extracting them; those in resource- poor nations, by promoting the creation of wealth. Differencesin natural endowments thus appear to the shape the behavior of governments. -RobertBates,ProsperityandViolence Is oil wealth a blessing or a curse?Norway provides an encouraging example, but Azerbaijanis are rightly concerned whether their country can handle the potential bonanza from newly discovered oil fields. While government officials have pro- mised that oil revenues will go to schools, hospitals, and roads, no formal plans are in the offing; meanwhile, neighboring Caspian Sea nations are despotically ruled, ethnically divided, and weakened by corruption-problems some fear will be made worse by oil.' The controversy over construction of the oil pipeline in Chad demonstrates that even in an extraordinarily poor country, not all believe that additional wealth pouring into government coffers will lead to better times. Sirni- larly, after the recent discovery of oil reserves off the coast of S5o Tomt and Principe, the leader of a short-lived coup demanded that the oil revenues be used to benefit the nation's entire population. After the government was restored, Prime Minister Maria das Neves stated: "Oil could be our heaven, purgatory or hell; it all depends on how S5oTomt faces up to this challenge" (AgenceFrance Press 2004). Both resource scarcity and abundance have been cited as a primary cause of civil war. Some have argued that land scarcity is behind the Rwandan conflicts (Diamond2005; Klare 2001), but resource-rich countries have not escaped civil strife. Countries such as Angola have been embroiled in conflict since the mid- 1970s, and the ~roblemthere is not scarce land but rather abundant sources of oil and some of the world's best diamonds (Campbell 2002). Just as revenues from diamonds, timber, coffee, and gold in the eastern half of the country strengthened (then) Zaire's elite, revenues from coltan (columbite-tantalite) are now strengthening the rebel Rally for Congolese ~ e m o c r a cRebels in ~ . ~ 1. According to the former chief UN representative in Azerbaijan, "This wealth.. .will create a lot of problems. It will increase the already substantial gap between the rich and poor, and eventually it will affect political stability" (Kinzer 1999). 2. Coltan has recently been declared the "wonder mineral of the moment" (Vick2001, p. Al).When processed, it is vital for the manufacture of capacitors and other high-tech products. Isham and others 143 Sierra Leone are financed by revenues from diamond mines and may be fighting over nothing else but control over the mines. What mechanismsmight explainthe conditions under which resourceabundance becomes a problem rather than part of a solution to development?This article aldds to the burgeoningliteratureon natural resources and performance by documenting oneway in which countries' sources of export revenueaffect economic growth.3The noveltyin thisarticleis toshow that exportconcentration in what is herecalledpoint source natural resources-those extracted from a narrow geographic or economic base,such as oil, minerals (suchascopperand diamonds),and plantationcrops (such as sugar and bananas)-isstrongly associated with weak public institutions, which are, in turn, strongly associated with slower growth.4 This article presents econo- metric evidence to support the hypothesis not only that institutional capacity to handle shocks is a determinant of economic growth since the commodity shocks of the 1970s and 1980s (Rodrik1999) but also that institutional capacity itself varies and that export structures influencesocioeconomic and political institutions. The growth performancefacts that the analysis is trying to (partially)explain are showninfigures1and2.Smoothedover threeyears,themedianannualgrowthrat'eof GDP per capitafor 90 developingeconomiesfrom theearly1960stothelate1970swas consistently above 2 percent (figure1).But since 1980, developingeconomies have endured a growth collapseof Grand Canyon proportions, with growth well below1 percent for the early 1980s and remaining below 2 percent until the mid-1990s. The collapse is even more striking when the growth performance is shown for the 90 developing economies classified by their export structure (defined shortly) in 1985 (figure 2). Countries that were exporters of manufacturers have experienced no growth deceleration. All natural resource exporters suf- fered substantial slowdowns, but the deceleration was much more severe and lasted much longer for point source and coffee or cocoa exporters than for countries whose principal exports were diffuse. Why? This article focuses on the variety of growth experiences associated with reliance on different sources of export revenue. It shows that the composition of natural resource exports influences the quality of political institutions and that these in turn shape growth performance. Given the distinguished roster of theoretical and econometric publications that have addressed connecticons among natural resources, institutions, and economic performance, four caveats apply to this article's place in the literature. First, the article makes no claim that 3. The most recent literature on the effects of natural resources on growth includes Auty (1995, 2001b); Leamer and others (1999);Leite and Weidmann (1999);Ross (1999,2001);Sachs and Warner (1995,1999); Stijns (2001);Nugent and Robinson (2001);Gylfason (2001);Gylfason and Zoega (2001, 2002); Lederman and Maloney (2002);Easterly and Levine (2002);Murshed (2003);Sala-i-Martin and Subramanian (2003);Neumayer (2004);and Papyrakis and Gerlagh (2004). 4. Rodrik and others (2004)and Rigobon and Rodrik (2004)are the latest in a decade-long set of publications to establish, with cross-sectional data, the connection between institutions and economic performance. FIGURE 1. Smoothed Median per Capita Growth Rates in 90 Developing Economies, 1955-97 -1 1 Source: Woolcock and others (2001). FIGURE 2. Smoothed Median Growth Rates for 90 Developing Economies, 1957-97 Source: Woolcock and others (2001). natural resources affect growth solely though institutions: Dutch disease has been convincingly documented since at least Corden and Neary (1982).Second, the article does not offer any novel claims about the relative importance of institutions for economic performance: for at least a decade a range of econo- metric studies have fruitfully explored this link (from Knack and Keefer 1995 Isham and others 145 through Rigobon and Rodrik 2004).Third, the article does not suggest that the empirical results reported herein are the test of some particular model; rattler, they are consistent with a variety of possible model^.^ Finally, the results folcus on modern economic history, rather than seeking to explain longer term growth trajectories (though, as will be shown, the arguments are broadly consistent with those presented in studies that do undertake such a challenge). The article next discusses the literature on natural resources and growth,,in particular the range of hypotheses that are consistent with a link between resource composition and governance. It then discusses the two measures of export structure and shows the link between these and indicators of governance, completing the circle by showing the link between indicators of governance and economic growth. The final section offers some speculations for policy. Over the past decade, a distinguished body of empirical literature has emerged in support of arguments that development trajectories are shaped by institutions and that institutional form and quality, in turn, are deeply embedded in histtory and geography.6 This work suggests that combinations of climate (disease vectors, rainfall levels, temperature), topography (soil and mineral quality, access to ports), and labor (degrees of scarcity and compliance) in the early colonial period interacted in different places with the profitability of natural resources to make it more or less necessary (or feasible) to build governance institutions geared toward the subjugation and control of a domestic population by an expatriate minority. In Latin America and Africa, this process led to the construction of highly concentrated and authoritarian political structLlres ("extraction colonies"), whereas in North America (except Mexico), Australia, and New Zealand it gave rise to more open and dispersed political structLtres that concomitantly accorded greater civic freedoms and stronger property rights ("settlerc~lonies").~Where extractive institutions were initially laid down, they 5. Admittedly, this deviates from much of existing economic practice, but it does so deliberately. A common approach for journal articles is to write down one particular structural model that highlights one particular way in which resources affect politics, work out the comparative statics of that model only, and then test whether the comparative static predictions of that model are consistent with the data. If they are, a claim is then made that this validates that particular model. This is methodologically flawed. Any particular empirical test does not validate a particular model; it only rejects the class of models tha:t are incapable of producing the associations in the data and therefore validates all models that are capable of generating the particular comparative static prediction. The following section shows that a large class of models predicts that economic structure conditions political structure, with subsequent feedbacks from the resulting political and institutional structures onto economic performance. 6. See, for example, Sachs and Warner (1999);Engerman and Sokoloff (1997);Acemoglu and others (2001, 2002, 2003); and Easterly and Levine (2002). This paragraph summarizes the general 1in.e of argument emerging from this work. 7. The concepts of extraction and settler colonies come from Acemoglu, Johnson, and Robinson (2001). soon consolidated themselves in ways that reduced the likelihood that over time they would have an interest in generating-or in being subjected to counter- vailing pressures to generate-either more diverse revenue (export)streams or more open political structures. If this is so, one could plausibly argue that attempts to measure export structures and institutional quality in the late twentieth century-as is done here-are merely capturing paths of development laid down many decades before: endowments (broadly defined) may have had an important initial role, but in the intervening centuries it is the prevailing political institutions that have determined the export structures, not the other way around. For some, this leads to an interpretation that contemporary Russia and Mexico export oil not only because that consolidates the political power of prevailing elites but also because the associated long-standing fragility of their institutions (of all kinds) precludes the possibility of generating wealth from more technologically sophis- ticated (or diverse) sources. This account is fine insofar as it provides some novel and revisionist (because earlier generations of development economists confidently believed abundant natural resources to be a solid basis of prosperity) stylized facts of modern economic history, but it is less well positioned to explain variations in the development trajectories of countries with ostensibly similar "initial con- ditions." India, for example, was quintessentially an extraction colony, yet it now has a vibrant economy dominated by manufacturing and services. Argen- tina and Venezuela at the turn of the twentieth century-long after coloniza- tion ended-were among the richest countries in the world, yet they are now ranked below the top 60. Bangladesh has always been poor, but textiles provide its largest source of export revenue. This account is also singularly unhelpful in the realm of offering contemporary policy advice. (What can a low-income country do to avoid the "resource curse" if it happens to discover a large oil field?)' More important, the emerging long-run storyline connecting institutional history to resource endowments is not inconsistent with several alternative (and more focused) explanations for the role of natural resource endowments in shaping growth trajectories. Ross's (2001) excellent empirical investigation into the mechanisms by which oil undermines democracy, for example, outlines several possible channels, or effects, through which oil influences political out- comes. Three such mechanisms drawn from his analysis are discussed here-a rentier effect, a delayed modernization effect, and an entrenched inequality effect-all consistent with a negative link between particular types of natural resources and government capacity. 8. From a strictly econometric point of view, it should also be noted that instruments (such as settler mortality) used to control for the initial quality of colonial institutions have recently been called into question. Isham and others 147 Rentier Effects Political scientists generally-area specialists in particular-argue that certain natural resources undermine development through what they call rentier effects (Ross 200l).~ When revenues can be easily extracted from a few easily controlled sources, there are three consequences. First, for any given revenue target the state has less need for taxation of the population, and without the pressure for taxation the state has less need to develop mechanisms of deep control of the citizenry. By the same token, citizens have less incentive to create mechanisms of accountability and to develop the deep civil society and horizontal social associations that many feel are the preconditions of democracy (see, among others, Inglehart 1997; Lipset 1959; Moore 1g166; Putnam 1993). Second, with the "exogenous" revenues the government can mollify dissent through a variety of mechanisms (buying off critics, providing the population with benefits, infrastructure projects, patronage, or outright graft). Third, the state has the resources to pursue direct repression and violence against dissenters. Delayed Modernization For influential scholars such as Moore (1966),the story of wealth, power, and political and economic transformation begins with some small group of elites owning the most valuable resources (usually land), from which they extract a surplus from the peasants (through serfdom, slavery, or feudal exactions). But then economic circumstances change so that industrialization becomes neces- sary. Modernization requires that some of the surpluses be transferred from existing activities to new industrial activities, that at least some of the labor be moved to the new activities, and that a more sophisticated system be put in place to manage the political pressures generated by urbanization and the demands of new semi-professional urban dwellers and business groups. 'This combination of economic transformations sets off a series of shifts in political power that can lead in various directions, depending on how the coalitions play out-between landed elite and rural producer and among urban labor, new industrialists, and the urban middle class. This process can move rapidly or more slowly and can lead to representative democracy, fascism, corporatism, Marxist dictatorship, or oligarchy (Moore 1966). 9. Some historians of the early modern state (such as 'Tilly 1992) argue that the increasing cost of modern armies led to greater demands on the state's ability to raise revenues, which led to one of several outcomes. States with access to foreign resources (the Spanish Crown, for example) did not have to extract resources from the domestic population and so did not develop any of the forms of the modern state. In other cases an accommodation was reached between the sovereign and other classes granting permission or assistance in taxation (England is the classic case), an increasingly powerful sovereign extracted resources directly (France is the classic case), or an inability to mobilize revenues because of conflicts between sovereign and nobles meant that eventually one got subsumed (Hungary and Poland are the classic cases). More recently, Acemoglu and others (2001,2002)have used similar arguments in which the mortality of settlers plays a crucial role in determining- the structure of economic production and hence institutions. In high-mortality environments, settlersconcentrated only on rent extraction from high value-added products and hence did not "invest" in developing high-quality government institutions. States controlling a point source resource would resist industrialization because it means creating alternative sources of power (urban labor, urban middle class, urban industrialists), which, as their power grows, will want to - tax away (or just confiscate) the quasi-rents from the natural resources. In the cross-section of levels, this implies that countries that today are still dominated by point source products are also likely to be dominated by elite politics of one type or another. In this case the high-income countries of the Organisation for Economic Co-operation and Development would be included in the analysis, because they have successfullymade the transition from agricultural production to industrialization (and beyond) and in the process have created functioning democratic ~olities(although along very different paths-the U.K.-U.S. path to democracy is very different from the French, Prussian-German, or Japanese paths). Indeed, viewed over the span of the past hundred years, it is only quite recently that resource-poor countries have become systematicallywealthier than resource-rich countries (seeAuty 2001a, p. 5). Political scientists have long argued that natural resource-dependent states tend to thwart secular modernization pressures-higher levels of urbanization, education, and occupational specialization-becausetheir budget revenues are derived from a small workforce that deploys sophisticated technical skills that - . can be acquired only abroad (oil, for example, is extracted largely by foreign, not domestic, firms). As a result, neither economic imperatives nor workers themselves generate pressures for increased literacy, labor organizations, and political influence. Concomitantly, citizens are less able to effectivelyand peace- fully voice their collective interests, preferences, and grievances (evenin nomin- ally democratic countries, such as Jamaica and Zimbabwe). In short, resource abundance simultaneously strengthens states and weakens societies and thus yields-or at least perpetuates-low levels of development (see also Migdal 1988).1° Entrenched Inequality The entrenched inequality effect refers to the ways in which export composition influences economic and political outcomes by affecting the social structure. Economic historians Engerman and Sokoloff (1997; see also Sokoloff and Engerman 2000, 2002) argue that the diverging growth trajectories of South 10. There are many variations in the way resources delay modernization, all relating to different connections between states and elites. The state can own the rents and a regime of rentier autocrats emerges, as with Algeria and Nigeria. Or rentier capitalists can effectively own the state, as in Angola and El Salvador, and oligarchic regimes emerge. Isham and others 149 and North America over the past 200 years can be explained in part by the types of crops grown, the property rights regimes enacted to secure their sale, and the timing and nature of decolonization.ll In North America, crops such as wheat and corn were grown on small family farms, cultivatable land was relatively abundant, decolonization occurred early, and innovative property rights ensured that land (andassets more generally) could be sold on an open marlket. In South America, by contrast, crops such as sugar, coffee, and cocoa were grown on large plantations, cultivatable land was relatively scarce, decoloniza- tion occurred late, and property rights were weak. Landed elites were able to amass great personal fortunes, resist more democratic reforms, and consolidate power. During economic downswings, vested interests thus resist reforms that would diversify the economy because this would create rivals competing for labor and government influence.12 - Certain types of natural resources are thus predisposed to influence the long- run level of development. North America's resource base enabled it to become rich, but South America's did not.13 As Frieden (1999, p. 22) writes in his account of economic growth in modern Latin America: "Economic character- istics of assets determine the policy preferences of their owners.. ..The incentive to lobby increases with the specificity of the asset." ~ntrenchedinequality has social-dimensions as well. Some areas of geo- graphic space are conducive to large-scale production (plantation agriculture). In these regions relationships tend to bind producers to a social superior (noble, landowner), and the horizontal relationships among producers tend to be ones of distrust, producing a social structure that is conducive to "bad" politics (clientelism) and to "bad" governance (because citizens cannot cooperate to demand better services from the state). This pattern is in contrast to other areas of geographic space, which are conducive to smallholder production on individually owned plots and in which horizontal interactions among produicers tend to be relationships of equality. 11. The Engerman-Sokoloff account of continental divergence is one based on factor endowlments broadly defined and implicates primarily the role of labor scarcity (Hoff 2003). 12. See Tornell and Lane (1999)for a model of how special interests can dampen economic growth. On the institutional side, their argument is very much in the spirit of this article. They also note (echoing Barro 1997) that one possible explanation for the distributive struggle in many countries is the attempt to appropriate rents generated by natural resource endowments. 13. For instance, one of several possible channels is that proposed by the late Carlos Diaz-Alejandro, who is said to have conjectured informally to his students that at least some of the difference in political and economic evolution between Argentina and the United States could be explained by the fact that in Argentina land gets better from west (lastsettled) to east (firstsettled),whereas in the United States land gets better from east (first settled) to west (last settled). Hence in Argentina, population growth led to larger and larger rents on the good land that was already divided. Access to new land was availablmeonly for less attractive property, and redistribution would involve existing, very valuable lands. In the United States, by contrast, the western expansion moved people onto superior land. This meant that the system of property rights was developed as new and valuable lands were being brought into the economy (De Soto 2000); redistribution or taxation of the rents on of existing land was thus of almost no interest. 150 THE W O R L D B A N K E C O N O M I C REVIEW, VOL. 19, N O . 2 Implications of All Three Stories The links between particular types of natural resources and government capa- city exhibit three common (but distinctive) elements. First, all involve some connection from the structure of economic production, particularly the struc- ture of exporting activity, to some measure of the capacity and quality of government. Second, natural resource production characteristics matter, not just natural resource exports. The geographic pattern of production is impor- tant, particularly as it affects the ease with which the state can control and extract rents. Whereas others have focused (rightly) on dimensions of natural resource exports such as lack of diversification and exposure to secular declines in terms of trade (and volatility),14this article stresses the effect of exports on political and social structures and only then indirectly on economic perfor- mance. Thus, as other have shown, although it is possible for the state to extract rents from all forms of natural resources (through bottlenecks along the trans- port chain, for example), point source resources are far more susceptible to capture (whetherthrough marketing boards, control of line ministries, or direct procurement) than diffuse resources, as the opening country vignettes demon- strate. Third, though many of the growth stories involve very long-run effects, there is also a connection with changes in growth rates through the combination of weak institutions and shocks. The hypothesis and the related empirical strategy can therefore be stated as follows. Different types of natural resource endowments matter for eco- nomic growth. In particular, countries dependent on point source natural resources and plantation crops are predisposed to heightened social divisions and weakened institutional capacity. This in turn impedes their ability to respond effectively to shocks, which previous studies have shown to be essential for sustaining rising levels of prosperity. Export structures influence various measures of political and institutional performance, and these mea- sures of political and institutional performance condition growth perfor- mance during 1974-97, a period of massive deceleration in growth in developing economies. Again, this approach does not attempt to differentiate among the various models whereby resource endowments affect political and institutional structures or among the various models whereby these structures affect growth. The link between endowments and export structure is taken as given- countries with oil are more likely to export oil, and countries can only export crops such as coffee and cocoa if they have appropriate climates. This link has a 14. Note that the classification"diffuse"exporter as used in the analysis here concerns the conditions of production of any given commodity, not diversification across different commodities. Isham and others 151 reasonable base in theory and evidence. The measures of the quality of govern- ment that are used in the analysis are typically from the 1980s and 1990s. Export structure is from prior to that period so that at least with respect to post- 1980 growth and currently assessed institutional quality, export structure is predetermined. The weak link in determining the chain of causation is that it is possible that historical factors affect institutions (as already discussed) and that this in turn determines whether a country will develop a capacity to produce and export manufactures, and hence the link between poor governance and exports is caused by poor governance. However, this argument is much less compelling than the argument based on different types of natural resource exports. Moreover, given that the geopolitical and economic impor- tance of certain types of natural resources is relatively new (the surging global demand for oil, and the debilitating economic "shocks" to which that has given rise, are largely coincident with the postcolonial era) and that geology largely determines natural resource location (but not demand), it seems reasonable to regard the link between endowments and export structures as largely exogenous. Data on Export Composition To test this hypothesis, export structures were classified according to their natural resource base using two methods. First, data on the leading exports of every country in 1985 with a GNP per capita of less than $10,000 and a population greater than 1 million were taken from the Handbook of Interna- tional Trade and Development Statistics (UNCTAD 1988). Countries were classi- fied into four types on the basis of their top two exports at the Stan'dard International Trade Classification (srrc)three-digit level: Manufacturing exporters, which have relied on exports of manufactures (without regard to labor or capital intensity). Diffuse exporters, which have relied primarily on livestock and agricultural produce grown on small family farms (riceand wheat, for example). Point source exporters, which have relied primarily on fuels, minerals, and plantation crops (such as sugar). Coffee and cocoa exporters, which have relied primarily on these two commodities. (Classifying them as either point source or diffuse proved problematic because these crops can be grown on either plantations or small family farms, but because these tree crops take many years to reach maturity and are immobile, they are potentially susceptible to rent extrac- tion from smallholders through marketing boards.) Judgments by country and commodity experts were used when there was some ambiguity about a country's classification. The countries used in this analysis and their classifications are presented in appendix table A.l, along with their first and second most important exports and each export's share in total exports.15 The second method was to compute four indices of net export shares that mirror the four categories of exports by type: manufacturing, diffuse, point source, and coffee and cocoa. In constructing these four indices, the World Trade Analyzer (Statistics Canada 2002) from 1980 was used to aggregate - - - SITC codes at the two-digit level for subcategoriesof exports into the four export categories, following the approach of Leamer and others (1999).The net export share for each subcategory is calculated as net exports (exports minus imports) of subcategory i divided by the sum of the absolute value of net exports across all subcategories (followingthe procedure in Leamer and others 1999).The four indices are then calculated as the sum of the net export shares for each sub- category in each of the four categories. By construction, these indices have a range of -1 to 1, with a higher number indicating a greater reliance on that category for export earnings.16 Reassuringly, the two methods give similar results, as shown by the means of these four indices across all countries and according to the U N C T A D - ~ ~classi S ~ ~- fications (table1).The manufactures index is much higher for the UNCTAD-based manufacturing exporters, compared with the three resource exporters (moving down the first column of the table). The diffuse, point source, and coffee and cocoa indices are highest for each of the corresponding set of wc~A~-based classifications (the bold numbers in the last three rows of the table) and higher than the other row entries in the same category.17 Differences in Growth and Institutional Quality across Export Categories A return to the growth story introduced in figures 1 and 2 shows that since 1974, growth rates in developing economies have been massively different 15. There are several borderline cases on which reasonable judgments could differ. Wherever possi- ble, such borderline cases were classified to err on the side that would be "against" the hypothesis. For instance, should Botswana be considered a point source exporter because of its diamonds or a diffuse exporter because of its cattle? Acemoglu and others (2003) have argued that the social structures that emerged from cattle raising were an important part of the Botswana success story (and why it was able to resist the pressures of diamond exports). For this study Botswana is classified as a point source exporter- which weakens a case that this is adverse for institutional quality and growth. In other cases, subtle judgments had to be made, and it is unlikely that they affected the overall results because they were not based on performance. For instance, although Burkina Faso and Mali both export cotton (regarded as a point source export) and live animals (regarded as diffuse) as their two major exports, Burkina Faso is classified as a point source exporter and Mali is classified as a diffuse exporter because its share of live animals is substantially higher than that of Burkina Faso. 16. For additional detail on the rationale behind the groupings, see Leamer (1984). The authors thank Peter Schott for providing this information. 17. The classification of countries also produces reasonable results when compared with standard sources such as World Development Indicators (World Bank 1999). Over the 15-year period before the oil shock, manufactures were only 10.6 percent of exports for resource exporters compared with 46.8 percent for merchandise exporters. Isham and others 153 TABLE 1. Mean of the Indices of Net Export Shares by Export Composition and Natural Resource Base of Selected Developing Economies Statistics Canada-Based Trade Data for 1980 ~NCTAD-Based Manufactures Diffuse Point Source Coffee and Classification Index Index Index Cocoa Index All countries -0.34 0.03 0.11 0.0'6 Manufacturing exporters -0.02 -0.05 -0.12 0.01 Diffuse exporters -0.38 0.08 -0.04 0.0.4 Point source exporters -0.35 0.01 0.28 0.04 Coffee and cocoa exporters -0.43 0.06 -0.02 0.16 Note: Means of selected export- and trade-related data for 90 developing economies. Source: Authors' calculations based on ~NCTAD(1988 ) and Statistics Canada (2002)data. TABLE 2. Average Annual GDP per Capita Growth Rates by Export Composition and Period Resource Exporters Manufactures Point Coffee and Period Exporters All Diffuse Source Cocoa 1957-97 4.16 1.43"" 1.74 1.57 0.76 1957-74 3.56 2.54 2.03 3.08 1.73 1975-97 4.58 0.65*" 1.60 0.51 008 Difference, 1975-97 less 1957-74 1.02 -1.89 -0.43 -2.57 -1,.65 ""Significant at the 1 percent level for Mann-Whitney test of similar distributions in rescturce- poor and resource-exporter countries. Source: Authors' estimations based on data indicated in appendix table A.2. between manufactures exporters (4.58 percent) and natural resource expolrters (0.65 percent)-differing by almost 4 percentage points annually (table 2). Whereas growth among manufactures exporters increased by 1 percentage point between 1957-74 and 1975-97, growth among resource exporters decel- erated by almost 2 percentage points (1.89),together accounting for 3 percen- tage points of the difference (growth rate differences of this magniltude maintained over time have enormous implications-if two countries begin with equal income today, the country that grows 3 percentage points faster would be more than twice as rich in only 22 years). More important to the hypothesis that the type of exports-measured by the four indices-affects economic growth through political and social institutions, growth rates are also significantly different (using the Mann-Whitney test) among types of resource exporters. Diffuse exporters did almost as well as before the oil shock, with growth decelerating by only 0.43 percentage point, whereas growth decelerated by 1.65 percentage points for coffee and cocoa exporters and by 2.57 percentage points for point source exporters. 1 5 4 T H E W O R L D BANK E C O N O M I C REVIEW, V O L . 1 9 , N O . 2 TABLE 3. Institutional Quality and Export Composition among 90 Developing Economies Resource Exporters Manufactures Point Exporters All Diffuse Source Coffee and Variable (n=9) (n=81) (n=18) (n=45) Cocoa (n=18) Rule of law Political stability and violence Government effectiveness Absence of corruption Voice and accountabilitv Regulatory burden Law and order tradition Quality of the bureaucracy Political rights Civil liberties Property rights and rule-based governance "Significant at the 5 percent level for Mann-Whitney test of similar distributions in resource- poor and resource-exporter countries. "Significantatthe1percentlevelforMann-Whitneytestofsimilardistributionsinresource- poor and resource-exporter countries. Source: Authors' estimations based on data indicated in appendix table A.2. Averages were also compared across these exporter classifications for 11 insti- tutional variables that have been used as indicators of "institutional quality" in the empirical growth literature (table 3).18By these variables institutional quality is unquestionably higher among manufactures exporters. The indicator is lower among the resource exportersin all cases, and for six of the variablesthe difference is statistically significant. However, the differences across types of resource expor- ters are not impressive: although diffuse exporters tend to have better institutional quality, the differences are small and not statistically significant. The analysis now moves beyond the simple cross-tabulations, and the continuous indices of export composition are used to estimate a two-equation system. In the first equation, institutional variables are endogenously determined by different types of natural resource intensity (point source, diffuse, coffee and cocoa.) and 18. These institutional variables were recently used in a set of papers on the institutional determi- nants of economicgrowth; see, among others, Knack and Keefer (1995);Rodrik (1999);Kaufmann and others (2000);Dollar and Kraay (2003);Ritzen and others (2000);and Easterly (2001).Growth rate data for 1957-97 were compiled from the Penn World Table version 6.1 (Heston and others 2002) and the World Development Indicators (World Bank 1999). Measures of social and political data were adapted from Kaufmann and others (2002),Easterly (2001),and World Bank (2002). lsham and others 155 by other correlates of institutional quality that have been proposed in the litera- ture (table 4). In the second equation, growth is then determined by institu1:ions (aswell as initial income, education, and the other usual suspects from the growth regression literature).Unless otherwise noted, all of the regression results reported are from the two-stage system detailed here. First, an equation is estimated for each of six indicators of institutional quality measured in the 1990s, Pi,1yy0,(rule of law, political stability and violence, government effectiveness, absence of corruption, regulatory frame- work, and property rights and rule-based governance) as a linear function of the four indices of net export composition measured in 1980, NRki, plus five other relatively predetermined variables (English language, European language, distance from equator, predicted trade share, and ethnolinguistic fractionaliza- tion); all of the usual growth determinants, Yi(lagged GDP per capita, lagged secondary school achievement, the Sachs-Warner indicator of trade openness, changes in the terms of trade, and the share of primary exports to GDP); and a set of regional dummy variables:" 5 + /3F*Xy + Region dummy variables + E!, j = 1, ..,6 m=l Growth over the period 1975-97 is then estimated as a linear function of an endogenously determined indicator of institutional quality (includedone at a time), the same usual growth determinants, and the set of regional dummy varialbles. Three-stage least squares ( 3 s ~is~used to estimate this system of equations:20 ) 19. One reaction to this strategy is to wish for instruments for the instrumentssee, for example, Rigobon and Rodrik (2004),who exploit the structural variance in two country subsamples to account for possible endogeneity. Using export structures to identify the impact of political and institutional quality variables on growth, as here, leads to the complaint that export structure is not exogenous and, in particular, that countries that developed good institutions, even if they were richly endowed with natural resources (such as Australia and Norway), ceased being primary goods exporters and hence ewport structure is endogenous to institutions. The point is not, however, that exporters of manufactures have better institutions than primary exporters do (as others have tried to show), but rather that the composi- tion of the types of natural resource exports has political effects over and above the manufactures versus primary goods distinction. On this point there can be more confidence that the variation across exports is exogenous and endowment driven and that as such, using the substantially lagged export structurseson institutional performance goes some way toward resolving the identification question. 20. This follows the example of Barro (1997)and many others in the growth literature. 3s~sestimates are more efficient than instrumental variable estimates if the error terms are correlated and the systfemis not exactly identified. As noted by an anonymous referee, some of the recent growth literature has used recursive seemingly unrelated regression (SUR) estimation for models that include natural resource endowments (for example, Gylfason and Zoega 2001), which is one approach to addressing the simlulta- neity problem that might otherwise bias such results. There were no substantial differences in the reported results when the model was reestimated using recursive sm. TA BLE 4. The Effect of Natural Resource Endowment on Six Institutional Variables (6) (1) (2) (3) (4) (5) Property Rights Rule Political Government Absence of Regulatory and Rule-Based of Law Stability Effectiveness Corruption Framework Governance Manufactures index -0.02 0.05 -0.48 0.02 -0.49 (0.23) (0.29) (0.26) (0.24) (0.30) Diffuse index -0.08 -0.27 -0.39 -0.21 0.05 (0.34) (0.45) (0.40) (0.36) (0.45) Point source index -1.48:::' -2.09"" -1.47:'" -0.95"" -1.09,5* (0.26) (0.33) (0.30) (0.26) (0.34) Coffee and cocoa index -2.05:'" -3.26":' -1.64" -1.41'" -1.60 (0.69) (0.87) (0.82) (0.71) (0.89) Ethnic fractionalization 0.0027 0.0018 0.0027 0.0004 0.0022 + u (0.0023) (0.0029) (0.0027) (0.0023) (0.0029) n Predicted trade share 0.06 0.06 -0.09 0.12 0.04 (0.10) (0.13) (0.12) (0.11) (0.13) Latitude -0.0058 -0.0005 -0.0007 -0.0047 0.0007 (0.0044) (0.0057) (0.0051) (0.0045) (0.0056) English language 0.10 -0.44 0.04 0.09 -0.08 (0.29) (0.36) (0.33) (0.29) (0.37) European language 0.92"" 118:'" 1.11*" 0.96"" 0.99" (0.32) (0.40) (0.37) (0.32) (0.42) GDP per capita 0.127"" 0.195,)::. 0 150:):~ 0.060 0.058 (0.046) (0.059) (0.053) (0.047) (0.061) Secondary school achievement 0.024" 0.001 0.029"" 0.034"" 0.029" (0.010) (0.012) (0.011) (0.010) (0.013) Trade openness 0.55";' 0.31 0.34 0.17 0.43 (0.20) (0.25) (0.23) (0.20) (0.27) -- (Continued) TABLE Continued 4. (6) (1) (2) (3) (4) (5) Property Rights Rule Political Government Absence of Regulatory and Rule-Based of Law Stability Effectiveness Corruption Framework Governance Change in terms of trade -0.01 -0.09 -0.14 - 0 6O:):b 0 . 3 7 -0.45" (0.18) (0.23) (0.23) (0.18) (0.24) (0.20) Share of primary exports1 GDP 1.27" 146:: 1.23'? 0.15 0.89 1.77"" (0.53) (0.68) (0.61) (0.55) (0.71) (0.52) Sub-Saharan Africa 0.18 0.60 0.44 0.28 0.25 0.15 (0.28) (0.36) (0.33) (0.29) (0.37) (0.31) c Europe and Middle East 0.87"" 0 95:; 0.95"" 0.78:)" 0.58 0.79" '.I (0.29) (0.37) (0.34) (0.30) (0.39) (0.33) Latin America -0.49 0.21 -0.37 -0.36 0.03 -0.22 (0.33) (0.42) (0.38) (0.34) (0.43) (0.36) East Asia 0.16 0.43 0.51 0.08 0.41 0.23 (0.29) (0.38) (0.34) (0.30) (0.40) (0.31) Adjusted R-squared 0.71 0.65 0.63 0.63 0.51 0.64 Sample size 66 65 64 64 66 62 *Significant at the 5 percent level. ""Significant at the 1percent level. Note: GDP per capita (1975) was adjusted for purchasing power parity. Numbers in parentheses are standard errors. Source: Authors' estimations based on data indicated in appendix table A.2. 158 T H E W O R L D BANK E C O N O M I C REVIEW, VOL. 19, N O . 2 +(1.1 + +qi (2) Growth; = ao *Iij a 2*Xi Two of the growth determinants deserve particular attention: the terms of trade, to be sure that the regression is not simply capturing the effect of falling terms of trade, and the share of primary exports in GDP, as was done in a pair of influential papers by Sachs and Warner (1995, 1999). They argue that having abundant natural resources makes a country less competitive in manufacturing exports and that manufacturing exports have some features, such as learning spillovers, that make them "extra good" for growth. Originally, the thought was that the channel through institutions might better explain the presence of the "primary share" in a gowth regression. However, including the regional dummy variables in a sample of developing economies already makes the pure primary exports variable statistically insignificant.Even so, the share of primary exports to GDP is included as a growth regressor, because this ensures that the impacts of export structure are due to the composition of primary export types and not simply to the fact that any natural resource has the same impact.21 Estimation of equation 1, to establish whether measures of the natural resource endowment (using the four indices derived from Statistics Canada 2002 data) predict the nature of socioeconomic and political institutions," show that neither the manufactures index nor the diffuse index are statistically significant predictors of any of the six institutional variables (see table 4). In contrast, the point source index is statistically significant in all six specifications: all else being equal, an increased dependence on point source natural resources is associated with much worse institutions. The coffee and cocoa index is significant in specifications 1 - 4 . ~AS ~ for the other regressors in this model, European language, secondary school achievement, and Europe and the Middle East are also robust statistically significant predictors of this set of institutional variables. The share of primary exports in GDP is a positive and significant predictor of institutions as well, which seems to raise questions about the net effect of exporting certain kinds of primary goods.24TO diagnose this result, the share of primary exports in GDP in the model was replaced with the share of exports in GDP. The results were almost identical statistically. Next, both of these variables were included as regressors in the model. Neither variable was significant on its 21. An anonymous referee suggested verifying that the inclusion of exports as a share of GDP in the model has no effect on the reported results; that was found to be the case. Likewise, investment is not a significant determinant of growth (as found by Gylfason 2001) in models in which the regional dummy variables are included. 22. From 62 to 66 of the 90 countries that are used to derive tables 1 and 2 have the required data to estimate these models. The countries included in these estimations are noted with an asterisk in the second column of appendix table A.1. 23. The p-values for specifications 5 and 6 are 0.07 and 0.12, respectively. 24. The authors thank an anonymous referee for drawing their attention to this point. Isham and others 159 TAB LE 5. The Relative Magnitude of the Effect of the Natural Resource Endowment Variables on Institutions ( 6 ) Property (1) (2) (3) (41 (5) Rights and Rule of Political Government Control of Regulatory Rule-Based Law Instability Effectiveness Corruption Framework Governance Point source index -0.58 -0.71 -0.57 -0.41 -0.38 0 . 4 6 Coffee and cocoa index -0.27 -0.37 -0.21 -0.20 -0.18 -0.13 European language 0.53 0.59 0.63 0.61 0.50 0.60 GDP per capita 0.32 0.44 0.38 0.17 0.13 0.17 Secondary school 0.26 0.01 0.31 0.40 0.28 0.35 achievement Trade openness 0.25 0.12 0.15 0.08 0.17 0.00 Share of primary 0.21 0.21 0.20 0.03 0.13 0.27 exports in DP G Note: Figures are the equivalent of beta coefficients from three-stage least squares estimation. Source: Authors' estimations based on data indicated in appendix table A.2. own, but they were jointly significant. These results suggest that it is the presence of exporting of any kind that has an independent and positive effect on institutions. Higher exporters are more plugged into globalized markets, and countries can only be plugged into globalized markets if they respect rule of law, property rights, and other institutional indicators of good governance. What are the relative magnitudes of the effects of the significant regressors in this equation? Table 5 lists the 3 s ~ sequivalent of standardized beta coeffi- c i e n t ~The~values for the point source index (from -0.38 to -0.71) are either . ~ the largest (columns 1 and 2) or second largest (columns 3-6) compared with the values for European language and the other significant variable^.^^ The values for the coffee and cocoa index (from -0.13 to -0.37) are generally comparable to those of GDP per capita. What are the absolute magnitudes of the effects of the natural resource variables? A country whose point source index fell by a standard deviation (= 0.266)~~-theapproximate difference between Angola (0.70) and Cameroon (0.42)-wouldincrease rule of law by 0.39; a country whose coffee and cocoa index fell by a standard deviation (= 0.088)-the approximate difference between Colombia (0.22) and Ecuador (0.14)-wouldincrease rule of lavv by 0.18. Because the standard deviation of rule of law is 0.68, these represent substantial institutional improvements. To illustrate, the estimated effect of a 25. Figures are calculated as the product of the coefficient and the standard deviation (from the regression sample) of the listed variable, divided by the standard deviation of the dependent variable. 26. The Europe and Middle East dummy variable was excluded from this comparison. 27. Here and with the beta coefficientcalculations above, the standard deviations from the regression sample are used, as listed in appendix table A.2. decrease in 1 standard deviation in the point source index and in the coffee and cocoa index yields a total change of rule of law of 0.57, based on the calcula- tions already done. This is equivalent to the difference between Saudi Arabia (0.19) and Taiwan, China (0.75). These overall results, in both relative and absolute magnitudes, are consistent with the first hypothesis that both point source and coffee and cocoa dependence are critical determinants of socio- economic institutions. Table6 presentsthe resultsof estimatingthe growth equation toshow the strong impact of institutionson post-1974 Five of the six ordinary leastsquares (OLS) models (specification 1)suggest that institutionsare a positiveand significant determinantof economicgrowth among these developingeconomiesfrom 1975 to 1997. By contrast, when the four natural resource indices are used as the identify- ing instruments (specification 2), the estimation results for all six institutional variables are significant-and the point estimate is also greater than the OLS point estimate (which is consistent with the presence of a plausible degree of measure- ment error in the indicators of institutional quality).When the five other relatively predetermined variables (English language, European language, distance from equator, predicted trade share, and ethnolinguistic fractionalization)are added to the natural resourceinstrumentset (specification3),the results are broadlysimilar. The presence of alternative valid instruments for institutions allows the testing of the "exclusion" restriction-that export composition affects growth only insofar as it affects institution^.^^ Intuitively, the test is an F-test of the inclusion of the four export composition indices in the growth regression with a consistent estimate for the effect of institutions (Davidson and MacKinnon 1993; Hausman and Taylor 1981). The tests show no evidence that export composition should be included in the growth regression (appendix table A.3). The results reported in this section constitute the econometric punch line of this article. First, it is not just natural resource exports that lead to lower quality institutions but a particular type of natural resource exports. Both point source export dependence and coffee and cocoa export dependence are negatively associated with national socioeconomic institutions. This is consistent with the long-run stories of institutional determinati~n.~'Second, the results using this 28. In table 6, the top line is taken from appendix table A.3 and the remaining five lines are taken from tables similar to appendix table A.3 where rule of law was replaced first by ~oliticalstability, then by government effectiveness, and so on. 29. It is usually difficult, if not impossible, to find instruments that are correlated with the regressors but not with economic growth, due to the inherent endogeneity of macroeconomic variables (Temple 1999). Following Hall and Jones (1990), however, the clearly exogenous English language, European language, and distance from equator instruments are ideal for the endogenous regressors of interest, the institutional variables. 30. This is of course also consistent with the possibility that long-run institutions affect export composition in the1980s, and certainly being a manufactures exporter rather than being a natural resource-dependent exporter in the 1980s is strongly dependent on long-run trajectories. The innovation in this article is distinguishing among types of exports. Isham and others 161 TAB LE 6. The Effect of the Institutions on Economic Growth, 1974-97 (3) ~ S L S Language Variables, Equator (2) Distance, Trade Share, Estimation procedure (1) ~ S L S Natural Fractionalization, and and instrument set OLS Resources Natural Resources Rule of law 1-33"" 1.36" 1.30:'" (0.33) (0.50) (0.44) 66 66 66 Political instability 0.68" 0.79" 0.79* (0.27) (0.37) (0.35) 65 65 65 Government effectiveness 1.14" 1.56"" 1.35:>* (0.32) (0.56) (0.46) 64 64 64 Control of corruption 0.79 1.59" 1.35" (0.40) (0.81) (0.64) 64 64 64 Regulatory framework 1.00"" 1.85X.:'* 1..55+' (0.30) (0.70) (0.57) 66 66 66 Property rights and 1.51*:> 2.50'" 1.66"" rule-based governance (0.38) (0.82) (0.54) 62 62 62 "Significant at the 5 percent level. ""Significant at the 1 percent level. Note: Numbers in parentheses are SEs. Each reported set of results is the result of including just one of the indicators of institutional quality in the growth regression in equation 2. Source: Authors' estimations based on data indicated in appendix table A.2. method reconfirm what others have found: institutions, which are endogenously determined by the nature of natural resource dependence, are significant deter- minants of growth. Third, and a bit more speculatively, the hypothesis cannot be rejected that the only impact of export structure on growth is through institutions. What are the implications of this two-stage effect?It was reported above that a large change in the composition of a country's natural resource endowment- a 1 standard deviation change in point source dependence and coffee and cocoa dependence-is associated with a relatively large improvement in the measures of socioeconomic institutions. How might such an improvement translate into a change in economic growth? The estimated effect on economic growth of a 1 standard deviation decrease in the point source index and of the coffee and cocoa index, through better institutions, was calculated using the results f?om table 4 and specification 3 in table 6. These calculations yield an annual increase in per capita growth of between 0.51 percentage point and 0.75 percentage point. Using the median of these figures (0.68 percentage points), this translates, all else being equal, into a GDP per capita that is 19 percent higher 25 years later among countries with better institutions than among countries with worse institutions. IV. DISCUSSION AND CONCLUSION At first glance these are stultifying results for the policymaker. Like Putnam's (1993)results on medieval guilds and choral societies, it is hard to imagine how a policymaker interested in accelerating growth can change what is here identi- fied as one possible underlying cause of poor performance-that a country's natural resource endowment makes for poor institution^.^' What options are available to the policymaker? The World Bank (1998) illustrates the power of institutions in development assistance and identifies what donors should (and, more important, should not) do in the face of varied institutional performance among potential aid recipients. The results here sug- gest how entrenched-and environmentally determined-poor institutions can be (compare with Wade 1988, at the micro level).So these results raise further cautions about casual attempts at institutional reform (Murshed 2003). Poor institutions are deeply rooted. Where others (such as Acemoglu and others 2004; Pritchett 2000; Rodrik 1999) have shown how important institutional quality and social inclusion are to managing long-run growth generally and growth volatility in particular, these results push the chain of causation one step further back, showing that, as asserted by Karl (1997, p. 13) in the opening quotation, "the revenues a state collects, how it collects them, and the uses to which it puts them" do indeed "define its nature." Institutions surely matter a lot, but the results here are consistent with models in which types of natural resource endowments and the export structures to which they give rise (rather than "geography"),32play a large role in shaping what kinds of institutional forms exist and persist. There are possibilities for structuring the influences once they are identified, but even this is not obvious. In Chad, for instance, outside factors (notably the World Bank) have placed institutional conditions on the use of resources from the oil pipeline that they are helping finance. Perhaps this will work, but as this project began, money was still going, defiantly, to purchase arms (Thurow 2003). In contrast, in Qatar the head of state recognizes that natural resource-based revenues-and the institutions that they have sustained-are likely to weaken in the near future. Accordingly, he is attempting reform from within and has decreed that Qatar will become a democracy (Weaver 2000). 31. Similarly, in their recent article on the primacy of institutions over geography for economic growth, Rodrik and others (2004, p. 157) observe that "the operational guidance that our central result on the primacy of institutional quality yields is extremely meager." 32. On the relatively small (and insignificant)direct effects of geography compared with institutions, see Rodrik and others (2004). Isham and others 163 In some cases, donors can-if they are lucky-gently nudge along such reforms. At the very least, donors should not maintain (perceived) "lifeline" aid that prevents nascent reforms from even getting started. More optinnistic and constructive are proposals, such as those made by The Ecolzomist (2003) and Sala-i-Martin and Subramanian (2003), among others, to make publicly available all revenues and expenditures associated with natural resource rents. Greater transparency and citizen accountability, as in other realms of public management reform, are key. Regarding client countries as mere repositories for the steady flow of highly valued--economically and geopolitically-natural resources such as oil and diamonds, rather than as genuine partners in the development process, likely undermines such reform efforts. TA BLE A- 1. Details on the Export Classifications Derived from UNCTAD Data SITC Export SITC Export Share of Share of Export First and Second Most Code for Code for Total Category Classification Economy Year Important Exports First Second Exports (%) Exports (%) Manufacturing Bangladesh Woven textiles, textile China Vehicles parts, knitwear '' Hong Kong, China Manufacturing " India Pearls, clothing " Korea, Rep. of Ships, clothing Nepal Floor cover, clothing Singapore Manufacturing Taiwan, China Turkey Clothing, textiles Diffuse " Argentina Wheat, oilseeds, and nuts 'Myanmar Rice, wood '' Gambia CL a Oil seeds, vegetable oils P Guinea-Bissau Fruit " Honduras Fruit, coffee Lesotho " Malaysia Crude petroleum, vegetable oil :t Mali Cotton, live animals Mozambique Fish, fruit " Pakistan Cotton, rice Panama Fruit, fish " Philippines Special transactions and commodities, vegetable oil " Senegal Fish, vegetable oils Somalia Live animals, fruit " Sri Lanka Tea, clothing ''Thailand Rice, vegetables " Uruguay Wood, meat " Zimbabwe Tobacco, pig iron (Continued) TA BLE A-1. Continued SITC Export SITCExport Share of Share of Export First and Second Most Code for Code for Total Category Classification Economy Year Important Exports First Second Exports (%) Exports (%) Point source " Algeria Petroleum products, crude petroleum Angola Crude petroleum, petroleum products '' Benin Cotton, cocoa " Bolivia Tin, gas Botswana Diamonds " Burkina Faso Cotton. live animals " Chad Cotton, live animals " Chile Copper, nonferrous ore '' Congo Crude petroleum, petroleum products " Dominican Repub. Sugar, pig iron " Ecuador Crude petroleum, coffee '' Egypt Crude petroleum, cotton Fiji Sugar " Gabon Crude petroleum, wood Guinea '' Guyana " Indonesia Crude petroleum, gas 'Iran Crude petroleum, tapestry Iraq Crude petroleum, fruit " Jamaica Inorganic elements, nonferrous metals " Jordan Fertilizers (crude),fertilizer (manufactured) Liberia Iron, rubber " Malawi Tobacco, tea " Mauritania Iron, fish " Mauritius Sugar, clothing (Continued) TA BLE A-1. Continued SITC Export SITC Export Share of Share of Export First and Second Most Code for Code for Total Category Classification Economy Year Important Exports First Second Exports (%) Exports (%) 'Wexico 1985 Crude petroleum, petroleum products " Morocco 1985 Fertilizers, Inorganic elements Namibia " Niger 1981 Uranium, live animals " Nigeria 1985 Crude petroleum, cocoa Oman Papua New Guinea 1985 Nonferrous metal, coffee " Paraguay 1985 Cotton, oil * Peru 1985 Petrol, nonferrous metal Saudi Arabia " Sierra Leone 1985 Pearl, nonferrous metal " South Africa 1985 Special, coal Sudan 1985 Cotton, oil seeds " Syria 1985 Crude petroleum, petroleum products " Togo 1985 Fertilizers, cocoa ;'Trinidad &Tobago " Tunisia 1985 Crude petroleum, clothing :'Venezuela 1985 Crude petroleum products " Zaire 1985 Copper, crude petroleum " Zambia 1985 Copper, zinc Coffee and cocoa '>Brazil 1985 Coffee, petroleum products " Burundi 1985 Coffee, tea " Cameroon 1986 Coffee, cocoa (Continued) TAB LE A-1. Continued srrc Export SITCExport Share of Share of Export First and Second Most Code for Code for Total Category Classification Economy Year Important Exports First Second Exports (%) Exports (%) Central African Coffee, Wood Repub. " Colombia Coffee, petroleum products " Costa Rica Coffee, fruit 'TC8te d'Ivoire Cocoa, coffee " El Salvador Coffee, Sugar Ethiopia Coffee, hides " Ghana Cocoa, Aluminum " Guatemala Coffee, crude vegetable materials " Haiti Coffee, clothing :' Kenya Coffee, tea ''. Madagascar Coffee, spices " Nicaragua Coffee, cotton " Rwanda Coffee, tin " Tanzania Coffee, cotton '"Uganda Coffee, hides "Countries with required data to estimate the regression models. Note: See the text for a description of the classification methodology. Source: Authors' calculations based on export classification data from mcra~(1988). TAB LE A - 2. Data Names and Sources Entire Sample Regression Sample Dependent Variable Year or Years Source Sample Size Mean SD Sample Size Mean SD Per Capita Growth Rate World Bank 1999, Heston and others 2002 Natural resource variables Manufactures index Statistics Canada 2002 Diffuse index Statistics Canada 2002 Point source index Statistics Canada 2002 Coffee and cocoa index Statistics Canada 2002 Possible determinants of + institutions rn Easterly 2001 00 Ethnic fractionalization Predicted trade share Hall and Jones 1999 Latitude Hall and Jones 1999 English language Hall and Jones 1999 European language Hall and Jones 1999 Possible determinants of economic growth GDP per capita World Bank 2002 Secondary school Barro and Lee 1995 achievement Trade openness Sachs and Warner 1995 Change in terms of trade World Bank 2002 (Continued) TA BLE A-2. Continued Entire Sample Regression Sample Dependent Variable Year or Years Source Sample Size Mean SD Sample Size Mean SD Share of primary Sachs and Warner 1995 exportsl~~r Sub-Saharan Africa World Bank 2002 Europe and Middle East World Bank 2002 Latin America World Bank 2002 East Asia World Bank 2002 Institutions Rule of law Kaufmann and others 2002 Political instability Kaufmann and others 2002 cL and violence 0\ Y) Government effectiveness Kaufmann and others 2002 Control of corruption Kaufmann and others 2002 Voice and accountability Kaufmann and others 2002 Regulatory burden Kaufmann and others 2002 Law and Order Traditiona Easterly 2001 Quality of the Bureaucracya Easterly 2001 Political rightsb Easterly 2001 Civil libertiesb Easterly 2001 Property rights and rule-based World Bank 2002 governancec "Based on International Country Risk Guide data. b~asedon Freedom House data. 'Based on World Bank Country Policy and Institutional Assessment. NA, not applicable. TABL A-3. Determinants of Economic Growth, 1974-97 E (1) (2) (3) (4) (5) (6) Variable OLS 3srs 3srs 3srs ~ S L S ~ S L S Constant Rule of law GDP per capita Secondary school achievement Trade openness w Change in terms of trade 2 Share of primary exports in GDP Sub-Saharan Africa Europe and Middle East Latin America East Asia Adjusted R-squared Sample size (Continued) TAB LE A-3. Continued (1) (2) (3) (4) (5) (6) Variable OLS ~ S L S ~ S L S ~ S L S ~ S L S ~ S L S Instruments Language Natural resources Language Language Language variables variables, variables, variables, and equator equator distance, equator distance, equator distance, trade share, and trade share, distance and natural fractionalization fractionalization resources and natural + resources 3 Hausman test (p-value) 0.95 0.57 0.89 0.96 0.92 Overidentification test 0.97 0.30 0.72 0.90 0.76 (p-value) Hausman-Taylor test 0.64 0.51 0.66 (p-value) *Significant at the 5 percent level. ""Significant at the 1percent level. Note: Dependent variable is the annual growth rate of cop, 1975-97. Acemoglu, Daron, Simon Johnson, and James Robinson. 2001. "The Colonial Origins of Comparative Development: An Empirical Investigation." American Economic Review 91(5):1369401. . 2002. "Reversal of Fortune: Geography and Institutions in the Making of the Modern World Income Distribution." Quarterly Journal of Economics 117(4):1231-94. . 2003. "An African Success Story: Botswana." In Dani Rodrik, ed., In Search of Prosperity. Princeton, N.J.: Princeton University Press. . 2004. "Institutions as the Fundamental Cause of Long-Run Growth." NBER Working Paper 10481. National Bureau of Economic Research, Cambridge, Mass. Agence France Press. 2004. "Sao Tome to Regulate Use of New Oil Riches." February 18. Available online at www.gasandoil.com/goc/news/nta41093.htm. Auty, fichard. 1995. Patterns of Development: Resources, Policy and Economic Growth. London: Edward Arnold. . 2001a. "Introduction and Overview." In R. M. Auty, ed., Resource Abundance and Economic Development. New York: Oxford University Press. ,ed. 2001b. Resource Abundance and Economic Development. New York: Oxford University Press. Barro, Robert. 1997. Determinants of Economic Growth: A Cross-Country EmpiricalStudy. Cambridge, Mass.: mit Press. Bates, Robert. 2000. Prosperity and Violence: The Political Economy of Development. New York: Norton. Campbell, Greg. 2002. Blood Diamonds: Tracing the Deadly Path of the World's Most Precious Stones. Boulder, Colo.: Westview Press. Corden, Max, and J. P. Neary. 1982. "Booming Sector and De-industrialisation in a Small Open Economy." EconomicJournal 92(December):82548. Davidson, R., and J. MacKinnon. 1993. Estimation and Inference in Econometrics. New York: Oxford University Press. De Soto, Hernando. 2000. The Mystery of Capital: Why Capitalism Succeeds in the West and Fails Everywhere Else. New York: Basic Books. Diamond, Jared. 2005. Collapse: How Societies Choose to Fail or Succeed. New York: Penguin Books. Dollar, David, and Aart Kraay. 2003. "Institutions, Trade, and Growth." Journal of Monetary Econom- ics 50(1):133-62. Easterly, William.2001. "The Middle Class Consensus and EconomicDevelopment."Journal of Economic Growth 6(4):317-35. Easterly, William, and Ross Levine. 2002. "Tropics, Germs, and Crops: How Endowments Influence Economic Development." Journal of Monetary Economics 50(1):3-39. The Economist. 2003. "The Devil's Excrement." May 22. Engerman, Stanley, and Kenneth Sokoloff. 1997. "Factor Endowments, Institutions, and Differential Paths of Growth among New World Economies: A View from Economic Historians of the United States," in Stephen Haber, ed., How Latin America Fell Behind. Stanford, Calif.: Stanford University Press. Frieden, Jeffrey A. 1992. Debt, Development, and Democracy: Modern Political Economy and Latin America, 1965-1985. Princeton, N.J.: Princeton University Press. Gylfason, Thorvaldur. 2001. "Natural Resources, Education, and Economic Development." European Economic Review 45(4-6):847-59. Gylfason, Thorvaldur, and Gylfi Zoega. 2001. "Natural Resources and Economic Growth: The Role of Investment." Working Paper, University of Iceland, Department of Economics. . 2002. "Inequality and Economic Growth: Do Natural Resources Matter?" cefinfo Working Paper 712. Munich, Germany. Isham and others 173 Hall, Robert, and Charles Jones. 1999. "Why Do Some Countries Produce so Much More Output per Worker than Others?" Quarterly Journal of Economics 114(1):83-116. Hausman, Jerry, and W. Taylor. 1981. "Panel Data and Unobservable Individual Effects." Econometrics 49(6):1377-98. Heston, Alan, Robert Summers, and Bettina Aten. 2002. "Penn World Table Version 6.1." Center for International Comparisons, University of Pennsylvania. Hoff, Karla. 2003. "Paths of Institutional Development: A View from Economic History." World ,Bank Research Observer 18(2):205-26. Inglehart, Ronald. 1997. Modernization and Postmodernization: Cultural, Economic, and Political Change in 43 Societies. Princeton, N.J.: Princeton University Press. Karl,Terry.1997. The Paradox of Plenty:Oil Booms and Petro-States.Berkeley:Universityof CaliforniaPress. Kaufmann, Daniel, Art Kray, and Pablo Zoido-Lobat6n. 2000. "Aggregating Governance Indicators." Policy Research Working Paper 2195. World Bank, Washington, D.C. 2002. "Governance Matters 11." Policy Research Working Paper 2772. World Bank, Washington, D.C. Kinzer, Stephen. 1999. "Riches May Roil Caspian Nations." Washington Post, January 2. Klare, Michael. 2001. Resource Wars: The New Landscape of Global Conflict. New York: Henry Holt. Knack, Stephen, and Philip Keefer. 1995. "Institutions and Economic Performance: Cross-Country Tests Using Alternative Institutional Measures." Economics and Politiw 7jNovember):207-27. Leamer, Edward. 1984. Sources of Comparative Advantage. Cambridge, Mass.: MIT Press. Leamer, Edward, Hugo Maul, Sergio Rodriguez, and Peter Schott. 1999. "Does Natural Resource Abundance Increase Latin American Income Inequality?" Journal of Development Economics 59(1):342. Lederman, Daniel, and William Maloney. 2002. "Open Questions about the Link between Natural Resources and Economic Growth: Sachs and Warner Revisted." Working Paper. World Bank, Washington, D.C. Leite, Carlos, and Jens Weidmann. 1999. "Does Mother Nature Corrupt? Natural Resources, Corruption, and Economic Growth." IMF Working Paper 99/85. Washington, D.C. Lipset, Seymour Martin. 1959. "Some Social Requisites of Democracy: Economic develop men^: and Political Legitimacy." American Political Science Review 53(1):69-105. Migdal, Joel. 1988. Strong Societies and Weak States: State-Society Relations and State Capacities in the Third World. Princeton, N.J.: Princeton University Press. Moore, Barrington. 1966. Social Origins of Dictatorship and Democracy. Boston, Mass.: Beacon ]Press. Murshed, S. Mansoob. 2003. "When Does Natural Resource Abundance Lead to a Resource Curse?" Working Paper. Institute of Social Studies, The Hague. Neumayer, Eric. 2004. "Does the 'Resource Curse' Hold for Growth in Genuine Income as Well?"World Development 32(10):162740. Nugent, Jeffrey, and James Robinson. 2001. "Are Endowments Fate? On the Political Econonny of Comparative Institutional Development." cepr Discussion Paper 1311. Centre for Economic Policy Research, London. Papyrakis, Elissaios, and Reyer Gerlagh. 2004. "The Resource Curse Hypothesis and Its Transm:~ssion Channels." Journal of Comparative Economics 32(1):181-93. Pritchett, Lant. 2000. "Understanding Patterns of Economic Growth: Searching for Hills Amongst Plateaus, Mountains, and Plains." World Bank Economic Review 14(2):221-50. Putnam, Robert. 1993. Making Democracy Work: Civic Traditions in Modern Italy. Princeton, N.J.: Princeton University Press. Rigobon, Roberto, and Dani Rodrik. 2004. "Rule of Law, Democracy, Openness, and Income: Estimat- ing the Interrelationships." NBER Working Paper 10750. National Bureau of Economic Research, Cambridge, Mass. Ritzen, Josef, William Easterly, and Michael Woolcock. 2000. "Social Cohesion, Institutions, and Growth." Policy Research Working Paper 2448. World Bank, Washington, DC. Rodrik, Dani. 1999. "Where Did All the Growth Go? External Shocks, Social Conflict, and Growth Collapses." Journal of Economic Growth 4(4):385-412. Rodrik, Dani, Arvind Subramanian, and Francesco Trebbi. 2004. "Institutions Rule: The Primacy of Institutions over Integration and Geography in Economic Development." Journal of Economic Growth 9(2):131-65. Ross, Michael. 1999. "The Political Economy of the Resource Curse." World Politics 51(1):297-322. . 2001. "Does Oil Hinder Democracy?" World Politics 53(3):325-61. Sachs, Jeffrey, and Andrew Warner. 1995. "Natural Resource Abundance and Economic Growth." NBER Working Paper 5398. National Bureau of Economic Research, Cambridge, Mass. .1999. "The BigPush, Natural ResourceBooms,and Growth."Journal of DevelopmentEconomics 59(1):43-76. Sala-i-Martin, Xavier, and Arvind Subramanian. 2003. "Addressing the Natural Resource Curse: An Illustration from Nigeria." NBER Working Paper 9804. National Bureau of Economic Research, Cambridge, Mass. Sokoloff, Kenneth, and Stanley Engerman. 2000. "Institutions, Factor Endowments, and Paths of Development in the New World." Journal of Economic Perspectives 14(3):217-32. -. 2002. "Factor Endowments, Inequality, and Paths of Development among New World Economies." Economia 3(1):41-109. Statistics Canada. 2002. World Trade Analyzer 1985-2000. Ottawa. CD-ROM. Stijns,Jean-Philippe. 2001. "Natural Resource Abundance and Economic Growth Revisited." Economics Working Paper Archive at Washington University, St. Louis, Mo. Temple, Jonathan. 1999. "The New Growth Evidence." Journal of Economic Literature 37(1):112-56. Thurow, Roger. 2003. "In War on Poverty, Chad's Pipeline Plays an Unusual Role." Wall StreetJournal, June 24. Tilly, Charles. 1992. Coercion, Capital, and European States: AD 990-1992. New York: Blackwell. Tornell, Aaron, and Philip R. Lane. 1999. "The Voracity Effect." American Economic Review 89(1):22-46. r n v c r ~(United Nations Conference on Trade and Development). 1988. Handbook of International ~ Trade and Development Statistics. New York: United Nations. Vick, Karl. 2001. "Vital Ore Funds Congo's War." Washington Post, March 19. Wade, Robert. 1988. Village Republics: Economic Conditions for CollectiveAction in South India. New York: Cambridge University Press. Weaver, Mary Anne. 2000. "Democracy by Decree." New Yorker, November 20. Woolcock, Michael, Lant Pritchett, and Jonathan Isham. 2001. "The Social Foundations of Poor Economic Growth in Resource-Rich Economies." In R. M. Auty, ed., Resource Abundance and Economic Development. New York: Oxford University Press. World Bank. 1998. AssessingAid: What Works, What Doesn't, and Why. New York: Oxford University Press. . 1999. World Development Indicators 1999. Washington, D.C. . 2002. "New Data on Property Rights and Rule Based Governance." Online document available at wwwl.worldbank.org/publicsector/indicators.htm. Attaching Workers through In-Kind Payments: Theory and Evidence from Russia Guido Friebel and Sergei Guriev External shocks may cause a decline in the productivity of fixed capital in certain regions of an economy. Exogenous obstacles to migration make it hard for workers in those regions to reallocate to more prosperous regions. In addition, firms may devise "attachment" strategies to keep workers from moving out of a local labor market. When workers are compensated in kind, they find it difficult to raise the cash needed for migration. This endogenous obstacle to migration has not yet been considered in the literature. The article shows that the feasibility of attachment depends on the inherited structure of local labor markets: attachment can exist in equilibrium only if the labor market is sufficiently concentrated. Attachment is beneficialfor both employ- ers and employees but hurts the unemployed and the self-employed. An analysis of matched household-firm data from the Russian Federation corroborates the theory. Economies are sometimes hit by massive shocks such as trade liberalization, economic integration or secession, terms of trade collapse, war, and the fall of communism. These events have one thing in common: they dramatically affect the productivity of capital in different sectors. Formerly profitable enterprises, Guido Friebel is maitre de conferences at the School for Advanced Studies in the Social Sciences (EHESS) and fellow at the Institute of Industrial Economics (DEI) of Universite des Sciences Sociales de Toulouse; his email address is friebel@cict.fr.Sergei Guriev is Human Capital Foundation associate professor of Corporate Finance, New Economic School, Moscow; his email address is sgurievC3nes.r~.Both authors are research affiliates at the Center for EconomicPolicy Research. A draft of this article was awarded a Gold Medal at the Annual Conference of the Global Development Network, Tokyo, in 2000. The article builds on and replaces Friebel and Guriev (2000). The authors are grateful to the Editor and three anonymous referees, Grerard Roland for his continuous encouragement and advice, and to Lee Alston, Dan Berkowitz, Micael Castan- heira, John Earle, Allison Garrett, Roman Inderst,Joep Konings, StasKolenikov, Patrick Legros, Meg Meyer, Espen Moen, Viktor Polterovitch,Michael Raith, Asa Rosen, Klara Sabirianova, Etienne Wasmer, and Katia Zhuravskaya for comments and support. They thank three anonymous referees and Jaime de Mello for comments and suggestions. They also acknowledge the comments of colleagues at seminars in Eerlin, Brussels, Caen, Carnegie-Mellon, Prague, Cergy-Pontoise, Dortmund, Leuven, Moscow, Paris, Pittsburgh, Prague, Stockholm, Toulouse, and Urbana-Champaign and conferences in Bejing, Bristol, Moscow, Sinaia, Seattle, Tokyo, and Voronezh. They thank Sergei Golovan, Dmitry Kvassov, and Daniil Manaenkov for excellent research assistance. The authors also acknowledge the support of the European Union's Technical Assistance to the Commonwealth of Independent States program, the Wallander Founda- tion, and the European Union's Training and Mobility of Researchers program. They also thank the organizers of the Russian Longitudinal Monitoring Survey. THEWORLD BANK ECONOMIC REVIEW, VOL. 19, NO 2, pp. 175-202 doi:l0.1093/wber/lhi012 Advance Access publication October 5, 2005 O The Author 2005. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLDBANK. All rights reserved. For permissions, please e-mail: joumals.permissions@oxfordjournals.org. sometimes entire industries, decline, and others grow. When industries are localized, resources ought to reallocate across regions in response to such shocks. In particular, one would expect a large relocation of workers. In a perfect world this reallocation should be swift, but in the real world there are important obstacles slowing it down, such as social norms, risk aversion, and underdeveloped housing markets.' There may also be strong endogenous forces that slow labor reallocation. Firms may devise "attachment" strategies to keep workers from moving out of a local labor market. With sunk investment costs, firms want to benefit as much as possible from their depreciating capital and thus need the labor to match it. Firms can attach workers through nonmonetary forms of compensation. When capital markets are imperfect, workers must have cash to finance the costs associated with migration. But when they are compensated through in-kind payments or fringe benefits, they are forced to consume and cannot save the cash needed for migration. At first glance, it might seem that attachment can work only in monopsonistic local labor markets. Firms ought to compete for workers not only in the level of compensation but also in the type of compensation. By offeringcash wages, firms would poach workers from other firms that pay nonmonetary compensation.The model developed to explore this notion shows that it is not true. Attachment can be sustained as a noncollusive equilibrium in an oligopsonistic market provided that the number of firms in the local labor market is sufficientlysmall. The model shows that attachment is not only good for firms but also good for employees. However, it hurts the unemployed and the self-employed. The model predicts that too little labor market competition may, through attachment, obstruct labor relocation and the capacity of an economy to adjust to external shocks. It also predicts that too little labor market competition may, through attachment,create an externality for workers in segmented labor markets. The intuition of the results is as follows. In a two-period model, workers are subject to a productivity shock that may make migration worth their while. The labor market has a given number of jobs, and there are job-specific matching frictions. Worker-firm matches survive for only one period. Whenever a firm opens a vacancy, it faces uncertainty about whether it will be filled. It is this uncertainty about finding a matching worker in the second period that provides the rationale for paying nonmonetary wages in the first period-employers like to retain workers in the local market to keep labor supply thick. 1. See Roland (2000)for a survey of related literature. This article is concerned with the strategies firms undertake to reduce outward mobility in their labor markets. This is related to the problem of attracting workers through in-kind compensation as a safeguard against firm opportunism in the labor market, a topic that has been analyzed in the literature on company towns (see, for instance, the discussion in Williamson 1985).The most important difference in perspective is that no effort is made here to explain why workers would move into segmented labor markets. The question of interest is why they may find it hard to move away. Friebel and Guriev 177 Employers cannot bind workers to the firm because matches are dissolved after each period. There is an important distinction between this attachment t:oa market and ties to a specific firm, which have been analyzed b e f ~ r eIn . ~ this form of attachment an employer's benefit from attaching workers must be shared with its competitors. This creates an externality leading to the collapse of the attachment equilibrium when the number of local employers, N, increases above a certain threshold. A current employer internalizes only 1/N of the benefits of attaching the worker, but it bears all the costs. To make a worker accept an attachment contract, the firm must compensate each worker for the foregone option to migrate. This premium is independent of N but the attach- ment benefits for the current employer are decreasing in N. When the number of firms reaches a certain level, the costs outweigh the benefits, and attachment ceases to be an equilibrium outcome. The intuition for the welfare results is straightforward. In the attachment equilibrium employed workers create a negative externality for the unemployed. Each worker who accepts an attachment contract makes it harder for the unemployed to find a job in the second period. Attachment decreases total welfare in the local economy unless there is a substantial labor shortage. In the model presented here, it is the presence of matching frictions that makes attachment desirable for employers. There may be many other reasons - . why employers prefer more rather than less labor supply. The model does not hinge on the precise motive for attachment. Efficiency wages would leacl to similar results regarding attachment as the ones generated in the matching model. It should also be clear that this study is not intended to contribute to the search literature. Rather, it tries to set up a simple model that can generate predictions on how competition between employers-in the form of labor contracts-affects workers' geographic mobility and welfare. We use data for the Russian Federation in the second half of the 1990s to test the model. Productivity in many regionally concentrated industries has shifted dramatically since the beginning of the transition, with some regions enjoying high growth rates, whereas others have experienced output declines of more than 50 percent (Berkovitz and DeJong 1999). Yet the rate of interregional migration is very low, around 1 percent a year (Goskomstat 2000), even lower than before the transition, when it was roughly 4 percent. Analysis of the Russian data reveals that many workers receive their compensation, fully or in part, in nonmonetary form. We use the Russian Longitudinal Monitoring Survey (RLMS)~to investigate the two main predictions of the model. First, after controlling for personal, firm-level, and regional characteristics, the propensity of workers to leave a 2. See Salop and Salop (1976) for a model of how firms use backlogged wages to reduce worker turnover. 3. For more information on the RLMS, see www.cpc.unc.edu/projectsIrlms and Zohoori and others (1998). region should be an increasing function of the competitiveness of the local labor market. Higher labor market concentration is shown to reduce geo- graphic mobility, a result that is significant and robust to various specifica- tions. An increase in labor market concentration by one standard deviation can reduce the propensity of an individual to leave by up to 3.6 percentage points. Second, after controlling for regional and personal characteristics and the financial situation of firms (another important potential determinant of in- kind payments), the model predicts that in-kind payments should be more frequent in more concentrated local labor markets. We find corroborating evidence using a subset of the RLMS that was matched with firm data.4 A one-standard-deviation increase in market concentration increases the prob- ability of in-kind payments by at least 3 percentage points. We also discuss why our theory appears better suited than alternative explanations for under- standing the regression results. This section discusses the literature on the interlinkage of markets and labor tying in developing economies, oligopsony (monopsony) in the labor market, and the Russian economy. At first glance, the structure of the proposed model bears some resemblance to the literature on interlinked markets: credit market imperfections and reduced labor mobility feature in both. The literature on interlinkages has been motivated by many observations from developing economies in which people often conduct business with the same partners in several markets. Land- lords, for instance, not only employ workers but also often provide them with credit, and traders not only buy crops from farmers but also often provide the farmers with seeds or credit to buy seeds. The literature presents a number of explanations for such bundling (see the survey by Bell 1988). Many explanations build on the idea that interlinking transactions can help overcome agency problems. For instance, when workers have no other collateral than their work, "pure" money lenders have no use for it, whereas farmerlmoney lenders do. The model entails no agency considerations, and firms do not interact with workers on more than one market. Rather, firms want to ensure their labor input (in a manner similar to that described by Bardhan 1983, who argues that employers benefit from labor tying because it ensures labor supply in peak times). Firms in the model offer in-kind paymentsto reduce geographic mobility, and workers are willing to accept in-kind contracts if the value of the provided goods is at least equal to their outside options plus the option value of migrating, which they forgo if they accept in-kind payments. Hence, attachment contracts 4. The authors thank Klara Sabirianova for providing these matched data. Friebel and Guriev 179 create a surplus for any firm-worker pair. However, as explained before, in-kind payments impose an externality on the pool of unemployed workers. The question thus arises whether interlinkages, particularly tying, is good for workers.' Again, the framework here differs from the existing labor-tying literature in that it considers imperfect competition in the labor market ,and involuntary unemployment, both important problems in transition and devel- oping economies. The theory underlying the model offers a simple explanation of why tying may be bad for the unemployed but not the employed. In the model sufficient competition thus has an important role: it makes attachment collapse, and it protects the unemployed from welfare losses due to the attach- ment of employees. These effects are absent in models of labor tying that assume either labor market monopsony (Bardhan 1983) or perfect competition (Mukherjee and Ray 1995). There is also a small but growing body of literature that uses concepts from industrial organization to analyze labor markets. Boa1 and Ransom (1997)and Bhaskar and others (2002)show that certain labor market phenomena car1 be explained only if firms hold market power. Bhaskar and colleagues argue, however, that it is unrealistic to assume conventional monopsony: employers do compete with each other. Somewhat similar to the examples discussecl by Bhaskar and others, cases of intermediate competition are of particular interest here. If there is perfect competition, attachment does not pay off. If there is monopsony, attachment is costless-becausethe worker has no choice, the firm does not need to compensate the worker for the forgone option to migrate. The problem becomes interesting in the case of oligopsony. Like Stevens (1994),this article looks at the provision of training and at poaching in a model with imperfect competition. Several studies look at interregional migration and the demonetization of worker compensation in Russia. Jarocinska and Wijrgijtter (2000) and Andrienko and Guriev (2004)show that there are substantial wage differences across regions and yet little interregional mobility. This points to the presence of frictions in the labor market. A few studies have examined demonetization of worker compensation as a source of such frictions. Commander and Schankerman (1997) have analyzed Russian firms' practice of providing social services to their workers. They argue that the absence of a public social security network reduces worker mobility, because workers fear exclusion from firm- provided social services. Their argument applies to mobility in the same labor market, not to mobility across segmented local labor markets. Also, it presumes that firms are worker-controlled. Grosfeld and others (2001)relate the segnnen- tation of the Russian labor market with respect to skills to the provision of fringe benefits. Earle and Sabirianova (2000, 2002) look at wage arrears as an 5. See, for instance Schaffner (1995), who argues that landlords subject workers to "servility" and restrict their information to maintain servile relationships. 1 8 0 T H E W O R L D BANK E C O N O M I C REVIEW, V O L . 19, N O . 2 equilibrium outcome between firms in a given local labor market. They argue that one firm's decision not to pay wages may be a strategic complement to the decisions of other firms. This article is related to that literature inasmuch as it looks at demonetization as a result of firm strategies, but its focus is different. The other publications do not provide a theory of the impact of market struc- ture on feasibility of demonetization strategies. Nor do they focus on territorial mobility as ours does. The main interest here is to study how market structure affects territorial mobility and thus the ability of an economy to adjust to shocks. Inherited labor market structures slow the reallocation of labor. As a result, local labor markets remain segmented. Thus, the theory proposed here also contributes to the understanding of regional disintegration in Russia, which has attracted consid- erable interest in the economics literature. Blanchard and Shleifer (2001)argue that Russia performs poorly in comparison with China because in Russia weak central institutions fail to curb the rent-seeking behavior of regional and local governments. If Tiebout competition were feasible, efforts to recentralize (such as those undertaken by the Putin administration) would not be necessary. Workers who live in concentrated labor markets cannot vote with their feet. These are most likely the workers who are subject to the least efficient local governments. Hence, by undermining Tiebout competition, attachment contri- butes to regional disintegration. Berkovitz and DeJong (1999)have shown that Russia has "internal borders" erected by regional governments so that they can pursue their political interests. The model shows that labor markets are subject to similar internal borders as product market^.^ While Russia is a good testing ground for the theory, attachment seems to be - - a more general phenomenon, making the theory relevant beyond transition economies. Throughout economic history firms have devised strategies to reduce the territorial mobility of workers. Examples include company towns and the truck system7 and labor-tying arrangements in rural economies. Patern- alism in the Southern states of the United States following the Civil War is another example. Alston and Ferrie (1993, 1999) show that when slaves were freed, rural employers had to cope with high turnover. Southern landlords had to limit competition among themselves to prevent Northern capital from mov- ing to the South. Farmers created a web of social control mechanisms, in-kind payments, services, and protection from racist-inspired violence. They also 6. This is also in line with Ericson's (2000)view of the Russian economy as "post-soviet industrial feudalism." The attachment mechanism posited here shows why quasi-feudal structures in Russia have emerged and how they can be sustained. It is also important to stress that one should be most worried about the welfare of those outside the quasi-feudal arrangements-the unemployed and the self- employed. 7. The truck system was widely used, particularly in the United Kingdom and United States. Workers were obliged to buy their goods in company stores and often became heavily indebted, making it difficult for them to move away (see Hilton 1960). Friebel and Gurieu 181 exerted political power to keep Northern influence out of their labor mar!ket. During World War I and following restrictive immigration legislation in the 1920s, as immigration from outside the United States slowed and outmigration of former slaves became a threat, landlords used state legislatures and paterna- listic benefits to limit outmigration. This strategic behavior prevailed until production became less labor-intensive, and long-term investments of worlrers and farmers in the fertility of the soil became less important. Industrial firms in Russia are experiencing a similar transition, and firms appear to be reacting in a similar way. Kornai (1992) has argued that the dependence of workers on the Communist Party and on their firm was a constituent element of communism. The collapse of communism freed inclivi- duals from party dominance. Workers should then have been able to move to where they are most productive, rather than remain where Stalin wanted them (or their parents) to be. But attachment strategies appear to make this realloca- tion a slow and complicated task. Consider a local economy with N identical firms and two periods. First-period labor supply is a continuum of workers, normalized to L1. Second-period labor supply, L2, is endogenous. Labor contracts cover only the current period. There is also a geographically distant labor market, the "central" la~bor market, which is competitive. To find a job there, a worker incurs migration and search costs, T. Labor productivity in the central market is subject t:o a shock: with probability p, the wage wm net of the costs of migration exceeds R, the productivity of a worker in the local labor market: With probability 1- p, the wage in the central market is low (forsimplicity,it is assumed to be zero) so that migration does not pay off. The costs of migration must be paid up front. Thus at the beginning of the second period the worker needs at least T units of cash to migrate. Workers who are unemployed in the first period receive no wages and cannot migrate. The ability to migrate for workers who have a job in the first period depends on their first-period employment contracts. If they agreed on a standard cash contract, they have enough cash to migrate (the cash wage is assumed to exceed T in equilibrium) and receive utility wc. If they agreed on a contract specif;ying compensation in non-monetary form-an attachment contract-that provides utility wa, they cannot migrate. For simplicity, firms are assumed to bear no additional cost of paying salary in-kind relative to a monetary salary that provides an equivalent utility to the worker. This assumption does not affect the main results. Timing For the first period, workers and firms are randomly matched, with a worker matched with a firm no more than once. Workers who do not find a match remain unemployed for the first period.8 For any match the worker and firm bargain individually over wages. Assuming that bargaining is efficient, the joint surplus is maximized by agreeing on either a cash contract or an attachment contract. First-period production takes place, workers and firms receive their payoffs, and all matches dissolve. The unemployed get nothing. For the second period, workers migrate or not depending on whether migrating pays off for them and whether they have the necessary cash. The remaining workers (includingthose who were unemployed in the first period)are matched according to the same matching technology. Workers and firms bargain about the second-period wage. Because there are only two periods, attention can be restricted to cash wages. Second-period production takes place, and workers and firms receive their payoffs. Matching, Bargaining, and Second-Period Labor Supply Matching is assumed to take place according to a standard matching function (see Petrongolo and Pissarides 2001). The number of successful matches between workers and firms, M, is determined by a matching function with constant returns to scale: with 1 = L/J denoting the number of workers per job, P(1) is the probability that a firm will fill a vacancy, and $1) = P(1)/1 is the probability that any given worker will find a job. According to the assumptions above, P(1) is an increasing function (approaching 1 as 1 goes to infinity), and y is a decreasing function (approaching 0 as 1 goes to infinity). Thus, M ( L ,J ) < L and M ( L ,J ) N**. The intuition for Proposition 1is as follows (seealso figure1).Given efficient bargaining, any worker-firm match chooses the contract that maximizes the joint surplus. Inspection of equation 9 shows that the value of attachment (the left side of the equation) increases with the impact the attachment of workers has on the firm's probability of filling a vacancy in the second period, pPt(L1- p[M(L1,J)- q]), and with R, the productivity of labor in the local market. Each firm internalizes only 11N of this attachment benefit, as matches are destroyed in the beginning of the second period (attachment is market specific, not firm specific). However, a worker accepts an attachment contract only when the first-period wage includes compensation for the value of the forgone option to migrate, p[wm- T 5y - ( ~ ~ When N increases, the left side ) ] . of equation 9 decreases, and the right side remains constant. Ultimately, the cost Friebel and Guriev 185 FIGURE 1. Share of Attachment Contracts in the First Period in Equilibrium as a Function of Number of Employers in the Local Labor Market, N 1 Share of attachment contracts q /M(L,A Full attachment 1 of attachment dominates the benefits for the individual firm. This free-riding effect makes attachment collapse.'1 The model assumes that all firms are symmetric. Suppose instead that firms differ in the stock of capital and therefore in the number of vacancies. Then there will be equilibria in which large firms attach and smaller ones do not. Indeed, the smaller the firm, the larger the free-rider problem. The benefit of attachment per worker is proportional to the firms' employment, whereas the cost of attachment is the same for all firms. Formally, the only change in equation 9 is that 11N is replaced by the firm's share in local employment. Notice that the attachment policies of the larger firms impose a negative externality on the employees of small firms (as well as on the unemployed and the self-employed).The employees of the small firms are not attached and leave with probability p; however, with probability 1 - p they stay and have to face tougher competition for jobs in the second period. The results are robust to relaxing the assumption that all first-period matches are dissolved. Suppose that the first-period matches are destroyed only .with probability A. Then the firms can in principle offer long-term contracts to economize on search frictions in the second period. In the absence of firm- specific investment the contracts will specify the second-period wage just to cover the second-period option. Also, because it is unrealistic to assume 11. The equilibrium in Proposition 1is unique for a given N. This follows from the concavity of P(1). If B(1) were convex and ~ ( 1 declined sufficiently slowly with I, ) N* > N** then, the structure of the equilibrium would be as follows: (1)if N < N**,there exists a unique equilibrium with full attachment q =M(L,,1);(2)if N E (N**,N*),there exist at least three equilibria: a stable equilibrium with full attachment, q = M(L1,1);a stable equilibrium without attachment q = 0; and at least one unstable equilibrium with partial attachment with q solving equation 10; (3) if N > N*, there exists a unique equilibrium without attachment, q = 0. commitment on the worker's part, long-term contracts alone cannot protect firms from losing workers to the central labor market (equation 1).To attach workers, firms have to rely on in-kind contracts. Simple calculations yield the condition for the attachment equilibrium which becomes equation 9 as X -+1. Also, firms' commitment to long-term contracts is limited in an environment with high volatility, financial constraints, and high discount rates. Bertrand (2004) shows that import competition reduces U.S. firms' ability to stick to implicit contracts that shield workers from market volatility; financial pressures make it harder to respect long-term wage commitments. Denisova and others (1998) refer to court statistics to show that even under formal labor contracts Russian firms managed to get away with delaying wage payments for months. Under conditions of double-digit inflation, wage arrears were equivalent to renegotiating wages downward. The workers won 95 percent wage arrears lawsuits against firms, but the court rulings were almost never enforced. The results are also robust to changes in the allocation of bargaining power. If the worker gets a < 1 percent of the joint surplus, then condition 10 becomes: P(L2) N < w"-T -- (l-a)R a-~(L2)' (l- a) The properties of equilibrium do not change even if the worker has no bargaining power (a= 0); the only difference is that the attached workers do not benefit from attachment. In the unlikely case where the worker has full bargaining power (a= I),attachment never occurs-the benefit of attachment is trivial, and so is the right side of equation 10. One can also analyze the case where the bargaining power is endogenous to local labor market conditions, with the worker's bargaining power a decreasing in unemployment and increasing in N. This would strengthen the results. Indeed, the effect of unemployment on bargaining power provides the firm with even stronger incentives to attach workers to increase its surplus in the second period. The link between labor market competition and bargaining power also works in the same direction: as the number of firms increases, attachment becomes even less likely as the firm expects to appropriate a lower share of returns to attachment. Welfare Because of the assumptions of efficient bargaining and equal allocation of bargaining power between worker and firm in a match, it is clear that workers who are employed in the first period and firms cannot lose from attachment. Friebel and Guriev 187 However, the unemployed of the first period suffer as a result of attachment. A proportion p of the workers with attachment contracts would migrate if they had cash contracts instead. Under attachment, they stay and reduce the prob- ability that the unemployed will find a job in the second period. Thus, the fact that employed workers accept attachment contracts imposes an externaliry on the unemployed.12 How does the local economy as a whole fare under attachment? Consider the sum of the utilities (for clarity, the assumption that J is normalized to 1 is dropped here): The derivative with respect to q is Equation 16 shows that attachment decreases welfare only if unemployment in the second period is sufficiently high: L2/J > l*,where This result reveals the welfare implications of attachment. Attachment is Ibene- ficial to the local economy because it increases matching efficiency, but it is costly because potentially mobile workers forgo the option to earn higher vvages outside. The beneficial effect is more important if there is a shortage of workers in the second period (if L2/J is low). The cost of attachment is high if there is high unemployment (L2/Jis high), however, because the marginal worker has only a small effect on the efficiency of matching and each worker's local expected payoff is very low.13 If workers and firms could write enforceable debt contracts, it would be possible for firms from the central labor market to finance workers' migration from the local labor market. However, because workers have no collateral and indentured servitude contracts cannot be enforced, such contracts would be infeasible: the worker would default on the debt after arriving in the central labor market. Entry of firms would be a second possibility. How- ever, although the capital costs of incumbent firms is sunk, new entrants would have to pay a fixed cost, which, if high enough, would prevent firms from entering. 12. This is similar to Rama and Scott (1999),where the dominant firm's employment decisions also have a negative effect on outsiders (smallfirms): downsizing the monopsony increases the pool of people looking for jobs in the local labor market, thereby suppressing wages and local demand. 13. The formally more correct expression of a social planner's problem would be to maximize welfare by choosing whether to ban attachment. The results here show that banning attachment increases welfare if there is high unemployment in the second period. 1 8 8 T H E W O R L D BANK E C O N O M I C R E V I E W , V O L . 19, N O . 2 Finally, matching frictions is not the only reason firms might like to attach workers. In an alternative model based on efficiency wages, greater local labor supply makes it cheaper rather than easier for firms to fill their vacancies.14 This section describes the Russian labor market in the second half of the 1990s and presents regression results using Russian data, including potential alterna- tive explanations and counterarguments. Characteristics of the Russian Labor Market Several features of the Russian labor market are important to this analysis. DEMONETIZATION OF WORKERS' COMPENSATIONS. In the Soviet Union firms pro- vided a wide range of nonmonetary benefits to their workers, including hospi- tals, housing, childcare, and education. By presidential decrees all assets related to the provision of such services had to be transferred to municipalities, but firms still provide some social services. In concentrated local labor markets firms own up to 85 percent of the social assets (Healey and others 1998). A survey of 93 enterprises reports that firms even invest in new types of facilities to provide fringe benefits (Tratchand others 1996).A recent survey of 400 firms confirms widespread ownership of social assets and investment in new ones (Haaparantaand others 2003).Even more striking, a survey of 200 firms shows that in-kind substitutes for wages were on the rise (Biletskyand others 1999).In 1991, 3 percent of surveyed firms provided in-kind payments; by 1998, 27 percent did. In-kind payments are a novel phenomenon, but the provision of fringe benefits could be attributed to the behavioral inertia of paternalistic managers. However, a survey of managers of 142 enterprises by the Russian Center for Public Opinion Research (VCIOM 1997) indicates that the provision of fringe benefits follows the strategic patterns highlighted in the model: only 37 percent of respondents continued to run the social assets of their firm because of Soviet traditions, whereas 51 percent did so to retain workers. Juurikkala and Lazar- eva (2004)show that provision of social survices reduces employee turnover. Besides the fringe benefits Russian workers in the second half of 1990s saw an explosion of explicit in-kind payments.1s As discussed in Clarke (2000), wages (and wage arrears) were commonly paid in the firms' outputs, food, and even manure (McMahon 2001). The widespread demonetization of the 14. These and other results mentioned but not reported in the article are available from the authors. 15. This article does not discuss the decline of in-kind wages in recent years. As the 1998 meltdown drove real interest rates down, the barter economy disappeared and in-kind transactions became more costly for firms. That in turn raised the cost of in-kind employee compensation. According to the R L M ~ data, the level of in-kind compensation has been declining steadily since 2000. Friebel and Gurieu 189 economy reduced the transaction costs of barter exchange for the firms, bctt the cost remained high for the workers. As Clarke (2000)argues, workers who were paid in kind were effectivelyforced to withdraw from the market economy and to engage in barter exchange. Low MOBILITY ACROSS REGIONS ND LABOR MARKET SEGMENTATION. There are A huge productivity differences across regions in Russia, which would be expected to result in a massive reallocation of workers. Heleniak (1999),for instance, estimates the the stock of potential migrants from the Russian north alone at 2 million people. But during the decade of transition, interregional migration in Russia remained fairly constant at about1 percent a year (Andrienkoand Guriev 2004, based on official data). This is surprisingly low, consideringthat migration rates were at about 4-5 percent before transition. Soviet-style industrialization resulted in geographic concentratioin of industrial activity, and local employment was often concentrated in orie or very few large plants. Goskomstat (2000) data show that since the outset of transition, labor market segmentation has steadily increased. Consider the ratio of unemployed people to vacancies by economic regions and adminis- trative regions (oblasts).In the Central Region the ratio was roughly 8 to 1 in 1993, increasing to 13 to 1 in 1996, and dropping again to 8 to 1 in 1997. In the Eastern Siberian Region the ratio grew from 18 to 1 in 1993 to 76 to 1 in 1997. More striking, the ratios vary dramatically even within economic regions and across the smaller oblasts, as shown by a comparison of four administrative regions and Moscow, all in the Central Region, the most developed and densely populated economic region (table 1).The dif- ference between Moscow and Ryazan oblast, for example, increased between 1993 and 1997, and by 1997 the ratio was 48 times higher in Ryazan than in Moscow. Also, Andrienko and Guriev (2004)discuss evidence on the lack of convergence across oblasts in both real income and unemployment during 1990s. SCOPE FOR MIGRATION. Why are workers from Ryazan, a town barely 200 km from Moscow, not moving to the capital? An obvious answer is that migration may not be worth the cost. A rough estimate of the costs of migration suggests TABLE 1. Ratio of Unemployed to Vacancies in the Central Economic Region Region 1993 1994 1995 1996 1997 Bryansk oblast 58 158 58 62 84 Vladimir oblast 18 28 34 46 38 Moscow City 4 3 3 2 1 Ryazan oblast 24 28 48 42 48 Tula oblast 6 15 18 31 32 Source: Authors' calculations based on official Goskomstat data for respective years. that this may indeed be the case. We collected data on rents, transportation costs, and monthly salaries in rubles for up to 10 occupations for 28 Russian towns and cities, using job advertisements in newspapers in October 2000.16 A simple back-of-the-envelope calculation for Moscow and Ryazan indicates that there is scope for migration, in particular for qualified workers. However, the associated costs (dueto relatively high rents in Moscow and registration and moving expenses) are substantial-half a year to a year's wages in Moscow- and they must be paid up front. With Ryazan salaries not much above the minimum living standard, the in-kind payments are a serious if not an insur- mountable obstacle to migration. Data and Empirical Results The model implies two empirical predictions. More competition in a local labor market should result in more migration and in reduced frequency of nonmone- tary compensation for workers. In the absence of micro migration data, data are taken from the RLMS, a representative data set on Russian households. The RLMS is not a panel data set, but interviews in round VI (winter 1995196) and round VII (winter 1996197) were conducted in the same dwellings. For respondents who had moved between the two rounds, interviewers were supposed to find out about their new residence, provided they had not left the community. Former respondents who had left the community were not followed up. The analysis uses data on working age individuals who were employed during round VI. For both hypotheses the main independent variable is a labor market compe- tition index, CR4, which represents the percentage of the labor force employed by the four largest employers in the local labor market, constructed using Goskomstat's Registry of Russian Industrial Enterprises (the annual census of Russian enterprises) for 1995. A larger CR4 is tantamount to more concentra- tion (lesscompetition) in the labor market. From the RLMS'S 38 primary sampling units, or communities, individual communities were defined so that each is a local labor market. Where the primary sampling unit is a standalone urban or rural settlement, concentration was calculated at the level'of the sampling unit. Where the primary sampling unit is a part of a large city, concentration was calculated for the citywide labor market rather than the district labor market. This is consistent with a casual understanding of commuting distance in Russia. DOESHIGHER LABOR MARKET CONCENTRATION RESULT IN LESS MIGRATION? The dependent variable moue takes a value of O if an interviewedindividual in round VI lived in the same community in round VII and a value of 1 if interviewers were unable to find that individual in the same community in round VII. The 3 6. The full list is available from the authors. Friebel and Gurieu 191 category move=1 thus also includes nonrespondents and people who died between the two rounds, meaning that it is an imperfect measure of regional mobility.17 Control variables were also drawn from the RLMS: personal characteristics, job characteristics, household characteristics, and proxies for subjective well- being (for instance, satisfaction with life, intention to change job or to move away from a community). We collected additional information on the economy of each community. All nominal variables were deflated by a local consumer price index (CPI) that uses price information on 25 basic goods from the RLMS and weighs them according to the Goskomstat methodology. Descriptive statis- tics for the most important variables are shown in table 2. (Seethe appendix for definitions of the variables used.) We ran regressions with all potentially interesting personal, household, and job characteristics, but results are presented only for variables that are jointly significant. Table 3 reports on the results for various probit specifications for move. The results show the marginal effect of a change in the respective independent variable on an individual's likelihood of moving (computed a~tthe average value of the respective variable). The first specification includes dummy variables for the primary sampling unit and provides a useful benchmark. Because CR4 is a linear combination of primary sampling unit dummy variables, TABLE 2. Descriptive Statistics, RLMS Round VI Variable Number o f Observations Mean SD Minimum Maximum moue hhincome, def jobsyr edyrs age male married aprent nkids7-18 CR4 inkind cash-cl cash-sales c6bank c6telphp c6roads wantmoue Source: Authors' calculations based on data from R L M ~rounds VI (1995196)and VII (1996197) and Goskomstat Registry of Russian Industrial Enterprises for 1995 and 1996. 17. According to Goskomstat, the mortality rate in Russia was roughly 1.5 percent in 1995. Thus sample distortion due to nonrespondents is more substantial than that due to mortality. TA BLE 3. Probit (dFldx)Estimations for Move,RLMS round VI Variable Specification 1 Specification 2 Specification 3 Specification 4 Specification 5 hhincome 0.021' (0.011) jobsyr -O.0Ol2 (0.001) edyrs 0.003 (0.002) -0.002"*" (0.001) male 0.061~'- (0.008) married -0.027 (0.018) aprent 0.374'~' (0.060) -0.0115 (0.010) Primary sampling significant unit dummy variables Regional dummy significant significant significant significant variable CR4 cash-cl cash-sales Number of observations Log likelihood pseudo-R~ 'Significant at the 10 percent level. % Significant at the 5 percent level. % + * Significant at the 1 percent level. Note: Numbers in parentheses are standard errors adjusted for clustering at the primary sampling unit level (specifications 1,2,5-7) or at the firm level (specifications 3 and 4).See text for details. Source: Authors' calculations based on data from RLMS rounds VI (1995196)and VII (1996197) and Goskomstat Registry of Russian Industrial Enterprises for 1995 and 1996. Friebel and Guriev 193 specification 2 replaces the primary sampling unit dummy variables with the respective CR4 and controls for the eight large economic regions, including a special dummy variable for Moscow. A comparison of specifications 1 and 2 in table 3 shows only slight cliffer- ences. The positive sign for monthly household income, deflated by the local CPI, is in line with the theory that highlights the importance of liquidity constraints to moving decisions. After controlling for personal and job characteristics, individuals with higher income should be less willing to leave. Thus the po'sitive sign suggests that the liquidity effect of a higher income dominates the income effect.18 Longer tenure in the firm (jobsyr)makes workers less mobile, a fact that can be reconciled with the presence of relation-specific human capital. Education, measured in years (edyrs), influences moving decisions positively. Older and married people move with lower probability as do people with children ages 7-18. Men have a higher propensity to move, as do individuals living in rented flats.19 he major lesson from specification 2 is that as predicted, higher labor market concentration as measured by CR4 has a large negative impact on individuals' moving- decisions: a one standard deviation (0.29)increase in CR4 results in a 3.6-percentage-point decrease in an individual's probability of mov- ing. Given that in the sample, move = 1 holds for only 17 percent of surveyed individuals, the impact of labor market concentration is important. DOESHIGHER LOCAL LABOR MARKET CONCENTRATION INCREASE THE PROBABILITY OF IN-KIND PAYMENTS? The dependent variable used to investigate this prediction is binary information on whether a person received in-kind payments.20Specifica- tion 1 in table 4 shows that although most personal characteristics have no significant impact, CR4 has a significant positive impact on the occurren.ceof in-kind payments-in line with the theory. It could be argued that firms that are more cash-constrained may be forced to pay wages in nonmonetary form (inkind)and that firm liquidity is correlated with CR4. We have explored this using matched worker-firm data for a subset of individuals from the RLMS. We have used two proxies for the fina~ncial constraints facing a firm: cash-cl, defined as the ratio of cash holdings of a firm at the time of the survey (end1995) divided by the firm's current liabilities 18. It would have been preferable to look at the stock of household savings, but that information is not available in the LMS. Regressions are reported for household income rather then for individual R salaries, because the former is a better measure of liquidity. Nonetheless, regressions were also run with monthly salary; the respective coefficient is positive and significant as well. 19. This can be interpreted as a sign that people who move more often prefer to live in rented flats rather then to own their home (or to live in company dormitories). However, apartment rental is also a potential proxy for the cash individuals hold, because in Russia rental flats are usually of higher quality and more expensive than the other forms of housing. 20. The magnitude of these payments is unknown. Information is also unavailable on the pc~tential provision of social services that are considered to be of a larger magnitude than in-kind payments. TABLE 4. Probit (dFldx)Estimations for Inkind, R L M ~Round VI Variable Specification 1 Specification 2 Specification 3 Specification 4 hhincome 0.003 0.003 0.004 0.000 (0.009) (0.015) (0.022) (0.012) jobsyr -0.000 0.000 0.001 -0.000 (0.000) (0.001) (0.001) (0.000) edyrs -0.005*"* -0.004 -0.005 -0.004" (0.002) (0.002) (0.003) (0.002) age -0.000 0.000 -0.000 -0.000 (0.001) (0.001) (0.001) (0.001) male 0.016 -0.001 0.001 013 (0.009) (0.011) (0.016) (0.10) married -0.027 0 . 0 0 1 0.001 -0.004 (0.018) (0.018) (0.025) (0.012) aprent 0.003 0.001 0.001 0 4 4 ~ (0.010) (0.025) (0.036) (0.024) nkids7-18 -0.015'~' 0.015' 0.020* 0.016~ (0.005) (0.009) (0.012) (0.006) Regional dummy significant significant significant significant variables CR4 0.093~' 0.10lz8* 0.148** (0.029) (0.049) (0.067) cashcl -0.393~~ (0.132) cash-sales -0.914 (0.689) c6bank Number of observations 3910 948 891 Log likelihood -1062 -272 -269 pseudo-R' 0.062 0.152 0.140 -Significant at the 10 percent level. :; Significant at the 5 percent level. Significant at the 1percent level. Note: Numbers in parentheses are standard errors adjusted for clustering at the primary sampling unit level (specifications1 and 4)or at the firm level (specifications 2 and 3).See text for specification details. Source: Authors' calculations based on data from RLMSrounds VI (1995196)and VII (1996197) and Goskomstat Registry of Russian Industrial Enterprises for 1995 and 1996. at the same date, and cash-sales, defined as the ratio of cash holdings to annual sales. Though these variables restrict the sample to fewer than 1,000 indivi- duals, and thus the results should be interpreted with caution, the results do support the proposed theory (the third and fourth columns in table 3 and the second and third columns in table 4). Both CR4 and cash-cl have the expected Friebel and Gurieu 195 signs and are statistically significant, whereas cash-sales has the expected1sign but is not significant. The influence of CR4 on inkind increases slightly with the inclusion of these variables, but the main point is that concentration affects the probability of in-kind payments positively-providing additional support for the theory. The regression also shows that personal characteristics have a negligeable effect on the occurrence of nonmonetary compensation. ADD~~IONALREGRESSIONS. The mobility variableis of rather low quality. Motre =1 contains both migrants and nonrespondents. Direct identificationwas impossible, but a round VI question on whether respondentsintended to move in the follow- ing 12 months proved to be a good predictor of move =1: the probability was 42 percent for those who had indicated an intention to move and 15 percent for the rest of the sample. Specification 6 in table 3 shows the results when individuals who did not intend to move but had move = 1 were removed from the sample, because they are more likely to be n~nres~ondents.~'The results show a lower magnitude for CR4, but it remains significant, and the explanatory power more than doubles, compared with specification 5. Specification 7 reports the results for a subsample of individuals who had reported in round VI that they intended to move. The coefficient for CJR4is significant and very large, but because the sample size shrinks to 292 indivi- duals, care must be taken not to overinterpret the results. The determinants of the intention to move were also estimated (results are not reported here). The intention to move was not found to depend on in-kind payments (controllingfor income, apartment rental, and so on). Mobility was found to depend on inkind, controlling for intention to move. We have run regressions with different additional controls and on different sub-samples. In all cases, the results were similar to those already discussed. Among others, we have looked at alternative measures of income, such as individual wages rather than household income. To control for liquidity at a more aggregated level, we used the ratio of per capita monetary income, deflated by the minimum living standard in the region, as well as deflated per capita bank deposits in the region. To control for potential size effects;, we investigated separately regressions for small and large towns and when Moscow and St. Petersburg were dropped from the sample. Other regressions were run separately for towns with high and low concentrations (with CR4 above and below 0.5). We also ran the regressions controlling for occupations (nine occupations as classified by RLMS), but they turned out be insignificant and had no effect on the relationship between CR4 and mobility. 21. If these individuals are not counted as migrants, the share of those who leave falls to 4 percent, which is comparable to the official national average for gross outgoing mobility (2.1 percent). Moreover, the data set is biased in favor of migration because it consists of the potentially most mobile category of people. Also, the data set covers nonregistered mobility, which is said to be quite large. Finally, the effect of in-kind payments on outmigration was estimated in various specifications: separately, jointly with concentration, a two-stage least squares (inkind instrumented by CR4),and a system of seemingly unrelated equations. In all specifications in-kind payments negatively influence outmigra- tion, and in almost all specifications the coefficient is significant. Whenever the effect of both inkind and CR4 on moue is studied, the coeffi- cient for CR4 decreases in absolute value, but it remains significant.This implies that in-kind payments are only one of the channels through which CR4 influ- ences outmigration. Other potential channels are wage arrears and fringe ben- efits. The results for regressions with wage arrears were also similar to those already reported. Yet even wage arrears and in-kind payments together do not fully explain the effect of concentration on outmigration. This hints at the importance of fringe benefits, for which the RLMS does not collect data. Juurikkala and Lazareva (2004)use data from a different survey to show that ownership of social assets by firms reduces employee turnover--consistent with the predictions here. EVIDENCE FROM SUBSEQUENT RLMS ROUNDS. The main regressions refer to RLMS rounds VI and VII. Data from round V could not be used because of triple- digit inflation in 1995. Using subsequent rounds is also problematic for a number of reasons. First, firm-level data were available only for round V. Second, there are no data on the variable for intention to move (wantmoue). This question was dropped from round IX onward, and the intervals between rounds VII and VIII and between rounds VIII and IX were increased to two years from the one year between rounds VI and VII. Nonelethless, basic specifications were estimated for rounds VII-X. Table 5 reports the cross-section results from those rounds as well as an estimation of the Cox proportional hazard model for migration. To make the results compar- able across rounds, household income was deflated using a regional price index rather than a price index at the primary sampling unit level. Price data at the primary sampling unit level are very incomplete, so it is not feasible to construct an index that would be consistent over time. Table 6 reports results for in-kind payments. The results are similar, although in two later rounds the coefficient on CR4 is marginally insignificant. Alternative Explanations The fact that CR4 negatively affects the likelihood of outmigration and at the same time positively affects the likelihood of in-kind payments corroborates the theory. Several alternative explanations and counterarguments are discussed next. First, other theories could also predict that migration would decrease with labor market concentration. The observed impact of labor market concentration on mobility could be owing to firms' greater market power in more concentrated TAB LE 5 . Proit (dFldx)Estimations for Move, RLMS rounds VI-X Variable VIII hhincdef 0.149*''" 3.930"' 3.082' 1.523"' 0.586" 1.322""* -0.049 -1.833 -1.685 -0.644 -0.356 -0.134 jobsyr -0.002 -0.002*" -0.001 -0.002' -0.002**" -0.003 -0.001 -0.001 -0.001 -0.001 0 . 0 0 1 -0.003 edyrs 0.001 0.003 0.008"' 0.005~~' 0.002 0.003 -0.001 -0.003 -0.004 -0.002 -0.003 -0.005 -0.003""' age -0.002"' -0.003'""' -0.002* -0.001*' -0.006"** -0.001 -0.001 -0.001 -0.001 0.001 -0.002 male 0.140*"'* 0.139''" 0.107"" 0.084'*' 0.073'~'- 0.311**" -0.015 -0.013 -0.011 -0.012 -0.009 -0.046 married -0.075'-** -0.071"*" -0.082'"' -0.069""' 0.O6ld"' -0.014 -0.022 -0.022 -0.017 -o.206:' 0.018 -0.016 -o.355.>:;:> 0.056 :,:, aprent 0.218'"" 0.132* 0.199"'-* 0.093"' -0.073 -0.079 -0.057 -0.051 -0.033 -0.077 nkids7 -0.057" -0.057"" -0.036 -0.056*"* -0.013 -0.084 -0.027 -0.021 -0.026 -0.016 -0.019 -0.052 CR4 -0.166" -0.199'- -0.219" -0.158 -0.146 -0.706"' -0.086 -0.104 -0.093 -0.124 -0.09 -0.317 Regional dummy significant significant significant significant significant significant variable Number of 4074 3761 3739 3797 4198 19569 observations Log likelihood -2525 -2313 -2094 -1892 -1809 pseudo-R~ 0.092 0.080 0.096 0.109 0.083 'significant at the 10 percent level. P>, :.Significant at the 5 percent level. :, > - Significant at the 1percent level. "ACox proportional hazard model for the risk that move =1. Note: Numbers in parentheses are standard errors adjusted for clustering at the primary sampling unit level. See text for details. Source: Authors' calculations based on data from RLMS rounds VI-X. TABLE 6. Profit (dFldx)Estimations for lnkind, RLMS rounds VI-X Variable VI VIII VII IX X -2.001 ":"' hhincdef -0.055" -2.586* -2.509 -0.927'~' -0.025 -1.499 -1.649 -0.369 -0.213 jobsyr 0.000 0.000 0.001 0.000 -0.001 -0.001 -0.001 -0.001 -0.001 -0.001 o.oo6:: x.:; edyrs 0.001 -0.009"*' -0.008*- -0.006~'' -0.001 -0.002 -0.002 -0.002 -0.002 0.000 0.000 0.003*** 0.000 0.001 -0.001 -0.001 -0.001 0.000 0.000 male 0.018++ 0.037'* 0.026** 0.023"* 0.023"" 0.009 -0.016 -0.013 -0.009 -0.008 married 0.004 -0.011 -0.017 0.006 -0.016'' -0.013 -0.014 -0.012 -0.007 -0.008 aprent 0.029 -0.034 0.028 -0.01 0.000 -0.026 -0.023 -0.027 -0.013 -0.014 0.006 0.007 -0.023 -0.001 0.004 -0.009 -0.008 -0.017 -0.011 -0.01 o.093::$* 0.071'* 0.116'' 0.137~' 0.126"'~ -0.031 -0.051 -0.067 -0.032 -0.023 Regional dummy significant significant significant significant significant variables Number of observations Log likelihood pseudo-R~ "Significant at the 10 percent level. ::% Significant at the 5 percent level. < *:. Significant at the 1 percent level. Note: Numbers in parentheses are standard errors adjusted for clustering at the primary sampling unit level. See text for details. Source: Authors' calculations based on data from R L M ~rounds & VI-X. labor markets. Employers' market power may result in lower wages, making migration harder to finance. When wages are regressed on CR4 and relevant controls, the effect of concentration is indeed negative, significant, and quite large: in various specifications individual wages decrease by 0.4 to 0.5 percent when CR4 increases by 1 percent. Empirically, however, this explanation can be distinguished from that presented in this article because CR4 is found to affect mobility controlling for income (either household income, as in table 3, or individual wages) and because our theory also predicts the effect of labor market concentration on the composition of wages, which is consistent with the evidence (table 4). To reinforce that argument, mobility was also regressed on both inkind and wages with relevant controls but excluding CR4.The results support the proposed theory, although the alternative explanation fails in some specifications: the coefficient on inkind is always negative and significant, whereas the effect of wages is not significant after controlling for willingness to move. Friebel and Guriev 199 Second, higher rates of labor market concentration might be correlated with higher product market concentration. Then, when CR4 is high, there are more rents that can be shared between managers and workers, which all else being equal makes current employment more attractive. As mentioned, however, the evidence is not consistent with this explanation: a higher concentration of market power results in lower rather than higher wages. Third, there may be economies of scale in the provision of fringe benefits such as hospitals, housing, and schools. Then, a higher CR4 could be an indicator of better provision of fringe benefits that compensate for potentially lower monetary wages. One could, in principle, test this theory, which would predict low outflows and high inflows for concentrated local labor markets (whereas the theory proposed here predicts both low outflows and low - - inflows). Population changes on the local level are not available, but survey evidence suggests that workers are not very keen to move into local labor markets with high concentration, whereas many want to leave but do not have the financial means to do so.22The impact of living standard proxies that are not highly correlated with CR4 was explored to examine this argument: the availability of bank services, the quality of telecommunication services, and the quality or roads in the primarly sampling units (specification 5 in table 3). Although these variables matter, they reduce the magnitude and significance of the results for CR4 only marginally. IV. CONCLUDING REMARKS In the theory of attachment presented here, low migration arises endogenously owing to the strategic behavior of oligopsonistic firms. The attachment con- tracts that emerge in concentrated local labor markets are beneficial for firms and employees but impose a negativeexternality on the unemployed.The theory fits Russia in the second half of the 1990s,when many local labor markets .were oligopsonistic, worker compensation was demonetized, and migration was low. In line with the theory an analysis of household and firm data shows that higher labor market concentration decreases the outflow of workers and increase:$the occurrence of in-kind payments. There are several implications for the Russian economy, but the theory is also of a more general nature. In particular, it points to a path dependency with respect to the structure of labor markets. Regional disparities may remain in economies facing large shocks because a few firms dominate the labor ma.rket, not only because of exogenous frictions. 22. In a survey of students and disabled, unemployed, and retired individuals residing in Russia north, 5 4 6 8 percent (for various categories) responded that they would be willing to leave the region, but only 3-11 percent said that they would have sufficient financial means to cover the migration costs fully or partially (Heleniak 1999). 200 THE W O R L D BANK E C O N O M I C R E V I E W , VOL. 19, N O . 2 The following list describes the key variables used in the regression analysis. Personal characteristics: male (dummyvariable, equals 1 if male); married (dummyvariable, equals 1 if the respondent is married);edyrs (yearsspent on education);age (age in years). Intention to move: wantmove (dummy variable, equals 1 if respondent indicates the intention to move in the coming year). Household characteristics: hhincome (household income); aprent (dummy variable, equals 1 if the respondent rents housing); nkids 7-18 (number of children ages 7-18 in the household). Job characteristics: jobsyr (number of years spent in the firm); inkind (dummy variable, equals 1 if respondent received in-kind payments in the last month); arr (dummy variable, equals 1 if respondent had wage arrears in the last month). Employer characteristics: cash-cl (ratio of firm's liquid assets to current liabilitiesas of December 31, 1995),cash-sales (ratioof firm's liquid assets as of December 31, 1995, to annual sales for 1996). Geographic characteristics: PSU (primary sampling unit, 38 communities represented in the sample); CR4 (labor market concentration ratio at the primary sampling unit level: the share of four biggest employersin the total employment in the primary sampling unit); region (regional dummy vari- ables for eight regions: Moscow and St. Petersberg, Central and Central Blacksoil region, North and Northwest, Volga, East Siberia and Far East, North Caucasus, Western Siberia, and Urals). Respondent absent from primary sampling unit in round VII: move (dummy variable, equals 1 if person is not found in the same community next year). Community characteristics: c6bank (availability of bank offices); c6telphp (phone lines per 100 people); c6roads (qualityof roads). Alston, L., and J. Ferrie. 1993. "Paternalism in Agricultural Labor Contracts in the US South: Implications for the Growth of the Welfare State." American Economic Review 83(4):852-76. -. 1999. Paternalism and the American Welfare State: Economics, Politics and Institutions in the US South, 1865-1965. Cambridge: Cambridge University Press. Andrienko, Y., and S. Guriev. 2004. "Determinants of Interregional Labor Mobility in Russia: Evidence from Panel Data." Economics of Transition 12(1):1-27. Bardhan, P. 1983. "Labor Tying in a Poor Agrarian Economy." Quarterly Journal of Economics 98(3):501-14. Bell, C. 1988. "Credit Market and Interlinked Transactions." In H. Chenery and T. N. Srinivasan, eds., Handbook of Development Economics, vol. 1. Amsterdam: North-Holland Elsevier. Berkovitz, D., and D. DeJong. 1999. "Russia's Internal Border." Regional Science and Urban Economics 29(5):63349. Friebel and Gtrrieu 201 Bearand, M. 2004. "From the Invisible Handshake to the Invisible Hand? How Import Competition Changes the Employment Relationship." Journal of Labor Economics 22(4):723-66. Bhaskar, V., A. Manning, and T. To. 2002. "Oligopsony and Monopsonistic Competition in Labor Markets." Journal of Economic Perspectives 16(2):155-74. Biletsky, S., D. Brown, J. Earle, I. Komarov, and K. Sabirianova. 1999. "Inside the Transforming Firm: Report on a Survey of Manufacturing Enterprises in Russia." Upjohn Institute for Employment Research, Kalamazoo, Mich. Blanchard, O., and A. Shleifer.2001. "Federalismwith and without Political Centralization: China versus Russia." IMF Staff Papers 48(4):171-80. Boal, W., and M. Ransom. 1997. "Monopsony in the Labor Market." Joural of Economic Literature 35(1):86-112. Burdett, K., S. Shi, and R. Wright. 2001. "Pricing and Matching." Journal of Political Ecoruomy 109(5):1060-85. Clarke, S. 2000. "The Household in a Non-Monetary Market Economy." In P. Seabright, ed., The Vanishing Ruble: Barter Networks and Non-Monetary Transactions in Post-Soviet Societies. Cambridge: Cambridge University Press. Commander, S., and M. Schankerman. 1997. "Enterprise Restructuring and Social Benefits." Economics of Transition 5(1):1-24. Denisova, I., G. Friebel, and E. Sadovnikova. 1998. "The Russian Labor Market: Urgent Neecls for Reform." Russian Economic Trends 7(2):9-14. Earle, J., and K. Sabirianova. 2000. "Equilibrium Wage Arrears: Theoretical and Empirical Analysis of Institutional Lock-in." William Davidson Institute Working Paper 321. Ann Arbor, Mich. . 2002. "How Late to Pay? Understanding Wage Arrears in Russia."Journal of Labor Economics 20(3):661-707. Ericson, R. 2000. "The Post-Soviet Russian Economic System: An Industrial Feudalism?" In Tuomas Komulainen and Iikka Korhonen, eds., Russian Crisis and Its Effects. Helsinki: Kikimora. Friebel, G., and S. Guriev. 2000. "Why Russian Workers Do not Move: Attachment of Workers through In-kind Payments." CEPR Discussion Paper 2368. Center for EconomicPolicy and Research,Washington, D.C. Goskomstat. 2000. Socialnoe Polozhenie i Uroven Zhizni v Rossii [Social Situation and Living Standards in Russia]. Moscow (in Russian). Grosfeld, I., C. Senik-Leygonie,T. Verdier, S. Kolenikov, and E. Paltseva. 2001. "Workers' Heterogeneity and Risk Aversion: A Segmentation Model of the Russian Labor Market." Journal of Comparative Economics 29(1):230-56. Haaparanta, Pertti, Tuuli Juurikkala, Olga Lazareva, Jukka Pirttila, Laura Solanko, and Ekaterina Zhuravskaya. 2003. "Firms and Public Service Provision in Russia." Working Paper 40. Center for Economic and Financial Research, Moscow. Healey, N., V. Leksin, and A. Shvedov. 1998. "Privatisation and Enterprise-Owned Social Assets." Russian Economic Barometer 7(2):18-38. Heleniak T. 1999. "Migration from the Russian North during the Transition Period." World Bank Social Protection Discussion Paper 9925. World Bank, Washington, D.C. Hilton, G. 1960. The Truck System. Cambridge: Heffer. Jarocinska, E., and A. Worgotter. 2000. "Regional and Sectoral Skill Premia in Russia." Working Paper. Institute for Advanced Studies, Vienna. Juurikkala, T., and 0. Lazareva. 2004. "The Role of Social Benefits in the Employment Strategies of Russian Firms." Helsinki School of Economics. Kornai, J. 1992. The Socialist System: The Political Economy of Communism. Princeton, N.J.: Princeton University Press. McMahon, C. 2001. "Idea to Pay DocsManure OffersWhiff of Russian Reality."ChicagoTribune, May 31. Mukherjee, A., and D. Ray. 1995. "Labor Tying." Journal of Development Economics 47(2):207-39. Petrongolo, B., and C. Pissarides. 2001. "Looking into the Black Box: A Survey of the Matching Function." Journal of Economic Literature 39(2):390431. Rama, M., and K. Scott. 1999. "Labor Earnings in One-Company Towns: Theory and Evidence from Kazakhstan." World Bank Economic Review 13(2):185-209. Roland, G. 2000. Transition and Economics: Politics, Markets and Fims. Cambridge, Mass.: MIT Press. Salop, J., and S. Salop. 1976. "Self-Selection and Turnover in the Labor Market." Quarterly Journal of Economics 90(3):619-27. Schaffner, J. 1995. "Attached Farm Labor, Limited Horizons and Servility." Journal of Development Economics 47(2):241-70. Stevens, M. 1994. "A Theoretical Model of On-the-Job Training with Imperfect Competition." Oxford Economic Papers 46(4):537-62. Tratch, I., M. Rein, and A. Worgotter. 1996. "Social Asset Restructuring in Russian Enterprises: Results of a Survey in Selected Russian Regions." In OECD, The Changing Social Benefits in Russian Enter- prises. Paris: Center for Co-operation with the Economies in Transition. v c r o ~(Russian Center for Public Opinion Research). 1997. Monitoring Obschestvennogo Mnenia: Ekonomicheskie i Socialnye Peremeny [Monitoring Public Opinion: Economic and Social Change]. Various issues, Moscow (in Russian). Williamson, 0.1985. The Economic Institutions of Capitalism. New York: Free Press. Zohoori N., T. Mroz, B. Popkin, E. Glinskaya, S. Lokshin, D. Mancini, P. Kozyreva, M. Kosolapov, and M. Swafford. 1998. "Monitoring the Economic Transition in the Russian Federation and Its Implications for the Demographic Crisis-the Russian Longitudinal Monitoring Survey." World Development 26(11):1977-93. Child Health and Economic Crisis in Peru Christina Paxson and Norbert Schady The effect of macroeconomic crises on child health is a topic of great policy impar- tance. This article analyzes the impact of a profound crisis in Peru on infant mortality. It finds an increase of about 2.5 percentage points in the infant mortality rate f ~ ~ r children born during the crisis of the late 1980s, which implies that about 17,000 more children died than would have in the absence of the crisis. Accounting for the precise source of the increase in infant mortality is difficult, but it appears that the collapse in public and private expenditures on health played an important role. Over the past two decades, a large number of countries, including Argentina, Indonesia, Mexico, Peru, and Russia, experienced economic crises that led to sharp reductions in incomes and living standards. A growing body of literature has examined whether these crises had adverse effects on health outcomes. To the extent that crises lead to declines in health outcomes, it is important to identify the specific mechanisms that are responsible, with an eye toward developing policies that can ameliorate adverse health effects in the future. This article considers how economic shocks affect health by examining the effect of one crisis-that experienced by Peru in the late 1980s-on infant mortality. The Peruvian case is noteworthy -because the crisis was unusually sharp: per capita gross domestic product (GDP) declined by 30 percent, and real wages in the capital city of Lima fell by more than 80 percent. The sheer depth of the economic collapse makes Peru a useful place in which to study the health effects of economic crises. In addition, Peru has good information on indant mortality from a set of household surveys-the Demographic and Health Surveys Christina Paxson is professor of economics and public affairs at Princeton University; her email address is cpaxson@princeton.edu. Norbert Schady is senior economist in the Development Research Group (Public Services Team) at the World Bank; his email address is nschady@worldbank.org. The authors thank Harold Alderman, Anne Case, Jaime de Melo, Francisco Ferreira, Jed Friedman, three anonymous referees, and seminar participants at Princeton University and the World Bank for comments. They also thank Pedro Francke and Jaime Saavedra for help in making data available. Anearlier version of this article is available as World Bank Policy Research Working Paper 3260; online at www.worldbank.org. THE WORLD BANK ECONOMIC REVIEW, VOL. 19, NO.2, pp. 203-223 doi:10.1093/wber/Bhi011 Advance Access publication September 21, 2005 O The Author 2005. Published by Oxford UniversityPress on behalf of the International Bank for Reconstruction and Development I THE WORLD BANK.All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. (DHS)--collectedat regular intervalssince 1986. The surveys are used to construct a time serieson infant mortality that spans the period before,during, and after the economic crisis. These data can then be used to analyze the extent to which changes in infant mortality are departures from preexisting trends and whether there is a return to these trends after the crisis. Before discussing the details of the Peruvian case, it is useful to consider why economic crises might affect infant mortality. One possibility is that crises prompt households to reduce spending on inputs to child health, including nutritious foods or medical care for mothers and infants. Another possibility is that crises cause public health services to deteriorate, which may increase the price of healthcare or reduce its quality. Although economic crises may have an effect on infant mortality in develop- ing economies, they do not have to. Governments can implement programs that mitigate the health effects of crises, and households may be able to smooth consumption or at least buffer expenditures on goods that protect health. Furthermore, families could avoid infant deaths by delaying fertility until the crisis has passed (Ashtonand others 1984; Ben-Porath 1973; Coale 1984; Stein and others 1975). Deferred fertility may lead to more widely spaced births and to fewer births to very young women, which lowers mortality (Palloni and Hill 1997).Whether economic crises adversely affect health is therefore an empirical question. The literature suggests that the relationship between infant mortality and economic fluctuations varies a great deal by country. Evidence from the United States shows that infant mortality decreases during recessions due to changes in maternal behavior, shifts in the composition of women giving birth, and declines in air pollution (Chay and Greenstone 2003; Dehejia and Lleras- Muney 2004; Ruhm 2000). Results from poorer countries with sharper eco- nomic fluctuations have yielded mixed results. The collapse in income in many countries of the former Soviet Union in the 1990s was associated with dramatic increases in adult mortality, particularly from alcoholism and suicide, but no obvious changein child health (Brainerd1998,2001; Brainerd and Cutler 2005; Shkolnikov and others 1998). During the 1998 Indonesian financial crisis, infant mortality increased by about 1.4 percentage points (Rukumnuaykit 2003). In Latin America the financial crisis of the late 1990s in Argentina did not affect the infant mortality rate (Rucci2004), whereas the economic crises in Mexico in the 1980s and 1990s increased mortality for the very young and the elderly (Cutler and others 2002).' The results for Peru presented here are most 1. Similarly mixed results can be found in studies that examine the effects of crises on children's anthropometric outcomes. See, for example, Jensen (2000)on C6te dlIvoire; Yamano and others (2003) on Ethiopia; Hoddinott and Kinsey (2001) on Zimbabwe; Foster (1995) and del Ninno and Lundberg (2002)on Bangladesh;and Frankenberg and others (1999),Cameron (2002),Waters and others (2003), and Strauss and others (2002)on Indonesia. Paxson and Schady 205 FIGURE 1. Economic Indicators for Peru, 1978-99 GDP per capita (1995 US$) 1,500 -I, , , , , , , , , , , 1978 1982 1986 1990 1994 1998 Wages in Lima (1994 soles) l,OO0 1 Source: Data on per capita GDP and inflation, World Bank databases; data on wages, annual labor force surveys conducted in Lima since 1986, provided by Jaime Saavedra, director of Grupo de Anilisis para el Desarrollo. consistent with those from Indonesia and Mexico in that the economic crisis is shown to have had a large effect on infant mortality. Figure 1 shows GDP per capita, wages (in Lima only), and inflation in the 1980s and 1990s. Each indicator provides clear evidence of a macroeconomic colla.pse in 1988. The reasons for this crisis-a "heterodox" stabilization program invol- ving reduced foreign debt payments, wage increases, and job creation programs that quickly proved unsustainable-and the impact that the crisis had on poverty and education outcomes have been documented elsewhere (Glewwe and Hall 1994; Schady 2004). Figure 1 makes the depth of the crisis obvious. Real GDP per capita contracted by almost 30 percent between 1987 and 1990 and did not begin to recover until 1993. The collapse in wages in Lima was even more dramatic, with a fall in real wages of more than 80 percent between 1987 and 1990 and a gradual recovery thereafter. Data from multipurpose income and consumption surveys conducted in Lima in 1985 and 1990 suggest that per capita consumption in 1990 was less than half its 1985 level, although there appears to have been a recovery in consumption thereafter (Glewwe and Hall 1994; Schady 2004). Inflation skyrocketed during the crisis-rising from 86 percent a year in 1987 to almost 7,500 percent in 1990, before falling to 410 percent in 1991 and 74 percent in 1992. By any measure, the extent of the economic collapse in Peru in the late 1980s is staggering. Indeed, as a consequence of this crisis, per capita GDP and real wages in 2000 were still well below their 1987 levels, despite respectablegrowth rates during most of the 1990s. Consider, as points of comparison, crises in the 1990s in Argentina, Indonesia, Mexico, and Russia. In Argentina the 1998-2002 crisis resulted in an 18 percent reduction in per capita GDP and a 32.4 percent reduction in wages in 2002 (McKenzieand Schargrodsky 2004). In Indonesia per capita GDP fell by 12 percent in 1997, and per capita consumption in 1999 was 23 percent below its 1996 value (World Bank 2004). In Mexico the 1995 crisis resultedin a 6.3 percent reduction in per capita GDP and a 19 percent reduction in per capita consumption (McKenzie 2004). Only the collapse of the Russian economy presents a crisis of a magnitude similar to that of Peru: between 1992 and 1998 per capita GDP fell by 29 percent, and monthly income contracted by 43 percent (Mroz and others 2001).~ Two additional points about Peru's crisis are noteworthy. First, although the beginning of the crisis is clear-1988-the end is not. Data on wages and inflation suggest that the recovery began in 1991, whereas the data on per capita GDP suggest that the recovery began in earnest in 1993. The results here are consistent with a shorter crisis-one that had an effect on child health between 1988 and 1990. Second, the GDP data show an economic crisis earlier in the 1980s,involving a 14 percent contraction in per capita GDPin 1983. These points are discussed next. The Peru Demographic and Health Surveys The main data source for this article is Peru's DHS.The surveys sampled 4,999 women ages 15-49 in 1986, 15,882 women in 1991192, 28,951 women in 1996, and 27,843 women in 2000 (see www.measuredhs.com). The surveys are nationally representative, although in 1986 and 1991192 some areas were 2. All changes in per capitaGDP are calculated from World Bank databases. Paxson and Schady 207 not surveyed due to high levels of terrorist a~tivity.~All four surveys included a set of questions on the date of birth, current vital statistics, and the date of death (if deceased) of all children ever born to the respondent. More extensive infor- mation was collected on children born to the respondents within five years of the survey. The 1991192, 1996, and 2000 survey data contain information on circumstances surrounding the births of children younger than 60 months old and on the heights and weights of children who were still living. All the sur~ieys also collected information on a range of household sociodemographic charac- teristics, including urban status, maternal education, housing characterist~ics, and ownership of durable goods. In addition to the DHS, administrative data on health expenditures and the number of terrorist incidents as well as household survey data on consumpltion patterns from the 1985186 and 1991 Peru Living Standards Measurement Study (LSMS) were used. These are discussed in more detail shortly. This section begins by examining how infant mortality rates evolved over the 1980s and 1990s. Retrospective birth and death histories from each DHS vvere used to construct mortality rates, by date of birth, in the first and second half of each calendar year from 1978 to 1999. The main measure of mortality is an indicator for whether a child died at age 12 months or younger, referred to as infant mortality. This definition, rather than the standard definition of mortality for children younger than 12 months of age was chosen because of "age heaplmgn in reports of mortality. However, the results reported are not sensitive to this choice. Results for mortality rates for children age 1 month or younger, referred to as neonatal mortality, and age 6 months or younger are also shown. Mortality rates were constructed using the sample weights provided in the survey. To avoid problems with censored data, information on children born within 23 months of the survey was discarded when calculating mortality rates for children age 12 months or younger. In theory only information on children born at least 12 months before survey should have been discarded, because it is unknown whether these children survived past 12 months. However, a more conservative approach was adopted in this article because of age heaping. Similarly, records for children born within 5 months of the survey were dis- carded when computing 1-month mortality rates, as were those for children born within 11 months of the survey when computing 6-month mortality rates. Results are very similar when a less conservative approach to censoring is used. 3. In 1986 three departments-Ayacucho, Apurimac, and Huancavelica-with 6 percent of the population were excluded. In 1991192 special precautions, including escorts of enumerators b:y the army or police, were taken in high-terrorism "emergency areas." Despite these efforts, 66 districts with approximately 5 percent of the population were excluded due to security concerns. Terrorism abated by the mid-1990s and did not pose a problem for the 1996 and 2000 surveys. Although each DHS is representative of women ages 15-49 at the time of the survey, it is not representative of all births (andchild deaths) at earlier years. For example, the mothers who were 1 5 4 9 years old at the time of the 2000 DHS were 5-39 years old in 1990, and any births and deaths reported for that year occurred when the women were in this (younger) age range. In theory, this feature of the data could bias measures of the infant mortality rate in either direction, with the direction of bias depending on whether the children of the older mothers who were excluded had higher or lower average infant mortality rates than the younger age group that was included. Because mortality rates may be highest for the youngest mothers, information on births that occurred when the mother was younger than 15 years old was discarded. An additional source of bias is error in recalling the dates of more distant births and deaths. To reduce problems of recall error, information on births that occurred more than 12 years before the survey was not used. The results are not, however, sensitive to these choices of maternal age ranges and recall periods. Finally, maternal mortality will bias the estimates of infant mortality. There is no information in the sample on births to mothers who died before the survey, because these women are not alive to be sampled. If their children were at higher risk of death, the infant mortality estimates reported in this article would be too low. Infant mortality rates were first calculated from each DHS separately, so mor- tality rates computed for the same date of birth but using different rounds of the DHS could be compared (figure2).The results have two important features. First, the patterns of infant mortality rates by date of birth are similar across surveys. Thus, there do not appear to be systematic biases in the rates calculated using up to 12 years of retrospective information on births. Second, there is a sharp increase in the infant mortality rate around 1990. This increase, which appears in data from the 1991192,1996, and 2000 surveys, begins with children born in the second half of 1989 and peaks for children born in the first half of 1990. This increase in the infant mortality rate-from approximately50 per 1,000 live births to 75-is large. The Peruvian population was nearly 22 million in 1990, with a crude birth rate of 31.73 per 1,000 people, implying that nearly 700,000 children were born in 1990 (U.S. Census Bureau 2004). The rise in the mortality rate observed during the crisis implies there were 17,184 "excess"infant deaths among children born in 1990. The fact that the mortality spike appears in all three surveys indicates that it is not the result of sampling error. Because each DHS yields similar infant mortality rates for children born at the same date but recorded in different surveys, it makes sense to average mortality rates across surveys. The results for the neonatal mortality rate, the rate of mortality in the first 6 months of life, and the infant mortality rate are shown in figure 3. A comparison of figure 3 with figure 1 shows that the spike in mortality among children born in 1990 coincides with the worst of the economic crisis, when per capita GDP was falling to its lowest levels and real wages had not yet recovered. A similar spike is observed in 1983, when Peru experienced a smaller economic crisis. But the spike in infant mortality in 1983 appears in data Paxson and Schady 209 FIGURE 2. Infant Mortality Rates, by Survey Year, 1978-99 C 111986 DHS Year of birth (mortalityshownfor first and second half of each birth year) Source: 1986, 1991192, 1996, and 2000 DHS. FIGURE 3. Child Mortality Rates, Average from all Survey Years, by Age Group, 1978-99 Year of birth (mortality shown for first and second half of each birth year) Note: The figure reflects unweighted averages of the infant mortality rates from the relevant surveys, with sample weights used to construct mortality rates for each survey. Source: 1986, 1991192, 1996, and 2000 DHS. 210 T H E W O R L D BANK E C O N O M I C REVIEW, V O L . 19, N O . 2 from the 1986 DHS but not from the 1991192 DHS (seefigure 2).Because the 1986 survey was quite small and the estimates of mortality based on these data are imprecise, this spike provides much less clear evidence of a possible increase in mortality in 1982-83. Mortality and per capita GDP are clearly inversely related over this time period: a regression of the log of the infant mortality rate on the log of per capita GDP, including a time trend, implies that the elasticity of infant mortality with respectto per capita GDP is -0.64 (t=2.35). (Thisregressionallows for first-order serial correlation of the error terms.) Figure 3 also indicates that the increase in mortality in 1990 was not confined to infants in specific age ranges. Children born in the second half of 1989 through 1990 were more likely to die in the first month of life. They were also more likely to die in the first 6 and 12 months of life. This is not a mechanical result of a higher mortality rate in the first month of life-for example, of children born in the first half of 1990 who survived at least 1 month, 20 per 1,000 died between age 6-12 months, in contrast to the conditional death rate of 8 per 1,000 for children born in the first half of 1988. Similarly, the mortality rate of those age 6-12 months (conditional on survival to 6 months) rose from 14 per 1,000 to 25 between the two periods. Using the 1996 and 2000 surveys, which have comparable region codes, it is also possible to examine whether the increase in mortality appears in urban and rural areas and in the coast, highlands, and jungle regions. The infant mortality spike appears in all areas except the jungle, where the estimates are very imprecise due to small sample sizes. As will be discussed, this is important because it rules out explanations for the increase in mortality that affect only some parts of the country. Vital statistics data on the registered number of deaths, by age group, can also be used to inspect mortality trends in the 1980s and 1990s. These statistics show no increase in the number of reported deaths in 1989-91. However, the vital statistics data for Peru are not reliable. The Pan American Health Organi- zation (PAHO) estimates that fewer than half of all deaths are recorded in Peru, and the number of recorded deaths is lowest in the poorest departments-for example, in Amazonas, Ayacucho, Huancavelica, and Loreto fewer than a quarter of deaths are reported, compared with more than three-quarters in the three wealthiest departments of Ica, Lima, and Tacna (PAHO 1998). A compar- ison of the vital statistics data with the number of infant deaths calculated from the 1992 DHS using the appropriate survey expansion factors suggests that the vital statistics covered 63 percent of infant deaths in 1988, 65 percent in 1989, 50 percent in 1990, and 47 percent in 1991. Underreporting in Peru's vital statistics data seems to be a serious problem- especially if, as seems likely, the estimates of infant mortality from the DHS are downward-biased because the sample does not include children born to very young or very old mothers or to mothers who died before the survey. Moreover, coverage of the vital statistics data appears to have worsened during the crisis, possibly because of budget cuts in the Ministry of Health, which is responsible Paxson and Schady 211 for collection and verification of the data and because of less use of health facilities-both of which are documented below. The evidence presented in the previous section indicates that infant morta.lity increased during the economic crisis in Peru. Although many factors affect infant mortality-for example, maternal education and knowledge about basic health practicesand nutrition, water supply, nutritional content of food intake, access to and quality of health services for children and their mothers-there are two main channels through which the crisis could have worked. First, the crisis could have caused public healthservices to deteriorate.Second, it could have led to reductions in household expenditureson inputs to child health, including nutritious foods or medical care for mothers and infants. This section presents evidence on the importance of each of these factors. In addition, it examines whether the increase in mortality was driven by a change in the compositionof women giving birth and whether it could have been due to other factors-such as a cholera outbreak or increases in terrorist activity-that happened to coincide with the crisis. Declines in Healthcare Use Public health expendituresfell sharply during the economic crisis-by 58 percent between 1985 and 1990, declining from 4.3 percent of the budget to 3 percent (figure 4). One consequence of deep budget cuts in health (combined with high inflation) was a reduction in real wages for health workers, which led to lalbor unrest. Ministry of Health workers went on strike in March-July 1991, which forced public hospitals and clinics to close, and again in early 1992 (Associated Press 1991, 1992). It seems likely that the decline in public health expenditures during the crisis would have led to reductions in the use of health services. This can be examined using the DHS data. The 1991192 and 1996 DHS asked where all children born within 59 months of the survey were delivered and how many antenatal health visits the mother had while pregnant. This information is used to examine whether there were increases in home births and declines in antenatal care during the crisis. Specifically, for each DHS, the following model was estimated: where Yibt is an outcome (number of antenatal visits or an indicator for home delivery) for child b born to mother i in year t, and Xiis a set of maternal characteristics that are assumed not to change over time, including level of schooling (with no school omitted), age group (with ages 15-19 omitted), and whether the mother lived in an urban area at the time of the survey (the 212 THE W O R L D BANK E C O N O M I C REVIEW, V O L . 19, N O . 2 FIGUR 4. Public Health Spending, 1978-2000 E Note: Public health expenditure includes all expenditures of the Ministry of Health, including centrally administered and locally executed programs. It does not include expenditures on health by local governments, which are negligible in Peru, or expenditures made by the health insurance system that covers formal sector workers. No data on private health expenditures over time are available. Source: Data provided by Pedro Francke, professor at Pontificia Universidad Cat6lica del Perk mother's location at the date of birth is unknown).Because mothers may choose different levels of healthcare for first births, an indicator Zibtis included for first births. The parameters of interest are the terms a,, which capture differences in the outcome across years, controlling for maternal and child characteristics. Equation 1 is estimated using linear regression models, including either mother- specific random effects or mother-specific fixed effects. In the fixed-effects models the time-invariant, mother-specific variables are necessarily e~cluded.~ 4. For the dichotomous outcomes, conditional logit models were also estimated, with similar results. In theory, the two surveys could have been pooled and equation 1 estimated using the combined sample. This was not done for two reasons. First, when mother-specific fixed effects are included, it is not possible to identify changes in the outcomes that occurred across the survey years, because mothers in the 1996 survey were not asked about the outcome measures for births occurring before 1992. For these results pooling the data would yield identical results to those presented shortly. Second, the question on where the child was delivered was coded somewhat differently between the two surveys, and the responses to this question may not be completely comparable. In 1996 a new category of "birth in the midwife's home" was added. It is not clear whether these births would have been coded as "home birthsn or as "other" in 1991192.In addition, the coding of types of births at public and private facilities other than the home changed between 1991192 and 1996, so that it is not possible to construct consistent series on other places of birth. Paxson and Schady 213 Table 1provides descriptive statistics on healthcare and birth outcomes. 'The table shows that the average number of antenatal visits in both surveys is roughly 3.5, whereas slightly more than half of births take place at home in both survey years. Table 2 reports results from estimations of equation 1- specifically, values of a for 1988-91 when using the 199111992 data and for 1993-96 when using the 1996 data. The left side of the table shows that the number of antenatal visits fell steadily from 1987 through 1991 and increased steadily from 1992 to 1996. Focusing, for example, on the random effects results, women who gave birth in 1991-many of whom were pregnant in 1990-had 0.28 fewer antenatal visits than those in 1987, whereas women who gave birth in 1992-many of whom were pregnant in 1991-had 0.38 fewer visits than those in 1996. Note that this sort of seesaw pattern is not consistent with any obvious form of recall bias-for example, if women remem- ber fewer antenatal visits for pregnancies that occurred further in the past. The right sideof the table shows that the fraction of home births was highest in 1990 (using the 1991192 survey) and 1992 (using the 1996 survey). The results in table 2 suggest that there were important declines in the use of health services during the years in which the crisis was most profound. These declines could have occurred either because of declines in public expenditures on health, as shown in figure 4, or because declines in household incomes made it more difficult for households to make co-payments at health facilities.' It is impossible to distinguish between these two possibilities with the available data. Changes in Household Consumption Patterns Because the crisis entailed large reductions in household income and consump- tion, it is possible that households were unable to protect expenditures on items of importance in determining child health. This issue cannot be examined using the DHS, both because information on births is retrospective and because expen- diture information was not collected. Instead, the 1985186 and 1991 Peru ISMS surveys are used to analyze patterns of consumption before and during the crisis. The 1985186 LSMS was a nationwide, multipurpose household survey. By contrast, the 1991 LSMS survey covered only Lima, the urban areas of the coast and the highlands (but not the jungle),and the rural areas of the highlands (but not the coast or the jungle). There are serious concerns with the quality of data from the rural highlands in the 1991 LSMS survey, as detailed in Scl-lady (2004).Moreover, the 1985186 and 1991 LSMS surveys were not conducted in the same months of the year, and seasonal differences in consumption patterns are likely to be important in rural areas, where a large part of food consumption comes from own-food production. The analysis here is therefore limited to 5. In health facilities run by the Ministry of Health, labor costs are heavily subsidized, but drugs and medical inputs are financed from user fees and are charged to the user at full cost plus a markup. Co-payments also apply to users of facilities run by the public health insurance system, which covers formal sector employees (World Bank 1999). 2 1 4 THE W O RLD BA N K EC O N O M IC R E V IE W , V O L. 19, NO. 2 TABLE 1. Descriptive Statistics for Healthcare and Birth Outcomes Statistic 1991192DHS 1996 DHS Average number of antenatal visits Share of births at home (%) Share of births that are first births (%) Observations Number of births in 5 years preceding the survey 9,027 Number of mothers 6,193 Number of mothers with two or more births 2,392 in 5 years preceding the survey Number of mothers with two or more births 1,254 who had change in number of antenatal visits Number of mothers with two or more births 307 who had change in "birth at home" Note: SDs are in parentheses. Means are calculated with the expansion factors in the survey. Source: 1991192 and 1996 DHS. comparisons of the urban areas of Lima, the coast, and the highlands. The 1994 and 1997 LSMS surveys are not used because there were important changes in the way they collected information on the consumption of food and nonfood items, which makes comparisons between 1991 and later surveys problematic. The 1985186 and 1991 LSMS surveys asked respondentswhether they purchased a particular item and, in the case of food items, semidurables, and services, the amount spent on each item. No questions were asked about quantities, and the extremely high rate of inflation during the crisis makes it impossibleto accurately deflate expenditures to real terms. The focus here is therefore on whether specific goods were purchased by the household-specifically,on the share of households in a survey that reported consuming a given food item in the past two weeks, purchasing a given semidurable or servicein the past three months, and purchasing a given durable in the last three years. (In the case of food items households are coded as having consumed a particular item if they reported purchasing it or providing it themselves "from their own store, business, or plot.") The results from these calculations suggest that by and large, there were no important changes in consumption patterns of food during the crisis. Consump- tion of some items (bread, potatoes, yams, yucca, poultry, eggs, oil, margarine, legumes, and fresh vegetables) increased between 1985186 and 1991, whereas consumption of others (maize,cookies, cake, other meat products, fish, seafood, milk, dairy products, and frozen, dried, or canned vegetables and fruits) decreased. There is no clear substitution out of "expensive" sources of protein- for example, meat, poultry, fish, and seafood. Moreover, the magnitude of the changes is generally quite small--only in the case of dairy products other than milk does the share of households that reported consumingit appear to fall by a large amount (from 72 percent to 39 percent). TA BLE 2. Healthcare and Birth Outcomes Number of Antenatal Visits Home Birth Mother-Level Mother-Specific Mother-Level Mother-Specific Estimation Method Random Effects Fixed Effects Random Effects Fixed Effects 1991/92 DHS (birth year= 1987 is the omitted c~tegory)~ Birth year =1988 -0.014(0.084) -0.033(0.102) -0.007(0.011) 0.007 (0.013) Birth year =1989 0 . 0 1 7 (0.020) 0.032 (0.095) 0.001 (0.010) 0.016 (0.013) Birth year =1990 -0.151(0.086) -0.064 (0.106) 0.031 (0.011) 0.042 (0.014) Birth year =1991 -0.277(0.086) -0.208 (0.107) 0.014 (0.011) 0.037 (0.014) First birth 0.630 (0.076) 0.711 (0.100) -0.074 (0.010) -0.035 (0.013) Test: Year effects jointly 0 (p-value) 0.001 0.149 0.004 0.006 h, + 1996 DHS (birth year=1992 is the omitted categovy) m Birth year =1993 0.064 (0.057) 0.035 (0.074) -0.017(0.008) -0.015 (0.010) Birth year =1994 0.121 (0.054) 0.104 (0.066) -0.027 (0.007) -0.029 (0.009) Birth year =1995 0.264 (0.056) 0.256 (0.073) -0.041 (0.007) -0.034(0.010) Birth year =1996 0.382 (0.062) 0.410 (0.080) -0.047 (0.008) -0.039(0.011) First birth 0.523 (0.054) 0.517 (0.076) -0.092 (0.007) -0.066 (0.010) Test: Year effects jointly 0 (p-value) 0.000 0.000 0.000 0.000 "The 1991192 DHS was completed by March 1, 1992, and there were only reported 55 births in 1992. These are excluded from this analysis. Note: SEs are in parentheses. Linear probability models are used throughout. Mother-level random effects models also control for the mother's level of schooling (with no school omitted), mother's age at the time of the survey (with ages 15-19 omitted), and whether the mother lived in an urban area. Source: Authors' calculations based on 1991192 and 1996 DHS. Bycomparison,consumptionof allsernidurablesand servicesfell, and someof the changes are large. The share of householdsthat reported purchasing child and adult clothing or footwear is 8 percentage points lower in 1991 than in 1985186in every category,andtheshareof householdsthatreportedpurchasingmedicinesdropped by almosthalf.Declinesover theperiodintheshareof householdsthat reportedspending on healthcareareevenlarger, although thequestionswerenot askedthesame way in bothsurveys.Finally,purchasesof durablegoodsshowamoremixedpictureincrisis yearshouseholdsweremorelikelyto purchasesomeitems (suchasradios, televisions, and other electronics) and less likely to purchase others (cars, motorbikes, and machinessuch assewingor weavingmachines,floor waxingmachines,and washing machines).Purchasing durable goods may be a reasonable way for households to protecttheirincomeduringhyperinflation,so thesefindingsarenot surprising. Although this analysis of changes in consumption patterns is by no means definitiveit is unknown how much households consumed of each item and whether households substituted cheaper or less nutritious alternatives in a given category-it suggests that householdsdid not seriouslychangetheir patternsof food consumption;instead, they drastically cut back on medicinesand healthcareexpen- ditures. This pattern is consistent with a smaller income elasticity of demand for food than for healthcare expenditures, as seems likely-or with models in which health investments act as a form of saving or consumption smoothing (Straussand Thomas1998;seealso McKenzie2004 and Stillmanand Thomas2004 for evidence that there were no large declines in food expenditures during crises in Mexico and Russia, respectively). Healthcare spending by households could have also been affected by disruptions in the public sector. Some health services may not have been available because of health worker strikes, while the reduction in public health expenditurescould have increased the price of health services. Maternal Selection and Infant Health Anotherexplanationfor thedeclineinchildhealthduringthecrisisischangesin the composition of women giving birth. Infant mortality rates vary by sociodemo- graphic group, with lower rates observed for women who have more education and live in urban areasand higherratesobservedfor very youngmothers.In theory, the spike in mortality in 1990 could be due to a relative increase in the number of high-risk women giving birthduringtheeconomiccrisis,althoughthisrunscounter to the evidence from the UnitedStates (Deheijaand Lleras-Muney 2004). To examine this hypothesis, Oaxaca-type decompositions of the changes in infant mortality across years were calculated. This involves estimating linear regressions for each year of birth, from 1978 to 1999: where Mi, is an indicator for whether a child born in year t to mother i died in the first year of life and Xi, is a set of maternal characteristics, including level of Paxson and Schady 217 schooling, age, and whether she lived in an urban area at the time of' the survey-all coded in the same way as in the estimations of equation Next, the parameter estimates are used to decompose changes in the mortality rate between years: where AM, is the change in the mortality rate between children born in years t and t - 1 and X,represents (appropriately weighted) means of the maternal characteristics in year t. The first term in brackets measures time effects--that is, changes in the mortality rate between years, holding the average character- istics of mothers fixed at the previous year's values. The second term in brackets measures selection effects-that is, the change in the mortality rate attributed to changes in the average characteristics of women giving birth. If changes in the composition of women giving birth account for patterns of mortality over time, a large part of the changes in mortality should be due to selection effects. Estimates of equation 2, not shown, yield unsurprising results. Infant mor- tality is systematically higher for women with less education, especially in the 1980s and early 1990s, for the youngest women (ages15-19)and oldest women (ages 40-49) (although differences in infant mortality across maternal age groups are smaller and less precisely estimated than those across education groups), and for women in rural areas. These differences make it possible for shifts in the composition of women giving birth to have sizable effects on the overall infant mortality rate. But the results also show that infant mortality increased during the crisis for all groups-prima facie evidence that the increase in infant mortality during the crisis was not due solely to compositional changes. The results of the decomposition exercise are shown in figure 5, which graphs year-to-year changes in infant mortality, along with the time effects and selection effects from equation 3. There is some evidence of a shift toward high-risk mothers in 1990 and toward low-risk mothers in 1991, but the time effects account for the bulk of the observed changes in mortality, both in the crisis years and across the entire time period. These results indicate that the selection of high risk women in or out of pregnancy cannot account for the year-to-year changes in infant mortality observed. 6. Ordinary least squares estimates rather than probit estimates are used because ordinary least squares estimates produce exact linear decompositions. However, probit models yield very similar results. FIGURE 5. Decomposition of Change in Infant Mortality into Time Effects and Selection Effects - 9 .: 2 - a 20 8 : 4 0 - 2 -c ,X .+ j - 2 - @ .- Total change in mortality .-a Year of birth Source: Authors' calculations based on 1986,1991192,1996, and 2000 DHS. Cholera and Other Diseases An alternative explanation for the deterioration in child health is that adverse circumstances happened to coincide with the economic crisis. An example is cholera, which broke out along the coast north of Lima in January 1991. Coastal areas of Peru were affected first, but the disease rapidly spread through- out the country and by the summer to neighboring countries (Colwell 1996). The number of recorded cases of cholera in Peru was 322,562 in 1991 (approxi- mately 1.5 percent of the population) and 210,836 in 1992, after which the disease abated. There were 2,909 deaths reported in 1991 and 727 in 1992 (PAHO 2003), although these numbers are somewhat unreliable because the primary symptom of cholera-diarrhea-is associated with a number of dis- eases, especially in childhood. In theory, the cholera epidemic could have caused large increases in infant mortality, but three pieces of evidence suggest that it was not responsible for the spike in infant mortality observed during the economic crisis. First, the magnitude of the cholera epidemic was simply not large enough. The estimated 17,000 excess infant deaths among children born in 1990 is an order of magnitudehigher than the total number of cholera deaths (2,909) reported for individuals of all ages in Peru in 1991. Even with gross underreporting of cholera deaths, it is not likely that cholera was responsiblefor the bulk of the increase in infant mortality. Second, the timing and age distribution of the mortality spike suggest that cholera was not responsible. The World Health Organization (WHO) notes that in endemic areas cholera is mainly a disease of young children, although "breastfeeding infants are rarely affected" (WHO 2000). Breastfeeding offers protection by reducing a child's exposure to infected water and food. In Paxson and Schady 219 addition, some evidence indicates that antibodies in breast milk protect against cholera (Glass and others 1983; Hanson and others 2003). But the results in figure 3 indicate that children born in 1990 had high rates of mortality in the first month and the first 6 months of life, even through breastfeeding would have protected many of these children. (The DHS data indicate that median length of time for breastfeeding in Peru is 15 months, and only 10 percent of children are breastfed for 5 or fewer months.) More important, the upward spike in infant mortality is apparent among children born in the first half of 1990, who died before the cholera epidemic began. Finally,diseasesother than cholera are unlikelytoexplainthe mortality increase. PAHO (1998)reported steady increases in malaria cases between 1989 and 1996; malaria inPeru affectedonlysomeareasof the jungleand thecoast.Thedistribution of mortality and the timingof the increasein malaria casesdo not coincide with the spikein mortality. The last measles epidemic in Peru occurred in1992 and resulted in263 reported deaths.Again, neitherthe timingof the outbreak nor the magnitude of the epidemic is a plausible explanation for the increase in infant mortality in 1990. A dengue epidemic that took place in 1990 coincides with the increase in mortality, but the total number of reported dengue cases in that year (9,623) is substantiallysmaller than theestimated number of excess infant deaths. Moreover, like malaria, dengue affectsonly the jungleand somecoastal areas in Peru, whereas the increase in mortality during the crisis was nationwide. Terrorism Terrorism could explain some of the increase in child mortality if it hampered the government's ability to deliver health services in affected areas or if it happened to bias the estimates of infant mortality due to lack of coverage of some areas. The Peruvian Ministry of the Interior has data for 1989-95 on the number of terrorist incidents broken down by department and by year (INEI 1996). These data 1-h a OW an increase in the number of terrorist incidents roughly coinciding with the economic crisis. The total number of reported terrorist incidents increased from 2,489 in 1987 to 3,149 in 1989, stayed at roughly the same level between 1989 and 1992, and dropped sharply from 2,995 in 1992 to 1,232 in 1995. Terrorism was highly concentrated in some areas of Peru-predominantly in departments in the central and southern highlands, as well as in Lima. Using the data on the number of terrorist incidents, departments were classified as having either high or low rates of terrorism, with high rates of terrorism defined as more than 0.1 incidentsper 1,000 people in every year between1989 and 1995 or more than 0.2 incidents per 1,000 people in any year. This classification yielded roughly equal numbers of respondents in high- and low-terrorism 7. High-terrorism departments were Ancash, Apurimac, Ayacucho, Huancavelica, Huanuco, Junin, Lima, Pasco, San Martin, and Ucayali. Low-terrorism departments were Amazonas, Arequipa, Cajamarca, Cusco, Ica, La Libertad, Lambayeque, Loreto, Madre de Dios, Moquegua, Piura, Puno, Tacna, and Tilmbes. FIGURE 6. Infant Mortality by High- and Low-Terrorism Departments, 1996 and 2000 g 1 0 4 Year of birth (mortality shown for fust and second half of each birth year) Source: Authors' calculations based on INEI 1996 and 1996 and 2000 DHS. Infant mortality series were then estimated for each group with the 1996 and 2000 DHS. (The1996 and 2000 surveys have consistent geographicidentifiers and national coverage.)Thesecalculationssuggest that the increase in infant mortality during the economic crisis occurred in high- and low-terrorism departments-if anything, the increase was larger in the departments with low terrorism (figure6). It is not possible to rule out an indirect effect of terrorism on infant mortality- for example, if expenditures on the military diverted funds that would otherwise have gone to healthcare. However, the results show that disruptions in access to healthcare or changes in the composition of the samples caused by terrorism cannot account for the changes in infant mortality observed. IV. CONCLUSION The extent to which macroeconomic crises affect child health is an important policy question. This article shows that the infant mortality rate increased by 2.5 percentagepoints during a deep economic crisis in Peru in the late 1980s. As a result, there were more than 17,000 excess infant deaths. Infant mortality peaked in 1990, when real wages hit rock-bottom, healthcare use had fallen dramatically, and public health expenditures were at their lowest level in decades. The available data do not allow for a complete parsing out of the causes of the increase in infant mortality-particularly because information on the economic circumstances of households over the crisis period is limited. As a whole, however, the evidence supports the hypothesis that the collapse in public and private expenditures on health contributed to the observed increases in infant mortality. There is no evidence that the unexpectedly high levels of infant mortality were due to changes Paxson and Schady 221 in the consumption of food, changes in the composition of women giving birth, outbreaks of infectious disease, or terrorism. Social expenditures in Latin American countries tend to be pro cyclical^ (De Ferranti and others 2000).The fact that the increase in infant mortality in Peru appears to be at least in part a result of a decline in healthcare use suggests there may be scope for public policies to protect households during macroeconomic crises. Reforms to budgeting processes-for example, to establish contingency funds for social expenditures during economic downturns--could be important to minimize the effects of future crises on child health. Compared with the changes in mortality during crisis periods documented in other countries, the change in infant mortality in Peru is large. In Argentina the financial collapse of the late 1990s did not result in increases in infant mortality (Rucci 2004). In Indonesia the 1998 financial crisis was associated with an increase in infant mortality of about 1.4 percentage points (Rukumnuaykit 2003).In Mexico macroeconomic crises in the 1980s and 1990s were associated with increasesin child mortality relative to trend rates (Cutlerand others 21002). Although the collapse of Russia's economy led to increases in adult mortality, there were no changes in infant mortality rates (Brainerd1998, 2001; Brainerd and Cutler 2005; Shkolnikov and others 1998).The estimates for Peru suggest a high elasticity of infant mortality with respect to income (0.64). There are several possible explanations for the cross-country differences in the effects of crises on infant mortality. One is that the data on vital statistics, which are used for the analysis of infant mortality trends in Argentina, Mexico, and Russia, are too inaccurate to pick up changes in infant mortality. The fact that increases in mortality in Peru are observed with the DHS data but not with the vital statistics data lends some credence to this hypothesis-although the quality of the vital statistics data in richer countries such as Russia is likely to be far superior to that in Peru. Other explanations for these differences across countries could be the depth of the crisis-particularly severe in the case of Peru-or the extent to which healthcare expenditures changed-in Argentina, for example, healthcare expenditures do not appear to have fallen during the crisis (Rucci 2004). Future research on the reliability of different sources of mortality data and on the importance of changes in household income and consumption relative to changes in public expenditures on health and other services would be important for the design of policies to protect child health during macroeconomic crises. Ashton, Basil, Kenneth Hill, Alan Piazza, and Robin Zeitz. 1984. "Faminein China, 1958-61." Population and Development Review 10(4):61346. Associated Press. 1991. "Peru State Health Workers Suspend Strike after Four Months." July 21. .1992. "Peru: Health Workers Go on Strike as Cholera Continues to Spread."February11. Ben-Porath, Yoram. 1973. "Short-Term Fluctuations in Fertility and Economic Activity in Israel." Demography 10(2):185-204. 222 THE W O R L D B A N K E C O N O M I C REVIEW, VOL. 1 9 , N O . 2 Brainerd, Elizabeth. 1998. "Market Reform and Mortality in Transition Economies." World Develop- ment 26(11):2013-27. . 2001. "Economic Reform and Mortality in the Former Soviet Union: A Study of the Suicide Epidemic in the 1990s." European Economic Review 45(4-6):1007-19. Brainerd, Elizabeth, and David M. Cutler. 2005. "Autopsy of an Empire: Understanding Mortality in Russia and the Former Soviet Union." Journal of Economic Perspectives 19(1):107-30. Cameron, Lisa. 2002. "The lmpact of the Indonesian Financial Crisis on Children: Data from 100 Villages Survey." Policy Research Working Paper 2799. World Bank, Washington, D.C. Chay, Kenneth Y., and Michael Greenstone. 2003. "The Impact of Air Pollution on Infant Mortality: Evidence from Geographic Variation in Pollution Shocks Induced by a Recession." QuarterlyJournal of Economics 118(3):1121-67. Coale, Ansley J. 1984. Rapid Population Change in China, 1952-1982. Committee on Population and Demography 27. Washington, D.C.: National Academy Press. Colwell, Rita. 1996. "Global Climate and Infectious Disease: The Cholera Paradigm." Science 274(5295):2025-3 1. Cutler, David M., Felicia Knaul, Rafael Lonzano, Oscar Mkndez, and Beatriz Zurita. 2002. "Financial Crisis, Health Outcomes and Ageing: Mexico in the 1980s and 1990s." Journal of Public Economics 84(2):279-303. De Ferranti, David, Guillermo E. Perry, Indermit S. Gill, and Luis Servkn.2000. Securing Our Future in a Global Economy. Washington, D.C.: World Bank. Dehejia, Rajeev, and Adriana Lleras-Muney. 2004. "Booms, Busts, and Babies' Health." Quarterly Journal of Economics 119(3):1091-130. Del Ninno, Carlo, and Mattias Lundberg. 2002. "Treading Water: Long-Term Impact of the 1998 Flood on Nutrition in Bangladesh." World Bank, Washington, D.C. Foster, Andrew D. 1995. "Prices, Credit Markets and Child Growth in Low-Income Rural Areas." Economic Journal 105(430):551-70. Frankenberg, Elizabeth, Duncan Thomas, and Kathleen Beegle. 1999. "The Real Costs of Indonesia's Economic Crisis: Preliminary Findings from the Indonesia Family Life Surveys." Labor and Popula- tion Working Paper Series 99-04. RAND Corporation, Santa Monica, Calif. Glass, R. I., A. M. Svennerholm, B. J. Stoll, M. R. Khan, K. M. Hossain, M. I. Huq, and J. Holmgren. 1983. "Protection against Cholera in Breast-Fed Children by Antibodies in Breast Milk." New England Journal of Medicine 308(23):1389-92. Glewwe,Paul, and GilletteHall. 1994. "Poverty, Inequalityand LivingStandards duringUnorthodoxAdjust- ment: The Case of Peru, 1985-1990."Economic Developmentand Cultural Change 42(4):689-717. Hanson, Lars, Marina Korotkova, Samuel Lundin, Liljana Haversen, Sven-Arne Silfverdal, Inger Mattsby-Baltzer, Birgitta Strandvik, and Esbjorn Telemo. 2003. "The Transfer of Immunity from Mother to Child." Annals of the New York Academy of Sciences 987:199-206. Hoddinott, John, and Bill Kinsey. 2001. "Child Growth in the Time of Drought." Oxford Bulletin of Economics and Statistics 63(4):409-36. INEI (Institute Nacional de Estadistica e Informitica). 1996. Perk Estaduticas de la Criminalidad, 1994-96. Lima. Jensen, Robert. 2000. "Agricultural Volatility and Investments in Children." American Economic Review 90(2):399404. McKenzie, David J. 2004. "The Consumer Response to the Mexican Peso Crisis." Stanford University, Department of Economics, Palo Alto, Calif. McKenzie, David J., and Ernesto Schargrodsky. 2004. "Buying Less, but Shopping More: Changes in Consumption Patterns during a Crisis."Stanford University, Department of Economics,Palo Alto, Calif. Mroz, Thomas, Laura Henderson, and Barry Popkin. 2001. "Monitoring Economic Conditions in the Russian Federation: The Russia Longitudinal Monitoring Survey 1992-2000." University of Noah Carolina at Chapel Hill, Carolina Population Center. Paxson and Schady 223 PAHO (Pan American Health Organization). 1998. Health in the Americas, 1998 Edition. Volume 11. Washington, D.C. 2003. "Cholera: Number of Cases and Deaths in the Americas (1991-2001, by country and year)." Washington, D.C. Available online at www.paho.org/english~hcp/ha/eer/cholera-l991-2OOl.htm; accessed March 3, 2003. Palloni, Alberto, and Kenneth Hill. 1997. "The Effects of Economic Changes on Mortality by Age and Cause: Latin America, 1950-90." In Georges Tapinos, Andrew Mason, and Jorge Bravos, eds., Demographic Responses to Economic Adjustment in Latin America. Paris: International Union for the Scientific Study of Population. Rucci, Graciana. 2004. "The Role of Macroeconomic Crisis on Births and Infant Health: The Argentine Case." University of California, Los Angeles, Department of Economics. Ruhm, Christopher. 2000. "Are Recessions Good for Your Health?" Quarterly Journal of Ecoiromics 115(2):617-50. Rukumnuaykit, Pungpond. 2003. "Crises and Child Health Outcomes: The Impacts of Economic and Drought/Smoke Crises on Infant Mortality and Birthweight in Indonesia." Michigan State University, Department of Economics, East Lansing, Mich. Schady, Norbert R. 2004. "Do Macroeconomic Crises Always Slow Human Capital Accumulation?" Wosld Bank Economic Review 18(2):131-54. Shkolnikov, Vladimir, Giovanni Cornia, David Leon, and France Mesle. 1998. "Causes of the F.ussian Mortality Crisis: Evidence and Interpretations." Wosld Development 26(11):1995-2011. Stein, Zena, Mervyn Susser, Gerhard Saenger, and Francis Marolla. 1975. Famine and Human Devel- opment: The Dutch Hunger Winter of 194445. New York: Oxford University Press. Stillman, Steven, and Duncan Thomas. 2004. "The Effect of Economic Crises on Nutritional Status: Evidence from Russia." Discussion Paper 1092. Institute for the Study of Labor, Bonn, Germany. Strauss, John, and Duncan Thomas. 1998. "Health, Nutrition, and Economic Development." Journal of Economic Literature 36(2):766-817. Strauss, John, Kathleen Beegle, Agus Dwiyanto, Yulia Herawati, Daan Pattinasarany, Elan Satriawan, Bondan Sikoki, Sukamdi, and Firman Witoelar. 2002. "Indonesian Living Standards Three Years after the Crisis: Evidence from the Indonesia Family Life Survey." RAND Corporation, Santa Monica, Calif. U.S. Census Bureau. 2004. International Data Base. Online at www.census.gov/ipc/www/idbacc.html. Waters, Hugh, Fadia Saadah, and Menno Pradhan. 2003. "The Impact of the 1997-98 East Asian Economic Crisis on Health and Health Care in Indonesia." Health Policy and Planning 18(2):172-81. WHO (World Health Organization). 2000. "Cholera." Fact Sheet 107. Geneva. Available online at www.who.int/mediacentre/factsheets/fsl07/edindex.html. World Bank. 1999. "Peru: Improving Health Care for the Poor." Washington, D.C. . 2004. "East Asia Update: Regional Overview: Scaling up Poverty Reduction-Lessons and Challenges from China, Indonesia, Korea and Malaysia." Washington, D.C. Yamano, Takashi, Harold Alderman, and Luc Christiaensen. 2003. "Child Growth, Shocks, and Food Aid in Rural Ethiopia." Policy Research Working Paper 3128. World Bank, Washington, D.C. Entrepreneurship Selection and Performance: A Meta-Analysis of the Impact of Education in Developing Economies Justin van der Sluis, Mirjam van Praag, and V i m Vijverberg This meta-analytical review of empirical studies of the impact of schooling on entre- preneurship selection and performance in developing economies looks at variations in impact across specific characteristics of the studies. A marginal year of schooling in developing economies raises enterprise income by an average of 5.5 percent, which is close to the average return in industrial countries. The return varies, however, by gender, rural or urban residence, and the share of agriculture in the economy. Further- more, more educated workers typically end up in wage employment and prefer non- farm entrepreneurship to farming. The education effect that separates workers into self-employment and wage employment is stronger for women, possibly stronger in urban areas, and also stronger in the least developed economies, where agriculture is more dominant and literacy rates are lower. The theory of human capital posits that one of the main drivers of investment in schooling is the notion that schooling produces skills that raise worker produ- ctivity and income. Education, therefore, is thought to be beneficial for eco- nomic growth. The development literature thus includes numerous studies that attempt to quantify the rate of return to education. Psacharopoulos (1994)has brought together the evidence from 140 studies from around the world in a way that allows both international comparisons and trend analyses. However, almost without exception, returns to schooling refer to the returns employees generate from their years at school (Bennell1996). By contrast, the literature on measurement of the rate of return to schooling in entrepreneurship or its most common empirical equivalent, self-employment, is actually still poorly defined, Justin van der Sluis is a Ph.D. student at the University of Amsterdam and the Tinbergen Institute; his email address is j.vandersluis@uva.nl. Mirjam van Praag is professor of entrepreneurship at the University of Amsterdam and research fellow at the Tinbergen Institute; her email address is c.m.vanpraa~@uva.nl. Wim Vijverberg is professor of economics and political economy at the University of Texas, Dallas, and research fellow at the Institute for the Study of Labor (IZA); his email address is vijver@utdallas.edu. The authors gratefully acknowledge the financial support of the World Bank and comments from the editor and anonymous referees. THE WORLD BANK ECONOMICREVIEW VOL. 19, NO. 2, pp. 225-261 , doi:l0. 1093/wber/lhi013 Advance Access publication September 28, 2005 The Author 2005. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development I THEWORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. both for industrial countries (van der Sluis and others 2003) and for developing economies, the subject of this study.' The objective here is to assess whether and to what extent schooling affects entrepreneurship entry and performance in developing ec~nomies.~The analysis brings together more than 80 studies that measure these effects. A careful reading of the studies reveals that a simple summary is problematic because definitions of variables, empirical models, and data sources differ so much. Thus meta-analytical techniques based on factors that characterize each study are used to investigate the education effects. A meta-analytical approach yields a quantitative assessment of the literature that complements the more standard literature survey, which highlights particular high-quality pieces of research. A meta-analytical review forces precise comparisons of the research practices and methodologies applied in the studies and provides a quantitative explanation of the variation among the many research outcome^.^ Such analysis permits a deeper understanding of the gaps and opportunities in this rather poorly devel- oped area of entrepreneurship research. The results are compared to both the better developed studies on employee returns to schooling and to the just as poorly developed studies on entrepreneurs in developed economies. By assembling the evidence from entrepreneurship studies, this study also contributes a piece to the puzzle of the relationship between education and economic growth. The evidence summarized by Psacharopoulos (1994)pertains mostly to one type of microeconomic study-that of wage employment-which tends to yield more positive results than macroeconomic research into the returns to education (Bosworth and Collins 2003). This article summarizes microeconomic studies pertaining to entrepreneurship and asks what factors cause variation among these studies. This topic is clearly of great relevance. Researchers and practitioners alike are fully aware of the contributions of entrepreneurs to the economy. Entrepreneurs generate a substantial part of national income and employment in most coun- tries. Small enterprises form a large, flexible buffer between salaried employ- ment and incorporated businesses. Moreover, entrepreneurship may generate benefits for society through the development and maintenance of human and social capital that occur when entrepreneurial activity takes place. 1. In a parallel paper, van der Sluis and others (2003) examine the same set of relationships in industrial countries. The analysis of industrial country studies with an analysis of the developing economy studies is not easily fused, however, because the general level of schooling is lower and a substantial portion of the labor force works on the farm. 2. For more general surveys on entrepreneurship, see Mead and Liedholm (1998),King and McGrath (1999),and Kiggundu (2002). For a survey on the separate literature on the impact of education on farm production, see Jamison and Lau (1982) and Lockheed and others (1987). This literature is also summarized in a meta-analysis by Phillips (1994). This article does not cover the role of education in agricultural activities. 3. Space constraints preclude describing individual studies in detail, which would itself be a useful contribution. Sluis, Praag, and Vijverberg 227 In developing economies, the size and economic importance of the entrepre- neurial sector have long been underestimated. In line with Lewis (1954) and Ranis and Fei (1964),studies of economic development have emphasized agri- culture and industry. The work by Harris and Todaro (1969) illustrates that workers shifting from agriculture to industry may face a period of unemploy- ment or may bk forced to provide for themselves through a low-productivity household enterprise. A 1972 International Labour Organization (ILO 1972) report extended that notion and labeled such enterprises the "informal sector." The concept has proved to be one of the more influential ideas in development economics for right or wrong reasons (see, for example, House 1984; Mead and Morrison 1996; Peattie 1987). As defined, the informal sector covered all economic activity that was hidden from official oversight and that tended not to be very productive. Soon, the perception ruled that the informal sector consisted mainly of small enterprises, unable to make any significant contribu- tion to national economic growth and undesirable for anyone striving to make a decent living. Although many small enterprises appear unproductive, in the late 1980s large-scale household surveys began to uncover much hidden entrepre- neurial activity.4 It was found that small enterprises make useful contributions to household income, and that some blossom into large operations. With household entrepreneurship found to be so extensive, researchers began to ask what determines the income from household enterprises: for instance, what does schooling contribute? If employment in the formal sector is so much more desirable, why do people want to start a household enterprise? There are good theoretical reasons to presume that education is a determinant of entre- preneurship selection and entrepreneurial success (discussedlater)in developing economies. If education in fact improves entrepreneurial performance and results in more entrepreneurs, that would justify appropriate investments in education. This article assesses whether and to what extent schooling affects entrepreneurship selection and performance and evaluates the state of the art of research of this kind. The article first summarizes the economic theory on the relationship of entrepreneurship entry, performance, and educational attainment. It then describes the data gathering and the characteristics of the database, as well as current research into the relationship between schooling and entrepreneurship entry and performance. Next, it details the construction of subsamples for the meta-analysis used to explain cross-study differences in the relationship between schooling and entrepreneurship (entry and performance). The results from the meta-analysis on performance are then compared with the findings for 4. In contrast, studies of entrepreneurship relied on surveys sampled from lists of registered (and therefore larger scale) enterprises and therefore presented a biased view of entrepreneurship: only successful entrepreneurs would grow and eventually register. For an example in CBte d'Ivoire, see Vijverberg (1992); for an example across Botswana, Kenya, Malawi, Swaziland, and Zimbabwe, see Mead (1994)and Mead and Liedholm (1998). developed economies, and the relationships between schooling and entrepre- neurship selection are examined. Thetheoreticalliteratureproposesseveral determinantsof entrepreneurshipselection and performance. Among them are attitude to risk, access to capital, labor market experience, economic conditions, family background, psychological traits, income diversification, access to credit, and education. This section briefly reviews the theoreticalargumentson the relationship between schooling and entrepreneurship. Education as a Determinant of Entrepreneurship Selection and Performance The level of education might influence the propensity to become self-employed through several channels (Le 1999). Education enhances managerial ability, which increases the probability of entrepreneurship (Calvo and Wellisz 1980; Lucas 1978). Working in the opposite direction, higher levels of education might generate better options (more lucrative paid wage employment under better working conditions) and thus decrease the likelihood of entrepreneurship. It remains unclear what the predicted effect of these offsetting forces might be. Education may also influence entrepreneurship performance in several ways. According to the Mincerian specification of the determinants of individual earnings, the main factors affecting earnings are schooling and experience. This specification and the implied positive returns to schooling have found empirical support in the wage sector. This reasoning would seem to apply in other occupational sectors as well, such as entrepreneurship, but little systema- tic work has been done on the subject. Schooling is acknowledged both for its productive effect on the quality or quantity of labor supplied, as assumed by Mincer, and for its value as a signal of productive ability in labor markets without complete information (Riley 2002; Spence 1973). For entrepreneurs, the education signal may be helpful in dealing with clients, suppliers, bankers, and so on, and thus raise productivity. Integrated Models of Choice and Performance Another type of model simultaneously explains the occupational choice and performance of labor market participants. In these structural models the divi- sion between entrepreneurs and wage labor turns on the distribution of indivi- dual characteristics among the utility-maximizing population. In Lucas (1978) and van Praag and Cramer (2001)this characteristic is individual entrepreneur- ial ability as determined by, for instance, education. In such models education generates higher levels of expected entrepreneurial ability, which cause higher levels of expected entrepreneurial performance (interms of profit and firm size). This higher level of expected performance, and thus of income and nonmonetary returns, increases the expected utility attached to entrepreneurship and thereby Sluis, Praag, and Vijverberg 229 favors ths occupational choice. Similarly, Vijverberg (1986, 1993) models occu- pational choice as a time allocation problem in which people choose from differentincome-generating activities (seealso Roy 1951).Education has different effects on productivity for different activities, and people with different education backgrounds may have varying preferences for those activities. Thus, education affects sorting outcomes, but the net direction of the impact is an empirical matter. Education may affect sorting outcomes in several other ways. First, it inter- acts with the seasonality of on-farm work. During the slack agricultural season, many farm workers seek off-farm employment, but the scarcity of jobs (due partially to the lack of tolerance of nonfarm business ventures for seasonal fluctuations) forces farm workers to enter some sort of nonfarm self-employ- ment activity (Haggblade and others 1989; Lanjouw and Lanjouw 2001). Manual jobs that lend themselves to short-term self-employment require less education, because more educated workers establish themselves in more full- time activities. Second, households seek to diversify their income. They may operate a nonfarm enterprise to offset uncertainty in farming outcomes, and the education level of household members may determine who does what kind of work (seeDe Janvry and Sadoulet 2000; Haggblade and others 2002; Lanjouw and Lanjouw 2001; Reardon and others 2000). Third, education is associated with greater household wealth. Because credit markets function poorly, non- farm enterprises depend on farm income to finance their operations and invest- ment (Lanjouwand Lanjouw 2001; Reardon and others 2000).The start-up of entrepreneurial activity is often financed with family assets rather than with loans, and loans themselves are easier to get if the household has some wealth to offer as security (Paulson and Townsend 2001). In all this, education helps people perceive economic opportunities (Schultz1980). Thus there are many economic reasons to explain how education affects entrepreneurship choice and entrepreneurial performance. The word choice is used loosely here. There are also push factors that take people from agriculture into nonfarm self-employment: failed harvest, population pressure, rationed wage jobs. But this is not a random evolution of the rural economy either, because it could easily be argued that education guides this sorting process as well. As Le (1999, p. 386) notes, educational attainment is "one of the major theoretical determinants." In building a database for meta-analysis, the first concern is coverage: how representative of the literature are the collected documents (Nijkamp and Poot 2002)? The aim of this study is complete coverage of empirical studies that estimate a quantified relationship between entrepreneurship (entry or perfor- mance)and education. Because the relevant literature is widely scattered, several restrictions are imposed: To be included in the database, the studies must be written in English, be written for an academic audience, pertain to developing economies or economies in transition, and have been published after 1980 and before June 2003, the date by which construction of the database was com- pleted.5 The data search included journal articles, book chapters, books, and working papers, a wide net to cast. Working papers and other unpublished papers are included because that was the only way of incorporating the most recent research output, and it enlarges the sample.6 The first avenue of search was the Internet. Web of Science was the primary source for published journal articles. The primary search engines for working papers were the Social Science Research Network, Working Papers in Economics, and the working paper series of well-known research institutes such as the National Bureau of Economic Research, the World Bank, the World Institute for Development Economics Research, the Institute for the Study of Labor, and the Centre for the Study of African Economies at Oxford University. The second avenue of search for both published and unpublished documents was a scan through the references of each sampled paper. The Web of Science, which has a citations search function, was also used to find all other articles (in the journals covered) that refer to the studies already captured in the sample. This search resulted in the collection of 84 studies, each with at least one valid observation on the quantified relationship between schooling and entrepreneur- ship entry or selection (a transition to entrepreneurship) or on the quantified relationship between schooling and performance (earnings,duration). The studies are listed in the second part of the referencelist. Altogether, the 84 studies yielded 203 observations, 161 of them from published sources. Among the 203 observa- tions, 129 (64percent)examine performance, 19 (9percent) investigateentry into entrepreneurship, and 55 (27 percent) specify the dependent variable as "being self-employed." This last category is a stock (rather than flow) variable that is a hybrid of entry (everyone who is self-employed has entered this occupational status) and ~erformance(it generates an overrepresentation of survivors). These stock studies are therefore kept in a separate category. This description of the studies in the database considers such facets as the definition of the primary variables of interest (entrepreneurial outcomes and education), the type of data used, and the analytical techniques employed. 5. The 1980 cutoff is imposed for practical reasons of access, but it is virtually innocuous because this literature really got going only in the mid-1980s. 6. Older working papers are particularly difficult to find. Consequently, the database contains only one working paper from before 1995. To prevent double counting and to preserve the independence of observations, checks were conducted to determine whether working papers later appeared as publications (sometimeswith a different title or authorship). Sluis, Praag, and Vijverberg 231 A subsequent section focuses more explicitly on the evidence of the relationship between entrepreneurial outcomes and education. Measurement of Entrepreneurship, Enterprise Performance, and Education One of the challenges in performing the meta-analysis is that researchers on entrepreneurship defined key variables of interest to this study (entrepreneur- ship, enterprise performance, and education)in different ways. Such variation in definitions demands great care in the design of the conceptual framework that synthesizes the available evidence in this field of research. Although empirical definitions of entrepreneurship are fairly comparable to each other (and much more prosaic than those used in theories that refer to the innovativefree mind of the resourcefulspiritual entrepreneur),with most research- ers defining entrepreneurs as self-employed, how to model the entrepreneurship choice is more problematic. Should the characteristics of the entrepreneur be contrasted against those of everyone else, employees, or (self-employed)farmers? Is entrepreneurship a binary choice, or should the entrepreneur be viewed as someone who chooses from many alternatives? Table 1 illustrates the wide variety of choices that researchers have made. Most studies look at nonfarm entrepreneurship as a binary option, although another large group of studies models it as a multinomial choice. But this is only part of the complexity. A binomial model contrasts self-employment against a single alternative, but the 38 binomial studies used 6 different alternatives (see table 1).The multinomial models are spread out over seven alternatives while omitting the two most frequently used options of the binomial models. In addition there is an important discrepancy in the choice of the unit of analysis: some studies look at the choice of individuals, others TAB LE 1. Key Variables: EntryIStock: Type of Comparisons Made Comparison of Nonfarm Multinomial Maximum Entrepreneurship with: Logierobit Logit Likelihood Total Any other form of employment Other employment + nonemployed Wage workers+ nonemployed Wage employment Farming Nonemployed Unemployed No entry Migrant work Contracdpiecerate work Total "This total reflects the number of log-odds combinations derived from the 26 studies using a multinomial model. Source: Authors' analysis based on literature search described in the text. 2 3 2 T H E W O R L D BANK E C O N O M I C REVIEW, V O L . 19, N O . 2 at that of households. Needless to say, there is little homogeneity among the studies. Similarly, the literature has not yet converged on standard definitions of performance (see table 2) and educational (table 3) achievement. Of the 129 observations on performance, 70 (54 percent) focus on self-employment earnings defined in various ways, 16 percent on inputs (typically employ- ment) as a measure of size or growth, and 15 percent on duration or survival.' The largest number of studies model education as a straightforward linear variable reflecting years of schooling. Some studies embellish the relationship using squared years or spline functions that allow a different slope at different schooling levels. Many other studies use dummy variables indicating level of schooling- rather than years needed to attain the various levels. Of these, a few studies distinguish lower from upper elementary schooling and lower from upper secondary schooling.8 A small number of studies distinguish training and apprenticeships as less formal ways of investing in skills. Through all of this it is important to keep in mind that schooling systems in developing countries are highly heterogeneous (Kurian 1988). In some developing TA BLE 2. Key Variables: Performance: Variation across Studies - Variable Linear Logarithmic Total Earnings, income, profit Hourly monthly Annual Unspecified Inputslsize Technical efficiency index Duration/survival Other Total "Two studies examined annual performance measures but not in a linear or logarithmic manner. Source: Authors' analysis based on literature search described in the text. 7. Ten studies first derive a technical efficiency index from a production frontier analysis and then examine this index as an indicator of entrepreneurial success. Eight studies analyze other performance measures, such as self-employmentincome as a share of total household income, a private benefit-cost ratio, the growth rate of profits or a business diversification index. 8. The motivation is threefold. One might speculate that there is a threshold effect of education, such that benefits are gained only when cognitive skills reach a minimum level that is achieved after, say, three years of schooling, as appears to be the case in agriculture (Phillips 1994). Furthermore, given the low rates of schooling, especially among the older generation in many parts of the world, it makes sense to distinguish among these levels of schooling. In addition, unlike in some industrial countries, in developing countries secondary schooling is often broken into two levels (Kurian 1988). Sluis, Praag, and Vijverberg 233 TABLE 3. Key Variables: Education: Variation across Studies Variable Entry Stock Performance Total Years of schooling Entered linearly 9 20 43 Entered in quadratic form 9 15 Entered in spline form 9 Dummy variables Lower elementary 1 7 7 Upper elementary 1 7 7 Elementary 1 19 18 Lower secondary 7 9 11 Upper secondary 7 9 11 Secondary 2 15 29 Postsecondary 3 15 19 College graduate 9 9 Postgraduate 3 2 Master's degree 3 PhD degree 2 Other human capital variables Dummy variable: literate 6 4 Training 1 1 20 Apprenticeship 1 8 6 Source: Authors' analysis based on literature search described in the text. economies, cognitive skills by grade level are at a par with those in industrial countries; in others basic reading, writing, and arithmetic skills are still weak at the end of elementary school (Lee and Barro 2001, p. 485). Lack of uniformity in measures of schooling may generate additional problems for a quantified meta-analysis of the relationship between schooling and entrqpre- neurship, with the variation in the quality of education leading to a spread in measured effect^.^ An additional complexity arises from the use of different estimation strate- gies. The meta-analysis distinguishes structural studies from reduced-form studies of the same relationship. Several researchers have acknowledged that self-employment is an endogenous choice, dependent on the expected perfor- mance of the enterprise or on the utility from income expected to be gained from the enterprise. Failing to account for such selectivity effects may well bias estimates of return to education. Studies labeled "~tructural~~for the meta- analysis attempt to incorporate at least some kind of a deliberate occupat1iona8 9. Not reported in table 3 are three studies that count the number of people in the householdin each of several education categories, one study that refers to a British type 0-level and A-level education, and five studies that did not include any education variable but used information on training or apprenticeships. choice of labor force participants.10 Twenty-four of the 129 performance observations (14 percent) are structural. Almost none of the stock and entry studies are structural. However, none of the studies attempts to address the endogeneity of the schooling decision of individuals. This is striking, because doing so is becoming common practice in studies that measure the effect of education on wage employees (see Ashenfelter and others 1999). The Studies and Their Data Samples Entrepreneurship research is well known for its multidisciplinary character. Control variables may differ by discipline, which could influence the estimated impact of education.~~hedatabase includes studies from five academic fields: general economics (13.7 percent),labor economics and education (9.3 percent), development economics (57.8 percent), small business and entrepreneurship (12.4 percent), and management and sociology (6.8 percent). Detailed tabula- tions indicate that studies of structural performance are overrepresented in general economics and development economics journals, whereas stock studies are overrepresented in labor and development economics journals. Reduced- form performance models appear more frequently in the small business and entrepreneurship journals and in working papers. Studies of entrepreneurship entry are primarily found in the management and sociology category. There are also some noteworthy trends in the literature. In particular, analyses of entrepreneurship have become more popular in recent years, although this interest appears to be waning. Perhaps more important, however, is the trend in the nature of the research in this field: structural form studies are a more recent phenomenon. Sub-Saharan Africa dominates the geographical distribution of studies of entrepreneurship entry and performance, contributing 72,.of 203 observations (36.5 percent). This is followed by Latin America and the Caribbean (31.5 percent), East and Southeast Asia (12.3 percent), Eastern European economies in transition (9.9 percent), North Africa and the Middle East (5.4 percent), and South Asia (4.9 percent). Figure1depicts the sample size in each of the four different types of studies in the database, sorted by sample size from smallest to largest. Several studies did not report sample size; these are represented by the horizontal offset of the four 10. Many choices could fall under this heading. For example, it is often assumed that the individual is working anyway and that the only choice to be modeled is whether to be self-employed as opposed to working for a wage. However, this choice model could be augmented with many other choices: whether to work, whether to work in the public sector or the private sector, whether to work for a large corporate organization or for a smaller business with an environment similar to one's own enterprise, whether to work in an urban area or in a more rural setting, whether to quit schooling to take a job, and so on. Obviously, no study includes all of these features. The point is that structural studies attempt to remove the bias caused by ignoring one or more of these choices, but one could easily think of other omitted selectivity factors that still may bias the estimated returns to schooling. Therefore, although any compar- ison between reduced-form and structural model estimates has obvious limitations, the comparison of various structural studies is not entirely straightforward either. Sluis, Praag, and Vijverberg 235 FIGURE 1. Sample Size of Study Types in the Database Study, ranked by sample size Entry - N- Stock -Performance(reducedform) Performance (structural form) Source: Authors' analysis based on literature search described in the text. charted lines (for example, 16 of the 105 studies on reduced-form performance did not report the size of their sample).Stock and entry studies tend to use larger samples than performance studies, for the simple reason that they include the nonself-employed to study the entrepreneurship choice. Seventy-three of the 89 reduced-form performance studies that reported sample size contained fewer than 1,000 observations, with one as low as 30. The trend is toward larger data sets: in general, the correlation between the size of the sample and the year the sample was gathered was positive (0.15). In doing research on entrepreneurship choice and performance, there are good reasons for studying men and women separately because they face differ- ent constraints and act on different opportunities. At the same time, a House- hold enterprise is often a joint activity of household members, with mep and women working together for the benefit of the household more than of the enterprise. Stock and entry studies separate the sample by gender more often than performance studies do, but they are also more likely to treat the entre- preneurship choice as a household outcome than are performance studies (table4). More than half of the performance studies use mixed-gender samples, frequently because the unit of observation is the enterprise rather thdn the entrepreneur.ll IV. IMPLEMENTING THE META-ANALYSIS Meta-analysis is a quantitative tool that is applied to synthesize previous research findings that share common aspects that can be addressed statistically. The set of meta-analytical techniques has been developed and applied rpainly in the melcal and natural sciences. Rare examples of the application of these 11.The appropriate education measure then pertains to the entrepreneur, the leader of the enterprise. 236 THE WORLD BANK ECONOMIC REVIEW, V O L . 19, NO. 2 TABLE 4. Gender of Study Respondents Female Male Both Male Number Type of Study Only Only and Female Household Total of Observations - - Entry 5.6 61.1 16.7 16.7 100 18 Stock 25.0 25.0 37.5 12.5 100 56 Performance, 11.8 23.5 62.7 2.0 100 102 reduced form Performance, 20.8 16.7 50.0 12.5 100 24 structural Total 16.0 26.5 50.0 7.5 100 200 Source: Authors' analysis based on literature search described in the text. techniques in economics are Phillips (1994),Card and Krueger (1995),Ashenfelter and others (1999), Nijkamp and Poot (2002),and van der Sluis and others (2003). Constructing Subsets for the Meta-Analysis Regression techniques are used in the analysis to "explain" the effect of school- ing, referred to as b, by the various characteristics Z that were gathered for each study: It is important to select subsets of studies for this regression analysis with homogenous definitions for education, enterprise performance, and entry1 stock. The less variation there is across studies in the measurement of these variables, the more meaningful the meta-analysis results. Insisting on strict homogeneity, however, reduces the subset of eligible studies. In particular, the requirement of homogeneity conflicts with the fractured definitions of stock1 entry, performance, and education. For example, studies that specify enterprise income in linear form end up in a different subset than those that use a logarithmic form; studies using years of schooling are separated from those specifying a set of dummy variables, and so on. Many of the resulting subsets are therefore too small to permit estimation of equation 1. There is, however, a way to pool small subsets in meaningful ways. The t-statistics of b reflect the sign and significance of the estimated relationship, where it does not matter so much whether the dependent variable is measured in linear or logarithmic form. Better yet, one may pool across all forms of perfor- mance measures, as long as the parameter estimates and t-statistics are recorded in such a way that the hypothesized effect of education points in the same direction. That approach is followed here. First, all the effects of schooling are recorded for performance measures for which "the more, the better" does not hold-exit from self-employment and the hazard out of self-employment. Sluis, Praag, and Vijverberg 237 Next, a recoded variable t*is defined that takes a value of 0 for observations that find a significantly negative effect, 1 for those that find an insignificantlynegative effect, 2 for those that find an insignificantly positive effect, and 3 for those that find a significantly positive effect. This ordered variable is then regressed on characteristics of the studies by means of the ordered probit model: The advantage of this approach is that it allows different entrepreneurial per- formance indicators to be merged into a single analysis. Thus three subsamples of performance studies are employed: Studies that mea- sure performance as log income and can use a quantitative approach (estimationof equation 1); studies measuring performance in terms of any sort of income; and pooled performancestudies, measured in any manner. The variation in the effect of education on performance in the second and third subsamples can be estimated qualitatively only, by means of the ordered probit model (equation 2). Independence of observations is a crucial statistical requirement in building suitable subsets for regression analysis. At issue is whether studies represent independent measurements of the impact of education. Detailed examination suggests that some of the studies may violate the independence assumption. In many cases multiple observations of the same type (for instance, performance) come from a publication that uses a single data set. For example, many studies report several estimated models in an effort to demonstrate the robustness of the results. To preserve independence, only one of the estimated models is selected in these cases-the one that best shows the estimated relationship. However, where a single study presents separate estimates for men and women, for example, both are included in the database, because independence is not in danger.12 If several studies use the same data source and (roughly) the same subsample to examine the same entrepreneurial outcome, independence requires that only one study be retained.13 For this analysis this requirement led to the dropping of one observation (Lanjouw 1999).14 12. Two estimates drawn for different subsamples of a single study (or from two studies by tHe same author) inherit a scientific approach from a single source and might therefore still be correlated, statistically speaking, from the perspective of a meta-analysis. The use of the term independence of observations pertains to the statistical independence of the samples that generate the estimated edbcation effects. 13. In a few other cases there is some overlap in the samples, such as one study that usesseveral rounds of a survey of which another study uses only one year. The gain in independence among observations was judged to be less than the loss of an observation, so no studies were dropped in such circumstances. 14. For different reasons, four studies are eliminated at this stage as well. Yijverberg (1995) and Henderson (1983) reported the effect of education interacted with some other variable; Honig (1996, 1998) reported t-statistics that were so high as to be implausible in light of the recorded ~ - $ ~ u a r e d statistic. Determinants in Meta-Analytical Models Because the subset sizes are not very large, parsimonious meta-analytical models are needed. There is always a temptation to create rich specifications and many "what if" scenarios, but subset sizes of at most 70 studies do not permit rich models. The effects that are estimated should be interpreted cautiously. Theory provides limited guidance in generating hypotheses about the deter- minants of the returns to schooling for entrepreneurs or of the relationship between entry and schooling. If anything, theory provides only loose guidance for the following questions. Given that schooling is thought to be more bene- ficial in more vibrant economies, is the return to schooling or the effect of schooling on entry higher in the East Asia than in Africa, for example? Is it higher in urban areas than in rural areas? Is the return to schooling increasing over time?Theory also predicts that part of the returns to education derive from optimal choices of enterprise inputs and sector of economic activity. Thus, studies that control for input use or business sector should be reporting smaller estimated returns. This may also bear on the effect of using a structural model, although the impact of education on selection into entrepreneurship is not clear a priori, as argued earlier. Thus, eliminating self-selection bias by estimating a structural model instead of a reduced-form model may well affect the relation- ship between schooling and performance (or between schooling and entry), but the direction is not predicted by theory. In other aspects the meta-analysis is more exploratory. Are returns higher for men than for women? Is there any distinction in the effect of education across schooling levels? Does the performance measure selected affect the estimated return to schooling? Do estimates vary by sample size, by the scientific weight of the journal, or by the field of the journal in which the study is published? Is there something like a publication or reporting bias in the sense that there is an overrepresentation of significant results? The following sections describe the explanatory variables used in the meta-analytical models. SAMPLE CHARACTERISTICS OF OBSERVATIONS. The first group of control variables controls for three sample characteristics: the region of the world from which the sample is taken, the percentage of women in the sample, and the percentage of individuals in the sample living in urban areas.'' USE OF SECTORA D INPUT CONTROL VARIABLES. AS already hypothesized, the N estimated impact of education may depend on whether the study includes controls 15. Sometimes, when the sample consisted of a mix of rural and urban residents and when the study did not supply the relevant statistics, the general ratio applying to the working population in the relevant country was used, based on World Bank statistics. Moreover, when the percentage of women was not reported, a value of 0.50 was used when the unit of observation was the household (when the research question was whether the household operates an enterprise), a value of 0.2 when there was a sense that the majority of respondents would be men, or World Bank statistics on the female labor force participation rates. Sluis, Praag, and Vijverberg 239 for sector of business (or whether the sample comprises enterprises in a single industry), and whether the regression model incorporates enterprise inputs. MA CROECONOMIC COND~ONS.A variable is included that indicates the (earliest) year from which the observations in the sample have been drawn, to capture temporal effects associated with technology, level of development, and similar aspects. Other variables include the sectoral composition of the economy (agri- culture, industry, services);income per capita, a more direct measure of develop- ment; gross investment, as human capital could be either a substitute for or a complement to physical capital; and rates of illiteracy, because human capital may be more precious in a context where people are less skilled. These variables are drawn from World Bank (2003).An effort was made to incorporate informa- tion about the competitiveness of the business climate as reported by the World Competiveness Scoreboard outcome (IMD 2002), but, this database included only 19 of the 50 countries in this study, too few to permit a meaningful analysis. CHARACTERIS~CS OF SOURCE DOCUME~TS.Forty-two of the 203 observations are drawn from working papers or book chapters. Where feasible, the analysis includes a dummy variable indicating whether the observation was found in a journal rather than a working paper or book. Preliminary analysis showed no effect of the branch of journal in which a study is published, so this variable is omitted. The impact of the journal is included in the equations as a proxy for journal quality. Studies published in the better quality journals should be of better quality, so if all better quality studies report higher (or lower) schooling effects, there is reason to believe that the "true" effect is indeed higher (orlower) than the simple average effect suggests. For journals without an impact factor and for working papers and books, the impact factor is set to 0. PUBLICA~ONREPORTING BIAS. ASAshenfelter and others (1999)point out, it is AND possible that the observed universe of published results reflects solely studies with statistically significant results. Studies that failed to find a statistically significant rejection of the null hypothesis of no effect might not have been published. It is also possible that authors systematically drop insignificant determinants from their models and that a statistically insignificant education effect leads them to omit all education variables. If so, such studies would not appear in the database. To test for such bias, the standard errors of the parameter estimate for education in the observed studies are included in the analysis. If there is no publication bias, standard errors should have no significant relationship with the coefficients. If there is publication bias, standard errors should exhibit a positive relationship with the coefficient of the schooling measure b.16 16. Hedges (1992)offers another approach to the study of publication bias. His method is briefly explored later to confirm the findings here. In the ordered probit models of equation 2, the standard error cannot be used as a control variable because the dependent variables in these subsets are based on the t-statistics, which are partly determined by the standard errors. There- fore, the square root of the sample size N is included in these models because the standard error of the parameter estimate declines with the sample size at a rate of A?'.'. Consider the effect ofA?'.5.Ifthe true effect of the education variable is positive, its t-statistic is more likely to be significantly positive in a larger sample, and therefore the parameter on N0.' in the ordered probit model should be positive (and along the same lines, if the true effect is negative, the impact of N0.5will be negative). If the true effect of the education variable is 0, its t-statistic should hover around 0, no matter what the sample size, so that the parameter on A?'.'should be near 0. If publication bias exists, studies with small data samples will also be reporting statistically significant education effects, so that, regardless of whether the true education effect is positive or negative, the parameter on No.' should be near 0. It is this situation, therefore, that is indicative of publication bias. Yet caution applies. If the true effect of education varies such that it is positive in some places (samples)and negative in others, the parameter on could be near zero even in the absence of publication bias. PERFORMANCEMEASURES. The meta-analytical regression models that are esti- mated on the subsets that pool all performance measures control for the types of performance measures used to assess whether the impact of schooling differs across those measures. After some preliminary attempts, a dummy variable for earnings-related performance measures is included in these equations. ESTIMATION METHODS AND TYPES OF DATA. Finally, a dummy variable is included for whether a structural model has been used. In the few studies that used panel data, the panel aspect appears to be largely ignored, so this distinction is not explored further. This section explores the link between education and performance. The next looks at entrepreneurial choice. Effect of Schooling The preponderance of the evidence supports a relationship between schooling and entrepreneurship performance (table 5).But although the effect is positive, it is not always easily teased from the data. Perhaps the most successful speci- fication is the one that uses years of schooling entered linearly. Thirty-three of 40 observations are positive, 19 of them significantly so. Attempts to uncover nonlinearity are not particularly successful, unless the evidence is interpreted as indicating that upper secondary schooling yields greater returns than primary or Sluis, Praag, and Vijuerberg 241 TA BLE 5. Impact of Education on Entrepreneurial Performance Negative Positive Significant Insignificant Insignificant Significant Number of Variable (t<-1.96) (t>1.96) Observations - - Years of schooling All years, entered linearly Primary years All secondary years Lower secondary years Upper secondary years Years of schooling squared Dummy variables Lower primary Upper primary All primary Lower secondary Upper secondary All secondary Post-secondary College Other human capital variables Training Apprenticeships Source: Authors' analysis based on literature search described in the text. lower secondary schooling. But this is not convincing yet. Entering schooling in a quadratic form yields only insignificant parameter estimates (15studies),'' Studies that use dummy variables generate the same kind of evidende. It should be noted that the database records the dummy variable effects in kom- parison with the base category of no schooling, even if a study actually used another group as the base category.18Upper elementary schooling shows clearer evidence of positive returns compared with lower elementary schooling,sup- porting the threshold notion that has been found in agriculture (Phillips1994). However, the impact of lower secondary schooling is not more positive than 17. Note that with years of schooling entered in quadratic form, the overall effect of educatioi is not recorded in table 4, nor is the effect of years per se, which is not interpretable without referenceto the squared term. 18. This is not a fully innocuous choice, even if most studies use no schooling as the comdarison group. People who have no education may he relatively heterogeneous in that some of them may ndt have had access to schooling in their youth and therefore have more native ability on average than those who chose to forgo schooling. that of upper elementary schooling, and indeed the impact for postsecondary and college education is not always different from that of no schooling. Yet six of seven estimates for upper secondary schooling are significantly positive, which is consistent with the summary of the years of schooling effect, already mentioned. In all, therefore, the relationship appears to be positive but not strongly evident. Asmall numberof studiesincorporateinformationabout trainingand apprentice- ships in the regression model. This is done by counting years or by using dichot- omous indicator variables. As informal, heterogeneous means of accumulating human capital, both training and apprenticeships are difficult to capture precisely in a questionnaire.Across studies, mild evidence of a positive relationship emerges. The most common model uses the log of earnings, enterprise income, or profits as the dependent variable.19 On top of this, many of these studies use years of schooling as the measure of education. In such a model, b represents the propor- tional increase in income resulting from a marginal year of schooling. The esti- mates of all of thesestudies have been pooled. In cases where a splined relationship was estimated, the slope of each segment of the education-income function is treated as a separate observation: in effect, such studies yield several estimates of b pertaining to differentlevels of education. Becauseit is impossibleto characterize the correlation among these estimates from the information in the studies, the slopes of the segments are assumed to be uncorrelated. Moreover, quadratic and interacted relationshipsare evaluated at the means (or,if meanswere not provided, at six years of schoolingand 35 years of age).20In all, then, there is a subsample of 49 observations (tables 6 and 7). The average value of b is 5.5 percent, a very plausiblerate of return per year of schooling.The standard deviation of 6.4 percent indicates a large spread around this mean across the 49 observations. Explaining the Variation in Effects To find the cause of this spread, several regression models are estimated, all varia- tions of a base model that includes the proportion of women and urbanites in the study sample. The small sample size precludes using all variables at the same time. Thus, the base model isexpanded by addingvariablesone (oronegroup)at the time. 19. The use of the logarithm, together with an assumption that hours of work are predetermined, allows such studies to be pooled regardless of the time dimension of the income concept. Estimation results from linear and log-linear models might also be made comparable by expressing the education effect in elasticity form or by standardizing the parameter estimates with the standard deviation of earnings, but that requires descriptive statistics on earnings and education values that are often not reported in the studies. 20. t-statistics were similarly adjusted, although that must be done informally since not all of the necessary information is available. For example, in a quadratic specification, if the linear and quadratic terms are both positive and the linear term tends toward statistical significance, a simple linear model most often yields a significant positive parameter estimate. But if the linear term is positive and the squared term has a negative parameter such that a fully inverted U-shape results, a simple linear model yields an insignificant coefficient. The exact t-statistic could be computed only if the covariance were reported, but that is never the case. Sluis, Praag, and Vijverberg 243 TABLE 6. Meta-Analysis of the Effect of Years of Schooling on Performance, Description of Subsample by Performance Measure Log Income Any Income Any Performance Number of studies 49 52 69 Average value of b 0.055 Standard deviation of b 0.064 Significant negative (t<-1.96) 0 1 Insignificant negative 10 21 Insignificant positive 18 29 Significant positive (t> 1.96) 24 28 Source: Authors' analysis based on literature search described in the text. The base model shows higher returns for women and in urban areas, by about 4 percentage points each (table7).The gender effect is also observed in developed economies (van der Sluis and others 2003). With the inclusion of additional factors considered (groupsA through M), the main finding is that b tends to be higher in studies that report a less precise estimate, which is consistent with the notion of a publication bias, but the effect of the standard error of b is significant at the 10 percent level only. The method designed by Hedges (1992)is udd to explore this issue. The results show that the odds that a study with a statistically insignificant parameter estimate appears in the literature is only 0.65, but it is not statistically different from 1.0 at a p-value of 0.21.~'Again, there is only weak evidence of publication bias. Thus, the entrepreneurship literature appears to be more tolerant of insignificant estimates than the parallel literature on the returns to education in wage earnings (Ashenfelter and others 1999),perhaps becabse a priori expectations among researchers and journal editors are not as strong4 Other explanations for the variation in b are weak. Estimated returns may be lower when the regression model includes sector dummy variables, thus con- forming to expectations. Returns may be declining over time. Returns in Shuth- east Asia may be slightly lower than elsewhere. There is a hint that returds are lower in countries with a higher concentration of industry and higher where services contribute more to GDP. Returns may be lower in higher income ooun- tries. Human and physical capital appear to be substitutes, as the rtturn diminishes in countries with greater rates of gross investment. However, none of these effects attain statistical significance. Next, the subset is expanded to include all studies that examine enteiprise income, earnings, or profits, whether in linear or logarithmic form, and the sign '< 21. This pertains to the version of Hedges's model that merely estimates the average rate of return. This model suggests an average return of 4.4 percent and a variability across studies of 3.5 percentage points. In the base model the unexplained variability across studies already declines to 2.9 percent. Moreover, the estimate of the odds rises to 0.69 with a p-value of 0.30, reinforcing the conclusion that the evidence of publication bias is weak. TABL 7. Meta-Analysis of the Effect of Years of Schooling on Performance, Meta-Analytical Regression Analysis of b and t* E b (ordinary least squares) t* (ordered probit) t* (ordered probit) Variable Coefficient t-Statistic Coefficient t-Statistic Coefficient t-Statistic Base model Intercept Proportion female Proportion urban ~ 0 . 5 Added variables (in groups) Include inputs? Include sector? Year of sample Structural model? Published in journal? Impact factor Standard error of b Earnings model Sub-Saharan Africa North Africa and Middle East South Asia Southeast Asiaa Latin America Eastern Europe Proportion illiterate Agriculture/GD~ I n d u s t r y / ~ ~ ~ Serviceslco~ Income per capita * lo5 Gross investmentl~~r "Omitted category among the regional variables in group G. Source: Authors' analysis based on literature search described in the text. Sluis, Praag, and Vijuerberg 245 and significance of the estimated relationship are examined through an ordered probit model of t* (table 7, column 2). Once again, the higher the proportion of women and urbanites in the sample, the more likely it is that the study finds a positive and significant education effect. The base model also contains the square root of the sample size. The effect of is positive and significant, consistent with the hypothesis that education raises income and unsupportive of publication bias.22Among the other determinants examined, there is weak evidence that part of the effect of education plays through allocative input choices. Moreover, estimates tend to be more positive and significant in structural models. When the subsetisexpanded to includeallstudies of enterprise performancethat specify yearsof schooling,the main added findingis that enterpriseearningsmodels may well find a higher proportion of significant positive education effects than studies of enterprise survival or technological efficiency, for example. This phe- nomenon is also observed in developed economies (vander Sluis and others 2003). The entire exercise is repeated using studies that specify education through a series of dummy variables and with education sublevels combined into primary, secondary, and postsecondary schooling (tables 8 and 9). Once again, the base category is no schooling. On average, primary education yields a 19 percent (= eo.174 1) gain, which is comparable to the 5.5 percent annually shown in - table 6, because the average person in this category has less than 6 years of schooling. Entrepreneurs with secondary schooling earn 34 percent (=e0.294 -1) more than unschooled individuals, which on an annual basis is a bit lower than the returns reported in table 6. Those with postsecondary schooling gain 140 percent. Each of these averages is accompanied by a large standard deviation: the variation among studies is large. Which factors are associated with this varia- tion? The subsets are too small (15,20, and11studies)to examine this issue with meta-analytical regressions or Hedges's model. The regression estimates should therefore be taken with a grain of salt. Only the base model results and a few more systematically significant factors are reported in table 9.23 Returns to primary schooling appear lower in urban areas and in economies with larger service sectors. Secondary schooling returns are higher in more recent samples and in societies with more illiterate people, a larger agricultural sector, and a smaller industrial sector. But the strongest determinant is the effect of the standard error, suggesting a strong publication bias (consider also the insignif- icance of large estimates with large standard errors are tolerated, but smaller estimates had better be more precise. Adding all other income studies yields subsets of 20,27, and 13 studies, all of them still small. Pooling all performance studies that use dummy variables for 22. The Hedges method cannot be applied to the ordered probit model of t*. 23. No regressions were attempted to explain the variation in postsecondary returns. Furthermore, other factors, as listed in tables 8 and 9, were entered but did not provide sufficient explanatory power. The few that occasionally mattered are mentioned in the discussion. TABLE 8. Meta-analysis of the Effect of Schooling Dummy Variables on Performance, Description of subsample by education level and performance measure Primary, Combined Secondary, Combined Postsecondary, Combined Log Any Log Any Log Any Income Performance Income Performance Income Performance Number of studies 15 27 20 43 11 25 Average value of b 0.174 0.294 0.874 SD of b 0.232 0.254 0.487 Significant negative 1 1 1 (t<-1.96) Insignificant negative 4 7 3 Insignificant positive 14 18 9 Significant positive 8 17 12 (t>1.96) Note:The categories "primary education"and "secondary education" combine lower and upper levels. Postsecondary education includes college. Source: Authors' analysis based on literature search described in the text. schooling raises the subsets to reasonable sizes (27, 43, and 25 studies).24For primary schooling the negative urban effect weakens. Larger (or more precisely estimated) education effects occur in societies with lower literacy rates and more extensive agricultural activity. For secondary schooling, the effect appears smaller for women and somewhat larger in agricultural societies. Other factors show little correlation with 6. Returns to postsecondary schooling appear lower in regressions that include controls for inputs and perhaps for sector and are higher in illiterate agricultural societies and when estimated using a structural model. As before, the impact is also stronger on earnings than on other performance measures. In sum, there is some evidence of higher returns for women and in urban areas, as well as in agriculturalsocieties where literacy rates are lower. Insertingcontrols for inputs removes the allocative portion of the gain that education generates. Adding sector dummy variables removes another choice-related portion of the returns. Publication bias is evident in secondary schooling dummy variables. This suggests that researchers are using the dummy variables only if the results for secondary schooling are positive and significant. When designing a model, it is desirable to leave open the possibility of a nonlinear education effect, but though quadratic and splined functions are usually ineffective in detecting nonlinearity, dummy variables may be able to indicate a nonlinear relationship. VI. THE EFFECT OF SCHOOLING ON THE CHOICE OF ENTREPRENEURSHIP The second dimension of the impact of schooling deals with the choice of entrepreneurship. The logical focus of an analysis of this choice is the behavior 24. The meta-analytical results of the all-income samples are broadly similar to those of the all- performance samples. TAB LE 9. Meta-Analysis of the Effect of Schooling Dummy Variables on Performance, Meta-Analytical Regression Analysis of b and t* - - - - - - - - Primary, Combined Secondary, Combined Secondary, Combined b t' b t* b t* (ordinary least squares) (ordered probit) (ordinary least squares) (ordered probit) (ordinary least squares) (ordered probit) Coefficient t-Statistic Coefficient t-Statistic Coefficient t-Statistic Coefficient t-Statistic Coefficient t-Statistic Coefficient t-Statistic Base model Intercept 0.372 2.87 Proportion female 0.018 0.14 Proportion urban -0.269 1.82 w ~ 0 . 5 P Added variables (in groups)" E. SE of 6 -0.422 0.27 F. Earnings model H. Proportion 0.656 1.05 illiterate I. Agriculture/c~~ 0.651 1.19 aFor brevity, other groups of variables are omitted because they mattered less. Note: The categories "primary education" and "secondary education" combine lower and upper levels. Postsecondary education includes college. Source: Authors' analysis based on literature search described in the text. of individuals as they make their career decisions: Who starts a business? Who goes into wage employment? Such studies of entry are relatively scarce, largely because of data limitations and researchers' inattention. The majority of studies examine being (rather than becoming) self-employed. Because the database is small, the stock and entry studies are aggregated. The estimated effects for stock and entry studies are also examined briefly to see whether they differ. The literature offers many empirical models. Estimation methods vary from simple binomial models to elaborate structural models.25 Studies also use many different comparison categories (see tables 1-3), because the choices available to people living in developing areas are broad. All this makes compar- ison across studies tedious.26 To make parameter estimates comparable, they are expressed in terms of the marginal impact on the probability of nonfarm self-employment. Table 10 describes the evidence from studies that analyze the impact of education on the self-employment choice, sorted by base category. The table shows only studies for which at least one type of education variable has been used more than five times. There is considerable consistency among studies that specify education in years of schooling and studies that employ education category variables. Overall, relative to a heterogeneous set of other forms of employment (panel A),education lowers the likelihood of nonfarm self-employ- ment by an average of 1.3 percentage points per year of schooling. The effect is frequently statistically significant. The contrast with wage employment (panel C) is much more sharply negative, at a 6.8 percentage point decline per year. Moreover, a rise in schooling level pulls people out of farming (panel D) at a rate of 8.1 percent per year of schooling. Relative to a combination of none- mployment and all alternative forms of employment, education may weakly favor nonfarm self-employment (panel B). In combination, panels A and B suggest that more educated individuals are less likely to be nonemployed than 25. More advanced studies recognize that nonagricultural self-employment is one alternative among several and therefore that a multinomial choice model is preferable. The econometric model of choice is the multinomial logit model in which the parameters are identified relative to a base category. A multi- nomial probit model avoids the independence of irrelevant alternatives assumption but is more difficult to estimate. A nested multinomial logit model does not always produce plausible nesting structures. 26. However, different studies use different base categories (forexample, farming, wage employment, or nonemployment),so the estimated schooling parameter of the self-employment selection equation cannot be compared across studies. Given the possible categories j (=1,...,J, in general),studies report the estimates and t-statistics (orstandard errors)of P,and Pk,but not ofPj- for everycombination of j and k. Of course, Pk, one may, and the meta-analysis does, compute estimated values of pi- Pk,butstudies do not provide enough information to conclude anythng definite about the significancelevel of this difference. Because the aim of ths study is to understand the impact of education on the self-employment choice in detail, some reasonable assumptions are made to enable evaluation of the significancelevels of the impact of education on the choice between every combination of economic activities. For example, if pi is significantly positive and Pk is insignificantlydifferent from zero (oris significantly negative), Pi-Pkisassumed to be significantly positive. Or if both Piand Pkare significantly positive,P,- Pkisassumed to be insignificantly different from zero (but still positive if pi-Pk>0). TAB LE 10. Education and Entrepreneurship Choice: Descriptive Analysis Impact on Probability of Nonfarm Self-Employment Negative Positive Number of Education Variable 0bservations Mean SD Significant Insignificant Insignificant Significant A. Relative to any other form of employment Years Primary, combined Secondary, combined Postsecondary, combined B. Relative to any other form of employment or nonemployment Years Primary, combined Secondary, combined Postsecondary, combined h, P C. Relative to wage employment \D Years Primary, combined Secondary, combined Postsecondary, combined D. Relative to farming Years Primary, combined Secondary, combined Postsecondary, combined E. Relative to nonemployment Years Primary Secondary Postsecondary F. Relative to no entry into nonfarm self-employment Secondary Note: The categories "primary education" and "secondary education" combine lower and upper levels. Postsecondary education includes college. Source: Authors' analysis based on literature search described in the text. to be engaged in a nonfarm enterprise, but this tends to be contradicted by the more often negative estimates summarizing the explicit comparisons between nonemployment and nonfarm self-employment (panel E). Finally, panel F expli- citly compares entry and nonentry, but the 10 observations are from a single study on transitions into entrepreneurship after communism in Hungary. This evidence shows an ambiguous effect of education. Altogether, schooling is associated with a distinct sorting in the labor market of developing countries. The comparison of wage employment and self-employment has been studied most often, although the subsets of studies are still quite small. Nevertheless, because the schooling impact appears fairly stable, as evident in table 10, with proper caution the subsets permit a deeper meta-analysis (table11).As before, various hypotheses are explored from a base model that includes the proportion of the sample that is female and that resides in urban areas. The base model results suggest that as the level of education rises, a woman is more likely than a man to choose wage employment over nonfarm entrepreneurship. Similarly, a more educated urban resident is more likely to select a wage job than a more educated rural resident, though the estimated difference is not as strong as between the genders. Adding variables one at the time helps explain the variation among studies.27 The year in which the data were generated proves irrelevant: the global level of technology or globalization has no discernible impact on local labor market sorting patterns. In regard to the macroeconomic variables, the meta-analysis of the effect of years of schooling yields a distinctly different association pattern than the education category analyses. The years of schooling model suggests that b is more negative in agricultural societies with higher illiteracy: because educated workers are scarcer, education opens up more opportunities in wage employment. The education category models show no such association. The years of schooling model finds that b is less negative when the economy is more industrial; the education category models show the opposite. The education category models also show that b becomes smaller but remains negative in service-oriented economies; the association in the years of schooling model is inconclusive. Per capita income matters in the years of schooling model (redu- cing the size of b, which remains negative) but is largely absent in the education category models. The only macroeconomic variable on which all models agree is gross investment, which reduces the negative value of b, suggesting that in a faster growing economy there are more entrepreneurial opportunities for more educated individuals. Overall, the macroeconomic variables show a more plau- sible effect in the years of schooling model. Some of the estimated effects in the 27. Regressions not reported in the table examined the field of the journal publication, which did not matter, and the differences among regions, which showed significant parameters, but the sample is too small to draw any economic conclusions. TAB LE 11. Education and Entrepreneurship Choice: Meta-Analysis of the Choice between Wage Employment and Nonfarm Self-Employment Years Primary, Combined Secondary, Combined Postsecondary, Combined (n= 24) (n=19) ( n=24) (n=16) Variable Coefficient t-Statistic Coefficient t-Statistic Coefficient t-Statistic Coefficient t-Statistic A. Base model Intercept Proportion female Proportion urban B. Added variables N A. Year of sample b r B. Structural model? C. Study of entry? D. Published in journal? Impact factor E. SE of b F. Proportion illiterate G. Agriculture/c~~ H. Industry/co~ I. Services/cor J. Income per capita*lo3 K. Gross investrnentlco~ Note: The dependent variable is the parameter estlmate (b) of the specified education variable in the model that explains the choice between wage employment (as the base) and nonfarm self-employment. The number in parentheses at the head of each column is the number of observations in each regression model. The categories "primary education" and "secondary education" combine lower and upper levels. Postsecondary education includes college. The ma-andytrcai regression modek are estimzted by ordinary least squares. education category models are exceedingly large, possibly indicating the kind of spurious effects sometimes found in regressions run on small samples.28 In sum, the literature shows that more educated workers typically end up in wage employment, shunning nonfarm entrepreneurship. This effect is stronger for women, possibly stronger in urban areas, and also stronger in developing economies, where agriculture is more dominant and literacy rates are lower. Relative to farming, however, more educated workers seek out nonfarm entre- preneurship opportunities. VII. CONCLUSIONS AND SUGGESTIONS FOR FURTHER RESEARCH The meta-analysis shows that an additional year of schooling raises enterprise income in developing economies by an average of 5.5 percent. This is somewhat lower than the return to education in wage employment in developing areas, which ranges from 7.2 percent per additional year of schooling on average to more than 11 percent (Psacharopoulos 1 9 9 4 ) . ~The pattern and values are ~ similar to those for the United States, where the average return to schooling in entrepreneurial pursuits is 6.1 percent, compared with 7-9 percent for returns to wage employment. The return in developing economies tends to be higher for women, as in industrial countries, and for urban residents, but also higher in agricultural societies where literacy rates are lower. As is to be expected, the measured return is sensitive to model specification: for example, inserting con- trols for inputs removes the allocative portion of the gain that education generates. With respect to entrepreneurship choice, the descriptive summary of the effect of education indicates that more educated workers typically end up in wage employment, shunning nonfarm entrepreneurship. Relative to farming, however, more educated workers seek out nonfarm entrepreneurship opportu- nities. For reasons of sample size limitations, a meta-analysis to explain the heterogeneity of results is feasible only for the comparison between wage employment and nonfarm self-employment. The education effect that separates workers out of self-employment into wage employment is stronger for women, ~ossiblystronger in urban areas, and also stronger in less developed economies, where agriculture is more dominant and literacy rates are lower. Many studies report that uneducated women are concentrated in low-income sectors of food 28.Whether the author formulated a structural model or examined entry rather than stock makes no difference. Publication bias is not evident. The standard error of b is not included in the education category equations because this standard error cannot be derived from studies in which the authors used a different base category for their multinomial logit model than wage employment. 29.Psacharopoulos (1994,p. 1330)also reported a 10.8 percent average rate of return to education in self-employment, which, as found here, is less than the 12.2 percent average return in "dependent employment." This value of 10.8 percent is a summary of estimates from a small early entrepreneurship literature and is, given the large standard error in tables 8 and 9, not necessarily out of line with the average of 5.5 percent found here. Sluis, Praag, and Vijverberg 253 commerce and textiles. Thus, it appears that education leads women toward the more rewarding opportunities that are to be found not in higher income entre- preneurial activities but in wage jobs. The differential in the labor market sorting process that education brings about in developing economies stands in contrast with the lack of relationship between an individual's schooling level and the probability of selection into entrepreneurship found for industrial countries. Economic theory points out that education could have opposing effects on entrepreneurship entry, which may play out differently in developingeconomies: for example, the opportunity set in developingeconomies is larger, as are the income differentials between the sectors. Bosworth and Collins (2003),in discussing the discrepancy between macro- economic and microeconomic studies of the returns to education, debate the difference between social and private returns, the problem of properly measur- ing education, and the difficulty of comparing the quality of education across countries. The meta-analysis here adds two elements to this list. First, evidence of systematic heterogeneity in the returns to education as implied by the out- comes of the meta-analysis suggests that the macroeconomic parameter measur- ing the returns to education should be treated as a function of societal characteristics or as a random (rather than a constant) coefficient. Sqcond, there is some evidenceof publication bias, suggestingthat the published positive microeconomic evidence might be overstating the true returns to education. As benchmarked against the common practice in the returns-to-schboling research in the employment literature, the state-of-the-art of research into the effect of education on entrepreneurship is somewhat disappointing. Though much effort has been directed toward the issue, many lacunae remain-issues that have not been addressed or that have been addressed inadequately. A first drawback is the lack of homogeneity in the definitions of schooling, performance, and entrepreneurship. Only about 35 percent of the studies use a simple years of schooling as the measure of educational attainment. Most researchers use a widely varying set of dummy variables for specific levels of schooling in their entrepreneurship performance and entry equations: the com- parison group varies so much across studies that it is difficult to generalize. The same holds true for performance, for which various definitions are used (by themselves useful), and for entrepreneurship selection, where comparison groups are quite varied. It should be acknowledged that many of the data sets used to study self-employmentactivities are not specifically designed to inalyze entrepreneurship. Still, to build a body of knowledge, researchers ought to pay more attention both to the systematic operationalization of entrepreneurship concepts and to the reporting of the results (including, it must be said, a proper description of the data). Novelty for the sake of product differentiation does not have as much value in the scientific realm as it does in the marketplace. A second issue that should receive more attention is the role of ability and other often unobserved factors in determining entrepreneurial selection and performance. It is quite plausible that the "effect" of schooling that is typically estimated is not completely causal and is therefore biased: ability and other factors might increase performance and also lead to more schooling, thus potentially leading to a spurious positive effect of schooling on performance. A deeper theoretical concern is that schooling itself is endogenous to perfor- mance in the labor market: although future earnings are not the only reason to pursue an education, the prospect of earning higher incomes induces many students to stay in school longer (see, for example, Glewwe 1999). In the established returns-to-schooling literature that focuses on wage employment, this issue is well recognized. Whenever the data permit, researchers attempt to correct for the ability bias and the endogeneity of schooling by including measures of innate ability in the specifications, by using instrumental variables, or by running controlled experiments, as in twin studies. In the entrepreneurship counterpart of this literature, none of the studies mentions the endogeneity of schooling,30 and there are virtually no studies that incorporate any kind of ability mea~ure.~' Of the 129 studies on performance that were surveyed, 19 percent corrected for selection biases and 81 percent did not. Omission of such a correction should be acknowledged in the type of recommendations made by these studies. Furthermore, the standard model that many researchers use to correct for selectivity can be questioned. The multinomial logit model assumes that each person makes one and only one labor market choice. However, it is clear that many labor market participants, including the nonfarm self-employed, are active in more than one sector at a time or during the course of a year (Lanjouw 2001; Vijverberg 1992; Vijverberg and Haughton 2002). The multinomial logit model is not suited to analyzing such behavioral patterns (nor, by extension, are the logit and probit models), and the selectivity correction may therefore not be appropriate either.32 For employees the distinction between the effects of general education and specific education is quite well known. For entrepreneurship much remains to be explored about the type of curriculum, the effect of training, and the benefits of apprenticeships. It may well be true that the type of curriculum is more important than the level of schooling, and it is conceivable that both curriculum 30. This kind of analysis is further complicated by the fact that because of cultural differences, financial constraints, or availability of schools, people with zero years of schooling may be a hetero- geneous group. 31. Exceptions are Escher and others (2002),who found that cognitive ability mattered but did not control for educational attainment, and Vijverberg (1999),which found an insignificant effect for various ability measures. 32. One attempt to develop a solution to this issue is found in Vijverberg (1986).A related issue that has never been discussed is the potential bias resulting from nonrepresentative participation in samples. It might well be the case that more successful entrepreneurs do not take the time to fill out questionnaires or, conversely, that poorly performing entrepreneurs are unwilling to reveal their bad state of affairs in a questionnaire. Sluis, Praag, and Vijverberg 255 and level of schooling affect entrepreneurial outcomes through productivity (human capital theory) and sorting (screening or signaling theory; see Wolpin 1977 and Riley 2002). A limited number of related studies include prior entre- preneurship experience of the respondent or of the respondent's parents as a determinant of the likelihood that the person pursues entrepreneurial interests, but the precise process that leads to-this choice remains-unclear and likely constitutes a fruitful area of future research. Finally, entrepreneurship might be described as the process of bringing inputs, technologies, and output markets together. An important part of this process is acquiring financial capital during start-up and expansion. Obtaining credit might be viewed as one of the many dimensions of entrepreneurial performance, and indeed there is little doubt that schooling is related to the likelihood of getting loans (Bigsten and others 2000; McKernan 2002; Parker and van Praag 2003; Raturi and Swamy 1999). Yet though there is therefore a close link with this branch of the finance literature, a review of the precise role of education is left as a future task. To summarize, many challenges remain in the study of the relationship between entrepreneurship and education, both qualitatively and quantita- tively. This article should stimulate efforts toward a deeper, more robust understanding of the role of education in determining the decision to become an entrepreneur and in determining the returns to education among entrepre- neurs. If this meta-analysis demonstrates anything, it is that in developing economies the choice of becoming an entrepreneur seems to be rising at low levels of formal education and to be falling at higher levels, and that derfor- mance has a positive relationship to education pursued. It is well known that schooling raises wage earnings, but if schooling is also positively related to entrepreneurship performance, this makes a stronger case for investment in human capital through schooling (perhaps including lifelong learning) at all levels. Studies that contribute to the database are listed separately, including those that are referenced in the text. Ashenfelter, Orly, Colm Harmon, and Hessel Oosterbeek. 1999. "A Review of the Schooling/Earnmgs Relationship with Tests for Publication Bias." Labour Economics 6(4):453-70. Rennell, Paul, 1996. "Rates of Return to Education: Does the Conventional Pattern Prevail in Sub- Saharan Africa?" World Development 24(1):183-99. Bigsten, Arne, Paul Collier, Stefan Dercon, Marcel Fafchamps, Bernard Gauthier, Jan Willem dunning, Mans Soderborn, Abena Oduro, Remco Oostendorp, Cathy Patillo, Francis Teal, and Albert teufack. 2000. "Credit Constraints in Manufacturing Enterprises in Africa." Discussion Paper 24. Oxford University, Institute for the Study of African Economies, Oxford. Bosworth, Barry P., and Susan M. Collins. 2003. "The Empirics of Growth: An Update." ~ r o o k i n ~ s Papers on Economic Activity 2:113-206. Calvo, G., and S. Wellisz. 1980. "Technology, Entrepreneurs and Firm Size." Quarterly J ~ r n a of l Economics 95(4):663-78. Card, D., and A. B. Krueger. 1995. "Time-Series Minimum-Wage Studies: A Meta-Analyis." American Economic Review 85(2):238-43. Escher, Susanne, Rafal Grabarkiewicz, Michael Frese, Gwenda van Steekelenburg, Maartje Lauw, and Christian Friedrich. 2002. "The Moderator Effect of Cognitive Ability on the Relationship between Planning Strategies and Business Success of Small Scale Business Owners in South Africa: A Long- itudinal Study." Journal of Developmental Entrepreneurship 7(3):305-18. Glewwe, Paul. 1999. "A Method for Estimating the Determinants of Schooling Outcomes." In P. Glewwe, ed., The Economics of School Quality in Developing Countries: An Empirical Study of Ghana. New York: St. Martin's Press. Haggblade, Steven, Peter Hazell, and James Brown. 1989. "Farm-Nonfarm Linkages in Rural Sub- Saharan Africa." World Development 17(8):1173-201. Haggblade, Steven, Peter Hazell, and Thomas Reardon. 2002. "Strategies for Stimulating Poverty- Alleviating Growth in the Rural Nonfarm Economy in Developing Countries." EPTD Discussion Paper 92. International Food Policy Research Institute, Environment and Production Technology Division, and World Bank, Rural Development Department, Washington, D.C. Available online at www.ifpri.orgldivsleptdldpleptdp92.htm. Harris, J. R., and M. Todaro. 1969. "Migration, Unemployment and Development: A Two-Sector Model." American Economic Review 60(1):126-42. Hedges, L. V. 1992. "Modeling Publication Effects in Meta-Analysis." Statistical Science 7(2):246-55. House, William J. 1984. "Nairobi's Informal Sector: Dynamic Entrepreneurs or Surplus Labor?" Eco- nomic Development and Cultural Change 32(2):277-302. IMD. 2002. World Competitiveness Yearbook 2002. Available online at www.imd.ch. ILO (International Labour Organization). 1972. Employment, Incomes, and Equality: A Strategy for Productive Employment in Kenya. Geneva: International Labour Office. I Jamison, Dean T., and Lawrence J. Lau. 1982. Farmer Education and Farm Efficiency. Baltimore, Md.: Johns Hopkins University Press. Kiggundu, Moses N. 2002. "Entrepreneurs and Entrepreneurship in Africa: What Is Known and What Needs to Be Done." Journal of Developmental Entrepreneurship 7(3):239-58. King, K., and S. McGrath. 1999. Enterprise in Africa: Between Poverty and Growth. London: Inter- mediate Technology. Kurian, George T. 1988. The World Education Encyclopedia. Oxford: Facts on File. Lanjouw, Jean O., and Peter Lanjouw. 2001. "The Rural Non-Farm Sector: Issues and Evidence from Developing Countries." Agricultural Economics 26(1):1-23. Le, A. T. 1999. "Empirical Studies of Self-Employment."Journal of Economic Surveys 13(4):381416. Lee, Jong-Hwa, and Robert J. Barro. 2001. "Schooling Quality in a Cross-Section of Countries." Economics 68(272):465-88. Lewis, W. Arthur. 1954. "Economic Development with Unlimited Supplies of Labour." Manchester School 22(2):139-91. Lockheed, Marleen E., Dean T. Jamison, and Lawrence J. Lau. 1987. "Farmer Education and Farm Efficiency: A Survey." Economic Development and Cultziral Change 29(1):37-76. Lucas, R. E. 1978. "On the Size Distribution of Business Firms." Bell Journal of Economics 9(2):508-23. McKernan, Signe-Mary. 2002. "The Impact of Microcredit Programs on Self-Employment Profits: Do Noncredit Program Aspects Matter." Review of Economics and Statistics 84(1):93-115. Mead, Donald C. 1994. "The Contribution of Small Enterprises to Employment Growth in Southern and Eastern Africa." World Development 22(12):1881-94. Mead, Donald C., and Carl Liedholm. 1998. "The Dynamics of Micro and Small Enterprises in Devel- oping Countries." World Development 26(1):61-74. Mead, Donald C., and Catherine Morrison. 1996. "The Informal Sector Elephant." World Development 24(10):1611-19. Sluis, Praag, and Vijverberg 257 Nijkamp, Peter, and Jacques Poot. 2002. "Meta-Analysis of the Impact of Fiscal Policies on Long-Run Growth." Tinbergen Institute Discussion Paper 2813. Amsterdam. Parker, Simon C., and C. Mirjam van Praag. 2003. "Explaining Entrepreneurial Performance: The Effects of Education and Financial Capital Constraints." University of Amsterdam, Amsterdam. Peattie, Lisa. 1987. "An Idea in Good Currency and How It Grew: The Informal Sector." World Development 15(7):850-60. Phillips, J. M. 1994. "Farmer Education and Farm Efficiency." Economic Development and Change 43(1):149-65. Psacharopoulos, George. 1994. "Returns to Investment in Education: A Global Update." World Devel- opment 22(9):1325-43. Ranis, Gustav, and John C. H. Fei. 1964. "Development of the Labor Surplus Economy: Theory and Policy." Homewood, Ill.: R. D. Irwin. Raturi, Mayank, and Anand V. Swamy. 1999. "Explaining Ethnic Differentials in Credit Market Out- comes in Zimbabwe." Economic Development and Cultural Change 47(3):585-604. Reardon, Thomas, J., Edward Taylor, Kostas Stamoulis, Peter Lanjouw, and Arsenio Balisacan. 2000. "Effects of Non-Farm Employment on Rural Income Inequality in Developing Countries: An Invest- ment Perspective." lournal of Agricultural Economics 51(2):266-88. Riley, J. G. 2002. "Weak and Strong Signals." Scandinavian Journal of Economics 104(2):213-36. Roy, A. D. 1951. "Some Thoughts on the Distribution of Earnings." Oxford Economic Papers 3(June):13546. Schultz, T. W. 1980. "Investment in Entrepreneurial Activity." Scandinavian lournal of Economics 82(4):437-48. Spence, A. M. 1973. Market Signaling: Information Transfer in Hiring and Related Processes. Cambridge, Mass.: Harvard University Press. van der Sluis, J., C. M. van Praag, and W. Vijverberg. 2003. "Entrepreneurship Selection and Perfor- mance: A Meta-Analysis of the Impact of Education in Industrialized Countries." TinbergenInstitute Discussion Paper 2003-04613. Amsterdam. van Praag, C. M., and J. S. Cramer. 2001. "The Roots of Entrepreneurship and Labour oemand: Individual Ability and Low Risk Aversion." Economics 68(269):45-62. Vijverberg, Wim P. M. 1992. "Measuring Income from Family Enterprises with Household Surveys." Small Business Economics 4(4):287-305. Wolpin, K. 1977. "Education and Screening." American Economic Review 67(5):949-58. World Bank. 2003. World Develpoment Indicators 2003. Available online at www.worldbank.brgldata1 dataquery.htmb Studies Included in the Database Alvarez, Roberto, and Gustavo Crespi. 2003. "Determinants of Technical Efficiency in Small Firms." Small Business Economics 20(3):23344. Appleton, Simon, Arne Bigsten, and Damiano Kulundu Manda. 1999. "Educational ~ x ~ a n 6 i oand n Economic Decline: Returns to Education in Kenya, 1978-1995." Working Paper 99-6. University of Oxford, Centre for the Study of African Economies, Oxford. Barr, Ablgail M. 1995. "The MissingFactor: Entrepreneurial Networks, Enterprisesand Economic Growth in Ghana." Working Paper 95-11. Oxford University, Centre for the Study of African Economiesi Oxford. Barr, Abigail M., and Truman Packard. 2002. "Revealed Preference and Self Insurance: Can v e Learn from the Self-Employed in Chile?" Discussion Paper 3365. World Bank, Washington, D.C. Blau, David. 1985. "Self-Employment and Self-Selection in Developing Country Labor Market$." South- ern Economic Journal 52(2):351-63. -. 1986. "Self-Employment, Earnings, and Mobility in Peninsular Malaysia." World ~ev'elo~ment 14(7):839-54. Burki, Abid A. 1998. "Measuring Production Efficiency of Small Firms in Pakistan." World Development 26(1):155-69. Chaves, Rodrigo A., Susana Sanchez, Saul Schor, and Emil Tesliuc. 2001. "Financial Markets, Credit Constraints, and Investment in Rural Romania." Technical Paper 499. World Bank, Washington, D.C. Christofides, Louis N., and Panos Pashardes. 2002. "SelfIPaid-Employment,PublidPrivate Sector Selec- tion, and Wage Differentials." Labour Economics 9(6):737-62. Co, Catherine Y., Ira N. Gang, and Myeong-Su Yun. 2002. "Self-Employment and Wage Earning in Hungary." Discussion Paper. Rutgers University, Camden, N.J. Cohen, Barney, and William House. 1996. "Labor Market Choices, Earnings, and Informal Networks in Khartoum, Sudan." Economic Development and Cultural Change 44(3):589418. Copestake, James, Sonia Bhalotra, and Susan Johnson. 2001. "Assessing the Impact of Microcredit: A Zambian Case Study." Journal of Development Studies 37(4):81-100. Corral, Leonardo. 2001. "Rural Nonfarm Incomes in Nicaragua." World Development 29(3): 42742. Cortes, Mariluz, Albert Berry, and Ashfaq Ishaq. 1987. Success in Small and Medium-Scale Enterprises. New York: Oxford University Press. Coulombe, Harold, and Andrew McKay. 1996. "Modeling Determinants of Poverty in Mauritania." World Development 24(6):1015-31. Cunningham, Wendy V., and William F. Maloney. 2001. "Heterogeneity among Mexico's Micro- Enterprises: An Application of Factor and Cluster Analysis." Economic Development and Cultural Change 50(1):131-56. Daniels, Lisa, and Donald C. Mead. 1998. "The Contribution of Small Enterprises to Household and National Income in Kenya." Economic Development and Cultural Change 47(1):45-71. Dasgupta, Sukti. 2003. "Structural and Behavioural Characteristics of Informal Service Employment: Evidence from a Survey in New Delhi." Journal of Development Studies 39(3):351-80. De Janvry, Alain, and Elisabeth Sadoulet. 2000. "Rural Poverty in Latin America: Determinants and Exit Paths." Food Policy 25(4):389409. Earle, John S., and Zuzana Sakova. 1999. "Entrepreneurship from Scratch: Lessonson the Entry Decision into Self-Employmentfrom Transition Economies." IZA Discussion Paper 79. Institute for the Study of Labor, Bonn, Germany. . 2000. "Business Start-Ups or Disguised Unemployment? Evidence on the Character of Self- Employment from Transition Economies." Labour Economics 7(5):57.5-601. Elbers, Chris, and Peter Lanjouw. 2001. "Intersectoral Transfer, Growth, and Inequality in Rural Ecuador." World Development 29(3):481-96. Escobal, Javier. 2001. "The Determinants of Nonfarm Income Diversification in Rural Peru." World Development 29(3):497-508. Funkhouser, Edward. 1992. "Migration from Nicaragua: Some Recent Evidence." World Development 20(8):1209-18. . 1997. "Labor Market Adjustment to Political Conflict: Changes in the Labor Market in El Salvador during the 1980s." Journal of Development Economics 52(1):31-64. Giugale, M., S. El-Diwany, and S. Everhart. 2000. "Informality, Size, and Regulation: Theory and an Application to Egypt." Small Business Economics 14(2):95-106. Glick, Peter, and David E. Sahn. 1997. "Gender and Education Impacts on Employment and Earnings in West Africa: Evidence from Guinea." Economic Development and Cultural Change 45(4):793-823. -1998."HealthandProductivityinaHeterogeneousUrbanLabourMarket."AppliedEconomics . 30(2):203-16. Goedhuys, Micheline, and Leo Sleuwaegen. 2000. "Entrepreneurship and Growth of Entrepreneurial Firms in C6te d'lvoire." ]oumal of Development Studies 36(3):12245. Sluis, Praag, and Vijverberg 259 Gokcekus, Omer, and Kwabena Anyane-Ntow, and T. T. Richmond. 2001. "Human Capital and Efficiency:The Role of Education and Experience in Micro-Enterprises of Ghana's Wood-Products Industry." Journal of Economic Development 26(1):103-13. Henderson, James. 1983. "Earnings Functions for the Self-Employed: Comment." Journal of Develop- ment Economics 13:97-102. Honig, Benson. 1996. "Education and Self-Employment in Jamaica." Comparative Education Review 40(2):177-93. . 1998. "What Determines Success? Examining the Human, Financial, and Social Capital of Jamaican Microentrepreneurs." Journal of Business Venturing 13:371-94. . 2001. "Human Capital and Structural Upheaval: A Study of Manufacturing Firms in the West Bank." Journal of Business Venturing 16(6):575-94. Khandker, Shahidur. 1987. "Labor Market Participation of Married Women in Bangladesh."Review of Economics and Statistics 69(3):53641. Lanjouw, Peter. 1999. "Rural Nonagricultural Employment and Poverty in Ecuador." Economic Devel- opment and Cultural Change 48(1):91-122. . 2001. "Nonfarm Employment and Poverty in Rural El Salvador." World Development 29(3):529-47. Lanjouw, Peter, Jaime Quizon, and Robert Sparrow. 2001. "Non-Agricultural Earnings in Peri-Urban Areas of Tanzania: Evidence from Household Survey Data." Food Policy 26(4):385403. Lanot, Gauthier, and Christophe Muller. 1997. "Dualistic Sector Choice and Female LabourSupply: Evidence from Formal and Informal Sectors in Cameroon." Working Paper 97-9. University of Oxford, Centre for the Study of African Economies, Oxford. Lerner, Miri, and Sigal Haber. 2001. "Performance Factors of Small Tourism Ventures: The Interface of Tourism Entrepreneurship and the Environment." Journal of Business Venturing 16(1):77-1100. Lerner, Miri, Candida Brush, and Robert Hisrich. 1997. "IsraeliWomen Entrepreneurs: An Examination of Factors Affecting Performance."Journal of Business Venturing 12(4):315-39. Little, Ian M. D., Dipak Mazumdar, and John M. Page Jr. 1987. Small Manufacturing Enterprises: A Comparative Analysis of India and Other Economies. New York: Oxford University Press. Magnac, T. 1991. "Segmented or Competitive Labor Markets." Econometrics 59(1):165-87. Maloney, William F. 1999. "Does Informality Imply Segmentation in Urban Labor Markets? Evidence from Sectoral Transitions in Mexico." World Bank Economic Review 13(2):275-302. Martin, John P., and John M. Page Jr. 1983. "The Impact of Subsidies on X-Efficiency in LDC Industry: Theory and an Empirical Test." Review of Economics and Statistics 65(4):608-17. McPherson, Michael A. 1996. "Growth of Micro and Small Enterprises in Southern Africa." Journal of Development Economics 48(2):253-77. Melmed-Sanjak, Jolyne, and Carlos E. Santiago. 1996. "The Household and Employment in Small-Scale Nonfarm Enterprises." World Development 24(4):749-64. Mengistae, Taye. 2001. "Indigenous Ethnicity and Entrepreneurial Success in Africa: Some Evidence from Ethiopia." Policy Research Working Paper 2534. World Bank, Washington, D.C. Mesnard, Alice, and Martin Ravallion. 2001 "Is Inequality Bad for Business? A Nonlinear Microeco- nomic Model of Wealth Effects on Self-Employment."Policy Research Working Paper 2527. World Bank, Washington, D.C. Mills, Bradford, and David E. Sahn. 1997. "Labor Market Segmentation and the Implications for Public Sector Retrenchment Programs."Journal of Comparative Economics 25(3):385402. Moock, Peter, Philip Musgrove, and Morton Stelcner. 1989. "Education and Earnings in Peru's Informal Nonfarm Family Enterprises." Working Paper 236. World Bank, Washington, D.C. Nafziger, Wayne, and Dek Terrell. 1996. "Entrepreneurial Human Capital and the Long-Run Survivalof Firms in India." World Development 24(4):689-96. Nielsen, Helena Skyt, and Niels Westergard-Nielsen. 2001. "Returns to Schooling in Less Developed Countries: New Evidence from Zambia." Economic Development and Cultural Change 49(2);:365-94. Nziramasanga, Mudziviri, and Minsoo Lee. 2001. "Duration of Self-Employment in Developing Coun- tries: Evidence from Small Enterprises in Zimbabwe." Small Business Economics 17(4):239-53. . 2002. "On the Duration of Self-Employment: The Impact of Macroeconomic Conditions." Journal of Development Studies 39(1):46-73. Page, John M. Jr. 1984. "Firm Size and Technical Efficiency: Applications of Production Frontiers to Indian Survey Data." Journal of Development Economics 16(1/2):129-52. Paulson, Anna, and Robert Townsend. 2001. "Entrepreneurship and Financial Constraints in Thailand." Working Paper. University of Chicago, Chicago, Ill. Pradhan, Menno, and Arthur van Soest. 1997. "Household Labor Supply in Urban Areas of Bolivia." Review of Economics and Statistics 79(2):300-10. Ramachandran, Vijaya, and Manju Kedia Shah. 1999. "Minority Entrepreneurs and Firm Performance in Sub-Saharan Africa."Journal of Development Studies 36(2):71-87. Robert, Peter, and Erzsebet Bukodi. 2000. "Who Are the Entrepreneurs and Where Do They Come From? Transition to Self-Employment before, under, and after Communism in Hungary." Interna- tional Review of Sociology 10(1):147-71. Rona-Tas, Akos. 1994. "The First Shall Be Last? Entrepreneurship and Communist Cadres in the Transition from Socialism." AmericanJournal of Sociology 100(1):40-69. Ruben, Ruerd, and Marrit van den Berg. 2001. "Nonfarm Employment and Poverty Alleviation of Rural Farm Households in Honduras." World Development 29(3):549-60. Saavedra, Jaime, and Alberto Chong. 1999. "Structural Reform, Institutions and Earnings: Evidence from the Formal and Informal Sectors in Urban Peru." Journal of Development Studies 35(4):95-116. Shavit, Yossi, and Ephraim Yuchtman-Yaar. 2001. "Ethnicity, Education, and Other Determinants of Self-Employmentin Israel." InternationalJournal of Sociology 31(1):59-91. Singh, Surendra, Ruthie Reynolds, and Safdar Muhammad. 2001. "A Gender-Based Performance Ana- lysis of Micro and Small Enterprises in Java, Indonesia." Journal of Small Business Management 39(2):174-82. Smith, Paula, and Michael R. Metzger. 1998. "The Return to Education: Street Vendors in Mexico." World Development 26(2):289-96. Strassmann, W. Paul. 1987. "Home-Based Enterprises in Cities in Developing Countries." Economic Development and Cultural Change 36(1):12144. Teilhet-Waldorf, Saral, and William H. Waldorf. 1983. "Earnings of Self-Employed in an Informal Sector: A Case Study of Bangkok." Economic Development and Cultural Change 31(3):587-607. Telles, Edward E. 1993. "Urban Labor Market Segmentation and Income in Brazil." Economic Devel- opment and Cultural Change 41(2):23147. Tiefenthaler, Jill. 1994. "A Multisector Model of Female Labor Force Participation: Empirical Evidence from Cebu Island, Philippines." Economic Development and Cultural Change 42(4):71942. Tran, Quoc Trung. 2000. Determinants of Income from Rural Non-Fann Business Activitiesin Vietnam. Master's thesis. National Economics University, Hanoi, and Institute for Social Studies, The Hague. Verme, Paolo. 2000. "The Choice of the Working Sector in Transition: Income and Non-Income Determinants of Sector Participation in Kazakhstan." Economics of Transition 8(3):691-731. Vijverberg, Wim. 1986. "Consistent Estimates of the Wage Equation when Individuals Choose among Income-Earning Activities." Southern Economic Journal 52(4):1028-42. -. 1991. "Profits from Self-Employment: The Case of Cbte d'lvoire." World Development 19(6):683-96. -. 1993. "Educational Investments and Returns for Women and Men in Cbte d'lvoire." Journal of Human Resources 28(4):933-74. . 1995. "Returns to Schooling in Non-Farm Self-Employment:An Econometric Case Study of Ghana." World Development 23(7):1215-27. - . 1998. "Nonfarm Household Enterprises in Vietnam." In David Dollar, P. Glewwe, and J. Litvack, eds., Household Welfare and Vietnam's Transition. Washington, D.C.: World Bank. Sluis, Praag, and Vijverberg 261 .1999."TheImpactofSchoolingand CognitiveSkillson Incomefrom Non-FarmSelf-Employment." In P. Glewwe, ed., The Economics of School Quality Investments in Developing Countries. London: Macmillan. Vijverberg, Wim, and Jonathan Haughton. 2002. "Household Enterprises in Vietnam: Survival, Growth, and Living Standards." Policy Research Working Paper 2773. World Bank, Washington, D.C. Winters, Paul, Benjamin Davis, and Leonardo Corral. 2002. "Assets, Activities, and Income Generation in Rural Mexico: Factoring in Social and Public Capital." Agricultural Economics 27:139-56. Woldenhanna, T., and A. Oskam. 2001. "Income Diversification and Entry Barriers: Evidence from the Tigray Region of Northern Ethiopia." Food Policy 26(4):351-65. Wu, Xiaogang. 2002. "Embracing the Market: Entry into Self-Employmentin Transitional China, 1978- 1996." William Davidson Working Paper 512. University of Michigan, Ann Arbor, Mich. Yang, Dennis Tao, and Mark Yuying An. 2002. "Human Capital, Entrepreneurship, and Farm Household Earnings." Journal of Development Economics 68(1):65-88. Yunez-Naude, Antonio, and Edward Taylor. 2001. "The Determinants of Non-Farm Activities and Incomes of Rural Households in Mexico, with Emphasis on Education." World Development 29(3):561-72. Zhao, Yaohui. 1999. "Labor Migration and Earnings Differences:The Case of Rural China." Economic Development and Cultural Change 47(4):767-82. Microfinance and Poverty: Evidence Using Panel Data from Bangladesh Shahidur R. Khandker Microfinance supports mainly informal activities that often have a low return and low market demand. It may therefore be hypothesized that the aggregate poverty impact of microfinance is modest or even nonexistent. If true, the poverty impact of microfinance observed at the participant level represents either income redistribution or short-run income generation from the microfinance intervention. This article examines the effects of microfinance on poverty reduction at both the participant and the aggregate levels using panel data from Bangladesh. The results suggest that access to microfinance contributes to poverty reduction, especially for female participants, and to overall poverty reduction at the village level. Microfinance thus helps not only poor partici- pants but also the local economy. Bangladesh has been a pioneer in the microfinance movement since its inception in the early 1980s and today is home to the most extensive microfinance operations in the world. In Bangladesh and elsewhere around the world, micro- finance operations support mainly the poor and women engaged in informal activities. Microfinance involves small-scale transactions in credit and savings designed to meet the needs of small- and medium-scale producers and busi- nesses. Microfinance programs also offer skill-based training to augment pro- ductivity and organizational support and consciousness-raising training to empower the poor. But even though microfinance has been the focus of devel- opment and poverty reduction activities for decades, development practitioners still know relatively little about the extent of poverty reduction possible through microfinance activities. This article seeks to shed some light on the question by Shahidur R. Khandker is lead economist in the World Bank Institute's Poverty Reduction and Economic Management Division and the Development Research Group at the World Bank; his email address is skhandker@worldbank.org.This article is a revised version of a World Bank Discussion Paper of the same title (Khandker 2003). The article benefited from discussions with Gershon Feder, M. A. Latif, Mark Pitt, Martin Ravallion, and Binayak Sen and insightful comments from two anonymous refereesand from Jaime de Melo and Alan Winters. The author is grateful to Hussain Samad for excellent research assistance and to the Bangladesh Institute of Development Studies research staff for help in collecting and processing data. THE WORLD BANK ECONOMIC REVIEW VOL. 19, NO 2, pp. 263-286 , doi:10.1093/wber/lhi008 Advance Access publication September 8, 2005 O The Author 2005. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THEWORW BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. estimating the impact of microfinance on consumption poverty in Bangladesh using panel data from household surveys for 1991192 and 1998199. Microfinance responds to the derived demand for borrowing to support self- employment and small business. Thus unlike other transfer schemes, it requires both entrepreneurial skill and a favorable local market. Without them, the returns to the investments financed by microfinance are likely to be small, and so, too, are any reductions in overall poverty. Even if the induced marginal gains from microborrowing are large for participants, the effect of the accrued total benefits on aggregatepoverty are likely to be small, for several reasons. For one, microfinance transactions are too small to exert a large impact on aggregate poverty. For another, in an economy with low economic growth, borrowing may only redistribute income rather than boost growth. Determining whether the benefitsto program participants are sustainable and large enough to make a dent in the poverty of participants and society at large is important for guiding policy. Answering such questions requires an analysis of the dynamic consumption effects of microfinance interventionsas well as village- level spillover effects on poverty reduction. Khandker and Pitt (2003)examined the impacts of microfinanceon a number of outcomes using panel household survey from Bangladesh. More specifically, they considered such issues as whether the effects of microfinance are saturated or crowded out over time, whether programs generate externalities, and whether the estimated impacts of microfinance found earlier with cross-section data analysis can be corroborated using an alternative method. They found a declining long-term effect of microfinance as well as the possibility of village saturation from microfinance loans. This article uses the same household panel data to address related issues, focusing on the poverty reduction impact of microfinance. First, it examines whether household and individual factors such as land and education influence a household's demand for a loan in microfinanceschemes where the decisions to borrow and how much to borrow are made by a group. Second, the article assesses whether microfinance reduces poverty and, if so, what the limits of poverty reduction through microfinance are. Third, it examines the spillover effects of microfinance, to determine whether the program benefits households beyond those that participate. Finally, it examines the aggregate poverty effect of microfinance to determine whether there is a society-level impact of micro- finance on poverty reduction. The article reviews the literature on microfinance and then discusses the econometric framework and the data. It estimates the demand for credit from a microfinance program based on both cross-sectional and panel data to deter- mine whether individual and household factors matter in a household's demand for credit from a group-based microfinance program and reports the estimated effects of microfinance on the consumption of borrowers. It also discusses the spillover effects of microfinance programs and the effects on poverty reduction for both borrowers and society as a whole. Khandker 265 Microfinance organizations in Bangladesh, unlike their formal counterparts in the financial sector, have made great strides in delivering financial services (both savings and credit) to the poor, especially women, at very low loan default rates. Strategies such as collateral-free group-based lending and mobilization of sav- ings, even in small amounts, have helped microfinance programs mitigate the problems that beset their formal counterparts, such as weak outreach and high loan default costs. However, the transaction costs are high for maintaining credit discipline among borrowers through group pressure and monitoriug of borrowers' behavior, and programs have relied on donors to sustain their operations (Khandker 1998; Khalily and others 2000; Morduch 1999; Yaron 1994). In Bangladesh such support is provided by the government and donors in the expectation that society benefits from such investments. In 1996 the World Bank provided a loan of $115 million to Palli Karma Shahayak Foundation (PKSF), an intermediary for wholesaling microfinance. The PKSF supports on- lending through small nongovernment organizations (NGOS) and a few large ones. This project was followed by a second project of $160 million in 2000. In 200012001 microfinance programs assisted some 10 million households (of a total of 30 million), with loans outstanding of about $1 billion. The organized NGO sector and Grameen Bank accounted for more than 86 percent of micro- finance lending and commercial banks for just 14 percent.1 Because these government and donor resources have alternative uses from which the poor could also benefit, such as building community infrastructure, schools, and health facilities, it is important to know the extent of the socio- economic impacts of microfinance. Are the impacts great enough to justify supporting microfinance over alternate uses? In particular, because women are disproportionately disadvantaged in countries such as Bangladesh and consti- tute the overwhelming majority of microfinance beneficiaries, an important policy question is whether women benefit from microfinance and, if so, how much? At the aggregate level the policy question is whether microfinance programs benefit nonparticipants or do they simply redistribute income in a society? Despite differences in methodology, impact assessments show that microfi- nance in general helps the poor, although all participants may not benefit equally. An early study of Grameen Bank noted its support for the poor, especially women, through employment and income generation and improvements in social indicators (Hossain 1988). Some recent studies also find beneficial aspects of microfinance operations in Bangladesh (for example, Hashemi and others 1996). 1. There is a common misconception that Grameen Bank is an NGO. It is a specialized bank with its own charter approved by the government of Bangladesh. The most comprehensive impact studies of microfinance, a joint research project of the Bangaldesh Institute of Development Studies (BIDS) and the World Bank, find strong evidence that the programs help the poor through consumption smoothing and asset building (Khandker 1998; Pitt and Khandker 1998). The findings support the claim that microfinance programs promote investment in human capital (such as schooling) and raise awareness of reproductive health issues (suchas use of contraceptives)among poor families. The studies also shed light on the role of gender-based targeting and its impact on household and individual welfare, finding that microfinance helps women acquire assets of their own and exercise power in household deci~ionmakin~.~The research project estimatesthat the marginalimpactof microfinance on consumption was 18 percent for women and 11 percent for men (Pittand Khandker 1998).The study finds that some 5 percent of borrowers may lift themselves out of poverty each year by borrowing from a microfinance program, if the estimated impacts on consumption continue over time (Khandker 1998). But even if this does happen, microfinance could lift less than1percent of the populationout of poverty becauseit reachesonly a quarter of the population.3 The robustness of these results still remains an issue, however, because impact studies are sensitive to the method applied. This article thus examines whether these findings can be substantiated by another method, such as panel data analysis. It looks at the long-run impacts of microfinance to see whether program impacts found in 1991192 are sustainable over time. And if microfinance leads to poverty reduction at the borrower level, what is its impact on aggregate poverty? Income or consumption poverty can be reduced through interventions such as microfinance that help the poor become self-employed and generate income. But efforts to assess the impact of microfinance programs can be biased by nonran- dom program placement and participation. Antipoverty programs such as the 2. Morduch (1998)found either small or nonexistent program effects using the same 1991192 BIDS- World Bank survey data. However, this study applied the difference-in-difference technique, which is suitable only for a randomized experimental study, whereas the BIDS-World Bank survey is of the quasi- experimental type and hence endogeneity of program participation is a serious issue. Morduch also pointed out that because of mistargeting (about 25 percent), the impacts shown in Pitt and Khandker (1998) were upper bound. However, a reexamination by Pitt (1999) showed that mistargeting was a nonissue in the estimated impacts and reconfirmed the earlier findings in Pitt and Khandker (1998),even after relaxing the targeting criteria and excluding mistargeted households from analysis. 3. If borrowers make up less than 25 percent of rural households, and if 5 percent of borrowers can move out of poverty, that means that 1 percent of households moved out of poverty in rural areas in each year due to microfinance. This may be high, given the fact that aggregate national leveI poverty estimates over time show that Bangladesh has managed to reduce poverty by 1 percent every year over the last decade (World Bank 2003). Khandker 267 Grameen Bank are often placed in areas where the incidence of poverty is high. Thus simply comparing the incidence of poverty in program and nonprogram areas may lead to the mistaken conclusion that microfinance programs have increased poverty. Similarly, those who participate may self-select into a pro- gram based on unobserved traits such as entrepreneurial ability. In that case, simply comparing such outcomes as per capita consumption or the incidence of poverty between program participants and nonparticipants may lead to the mistaken conclusion that the programs have a high impact on poverty reduc- tion, when the effects are due to the unobserved abilities of participants. Thus the estimated effects may be under- or overestimated depending on the type of analysis. The BIDS-WorldBank study shows that the endogeneity of microfinance program placement and participation must be taken into account in estimations (Pitt and Khandker 1996, 1998).The study also shows that because the impacts vary by gender, that too needs to be taken into account. Looking only at the impact of borrowing by households is thus misleading. The method used in the study by Pitt and Khandker (1998) was based on cross-section data, with a quasi-experimental survey design to resolve problems of endogeneity assohated with nonrandom program placement and self-selected participation. In the quasi-experimental survey design, households were sampled in villages with and without a program, both eligible and ineligible households were sampled in both types of villages, and both program participants and norlparti- cipants were sampled among the eligible households in villages with microfi- nance programs. The two central underlyingconditions for identifying program impact were the program's eligibility restriction and its gender-based program design. Any household with a landholding of less than half of an acre is eligible to participate in all microfinance programs in Bangladesh. One can identify the program impact on participants by distinguishing who participates and who does not from among those who are eligible to participate in a microfinance program.4 Because men can join only men-only groups and women cam only join women-only groups, the gender-based restriction is easily enforceable and thus observable, whereas the land-based identification restriction, for vhrious reasons, may not be. Thus, if the land-based restriction is not observable, using the gender-based program design to identify the program effect by genuer of participation is far more efficient. Program effects are conditioned by how certain villages are selected for a program by drawing randomly both program villages and nonprogram villages. The villages are further classified by women-only and men-only groups, which 4. The landholding requirement is not strictly enforced, and so nontarget households are included as program participants. In that case the identification restriction in cross-sectional analysis based on landholding may lose efficiency, although that hypothesis is subject to testing. However, idendification restriction does not apply to fixed-effects analysis using panel data, because the household fixkd-effect method resolves any time-invariant participation-related endogeneity. in turn helps identify program impacts by gender. But the issue remains why certain villages have women-only groups and others have men-only groups. Although the village-level fixed-effect method used with data from both pro- gram and nonprogram villages and from both male-only and female-only groups can resolve the endogeneity of program placement, another exogenous eligibility requirement is needed at the household level to determine why certain households and not others participate. However, there were no conditions that met the requirements of household-level exogenous eligibility conditions. The Pitt and Khandker (1998)study uses instruments such as the interaction of the land-based eligibility rule with household and village-levelcharacteristics to identify the program effect. However, if the land-based condition is not strictly enforced, the interaction variables based on changing land holding may not be as efficient instruments as they would be if the condition were strictly enforced. Given the sensitivity to the instruments used, there are compelling reasons to use alternative methods to demonstrate whether microfinance matters. The quasi-experimental survey design used by Pitt and Khandker (1998)is one of many methods evaluators use to assess program effects (for a review, see Moffitt 1991 and Ravallion 2001). One alternative is the household-level fixed-effect method using panel data. There are strong reasons for using a panel survey over a cross-sectional survey in impact analysis. Cross-section results may not be robust, with some studies showing that measurement of program impacts depends significantly on the method used to treat program endogeneity (see, for example, Lalonde 1986). It is important to assess the robustness of the results using a method that, unlike the quasi-experimental method used in the Pitt and Khandker (1998)study, is less reliant on the landholding eligibility rule, which is not strictly enforceable but was nonetheless used in the cross- sectional analysis. Household-level panel data are thus used here to analyze the impact of borrowing on consumption and hence on poverty. To show how panel data can be used to estimate program effects, assume the following reduced-form borrowing (Siit)by the ith household living in the jth village in period tYS where X is a vector of household-, village-, and group-levelcharacteristics (such as age and education of household head), h is a vector of unknown parameters 5. References to borrowing or credit and to the cumulative stock of borrowing since the start of program participation. Separate borrowing equations are not shown for men and women. This is subject to testing the restrictions on equality of parameters of the credit demand equations of men and women. A priori differential credit demand is expected for men and women. In a sex-segregated society, the credit constraints faced by women are expected to vary substantially from those faced by men. There is also a possibility that the demand for credit varies by source of finance such as Grameen Bank, BRAC,RD-12, or other NGOS. This is also tested. Khandker 269 to be estimated, q is an unmeasured determinant of credit demand that is time- invariant and fixed within a household (it also includes unobserved group characteristics), p is an unmeasured determinant of credit demand that is time-invariant and fixed within a village, E is nonsystematic error, and the superscript s refers to unobserved error terms specific to the credit-demand equation.6 Current outcome (say, consumption) is assumed to depend on both current and past characteristics, including borrowing. So the conditional demand for consumption ((Zit) in period t is given as,7 where 6 and y measure the effects of current and past credit (stock), and superscript c refers to the error terms specific to the consumption equation. According to equation 2, the return to consumption in any given period (say, 1998199) is the sum of returns from past credit (say,1991192)and current credit (1998199).So this model assumes that even if current credit (Sil,)is zero (thatis, a household stopped borrowing after period I), past credit (S,l(t_ll)may con- tinue to benefit the borrower (y> 0).Therefore, allowance is made for differ- ential impacts of borrowing over time.' The impact of credit on household consumption can be measured by esti- mating equation 2. However, household demand for credit as given in equa- tion l needs to be estimated jointly with equation 2. Using cross-section data ( t = l ) raises the endogeneity of equation 2 with respect to equation 1, as a result of possible correlation of errors in the borrowing and consumption equations (Pitt and Khandker 1998). Although estimation of the credit impact on consumption is observable if consumption equation 2 includes variables that are not included in equation 1, or vice versa, this cannot happen under 6. Very few households change groups in the village over the period of borrowing, so that group effects are picked up by the unmeasured village-level effects. 7. Including borrowing in the dynamic consumption equation 2 can be justified by modifying the Ramsey consumption growth model and allowing the marginal product of capital to depend on +e level of borrowing in the presence of constraints on capital mobility, making households credit constrhined (a similar argument on geographical capital immobility is mentioned as a factor in consumption grj3wth in i China in Jalan and Ravallion 2002). Under the assumption that households are credit constrai ed, the marginal product of capital depends on borrowing, given by the rate of return on borrowing ( B ) .An optimization of consumption over time subject to production constraints can lead to an optimal, rate of consumption growth C(t)as a function of the rate of return to capital (which is constrained by borrow- ing), rate of depreciation, and subjective rate of time reference. The error terms are assumed toinclude the subjective rate of time preference and rate of depreciation, which may be household specificor area specific. 8. The credit demand function (equation1)is a reduced-form equation. However, it could have been allowed to depend on past characteristics (lagged model) in addition to current characteristics. But that change in first-stage specification does not change the consistency of the second-stage consumption equation. Moreover, if the household fixed-effect method is used for the consumption equation in panel analysis, the first-stage credit equation becomes a nonissue. 270 THE WORLD BANK E C O N O M I C REVIEW, VOL. 19, N O . 2 normal circumstances. Thus estimation of equation 2 cannot be separated from that of equation 1. Unlike the Pitt and Khandker (1998) study, which uses a village-level fixed-effect method with an instrumental variable to resolve program pla- cement endogeneity and household-level endogeneity using the 1991192 cross-sectional data, this study uses panel data for households with more than one observation (t>1)to estimate program effects without using an instrumental variable method. This is done by estimating a household-level fixed-effects model, which resolves both household- and village-level endo- geneity, based on the assumption that the error terms of the credit demand equation and consumption equation are uncorrelated, that is, C o r r ( ~ ' ~=~0., ~ ~ ~ ~ ) A basic assumption of the household fixed-effect method is that the unob- served factors at the household and village level remain fixed over time. But these unobserved factors may change for various reasons. For example, unob- served household income, which may condition credit demand, may increase temporarily so that with a larger cushion against risk, households may be willing to assume more loans. Similarly, the unobserved local market condi- tions that influence a household's demand for credit may change over time, exerting a more favorable impact on credit demand. Equations 1 and 2 can be rewritten to incorporate variation in q and p over time: Because the household-level fixed-effect method also resolves any village-level endogeneity, the credit and consumption equations can be simplified by omit- ting village-level unmeasured determinants (p): But the household fixed-effect method that controls for fixed unobserved attri- butes of households participating in microfinance programs may still not yield consistent estimates of the credit effects with panel data for two reasons. The unmeasured determinants of credit at both household and village levels may vary over time, and if credit is measured with errors (which is likely), the error gets amplified when differencing over time, especially with only two time periods. This measurement error will impart attenuation bias to the credit impact coefficients, biasing the impact estimates toward zero. A standard cor- rection for both types of bias (one due to measurement error and one to time- varying heterogeneity in credit demand) is the reintroduction of instrumental variable estimation. The instruments can be similar to those applied in the cross-section analysis of Pitt and Khandker (1998).~This requires testing whether the household-level fixed-effect or the household-level fixed-effect instrumentalvariablemethod is appropriate in estimatinghouseholdconsumption behavior. The BIDS-WorldBank 1991192 survey covered 1,798 households drawn from 87 villages in 29 thanas. Eight program thanas were drawn randomly from the project areas of brac, Grameen Bank, and the Bangladesh Rural Devel- opment Board's (BRDB) Rural Development 12 (RD-12) program; five non- program thanas were also drawn randomly. Three villages were drawn randomly from each thana in which the programs had been in operation for at least three years. The survey was conducted three times during 19911 92, during the three cropping seasons: round 1 during the Aman rice season (November-February), round 2 during the Boro rice season (March-June), and round 3 during the Aus rice season (July-October).However, because of attrition, only 1,769 households of the original 1,798 were available in the third round. A follow-up survey conducted in 1998199 included the same households but also added new households from the original villages, new villages in the original thanas, and three new thanas, raising the number of sample households to 2,599. Because this study relies on panel data to assess the impact of program participation, the study sample was restricted to the 1,638 households that were interviewed in both periods. Of the original group of 1,769 household^,^^ 237 households had split into 546 households in 1998199, resulting in 1,947 house- holds. To maintain a one-to-one correspondence among matching households, the split households were treated as a single household in the resurvey data. Tests conducted to determine whether this merger was appropriate found no statistical difference in results between samples with merged and separated household^.^^ 9. The purpose for using these instruments was different in the Pitt and Khandker (1998) study, which used it to correct for the endogeneity of program participation by a household member. In contrast, these instruments are used in the panel data analysis to correct for time-varying heterdgeneity and measurement errors associated with credit variables. 10. The issue of sample household dropout (attrition) between survey periods is discussed in Khand- ker and Pitt (2003).The attrition rate was 7.4 percent (from 1,769 households to 1,638), whichis quite low. Several studies (Alderman and others 2000; Fitzgerald and others 1998; Thomas and others 1999; Ziliak and Kniesner 1998) have shown that attrition bias is not a big issue as long as it is random. Khandker and Pitt (2003),however, formally tested for attrition bias and found that it can be ignored in a majority of outcomes. 11. See Khandker and Pitt (2003)for the details of the test. Of the 1,638 panel households used in the analysis, 25.8 percent of those in the 1991192 survey were program participants, 38.0 percent were eligible non- participants, and 36.2 percent were nontarget households.12 In the 1998199 resurvey 52.7 percent of households were program participants, 20.1 percent were eligible nonparticipants, and 27.3 percent were nontarget households.13 More than 95 percent of the 1991192 participants were still with the micro- finance programs in 1998199. A major shift occurred among the eligible non- participants, with 47 percent joining a microfinance program by 1998199. Among the nontarget households observed in 1991192, some 28 percent joined a microfinance program by 1998199. Examination of program participation rates by landholding shows participa- tion to be higher among target households (with 50 decimals of land or less) than among nontarget households in both periods (table 1). But program participation among nontarget households indicates potential mistargeting. The extent of mistargeting has increased, rising from about 25 percent of program borrowers in 1991192 to 31 percent in 1998199. Nonetheless, the extreme poor (households with 20 decimals of land or less) constituted a majority of participants (about 60 percent in 1991192 and 54 percent in 1998/99).14 Overall, the participation rate more than doubled between the surveys. The increase is significant even if program attrition is TAB LE 1. Program Participation Rate for Household Landholding Groups, 1991192 and 1998199 (%) Landholding (decimals) 1991192 1998199 0 1-20 21-50 51-100 101-2.50 251+ All households Number of observations Note: Numbers in parentheses are the share of total program participants from each landholding group, except in the last row where they are the number of program participating households. Source: Author's computations based on 1991192 and 1998199 household surveys in Bangladesh. 12. The sample households were drawn using a proportionate random distribution. Hence, the analysis uses weights both in the descriptivestatistics and the econometric analysis. 13. Program participants in 1998199 included members of such programs as Association for Social Advancement (ASA), PROSHIKA, Youth Development Program, Gano Shahajyo Sangstha (css), and some small local ngos in addition to Grameen Bank, BRAC, and BRDB RD-12 members. They also included members of multiple programs. 14. Landholdingis considered a proxy of household wealth and povertyin rural Bangladesh. Poverty can also be measured by the level of consumption. Khandker 273 considered (the annual dropout rate went from 5.5 percent in 1991192 to 3.5 percent in 1998199).'~ Summary statistics for individual and household-level borrowing and con- sumption outcomes show that while average male borrowing for participating households declined from 3,472 taka (Tk)to Tk 2,483 in real terms, or by 28.5 percent over the seven-year period, average female borrowing for participating households increased by 94 percent in real terms (table 2). This suggests that microfinance programs provided loans mainly through female members of poor households, with female borrowing on average accounting for 82 percent of microfinance borrowing in 1998199, up from 63 percent in 1991192. The average loan size was higher in real terms for mistargeted borrowers (Tk10,075) than for targeted borrowers (Tk 8,634) in 1991192 (not shown in table 2), but slightly lower for mistargeted borrowers (Tk 12,606) than for targeted borrowers (Tk 12,728) in 1998199. Mistargeted households constituted some 23 percent of borrowers and received some 25 percent of total credit supplied by microfinance programs. Total household annual per capita expenditure grew by 30.5 percent over the seven-year period for all households compared with 34.6 percent for program participants. This is equivalent to a real increase of some 5 percent annually for program participants. How much of the change in consumption was due to borrowing from microfinance programs, and what were the impacts on poverty reduction? This is discussed in section VII. IV. WHAT DETERMINES DEMAND FOR LOANS ROM A GROUP-BASED F MICROFINANCEPROGRAM? The demand for credit is determined, as specified in equation 1, by a host of factors at the household, village, and group level, including physical endowments (suchas land)and human capital (suchas education), given the availabilityof the program in a village and the nature of group decisionmaking involved in indivi- dual lending. One testable hypothesis is whether the demand for microfinance differs by gender. Because credit markets are imperfect and labor markets are different for men and women, demand for credit is expected to differ by gepder. An F-test for the equality of credit by gender rejects the hypothesis, so different demand equations are fitted for male and female borrowers. A village-level fixed-effect method is applied to equation 1 based on the cross-section data for 1991192 and 1998199 and assuming no unmea~ured household-level determinants of credit. If unobserved group characteristids are part of village-level unobserved factors, such a method would help eliminate the influence of the unmeasured village-leveldemand for credit by men and wbmen 15. The dropout rate is defined as the proportion of past members that are no longer members of any microfinance program. TAB LE 2. Summary Statistics of Consumption and Credit Variables (1991192 taka) Target Nontarget All House Variable Participants Nonparticipants Nonparticipants holds 1991/92 Household-level male borrowing Household-level female borrowing Village-level average male borrowing Village-level average female borrowing Household annual per capita total expenditure Household annual per capita food expenditure Household annual per capita nonfood expenditure Number of observations 1998/99 Household-level male borrowing Household-level female borrowing Village-level average male borrowing Village-level average female borrowing Household per capita annual total expenditure Household annual per capita food expenditure Household annual per capita nonfood expenditure Number of observations Note: Numbers in parentheses are standard deviations. Source:Author's computations based on 1991192 and 1998199 household surveys in Bangladesh. who form groups for the group-based microfinance programs. However, if unobserved household-level factors also contribute to group formation and group characteristics, then the village-level fixed-effect estimates of the credit demand function will not be unbiased. In that case, a household fixed-effect method using household-level panel data may be more efficient. The estimated results of demand equations for men and women show that the errors in the credit demand equations for men and women are not Khandker 275 correlated for cross-section data for either 1991192 or 1998199, whereas they are correlated for panel data (table 3).Therefore, the panel data analysis of the demand functions uses the household fixed-effect method with the correction for the nonzero covariance of the errors of men's and women's credit demand equations. The results confirm that households that are resource poor, especially in land, demand more loans from microfinance programs than households that are resource rich. This means that landless households are likely to receive more loans from microfinance programs than landed households are. The results for TABLE 3. Determinants of Microfinance Demand by Men and Women Village Fixed Household ~ i i e d Effects Effects 1991192 1998199 Panel Data Log of Log of Log of Log of Log of Log of Explanatory Men's Women's Men's Women's Men's Worrlen's Variable Loans Loans Loans Loans Loans L o p s Maximum education 0.069 -0.046 0.021 0.020 0.016 -0.bO9 of household male (years) Maxlmum education -0.056 -0.076 0.007 -0.108" 0.002 -0.049* of household female (years) Log of household land -0.202*" -0.308"- -0.225' ' -0.533"" -0.058 -0.338- (decimals) F-statistics 11.196 14.768 12.876 25.457 2.916 4.511 Number of 1,638 1,638 1,638 1,638 1,638 1,638 observations F-statistics (Ho:para- 7.75 11.02 4.10 meters of men's and women's borrowing equarions are jointly equal to 0) Prob > F 0.0000 Breusch-Pagan test of 3.082 independence of male and female borrowing: XZ (1) Prob > Xz 0.0792 "t-statistic is significant at the 10 percent level or better. ""t-statistic is significant at the 5 percent level or better. Note: Regression also includes the following variables: sex; age and education of household head; whether parents, brothers, and sisters of household's head or head's spouse own land; year; and village level infrastrucrure and price variables to reflect the impact of time-varying changes in local economic conditions. Source: Author's computations based on 1991192 and 1998199 household surveys in Bangladesh. the panel data analysis show that a 10 percent increase in landholding from an average of 137 decimals of land reduces the total amount of borrowing by 3.4 percent for women's borrowing and has no effect on men's borrowing Findings are similar with cross-sectional demand analysis. Moreover, even if landholding determines group formation and consequently an individual's demand for credit, the education of household members also affects demand for credit. More specifically, female education has a negative effect on the amount of borrowing from microfinance programs. One additional year of female education reduces the amount of female borrowing by less than 1 percent in the panel data analysis and by more than 1 percent in 1998199 in the cross-sectional demand analysis. Although having the same sign, the coeffi- cients are smaller in the panel demand analysis than in the cross-sectional analysis. The results thus suggest that even in a group-based system in which demand for microfinance is largely derived from landholding eligibility conditions, human capital (education)matters in deciding how much a particular household borrows from a group-based microfinance program. Poverty reduction is an overarching objective of targeted microfinanceprograms. Because poor people with little education and land are more likely to participate in microfinance programs than are other groups, impact assessments of micro- finance and poverty reduction should assess whether program participation increases household consumption over its level before program participation. Because program participation varies by such observed and unobserved attributes as landholding and education, the level of consumption also varies by these same attributes, so that demand for credit and consumption are jointly determined. The panel household survey data provide a way of controlling for the joint determination of consumption and credit demand and provide a framework for measuring the impact of credit on consumption using the household fixed-effect method. However, as mentioned, a test is needed to determine whether a simple fixed-effect method or a fixed-effect method with instrumental variables is more appropriate. For the instrumental variable method within the household fixed-effectstruc- ture, the first-stage equation for the stock of credit (suppressingsub- scripts for male, female, household, and village)can, similar to equation 3, be written as: where Z is a set of household and village characteristics distinct from household characteristics (X)so that they affect S but not household per capita consump- tion conditional on S. Khandker 277 Selecting appropriate Z variables is a crucial part of this exercise. A house- hold-level choice variable is defined that determines whether a household has a choice of participating in a program. A household's choice depends on two factors: whether a microfinance program operates in the village where the household lives and whether the household qualifies to participate in the pro- gram based on the landholding criteria. The choice variable is considered for both 1991192 and 1998199 to take care of the differential impacts of the two periods and is then interacted with household-level exogenous variables and village fixed-effects to get the instruments.16 Before consumption equation 4 is estimated, tests are carried out on the equality of credit sources (the null hypothesis is that men's loans from different sources are equal and women's loans from different sources are equal) and on the equality of the gender of borrowers (the null hypothesis is that men's and women's loans are equal). The results (not shown here) indicate that at the 5 percent significance level, the hypotheses of the equality of sources cannot be rejected in five of six cases (it can be rejected only for per capita food expenditure in 1991192, where F(16, 1,572)=2.44, p >F= 0.024). Thus credit from all sources can be lumped together. The second set of tests indicates that the equality of the gender of borrowers can be rejected for per capita total expenditure [F(2, 1,582)=6.07, p >F= 0.0021 and food expenditures [F(2, 1,582)=9.85, p >F =0.0011, but not for nonfood expenditures [F(2,1,582)=2.17, p >F=0.1151. This suggests that borrowing by men and women cannot be combined. So the effects of credit pooled from various sources are estimated separately for men and women. A specification test (Wu-Hausman test) is performed to determine which is more appropriate for estimating the consumption effects of borrowing from microfinance programs: the household-level fixed-effect method or the house- hold-level fixed-effect with instrumental variable method, which depends on an alternative specification that suggests that the time-varying errors that affect credit demand have separate effects on that demand.17 The test result (not shown here) for per capita consumption suggests that the credit volume as used in the fixed-effect method is not endogenously determined by factors suLh as 16. Unlike the case in 1991192, there was no village without a program in 1998199. Hence, th$re was no control village and no village-specificchoice. Yet households have a choice based on the land@olding eligibility condition. 17. Here the null hypothesis is that both fixed-effect and fixed-effect with instrumental Gbriable estimates are consistent, and the alternate hypothesis is that only the fixed-effect with instruhental ? variable estimate is consistent. If the null hypothesis is true the fixed-effect model should be used ecause it is more efficient. Otherwise, the fixed-effect with instrumental variable model would be used. or the Wu-Hausman test credit variables are regressed on only instruments using the fixed-effect model, predicted credit variables are saved as fitted, and the second-stage equations are estimated using the fixed-effect model including both original and fitted credit variables as regressors. Then the test is run to determine if the coefficient on the fitted variables is zero (null hypothesis). the time-varying heterogeneity or the measurement errors associated with credit variables.'' The household-levelfixed-effect results suggest that male borrowing has no significanteffect, while female borrowing has a significant positiveeffect on per capita consumption outcomes (table4).19Based on the fixed-effect estimation, a 10 percent increase in the current stock of female borrowing increases house- hold total expenditure by 0.09 percent and the same increase in the past stock of female borrowing increases per capita consumption by 0.10 percent.20Similar positive and significant impacts of stocks of female borrowing are also evident for household food and nonfood expenditure. These fixed-effect estimates are used to calculate the marginal returns to borrowing for men and women (table 5). The marginal return estimates for women are used as the basis for calculating the impact of borrowing on poverty reduction presented later. At the mean an additional Tk 100 of cumulative bor- rowing by women during 1991/92 adds almost Tk 15 to total annual household expenditure-Tk 7 to food expenditure, and Tk 8 to nonfood expenditure. TABLE 4. Household Fixed-Effects Estimates of the Impact of Microfinance Loan Log of Household Log of Household Log of Household per Capita Yearly per Capita Yearly per Capita Yearly Credit Variables Expenditure Food Expenditure Nonfood Expenditure Log of men's current loans -0.002 -0.005 0.008 Log of women's current loans 0.009"" 0.006"" 0.018"" Log of men's past loans -0.004 -0.003 -0.005 Log of women's past loans 0 010"" 0.008"" 0.014"" Number of observations 1,638 1,638 1,638 F-statistics (56, 1582) 9.92 9.45 9.27 - ""t-statistic is significant at the 5 percent level or better. Note: Regression also includes the following variables: sex; age and education of household head; whether parents, brothers, and sisters of household's head or head's spouse own land; year; and village level infrastructure and price variables to reflect the impact of time-varying changes in local economic conditions. Source: Author's computations based on 1991192 and 1998199 household surveys in Bangladesh. 18. The response elasticity is somewhat higher with the fixed-effect with instrumentalvariable model than with fixed-effect model estimates, although the significance levels are similar for both estimates, especially for female borrowing. 19. The estimated model is logarithmic function; hence, the coefficientsof credit variables measure the response elasticity. Regressions are estimated using log-log models, due to possible highly skewed data and heteroskedasticity. Using a log-log model reduces the importance of high-value outliers and makes errors more hornoskedastic. 20. During 1991192 current borrowing is the stock at that time and past borrowing is zero (because 1991192 is the first data point in the panel). During 1998199 current borrowing is the stock in 1998199 and past borrowing is the stock from 1991192. Khandker 279 TA BLE 5. Marginal Returns to Microfinance Loan Based on Household Fixed-Effects Estimates (taka per 100 taka in borrowing) Household Yearly Household Yearly Household Yearly Gender and Period Total Expenditure Food Expenditure Nonfood Expenditure Women's borrowing Returns in 1991192 14.7:1+ Returns in 1998199 20.5*" Men's borrowing Returns in 1991192 -6.5 Returns in 1998199 -16.6 *"t-statistic is significant at the 5 percent level or better. Note: Because the estimation equations_are in log-log (elasticity) form, marginal returns are calculated using the formula, dYldX= P(Y/X, where Y and X are sample means of Y (household expenditure) and X (women's cred~t,for example). Household expenditure figures are obtained by multiplying household per capita expenditure by household size (5.8 for the sample). The return in 1991192 includes that from current credit only (because in 1991192 past credit is zero), and the return in 1998199 indudes that from both current (4.2 percent in 1998199) and past (16.3 percent in 1991192) credit. Source: Author's computations based on 1991192 and 1998199 household surveys in Bangladesh. An additional Tk 100 in women's stock of credit during 1998199 increases the household's total annual expenditure by almost Tk 21-food expenditui-eby Tk 11.3, and nonfood expenditure by Tk 9.2. By assumption there was no borrowing before 1991192 because the data start in 1991192. For 1991192, then, the contribution to marginal returns to current consumption comes only from current borrowing, which is 14.7 percent. For 1998199, however, the contribution is the sum of impacts from both cukrent borrowing (4.2 percent) and past borrowing (16.3 percent).21 data analysis show a lower return for 1991192 and a higher return forthe 8199 Thus 19 jane1 than the earlier Pitt and Khandker (1998) results for 1991192 based on doss- sectional analysis, which show an 18 percent impact of women's borrowirig on total consumption. The marginal returns to men's borrowing were also calcu- lated. In most cases they are small and insignificant-effectively, the returfis to male borrowing are zero. This may be because these programs have alfivays emphasized women, and as a result men have lagged behind in both medber- ship and loan volume, with the discrepancy increasing over time. Oneconcernisthat programparticipantsinclude noneligiblehouseholdsthadown more than 50 decimals of land, which may bias the estimated coefficientsof dredit 21. This shows that return to cumulative borrowingin a particular year diminishedover the years. For example, returns to current borrowing on current consumption dropped from 14.7 percent in 1991192 to 4.2 percent in 1998199. Diminishing returns are also obvlous from the fact that in 1998199 the return to current consumption is 16.3 percent from past borrowing (1991192) and 4.2 percent from 4urrent borrowing (1998199). (Morduch 1998). To see how robust the estimated effects are, the impacts were reestimated by excluding mistargeted households (followingPitt 1999).The results (notshownhere)suggestthattheestimatedcoefficientsarequiterobust-they arenot sensitive to whether rnistargeted householdsare included as program participants. VI. SPILLOVER EFFECTS OF MICROFINANCE The results have shown that microfinance has a large impact on the welfare of borrowing households by raising consumption among program participants. Are there spillover effects that are felt beyond the program participants? Loans outstanding for microfinance organizations in Bangladesh totaled about $600 million in 1998199. This large inflow of microfinance to rural areas is expected to have an aggregate impact on the local economy. Panel data are needed to estimate any spillover effects. When there are spil- lover effects, unobserved village heterogeneity can be correlated with program placement, with causation going from program placement to unobserved village effects, not from village effects to program placement. This measurement pro- blem implies that the placement of a microfinance program may cause a village effect additional to any preexisting (time-invariant)village effects. Omitting unmeasured village effects, as before, a consumption equation that captures village spillover can be written as: where the fii terms represent the external village effects of a program (with a value of zero if no program is located in the village),and qij is the unobserved household-level fixed effect. The program-effect parameters, 6 and y, capture all program effects only if fij = 0 (noneof the village-specific heterogeneity is caused by the program). If village externalities exist (ai#O), the spillover effect is not separately identified from the time-invariant village effect. If the fi, terms are measured by the average value of all microfinance borrowingin a village, then the spillover effect is measured by the change in behavior of nonparticipants due to a change in village-level average microfinance borrowing, captured by TC and p. Equation 6 is estimated by the fixed-effect method that eliminates program placement bias. Both the consumption and the credit variables are in logarith- mic form, and thus the credit coefficients measure response elasticities. The benefits for nonparticipants depend on the amount of credit obtained by all program borrowers living in a village, as measured by the average value of credit obtained by all households living in a village. The village averages of women's current and past borrowing have significant positive impacts on per capita expenditure of an average household of a village. A10 percent increase in the village average of women's current borrowing from microfinance programs increases household per capita total expenditure by 0.68 percent, food expenditure by 0.50 percent, and nonfood expenditure by 0.97 Khandker 281 percent (table6).Similarly a 10 percent increase in the village average of women's past borrowing from microfinance programs increases household per capita total expenditure by 0.69 percent, food expenditure by 0.45 percent, and nonfood expenditure by 1.19 percent. A 10 percent increase in women's individual current borrowing increases borrowing household's per capita expenditure by 0.06 percent and the same increase in women's past borrowing increases per capita expenditure by 0.05 percent. Men's borrowing has no such effects. The positivespillover effects suggest that microfinanceprogramshave infldenced the welfarenot only of poor participants but also of nonparticipants.The total effect of a program is then a sum of the effects for participantsand nonparticipants.22 TABLE 6. Household Fixed-Effects Estimates of Village Spillover Effects of Microfinance Loans -- - Log of Household Log of Household Log of Household per Capita Yearly per Capita Yearly per Capita Yearly Credit Variables Expenditure Food Expenditure Nonfood Expetlditure Log of men's current loans Log of women's current loans Log of men's past loans Log of women's past loans Log of village average of men's current loans Log of village average of women's current loans Log of village average of men's Dast loans Log of village average of women's Dast loans Number of observations F-statistics (60, 1578) "t-statistic is significant at the 10 percent level or better. "'t-statistic is significant at the 5 percent level or better. Note: Regression also includes the following variables: sex; age and education of househoed head; whether parents,brothers,andsistersof household'sheador head'sspouseown land;year;and villagelevel infrastructureand pricevariablestoreflecttheimpactof time-varyingchangesin localeconomicconditions. Source: Author's computations based on 1991192and 1998199 household surveysin ~an~ladesh. 22. This is, however, not a simplealgebraicaggregation. Accordingto thisestimate, benefit from both own effects and spillover, whereas nonparticipants benefit only from spillover. A 10 percent increasein women's currentloans increasesparticipants' per capita expenditure directly by 0.06 pertent and through spillover by 0.034 percent (coefficientof the village average of women's borrowingis 0.068 which, for a 20-household village, translates into a 0.034 percentincrease at the household level).Nonparticipants' per capita expenditure increases by 0.034 percent because of the spillover. As nonparticipantsconstitute roughly about 48 percentof the villagepopulationduringthe1998199survey period,a10 percent increasein women's current borrowing increases per capita expenditure of an average household by 0.065 percent ( = 0.52"0.094 + 0.48*0.034). VII. POVERTY EFFECTS OF MICROFINANCE Data on consumption and the consumption poverty line show that moderate poverty in the sample villages declined overall by 17 percentage points between 1991192 and 1998199 and extreme poverty by 13 percentage points (table 7). The follow-up survey provides a means of gauging the extent of mistargeting of microfinance programs based on consumption poverty, when mistargeting is defined as program participation by households that are not poor based on their consumption. For households in nonprogram areas in 1991192 that joined microfinance programs after the 1991192 survey, the average incidence of poverty was 90.8 percent before they joined the TABLE 7. Poverty Status (Headcount)by Program Participation Status and Survey Period Moderate Poverty Extreme Poverty Program Participation Statusa 1991192 1998199 1991192 1998199 Program villages Program participants (targeted) Program participants (mistargeted) All program participants Target nonparticipants Nontarget, nonparticipants Total Nonprogram villages Program participants (targeted)b Program participants ( m i ~ t a r ~ e t e d ) ~ All program participantsb Target nonparticipants Nontarget, nonparticipants Total All villages Program participants (targeted) Program participants (mistargeted) All program participants Target nonparticipants Nontarget, nonparticipants Total "Change in poverty from 1991192 to 1998199 is significant at the 10 percent level or better. ":'Change in poverty from 1991192 to 1998199 is significant at the 5 percent level or better. "Program participation status is based on program placement in 1991192. By 1998199 all sample villages that had not had programs in 1991192 had programs. b ~ h e rwere no program participants in nonprogram villages in 1991192. Figures in parentheses e show headcount in 1991192 for households in those villages that became program participants in 1998199 (whoseheadcount in 1998199 is shown in next right cell). Source: Author's computations based on 1991192 and 1998199 household surveys in Bangladesh. Khandker 283 microfinance programs. Thus, only 9 percent of households were mistargeted based on consumption poverty.23 Poverty reduction is substantial among both program participants and non- participants and in both program and nonprogram areas. The net reduction in moderate poverty is about 18 percentage points in program areas, 13 percentage points in nonprogram areas, and 17 percentage points overall between 1991192 and 1998199. Did microfinance play a role? The substantial reduction in poverty (19 percentage points) after 1991192 among participants in previously nonprogram areas suggests that microfinance programs were successful in reducing village-level poverty, even with the possi- bility of spillover effects. Extreme poverty also dropped over this period, but the reduction in nonprogram areas is not statistically significant in most cases. Poverty rates declined by more than 20 percentage points over the seven years among the households that were program participants in 1991192-about 3 percentage points a year. How much of this reduction is due to microfinance? To quantify that contribution, consumption estimates for program participants and the village as a whole are used. The marginal returns to householdconsump- tion on women's loans (seetable 5) are used to calculate the change in per capita consumption due to microfinance borrowing.24This change in per capita con- sumption is subtracted from the current consumption of participants to get a preborrowing levelof cons~m~tion,2~which is then used to derivea preborrowing level of poverty. Poverty reduction for nonparticipants can be calculated similarly from table 6, first by calculatingvillage-levelmarginalimpact and then by dividing that by the number of households in the village to get household-level marginal impacts. Applying marginal impacts to the per capita expenditure of nonpartici- pants yields their poverty rate before microfinance intervention. Together, these two estimates give the aggregate poverty estimates at the village level (table 8). The reported poverty reduction for participants amounts to a 1.6 percentage point annual reduction in moderate poverty and a 2.2 percentage point annual reduction in extreme poverty (using average program duration for female parti- cipants of 5.2 years).26Thus more than half of the 3 percentage point annual decline in moderate poverty among program participants (see table 7) cAn be attributed to microfinance programs alone. Microfinance reduces both moderate 23. Contrast this with the land-based mistargeting of 31 percent in 1998199 (table1). 24. Ideally, average return (notmarginal return) should be used to calculate the consumption change. Since the marginal return is usually smaller than the average return, the return to consumption based on marginal impacts underestimates the impact of microfinance on poverty. 25. The change in per capita consumption among the participants is about 14 percent of their preborrowing (simulated) per capita expenditure, which, after dividing by the program duration of 5.2 years, amounts to 2.7 percent per year or 54 percent of the actual change observed in the descriptive statistics (table 2). 26. Although membership duration is as high as 15 years for some women, there are significant numbers of new entrants to the programs with short duration, brining the average program duration to 5.2 years. TABLE 8. Predicted Impacts on Poverty of Women's Microfinance Loans Poverty Participation Status and Poverty Level Before Borrowing After Borrowing Participants Moderate poverty headcount 0 793:'s 0.711>5* Extreme poverty headcount 0.467*" 0.352:' :' Nonparticipants Moderate poverty headcount Extreme poverty headcount A11 households Moderate poverty headcount Extreme poverty headcount *"Significant at the 5 percent level or better. Source: Author's computations based on 1991192 and 1998199 household surveys in Bangladesh. and extreme poverty among nonparticipants as well (seetable 8). At an aggregate level microfinance reduces moderate poverty by about 1.0 percentage point and extreme poverty by 1.3 percentage points a year.27 Thus microfinance can account for some 40 percent of the overall reductions in moderate poverty in rural Bangladesh (1percentage point out of the 2.5 percentage point reduction each year).28The impact of microfinance is slightly higher for extreme poverty than for moderate poverty, at both the individual and the village level. The microfinance impacts are much stronger for female borrowing than for male borrowing. VIII. CONCLUSIONS Program impact evaluation compares outcomes for treatment groups with those for control groups. However, finding control groups in a nonexperimental setting is difficult. Alternatively, program effects can be identified by resorting to instruments with the availability of cross-section data. However, finding good instruments is also difficult. Pitt and Khandker (1998) used a quasi- experimental method relying on exogenous eligibility conditions as a way of identifying program effects. When conditions are not adequately restrictive they may not be reliable, as with the weak enforcement of the landholding criterion 27. Annual poverty reduction of nonparticipants is obtained by dividing total poverty reduction (1.6 percent from table 8) by average village level program duration (9.3 years). Aggregate poverty reduction per year is the average of the per year reductions of participants and nonparticipants (weighted for the distribution of program participants). 28. If the spillover estimate from table 7 is used to simulate poverty reduction among participants and nonparticipants, a 30 percent overall poverty reduction at the village level can be attributed to microfinance. Khandker 285 for program participation. Results may also be sensitive to the methods used in impact assessment. This study carried out an impact assessment using the 1998199 follow-up survey to the 1991192 survey to assess the sensitivity of the earlier findings on the poverty effects of microfinance in rural Bangladesh. The panel data analysis helps estimate the effects on poverty using an alternative estimation technique and also helps estimate the impacts of past and current borrowing, assuming that gains from borrowing, such as consumption gains, vary over time. An earlier study using 1991192 cross-section data found returns of about 18 percent to women's borrowing (Pittand Khandker 1998).If this rate were sustained given the level of poverty among program participantsin 1991192, this could have led to an estimated 5 percentage point reduction in poverty for participants and a 1percentagepoint reductionfor the villageas a whole over the program period. Are these projected gains robust and sustainable over time? This article sought to answer these questions using panel data analysis and a dynamic model to estimate the time-varying borrowing effects on consumption for participants and nonparticipants as well as for average villagers, through spillover effects. The results are resounding. Microfinance continues to reduce poverty among poor borrowers and within the local economy, albeit at a lower rate. It raises per capita household consumption for both participants and nonparticipants. The average returns to cumulative borrowing for female members of microfi- nance programs are as much as 21 percent in 1998199, up from 18 percent in 1991192. Despite higher returns to cumulative borrowing, the impact on pov- erty reduction among program participants was lower in 1998199 (2 percentage points) than in 1991192 (5percentage points).This is due to diminishing returns to additional borrowing, so that despite the increase in the stock of borrowing by female members, the resulting increases in consumption were not large enough to reduce poverty as expected. Moreover, because of better economic conditions, the gains in real consumption were much higher for both program participants and nonparticipants over the study period. The consumption level of borrowers, for example, was only 8 percent below the poverty line in 1998199 compared with 31 percent in 1991192. And despite the diminishing returns to microfinance and the overall better economicconditions, the results indicate that microfinance accounts for more than half of the 3 percentage points observed annual reduction in poverty among program participants. The panel data analysis also estimated the aggregate impacts of microfinance on consumption and poverty. Not only does the increase in consumption resulting from borrowing raise the probability that program participants will escape pov- erty but the microfinance intervention also benefits nonparticipants thdough growth in local income. In particular, microfinance reduces the average village poverty level by 1percentage point each year in program areas, some 40 parcent of the observed village-level poverty reduction. Microfinance has a slightlyhigher impact on extreme poverty than on moderate poverty for everybody. Alderman, Harold, Jere R. Behrman, Hans-Peter Kohler, John A. Maluccio, and Susan C. Watkins. 2000. "Attrition in Longitudinal Household Survey Data: Some Tests for Three Developing-Country Sam- ples." Policy Research Working Paper 2447. World Bank, Washington, D.C. Fitzgerald, John, Peter Gottschalk, and Robert Moffitt. 1998. "An Analysis of Sample Attrition in Panel Data." Joural of Human Resources 33(2):251-99. Hashemi, Syed M., Sidney R. Schuler, and Ann P. Riley. 1996. "Rural Credit Programs and Women's Empowerment in Bangladesh." World Development 24(4):635-53. Hossain, Mahabub. 1988. "Credit for Alleviation of Rural Poverty: The Grameen Bank in Bangladesh." IFPRI Research Report 65. International Food Policy Research Institute, Washington, D.C. Jalan, Jyotsna, and Martin Ravallion. 2002. "Geographic Poverty Traps? A Micro Model of Consump- tion Growth in Rural China." World Bank, Washington, D.C. Khalily, M. B., M. 0.Imam, and S. A. Khan. 2000. "Efficiency and Sustainability of Formal and Quasi- formal Microfinance Programs-An Analysis of Grameen Bank and ASA." In Rushidan I. Rahman and Shahidur R. Khandker, eds., The Bangladesh Development Studies, A Special Issue on Microfinance and Development: Emerging Issues 26(June/September):l0346. Khandker, Shahidur R. 1998. Fighting Poverty with Microcredit: Experience in Bangladesh. New York: Oxford University Press. . 2003. "Micro-Finance and Poverty: Evidence Using Panel Data from Bangladesh." Policy Research Working Paper 2945. World Bank, Washington, D.C. Khandker, Shahidur R., and Mark M. Pitt. 2003. "The Impact of Group-Based Credit on Poor House- holds: An Analysis of Panel Data from Bangladesh." World Bank, Washington, D.C. Lalonde, R. 1986. "Evaluating the Econometric Evaluations of Training Programs with Experimental Data." American Economic Revietu 76(4):60420. Moffitt, Robert. 1991. "Program Evaluation with Nonexperimental Data." Evaluation Review 15(June): 291-314. Morduch, Jonathan. 1998. "Does Microfinance Really Help the Poor? New Evidence from Flagship Programs in Bangladesh." New York University, New York City. . 1999. "The Role of Subsidies in Microfinance: Evidence from the Grameen Bank." Journal of Development Economics 60(0ctober):22948. Pitt, Mark M. 1999. "Reply to Jonathan Morduch's 'Does Microfinance Really Help the Poor? New Evidence from Flagship Programs in Bangladesh.' " Brown University, Department of Economics. Available online at www.pstc.brown.edu/-mp. Pitt, Mark M., and Shahidur R. Khandker. 1996. "Household and Intrahousehold Impact of the Grameen Bank and Similar Targeted Credit Programs in Bangladesh." Discussion Paper 320. World Bank. Washington, D.C. . 1998. "The Impact of Group-Based Credit Programs on Poor Households in Bangladesh: Does the Gender of Participants Matter?"Journal of Political Economy 106(5):958-96. Ravallion, Martin. 2001. "The Mystery of Vanishing Benefits: An Introduction to Impact Evaluation." World Bank Economic Revietu 15(1):11540. Thomas, Duncan, Elizabeth Frankenberg, and James P. Smith. 1999. "Lost but Not Forgotten: Attrition and Follow-up in the Indonesian Family Life Survey." rand Labor and Population Program Working Paper 99-01. Santa Monica, Calif., WD. Yaron, Jacob. 1994. "What Makes Rural Finance Institutions Successful?"World Bank Research Observer 9(1):49-70. World Bank. 2003. Poverty in Bangladesh: Building on Progress. Washington, D.C. Ziliak, James P., and Thomas J. Kniesner. 1998. "The Importance of Sample Attrition in Life Cycle labor Supply Estimation." Journal of Human Resources 33(2):507-30. Participation in WTO Dispute Settlement: Complainants, Interested Parties, and Free Riders Chad P. Bown What affects a country's decision of whether to formally engage in a trade dispute directly related to its exporting interests? This article empirically examines determi- nants of affected country participation decisions in formal trade litigation arising under the World Trade Organization (wto)between 1995 and 2000. It investigates determi- nants of nonparticipation and examines whether the incentives generated by the system's rules and procedures discourage active engagement in dispute settlement by developing country members in particular. Though the size of exports at stake is found to be an important economic determinant affecting the decision to participate in challenges to a wto-inconsistent policy, the evidence also shows that measures of a country's retaliatory and legal capacity as well as its international political econdmy relationships matter. These results are consistent with the hypothesis of an implicit "institutional bias" generated by the system's rules and incentives that particularly affects developing economy participation in dispute settlement. The basic rules and procedures of dispute settlement under the World Trade Organization (WTO) are the same for all member countries. Nevertheless, there is substantial concern that the trading interests of certain types of members, such as small or developing economies, may be underrepresented in dispute settlement activity. A bias in participation activity may stem from the current system bf seli- representation requiring that countries have sufficient resources to both mbnitor and recognize relevant WTO violations and to fund legal proceedings in cases in which their rights have been violated. Furthermore, the self-enforcingnature 'ofthe system requires that complainant countries have the retaliatory capacity to threaten Chad P. Bown is associate professorin the Department of Economicsand International Business School at Brandeis University; his email address is cbown@brandeis.edu.The author gratefully acknowledges (inancia1 support from a Brandeis University Mazer Award, the World Bank, and the Okun-Model Fellowshipat the Brookings Institution, where he was visiting while the article was revised. The author also thanks Rachel McCulloch, Bernard Hoekman, Caglar Ozden, mau Looi Kee, Thomas Osang, H h n ~ o r d s t r ~three h , anonymous referees,and seminar participants at the World Bank, BrandeisUniversity,and Southern Methodist University's Conferenceon Institutions in the Global Economyfor helpfulcomments on an earlier draG, as well as Petros Mavroidls and Rhian Wood for assistance in some data acquisition. THE WORLD BANK ECONOMICREVIEW VOL. 19, NO 2, pp. 287-310 , doi:lO.l093/wber/lhi009 Advance Access publ~cationAugust 24, 2005 0TheAuthor 2005. Published byOxfordUniversityPresson behalf of the International Bank for Reconstruction and Development I TH E WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oupjoumals.org. to impose economic costs on respondents that fail to comply with wro panel rulings. Dispute settlement activity may also be skewed against confrontation of trading partners with which a country has a special political relationship-either through reliance on a foreign government for development assistance or through membership in a common preferential trade agreement. If these and other incentives affect litigation behavior, poor or powerless countries may not participate in dispute settlement activities critically important to their trading interests.' Thus, although all wro members have equal access to the system in principle, use of the dispute settlement provisions may reflect an institutional bias-that is, that poor or powerless members do not participate because of the incentives generated by wro rules and procedures. This article empirically investigates whether such determinants affect participation in the formal wro dispute settlement process. Economicresearch byHornand others (2005)has beguntoempiricallyinvestigate this question by assessing the biases associated with the initiation of disputes under the wro. Their analysiscomparesthe actual number and compositionof complaints initiated in 1995-98 with a probabilistic model's predictions of the number and composition of complaints.They find that economic measures, such as the value of trade and the diversity of a country's trading partners, explain the pattern of actual dispute initiation fairly well. For example, they find that even though Canada, the European Union, Japan, and the United States initiated more than 60 percent of all complaints over the period, these two factors naturally led them to initiate more formal trade disputes than did other m o members. Their preliminaryconclusion is thus that "power"measures do not seem to matter, and they do not find evidence of institutionalbiasassociatedwith disputeinitiation.Althoughcountriesthat engagein more trade with a wider array of partnersare expected to be more involved in formal disputesettlementactivity,theirapproach alsoassumesthat wro-inconsistentactivity is randomly and uniformly distributed across markets, products, and trading part- ners. This last assumption in particular may be called into question given the sub- sequent results of Blonigenand Bown (2003)and Bown (2004b),whichsuggest that bilaterally powerless countries that do not have the capacity to retaliate are more likely than other countries to be the target of certain types of activity that are inconsistent with General Agreementon Tariffsand Trade (GATT)or wro rules.2 1. Hoekman and Mavroidis (2000)provide a thorough discussion of these and other informational issues that are likely to increase the likelihood of nonparticipation by developing economies in particular. 2. Blonigen and Bown (2003)empirically investigated U.S. antidumping petitions between 1980 and 1998 and found that petitions are more likely to result in duties against countries that lack retaliatory capacity. This is consistent with the hypothesis that bilaterally powerless countries are more likely to be targeted by GATT/~O-inconsistentantidumping measures unless powerless exporters were, on average, more likely to face GATT~WO-consistentthan GATT/WTO-~~COIIS~S~~~~antidumping measures, which is unlikely. Second, Bown (2004b)examined a sample of GATT trade dispute data for 1973-94 and found that countries tend to implement GATT-inconsistent import protection leading to a trade dispute, as opposed to GATT-consistent safeguards protection, when the trading partner affected by the protection is bilaterally powerless. Bown 289 Any attempt to estimate the bias associated with the initiation of disputes is subject to data constraints. There is no obvious source for comprehensive information on government policies that are WTO-inconsistent and yet have not been formally confronted through the initiation of a trade dispute.3To get around this data problem, the focus is on the pattern of participation in disputes that have already been initiated instead of attempting to examine whether there is a bias in the initiation of disputes. Previously unexploited information on the participation and nonparticipation of potential litigants adversely affected by member-implemented, WTO-inconsistentpolicies is used. The data are derived from initiated disputes and the observation that in many disputes the respon- dent's wro-inconsistent policy may have been imposed on a quasi-most favored nation (MFN) basis that negatively affected the exports of multiple member countries, any number of which could have formally participated in the dispute. In addition to the initiating complainant, many negatively affected exporting countries also participate in disputes, either as a co-complainant or as an interested third party, which is permitted by Article10 of the Dispute Settlement Understanding. Nevertheless, dozens of affected exporting countries do not formally participate, even though they have a right to do so and an economic interest in the dispute's outcome. More formally, the investigation covers a set of w o trade disputes from 1995-2000 that involve respondents' wo-inconsistent policies being implemen- ted on a quasi-MFNbasis. Such policies negatively affect the exports of multiple WTO members, thus establishing a set of potential litigants. An expected cost- benefit framework is then developed to guide an empirical examination of determinants of potential litigants' decision to formally participate in the dis- putes. Presuming that formal participation occurs when the expected benefits are greater than the expected costs, the investigation looks at whether the expected benefits include increased market access in the disputed sector and the increased probability of an economically successful dispute outcome that may be tied to credible retaliatory threats. Also examined is whether the expected costs to formal participation include either a country's capacity to afford the substantial legal costs associated with WTO dispute settlement litiga- tion or the political costs associated with a potential deterioration of interna- tional relations when confronting important trading partners. Finally, it is acknowledged that an economically successful resolution to the disputes under investigation involves a respondent country removing a wo-inconsistent policy on an MFN basis, so that any formal litigants' efforts generate positive externalities. 3. The WTO does not provide exhaustive data on the extent to which member countries violate their obligations, although periodic peer country reviews of trade policies are published under the mo's "Trade Policy Review Mechanism." Though a potential future source of data, the peer reviews are also nonrandom and sporadic: Canada, the European Union, Japan, and the United States are reviewed every two years, the next 16 largest traders are reviewed every four years, and the remaining members are reviewed only every six years (WTO 2003b). Such externalities generate an incentive to free ride on the litigation of others, providing a potential explanation for nonparticipation if the investigation finds that exportersdo not engage in the disputesettlementprocessfor reasonsrelatedto the expected cost-benefitdeterminants. To clarify this approach, it is useful to consider a typical dispute under examination in the sample, such as United States-Safeguard on Circular Welded Pipe from Korea (DS202),which concerns a respondent having imple- mented a relatively nondiscriminatory (called quasi-MFNhere) but mo-incon- sistent safeguard policy.4Becausethe U.S. safeguard was applied on a quasi-^^^ basis adversely affecting the exports from multiple WTO member countries, a completely successful economic resolution to this dispute would involve the United States eliminating the trade barrier, liberalizing imports of pipe from the Korean complainant, and extending that liberalization to exporters of pipe from other source countries on an MFN basis. In this instance, other exporting countries that were adversely affected by the U.S. safeguard did formally parti- cipate in the dispute. The European Union and Japan, for example, exercised their rights to intervene as interested third parties in the dispute.' But other adversely affected exporting countries, such as South Africa, Turkey, and Vene- zuela, did not formally participate in the dispute. Undoubtedly they hoped to free ride and enjoy the market access benefits generated by the formal litigants' efforts to liberalize the safeguard-protected market on an MFN basis, as m o rules require. But it is possible that other elements of the dispute resolution process generate incentives that also affected the nonparticipation decision (for example, lack of sufficient retaliatory or legal capacity, political relationships). The purpose of this article is to investigate econometrically whether such political economy determinants can be used to explain why some trading partners (for example, the European Union, Japan, and the Republic of Korea) formally participate in such disputes, whereas other adversely affected potential litigants (for example, South Africa, Turkey, and Venezuela)do not. Why is understanding determinants of dispute settlement nonparticipation important? Although lessons from the United States-Safeguard on Circular Welded Pipe from Korea dispute are anecdotal and dispute outcomes are not under investigation here, the dispute's resolution nevertheless raises some relevant concerns about the implicationsof the current process. In this particular instance any nonparticipant's hopes of free riding on the complainant's litigation efforts 4. One critical element for the wro inconsistency of this particular safeguard was the U.S. govern- ment's failure to attribute injury to imports (Irwin 2003). Nevertheless, the United States did exempt members of the North America Free Trade Agreement (Canada and Mexico) from the safeguard, so the Republic of Korea included discrimination allegations on its list of m o violations. As discussed in the data section, such exempted countries were eliminated from the set of negatively affected exporters identified in safeguard disputes. 5. The European Union initiated its own dispute over the U.S. pipe safeguard as part of another dispute contesting a U.S. safeguard on steel wire rod (DS214), but it did not follow through as a complainant. Bown 293 went unmet. Despite almost exhausting the WTO'S formal dispute resolution process, the dispute was not resolved by the United States lifting the safeguard. Instead, the negotiated settlement ~ieldeda discriminatory increase in market access benefits to the Republic of Korea a10ne.~A policy concern raised by this experience is whether the lack of active participation by the other exporting interests contributed at least implicitly to a negotiated settlement that failed to generate positive trade liberalization benefits for the other exporters (spil- lovers) and instead led to a simple restructuring of the WTO-inconsistent policy into something that was likely even more discriminatory than the initial safepard.' As a preview to the empirical results, evidence is presented that countries with a substantial economic stake in the litigation (that is, lost market access) are more likely to participate in WTO dispute settlement. However, even after controlling for market access interests, several other political economy factors affect the decision not to litigate. These other factors are of potential concern from the perspective of an open and accessible dispute settlement system. Other things being equal, adversely affected exporters are less likely to participate when they are involved in a preferential trade agreement with the respondent, when they lack the capacity to retaliate against the respondent by withdrawing trade concessions, when they are poor or small, and when they are particularly reliant on the respondent for bilateral assistance. Because these last character- istics are typically associated with developing economies in the WTO member- ship, these results suggest evidence of an institutional bias affecting active engagement by such countries in the current system. In addition to complementing the work of Horn and others (20051, this article is part of the growing empirical literature on dispute settlement activity under the WTO and its predecessor, the GATT. Bown (2004a)empirically assesses determinants of successful economic outcomes in GATT~WTO trade disputes, finding substantial evidence that retaliation threats affect the likelihood and size of trade liberalization undertaken by the respondent and weak evidence that 6. Specifically, the quantitative restriction element of the tariff rate quota facing the Rep,ublic of Korea under the safeguard was expanded, so that the safeguard tariff applied only to Korean imports of \ line pipe exceeding 17,500 tons per quarter (USTR 2002). The United States did not increase market access under the tariff rate quota to any of the other adversely affected exporters. 7. The formal third-party participants (the European Union and Japan) also did not enjoy additional benefits from the settlement negotiated between the Republic of Korea and the United States: This is consistent with the empirical results of Bown (2004c),which examined the outcomes of an earlier sample of GATT/WO disputes and found no evidence that participating as an interested third party made the trade liberalization gains extended by a respondent more multilateral. Though Bown (2004~)did not narrow his focus to examine third-party participation in nondiscrimination violation cases alone, an alternative interpretation to the free riding hypothesis is simply that nonparticipants make a rational choice. Perhaps because they do not have the capacity to threaten retaliation and prevent the discriminatory settlement, the nonparticipants rationally choose not to pay the litigation and political economy costs of pnrticipa- tion, under the expectation that they would not have additional benefits extended to them through MFN anyway. panel rulings of guilty also induce economic ~om~liance.~In the political science literature a series of papers by Busch and Reinhardt (Busch 2000; Busch and Reinhardt 2000; Reinhardt 2001) examined determinants of GATr/mo litigation decisions related to those under investigation here-such as why some disputes settle early as opposed to being resolved by the third party adjudication available through the panel process (seealso Guzman and Simmons 2002). None of these publicationsfocuses on the question of dispute participation, the determinants of such participation, or any potential institutional bias. Furthermore, with the exception of Bown (2004a, 2004c), none of these earlier m o studies takes advantage of the disaggregated trade data on the actual products under dispute. Section I discusses the m o dispute settlement process and the data collection efforts that establish the set of adversely affected exporters whose dispute settlement participation decisions (as potential litigants) are investigated econ- ometrically. Section I1 provides the empirical investigation, and section I11 discusses the results. Section IV draws out policy implications and suggests areas of potential future research. The increased legalization of the GATT'sdispute settlement procedure, culminat- ing in the 1995 establishment of the mo's Dispute Settlement Understanding, was one of the major achievements of the negotiations under the Uruguay Round (Jackson 1997; Petersmann 1997).The GATT regime's dispute settlement process had several problems. For example, any contracting party, including potential respondents, could veto the initiation of a dispute, the establishment of a panel, or the adoption of a panel report. Furthermore, the dispute settlement process often failed to induce respondents to bring GA~-inconsistentpolicies into compliance with actual rulings. The reforms embodied in the Dispute Settlement Understandingaddressed many of these shortcomings: it eliminated countries' ability to unilaterally veto the establishment of a dispute settlement panel (Article 6.1), it delineated an explicit time frame for the panel decisions (Article20),and it established more transparent rules, limits, and access to permissible retaliation (Article 22). At its inception, many scholars argued that the more rules-based dispute settlement system would benefit developingeconomy members in particular because they lacked the leverage to operate effectively under the old power-based system of the GATT. Even with the increased legalization of the process, however, power relation- ships are still an important element of rules enforcement in the WTO system. Affected trading partners may be authorized to retaliate against a member 8. Bown (2002) presents a theoretical approach and Bown (2004b)an empirical one to address the related question of why a respondent country may have implemented a trade policy that was inconsistent with its international obligations and chat thus put it in the position of being a respondent in a trade dispute. Bown 293 country that fails to live up to its obligations by withdrawing concessions "equivalent to the level of the nullification or impairment" suffered by the complainant (Article 22.4). However, many small countries find such author- ization to be useless. Their inability to affect world prices implies that any trade retaliation imposes substantial welfare costs on themselves through the standard inefficiencies associated with the imposition of tariff protection.9 Furthermore, although some subsidized legal assistance can be accessed by developing econo- mies through the independent Geneva-based Advisory Centre on WTO Law (ACWL)," the assistance is limited. Finally, there are no independent prosecutors under the WTO, SO that firms in developing economies must be able to recognize that their rights have been violated before they can turn to their governments to pursue their case and even request access to subsidized legal assistance. The Dispute Resolution Process under the WTO How does the wro dispute settlement process operate in practice? If a WTO member discovers its market access rights have been violated by another WTO member, it can initiate a dispute by requesting bilateral consultations under Article 4 of the Dispute Settlement Understanding. If those preliminary discus- sions fail to resolve the matter, the member can request the establishment of a formal dispute settlement panel under Article 6. Furthermore, any wro member that has also been negatively affected by the respondent's policy or that has a substantial trading interest in the matter can formally participate in the dispute settlement proceedings as either a co-complainant or as an interested third party. With respect to multiple complainants Article 9.1 states that "where more than one Member requests the establishment of a panel related to the same matter, a single panel may be established to examine these complaints." Furthermore, with respect to third party interests, Article 10.2 states, "Any [wo] member having a substantial interest in a matter before a panel and having notified its interest to the DSB [Dispute Settlement Body] (referred to in this Understanding as a "third party") shall have an opportunity to be heard by the panel and to make *ritten submissions to the panel. These submissions shall also be given to the parties to the dispute and shall be reflected in the panel report." A final question for the dispute settlement process that could affect the data collection approach here is whether there are restrictions on who is eligible to initiate a dispute as a potential complainant. For example, are complainants limited to large suppliers of the products over which trade restrictions have been 9. Bown (2002)presents a theoretical model showing how a complainant's ability to affect the terms of trade influences the outcome it receives in dispute settlement negotiations, even in disputes that do not end in retaliation. 10. The mission of the Advisory Centre on WTO Law is to "provide legal counseling on wro law matters to developing-country and economy-in-transition members of the Centre and all least developed countries free of charge up to a maximum of hours to be determined by the Management Board" (seethe ACWL Web site, www.acwl.ch). imposed? Article 3.7 suggests that a member should initiate a dispute only after exercising "its judgment as to whether action under these procedures would be fruitful," which could be interpreted as limiting eligibility to large exporters where the economic gains of increased market access would be "fruitful." By contrast, Petersmann (1997) details many explicit provisions in the Dispute Settlement Understanding that appear designed to encourage developing econo- mies to become more involved in the initiation of disputes to protect their market access rights. Given these provisions mandating special treatment for developing area interests in the dispute settlement process, any WTO member that exports the disputed product to the respondent is treated as eligible to participate in formal disputes in which its exports have been adversely affected. Building the Data Set of Potential Litigants Because this is both a new approach and a new data set under investigation, this section briefly describes the effort to construct a database of potential liti- gants-that is, the exporters that are negatively affected by member-implemen- ted, oao-inconsistent, import-restricting policies. The approach essentially has three steps: determining the sample of WTO disputes to analyze, determining the set of exporters that are adversely affected by the disputed policy and that share the common goal of the mo-inconsistent policy's removal, and matching the resulting set of potential litigants with data on the actual formal dispute settle- ment participants. There were 85 formal oao trade disputes initiated between 1995 and 2000 that involved legitimate allegations of the respondent providing excessive protection to a domestic, import-competing industry, which affected a well-defined set of imported products.11 Only 54 of them are unique, under the definition of a unique dispute relating to a singular WTO-inconsistentpolicy, respondent, and set of disputed products. The Harmonized System (HS) code of the imported products involved in the disputed policies is matched with the most disaggregated, multilateral trade data systematically available from an independent source, the HS six-digit import data provided by the United Nations Conference on Trade and Development's (UNCTAD) Trade Analysis and Information System (TRAINS).'~If t is 11. This eliminates from the sample of data several disputes involving excessive export promotion (typically mo-inconsistent subsidies), as well as disputes that failed to relate to a well-defined set of product-for example, the dispute over the U.S. Byrd Amendment (UnitedStates-ContinuedDumping and Subsidy Offset Act of 2000, DS217 and DS234)-because no specific products were identified in the dispute. The focus here on legitimate allegations minimizesthe effect of omitted variable bias that might be introduced by failing to formally control variation in the level of difficulty of the legal issues across cases. 12. In most disputes the HS codes of the affected products are listed in the formal wro dispute correspon- dence that is published on the mo's Web site. In a handful of cases the codes were obtained from other sources, such as national government Web sites (for example, the Federal Register in some cases involving U.S. antidumping measures).Furthermore, the products frequently at issue in the dispute may be at the more dsaggregated 8- or 10-digit level. To the extent that there is substantial variation in other 8- or 10-digit exports not under dispute in a 6-digit HS category, the results may be imprecisely estimated. Bown 295 the year of dispute initiation, an exporter is defined as being affected by the disputed policy if it was a WTO member with nonzero exports of the disputed six-digit HS product to the respondent in any of the years t-2, t-1, t, or t + l . Only potential litigants that were adversely affected by the mo-inconsistent policy and thus seek its removal are included.13To identify this set of countries required more detail on the discriminatory nature of the respondent violation in the w o dispute. Only disputes that involved mo-inconsistent policies applied on a quasi-MFNbasis (that is, that negatively affected the exports from multiple countries) could be used. Some 35 of the 54 WTO-inconsistentpolicies fit the definition of being applied on virtually an MFN basis, so as to negatively affect the trade of all exporters of the disputed product (table 1).14~heother 19 WTO- inconsistent policies in the sample were applied on at least a quasi-MFNbasis, in that even though an MFN violation was a key element of the dispute, non-wo sources allowed the other exporting countries to be identified in addition to the complainant. A good example of this second type of dispute is Eurapean Union-Banana Regime, where many negatively affected exporters could have participated in the dispute (becausethey were injured by the discrimination) and many positively affected exporters, such as the Lomi Agreement countries that received preferential access, would not be potential litigants under the definition used here, because they would not have sought to have the WTO-inconsistent policy removed. The final step is to take the adversely affected exporters in these 54 disputes and to match them with information on the exporters that forlnally participated in each dispute. First, there may be multiple complainant coun- tries involved in a dispute against the same respondent and disputed policy. There were 89 complainants involved in these 54 unique disputes (see table 1). 13. This is not to say that exporters that implicitly benefit from a oao-inconsistent policy-say, through preferential access generated by an MFN violation-are not interested in the dispute's outcome and thus do not have an incentive to participate as an interested third party. However, because they are not adversely affected by the wro-inconsistent policy, they have no economic incentive to act as a complainant in the dispute (in fact, they have a disincentive to complain), and therefore it would be inappropriate to include them in the three-choice model estimated later. Nevertheless, there are several questions regarding third-country participation alone that are quite interesting but that cannot be addressed given the approach h e r e f o r example, the more general question of why countries participate as third parties in trade disputes at all and whether they are more likely to do so to defend economic benefits implicitly received through a discriminatory policy, fight for economic benefits prom~sedbut not given because of a WTO-inconsistent policy, or fight for legal interests that are more systemic in nature and that might not relate to any particular economic benefit at all. 14. As noted in the introduction, any exempted countries that the respondent announced as being excluded from the safeguard were excluded from the negatively affected exporting countries in safeguard disputes. In many cases these were countries in a common preferential trading agreement or small, developing economy suppliers that do not meet a de minimis criteria of at least 3 percent of the import market. For a discussion of the use of country exemptions in wro safeguards protection, see Bown and McCulloch (2004). TABLE 1. Nondiscrimination and Discrimination (MFN) Violations in the 1995-2000 WTO Trade Dispute Data Used in the Estimation Discrimination (MFN) Nondiscrimination Violations, also Adversely Violations Negatively Affecting Some Exporters in Affecting All Exporters Addition to Complainant Disputes in the data set (85) DS1, DS7 (DS12, DS14), DS2 (DS4), DS24, DS27 DS8 (DS10, DSll), DS9 (DSlOS),DS29, DS32, DS33, (DS13, DS17, DS25), DS18 DS54 (DS55, DS59, DS64), (DS21),DS20, DS26 (DS48), DS58 (DS61),DS72, DS119, DS31, DS43, DS56 (DS77), DS122, DS135, DS139 DS62, DS74, DS75 (DS84), (DS142),DS140, DS141, DS76, DS78," DS85, DS87 DS179, DS184, DS190, (DS109),DS90 (DS91,DS92, DS206 DS93, DS94, DS96), DS98," DS103 (DS113),DSl11, DS121a (DS123, DS164), DS146 (DS175),DS147, DS149, DS151, DS161 (DS169),DS166," DS177a (DS178),DS183, DS193, DS195, DS202," DS207 (DS220),DS214" Unique disputes 35 19 Adversely affected exporters 805 60 As complainants 57 32 As interested third parties 58 7 As nonparticipants 690 21 Note: Classification determined by the author as described in the text. A dispute in parentheses is combined with the immediately preceding dispute (notin parentheses) because it relates to a common respondent and set of disputed products. aThe exception is safeguard violations in which the safeguard-imposing country exempted imports from either members of a preferential trading arrangement or small developing countries under Article 9.1 of the Agreement on Safeguards. Source: Author's compilation of HS codes based on publicly available wro dispute settlement documents and national government publications. The exporters are derived from the HS import data in the UNCTAD TRANS database. There were also 65 adversely affected exporters that formally notified the WTO of their interest as a third party.15 The remaining 711 adversely affected exporters were nonparticipants. 15. Two sources were used to identify these countries. First, for all disputes that resulted in a panel report, the information in the report was used to determine which countries made third-party submis- sions or reserved their third-party rights to make legal arguments during the panel process. Second, in disputes that did not reach the panel stage countries could signal their interest by making a formal request to the respondent and complainant to join the consultations (under Article 4.11 of the Dispute Settlement Understanding), based on a substantial trading interest in the products under dispute. Such notifications are published on the w~o'sWeb site along with other information pertaining to the dispute. Bown 297 Table 2 lists the frequency with which each exporting country was a non- participant, interested third party, and complainant in the data set of trade disputes under investigation. Developing economies constitute a substantial share of the sample, despite the fact that a country's inclusion was contingent on it already being an exporter of the HS six-digit product under dispute in the respondent's market. That is, the estimation excludes from the sample potential (developing and developed economy) exporters of the disputed product that have been shut out of a particular market entirely, perhaps due to the wro- inconsistent policy. What factors determine whether an adversely affected exporting country for- mally participates in a trade dispute? The hypothesis here is that such exporters participate when the expected benefits to participation are greater than the expected costs. It is assumed that expected benefits depend on the size of the gains the exporter would receive from a successfully resolved case and on the probability that the case is resolved successfully. The expected costs of formal participation in a dispute can be said to have two distinct components: the expected litigation costs and the expected political economy costs of confronting another nation in a formal dispute. As will be described in more detail, the hypothesis allows for economic interests to affect decisions, but proxies are included for some of the institutional biases that wro scholars have been concerned might also influence participation decisions,given the rules and procedures of dispute settlement described in the previous section. The failure to find evidence of a relationship between the political economy determinants and participation decisions would be consistent with an alterna- tive hypothesis that only the exporter's interest in its trade to the disputed sector matters. The next two sections describe the variables and data used to represent these expected benefits and costs. Expected Benefits of Formal Participation What are the expected benefits of participating in a dispute, and when would they be large? This investigation focuses on the direct short-term economic benefits of participating in the dispute-that is, the improved terms of market access or trade liberalization offered by the respondent country.16The hypoth- esis here is that an exporter's decision to participate formally in the dispute increases the marginal benefit of all countries that export the disputed product to the respondent either by increasing the likelihood that the respondent will 16. Alternatively, countries might have an incentive to participate to ensure the Bong-term viability of the iilstitutional arrangement or to make arguments that might apply to their rights and obligations being litigated in other concurrent (or future) cases-perspectives not considered here. 298 THE W O R L D B A N K E C O N O M I C REVIEW, VOL. 19, N O . 2 TABLE 2. Affected w ~ Member Exporters as Nonparticipants, Interested o Third Parties, and Complainants in the 1995-2000 Trade Dispute Data Used in the Estimation Adversely Affected Exporter Nonparticipant Interested Third Party Complainant Total Korea, Rep. Indonesia New Zealand South Africa Japan Singapore Turkey Australia Canada Brazil Hong Kong, China Mexico Switzerland Argentina Thailand Czech Republic Romania Pakistan Poland Colombia Peru Malaysia Uruguay Morocco E ~ptY Hungary Israel Norway Chile Philippines Sri Lanka European Union Ecuador Venezuela Bangladesh Tunisia India Costa Rica Kenya United Arab Emirates Zirnbabwe Bulgaria Mauritius St. Lucia Honduras Paraguay Guatemala comply with its obligations and undertake a given level of trade liberalization or by increasing the depth of any such liberalization. First, an exporter would be more likely to participate in the proceedings when the respondent's disputed market is important. One measure of impor- tance is the size of the market access commitment in question (that is, the value of trade lost to the disputed policy),for which the log of the real dollar value of exports to the respondent's disputed market in t 1, the year before the initiation - of the dispute, is used as a proxy. The disputed sector data for the respondent country is again the HS six-digit data derived from TRAINS." The exporter's share of the respondent's disputed import market in year t - 1 is used as a second measure. This variable addresses the idea that an exporter with a sizable market share may be expected to take on a leadership role in challenging a wo- inconsistent measure. This might occur even when imports in t - 1were small because it is a dispute in which the respondent refused to implement negotiated vao obligations, as opposed to a dispute in which the respondent has applied a new, wro-inconsistent policy in t after a market had been liberalized. Next, evenwhen the value of trade at stake isnot largeor when the exporter is not necessarily a leader in that particular market, exporters may be more likely to participate in disputes in which their sales are disproportionately concentrated in a particular destination market. Thus a measure of the exporter's diversification, definedasthe disputed HSsix-digitexports to the respondent as a share the exporter's same six-digit exports to the world in t - 1, is also included as an explanatory variable.18A positive relationship is expected between this variable and the partici- pation decision because exporters that are more reliant on the respondent's market (that is, that are less diversified) are more likely to participate in a formal vao challenge because they are concerned with the ability to deflect lost trade to alter- native third markets due to a market-specific,fixed cost of exporting.19 The Likelihood of Success in a Dispute The expected benefits of a dispute are also expected to be affected by the probability of its successful economic resolution. Due to the self-enforcing nature of the wro's dispute settlement system, exporting countries can enforce their rights only through actual or implicit threats of retaliation against offend- ing trading partners.20 Therefore, the hypothesis here is that an exporter is more 17. The log of the level of t-1 imports is used to avoid giving too much weight to particularly large values of this variable in certain observations in the data set. 18. Because the HS six-digit data are available only for importing countries reporting data in the TRAINS data set, a consistent time series exists for 23 of the 30 largest importing countries here. 19. For evidence on exporting countries' ability to deflect trade to third markets when confronted with newly imposed trade restrictions, see Bown and Crowley (2004). 20. Using a sample of GA~TIWTO disputes initiated and completed over 1973-98, Bown (2004a) has shown that the more powerful the complainant exporter with respect to its capacity to engage in tariff retaliation against the respondent, the greater the trade liberalization gains that the respondent yields to the complainant at the conclusion of the dispute. likely to participate in a dispute in which it is bilaterally powerful (withrespect to the respondent) because this positively affects the probability of a successful economic outcome. A respondent country is more likely to bring a mo-incon- sistent policy into conformity with its obligations when it has a credible reta- liatory cost for failing to do so. The capacity of an exporter to credibly threaten tariff retaliation is measured as the share of the respondent's total exports to the exporting country, using the bilateral export data provided in Feenstra (2000). An alternative retaliation threat variable is the respondent's reliance on the exporter for bilateral aid. Specifically, the more reliant the respondent is on the exporting country for development assistance, the more aid the exporting country could threaten to withdraw, and thus the more likely that the respon- dent would implement market access commitments. The hypothesis here is that the more reliant the respondent is on the exporter for bilateral aid, the more likely the exporter is to formally join the dispute. By contrast, the respondent's reliance on the exporter for bilateral assistance could also signal a special political relationship between the two countries that might decrease the like- lihood that the exporter would confront the country with a formal international dispute. These potential relationships are investigated using bilateral aid data derived from Organisation for Economic Co-operation and Development ( ~ E C D / DAC 2001). The variable is formally defined as the aid the respondent receives from the exporter, relative to the size of the respondent's gross domestic product (GDP),to normalize for level differences across c~untries.~' The Capacity to Absorb Litigation Costs When would the expected costs to an exporting country of formally participat- ing in a dispute be high? The resource costs of filing the paperwork to merely initiate or participate in a case as either a complainant or an interested third party (or reserving third-party rights) are not large, and this legal signal is all that is necessary for a country to be considered a formal participant in the data set used here. Nevertheless,the exporting country's GDP, with data derived from World Bank (2001), is used as a proxy for an exporter's capacity to incur significant legal costs. The theory is that although legal services may be inter- nationally traded, richer countries have more access to the resources necessary to hire counsel to both monitor trading interests and to stand up for those interests through litigation. Data on the number of delegates that e a c h w o member has sent to the WTO offices in Geneva is used as a proxy for a country's legal capacity (Horn and others 2005). 21. The aid data are official development assistance and aid, and they do not include, for example, military aid or aid from nongovernmental organizations. An alternate measure of interest-not consid- ered here due to lack of data-is trade preferences between the exporter and respondent countries, such as participation in the Generalized System of Preferences. Part of this concern is addressed by a variable capturing membership in formal preferential trade agreements sanctioned under the GA?T'S Article 24, discussed later. Political Economy Costs A second potentially important expected cost to developing economy exporters relates to the political economy costs of publicizing a grievance through a formal international confrontation with a particularly 'important' respondent country. One type of important country is a trading partner on which the exporter is particularly reliant for bilateral assistance. The expected result here is that the more aid the exporter receives from the respondent (relativeto GDP), the less likely that the exporter will formally participate in a case against the respondent as either a complainant or an interested third party, for fear of losing this aid. Again, the bilateral aid data are derived from OECD~DAC(2002). Another example of an important country from the exporter's perspective is a trading partner with which the exporter is involved in a preferential trade agree- ment. The hypothesis here is that a country is less likely to formally participate in a dispute against another preferential trade agreement member because it would worsen relations or because the agreement contains its own dispute settlement provisions. The dummy variable thus takes on a value of 1 if the exporter and respondent country are members of a common free trade agreement or customs union that has been notifiedto the WTO under the GAT'S ArticleXXIV ( w ~2003a). o The summary statistics for each of the variables used in the estimation are provided in table 3. Econometric Model To address the determinants of a negatively affected exporter's decision to participate in a trade dispute, a w ~ member is assumed to make one of three o choices: i E {0, 1, 21, where 0 =not participate, 1=interested third party, and 2=complainant. It is assumed that the formal participation decision is an ordered choice-that is, complainants are more involved in the case than are interested third parties, and so on. The determinants of this choice are econo- metrically estimated using the standard ordered probit model.22 Table 4 shows the maximum likelihood estimates of the marginal effects of the ordered probit model. The 865 observations are the negatively affected coun- tries revealed by the trade data as exporting the HS six-digit disputed product to the respondent in one of the 54 quasi- MFN disputes described in table 1. The model is also estimated with respondent country fixed effects, whose estimates are suppressed. Table 4 presents estimates of the marginal effects of the determinants of the exporter's choice of becoming a complainant, of becoming 22. For a formal discussion of the ordered probit model, see Greene (2000).Alternatively, ofiecould also assume that these choices are unordered and thus use the multinomial logit model, which is discussed in more detail later. For a formal discussion of the multinomial logit model, see Greene (2000). Bown 303 TAB LE 3. Summary Statistics for the Variables Used in the Negatively Affected Exporter's Choice Model Variable Predicted Sign Mean SD Minimum Maximum -- Dependent variable 0=nonparticipant 1=interested third party (for= 1or 2) 0.2797 0.6387 0 2 2 =complainant Explanatory variables Size of potential liberal- ization benefits Market access: log of exporter's real value of exports to respondent's disputed market in t - 1 Leadership: exporter's share of respondent's disputed market in t- 1 Market diversification: exporter's disputed sector exports to re- spondent as a share of exporter's total disputed sector exports in t- 1 Probability of realizing benefits Trade retaliation capacity: + 0.0289 0.0688 0.0000 0.8052 respondent's exports sent to the exporter as a share of its total exports in t-1 Aid retaliation capacity Unknown 0.0042 0.0448 0 1.1028 or special relationship: respondent's bilateral aid that is received from the exporter relative to respondent GDP in t - la Capacity to absorb expected litigation costs Income: log of exporter's GDP in t - 1 Legal capacity: log of exporter delegates at the wro Secretariat Political econorny costs Preferential trade agreement: respondent and exporter in a common free trade area or customs union (Continued) TABLE 3. Continued Variable Predicted Sign Mean SD Minimum Maximum Fear of losing aid: - 0.2358 0.8897 0 8.6872 exporter's bilateral aid that is received from the respondent relative to exporter GDP in t - la Source: Author's calculations based on data sources described in the text. "Ratio scaled up by 100,000. an interested third party, and of the choice not to participate.23In considering the size of the marginal effects estimates discussed next, note that when eval- uated using the means of the underlying data, the predicted probability of an exporter choosing to be a complainant is 2.7 percent and to be an interested third party is 5.7 percent, whereas the average exporter has a 91.6 percent chance of choosing not to participate. Expected Benefits to an Economically Successful Resolution The expected benefits of participating and the variables controlling for the size and importance of the benefits to the exporter if the dispute concludes success- fully-that is, with the respondent liberalizing trade in the disputed sector-are considered first. The market access variable is defined as the log of the value of the exporter's exports to the respondent's HS six-digit disputed market in t - 1, and the estimated marginal effect is found to be 0.009. Although the implied size of the estimate for this variable is difficult to interpret (recallthe import variable is defined in logs), it is economically significant-thus, a 1 point increase in the underlying explanatory variable from the mean of 6.4027 ($603,472 of HS six- digit exports) to 7.4027 ($1,640,408 of HS six-digit exports) increases the like- lihood that an exporter will become a complainant by roughly 0.9 percentage points (from2.7 percent to 3.6 percent).Next, the 0.098 estimate of the marginal effect for the leadership variable indicates that a 10 percentage point increase in the exporter's share of the respondent's disputed market in t - 1 leads to a 0.98 percentage point increase in the likelihood that the exporter will become a complainant. The one variable from the size of the expected benefits analysis that is not of the theoretically predicted sign is the diversificationvariable, defined as the exporter's HS six-digit exports to the respondent in t - 1 relative to its exports to the world of the disputed HS six-digit product. The estimate indicates that the more reliant (lessdiversified)the exporter is on the respondent's market, 23. Estimates of the three choices are included for convenience, though estimates for two of the choices would be sufficient. For example, the estimates for the nonparticipant choice can be derived by simply multiplying the sum of the values of the marginal effect estimates for the complainant and for the interested third party by 1. - Bown 305 TABLE 4. Marginal Effects Estimates of Ordered Probit Model of Complainant, Interested Third Party, and Nonparticipant Choice Dependent Variable: Exporter's Choice of Becoming A Explanatory Variable Complainant Interested Third Party Nonparticipant Size of potential liberal- ization benefits Market access: log of exporter's real value of exports to respondent's disputed market in t - 1 Leadership: exporter's share of respondent's disputed market in t 1 - Market diversification: exporter's disputed sector exports to respondent as a share of exporter's total disputed sector exports in t - 1 Probability of realizing benefits Trade retaliation capacity: respondent's exports sent to the exporter as a share of its total exports in t- 1 Aid retaliation capacity or special relationship: respondent's bilateral aid that is received from the exporter relative to respondent GDP in t - 1 Capacity to absorb expected litigation costs Income: log of exporter's G D P t~- 1 ~ Legal capacity: log of exporter delegates at the WTO Secretariat Political economic costs Preferential trade agreement: respondent and exporter in a common free trade area or customs union Fear of losing aid: exporter's bilateral aid that is received from the respondent relative to exporter GDP in t - 1 TABLE 4. Continued Dependent Variable: Exporter's Choice of Becoming A Explanatory Variable Complainant Interested Third Party Nonparticipant Number of observations 865 Number of unique 54 disputes pseudo-R' 0.32 Log likelihood -344.47 "Statistically different from 0 at the 10 percent level. ""Statistically different from 0 at the 5 percent level. """Statistically different from 0 at the 1percent level. Note: White's heteroskedasticity-consistent standard errors corrected for clustering on the underlying dispute are in parentheses. Time t is the year of the start of the dispute. Specification also estimated with a constant term and with respondent country-fixed effects whose estimates are suppressed. the more likely the exporter is to simply not participate (0.073). However, the estimate is not statistically different from zero. The second set of explanatory variables is concerned with the probability that the benefits to the exporter will be realized through a successful resolution of the dispute. The more capacity that an exporter has to retaliate by withdrawing trade concessions, as measured by the respondent's reliance on the exporter's market for the respondent's trade, the more likely the exporter is to become a complainant (0.266).A 10 percentage point increasein the respondent's reliance on the exporter's markets for its own exports thus leads to a 2.66 percentage point (roughly double) increase in the likelihood that the exporter will formally participate in the dispute as a complainant. By contrast, the estimate for the retaliation threat through withdrawing bilateral assistance that was also expected to influence the likelihood of a successfuloutcome, is negative, though statistically insignificant. Although inconsistent with the threatened withdrawal of aid hypothesis, a viable explanation is that this aid relationship is instead capturing a special political relationship that makes the exporter less likely to participate in a formal international dispute confronting the respondent. Expected Costs of Participating in a Dispute The next set of variables covers the expected costs to an exporter of participat- ing in a dispute. First, the exporter's GDP, a proxy for its capacity to pay for traded legal services, is positively associated with the decision to become a complainant or interested third party. Larger and richer countries are thus more likely to formally participate in vmo litigation. But the estimates for variables capturing the number of the exporter's delegates at vmo are neither of the correct sign nor statistically significant. Finally, there is also strong evidence that potential political economy costs of international relations make it less likely that an exporter will participate in a Bown 307 trade dispute when the respondent is politically important to the exporter. Exporters are less likely to participate in disputes against trading partners in a common preferential trade agreement as either complainants (-0.030)or inter- ested third parties (-0.059). The estimated marginal effects of these variables are large: other things being equal, an exporter that is in a preferential trade agreement with the respondent faces a roughly 3 percentage point decrease in the probability of becoming a complainant and a 6 percentage point decrease in the probability of becoming a third party, relative to an exporter that is not in a preferential trade agreement with the respondent. Furthermore, the larger the exporter's reliance on the respondent for bilateral aid, the less likely it is to intervene as a complainant (-0.021)or an interested third party (-0.032).The size of the effect is also substantial, as a one standard deviation increase in the variable above its mean halves the likelihood of the exporter participating as a complainant (from 2.7 percent to 1.3 percent), and also substantially reduces the likelihood of it participating as an interested third party (from5.7 percent to 3.3 percent). Sensitivity Analysis In addition to the baseline specification of the ordered probit model illustrated in table 4, several robustness checks are performed to assess the sensitivity of the results to basic changes of model specification. One potential source of concern for the approach used here relates to the choice of the ordered probit model itself. Alternatively, one might use the multinomial logit model, which does not require an assumption on the ordering of outcomes. But a concern with estimat- ing the multinomial logit model is the independence of irrelevant alternatives assumption. In the estimation of the multinomial logit model (not reported here) the qualitative pattern of results was quite similar to those reported for the ordered probit model, and yet Hausman tests of the independence of irrelevant alternatives assumption suggested that in some specifications of the model it could be invalid. As other robustness checks, the ordered probit model has been estimated using various subsets of data, including only the exporters from the 35 non- discrimination violation disputes listed in table 1and the full set of 54 disputes truncated to include only countries that were above a minimum dollar threshold (forexample, $500,000, $1 million, and $2 million) of disputed sector exports to the respondent to ensure that the results were not driven simply by the smallest exporting countries. In both instances, the qualitative nature of the results was largely unchanged from those reported in table 4. IV. CO NCLUSIONS A N D P OLICY IM PLICATIONS This article is the first to use detailed trade data to identify the potential litigants in WTO dispute settlement activity to investigate the determinants of those coun- tries' participation decisions in formal trade disputes. Even after controlling for the economic importance of disputed sector market access, variables that serve as proxies for the institutional bias generated by the current rules of the system also affect the nonparticipation choice. The formal evidence indicates that despite market access interests in a dispute, an exporting country is less likely to partici- pate in WTO litigation if it has inadequate power for trade retaliation, if it is poor and does not have the capacity to absorb substantial legal costs, if it is particularly reliant on the respondent country for bilateral assistance, or if it is engaged with the respondent in a preferential trade agreement. These are characteristics typi- cally associated with developing economies in the m o membership. This investigation is also subject to caveats. Foremost is that although the reasons why exporters do not participate in disputes that have already been initiated are examined, the lack of data and knowledge concerning noninitiated cases means that the more compelling question of whether the determinants of nonparticipation analogously lead to an underinitiation of trade disputes rela- tive to a social optimum cannot be addressed. At most the evidence provided here on the importance of limited retaliatory and legal capacity, as well as special political economy relationships, suggests that these factors may also adversely affect the initiation of disputes more generally. Obviously, the ques- tion of dispute initiation is still open and should be the focus of additional research.24 Although this is only a first attempt to characterize and analyze the data, these results may nevertheless contribute to the policy debate on proposals of reform to the mo dispute settlement system. In particular, suppose that one policy goal was to promote systemic reforms designed to encourage a country's participation in dispute settlement activities that were important to its trading interests, so as to induce a sharing of the litigation burden and a commitment to working within the system. The results here suggest that any such attempt must recognize that it is not only the exporter's trading interest (and level of income) that affects the decision to participate but also its capacity to retaliate through trade, to be retaliated against through the withdrawal of bilateral aid, and the nature of special political or trading relationships that it has with respondents. One proposal has been to expand the WTO'S power to impose more discipline on negotiated settlements, so that the outcomes of disputes are truly transpar- ent. In light of the concerns raised in this article, such an approach on transpar- ency could be beneficial if it reduced the incidence of discriminatory settlements where market access benefits are not extended on an MFN-basis. For example, increased transparency could lead private sector interests (such as the adversely affected exporting firms) to increase the pressure they place on their own governments to better monitor and actively participate in the process on their 24. Bown (2005)is one attempt. It examines which U.S.-imposed antidumping measures are subject to oao disputes. The evidence from this sample of data suggests that many of the same factors that influence the choice to participate investigated here also affect the decision of which disputes to initiate against the United States. Bown 309 behalf. Active engagement and representation of exporting interests in develop- ing economies especially could help balance the political influence that domi- nant, import-competing interests typically wield over their governments. Of course, a reform that increases transparency is also likely to affect the incentive for potential litigants to initiate disputes and thus the set of wo-inconsistent policies that get challenged at all. Therefore, such a proposal should be the subject of additional research and scrutiny. Second, the finding that political concerns affect nonparticipation decisions illustrates the difficulties confronting the desire to facilitate coordination of litigation efforts across countries. Another proposal that could minimize the influence of such political concerns would be to authorize a wro-sponsored independent prosecutor or ombudsman to represent the joint interests of the group of adversely affected, potential litigants. This would focus attention on the wro-inconsistent policy, as opposed to any particular complainant country. Although this approach would certainly also introduce additional concerns that should be studied, it could help overcome the unwillingness of dependent countries to challenge trade restrictions due to fear of retribution by the respon- dent in other areas. Finally, with respect to the issue of a lack of retaliatory capacity, Bagwell and others (2004)present an approach that investigates potential schemes to address bilateral power imbalances, in particular the possibility that powerless complai- nant countries might auction off their rights to retaliate against noncom~liant respondents. The results presented here, along with those of Bown (2004a), suggest that if the wro seeks incentives for affected exporters to participate in dispute settlement, it might be most effective at targeting for participation the relatively (bilaterally) powerful country complainants and third parties, even if the powerful potential litigant would normally not participate because of only a small trading interest in the disputed sector. Each of these proposals raises intei-estingadditional questions that should be the focus of additional thcore- tical and empirical economics research. Bagwell, Kyle, Petros C. Mavroidis, and Robert W. Staiger. 2004. "The Case for Tradable Remedies in wro Dispute Settlement." Working Paper 3314. World Bank, Washington, D.C. Blonigen, Bruce A,, and Chad P. Bown. 2003. "Antidumping and Retaliation Threats." Journal of International Economics 60(2):249-73. Bown, Chad P. 2002. "The Economicsof Trade Disputes, the GAT'S Article XXIII, and the wro's Dispute Settlement Understanding." Economics and Politics 14(3):283-323. . 2004a. "On the EconomicSuccess of GATT~WTO Dispute Settlement." Review of Economics and Statistics 86(3):811-23. 2004b. "Trade Disputes and the Implementation of Protection under the GAT: An Empirical Assessment." Journal of International Economics 62(2):263-94. -. 2004~."Trade Policy under the GATT /wro: Empirical Evidence of the Equal Treatment Rule." Canadian Journal of Economim 37(3):678-720. -. 2005. "Trade Remedies and World Trade Organization Dispute Settlement: Why Are So Few Challenged?"]oumal of Legal Studies 34(2):515-55. Bown, Chad P., and Meredith A. Crowley. 2004. "Trade Deflection and Trade Depression." Working Paper 2003-26. Federal Reserve Bank of Chicago, Ill. Bown, Chad P., and Rachel McCulloch. 2004. "The wro Agreement on Safeguards: An Empirical Analysisof Discriminatory Impact." In Michael G. Plummer, ed., Empirical Methods in International Economzu: Essays in Honor of Mordechai Kreinin. Cheltenham, UK: Edward Elgar. Busch, Marc L. 2000. "Democracy, Consultation and the Paneling of Disputes under ~ ~ ~ ~ . " ] o uofn a l r Conflict Resolution 44(4):425-46. Busch, Marc L., and Eric Reinhardt. 2000. "Bargaining in the Shadow of the Law: Early Settlement in GATT~WTODisputes." Fordham International Law ]ournu1 24(1):158-72. Feenstra, Robert. 2000. "World Trade Flows, 1980-1997, with Production and Tariff Data." Working Paper, University of California, Davis, Department of Economics. Greene, William H. 2000. Econometric Analysis, 4th ed. Upper Saddle River, N.J.: Prentice Hall. Guzman, Andrew, and Beth A. Simmons. 2002. "To Settle or Empanel? An Empirical Analysis of Litigation and Settlement at the World Trade Organization." ]ournal of LegalStudies 31(1):S205-35. Hoekman, Bernard, and Petros C. Mavroidis. 2000. " m o Dispute Settlement, Transparency and Sur- veillance." World Economy 23(4):52742. Horn, Henrik, Petros C. Mavroidis, and Hikan Nordstrom. 2005. "Is the Use of the wro Dispute Settlement System Biased?" in Petros C. Mavroidis and Alan Sykes, eds., The wro and International Trade Law/Dispute Settlement. Cheltenham, UK: Edward Elgar. Irwin, Douglas A. 2003. "Causing Problems? The wro Review of Causation and Injury Attribution in US Section 201 Cases." World Trade Review 2(3):297-325. Jackson, John H. 1997. The World Trading System: Law and Policy of International Economic Rela- tions, 2nd ed. Cambridge, Mass.: MIT Press. OECD/DAC(Organisation for Economic Co-operation and DevelopmentlDevelopment Assistance Commit- tee). 2001. International Development Statistics. Paris. Petersmann, Ernst-Ulrich. 1997. The c~rrlwroDispute Settlement System: International Law, Interna- tional Organizations and Dispute Settlement. London: Kluwer Law. Reinhardt, Eric. 2001. "Adjudicating without Enforcement in GATT Disputes." ]ournu1 of Conpict Resolution 45(2):174-95. USTR (United States Trade Representative).2002. "United States and Korea Resolve wro Dispute on Line Pipe." Press release, July 29. Available online at www.ustr.gov. World Bank. 2001. "Global Development Network Growth Database." Washington, D.C. Online docu- ment available at www.worldbank.orglresearch/growth/gdbdata.htm. WTO (World Trade Organization). 2003a. "Regional Trade Agreements." Online document available at www.wto.org/english/tratop-e/region~e/region-e.htm. -. 2003b. "Trade Policy Reviews." Online document available at www.wto.org/english/tratop-e/ tpr-e/tpr-e.htm. Has Rural Infrastructure Rehabilitation in Georgia Helped the Poor? Michael Lokshin and Ruslan Yemtsov This article proposes a research strategy to deal with the scarcity of data on benefici- aries for conducting impact assessments of community-level projects. Community-level panel data from a regular household survey augmented with a special community module are used to measure the impact of projects. Propensity score-matched difference- in-differencecomparisonsare used to control for time-invariantunobservable factors. This methodology takes into consideration the purposeful placement of projects and their interactions at the community level. This empirical approach is applied to infrastructure rehabilitation projects-for schools, roads, and water supply system-in rural Georgia between 1998 and 2001. The analysis produces plausible results regarding the size of welfare gains from a particular project at the village level and allows for differentiationof benefits between the poor and the nonpoor. The findings of this study can contribute to evaluations of the impact of infrastructure interventions on poverty by bringing flew empirical evidence to bear on the welfare and equity implications. A frequent problem in evaluating the impact of projects in developing econo- mies is the lack of data. A further complication is the increasing number of projects that target communities rather than individuals and rely on demand- driven placement, requiring special evaluation techniques and good-quality data to obtain robust results. Despite these difficulties,researchers are often called on to provide ex post assessments of a project's impact. This article develops a strategy for meeting such requests with minimum data. Interest in evaluating the effectiveness of community-based infrastructure projects has grown in response to the increasing popularity of such programs. Michael Lokshin is senior economist in the DevelopmentEconomics Research Group at the World Bank; his email address is mlokshin@worldbank.org. Ruslan Yemtsov is senior economist in the Europe and Central Asia Poverty Reduction and Economic Management unit at the World Bank; his email address is ryemtsov@worldbank.org. The research for this study was conducted as a part of the analyticalworkfor the Georgia Poverty Update.Support from the Poverty Reduction and Economic Management Poverty ~ebuction Group trust fund is gratefully acknowledged. We thank Louise Cord, Martin Ravallion, and Dominique van de Walle for useful comments. We are also grateful for the support and contributions of ' ~ o d a r Kapanadze, Alexander Kolev, and Zurab Sajaia. Special thanks to the Georgia Social ~nvestme;! Fund team for making a wealth of information available. THE WORLD BANK ECONOMIC REVIEW, VOL. 19, NO. 2, pp. 311-333 doi:10.1093/wber/lhi007 Advance Access publication August 31, 2005 O The Author 2005. Published by Oxford UniversityPress on behalf of the International Bank for Reconstruction and Development I TKEWORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. Jalan and Ravallion (2003),Lee and others (1997),and Brockerhoff and Derose (1996)analyze improvements in water and sanitation facilities. Glewwe (1999), Hanushek (1995),and Kremer (1995)evaluate the impacts of school infrastruc- ture rehabilitation projects. Jacoby (2002)and van de Walle and Cratty (2002) analyze the effect of improvements in access to roads. For the most part researchers have had to apply matching techniques combining samples of ben- eficiaries with samples from regular household surveys, and they have used panel data or instrumental variables to deal with biases arising from the non- random placement of a project under evaluation. Following this practice, the empirical approach proposed here applies pro- pensity score-matched difference-in-difference comparison between project beneficiaries and a control group to purge biases arising from time-invariant unobservablecommunity characteristics that might affect project outcomes. The approach also employs several innovations. First, a carefully designed commu- nity survey with sufficiently long recalls is used to compensate for the lack of baseline data. Second, repeated cross-sections from a household survey are aggregated to the community level to obtain longitudinal observations at the community level, with the community treated as a unit of observation. Third, the analysis explicitly considers all infrastructure micro-projects in every com- munity. Fourth, the methodology is applied to assessment of infrastructure rehabilitation projects rather than to the construction of new facilities, the focus of most studies so far. This approach is applied to data from Georgia, one of the poorest countries of the former Soviet Union. The analysis is conducted for all community-level infra- structurerehabilitation projects--school, roads, and water supply systems-in rural areas between 1998 and 2000. The proposed evaluation strategy produces defend- able results on the size of welfare gains from a particular project at the village level and can differentiategains to the poor and gains to the nonpoor. The limitations of the approach are not trivial, however. It produces reliable results only for projects affectinga significant fraction of the population-projects largeenoughfor a regular householdsurvey to registertheir impacts. Consequently, the proposed strategy is probably best suited for large-scale projects deployed with little regard for the need for subsequent evaluation, such as disaster response operations or decentralized public investment programs in developing areas. Economic and political turmoil have led to a dramatic fall in living standards in Georgia, once one of the richest republics of the former Soviet Union. Indepen- dent Georgia inherited developed infrastructure facilities, but these have been deteriorating rapidly. Rural areas have been hit particularly hard, suffering from increasing economic marginalization and impoverishment (WorldBank 2003b). The decay of infrastructure providing public services has resulted in deteriorat- ing nonincome indicators, particularly those related to child welfare-school Lokshin and Yemtsou 313 enrollment rates have fallen, maternal mortality ratios have risen, and infant mortality rates have remained high (World Bank 2003a). Georgia has few resources available for rehabilitating its severely decayed infrastructure. Donor funds have been required to finance basic maintenance of roads, repair of water and sanitation systems, and urgent rehabilitation of school facilities. In the early 2000s as many as 218 donor organizations were actively involved in such projects (UNOCHA 2003). Because of the backlog in deferred and neglected maintenance, the govern- ment and donors will continue to face difficult tradeoffs between capital invest- ments and spending on critical current needs. To make such choices, it is important to know which rehabilitation and maintenance programs address the most critical needs. Little is known, however, about the impact of these activities on households. Although every donor-financed operation includes an evaluation module, these evaluations often focus only on project-specific outputs. In the rare cases when beneficiaries have been the center of attention, lack of data and ad hoc choice of the control groups have limited the usefulness of evaluation. Evaluation attempts have focused exclusively on projects sponsored by the agency request- ing the evaluation, overlooking the many micro-projects implemented in paral- lel (and often with little coordination) by different donors in the same community. To help channel scarce public resources to their best uses, this article inves- tigates the welfare impact of various types of rural infrastructure rehabilitation projects and evaluates their targeting (or placement). It also provides evidence on whether such activities benefit the poor, a useful input to the implementation of a poverty reduction strategy in Georgia. The data used for the analysis come from a household survey and a community survey. The household-level information is provided by the ongoing multitopic Survey of Georgian Households (SGHH). The survey, begun in July 1996 and conducted quarterly by the State Department of Statistics, collects information on the demographic characteristics of household members, their labor market - - activities, and their access to social services. One section of the questio~naire gathers information on income and consumption expenditures and ownership of assets. Modules collecting information on health and education outcomes were introduced in the first quarter of 2000. The household survey uses a two-stage stratified rotating sample of 2800 households, representative at the national, urban, rural, and regional levels. At the first stage 282 primary sampling units are randomly selected from the stratified list of 12,000 census units with probability proportional to size. In rural areas the primary sampling units roughly correspond to villages. At the second stage, 7-20 households are randomly selectedfrom each primary sampling unit. These households stay in the survey for four consecutive quarters and are then replaced by differenthouseholds from the same sampling unit. This process continues until the list of households in the sampling unit is exhausted. At that point, another primary sampling unit is chosen from the same stratum. Each sampling unit tends to remain in the survey for years, making it possible to construct village-level panels spanning multiple periods. In total, the survey covers 174 rural population sites. Community-level data were collected through the Rural Community Infra- structure Survey (RCIS) conducted in May-June 2002 by the Georgian Opinion Research Business International with the support of the State Department of Statistics and the World Bank. The survey covers all 174 rural population sites from the household surveys. In addition, to expand the sample of beneficiaries, 75 villages not covered by the household surveys were selected from a list of 360 villages supplied by a major donor involved in the community projects, giving a total sample of 249 villages.' A typical village in the sample benefited from multiple infrastructure projects-57 percent of survey sites reported having two or more projects carried out between 1998 and 2002 (the maximum was 15 projects). Forty-nine villages (20 percent of the sample) had no projects. One of the main purposes of the rural infrastructure survey is to collect retro- spective information on infrastructure projects. The survey questionnaire includes sections on the state of transport infrastructure, water supply systems, schools, kindergartens, and healthcare facilities. It also covers sources of livelihood for the local population based on detailed modules on agricultural and nonfarm activ- ities. One section of the questionnaire contains detailed questions on all infra- structure rehabilitation projects carried out since 1996: dates of initiation and completion of each project, source of funds, and sector. Information was collected from key informants in rural sites, such as local authorities, informal leaders, nongovernmental organizations, and social assistance workers. Because the survey covers all villages covered by the household surveys, there is complete overlap between these two sources of data. This permits the use of both community and household information for analysis.2 Evaluation Sample The community infrastructure survey collected information on 549 rehabilita- tion projects funded by local and international agencies covering schools 1. The data are from the Georgia Social Investment Fund. Detailed information on the Rural Community Infrastructure Survey sample frame, methodology, and questionnaire design can be found in GOMI and Georgia State Department for Statistics (2003). 2. Although this design feature of the community infrastructure survey compensates for the absence of proper baseline data for some indicators used in the analysis, the long recall period for the village-level outcome measures could introduce bias. The reliability of recall data is often questioned on the basis of framing bias (see, for example, Kahnemann 2003). To minimize such bias, group interviews were conducted to reduce individual heterogeneity in the responses of local informants. Lokshin and Yemtsou 315 (28 percent of projects), road infrastructure (27 percent), water supply systems (11percent),medical facilities (6percent), kindergartens (3percent), and other infrastructure rehabilitation projects (25 percent). The three largest groups of interventions were evaluatecl-schools, road infrastructure, and water system rehabilitation projects. To fit the recall period of the rural infrastructure survey, the analysis uses projects that began on or after 1998 (the baseline) and that were completed by January 2001. That yielded a total of 144 projects in 106 villages. Impact Indicators Two sets of impact indicators were identified for each project, one drawn from the community infrastructure survey and one from the household survey. Community-level indicators based on the community infrastructure survey measure changes between 1998 and 2002.~Village-level averages, which are based on data from the household survey, compare outcomes in 2000 and 2001. This arrangement is dictated by data availability and creates some disconnect between the two timeframes. It is not a big problem, however, because the majority of projects in the treatment group were completed in 2000.~ The indicators are listed in appendix table Al, and their values are shown in table 1, averaged across villages in the sample calculated at the beginning and at the end of the timeframe chosen for the analysis.' To deal with the problem of several influential outliers, changes in some continuous variables were recoded into simple categorical variables, reporting the balance between positive and negative changes. Some outcome indicators reflect alarming trends in access to education, quality of road infrastructure, and availability of piped water. Only 68 percent of villages had all school-age children in school in 1998; by 2002 only 59 percent did. Household-level data suggest that close to 8 percent of children in an average village missed more than 30 days of classes in 2000. This indicator improved for all villages in the sample by the end of 2001. For 1998 as many as 91 percent of villages reported that the quality of their main roads was inadequate. This indicator improved considerably by 2002, but respondentsin 71 percent of villagesstill complained about road quality. In more 3. The treatment group includes villages with projects completed before the beginning of 2001, allowing at least one year to pass before assessing project benefits. 4. Omitting these cases would reduce the number of usable observations, which is a critical constraint in the study. We chose to retain all the villages but to exercise care in interpreting results. 5. Note the difference in the definition of "before" and "after" in community infrastructure survey and household survey indicators. Also, the number of observations reported for the household survey in table 1 differs across indicators. For example, data on school enrollment rates are available for all 102 villages with household survey data, but information on ambulance arrival time is available for only 68 villages. This attrition is clearly related to the low frequency with which rural residents use emergency care. TA BLE 1. Summary Statistics for Main Outcome Indicators Beforea Change Outcome indicator Source Number Mean SD Mean Mean SD All children are enrolled in school RClS 0.683 0.466 0.590 Number of pupils RCIS 293.781 278.019 283.129 Number of graduates RCIS 23.450 21.955 21.635 Access to educationC RCIS School enrolment rate SGHH 0.978 Share of pupils missing more than 30 days SGHH 0.063 Unsatisfactory schooling conditions SGHH 0.054 Expenditures on schooling SGHH 48.470 Incidence of respiratory diseases (child) SGHH 0.079 Time to district capitalC RCIS Subjective assessment of road (bad) RCIS 0.711 Barter trade RCIS 0.498 Small enterprises RCIS 0.486 Time for ambulance to arrive SGHH 0.528 Sales of agricultural products SGHH 114.760 Female off-farm employment SGHH 0.129 Nonagricultural employment SGHH 0.154 Household transport expenditures per capita SGHH 1.642 Incidence of trauma SGHH 0.001 (Continued) TABLE1. Continued Beforea Aftera Change Outcome indicator Source Number Mean SD Mean SD Mean SD New water sourcesC RCIS 249 0.092 0.290 Number of livestockC RCIS 249 0.663 0.474 Piped water in the household SGHH 103 0.561 0.444 0.565 0.429 0.006 0.227 Hours of piped water supply SGHH 103 8.380 9.456 8.788 9.533 0.767~ 4.852 Incidence of water-borne diseases (total) SGHH 102 0.004 0.008 0.008 0.028 0.000~ 0.008 Incidence of water-borne diseases (child) SGHH 91 0.008 0.028 0.022 0.097 0.007~ 0.032 Expenditures on bottled water SGHH 89 1.608 0.872 1.583 0.895 -0.019~ 0.421 Note: Values are averaged across villages in the sample calculated at the beginning and the end of the timeframe chosen for analysis. "For RCIS, "before" is 1998 and "after" is 2002; for the SGHH, "before" is 2000 and "after" is 2001. b~ecodedchange indicator: the share of changes in a positive direction minus the share of changes in a negative direction. Some villages have missing values for an indicator before or after project completion; thus the indicator of change may differ from a simple ratio of before and after project indices. 'Based on a direct change question in the survey. *change in the log of per capita values. P E- than half the villages, it took more than four hours for an ambulance to respond to a call. In 1998 and 2002 only about 56 percent of rural households were connected to a piped water supply. Piped water was available an average of eight hours a day. The high and increasingincidence of waterborne diseases among children is of particular concern. By 2001 as many as 2 percent of children below age seven reported illnesses related to poor water quality in the month preceding the survey. The criteria for project placement vary among agencies operating in Georgia, but in most cases placement criteria take into account the extent of poverty or its correlates, the state of infrastructure in a village, or regional characteristics. Many projects rely on demand-driven targeting mechanisms. Whether a parti- cular village gets a project can depend on the village's ability to seek support from implementing agencies. Villages are chosen by project managers based on characteristics that could be correlated with the expected outcomes of a project. Because of such nonrandom placement, a simple comparison of outcomes between villages with projects and villages without projects would be invalid. If selection of a village for a project is based purely on observable character- istics, a propensity-score matching method can be used to correct for selection bias (Rosenbaumand Rubin 1983; Rubin 1973).The propensity score measures the probability that a project is implemented in a village as a function of that village's observed preintervention characteristics. Villages with projects (the treatment group) are matched with villages without projects (the control group) on the basis of the propensity score. Following Chen and Ravallion (2003),outcome measure litfor a project in ith village at date t is defined as: where Ii: is the outcome for a village if the project is not implemented, and Gi: is the gain to village i from an outcome attributable to a project. Then the estimate of the average impact of the project on a treatment village (dummy variable Di=1)can be decomposed as: From equation 2 the estimation bias amounts to There is no bias in a simple comparison of the means between treatment and control villages if the terms of equation 3 are equal. The cross-sectional Lokshin and Yemtsou 319 propensity score method assumes that conditional on a set of observed char- acteristics X, Thus, the cross-sectional propensity score method produces an unbiased esti- mate of the project effect if project placement is based purely on a village's observed characteristics. However, some unobserved characteristics of the village that are correlated with project outcomes might also be correlated with project placement. This correlation can introduce bias in the estimation of project impact. For example, an active parent group might lobby the village authorities to pursue a school rehabilitation project. This same group of active parents might then become involved in the education process and positively affect school outcomes for their children. If the evaluation does not take into account the differences in parental activity between treatment and control villages, the effectiveness of the school project will be overestimated. If the preintervention differences between treatment and control villages are assumed to be the result of time-invariant unobserved factors, the difference-in-difference method can be used to correct for possible bias. The preproject difference in outcomes may be subtracted from the postpro- ject differences for the same villages. The underlying assumption in this method is that the time trend in the control group is an adequate proxy for the time trend that would have occurred in the treatment group in the absence of an intervention, or The mean difference in difference for the outcome is estimated by taking the expectation of equation 1 over all N sample villages using equation 5: If the outcomes in period 0 are not correlated with project placement, equation 6 estimates the mean changes in outcomes for the treatment villages. This study uses the matched difference-in-difference method, which com- bines propensity score-matching and difference-in-difference methods. Pecent studies by Heckman and others (1997, 1998) have argued that combining these methods can substantially reduce the bias found in other nonexperimen- tal evaluations. First, villages from the control and treatment groups are matched using propensity score matching. This matching removes the selection bias due to the observed differences between treatment and control villages. Then the difference-in-difference method is applied to correct for possible bias due to the differences in time-invariant unobserved characteristics between the two groups. To evaluate the impact of the project, the changes in outcome measures are compared between matched villages from the treatment and control groups. There is another form of bias that these methods cannot remove, which arises from time-variant unobservable characteristics correlated with both project placement and the outcomes of the interventioa6 In particular, project place- ment could be based on unobserved community characteristics that are corre- lated with changes in the expected project outcomes. However, there are reasons to believe that this bias may not arise in the context of micro-projects in rural Georgia. The project placement procedures used by the implementing agencies are based on formal criteria that capture exclusively the current state of affairs. Thus, placement can reduce (but not completely eliminate) possible bias from time-variant unobservables. IV. RESULTS The project placement mechanism for each type of intervention is modeled first as a function of a large set of variables from the community infrastructure survey that include village-level aggregates on geographic, demographic, and socioeconomic conditions (table 2). The model also controls for the presence of other projects in the same village. For example, in the specification that models the probability of a village participating in a school rehabilitation project, two dummy variables are included to reflect the presence of road and water projects. The probit estimates for three types of interventions are shown in table 2. The adjusted pseudo-~2of these estimations ranges from 0.156 for the school projects to 0.393 for the water projects. These are acceptable levels of explana- tory power. A high R2 could indicate the existence of fundamental differences between the characteristics of project and nonproject villages, which would make the formation of a proper control group very problematic. Only a few coefficients in the table are significant-the indicator of natural disasters, for example. This should not be taken as a sign of problems in forming a control group because the empirical specifications include many correlated variables and the purpose of the estimation is to calculate the propensity score and not model an underlying selection mechanism. 6. This problem is thought to be severe for infrastructure programs in poor areas if the deficient state of infrastructure in the initial period not only attracts the rehabilitation project, but also reduces future growth (Jalan and Ravallion 1998). TAB LE 2. Probit Estimates of the Probability of a Village Participating in a School, Road, or Water Project School Project Road Project Water Project Summary Statistics Coefficient SE Coefficient SE Coefficient SE Mean SE School project (binary) Road project (binary) Water project (binary) Total population (1.082) If internally displaced person in the village (binary) Agriculture only (binary) Experienced disaster (2.087) Experienced flood (binary) Mountain area (binary) Alpine area (binary) Distance to district center (0.808) Distance to market (49.308) Rail road (binary) Interstate highway (binary) Asphalt road (binary) Number of schools (0.802) Number of large enterprises (3.097) Small enterprise (binary) Police station (binary) Post office (binary) L' Restaurant (binary) g Proportion of households with a phone (0.284) 3' Q Prooortion of households with a toilet (0.206) R Unreliable electric power supply (binary) 3 Proportion of households with piped water (0.421) 3 '+ Proportion of buildings with wooden walls (0.472) 0 Proportion of buildings with dirt floors (0.231) Trade by the roadside (binary) W h, (Continued) + TABLE 2. Continued School Project Road Project Water Project Summary Statistics Coefficient SE Coefficient SE Coefficient SE Mean SE - ~ Regional dummy variables Kaheti (omitted category) (binary) Inner (Shida)Qartli (binary) Lower (Kvemo)Qartli (binary) Samskhe-Djavakheti (binary) Achara (binary) Guria (binary) Samegrelo (binary) Imereti (binary) Constant Sample size Adjusted pseudo-R~ Source: Authors' computations based onRCIS2002. Lokshin and Yemtsov 323 School Rehabilitation Pvojects Typically, school projects in Georgia focus on improving school buildings: repair- ing roofs, windows, and floors; replacing pipes; installing sanitary and heating equipment; and repainting walls. These projects may yield several types of ben- efits to the community. School rehabilitation may improve both enrollment and attendance rates. Better heating and repaired windows could be particularly important in Georgia, where some rural schools close for several weeks in winter because of frigid classrooms (Orivel 1998). Changes in household expenditures on schooling can be used as an indicator of the private response to investment in school rehabilitation. The subjective assessments of schooling conditions provide a useful check on results based on objective measures. School rehabilitation projects were completed in 61 villages (about a quarter of all villages) in the community infrastructure survey sample by 2001. Thirty- seven of these villages were also covered by the household survey. The initial (unmatched) control group was constructed from villages without school pro- jects and villages with incomplete school projects at the end of 2001. For the community infrastructure survey data three outcome indicators are reported at the community level based on difference-in-differenceestimation of the impact of school rehabilitation projects for the unmatched control group and propensity score match-constructed control group (table3).No significant differences are detected between the treatment group and the two control groups for the share of villages reporting that all children are enrolled in school, TABLE 3. Difference-in-DifferenceEstimates of the Average Impact of School Rehabilitation Projects Unmatched Sample Matched Sample Treatment Control Control Outcome Indicators Group Group p-Value Group p-Value RCIS All children are -0.066 -0.101 0.251 -0.066 0.500 enrolled in school If number of pupils increased 0.328 0.216 0.056 0.190 0.051 If number of graduates increased 0.373 0.327 0.268 0.237 0.059 Access to education has improved 0.361 0.266 0.089 0.213 0.036 SGHH School enrollment 0.059 -0.004 0.102 0.000 0.117 Share of pupils missing more -0.057 -0.001 0.063 0.020 0.019 than 30 days Unsatisfactory schooling conditions -0.020 -0.014 0.584 -0.013 0.611 Expenditures on schooling 1.249 1.094 0.365 1.544 0.772 Incidence of respiratory diseases -0.120 -0.056 0.083 -0.056 0.160 in children Sousce: Authors' computations based on the RCIS and the SGHH. which declined for all groups between 1998 and 2002. In approximately a third of project villages the number of pupils and the number of graduates increased while decreasing in the control group (this difference is statistically significant). The village-level subjective assessment indicator shows a significant change in the perception that access to education improved between 1998 and 2002 for the treatment villages but not for the control villages. A more detailed set of outcome indicators is estimated for the household survey data. In treatment villages, primary and secondary school enrollment rates increased by 6 percentage points between 2000 and 2001; there was no change in matched control villages. However, this difference is only marginally significant. A more responsive indicator of school attendance shows clearer benefits from school improvements. The share of pupils missing classes dropped by more than 5 percentage points in treatment villages, and it increased by 2 percentage points in the matched control group. The health impact of school rehabilitation is substantial. The incidence of respiratory diseases among school-age children declined by 12 percentage points in villages with a project compared with a decline of slightly more than 5 percentage points in villages without a project. No significant changes in parents' assessments of schooling conditions were detected. Overall, the estimation results fit the prior expectations. In Georgia, where primary education is compulsory, the most sensitive gauge of project impacts is changes in attendance, the outcome indicator for which the results are the most significant. It can also be speculated that if school rehabilitation projects induce a positive response in one indicator in treatment villages, this could lead to improvements in other indicators that intuitively are less sensitive to this type of intervention. Improvement in the health status of school-age children is one example. Improvements in Road Infrastructure Road and bridge rehabilitation often means repaving existing roads, restoring road structures damaged or destroyed by flooding and earthquakes, and widen- ing road intersections and bridges. Such rehabilitation can reduce commuting time and improve access to markets. Investments in roads and bridges are likely to generate new income opportunities for agricultural households, with impacts far beyond the project site.7 Several labor market studies have identified off- farm employment, an activity highly dependent on transportation, as the driving force behind welfare change in Georgia (Bernabi. 2002; Yemtsov 2001). Poor access to product markets appears to constrain growth and to perpetuate barter trade (Cord and others 2003). 7. According to studies of other countries in Eastern and Central Europe, poor road quality can add 28-44 percent to transportation costs for local producers and to commuting costs for rural dwellers. Lowering transportation costs will have a dramatic effect on the poor, because poor households generally tend to be located in very remote areas (World Bank 2003~). Lokshin and Yemtsov 325 By the end of 2000 road improvement projects were completed in 41 villages, or 19 percent of the community infrastructure survey sample; 36 of these villages were also covered by the household survey. The initial control group was constructed from all villages without road or bridge projects completed between 1998 and 2001. The most immediate outcome indicator of a road rehabilitation project-time spent commuting to the district center-shows a 36-minute reduction in project villages, but these gains are not statistically different from changes for the control group (table 4). Indicators linked to the economic impact of projects show more pronounced trends. The share of villages with active nonagricultural small and medium-size enterprises increased in project villages, a statistically significant change com- pared with the propensity score-matched control group. The share of villages reporting barter exchange among the main channels for marketing agricultural products dropped significantly as a result of the road projects while increasing in control villages. Subjective assessments reflect no reaction to road rehabilita- tion interventions. Off-farm employment and female wage employment rates increased in vil- lages affected by road rehabilitation but declined in the control villages. Indi- - caters reflecting changes in the per capita market sales of agricultural products, however, showed no improvement in the treatment villages. Time for an ambu- lance to arrive improved in 24 percent of the treatment villages. This compares favorably with the worsening of this indicator in the propensity score-matched TABLE 4. Difference-in-Difference Estimates of the Average Impact of Road and Bridge Rehabilitation Projects Unmatched Sample Matched Sample Treatment Control Control Outcome Indicators Group Group p-Value Group p-Value RCIS Travel time to district center Subjective assessment of road (bad) Barter trade Small enterprises SGHH Time for ambulance to arrive Sales of agricultural products Female off-farm employment Nonagricultural employment Household transport expenditures per capita Incidence of trauma - -- Source: Authors' computations based on the RCIS and the SGHH. control group. The difference between the control and treatment groups in the rate of road accidents is not statistically significant. Some of the effects from road rehabilitation projects could be difficult to capture because of their data requirements and long-run nature. For example, indicators of improved road safety impose high demands on data coverage because accidents occur rarely. Water System Rehabilitation Projects Water projects include a wide range of works-installing new or repairing existing communal water tanks, installing water treatment equipment, fitting new pumps, repairing or installing pipes, and rehabilitating wastewater manage- ment networks. Benefits could include a reduction in the incidence of water- borne disease (Jalan and Ravallion 2003), less reliance on more expensive alternatives to piped water, and more time for child schooling and for produc- tive activities among adults, particularly women. Coverage was less extensive for water rehabilitation projects than for school or road projects. In the community infrastructure survey, 17 villages (7percent of the sample) had a water system rehabilitation project completed by the end of 2001. Only nine villages in this group were also covered by the household survey. The small number of cases make this an important test of the limits of the proposed evaluation strategy. The impact evaluation estimations show that the range of drinking water supply options expanded in 24 percent of project villages (table 5). In the control group, only 8 percent of villages in the unmatched and 6 percent in the matched sample reported a new water supply option available between 1998 TABLE 5. Difference-in-Difference Estimates of the Average Impact of Water System Rehabilitation Projects Unmatched Sample Matched Sample Treatment Control Control Outcome Indicators Group Group p-Value Group p-Value RCIS New channels of water supply 0.235 0.082 0.086 0.059 0.041 Incrcasc in livestock 0.647 0.664 0.447 0.529 0.272 SGHH Piped water in the household 0.110 0.002 0.216 -0.063 0.243 Hours of piped water supply 0.980 0.779 0.565 -0.785 0.665 Female wage employment -0.055 0.004 0.181 -0.020 0.382 Incidence of waterborne -0.006 0.001 0.196 -0.001 0.123 diseases (total) Incidence of waterborne diseases 0.000 0.007 0.037 0.000 in children Expenditure on bottled water -0.018 -0.018 0.500 -0.027 0.514 Source: Authors' computations based on the cis and the SGHH. Lokshin and Yemtsov 327 and 2002. Coverage of piped water supply increased 11 percent in the treatment villages compared with no change or even slight deterioration of coverage in the control groups. The number of hours that piped water is available increased sub- stantiallyin the project villages while declining considerablyin the matched control group. Comparison of changes in the incidence of waterborne diseases shows a marginallysignificant effect. Other impact indicators show changes in the expected direction (with the exception of changes in the female employment rate), but the differences between the treatment and control group averages are insignificant. Difficulties in observing significant effects of water rehabilitation projects could be linked to three factors. First, water projects were the least "popular" in rural Georgia according to the community infrastructure survey, resulting in too small a sample to capture the effect. Second, it is difficult to extract specific indicators reflecting improved access to water from a regular multitopic survey. Third, a distinct feature of water projects is partial coverage of the population. In many villages only certain clusters of houses are connected to pipes and therefore are direct beneficiaries of this kind of intervention. As a result, the effect observed at the village level may not fully reflect the heterogeneity in impact among project beneficiaries. This issue is addressed in the next section. Distributional Impact of Infrastructure Rehabilitation Projects Households within the same village may benefit differently from a particular project. Jalan and Ravallion (2003)find that piped water projects, for example, have different impacts for poor and nonpoor households in India. To assess whether infrastructure rehabilitation projects had different impacts on the living standards of poor and nonpoor households in Georgia the main outcome indicators were reconstructed using subsamples of poor and nonpoor households from each village covered by the household survey. Community- level impact indicators from the community infrastructure survey were omitted because these cannot be differentiated for the poor and the nonpoor. Poor households seem to have benefited more than nonpoor households from school rehabilitationprojects (table6).The most sensitive indicator-improvements inschool attendance-shows that school rehabilitationhas a significanteffect on the poor. The share of children from poor households missing classes declined by 11 percentage points as compared with about 2 percentage points for non'poor households.Similarly, health outcomes improved more among children from poor - households than from nonpoor households. Changes in school enrollment rates, however, demonstrate a better response for children from nonpoor househblds, whereas differencesin changes in private educational expenditures are ambiguous. - The distributional impact of road rehabilitation projects varies for diffdrent outcome indicators. The nonpoor clearly benefited more in improved access to emergency medical assistance and in opportunities for nonagricultural employ- ment. Female off-farm employment rates, on the other hand, show greater positive change among the poor. Interpreting results for the agricultural product sales indicator is more complex (World Bank 2003~).In recent years the sales of TABLE 6. Poor versus Nonpoor: Difference-in-Difference Estimates of the Average Impact of the Project for Three Types of Interventions Poor Nonpoor Differencea Treatment Group Control Group p-Value Treatment Group Control Group p-Value p-Value School rehabilitation School enrollment Share of pupils missing more than 30 days Unsatisfactory schooling conditions Expenditures on schooling Incidence of respiratory diseases in children Road and bridge rehabilitation Time for ambulance to arrive Sales of agricultural products Female off-farm employment Nonagricultural employment Household transport expenditures per capita Incidence of trauma Water system rehabilitation Piped water in the household Hours of piped water supply Female wage employment Incidence of waterborne diseases, total Incidence of waterborne diseases in children Expenditure on bottled water Source: Authors' computations based on theRCISand the SGHH. aDifference in means between poor and nonpoor households from treatment group villages. Lokshin and Yemtsov 329 agricultural products plummeted for the whole country, and the decline was particularly strong for rich households, which had been better integrated into markets. This is what the impact analysis resultsshow here, suggestingthat road quality is not the main driver in this process. The key benefits from water projects are related to improvements in health status, which were found mainly among nonpoor households. Changes in the incidence of waterborne diseases among poor households were not statistically different between treatment and control groups. This analysis of the impact of community-level investments in infrastructure rehabilitation in rural Georgia on household well-being combined household- and community-level survey data, controlling for time-invariant unobservable characteristics at the community level by applying propensity score-matching difference-in-difference comparisons. The results indicate that improvements in school infrastructure produced nontrivial gains in school enrollment rates, raised school attendance, and reduced health risks for school-age children. Road and bridge rehabilitation projects generated clear economic benefits at the community level. The number of small and medium-size enterprises increased, and the importance of barter trade fell. Access to emergency medical assistance improved unambiguously. For water system rehabilitation interventions, the most unambiguous effect is the reduction of the incidence of waterborne diseases. The impact of water projects measured by other indicators is less clear-cut. To a large degree, the ambiguity is related to the small number of project villages in the sample. Comrnunity-levelinterventions had different distributional impacts. School reha- bilitation improved school attendance and children's health status among the poor more than it did among the better off. Road projects benefited the poor and nonpoor in different ways. The nonpoor gained more from improved accessibility to emergency medical assistance. Expansion of nonagricultural job oppor~ities favored women from poor households. That better-off households fully accounted for the observed decline in the incidence of waterborne diseases suggests that the benefits of water rehabilitation projects accrue mostly to the nonpoor. It is encouraging to see such richness in the results considering that the analysis relied on modest additional data collection. The methodology demon- strates that evaluation of project impact is possible even in the absence of proper baseline survey data. Carefully designed community surveys (collecting retro- spective information) in combination with ongoing nationally representative household surveys could provide a feasible and low-cost alternative to standard before-and-after techniques, perhaps stimulating wider use of robust impact assessment methodologies for community-levelprojects in developing countries. Nevertheless, it is important to emphasize that proper baseline data are crucial for a credibleevaluation. Using retrospectivedata to substitutefor baseline data, as APPENDIX TA BLE A.1. Definition of Indicators Indicator Definition All children are enrolle d in school Dummy, whether all children in the village 7-15 years old are enrolled in school Number of pupils Total number of pupils enrolled in all primary and secondary schools located in the village Number of grads Total number of pupils graduating in a given year from all schools located in the village Access to education Assessment by key informant of the change (1998-2002)in the access to education by village population School enrollment rate Share of village school-age children (7-15years old) currently enrolled in school Share of pupils missing more than Share of village children enrolled in school who 30 days missed more than 30 days in the last school year Unsatisfactory schooling conditions Share of parents assessing conditions as unsatisfactory in the school in which their child is currently enrolled Expenditures on schooling Average lari per child attending school in past 12 months for transport, textbooks, fees, and other school expenses Incidence of respiratory diseases (child) Share of children enrolled in schools who suffered from respiratory disease over the past 30 days Time to district capital Reduction in time between 1998 and 2002, in minutes, to reach the district capital by the most usual means of transportation Subjective assessment of road (bad) Share of villages where key informants assess the main road quality as "bad" or "very bad" Barter trade Share of villages where barter is listed among the three most important ways to sell agriculture products Small enterprises Share of villages which have operating small or medium-size manufacturing or construction firms (fewer than 10 employees) Time for ambulance to arrive Average time for emergency assistance to arrive at households that called for an ambulance in the past 30 days Sales of agricultural products Average sales of crops and animal products over the last quarter, in lari per household Female off-farm employment Share of women in the working age population who were employed for wages for at least one hour during the last week Nonagricultural employment Share of working age population employed for pay or self-employed outside agriculture in the last three months Household transport expenditures Spending on transportation services and means of per capita transport over the past quarter in lari per capita Incidence of trauma Share of population reporting suffering from trauma and burns in the past 30 days (Continued) Lokshin and Yemtsov 331 TA BLE A.1. Continued - - Indicator Definition New water sources Share of villages where new water supply options become available between 1998 and 2002 Number of livestock Share of villages where key informants report considerable or some increase in livestock owned between 1998 and 2002 Piped water in the household Share of household that have a piped water supply in or near their dwelling - Hours of piped water supply Hours water was available per day on average over the last three months for households with piped water Incidence of waterborne diseases (total) Share of population suffering diarrhea, gastrointestinal, or parasitic infections in the past 30 days Incidence of waterborne diseases Share of children under age five suffering froq (child) diarrhea, gastrointestinal, or parasitic infecfion in the past 30 days Expenditures on bottled water Purchases of bottled water over the last three1 months, in average lari per household If internally displaced person Dummy variable that takes a value of 1 if the village in the village has at least one internally displaced person from Abkhazia in residence Agriculture only Dummy variable that takes a value of 1 if th$ key informants report that there is no economi; activity in the village except agriculture Experienced disaster Total number of floods, earthquakes, droughts, hails, fires, landslides, and livestock epidedics between 1997 and 1999 Experienced flood Village experienced at least one flood betweeh 1997 and 1999 Railroad Whether the village had an operating railroad station in 1998 Interstate highway Whether a highway of regional importance passed through the village in 1998 Asphalt road Whether the main road in the village was paved in 1998 Number of schools Number of primary and secondary schools operating in the village in 1998 Number of large enterprises Number of operating enterprises within 20 knjof the village with more than 10 employees in 1998 Small enterprise Whether the village had a small or medium-size enterprise operating in 1998 Police station (post, restaurant) Whether the village had a police station (posroffice, restaurant, or roadside cafi) in 1998 Proportion of household with a phone Share of households in the village that were connected to a fixed tele~honeline in 1998 Proportion of household with a toilet Share of households in theiillage that were lsing latrines in 1998 I Unreliable electric power supply Key informant in the village reported electric power supply to the village of less than 24 hours a day in 1998 (Continued) TAB LE A.1. Continued Indicator Definition Proportion of households with Share of households in the village that have a piped piped water water supply near their dwelling in 1998 Built with wooden (dirt) walls (floors) Key informant estimate of share of buildings in the village with wooden walls (dirt floors) in 1998 Trade by the roadside Dummy variable taking a value of 1 if one of three main trade channels for the village was selling products by roadsides in 1998 Source: SGHH and RCIS. Note: Currency values are in nominal terms, with 1 lari approximately equal to $0.50. The inflation rate in Georgia in 1998-2002 was 4 percent a year, and food prices were essentially unchanged. is done here, risks introducing recall bias, which can influence the precision of the results. Retrospective data cannot fully substitute for baseline data, but they can serve as a defensible fallback when first-best options are not feasible. Also important, the proposed strategy fails to produce robust results for projects affecting a small fraction of the population. For example, most of the results for water system rehabilitation projects are only marginally significant or insignificant. Data limitations required dropping health clinic and kindergarten rehabilitation projects from the analysis. Thus, the proposed methodology is probably best suited for large-scale com- munity-driven micro-projects deployed with little regard for subsequent evalua- tion, such as emergency or disaster response operations. Government-run decentralized public investment programs in developing economies are another good candidate. Projects of this type play an important role in developing economies, and assessing their effectiveness can help in making informed choices about their focus, scope, and delivery mechanisms. Bernabi., S. 2002. "A Profile of the Labour Market in Georgia." International Labour Organization and UN Development Programme, Tbilisi. Brockerhoff, M., and L. Derose. 1996. "Child Survival in East Africa: The Impact of Preventive Health Care." World Developmetnt 24(12):1841-57. Chen, S., and M. Ravallion. 2003. "Hidden Impact? Ex-Post Evaluation of an Anti-Poverty Program." Journal of Public Economics 86(1):123-53. Cord, L., R. Lopez, M. Huppi, and 0.Melo. 2003. "Growth and Rural Poverty in the CIS7 Case Studies of Georgia, the Kyrgyz Republic, and Moldova." Paper prepared for the Lucerne Conference of the CIS-7 Initiative, January 20-22, Lucerne, Switzerland. Glewwe, P. 1999. The Economics of School Quality Investments in Developing Countries: An Empirical Study of Ghana. London: ~Macrnillan. GOMI (Georgian Opinion Research Business International) and Georgia State Department for Statistics. 2003. "Fiscal Community Survey Report." Tbilisi. Lokshin and Yemtsou 333 Hanushek, E. 1995. "Interpreting Recent Research on Schooling in Developing Countries." WorId Bank Research Observer 10(2):227-46. Heckman, J., H. Ichimura, and P. Todd. 1997. "Matching as an Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Program." Review of Economic Studies 64(4):605-54. Heckman, J., H. Ichimura, J. Smith, and P. Todd. 1998. "Characterizing Selection Bias Using Experi- mental Data." Econometrics 66(5):1017-99. Jacoby, H. 2002. "Access to Markets and the Benefits of Rural Roads." Economic Journal 110(465): 713-37. Jalan, J., and M. Ravallion. 1998. "Are There Dynamic Gains from a Poor-Area Development Program?" Journal of Public Economics 67(1):65-85. -. 2003. "Does Piped Water Reduce Diarrhea for Children in Rural India?"Journal of Econometrics 112(1):153-73. Kahnemann, D. 2003. "Maps of Bounded Rationality: Psychologyfor Behavioral Economics."American Economic Review 93(5):1449-75. Kremer, M. 1995. "Research on Schooling: What We Know and What We Don't: A Comment." World Bank Research Observer 10(2):247-54. Lee, L., M. Rosenzweig, and M. Pitt. 1997. "The Effects uf Improved Nutrition, Sanitation, and Water Quality on Child Health in High-Mortality Populations." Journal of Econometriw 77(1):209-35. Orivel, F. 1998. "Cost and Finance of Education in Georgia." World Bank, Washington, D.C. Rawlings, L., and N. Schady. 2002. "Impact Evaluation of Social Funds." World Bank Economic Review 16(2):213-17. Rosenbaum, P., and D. Rubin. 1983. "The Central Role of the Propensity Score in Observational Studies for Causal Effects." Biometr~ka70(1):41-55. Rubin, D. 1973. "The Use of Matched Sampling and Regression Adjustment to Remove Bias in Observa- tional Studies." Biometrics 29(1):185-203. UNOCHA (United Nations Office for the Coordination of Humanitarian Affairs). 2003. "Directory of Humanitarian and Developments Sectoral Programs." Tbilisi. van de Walle, D., and D. Cratty. 2002. "Impact Evaluation of a Rural Road Rehabilitation Pnoject." World Bank, Washington, D.C. World Bank. 2003a. "Achieving the Human Development MDGS in ECA." Washington, D.C. . 2003b. "Georgia Country AssistanceStrategy." Washington, D.C. . 2003c. "Georgia: Trade Diagnostic Study." Washington, D.C. Yemtsov, R. 2001. "Labor Markets, Inequality and Poverty in Georgia." IZA Discussion Paper 251. Institute for the Study of Labor, Bonn. can expect to live to the age of 80, 30 years more than the wasted human potentialand thus missed development That is why the World Deuelopnreizt Report 2006 analyzes the relationship between equity and development.This year's Rep& considers the evidence on inequality of opportunity, within and across countries.It asks how public action can level the political and economic playing fields. Domestically it makes the case for invesm in people, expandingaccess to justice, land, and infrastrudure,and promotingfairnessin markets.Internationallyit considersthe functioningof global markets and the rules that govern them and the complementary provisionof aid to help poor countries and poor people build greater endowments. World Development Report 2006:Equity and Development A copubIication with Oxford University Press. September 2005. 240 pages. Paperback. Stock no. A16249 (ISBN 0-8213-6249-6). US$26. Hardcover. Stock no. A16251 (ISBN 0-8213-6251-8). US$50. ................................. FromAlbaniatoZimbabwe,fromstartingabusinessto closingabusiness,theDoing Businessseriesinvestigates Doing Business in 2006 provides analysis on those regulations that help create jobs and those that deter it. The third in theseries, it updates the indicators presented in previous reports.on starting a business, hiring and firing workers, getting licenses, getting credit, protecting investors, enforcing contra-, and closing a business. Two news sets of measures are added, on paying taxes and tradmg across borders. Indicatorson business regulations and their enforcement can be compared across 150 countries, including post-conflict countries such as Iraq, and over time. The indicatorsare used to analyze economic and social outcomes, such as productivity investment, informality corruption, unemployment and poverty, and identify what reforms have worked, where and why. Doing Business in 2006: Creating Jobs September 2005. Stock no. A15749 (ISBN 0-8213-5749-2). US$35. Doing Business in 2005: Removing Obstacles to Growth September 2004. 160 pages. Stock no. A15748 (ISBN 0-8213-5748-4).US$35. Doing Business in 2004: Understanding Regulation October 2003. 215 pages. Stock no. A15341 (ISBN 0-8213-5341-1). US$35. www,worldbank,org/publications Telephone:703-661-1580or 1-800-645-7247 Fax:703-661-1501 [TI THE WORLD BANK ECONOMIC REVIEW