Research Observer EDITOR Shantayanan Devarajan, World Bank CO-EDITOR Gershon Feder. World Bank EDITORIAL BOARD Susan Collins, Georgetown University Angus Deaton, Princeton University Barry Eichengreen, University of California-Berkeley Emmanuel Jimenez, World Bank Benno Ndulu, World Bank Howard Pack, University of Pennsylvania Luis Serven, World Bank Sudhir Shetty,World Bank Michael Walton, World Bank The World Bank Research Observer is intended for anyone who has a professional interest in development. Observer articles are written to be accessible to nonspecialist readers; con- tributors examine ley issues in development economics, survey the literature and the lat- est World Bank research, and debate issues of development policy. Articles are reviewed by an editorial board drawn from across the Bank and the international community of econo- mists. Inconsistency with Bank policy is not grounds for rejection. The journal welcomes editorial comments and responses, which will be considered for pub- lication to the extent that space permits. On occasion the Observer considers unsolicited contributions. Any reader interested in preparing 'such an article is invited to submit a proposal of not more than two pages to the Editor. Please direct all editorial correspon- dence to the Editor, The World Bank Research Observer, 1818H Street,NW, Washington, DC 20433: USA. The views and interpretations expressed in thisjournal are those of the authors and do not necessarily represent the views and policies of the World Bank or of its ExecutiveDirectors - or the countries they represent. The World Bank does not guarantee the accuracy of data included in this publication and accepts no responsibility whatsoever Tor any consequences of their use. when maps are used, the boundaries, denominations, and other information do not imply on the part of the World Bank Group any judgment on the legal status of any territory or the endorsement or acceptance of such boundaries. P For more information, please visit the Web sites of the Research Observer at www.wbro.oxfordjournals.org, the World Bank a t w~vw.worldbank.org, and-oxford Univgrsity Press a t www.oxfordjournals.org. Research Observer - - - Volume 23 Number 1 Spring 2008 Governance Indicators: Where Are We, Where Should We Be Going? Daniel Kaufmann and Aart Kraay Two Comments on "Governance Indicators: Where Are We, Where Should We Be Going?" by Daniel Kaufmann and Aart Kraay Shantayanan Devarajan and Simon Johnson Walking up the Down Escalator: Public Investment and Fiscal Stability William Easterly,Timothy Irwin, and Luis Serven 37 What Can Countries in Other Regions Learn from Social Security Reform in Latin America? Indermit S. Gill, Ceren Ozer, and Radu Tatucu Why OECD Countries Should Reform Rules of Origin Olivier Cadot and Jaime de Melo Subscriptions A subscription to Tlre Worki Bank Resrnrch Obs~~rvrr(ISSN 0257-303L) comprises 2 issues. Prices include postage: b r subscribers outside the Americas, issucs are sent air Freight. Annual Subscription Rate (Volume 23. 2 issues. 2008) Academic libraries Print edition and site-wide online access: IlS$140/f9 3/140 Print edition only: IiS$133/EH9/l 33 Site-wide online access only: US$133/fH9/l 33 Corpornte Print edition and silc-widc online access: U S $ ~ O X I £ ~ ~2O8 Y / Print edition only: [JS$198/f l 32119X Site-wide online access only: US$198/L132/19t( Personal Print edition and individual online access: US$50/f 33/50 Please note: ITS$ rate applies to ITS & Canada. Eurose applies to Europe. UKf applies to IJK and Kest 01 World. Rnrders w~tlrrnailing ad~lr~ssesin tlon-OECII co~rntriesand in socirrlist ec.otronli(~sirz trarlsitiorz arc, pligihle to r(,ceive curnplir~zrntnr!l sul~srril~tionson rcyuest by writirry Lo tllr OK a~ldrvsshelo~it There may be other subscription rates available: for a complete listi~rg,please visit www.wbro.oxFordjournaIs.org/ subscriptions. Full pre-payment in the correct currcocy is required for all ordcrs. Payment should be in IJS dollars for orders being delivered to Lhe USA or Canada: Euros for orders being delivered within Europe (excluding the UKJ: GBP sterling b r orders being delivered elsewhrre (i.e not being dclivercd to USA. Canada, or Europe). All orders should be accompanied by full payment and sent to your nearest Oxbrd Journals oftice. Subscriptions are acceptcd for conlplete volumcs only Orders are regarded as tirm, and payments are not refundable. Our prices include Standard Air as poslage outsidr of the [JK. Claims must he notified within b u r moriths of despatchlorder date (whicheveris later). Subscriptions in thc EEC may be subject to European VAT. Ifregistered, please supply de~ailsto avoid unnecessary charges. For subscript~c~nsthat include online versions. a proportion of the subscription price may be subject to IIK U4T. Subscribers in Canada, please add CST lo the prices quoted. Personal rate subscriptions are only available iF payment is made by personal cheque or credit card, delivery is to a private address, and is for personal use only Back issues: The current year and two previous years' issues are available from Oxford Ilniversily Press. Previous volumes can bc ohtainrd lrom the Periodicals Service Company. 11 Main Strrct. Germantown. NY 12526, IJSA. E-mail: psc@periodicals.com. Tel: (518) 537-4700. Fax: (518) 537-5899. Contact information:Journals Customer Service 1)epartment. Oxford Ilniversity Press. Great Clarendon Stntct. Oxford OX2 hDt: UK.E-nlail: jn~s.cud.serv(i(loxlorrljournals.org.'~el: +44 (0)1865 353907. Fax: +44 111)18h5353485. In the Americas, please contact:Journals Custc~merServlcc 1)eparlment. Oxford Lnivcrsity Press. 2001 Evans Road, Cary. NC 27513. USA. E-mail: jniorders(~oxbrdjournals.ore.Tel: (800) 852-7323 (toll-free in L'SAlCanada) or (91'1) 677-0977. Fax: (919) 677-1 714. In Japan,please contact: Journals Customer Service Departmenl. Oxford Clniversity Press. 4-5-10-HP Shiba, blinato-ku. Tokyo. 108-8 386. Japan. E-mail: ~~kudaoup@po.iijnet.or.jp. +81 3 5444 5858. Fax: +81 3 3454 2929. Tcl: Postal informatinn: TIIPWorld Batik Kcsoarch Ohsurvrr (ISSN 0257-5031) is published twice a year, in Feb. and Aug., by Oxford Irniversity Press for the International Bank For Reconstn~clion and Developmcntl~~e HORI.I) H ~ N K . Postmaster: send address changes to Tlrv Worlri Bllrtk Rosntrch Ohseri~c~r.Journals Customer Service Ilepartment. Oxford University Press. 2001 Evans Road. car^: NC 2751 3-2009. Communications regarding original articles and edi(oria1 management should be addressed to The Editor. Tl~rWorld Barlk Rusearch Ohserver. The World Bank. 1818 H Street. NW. Washington. D.C. 2043 3. USA. Oxford Journals Environmental and Ethical Policies: Oufor~lJournals is committed to working with the global community to bring the highest quality research to the widest possible audience. Oxford Journals will protect the environment by implementing crlviron~l~etltallyliiendly policies and practices wherever possible. Please see http:iiw~~~w.oxbrdjournals.org/ eth~calpolicies.htmlb r further inlbrmation on Oxford [ournals' environmental and ethical policies. DIGITALOBJECT IDENTIFIERS: For information on dois and to rrsolve them. please visit www.doi.org. Permissions: Fnr information on how to request permissions lo reprnduce articlcs or inl'ormation from this journal. please visit www.oxfordjournals,org!jnls/perrnisbions. Advertising:Inquiries about adverlising should be sent to Hclen I'earson. Oxfcrrd Journals Advertising. PO Box 347. Abingdon OX14 1GJ. UK. E-mail: helen(~;oxFordads.com.Tel: +44 (011235 201904. Far: +4J ((1)8704 296864. Disclaimer: Statements of fact and opinion in thc artlcles in Tlrr Miorlrl Bank K~srar('11Ohscrwr are those of the respeclive authors and contributors and not of the l~lternationalBank Tor Heconstruction and I)evelopment/r~rWIIRII) B A ~ Kor Oxford Ilniversity Press. Neither Oxfwd LJniversily Prcss nor the Internalionai Rank for Reconstructio~~and Development/~~~i IVIIRLI) R A ~ K make any reprcsentation, express or implied, in respect of the accuracy of Lhe material in this jonrnal and cannot acccpt sny lcgal responsibilily or liability for any errors or omissions that may be made. The rcader should make her or his own evaluation as to the appropriateness or otherwise of any expcrimclltal Lechl~iquedescribed. Paper used: Tlrp M4)rlrl Rurrk Rpsi~,tr~.lrOhsrr-v1.r is printed on acid-frcr paper that meets the minimum requirements of ANSI Standard 259.48-1984 (Permanence of Paper). Indexing and abstracting: Tire M.k)rlclBorlk Kczs~,nn'hOhs~r~icris indexcd and/or abstracted by -IHI/I,V~ OK,LI. c . 4 ~ ~Zbslracts.Currcrrt (:or~trrrts/Socinland Bdl~nviornlS~.i~vrr,rs.)t*rrrrl,rl Ecorrorllic Litmtrlrr~~iEcoriLit, PALS lrrtrrrmlio~wl.KrPRr (R~smrchin E~~onomic Rrpers]. Sorial Sar\~ir.psCitnt~or~In[kx. and N'ilsor~Husir~rss;ibstr~rcts. Copyright 8.. 'l'he Intcrnatlonal Balk fur Keconstructlon and i)cvelopme~it/.~.~~: \LORI.II I ~ A ~ K All rights reserved; no part of this publicdlon may be reproducrd, storcd in a rrtricvai bystem, or transmitted in any form or by any means, electr~mic.mechanical, photocopying, recording, or otherwise without prior written permission or the publisher or ir license permitling reslr~ctedcopying issued In the IIK by thc Copyrighl Licensing Agency Ltd. 90 'rottenham Courl Road. London W1P 9HE. rrr in the IlSA by the Copyrighl Clearatst. Center. 222 Roscurood Drive. Danvers. MA 01923. Governance Indicators: Where Are We, Where Should We Be Going? - Daniel Kaufmann Aart Kraay .- Progress in measuring governance is assessed using a simple fran~ework that distinguishes between indicators that measure formal rules and indicators that measure the pmcticnl application or o~ctcomesoJ these rilles. The analysis calls attentior1 to the strerzgths and weaknesses of both types of indicators as well as the conzplemerziarities between them. It distinguishes between the vielvs of experts and the results of surveys atld assesses the nzerits of aggregate as opposed to individual governance indicators. Somc simple principles are ident$cd to guide the use and refinement of existing govern- ance indicators and the developnzcnt offut~lreindicators. These irlclude transparently dis- closing and acco~intirzg for the margins of error in all indicators, drawing f m n ~a diversity of indicators and exploiting con~ylenzentaritiesamong thern, submitting all irldicators to rigorous public and acaderrlic scrutiny, and being realistic in expectations of future indicators. [EL codes: H1, 017 Not everything that can be counted counts, and not everythirlg that counts can be counted. -Albert Einstein Most scholars, policymalers, aid donors, and aid recipients recognize that good governance is a fundamental ingredient of sustained economic development. This growing understanding, initially informed by a very limited set of empirical measures of governance, has spurred intense interest in developing more refined. nuanced, and policy-relevant indicators of governance. This article reviews progress in measuring governance, emphasizing empirical measures explicitly designed to be cornparable across countries and in most cases over time. The goal is to provide a structure for thinking about the strengths and weaknesses of different types of governance indicators that can inform both the use of existing indicators and ongoing efforts to improve them and develop new ones.' The tirst section of this article reviews definitions of governance. Although there are many broad definitions of governance, the degree of definitional disagreement I: The Author 20[)X. Published by Oxkrd TJniversily Press un behalrol the International Bank For Reconstruction and Ilevelopment 1 II I E \VOKIJ) BAYK. Allrights reserved. For permissions, please e-rnail: journals.perrnissions@joxIordjournills.org doi:10.1093!whro!lkmol2 Advance Access publication January 3 1. LOOX 23:l-30 can easily be overstated. Most definitions appropriately emphasize the importance of a capable state that is accountable to citizens and operating under the rule of law. Broad principles of governance along these lines are naturally not amenable to direct observation and thus to direct measurement. As Albert Einstein noted, "Not everything that counts can be counted." Many differenttypes of data provide infor- mation on the extent to which these principles of governance are observed across countries. An important corollary is that any particular indicator of governance can usefully be interpreted as an imperfect proxy for some unobserved broad dimension of governance. This interpretation emphasizes throughout this review a recurrent theme that there is measurement error in all governance indicators, which should be explicitly considered when using these kinds of data to draw conclusions about cross-country differencesor trends in governance over time. The second section addresses what is measured. The discussion highlights the distinction between indicators that measure specific rules "on the books" and indi- cators that measure particular governance outcomes "on the ground." Rules on the books codify details of the constitutional, legal, or regulatory environment; the existence or absence of specific agencies, such as anticorruption commissions or independent auditors; and so forth--components intended to provide the key de jure foundations of governance. On-the-ground measures assess de facto govern- ance outcomes that result from the application of these rules (Do firms find the regulatory environment cumbersome? Do households believe the police are corrupt?). An important message in this section concerns the shared limitations of indicators of both rules and outcomes: Outcome-based indicators of governance can be difficult to link back to specific policy interventions, and the links from easy-to-measure de jure indicators of rules to governance outcomes of interest are not yet well understood and in some cases appear tenuous at best. They remind us of the need to respect Einstein's dictum that "not everything that can be counted counts." The third section examines whose views should be relied on. Indicators based on the views of various types of experts are distinguished from survey-based indi- cators that capture the views of large samples of firms and individuals. A category of aggregate indicators that combine, organize, provide structure, and summarize information from these different types of respondents is examined. The fourth section examines the rationale for such aggregate indicators, and their strengths and weaknesses. The set of indicators discussed in this survey is intended to provide leading examples of major governance indicators rather than an exhaustive stocktaking of existing indicators in this taxonomy.2A feature of efforts to measure governance is the preponderance of indicators focused on measuring de facto governance out- comes and the paucity of measures of de jure rules. Almost by necessity, de jure rules-based indicators of governance reflect the views or judgments of experts. In 2 The World Bank Resrarctr Observer; ~101.23. no. 1 (Spring 2008) contrast, the much larger body of de facto indicators captures the views of both experts and survey respondents. The article concludes with a discussion of the way forward in measuring govern- ance in a manner that can be useful to policymakers. The emphasis is on the importance of consumers and producers of governance indicators clearly recogniz- ing and disclosing the pervasive measurement error in any type of governance indi- cators. This section also notes the importance of moving away from oft-heard false dichotomies, such as "subjective" or "objective"indicators or aggregate or disaggre- gated ones. For good reason, virtually all measures of governance involve a degree of subjective judgment, and different levels of aggregation are appropriate for differ- ent types of analysis. In any case, the choice is not either one or the other, as most aggregate indicators can readily be unbundled into their constituent components. What Does Governance Mean? The concept of governance is not a new one. Early discussions go back to at least 400 BCE, to the Arthashnstra. a treatise on governance attributed to Kautilya, thought to be the chief minister to the king of India. Kautilya presents key pillars of the "art of governance," emphasizing justice, ethics, and anti-autocratic ten- dencies. He identifiesthe duty of the king to protect the wealth of the state and its subjects and to enhance, maintain, and safeguard this wealth as well as the inter- ests of the kingdom's subjects. Despite the long provenance of the concept, no strong consensus has formed around a single definition of governance or institutional quality. For this reason, throughout this article the terms governance, institutions, and institutional quality are used interchangeably, if somewhat imprecisely. Researchers and organizations have produced a wide array of definitions. Some definitions are so broad that they cover almost anything (such as the definition "rules, enforcement mechanisms, and organizations" offered in the World Bank's World Development Report 2002: B~iildingInstitutions for Markets). Others, like the definition suggested by North (2000),are not only broad but risk making the links from good governance to development almost tautological: "How do we account for poverty in the midst of plenty? . . . We must create incentives for people to invest in more efficient tech- nology increase their skills, and organize efficient markets.. .. Such incentives are embodied in institutions." Some of the governance indicators surveyed capture a wide range of develop- ment outcomes. While it is difficult to draw a line between governance and the ultimate development outcomes of interest, it is useful at both the definitional and measurement stages to emphasize concepts of governance that are at least somc- what removed from development outcomes themselves. An early and narrower KuriJrnunn and Krcmy 3 definition of public sector governance proposed by the World Bank is that "governance is the manner in which power is exercised in the management of a country's economic and social resources for development" (World Bank 1992, p. 1).This definition remains almost unchanged in the Bank's 2007 governance and anticorruption strategy, with governance defined as "the manner in which public officials and institutions acquire and exercise the authority to shape public policy and provide public goods and services" (World Bank 2007, p. 1). Kaufmann, Kraay, and Zoido-Lobaton (1999a, p. 1)define governance as "the traditions and institutions by which authority in a country is exercised. This includes the process by which governments are selected. monitored and replaced; the capacity of the government to effectively formulate and implement sound policies; and the respect of citizens and the state for the institutions that govern economic and social interactions among them." Although the number of definitions of governance is large, there is some con- sensus. Most definitions agree on the importance of a capable state operating under the rule of law. Interestingly, comparing the last three definitions cited above, the one substantive difference has to do with the explicit degree of empha- sis on the role of democratic accountability of governments to their citizens. Even these narrower definitions remain sufficiently broad that there is scope for a wide diversity of empirical measures of various dimensions of good governance. The gravity of the issues dealt with in these definitions of governance suggests that measurement is important. In recent years there has been debate over whether such broad notions of governance can be usefully measured. Many indicators can shed light on various dimensions of governance. However, given the breadth of the concepts, and in many cases their inherent unobservability,no single indicator or combination of indicators can provide a completely reliable measure of any of these dimensions of governance. Rather, it is useful to think of the various specific indi- cators discussed below as all providing imperfect signals of fundamentally unobser- vable concepts of governance. This interpretation emphasizes the importance of taking into account as explicitly as possible the inevitable resulting measurement error in all indicators of governance when analyiing and interpreting any such measure. As shown below, however, the fact that such margins of error are finite and still allow for meaningful country comparisons across space and time suggests that measuring governance is both feasibleand informative. Governance Rules or Governance Outcomes? This section examines both the rules-based and outcome-based indicators of gov- ernance. A rules-based indicator of corruption might measure whether countries have legislation prohibiting corruption or have an anticorruption agency. 4 The World Bank Rrseizrc.110bservc.c vol. 23, no. 1 (Spring 2008) An outcome-based measure could assess whether the laws are enforced or the anticorruption agency is undermined by political interference. The views of firms, individuals, nongovernmental organizations (NGOs), or commercial risk-rating agencies could also be solicited regarding the prevalence of corruption in the public sector. To measure public sector accountability, one could observe the rules regarding the presence of formal elections, financial disclosurc requirements for public servants, and the like. One could also assess the extent to which these rules operate in practice by surveying respondents regarding the functioning of the institutions of democratic accountability. Because a clear line does not always distinguish the two types of indicators, it is more useful to think of ordering different indicators along a continuum, with one end corresponding to rules and thc other to ultimate governance outcomes of interest. Because both types of indicators have their own strengths and weak- nesses, all indicators should be thought of as imperfect, but complementary proxies for the aspects of govcrnance they purport to measure. Rules-Based Indicators of Governance Several rules-based indicators are uscd to assess governance (tables 1 and 2). They include the Doing Business project of the World Bank, which reports dctailed information on the legal and regulatory environment in a large set of countries; the Database of Political Institutions, constructed by World Bank researchers, and the POLITY-IV database of the University of Maryland, both of which report detailed factual information on features of countries' political systems; and the Global Integrity Index (GII),which provides detailed information on the legal framework governing public sector accountability and transparency in a sample of 41 countries, most of them developing economies. At first glance, one of the main virtues of indicators of rules is their clarity. It is straightforward to ascertain whether a country has a presidential or a parliamen- tary system of government or whether a country has a legally independent anti- corruption commission. In principle, it is also straightforward to document details of the legal and regulatory environment, such as how many legal steps are required to register a busincss or fire a worker. This clarity also implies that it is straightforward to measure progress on such indicators. Has an anticorruption commission been established.?Have business entry regulations been strcamlined? Has a legal requirement for disclosure of budget documents been passed? This clarity has made such indicators very appealing to aid donors interested in linking aid with performance indicators and in monitoring progress on such indicators. Set against these advantages are three main drawbacks. They are less "objective" than they appear. It is easy to overstate the clarity and objectivity of rules-based measures of governance. In practice, a good deal of Table 1. Sources and Types of Information Used in Governance Indicators mpe of indicator -- Rules-bused Outcori~es-based -- A Source oJinJormation Broad Specific Broad Specific Experts Lawyers DB Commercial risk-rating agencies DRI. EIU. PRS Nongovernmental organizations GI1 HER, RSE CIR. FRH GII, OBI Governments and multilaterals CPIA PEFA Academics DPI. PIV DPI, PIV Survey respondents Firms ICA. GCS, WCY Individuals AFR, LBO. GWP Aggregate indicators combining experts and survey respondents TI. WGI. MOI --- Note: AFR is Afrobarometer, CIR is Cingranelli-Richards Hutnan Rights Dataset. CPIA is Country Policy and Institutional Assessment. DB is Doing Business, DPI is Database of Political Institutions. DRI is Global Insight DRI, EIU is Economist Intelligence Unit. FRH is Freedom House. GCS is Global Competitiveness Survey, GI1 is Global Integrity Index, GWP is Gallup World Poll, HER is Heritage Foundation. ICA is Investment Climate Assessment, LBO is Latinobarometro. MOI is Ibrahim Index of African Governance. OBI is Open Budget Index. PEFA is Public Expenditure and Financial Accountability. PIV is Polity IV, PRS is Political Risk Services, RSF is Reporters Without Borders. TI is Transparency International, WCY is World Competitiveness Yearbook, and WGI is WorldwideGovernance Indicators. Source: Authors' compilation based on data from sources listed in table 2. subjective judgment is involved in codifying all but the most basic and obvious features of a country's constitutional, legal, and regulatory environments. (It is no accident that the views of lawyers, on which many of these indicators are based, are commonly referred to as opinions.) In Kenya in 2007, for example, a consti- tutional right to access to information faced being undermined or offset entirely by an official secrecy act and by pending approval and implementation of the Freedom of Information Act. In this case, codifying even the legal right to access to infor- mation requires carefuljudgment as to the net effect of potentially conflicting laws. Of course,this drawback of ambiguity is not unique to rules-based measures of gov- ernance: interpreting outcome-based indicators of governance can also involve ambiguity, as discussed bclow. There has been less recognition, however, of the extent to which rules-based indicators also reflect subjective judgment. The links between indicators and outcomes are complex. possibly subject to long lags, and often not well understood. These problems complicate the interpretation of rules-based indicators. In the case of rules-based measures, some of the most basic features of countries' constitutional arrangements have little normative content on their own; such indicators are for the most part descriptive. It makes 6 T11rMbrld Rur~kResearch (Il?srrwr:vol. 23, no. 1 (Spring 200X) Table 2. Country Coverage and Frequency of Governance Surveys - Number of Frequencg of .Varnr countriescovered surveg Web site - - Afrobarometer 18 Triennial www.afrobarometer.org Cingranelli-Richards Human Rights 192 Annual www.humanrightsdata.com Dataset Country Policy and Institutional 136 Annual www.worldbank.org Assessment Doing Business Annual Database of Political Institutions Annual Global Insight DRI Quarterly Economist Intelligence Unit Quarterly Freedom House Annual Global Competitiveness Survey Annual Global Integrity Index Triennial Gallup World Poll Annual Heritage Foundation Annual Investment Climate Assessment Irregular Latinobarometro Annual Ibrahim Index of African Governance Triennial Open Budget Index Annual Polity IV Annual Political Risk Services Monthly Public Expenditure and Financial Irregular Accountability Reporters without Borders 165 Annual www.rsf.org World Competitiveness Yearbook 47 Annual www.imd.ch Source:Authors' compilation. little sense,for example,to presuppose that presidential (asopposed to parliamentary) systems or majoritarian (as opposed to proportional) representation in voting arrangements are intrinsically good or bad. Interest in such variables as indi- cators of governance rests on the case that they may matter for outcomes, often in complex ways. In their influential book, Persson, Torsten, and Tabellini (2005) document how these features of constitutional rules influence the political process and ultimately outcomes such as the level, composition, and cyclicality of public spending (Acemoglu 2006) challenges the robustness of these findings). In such cases, the usefulness of rules-based indicators as measures of governance depends crucially on how strong the empirical links are between such rules and the ultimate outcomes of interest. Perhaps the more common is the less extreme case in which rules-based indi- cators of governance have normative content on their own, but the relative Kaufn~annand Kmny 7 importance of different rules for outcornes of interest is unclear. The GII, for example, provides information on the existence of dozens of rules, ranging frorn the legal right to freedom of speech to the existence of an independent ombuds- man to the presence of legislation prohibiting the offering or acceptance of bribes. The Open Budget Index (OBI) provides detailed information on the budget processes, including the types of information provided in budget documents, public access to budget documents, and the interaction between executive and legislative branches in the budget process. Many of these indicators arguably have normative value on their own: having public access to budget documents is desir- able and having streamlined business registration procedures is better than not having them. This profusion of detail in rules-based indicators leads to two related difficulties in using them to design and monitor governance reforms. The first is that as a result of absence of good information on the links between changes in specific rules or procedures and outcomes of interest, it is difficult to know which rules should be reformed and in what order. Will establishing an anticorruption com- mission or passing legislation outlawing bribery have any impact on reducing cor- ruption? If so, which is more important? Should, instead, more efforts be put into ensuring that existing laws and regulations are implemented or that there is greater transparency, access to information, or media freedom? IIow soon should one expect to see the impacts of these interventions? Given that governments typi- cally operate with limited political capital to implement reforms, these trade-offs and lags are important. The second difficulty in designing or monitoring reforms arises when aid donors or governments set performance indicators for governance reforms. Performance indicators based on changing specific rules, such as the passage of a particular piece of legislation or a reform of a specific budget procedure, can be very attractive because of their clarity: it is straightforward to verify whether the specified policy action has been taken.3Yet "actionable" indicators are not necessarily also "action worthy," in the sense of having a significant impact on the outcomes of interest. Moreover, excessive emphasis on registering improvements on rules-based indi- cators of governance leads to risks of "teaching to the test" or, worse "reform illu- sion," in which specific rules or procedures are changed in isolation with the sole purpose ofshowing progress on the specific indicators used by aid donors. Major gaps exist between statutory rules on the books and their implementation on the ground. To tale an extreme example, in all 41 countries covered by the 2006 GII, accepting a bribe is codified as illegal, and all but three countries (Brazil. Lebanon, and Liberia) have anticorruption commissions or similar agencies. Yet there is enormous variation in perceptions-based measures of corruption across these countries. The 41 countries covered by the GI1 include the Democratic 8 Thr Worltl Rank Resectrch Ohsrrv~cvol. 2 3.110.I (Spring 2008) Republic of Congo, which ranks 200 out of 207 countries on the 2006 Worldwide Governance Indicators (WGI) control of corruption indicator, and the United States, which ranks 23. Another example of the gap between rules and implementation (documented in more detail in Kaufmann, Kraay, and Mastruzzi 2005) compares the statutory ease of establishing a business with a survey-based measure of firms' perceptions of the ease of starting a business across a large sample of countries. In industrial countries, where de jure rules are often implemented as intended, the two measures correspond quite closely. In contrast, in developing economies, where there are often gaps between de jure rules and their de facto implementation, the correlation between the two is very weak; the de jure codification of the rules and regulations required to start a business is not a good predictor of the actual con- straints reported by firms. TJnsurprisingly, much of the difference between the de jure and de facto measures could be statistically explained by de facto measures of corruption, which subverts the fair application of rules on the books. The three drawbacks-the inevitable role of judgment even in "objective" indi- cators, the complexity and lack of knowledge regarding the links from rules to outcomes of interest, and the gap between rules on the boolis and their implementation on the ground-suggest that although rules-based governance indicators provide valuable information, they are insufficient on their own for measuring governance. Rules-based measures need to be complemented by and used in conjunction with outcome-based indicators of governance. Outcome-Based Governance Indicators Most indicators of governance are outcome based, and several rules-based indi- cators of governance also provide complementary outcome-based measures. The GII, for example, pairs indicators of the existence of various rules and procedures with indicators of their effectiveness in practice. The Database of Political Institutions measures not only such constitulional rules as the presence of a par- liamentary system, but also outcomes of the electoral process, such as the extent to which one party controls different branches of government and the fraction of votes received by the president. The Polity-IV database records a number of out- comes, including the effective constraints on the power of the executive. The remaining outcome-based indicators range from the highly specific to the quite general. The OBI reports data on more than 100 indicators of the budget process, ranging from whether budget documentation contains details of assump- tions underlying macroeconomic forecasts to documentation of budget outcomes relative to budget plans. Other less specific sources include the Public Expenditure and Financial Accountability indicators. constructed by aid donors with inputs of recipient countries, and several large cross-country surveys of firms-including the Investment Climate Assessments of the World Bank, the Executive Opinion Survey of the World Economic Forum, and the World Competitiveness Yearbook of the Institute for Management Development-that ask firms detailed questions about their interactions with the state. Examples of more general assessments of broad areas of governance include ratings provided by several commercial sources, including Political Risk Services, the Economist Intelligence Unit, and Global Insight-DRI. Political Risk Services rates 10 areas that can be identified with governance, such as "democratic accountability," "government stability," "law and order," and "corruption." Large cross-country surveys of individuals such as the Afrobarometer and Latinobarometro surveys and the Gallup World Poll ask general questions, such as "Is corruption widespread throughout the government in this country?" The main advantage of outcome-based indicators is that they capture the views of relevant stakeholders, who take actions based on these views. Governments, analysts, researchers, and decisionmakers should, and often do, care about public views on the prevalence of corruption, the fairness of elections, the quality of service delivery, and many other governance outcomes. Outcome-based govern- ance indicators provide direct information on the de fact0 outcome of how de jure rules are implemented. Outcome-based measures also have some significant limitations. Such measures, particularly where they are general, can be difficult to link back to specific policy interventions that might influence governance outcomes. This is the mirror image of the problem discussed above: Rules-based indicators of gov- ernance can also be difficult to relate to outcomes of interest. A related difficulty is that outcome-based governance indicators may be too close to ultimate develop- ment outcomes of interest. To take an extreme example, the Ibrahim Index of African Governance includes a number of ultimate development outcomes, such as per capita GDP (gross domestic product), growth of GDE inflation, infant mor- tality, and inequality. While such development outcomes are surely worth moni- toring, including them in an index of governance risks making the links from governance to development tautological. Another difficultyhas to do with interpreting the units in which outcomes are measured. Rules-based indicators have the virtue of clarity: either a particular rule exists or it does not. Outcome-based indicators by contrast are often measured on somewhat arbitrary scales. For example, a survey question might ask respondents to rate the quality of public services on a five-point scale, with the distinction between different scores left unclear and up to the res~ondent.~In contrast, the usefulness of outcome-based indicators is greatly enhanced by the extent to which the criteria for differingscores are clearly documented. The World Bank's Country Performance and Institutional Assessment (CPIA) and the Freedom House indicators are good examples of outcome-based indicators based 10 The World Bank Researcll Observer; vol. 23, no. 1 (Spring 2008) on expert assessments that provide documentation of the criteria used to assign specific scores on the indicators they compile. In the case of surveys, questions can be designed to ensure that responses are easier to interpret: rather than asking respondents whether they think "corruption is widespread," respondents can be asked whether they have been solicited for a bribe in the past month. An example illustrates some of the main advantages and disadvantages of the two types of measures. Figure 1 compares alternative indicators of democratic accountability, a key dimension of governance. The horizontal axis measures a very broad outcome-based indicator, taken from the 2005 Voice of the People survey, a large cross-country household survey (www.voice-of-the-people-net).It asks households to indicate whether they think elections in their country are free and fair. The vertical axis reports two indicators of the qualily of electoral insti- tutions, laken from Global Integrity. The points labeled "de jure" are based on a factual assessment of the existence of a number of specific institutions related to Figure 1. De lacto and de jure Indicators of Elections ---- - .- 100- * e D -3 ee a * Isp " .- C .- -5 "2 C O m *r 80- Y U 4 BGR GHAg = ZAF ' k 9 ,,- rn NIC u-0 al c ETY - NGA: SEN .- z g 4 kz f'HL 8 9 PAK KEN IDN -.- EF 'MEX x-.q e 40- RUS 0 y = 23.09~+ 55.30 Defact0 D C R2=0.19 20 - 2 P 0 - I 1- 1 I 0 0.2 0.4 0.6 0.8 1 Voice of the PeopleHousehold Survey: Are electionsfree and fair? Note: i\RG is Argentina, ARM is Armenia. AZE is Azerbaijan. BEN is Benin, BRA is Brazil, BGR is Bulgaria. ZAR is Democrdtic Republic of Congo, EGY is Egypt, ETH is Ethiopia. GEO is Georgia. GHA is Ghana, GTM is Guatemala. IND is India. IDN is Indonesia. ISR is Israel, KEN is Kenya. KGZ is Kyrgyz Republic. T.BN is Lebanon. LBR is Liberia. MEX is Mexico. MNP is Montenegro. MOZ is Mozambique, NPL 1s Nepal. NIC is Nicaragua. NGA is Nigeria. PAK is Pakistan. PHL is Philippirles, ROM is Romania. RUS is Russia. SEN is Senegal. YIJG is Serbia. SLE is Sierra Leone. WF is South Africa. SDN is Sudan. TJKis Tajikistan. TZA is Tanzania. UGA is Uganda. USA is United States. \'NM is Vietnam. YEM is Yemen, and ZWE is Zimbabwe. Source: Authors' analysis based on data described in the text. Kaufmar~r~and Kma!/ 11 elections, such as the existence of a legal right to universal suffrage and the exist- ence of an election monitoring agency. The points labeled "defacto" capture the assessment of Global Integrity's experts as to the effectiveness of these institutions.' Several messages emerge from this figure. First, in some cases rules-based measures of governance show remarkably little variation across countries, with all countries receiving scores close to 100,indicating perfect scores on the de jure basis of this important aspect of governance. As of 2005, for example, every country surveyed by Global Integrity promised the legal right to vote, and a statu- torily independent election-monitoring agency existed in all but three countries (Lebanon, Montenegro, and Mozambique). Second, the links between a specific objective indicator of rules and the broad outcome of interest (citizens' satisfaction with elections) is at best very weak. with a correlation between the two measures that is in fact slightly negative. Third, outcome-based indicators explicitly focusing on the de Jncto implementation of rules can be useful. A noteworthy feature of Global Integrity is its pairing of indicators of specific rules with assessments of their functioning in practice. The correlation of the de facto measure with the broad outcome measure of interest taken from the Voice of the People survey is much stronger (0.46)than the correlation with the de jure measure. The corre- lation is far from perfect, however, indicating the importance of relying on a variety of indicators when assessing governance in a country. Whose Views Should We Rely On? A variety of governance assessments are produced by experts on behalf of com- mercial risk-rating agencies and NGOs. The GI1 and the OBI, for example, rely on locally recruited experts in each country to complete their detailed questionnaires about governance, subject to peer review. Commercial organizations such as the Economist Intelligence Unit rely on a network of local correspolldents in a large set of countries to provide information underlying the ratings they produce. Other advocacy organizations, such as Amnesty International, Freedom House, and Reporters without Borders, also rely on networlts of respondents for the infor- mation underlying their assessments. Governments and multilateral organizations are also major producers of expert assessments. Some of the most notable include the Country Policy and Institutional Assessments, produced by the World Bank, the African Development Bank, and the Asian Development Bank. Each of these assessments is based on the responses of each institution's country economists to a detailed questionnaire, responses that are then reviewed for consistency and comparability across countries. The Public 12 The 1,VorId Hnrlk Res~nrchObservuc vul. 23, no. 1 (Spring 2008) Expenditure and Financial Accountability indicators mentioned earlier are also based on experts' views. Several large cross-country surveys of firms and individuals contain questions on governance. These include the Investment Climate Assessment and the Business Environment and Enterprise Performance Surveys conducted by the World Bank; the Executive Opinion Survey of the World Economic Forum; the World Competitiveness Yearbook; Voice of the People; and the Gallup World Poll. Expert Assessments Expert assessments have several major advantages, which account for their pre- ponderance among various types of governance indicators. One is cost: it is much less expensive to asli a selection of country economists at the World Bank to provide responses to a questionnaire on governance as part of the CPIA process than to carry out representative surveys of firms or households in a hundred or more countries. The second advantage is that expert assessments can more readily be tailored for cross-country comparability: Many of the organizations listed in table 2 have elaborate benchmarking systems to ensure that scores are comparable across countries. Finally, for certain aspects of governance experts are the natural respondents for the type of information being sought. (Consider, for example, the OBI'S detailed questionnaire on national budget processes, the par- ticulars of which are not the sort of common lcnowledge that survey data can easily collect.) Expert assessments nevertheless have several important limitations. A basic one is that, like survey respondents, different experts may have different views about similar aspects of governance. While this is not surprising, it suggests that users of governance indicators should be cautious about relying too heavily on any one set of expert assessments. These differences are evident in comparing the CPTA ratings of the World Bank and the African Development Bank, which in recent years harmonized their procedures for constructing CPIA ratings. An iden- tical questionnaire covering 16 dimensions of policy and institutional perform- ance is completed by two very similar sets of expert respondents-country economists with in-depth experience working on behalf of these two organizations in the countries they are assessing. Despite the homogeneity of the respondents and the very similar rating criteria, there are nontrivial differences between the two organizations' assessmellts on the 16 components of the CPIA (table 3). For example, the 0.67 correlation between the two assessments on the question on transparency, accountability, and corruption in the public sector is far from perfect. suggesting that it is prudent to base assessments of governance for policy purposes on the views of a variety of expert assessment^.^ Knujinann and Krany 13 - - - - - - - Table 3. Correlation Among Alternative Indicators of Corruption - Expert assessments Surveys .- World African World Economic Bank Development Global World Markets Forum Executive Gallup lndicator CPlA Bank CPIA Integrity Online Opinion Survey World Poll -- World Bank CPIA 1.OO 0.67 0.30 0.56 0.25 0.13 African 1.OO 0.49 0.51 0.45 0.24 Development Bank CPIA Global Integrity 1.00 0.34 0.29 0.11 World Markets 1.OO 0.88 0.59 Online World Economic 1.OO 0.70 Forum Executive Opinion Survey Gallup World Poll 1.OO Source: Authors' analysis based o n data described in the text. The second criticism that the country ratings assigned by different groups of experts are too highly correlated is just the opposite. Suppose that one set of experts comes up with an assessment of governance for a set of countries based on its own independent research and the second set of experts simply reproduces the assessments of the first. In this case, the high correlation of two expert assess- ments cannot be interpreted as evidence of their accuracy. Rather, it would reflect the fact that the two sources make correlated errors in measuring g~vernance.~ Nevertheless, even if the errors made by two data sources are highly, but not perfectly correlated, there will be benefits to relying on both data sources. The important empirical question is whether this hypothetical correlation of errors across sources is large or not. Empirically identifying correlations in errors across sources is difficult. Simply observing whether the assessments provided in the two data sources are highly correlated is not enough, as the high correlation can reflect the fact that both sources are either measuring governance accurately or making correlated measurement errors. To make progress, one needs to make identifying assumptions. Kaufmann, Kraay, and Mastruzzi (2006) detail two sets of assumptions that allow potential sources of correlation in the errors to be disentangled. One is that surveys of firms or individuals are less likely to make errors that are correlated with other data sources than, for example, assessments by commercial risk-rating agencies. If this is the case, however, one would expect that the assessments of commercial risk- rating agencies would be very highly correlated with one another, but less so with surveys. This turns out not to be the case. The average correlation of the five 14 The World Bank Research Observer. vol. 23, no. 1 (Spring 2008) major commercial risk-rating agencies for corruption in 2002-05 was 0.80. The correlation of each of these assessments with a large cross-country survey of firms was slightly higher (0.81), in contrast with what one would expect if the rating agencies had correlated errors. Conducting this exercise across all six aggregate governance indicators reveals at most modest evidence of error corre- lation. While this is unlikely to be the final word on this important question, it is a useful step forward to propose and implement tests of error correlation based on explicit identifying assumptions. The third criticism is that expert assessments are subject to various biases. Some researchers claim that many of these sources are biased toward the views of the business community, which may have very different views of what constitutes good governance than do other types of respondents. In short, goes the critique, businesspeople like low taxes and less regulation, while the public good demands reasonable taxation and appropriate regulation. This critique does not seem particularly compelling. If it were true, the responses of commercial risk-rating agencies, which serve mostly business clients, or the views of firms themselves to questions about governance, should not be highly correlated with ratings provided by respondents who are more likely to sympathize with the common good, such as individuals, NGOs, or public sector organizations. Yet, in most cases, these correlations are strong (Kaufmann, Kraay, and Mastruzzi 2007b). Cross-country surveys of firms and of individuals, such as the World Economic Forum's Executive Opinion Survey and the Gallup World Poll, yield similar corruption rankings, with the two surveys correlated at 0.7 (table 3). Another potential source of bias in expert assessments, particularly those pro- duced by NGOs, is that they are colored by the ideologicalorientation of the ratings organization. Kaufmann,Kraay, and Mastruzzi (2004) find that the assessments of think tanks and firm surveys are not systematically correlated with the political orientation of a country's government, casting doubt on this possible source of bias. A potentially greater problem of bias is at the country respondent level. For example, the views of pro-government and antigovernment "experts" might be very different,affecting both levels and trends over time. This risk is perhaps great- est for the sources that rely on local experts, such as the GII. This risk is also much more difficult to test for systematically. as the biases may affect individual country scores without introducing systematic biases into the source as a whole. Nevertheless, careful comparisons ofmany different data sources can often turn up anomalies in a single source that require more careful scrutiny. Surveys of Firms and Individuals Governance indicators derived from surveys of firms and individuals have the fun- damental advantage that they elicit the views of the ultimate beneficiaries of good KauJinnnn and Kraa!, 15 governance, citizens and firms in a country. The views of these stakeholders matter because they are likely to act on those views. If firms or individuals believe that the courts and the police are corrupt, they are unlikely to try to use their ser- vices (Hellman and Kaufmann 2004). Individuals are less likely to vote or to hold their elected leaders accountable if they think that elections are not free and fair. Another advantage of governance indicators based on surveys of domestic firms and individuals is greater domestic political credibility. Governments often dismiss external expert assessments of governance as uninformed pontification by outsi- ders. It is much harder for them to dismiss the views of their own citizens or of firms operating in their country. Survey-based data on governance can therefore be particularly useful in galvanizing the politics of governance reforms. The experience of many countries implementing their own in-depth Governance and Anti-Corruption diagnostics (assisted by the World Bank Institute and other agencies and implemented with institutions in the requesting country), based on in-country surveys of enterprises, users of services, and public officials, supports this point: the views expressed by thousands of domestic stalsetrrcllObsrnsr; ~ ~ o23. l . 110.1 (Sprirrg 2008) margins of error is rationalized by suggesting that they would be missed by most readers. Experience with the WGI suggests that this is not the case, with many users recognizing and benefiting from this additional degree of transparency about data limitations. Exploit the wealth of available indicators, recognizing that progress in developing new indicators is likely to be incrementaI. Much more work needs to be done to exploit the large body of disaggregated measures of governance already in existence. Linking disaggregated indicators to disaggregated outcomes, both across countries and over time, is likely to be an important area of research over the next several years that is likely to have important implications for policymakers. There is also scope for developing new and better indicators of governance. Work to improve such indicators will be important, as indicators are increasingly used to monitor the success and failure of governance reform efforts. But given the many challenges of measuring governance, it is important to recognize that progress in this area over the next several years is likely to be incremental rather than fundamental. Alongside efforts to develop new indicators, there is also a case to improve existing indicators, particularly in increasing the periodicity of heretofore one-off efforts and in broadening their country coverage (covering industrial and developing economies), as well as covering issues for which data are still scarce, such as money laundering. Notes Da~iielKaufmann is a director of global programs at the World Bank Institute: his ernail address is dkauf1nann@wor1dbank.org. Aart Kraay is a lead economist in the Development Kesearch Group at the World Ballk; his email address is akraay@worldbank.org. The authors would like to thank Shanta Devarajari for encouraging them to write this survey. Simeon Djankov and three anonymous referees for their helpful cornments, and Massimo Mastruzzi for assistance. 1. For surveys of and user guides to governance indicators, see lJNUP (2005),Arndt and Oman (ZOOh), and Knack i2006). Because of space constraints, no attempt is made here to review the important body of work focused on in-depth wilhiu-country diagnostic measures of governance that are not designed for cross-country replicability and comparisons. 2. A fuller compilation of governance datasets is available at www.worldbank.org/wbi/govern- anceldata. 3. Indeed, this is reflected in the terminology of "actionable" governance indicators emphasized in the World Bank's Global Monitoring Report (World Bank 2006). 4. See King and Wand (2007) for a description of how this problem can be mitigated by 1he use of "anchoring vignettes" that provide a common frame of reference to respondents in interpreting the response scale. The basic idea is to provide an understandable anecdote or vignette describing the situation faced by a hypothetical respondent to the survey. For example, "Miguel frequently finds that his applications to renew a business license are rejected nr delayed unless they are accompanied by an additional payment of 1,000pesos beyond the stated license fee." Respondents are then asked to assess how great corruption as all obstacle is for Miguel's business, using a 10-point scale. Since all respondents use the scale to assess the same situation, this rating can be used to "ancl~or"their KauJn7ctrln and Kraay 27 resporlses to questions referring to their own situation. 5. These two indicators are measured as the average of 14 "in law" comporlents and the LO "in practice" components of the elections indicator of Global Integrity. 6. Starting with the 2005 data, both the African Development Bank and the World Bank have made their CPIA scores public. The African Development Bank docs so for all borrowing countries: the World Bank does so only for countries eligible for its most conccssional lending. 7. Kaufmann, Kraay, and Zoido-Lobat6n (1999a) show how the estimated margins of error of their aggregate governance indicators would increase if thcy assume that the error terms made by individual data sources were correlated. Recently Svensson (2005), Arndt and Oman (2006). and Knack (2006) have raised this criticism again, largely without the benefit of systematic evidence. Kaufmann, Kraay, and Mastruzzi (2007b)provide a dctailed response. 8. This is not to say that all of the surveys used to measure governance are necessarily represen- tative in any strict sense of the term. In fact, one general critique is that several large cross-country surveys of firms that provide data on governance are riot very clear about their sample frame and sampling methodology. The Executive Opinion Survey of the World Economic Forum, for example, states that it seeks to ensure that the sample of respondents is representative of the sectoral and size distribution of firms (World Economic Forum 2006). But it reports that it "carefully select[sl compa- nies whose size and scope of activitics guarantee that their executives benefit from international exposure" (p. 1 33).It is not clear from their documentation how these two conflicting objectives are reconciled. 9. A simple example is that respondents are aslced whether they have ever offered a bribe. Rut before answering, the respondent is instructed to privately toss a coin and to answer "yes" if either they have in fact offered a bribe, or the coin comes up heads. See Azfar and Murrell (2006) for an assessment of thc extent to which randomized response methods correct for respondent reticence and an innovative approach to using this methodology to weed out less than candid respondents. 10. The assumption of a common error variance is necessary in this simple example with two indicators in order to achieve identification. In this example. just one sample correlation in the data can be used to infer the variance of measurement error; just one measurement error var- iancc can thus be identified. In more general applications of the unobserved components modcl, such as the WGI, this restriction is not required because there are three or more data sources. 11. For details on this calculation, see Kaufmann, Kraay, and Mastruazi (2004, 2006). Celb, Ngo, and Ye (2004) perform a similar calculation comparing the African Development Bank and World Bank CPIA scores. Their conclusion that the CPIA ratings have little measurement error is driven largely by the fact that the authors focus on the aggregate CPIA scores, which are very highly correlated between the two institutions. The focus here is on one of 16 specific questions; at this level of disaggregation. the correlation between the two sets of ratings is considerably lower. 12. For example, virtually all of the individual indicators underlying the aggregate WGI are available at www.govindicators.org. 13. One of the best-known and best-executed recent studies of this type is a study of corrup- tion in a local road-building project bv Olken (2007). References Acemoglu. Daron. 2006. "Constitutions. Politics, and Economics: A Review Essay on Persson and Tabellini's The Economic Eflrcts ~JConstitutions."Journal ~[Econornic.Litrrat~tre63(4):102545. Arndt, Christiane, and Oman Charles. 2006. "KJses and Abuses of Goverr~arlceIndicators." OECD Development Center Study, Organisation For Economic Co-operation and Decelopment, Paris. 28 TI]? M'r~rldBnr~kReserrrrh Ohsrrvec 23. 1 (Spring 2008) 1101. 110. Azfar. Omar, and Peter Murrell. 2006. "Identifying Reiic,ent Respondents: Assessing the Quality of Survey Data on Corruption and Values." University of Maryland, 1)epartment of Economicx, College Park Maryland. G'elb. Alan, Brian Ngo, and Xiao Ye. 2004. "Implementing lbrformance-Rased Aid in Africa: The Country Policy and Institrltional Assessment." World Bank Africa Region Working Paper 77, Washington, D.C. Hellman, loel, and Daniel Kaufmann. 2004. "The Inequality of Influence." In J. Kornai and S. Rose- Ackerman, eds.. Building a l'rustu~orth!l Stccte it1 Post-Socialist Tmnsition. New York: Palgrave McXlillan. Kaufmann. Daniel. Aart Kraay, and l'ablo Zoido-Lobaton. 1999a. 'Aggregating Coverr~ance Indicators." Policy Rrsearch Working Paper 2195. World Bank. Washington, D.C. -. 1999b. "Governance Matters." Policy Kesearch Working Papcr 2196. World Bank. Washington. 1j.C. Kaufmann, Ijaniel. Aart Kraay, and Massimo hlastruzzi. 2004. "Governance Matters 111: Go~vernance Indicators for 1996, 1998. 2000 and 2002" W~rlrlBur~kBci~rronricKcview 18(2):253-87. _ . 2005. "Governance Matters IV: Governance Indicators for 1996-2004." Policy Research Working Paper 3630. \Irorld Bank. Washington, D.C. - __. 2006. "Governance Matters V: Governancr Indicators fbr 1996-2005." Policy Research Working Paper 4012. World Bank. Washington, D.C. - . 2007a. "(>overnance MaLters 1'1: Aggregate and Individual Governance Indicators for 1996-2006." Policy Resrarch Working Paper 4280. World Rank. Washlngton, U.C. - . 2007b. "The Worldwide Governance Indicators Project: Answering the Critics." Policy Research Working Paper 4149. World Hank. CVashington,D.C. Kautilya. 1992. [400 R.C.E.]T11rArtl~ctslinstra.New Delhi, India: Penguin Classic Editior~. King. Gary, and Jonathan wand. 2007. "Comparing Incomparable Survey Kesponscs: Evaluating and Selecting Anchoring Vignettes." Political Ar~nf~/sis 1 5(1):46-66. Knack. Steven. 21106. "Mcasuring Corruption in Eastern Europe and Central Asia: A Critique of the Cross-Country Indicators." Policv Research Department Working Paper 3968. World Rank. Washington. D.C. North. Douglass. 2000. "Poverty in the Midst of Plenty." Hoover lnst~tutic~riDnily Report. October 2. (www.hoover.org.) -. 2007. "Monitoring Corruption: Evidence rrom a Field Experiment in Indonesii~."]olrrncrl Po1itit;nl Bconoln!{ 11i(21: 200-49. Persson. Torsten. and Guido 'I'abellini. 2005. Tl~eEcotiornic w ~ c tof Coristitrrtior~s.Cambridge, Mass.: s L1tT Press. Razafindrakoto, Mireille, and Fran~oisRoubaud. 2006. '>ireInternational Llatabases on Corruption Rcllable, A Comparison of Expert Opmion Surveys and Household Surveys in Sub-Saharan Africa." Developrnent Research Institute, Developrnent Institutions and Long-Term Analysis (IRI~IDIALI,Paris. Svensson, Jakob. 2005. "Eight Questions about Corruption." Jo~lrrialo/ Ecolron~ii,Pl~rsprct~\r~s19(3): 19-42. IiNDP (United Nations Development Programme). 2005. Govrrrratlcc. hdicutors: A Uscjrs (;~tirlr. New York: IJNDt? World Bank. 1992. Govrrr~nncrand De~~rlo~~mi~r~t.Washington. 1j.C. 2002. Building Lnstitutiorlsjbr .%liarkcts.Nerv York: Oxford Iinivcrsity. 2006. Global Monitoring Report. Washington. D.C. . 2007. "Strengthening World Bank Group Engagement on Governance and Anticorruption." Joint Ministerial Committee of the Boards of Governors of the Bank and the Fund on the Transfer of Real Resources to Developing Countries, Washington. D.C. [www.world- bank.orglhtmllextdrlcomments/governancefeedhack/gacpaper.pdf]. World Economic Forum. 2006. The Global Competitiveness Report 2006-2007. New York: Palgrave Macmillan. The 12iorld Barlk Rrscc~rrh0hsc.rvc.r: vol. 1'3.110.1 (Spring 200X) Two Comments on "Governance Indicators: Where Are We, Where Should We Be Going?" by Daniel A Kaufmann and Aart Kraay' Thefollowing comments by Shantayanan Devarajan and Simon Johnson provide two perspectives on indicators in general and the World Governance Indicators in particular. Shantayanan Devarajan The World Bank Research Observer publishes balanced surveys of the literature. When the authors of a survey are also the proponents of one of the major indi- cators being surveyed, it invites comments to ensure that balance is maintained. Kaufmann and Kraay provide a useful taxonomy of governance indicators, distinguishing between those measuring "rules on the books" and "rules on the ground" and those reflecting the views of experts and the results of surveys. While providing a balanced overview of the pros and cons of different methods, they make a strong case in favor of measuring rules on the ground based on an aggregated mix of expert- and survey-based indicators, along the lines of their World Governance Indicators (WGI). Any assessment of governance indicators-or any indicator for that matter- must be based on the purposes to which the indicators will be put, as Kaufmann and Kraay note. This comment examines how well the WGI and other indicators perform in two specific instances. The first is the allocation of resources-such as the concessional aid provided by the World Bank's International Development Association (IDA) or the IJS Millennium Challenge Corporation (MCC)-across low-income economies. These resources are allocated according to a formula that includes, among other factors, the productivity of aid in reducing poverty in a particular country-a factor that is partly a function of the quality of governance. An indicator is thus needed that ( .The Author 1008. Published by Oxford University Press on behalf of the International Rank for Reconstructio~~and Development / .THEWOHII) RANK. All rights reserved. For permissions. please e-mail: journals.perrnissions(ci,oxfordjournals.org doi:10.109 3lwbrullkn001 Advance Access publication Fehruary 12. 2008 23:31-36 measures the quality of governance across countries. By focusing on rules on the ground, using a mix of expert- and survey-based information, aggregating across indicators within a country, and providing standard errors around the means of these aggregate indicators, the WGI provide a defensible method of making cross-country comparisons. The IDA uses the judgment of World Bank experts in assessing a country's policies and institutions, but that judgment is informed by the WGI. The MCC uses five of the WGI, along with 11other indicators, to deter- mine a country's eligibility for its programs. The WGI have produced some seemingly anomalous results (India, for instance, ranks in the bottom quartile worldwide on "political stability and lack of vio- lence").' But aggregating across surveys in each country is needed if results are not to reflect surveys conducted across only a limited set of countries or some other bias-inducing method. A second possible use of the WGI is to help identify the nature of the "govern- ance problem" in a country. Bangladesh has ranked in the bottom quartile world- wide (and considerably below the low-income country average) in all but one or two of the aggregate WGI for the past decade and has been at or very near the bottom of Transparency International's corruption indicators. At the same time, the growth rate of per capita income has risen 0.1 percent every decade since the 1970s (the growth rate is now close to 5 percent), and poverty fell 0.8 percent in five years (twice the rate in India). The country has already achieved universal primary enrollment and gender equity in secondary school, and it is on track to reduce child mortality by two-thirds (relative to 1990 levels) by 2015. In what sense, then, does Bangladesh have a "governance problem?" Is it poss- ible that the data on development indicators have been mismeasured? Although some development indicators may be weaker than they appear-there is some evidence, for example, that enrollment rates among the lowest quintile of Bangladesh's population remain extremely low (World Bank 2007)-the country has clearly made tremendous gains in development over the past two decades. It is possible that it would have performed even better absent it governance problems (World Bank 2006). But, this does not answer the question of how Bangladesh's development outcomes are so much better than those of other countries with "better" governance. One explanation is that governance indicators in Bangladesh fail to capture the fact that the country has a vibrant and active civil society that not only deli- vers services, but provides some accountability to government. The WGI also seem to overlook the increasingly mature media, including vernacular newspa- pers, which play something of a watchdog role. But these measurement errors may be second-order considerations compared with the fundamental facts that the governance indicators are capturing. Bangladesh does score in the 30th per- centile on the WGI voice and accountability indicator; it comes close to the 32 Thr World Bntlk K(,search Ohsert~e~; ~01. 2 3, no. 1 (Spring 2008) 25th percentile on government effectiveness. But it ranks in the 5th percentile on control of corruption. These rankings are very low for a country that is per- forming so well. Another explanation is that the relation between governance and development in Bangladesh is unique, because the Bangladeshi people have worked around the country's governance problems to spur development. When the country was born, out of a civil war, there was hardly a government. International and national nongovernmental organizations (NGOs) filled the vacuum by delivering basic services, such as health, family planning, and education, and by creating microcredit schemes. As these efforts proved effective, the government made space for these NGOs and the private sector, in some cases contributing to their finan- cing. The government funds secondary education, for example, although 95 percent of it is provided by the nonstate sector. Similarly, Bangladesh's garment export sector grew rapidly, thanks to duty drawback systems and bonded ware- houses that enabled textiles to come into the country duty-free, circumventing a highly opaque customs system. Of course, this explanation does not explain why Bangladesh was able to work around its governance problems when so many other countries were not. Bangladesh is a densely populated, homogeneous society, in which innovations spread like wildfire. Soon after one village discovers something that works, neigh- boring villages find out about it and adopt it. As a result, family planning, micro- finance, and other programs took off in Bangladesh more easily than they might have elsewhere. By the time the government grew strong enough to control the NGOs and others, it was too late, microfinance, family planning, and private schooling had already become commonplace. (To its credit, the government recog- nized this and proceeded to support the providers of essential services with finance.) The result is a country with weak governance indicators, but impressive development. The Bangladesh case illustrates the fact that governance indicators such as the WGI do not capture the multifaceted ways in which governance affects development in a particular country. It would be dangerous to use indicators to jump to simple conclusions without understanding the specific relation between governance and development in a particular country; the indicators should cer- tainly not be used by themselves to design policy responses to problems of weak governance. This is the downside of having an indicator that permits intercountry compari- sons of governance: the richness of country-specific detail is lost. Kaufmann and Kraay recognize this tradeoff in their concluding discussion about more disaggre- gated indicators. One should not go too far down this path, however, lest the main benefit of the WGI and other such indicators-their comparability across countries-be diluted in the quest for indicators that are more country specific. In the Tinbergen tradition of not having more objectives than instruments, one should not expect governance indicators to serve too many purposes. Notes Shantayanan Devarajan is the chief economist of the South Asia Region at the World Bank; his email address is sdevarajan@worldbank.org. 1. This result may not be as anomalous as it appears at first blush; however. according to the Indian prime minister. 170 of 602 districts have a significant Naxalite (Maoist) presence (Singh 2005). References Singh. Manmohan 2005. "PM's Reply in Rajya Sabha to the Debate on Motion of Thanks to the President's Address." March 11.New Delhi. World Bank. 2006. Can South Asia End Poverty in a Generation? South Asia Region. Washington, D.C. . 2007. To the MDGs and Beyond: Accountability and Institutional Innovation in Bangladesh. Bangladesh Development Series No. 14. Dhaka Simon Johnson Kaufmann and Kraay took a major step forward in thinking about governance when (together with Pablo Zoido-Lobaton) they first published their World Governance Indicators (WGT), in 1999. These indicators provided a new way to combine comparable indices along various dimensions of governance. The WGI included more countries and more pieces of data than had previously been avail- able, complementing various other measures. From the beginning, the WGI have clearly indicated the underlying sources of the data and how the data are constructed, explicitly reporting error bands for all estimates. This was a major innovation and remains of paramount importance. The authors (together with Massimo Mastruzzi) have also done a great service by continuing to update the data and refine their methods. The WGI are now well established as one of the standard sets of measures that any researcher or policy analyst must consult. No measure is perfect, of course, but anyone trying to establish that a particular set of results is or is not robust is well advised to work carefully through the various vintages of the WGI. 34 Thr World Bank Resrnrclr Observer; vol. 23. no. 1 (Spring 2008) The appearance of the WGI coincided with a major backlash against carefully measuring and thinking hard about governance. While this backlash initially seemed rather academic and sometimes arcane, it had considerable immediate impact on both policy discussions and the impetus to research in this area. Rather than building on the WGI, serious endeavor in this area lost momentum, as three main arguments against governance indicators gained currency. The first is that because the WGI explicitly report "errors," they cannot be relied on. This argument is completely mistaken: similar error bands should be reported for all the data used in economics, particularly for countries with weak statistical systems. The second argument is that governance measures are not useful because many countries are growing, despite weak governance. What are the uses of these measures if they cannot predict who will and will not grow? While countries with weak institutions can, and indeed often do, grow for prolonged periods, over the longer run, it is hard to escape weak governance. Far from the exception, growth spurts are actually a standard feature almost everywhere (Berg, Ostry, and Zettelmeyer 2006)-as is the inability of many countries with weak governance to sustain the gains made during the good years. Some countries-albeit only relatively few recently-have managed to sustain rapid growth, despite weak initial governance. Too little is known about these cases, but one common element appears to be a strong focus on exports, pal-ticu- larly of manufacturing goods (see Johnson, Ostry, and Subramanian 2007 for specific country examples). These exceptions notwithstanding, if a country's governance indicators are weak-as measured by the WGI-the presumption should be that sustained growth will be difficult. Strengthening governance should help increase the odds of sustaining growth. The third and the most interesting argument is that aggregate indicators miss a great deal of rich detail, failing to pick up the de facto arrangements that effec- tively take the place of more formal governance rules. Close observation of how society is organized always turns up functional forms of improvisation-that is, ways to organize transactions even where contract enforcement is weak. Some informal mechanisms may be quite efficient, in the sense that the transaction costs are not much higher than they would have been had formal contracting worked well. Some of this organization can occur in and around the manufactured export sector. Initial improvisation could lead to more durable rules over time, both in the manufacturing sector and, through institutional spillovers, in the country as a whole. One needs to be very careful about drawing policy implications from apparent conditions that cannot be measured; however, it is hard to make much progress Drvancjan and Jahnsor~ 35 on the basis of anecdotal evidence. Informal governance arrangements may shift and be fragile; they may also prevent people from entering new activities. The limited work on this phenomenon suggests that informal governance is often a weak substitute for well-functioning governance. What is needed is a WGI-type approach to measure governance at a more local level, for both sectors and cities. There have already been ambitious steps in this direction [for example, Kaufmann, Leautier, and Mastruzzi (2005)],and there has been some indication that the World Bank, which is uniquely positioned in this regard, would move in this direction. So far, however, progress has been limited. There are many good reasons to move carefully in this area, and no doubt we need to think carefully about how to make best use of resources-something that Kaufmann and Kraay reflect in this article, with their balanced assessment of alternative sources within and outside the WGI. But it would be unfortunate if the legitimate debate over interpretation of macrolevel governance measures were to undermine confidence in this area of enquiry and prevent further work from being undertaken. The field is still at an early stage. Better measures are needed to gauge what works and what does not. Such measures should stand firmly on the shoulders of the WGI. Note Simon Johnson is an economic counsclor and director of the Research Department at the International Monetary Fund; his email address is sjohnson@imf.org. References Andrew Berg, Jonathan Ostry, and Jeromin Zettelmeyer. 2006. "What Males Growth Sustained?" IMF Working Paper. International Monetary Fund, Washington, D.C. Kaufmann. Daniel, Frannie Leautier, and Massimo Mastruzzi. 2005. "Governance and the City: An Empirical Exploration into Global Determinants of Urban Performance." Policy Research Working Paper 3712. World Bank, Washinglorl, D.C. Simon Johnson. Jonathan Ostry, and Arvind Subramanian. 2007. The Prospects for Sustained Growth in Africa: Benchmarkirlg the Constraints. NBER Working Paper 13120. Cambridge, MA: National Bureau of Economic Research. Thr World Bank Resecrrch Observer;vol. 23. no. 1 (Spring 2008) Walking up t h e Down Escalator: Public Investment and Fiscal Stability William Easterly, Timothy Irwin, and Luis Serven When growth-prornoting spending is cut so rnuch that the presvnt value of'filture gov- ernment revenues falls by rrlore than the iinmediate improvement in the cash deficit, fiscal adjustnlent brcoines like walking up the down escalatol: Although short-terrn cash flows matte6 too tight a focus on thrm encourages governinents to invest too little. Cash-flow targets also encourage governrilents to shift investment spending off budget by seeking private in~?esti?zentin public projects, irrespective of ~ t rrld fiscal or econonlic s benefits. To deal with this problem, some obser\?ers have sllggested excluding certain investnlents (such as those undertaken by public enterprises deemed commercial or finanred by inultilaterals) frorrl cash-flow targets. These sfopgap rerrledies may help protect some investments, but they do not provide a satisfactory solution to the under- lyiny probbn. Governinents can more effpctively reduce the kiasrs created by the focus on short-term cash flows by dewloping indicators of the long-term fiscal effects of their decisions, including accounting and economic measures of net worth, and, where appropriate, including such nlrusures in fiscal targets or even fiscal rilles. JEL codes: 0 2 3 E62 H60 H54 A popular phrase during the era of macroeconomic stabilization of the 1990s was "adjustment with growth." The focus of this article is on the surprising possibility that some types of fiscal austerity not only fail to bring growth, but they may not even bring "adjustment" in the long run. Consider the following anecdote from the World Bank's own budgeting experi- ence. In 2993 the World Bank Research Department unexpectedly produced a bestseller entitled The East Asia Miracle. The Research Department soon exhausted its adrninistrativc budget allocation for reprinting the book. The World Bank's centralized budget department denied a request for extra budgetary resources for printing more copies of the book on the grounds that the Research Department !, Thc Author 2008. Published by Oxford Iinivcrsity Press on behalf of the International Bank for Reconstructiol~and Development 1 rlltWORIL, nnNv.All rights reserved. Forpcrnlissions.please e-mail:journals.perrnissions@c~x~ordiour~~als.org doi:I 0.109 3/wbro/llon014 Advance Access publication J a ~ ~ u a28. 2008 r y 23:37-ih had already exceeded its printing budget--even though producing more copies of the book would have more than paid for itself! This kind of unreason is not confined to the world of bureaucratic budgetary management; it also extends to fiscal policy practice. The primary concern of most fiscal programs is to ensure public sector solvency, commonly viewed as an essential ingredient of macroeconomic stability. Solvency is by definition an inter- temporal concept, relating to the present value of revenues and expenditures and encompassing both assets and liabilities. A cut in public investment that lowers growth will lower the present value of revenues; it is conceivable that the govern- ment's intertemporal position deteriorates at the same time as the cash deficit improves. In practice, however, it is customary to assess the strength of public finances almost exclusively on the basis of the cash deficit (or "overall balance")- that is, the rate of acquisition of debt by the public sector. Latin America offers a good illustration of this practice. There is rising concern across the region that the fiscal adjustment that many countries had to undertake since the early 1990s may have come with an excessive fall in public investment (figure 1).To the extent that the response of private investment has been insuffi- cient to offset the decline of public investment in key sectors, such as infrastruc- ture, current levels of public investment are perceived by many as too low to support long-term growth rates consistent with rapid poverty reduction. Political economy considerations make adjustment difficult at the best of times. The recent backlash against free-market reforms in Latin America and the long- standing sensitivity to conditions perceived as imposed by outsiders make it more important than ever that adjustment programs be well conceived. Figure 1. Primary Deficit and Public Infrastructure Investment in Latin America, 1980- 2001 --e-Primarydeficit i Public~nvestmentin infrastructure Nntt: Figure is based on data from Argentina, Bolivia. Brazil, Chile. Colombia. Costa Kica, Mexico, and Peru. Sourre: Calderon and Serven (2004). 38 The World Barlk Research Obspn>er,vol. 23. no. 1 (Spring 2008) The international evidence suggests that Latin America's experience is the rule rather than the exception. Declines in infrastructure spending often account for the lion's share of fiscal deficit reduction, as Hicks (1991) shows for developing economies in a cross-regional context, Easterly and Servkn (2003) show for Latin America, and Estache (2004) shows for Sub-Saharan Africa. For industrial countries, Roubini and Sachs (1989) and De Haan et al. (1996) find that capital expenditure falls disproportionately at times of fiscal stringency. Balassone and Franco (2000) show that fulfillment of the Maastricht deficit targets sped the decline of public investment in the European Union (figure 2):of the nine countries that exceeded the deficit target in 1992, eight met it in 1997. In all eight, public investment had fallen relative to GDP; in seven of them, it had also fallen relative to total primary expenditure. In contrast, three of the six countries that met the target in 1992 raised their public investment in the subsequent years. The tendency toward compression of public investment at times of fiscal austerity underlies the fact that investment is the most volatile of all public spending items, as Talvi and Vegh (2000) document using data from developing economies and Lane (2003)documents using data from industrial countries. Of course, declining public investment would be of little consequence if it reflected improved spending efficiency or were fully matched by increased investment by the private sector. In most countries in Latin America, the only developing region for which adequate data are available, this may have been the case in the telecommunications sector. But the evidence suggests that in most infrastructure sectors in most countries, private investment did not offsetpublic sector retrenchment (Calderonand Serven 2004). Declining investment is a cause for concern when it results in decreased accumulation of public capital and public capital is productive. This is not always - - - Figure 2. Primary Deficit and Public Investment in the European Union, 1980-2002 I+Primary deficit Public investment I Notr: Figure is based on data Crom Austria. Belgiurn. Finland. France. Germany. Spain. Sweden, and the United Kingdom. Source: OECD Economic Outlook database. Easterly, Irwin, and Servin 39 the case; many projects labeled as public investment can be white elephants, which bring no future output benefits. The link between public investment spend- ing and capital accumulation can be fragile if investment involves significant waste-when projects are poorly selected and public procurement is inefficient or beset by corruption, for example (Pritchett 2000). With weak governance, public investment may become a vehicle for dispensing political favors rather than acquiring productive assets (Keeferand Knack 2007). The empirical literature is far from unanimous on the contribution of public capital to aggregate output or growth; this lack of agreement is hardly surprising in the context of growth empirics. Nevertheless, most studies, especially the more recent ones, do find a positive impact. The conclusions appear to depend in part on the approach followed: studies using measures of physical infrastructure assets find significantly positive output contributions in the vast majority of instances, whereas those that measure public capital using cumulative investment flows tend to be less conclusive, likely for the reasons outlined in the preceding para- graph. In some cases, however, both approaches yield similar results; using both financial and physical measures of public capita, for example, Ferreira and Araujo (2005)find significantly positive output effects in Brazil. Moreover, even if wasteful public investment spending weakens the link between spending and outcomes, an across-the-board reduction in public invest- ment will still result in cuts in productive infrastructure projects. Sacrificing such projects weakens the economy's growth potential; the right response is instead to protect high-return projects from spending cuts. If government does otherwise, it is trying to walk up the down escalator. This article offers a selective overview of these issues. It draws from the Latin American experience, because it has been relatively well documented, but its con- cerns are much more general. Indeed, they cut across developing as well as indus- trial regions, although they are more pressing in developing countries, which still have a long way to go in building up their infrastructure capital stocks. The article draws policy implications for the design and monitoring of fiscal targets consistent with both solvency and the efficient utilization of fiscal resources. The article is organized as follows. The next section reviews the shortcomings of the current approach to fiscal discipline. The second and third sections deal with two types of remedies: granting exceptions to existing fiscal targets and introducing new targets. The last section offers some concluding comments. Sliortcomi~igso f the Standard Approach t o Fiscal Discipline Fiscal adjustment programs typically focus on the short-term time path of the government's cash deficit, whose measurement is usually the center of attention 40 Thr 12brld Bnrrk Rrsearch Obsrrve~vol. 23, no. 1 (Spring 2008) of fiscal accounting. Short-term cash deficits and debt are the lcey fiscal concern of official creditors and form the basis of loan conditions in the fiscal and macro- economic dimensions. They are also closely scrutinized by multilateral insti- tutions, private creditors and investors, and economic analysts. There are good reasons why these fiscal aggregates should be closely watched. The cash deficit approximates the government's financing needs, which are a primary concern for the fiscal authorities as well as financial market participants. It can also give an indication ofthe public sector's contribution to overall aggregatedemand and thus its stance from the viewpoint of short-term stabilization, although the primary deficit (whichexcludesinterest payments) may be preferable for this purpose. Debt and the cash deficit can be misleading as solvency measures, however. because they do not talce into account the assets and future income the govern- ment may acquire by incurring debt today. This, of course, is hardly surprising: liquidity and solvency are fundamentally different concepts; as in corporate finance, different indicators are needed to gauge them. A corporation does not seek to maximize just this year's cash flow; it seeks to maximize the present dis- counted value of all future cash flows. Telling the public sector to improve the cash balance no matter what would be like telling Apple to forgo investing in a new ipoda factory in order to improve this year's cash flow. Solvency assessments based on debt and the cash deficit implicitly treat all public expenditures in the same way, because they all pose the same claim on today's fiscal resources. This blurs the distinction between public investrncnt and public consumption and, more precisely, between expenditures that yield future fiscal benefits and those that do not-even though they may have radically differ- ent implications for tomorrow's public revenues and therefore for solvency itself. Such practice distorts the tradeoffs faced by fiscal policy, both across time and among differentkinds of public expenditures. Across time, binding debt and cash deficit targets today tend to encourage postponement of expenditures and advancement of tomorrow's revenues, even if their present value, which is the relevant concern for solvency, remains unchanged (or declines as a result of delaying urgently needed expenditures, for example). Across expenditure types, liquidity targets pose a one-for-one tradeoff at the margin, regardless of the type of expenditures involved, whereas solvency targets do not. Faced with these trade- offs,governments having to strengthen public finances frequently choose adjust- ment paths that, by altering the time profile, the composition of expenditures, or both, attain the prescribed liquidity targets without any significant improvement in solvency They resort to deferring payments to the first day of the next year, accumulating arrears to government worlcers or suppliers, advancing the collec- tion of taxes, awarding higher pensions instead of increased wages, or granting guarantees instead of subsidies. Easterly (1999) and Easterly and Serven (2003) provide a variety of examples of this kind of illusory fiscal adjustment. Ensterlg. Irwin, nnrl Srrvin 41 Thus, other things being equal, governments facing binding liquidity targets today may devote too few resources to expenditures that yield returns tomorrow. This effectof liquidity targets on public spending composition is additional to the biases introduced by other political economy factors. These factors (which include governments' short-time horizons and political clientelism) can distort spending choices by discouraging public expenditures whose benefits accrue in the future in favor of those with immediate fiscal or political payoffs. Far from correcting these distortions, the conventional approach to fiscal discipline magnifies them. If fiscal adjustment disproportionately cuts infrastructure spending that enhances growth, it can lead to a vicious circle in which low growth generates unsustainable debt dynamics, which force fiscal adjustment implemented through investment cuts, which lowers growth further and prompts additional fiscal retrenchment and investment cuts. In other words, if debt stabilization is pursued primarily by cutting productive spending, destabilization can ensue. This phenomenon has been documented in both industrial and developing countries. In industrial countries, Alesina and Perotti (1997) find that fiscal cor- rections based primarily on public investment contraction are typically unsuccess- ful: they have an adverse effect on growth, and their stabilizing effect on public finances is eventually reversed. Calderon and Serven (2003) review the impact of public investment cuts in selected Latin American countries over the past decade. Their calculations suggest that the ensuing slump in growth and tax collection may have greatly weakened the intended solvency-enhancing effects of the capital expenditure decline. These issues concern all kinds of public expenditures that generate future fiscal benefits. Public infrastructure investment is the leading example, to the extent that public capital yields financial returns that the government can capture. Conceptually, infrastructure investments can be divided into three groups: Investment that generates direct financial returns through user fees, such as ports, airports, railways, and toll roads. Investment that does not generate user fees but increases growth and future tax collection. Investment that generates no future fiscal or growth benefits, whether or not the project has a positive social return (as in the case of environmental projects). The first two types of projects may pay for themselves-that is, generate a stream of financial returns whose present value exceeds the cost of the projects. On solvency grounds, deficit financing of those projects-termed "self- liquidating" (Mintz and Smart 2007)-is most likely to be justified, because such projects increase government net worth, even if they raise public debt in the short run. In practice, for the projects to increase government net worth, 42 Thr World Bnnk Researrlt 0l)servc.r. ~ w l .2 3 , rlo. 1 (Sprirrg 2008) the government must be able to capture the returns. For the first type of projects, user fees must be sufficient to cover project costs; for the second type of projects, taxes must be high enough to translate the additional growth into sufficient additional revenues. In the absence of user fees, many growth-enhancing projects may fail to gener- ate sufficient tax revenues to cover their cost. With the low (marginal)tax collec- tion rates of many developing economies, the growth impacts have to be considerable to yield the required tax revenue increases. For example, with a tax rate of 0.2, the output contribution would have to be five times as high as the project's user cost for the government to break even. So if the user cost is about 10 percent (say, a 5 percent real interest rate and a 5 percent rate of deprecia- tion),the project's marginal productivity must be at least 50 percent for the given tax rate to yield sufficient revenues (see Serven 2007 for the analytics of this and similar calculations). Such high productivity is more likely to arise in situations in which the initial endowment of public capital is low (relative to that of other productive assets)- specifically, when public capital services are substantially underprovided so that the marginal product of capital exceeds its user cost by a wide margin. Empirically, the international evidence appears to be consistent with the view that the marginal productivity of infrastructure capital is higher in developing economies, especially poorer ones, than in industrial countries (Calderon and Serven 2007). Probably as a reflection of these country-specific ingredients, empirical results are mixed regarding whether public investment may be self-financing through its growth and tax-collection effects. Perotti (2004) examines this issue in five countries in the Organisation for Economic Co-operation and Development (OECD),using a vector autoregression approach. He finds that in Canada and the United Kingdom, the extra public capital makes a negative growth contribulion: in Australia and the United States, the growth and tax collection effects finance only 20-30 percent of the investment cost. Only in Germany do these effects finance more than 100 percent of the cost. Using similar techniques but a broader sample of industrial countries, Pereira and Pinho (2006) find that p~lblic investment is roughly self-financing in France, Greece, and Ireland and more than self-financing in Germany and Italy. The growth effects are large in the majority of countries considered. Because developing economies possess smaller infrastructure capital endow- ments than industrial countries do, infrastructure capital might be expected to have a higher marginal productivity and so to come closer to being self-financing. Ferreira and Araujo (2005) find that public infrastructure investment is self- financing in Brazil, although it takes 10 years or more for the government to collect sufficient tax revenues to recoup the investment cost. Even if public sector projects fall in the intermediate area of having high returns for the economy as a whole but insufficient returns for public finances to improve public sector solvency, it may still be suboptimal to cut such projects during periods of fiscal austerity. An ideal marginal revenue collection scheme would allow the public sector to capture the returns, thereby eliminating the wedge between economywide and public sector returns. After all, the business of the public sector is precisely to provide public goods that yield a high return for the economy as a whole. In addition, public investment projects are more likely to exhibit higher marginal productivity ex post if the government's ex ante project evaluation capa- bilities are sufficiently strong that they select high-return projects and reject low- return ones. This, however, is far from assured in practice. Many developing econ- omies lack ex ante and even ex post project evaluation capabilities. (One exception is Chile, which has thorough procedures for evaluating projects: Fontaine 1997.) Unconditional endorsement of public infrastructure spending would lead to wasteful investments, as Tanzi and Davoodi (2002) convincingly argue. Roads would be built to nowhere, and useful roads would not be maintained. Power plants would lie idle after being built too far ahead of demand. Water supply net- works would be fully used but still burden the budget, because the tariff increases on which their financial viability was predicated were not allowed. Some invest- ments would be well motivated but poorly informed; others would be motivated by bribes, patronage, or photo opportunities (Keefer and Knack 2007). Infrastructure investment is the expenditure item that has attracted most atten- tion in the ongoing debate about the design of fiscal policy. But the link between spending composition and solvency arises in a broader context. On the one hand, not all public investment projects yield future income to the government. On the other hand, some current expenditures do yield future fiscal returns. Infrastructure operation and maintenance expenditure is a case in point. Operation and maintenance determines the useful life of capital and hence has a "capital-creating" effect similar to that of investment. If public capital yields finan- cial returns to the government, so does operation and maintenance. In fact, the financial, as well as social, return on operation and maintenance expenditure may well exceed that of new capital when the assets are not being properly main- tained (Rioja 2003a, b; Kalaitzidakis and Kalyvitis 2004; Serven 2007). This does not imply that developing countries should rush to raise public investment or that, as a rule, public investment increases should be financed with debt (or in any other particular way). The decision to invest should be guided by the return on the investment. That return is determined primarily by the mar- ginal productivity of public capital, itself dependent on the government's ability to select good projects and the (relative) scarcity of services rendered by public capital (for example, the availability of infrastructure services). Both return and 44 T l ~ rMiorit1 Bnnk Krsearch Ohservr~vol. 2 3. no. 1 (Spring 2008) cost calculations should embody risk adjustments to take account of uncertainty. Indeed, assessments of the effect on net worth of public investment projects should err on the side of caution, particularly when the government's initial indebtedness is high, because in such cases even small changes in interest rates may have very large adverse effects on public finances. Excluding Certain Public Investments from Fiscal Targets Broadly speaking, there are two possible ways to address the bias against pro- ductive public spending implicit in existing fiscal targets. One is to retain the targets but exempt from their action certain public investments deemed more likely to enhance growth and solvency. The other is to adopt new fiscal targets. This section reviews the first alternative; the next section discusses the second. Privately Financing Public Investment Projects One way to place investment projects beyond the reach of short-term deficit and debt targets is to have private firms finance them. Indeed, across the developing world, many governments have turned to the private sector to finance new invest- ments. In Latin America, for example, many governments have privatized their telecommunications firms and parts of their power and water industries. In the transport sector, private firms are often engaged in public-private partnerships in which the government retains an important financial role but the private sector finances investment. Chile and Colombia have had roads privately financed under arrangements in which the government provides revenue or foreign exchange guarantees. Other countries have begun to use a different form of public-private partnership in which a private firm finances an asset (such as a school, hospital, or prison), but the government purchases the service under a long-term contract. These arrangements may improve the returns to investment and thus enhance government solvency. In many cases, however, concerns about efficiency and sol- vency have played a minor role, and the resort to private financing has been guided primarily by the desire to evade the pressure of liquidity targets on public investment. Projects conceived with such a purpose in mind may not be well designed from the point of view of efficiency or solvency. The difference between liquidity and solvency effects is particularly apparent in privately financed projects in which the government purchases the service under a long-term contract. In such projects, explicit debt is replaced by similar commit- ments that are typically off balance sheet, without any major change in the mag- nitude of the government's financial obligations. It is also apparent when the government provides guarantees to private investors-such as guarantees of' the Ensterl!/, Irwin. c~ndServen 45 private firm's debt or revenue-that leave the public sector bearing much of the investment risk (Hemming and International Monetary Fund 2006; Irwin 2007b). Even when such guarantees are not formally offered ex ante, they may be provided ex post through renegotiation of concession agreements. The bailout of the Mexican toll road program in 1997, for example, cost 1.0-1.7 percent of GDP (World Bank 2005; see also Guasch 2004). On the whole, private financing has not come to play the dominant role in the provision of infrastructure services in Latin America or elsewhere that some observers expected. Although private financing now dominates telecommunica- tions and some other infrastructure industries in some countries, it still plays a small role in roads and water and sanitation-something that is unlikely to change in the near term. Moreover, it would be undesirable for decisions about the ownership of infrastructure firms to be driven by short-term fiscal constraints. When private ownership works better than public ownership in terms of effi- ciency, equity, or both, a state-owned firm should be privatized, even if ownership of the firm requires little investment. Likewise, when public ownership works better, the firm should remain public, even if major investment is required. Private financing may thus lessen the problem caused by liquidity targets in certain cases, but it is not an appropriate response to the general problem. Excluding Specific Public Projects Another option is to exclude from fiscal targets certain investments undertaken by the public sector. A recent proposal would exclude projects financed by multi- lateral institutions on the grounds that such projects are more likely than others to be carefully screened and designed. This idea has not garnered much support, partly because the fungibility of money means that the marginal financing from multilateral institutions would not necessarily support the intended projects. Furthermore, in many developing economies, total multilateral flows are too small to make a big difference. A second proposal, developed and refined by the International Monetary Fund (IMF 2004), is to exclude from fiscal targets investments by public enterprises that are deemed to be commercially run. This proposal (which is not new; see Afonso 2005) is, in principle, potentially important for countries in which public enterprises are included in the public sector aggregates monitored under fiscal programs. This is the case in Latin America but not in most other regions. In practice, this approach poses several problems. First, appropriate criteria to identify commercially oriented public enterprises are difficult to establish. Second, the fact that enterprises that meet likely criteria are the exception rather than the rule in many countries (IMF 2005) detracts from the practical relevance of this approach for public investment. 46 TJIPMiorld Bank Research Observe,: vol. 23, no. 1 (Spring 2008) Another difficulty is that excluding commercially run public enterprises from targets may further restrict investment elsewhere in the public sector if those enterprises make a positive net contribution to the aggregate budget surplus. Exclusion of those enterprises would then make fiscal targets more, rather than less, stringent for productive public expenditure. One way around this difficulty is to exclude the investments. but not the savings, of these firms from fiscal targets. Doing so, however, adds complexity, which detracts from the transparency of the approach. An alternative would be to relax the targets by the net saving of the excluded public enterprises. The fundamental problem with this proposal is that investments by enterprises that are not commercially run may still have offsetting fiscal and economic benefits-as is the case of many investments in roads, for example. In other words, the investments of commercially run public enterprises may not be the ones with highest priority from a fiscal or social perspective. Removing restrictions only on investment by commercially run public enterprises may still leave an overall public investment program far removed from the socially desirable one. For example, it is unlikely that Brazil's infrastructure needs would be significantly alle- viated by allowing more investment by the commercially run public oil company PETROBRAS. Such outcomes are a general problem with any proposals on the basis of the special treatment of specific investments. Developi~ i New Fiscal Targets Incorporating Measures g of Net Worth The limitations of "selective" approaches that exempt certain investments suggest that it is important to consider more fundamental changes, in particular, whether governments can develop measures of net worth that are sufficiently accurate and objective to be used as a basis for fiscal targets. Measuring net worth-the differ- ence between the value of assets and the value of liabilities-requires forecasts. The value of an asset is the present value of the net revenues it will generate and such revenues that are usually uncertain. Likewise, the value of a liability is the present value of the payments it will cause the government to make, and such payments are often uncertain. Because of these uncertainties, indicators of net worth are inherently approximate. Governments seeking to beautify their reported fiscal positions can take advan- tage of uncertainty to overestimate future revenues and underestimate future costs. Governments seeking to protect some category of public spending on politi- cal grounds can exaggerate the value of the "assets" it creates, even if the spend- ing is actually wasteful. If an indicator of net worth is too vulnerable to such manipulation it has too little credibility to be useful. Enstrrly, lnvin, and Survc;n Two indicators are used to measure net worth, one generated by modern accrual accounting, the other by long-term fiscal projections. This section exam- ines the accuracy and reliability of each. Modern Accrual Accounting Like traditional cash accounts, modern accrual accounts include information on short-term cash flows. Unlike traditional cash accounts, accrual accounts also include a balance sheet showing assets and liabilities. As a result, accrual accounts generate a measure of net worth. They also include a measure of the surplus or deficit that is not based on current cash flows. That measure includes revenues that have been earned but are not yet collected and bills that are payable but not yet paid. Crucially, investment itself is not counted as an expense in the period of investment; only the depreciation of the investment is included. The difference between accrual revenues and accrual expenses gives an income- statement surplus that is roughly equal to the increase in the government's net worth. To see the implications of accrual accounting, consider a government investing $200 million in a power plant, financed entirely by borrowing (this example is taken from Irwin 2007aj. Assume that, in the first year, no revenue is received, no operating costs are incurred, and no depreciation occurs. Under cash account- ing, the government's accounts show $200 million in extra expenditure, which increases the cash deficit and debt by the same amount (table 1).Under modern accrual accounting (table 21, these consequences of the investment on the govern- ment's cash flows and debt are revealed, but so, too, are the consequences for the government's assets. The accounts report that the investment has no net effect on the government's net worth or income-statement surplus. Accrual-based accounting standards for financial reporting have the advantage of being designed to limit the self-serving bias that uncertainty makes possible. Most obviously, financial reports, whether based on cash or accrual accounting, Table 1. Debt-Financed Investment in Cash Accounting Itc~nl Anlorrnt ($ million) Revenues () Expenditure 200 Surplus - 200 Debt 200 Note: Cash surplus is the sum of cash disbursed to operations and cash disbursed to investment Source: Authors. -- - - 48 The Worltl Bonk Rcscrarch O b s e r ~ ~rv~l.2 3, no. 1 (Spring 2008) e ~ Table 2. Debt-Financed Investment in Modern Accrual Accounting ltrnl Amount ($ nlillion) Income statement Revenue 0 Expenses 0 Income-statement surplus (1 Balance sheet Assets 20 0 Liabilities 200 Net worth 0 Cash-flow statement Cash disbursed to investment 200 Cash surplus 200 - Cash from financing 200 Note: Cash surplus is the sum of cash disbursed to operations and cash disbursed to investment. Solrrce: Authors. must be audited by an independent auditor. In addition, accrual accounting stan- dards tackle bias by preferring measures that are objectively verifiable even at the cost of some relevance. Some standards, for example, require an asset to be valued by recording its acquisition cost and then depreciating the cost according to a simple formula. The resulting value can only approximate the asset's true value, but the measure is less vulnerable to bias than alternative measures. When standards require the reporting of market value, they sometimes require the valuation to be performed by an independent expert (other than the auditor). Accounting scandals show that these safeguards can fail to prevent biased report- ing, but they are surely better than nothing. Moreover, although the risk of mis- leading accrual information is real, it does not provide a strong argument against the adoption of accrual accounting, because cash-based reporting is at least as vulnerable to manipulation, as argued earlier. Reporting according to modern accrual accounting standards is, however, more costly than reporting according to cash standards, and it can take years for a gov- ernment to move from cash to accrual accounting. These costs are more likely to be justified in middle- and high-income countries than in the poorest developing economies. Many high-income countries (including Australia, Canada, New Zealand, the United Kingdom, and the United States) have already made the transition, and many middle-income countries (including Chile, Indonesia, the Philippines, and South Africa) are adopting accrual accounting. Although accrual accounting generates valuable information missed by tra- ditional cash accounting, it is not sufficient for the assessment of net worth, even in middle- and high-income countries. For one thing, accounting values can diverge too much from true values. After many years of inflation, the depreciated acquisition cost of an asset may greatly underestimate the present value of the cash flows it will generate. In addition, accounting values of assets that generate user fees and higher tax payments at best capture only the value of the user fees, because the present value of future tax revenues does not count as an asset from the conventional accounting perspective. In contrast, durables are generally treated as assets and valued at their depreciated acquisition or replacement cost, even if they generate no future cash flows from either user fees or taxes. Expenditure on a bridge to nowhere can create an accounting asset even if it generates no tolls and does nothing to increase economic output. Long-Term Fiscal Projections Long-term fiscal projections have the potential to remedy some of the short- comings of accrual accounting. Such projections, prepared in various ways by countries such as Australia, New Zealand, the United Kingdom, and the United States, can include estimates of the government's operating and investing cash flows over the next 50-75 years, which can then be discounted back to the present to arrive at an estimate of the government's net worth. Crucially, all expected cash flows under current policies can be projected, including taxes and welfare expenditure. The projections can include public investment, expenditure on operations and maintenance, and payments to privately financed firms in public-private partner- ships. Revenues from user fees can be included. If evidence suggests that some investments increase tax revenue by generating more taxable economic activity, this extra revenue can also be included. Long-term fiscal projections can be designed to take account of the uncertainty of future cash flows. On the one hand, future revenues and expenditure can be adjusted for risk: risky tax revenues, for example, can be discounted at a higher rate than more predictable pension spending. On the other hand, the projections can show how net worth changes with critical assumptions about life expectancy, health care costs, and output. If output is modeled as a function of the stock of public capital, the projections can show how sensitive the government's net worth is to this assumption. The biggest disadvantage of long-term projections is a corollary of their useful- ness: generating the relevant information requires estimates that are subject to enormous uncertainty. What will the future rate of growth of GDP be? How will it be affected, if at all, by public investment?The large extent of reasonable disagree- ment about such estimates implies a large range of reasonable estimates of the government's net worth. This makes room for self-serving projections. 50 TIip M'orld Brink Keseor-01Obserwr; vol. 23. no. 1 (Spring 2008) Table 3. Benefits and Drawbacks of Alternative Sources of Fiscal Indicators Modern accrual Long-term Feature Cash accounting accounting projections Provides information on short-term Yes Yes Yes cash flows Provides information on net worth. No Partially Yes given current policies Incorporates uncertainty Avoids the issue Partially Yes Limits self-servingforecast bias Yes Yes Not easily Source: Authors. Though uncertainty in the estimates is unavoidable, a government can make its projections more credible. It can allow others to see how the results are gener- ated by making its projection model publicly available (in a spreadsheet on the Internet, for example). For critical parameters or key variables, it can use the esti- mates of a panel of independent experts, as Chile does in estimating its structural surplus. It can prepare and publish standards that it will follow in making the projections. And it can legislate that an independent auditor must opine on whether the projections follow those standards and reflect the stated assumptions. International organizations and financial institutions could help develop the standards and expertise necessary for some of these steps. Each approach has benefits and drawbacks (table 3). Cash accounting offers the necessary information on liquidity and, because of its short-term focus. limits opportunities for some sorts of bias, but it provides no information on the crucial issue of government net worth. Modern accrual accounting fills this gap using methods designed to limit bias, but partly because of the concern to limit bias, it includes some poor estimates of values and provides no information on crucial elements such as future tax revenues. Long-term projections can overcome these problems, but only at the cost of requiring more estimates, creating more leeway for bias. Given the advantages and disadvantages of each approach, the best strat- egy, at least for middle- and high-income countries, would seem to be for govern- ments to develop both accrual accounting and long-term projections while continuing to monitor short-term cash flows. Fiscal Targets and Fiscal Rules Indicators of net worth can also form the basis for new fiscal targets and rules that promote solvency without sacrificing public investment. One alternative is the so-called golden rule, according to which governments can borrow only to Easterly. Irwin, and Serven 51 invest in the creation of new assets. This idea of separating the current and capital budgets is hardly new, but it has recently been revived (a classic reference is Musgrave 1939;see Bassetto and Sargent 2005 for historical background). The British government has adopted a version of the golden rule that "over the econ- omic cycle" allows it to borrow only to invest and only as long as public sector net debt remains below 40 percent of GDP (H.M. Treasury 2004). Accrual accounting suits the golden rule, because another way of stating the rule is to say that governments must not run an income-statement deficit (that is, they must not reduce their net worth). Because the income-statement surplus excludes expenditure on investment, the golden rule encourages investment. Because the income-statement surplus includes depreciation, however, adopting the golden rule is not the same as simply exempting investment from the fiscal targets-it amounts to exempting net investment from fiscal targets. Observance of the golden rule over a long period of time would eventually result in a public debt stock no larger than the public capital stock so that to a first approximation the outstanding debt would be fully backed by public assets. The limitations of typical accounting mean that the golden rule does not ensure solvency or even expected solvency. The reason is that the assets may not yield an expected return high enough to cover the interest on the debt that financed their acquisition. Furthermore, by treating current and capital spending differently, the golden rule offers an incentive for opportunistic misclassification of expenditures. An alternative that, in principle, avoids these problems is the permanent- balance rule. Roughly stated, it requires governments to set tax rates at a constant fraction of output that over the long run pays for the government's present and future expenditure (Buiter and Grafe 2004). Named by analogy with Milton Friedman's permanent-income hypothesis, the rule allows governments to borrow when revenue is temporarily low or when current investment opportunities are greater than future investment opportunities. Long-term fiscal projections suit the permanent-balance rule, because such projections are required to determine the required minimum tax rate. Implementing the permanent-balance rule success- fully would thus require addressing the problems discussed earlier about the reliability of long-term fiscal projections, which is no easy matter. A compromise solution that avoids some of the problems with both the golden rule and the permanent-balance rule is the modified golden rule recently proposed by Mintz and Smart (2007). They suggest that governments should be able to borrow to invest in self-liquidatingassets, such as power plants that generate future revenues for the government but not in assets that generate no revenues, such as typical public schools. Furthermore, just as firms do not finance all their assets with debt, governments should be limited to borrowing only some fraction of the value of the revenue-generating assets. These features would result in a rule that is 52 Thr World Bank Hc~senrchObsc.r\,ec 1101.23. t~o.1 (Spring 2008) more conducive to solvency than the golden rule, without relying as much on potentially unreliable long-term forecasts as does the permanent-balance rule. Conclusions In many industrial and developing economies, governments have cut back on public investment as they have brought their budgets closer to balance. Although budget cuts were probably necessary, the cuts in public investment may have been counterproductive, because much theory and evidence suggest that public investment has the potential to increase future output. In the worst case, investment cuts trigger a vicious circle, in which the sub- sequent deterioration of future revenue forces further investment cuts, leading to yet further deterioration, further investment cuts, and so on. What is supposed to be fiscal adjustment in this case actually has the same consequences as fiscal pro- fligacy. Cutting investment to promote solvency becomes the fiscal equivalent of walking up the down escalator-riders step up only to end up below whert: they started. The cuts in public investment should have come as no surprise when most countries measure their fiscal position not in terms of net worth but in terms of short-term cash flows and gross debt, and cutting investment can reduce debt and short-term cash flows, even as it reduces net worth. The problem afflicts both industrial and developing economies, but it is more pressing in developing econ- omies, which have not yet built up their public capital stocks. The decline in public investment suggests the need to rethink fiscal strategies. In some cases, it may be best to increase public investment and accept a higher short-term cash deficit in exchange for higher tax and user-fee revenues later. This strategy is unlikely to be right for all countries, however. Those with good infrastructure and bad fiscal positions may indeed do well to cut public invest- ment. Countries with high taxes and debt may do best to increase public invest- ment but finance it by cutting current expenditure. Still others, with high debt and little room for cuts in current expenditure, may have no choice but to raise taxes or forgo improvements in their infrastructure. Each case must be analyzed on its merits, with-given the tendency to be optimistic in forecasting growth and the performance of investments-a degree of skepticism. One general lesson is that appropriate spending composition has to be an essential part of fiscal adjust- ment and consolidation strategies, because it affects growth outcomes. In other words, spending targets and growth forecasts cannot be set without regard to the composition of expenditure, as they currently are. All governments are likely to benefit from better fiscal information. The idea is not to abandon measures of debt and short-term cash flows, which are clearly important, but rather to supplement them with measures of assets, yielding a measure of net worth and its change over time. What is needed is information that allows governments to quickly determine when improvements in short-term cash flows are coming at the expense of declining net worth. Two means of gener- ating such information are constructing long-term projections of fiscal cash flows and adopting modern accrual accounting. Better fiscal information is helpful irrespectiveof whether the government follows quantitative fiscal rules or targets. But the question also arises whether govern- ments should set themselves fiscal rules or targets incorporating measures of invest- ment or net worth. Because debt and short-term cash flows matter, rules or targets based exclusively on net worth may not be helpful. But combining net worth with conventional fiscal measures may have merit. The United Kingdom, which adopted a version of the golden rule combined with a debt target, offers one example. An even better option might be a modified golden rule that allows borrowing to finance a portion of the cost of cash-generating assets but also requires that some proportion be financed by current taxes. There is no obvious "best" solution, but whatever the specific solution chosen, it is clearly time to change the exclusive focus on public sector liabilities and bring public sector assets into the picture when designing fiscal adjustment. It is much easier to walk up the up escalator. Notes William Easterly is professor of economics at New York University, where he is also a faculty affiliate of Africa House and co-director of the Development Research Institute: his email address is william. easterly@nyu.edu. Timothy Irwin is a senior economist in the Finance. Economics, and Urban Department at the World Bank; his email address is tirwin@worldbank.org. Luis Serven (corre- sponding author) is research manager for macroeconomics and growth in the Development Research Group at the World Bank; his email address is Iserven@worldbank.org. An earlier version of this article was prepared for the World Bank's Latin American Regional Studies Program. The authors are grateful to Penelope Brook, Antonio Estache, Jose Luis Irigoyen. Guillermo Perry, Sergio Rebelo, Augusto de la Torre, and three anonymous referees for helpful comments. References Afonso, J. 2005. "Fiscal Space and Public Sector Investment in Infrastructure." IPEA Texto para Discussao 1141, Instituto de Pesquisa Econdmica Aplicada. Brasilia. Alesina, A., and R. Perotti. 1997. "Fiscal Adjustments in OECD Countries: Composition and Macroeconomic Effects." Internationai Monetary Fund Staff Papers 44:210-48. Balassone. E, and D. Franco. 2000. "Public Investment. the Stability Pact and the Golden Rule." Fiscal Studies 21(2):207-29. 54 The World Bank Research Obsurveer; wl. 23. no. I (Spring 2008) Bassetto, M.. and T. Sargent. 2005. "Politics and Efficiency of Separating Capital and Ordinary Government Budgets." NBER Working Paper 11030. National Bureau of Economic Research, Cambridge, Mass. Buiter, W, and C. Grafe. 2004. "Patching Up the Pact: Suggestions for Enhancing Fiscal Sustainability and Macroeconomic Stability in an Enlarged European Union." Ecorloinics of Transition 12(1):67-102. Calderon, C., and L. Serven. 2004. "Trends in Infrastructure in Latin America." World Bank Policy Research Working Paper 3400, Washington, D.C. . . 2007. "Is Infrastructure Capital Productive?" World Bank, Washington, D.C. De Haan. J.. J. Sturm, and B. Sikken. 1996. "Government Capital Formation: Explaining the Decline." WeltwirtschaftlichesArchiv 132(1):55-74. Easterly, W 1999. "When Is Fiscal Adjustment an Illusion?" Economic Policy 14(28):55-86. Easterly, W., and L. Serven. 2003. The Limits of Stabilization: Infrastructure, Public Deficits and Growth in Latin America. Stanford. Calif.: Stanford University Press. Estache, A. 2004. "What Do We Know about Sub-Saharan Africa's Infrastructure and the Impact of Its 1990s Reforms?" World Bank. Washington, D.C. Ferreira. E, and C. Araujo. 2005. "On the Economic and Fiscal Effects or Infrastructure Investment in Brazil." Funda~Bo Getulio Vargas da Escola de Pos-Gradua~Boem Economia. Ensaio Econbmico 613. Siio Paulo, Brazil. Fontaine, E. 1997. "Project Evaluation Training and Public Investment in Chile." American Economic Review 87(2):63-7. Guasch. J. Luis. 2004. Granting and Rrrzegotiating Infrastructure Concessions: Doing It Right. WBI Development Studies. Washington. D.C.: World Bank. Hemming, R., and International Monetary Fund. 2006. Public-Private Partnerships. Government Guarantees, and Fiscal Risk. Washington, D.C.: International Monetary Fund. Hicks. N. 1991. "Expenditure Reductions in Developing Countries Revisited." Journal of Internationnl Devi~lopment3(1):29- 37. International Monetary Fund. 2004. "Public Investment and Fiscal Policy." Washington, D.C. . 2005. "Public Investment and Fiscal Policy: Lessons from the Pilot Country Studies." Washington. D.C. Irwin, T. 2007a. 'Accrual Accounting, Long-Term Fiscal Projections, and Public Investment in Infrastructure." In G. Perry, L. Serven. and R. Suescun eds.. Fiscal Policy. Stabilization, and Growth: Prudence or Abstinence? Washington, D.C.: World Bank. . 2007b. Governmerlt Guarantei~s:Allocating and Valuing Risk in Privately Financed Infrnstructure Projects. Washington. D.C.: World Bank. Kalaitzidakis, E, and S. Kalyvitis. 2004. "On the Macroeconomic Implications of Maintenance in Public Capital." Journal of Public Economics 88(3-4):69 5- 712. Keefer. Philip, and Steven Knack. 2007. "Boondoggles, Rent-Seeking and Political Checks and Balances: Public Investment under Unaccountable Governments." Review of Economics and Statistics 89(3):566- 72. Lane. E 2003. "The Cyclical Behavior of Fiscal Policy: Evidence from the OECD." Journal of Public Economics 87(12):2661-75. Mintz, J., and M. Smart. 2007. "Incentives for Public Investment under Fiscal Rules." In G. Perry. L. Serven, and R. Suescun, eds., Fiscal Policy. Stabilization, and Growth: Prudence or Abstinence? Washington. D.C.: World Bank. Easterly. Irwin, and Srrvin 55 Musgrave, R. 1939. "The Nature of Budgetary Balance and the Case for a Capital Budget." American Economic Review 29(2):260- 71. OECD. Various years. Economic Outlook. Paris. Pereira, A.. and M. Pinho. 2006. "Public Investment, Economic Performance and Budgetary Consolidation: VAR Evidence for the 12 Euro Countries." College of William and Mary, Department of Economics Discussion Paper 40, Williamsburg, Va. Perotti, R. 2004. "Public Investment: Another (Different) Look." IGIER Working Paper 277. Innocenzo Gasparini Institute for Economic Research. Milan. Pritchett. L. 2000. "The Tyranny of Concepts: CUDIE. Cumulated, Depreciated. Investment Effort Is Not Capital." Journal of' Ecor~omicGrowth 5(4):361-84. Rioja. E. 2003a. "Filling Potholes: Macroeconomic Effects of Maintenance versus New lnvestments in Public Infrastructure." Journal of Public Ecor~omics87(9- 10):2281-304. . 2003b. "The Penalties of Inefficient Infrastructure." Review of Developmer~tEcorlornics 7(1): 127-37. Roubini, N., and J. Sachs. 1989. "Government Spending and Budget Deficits in the Industrial Countries." Eronornic Policy 4(8):99- 1 32. Serven. L. 2007. "Fiscal Discipline, Public Investment, and Growth." In G. Perry, L. Serven, and R. Suescun, eds., Fiscal Policy, Stabilization, and Growth: Prudence or Abstinence? Washington, D.C.: World Bank. Talvi. E., and C. Vegh. 2000. "Tax Base Variability and Procyclical Fiscal Policy." NBER Working Paper 7499. National Bureau of Economic Research. Cambridge, Mass. Tanzi, Vito, and Davoodi Hamid. 2002. "Corruption. Public Investment, and Growth." In George T. Abed, and Sanjeev Gupta, eds., Corruption and Eronomic Performance. Washington D.C.: International Monetary Fund. Treasury, H.M. 2004. "Long-Term Public Finance Report: An Analysis of Fiscal Sustainability." December. London. World Bank. 2005. Infrastructure in Latin America: Recent Developments and Key Challeng~s. Washington, D.C.: World Bank. The M'orld Btrnk Research Obserwr. vol. 23. no. 1 (Spring 2008) What Can Countries in Other Regions Learn from Social Security Reform in Latin America? Indermit S. Gill, Ceren Ozer, and Radu Tatucu About a dozen colintries in Latin America have enacted reforms that include elrlments being contemplated elsewhere, including the partial privatization of social security. It is not e a q to draw universal lessons for social security refornl from the experience of countries sudl as Argentina, Chile, and Mexico, however, where sizeable public pension systems went bankrupt before the populations aged, mainly because of inismanagement. Most developing economies have much snlaller social security systenls. Relatively well- managed systems in industrial countries face problems that are long term in nature and have been brought about by an aging population. The experiences of Latin America never- theless offer some general lessons for countries in other parts of the world. These lessons relate to changes in labor market incentives accompanying refortrls and how workers react to them, government actions that have met with success in managing the transition to funded pensions, and the expectations of individuals from social security systems. Latin America's reforms suggest that the most effective approach is to keep payroll taxes low, governments solvent, and social security systems focused on providing reasonable insurance against poverty in old age. JEL codes: G23, H3I, H53, H55, J26. Latin America has long experience with social security reforms. Since the early 1980s, especially during the 1990s, about a dozen countries in the region have attempted to radically reform their social security systems. Many observers have studied social security reforms and outcomes in Latin America to draw lessons. Among the more widely discussed social security reform experiences have been those of Chile, Argentina, and Mexico. Gill. Packard, and Yerrno (2005), among others, have analyzed these experiences, with the objective of informing pension .!, The Author 2008. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development i 1 . 1 1 ~L W R I . ~I I A ~ K . All rights reserved. For permissions, please e-mail: j.ournals.permissions@oxfordjournals.org doi:10.1093iwbrollkrnOll Advance Access publication January 15 . 2008 23:57-76 Table 1. Actual and Projected Total Fertility and Life Expectancy at Birth, by Region Total fertility (children per woman) Life mpecpectancy at birth Region 1970- 75 2000-05 Low Medium High 2000-05 2045- 50 World Africa Asia 5.08 2.47 1.42 1.91 2.41 67.3 77.2 Europe 2.16 1.40 1.33 1.83 2.33 73.7 80.6 Latin America and the Caribbean 5.05 2.55 1.36 1.86 2.36 71.5 79.5 Northern America 2.01 1.99 1.35 1.85 2.35 77.6 82.7 Oceania 3.23 2.32 1.42 1.92 2.42 74.0 81.2 Source: Jousten 2007, based on data from United Nations Secretariat 2005. policy in Latin America. This article attempts to draw lessons for other countries from the Latin American experience. As in Latin America, many other developing regions are experiencing signifi- cant demographic change. Birth rates have fallen, and life expectancy is on the rise (table 1). Latin America offers the most varied experience with structural pension reform. Thus it provides insights into how radical reform of pension systems can help meet the pressures created by rising longevity and persistent old-age poverty. The fiscal deficits and increasing contingent liabilities (obligations to pay sums dependent on future events) of generous public pension systems, often combined with system mismanagement, created an immediate impetus for governments to institute structural pension reforms in Latin America. Rising pension costs raise questions of fiscal sustainability across the globe, but so far only a few countries have engaged in major structural reform. Many focused instead on parametric changes-adjusting the size and scope of their single-pillar social security systems (the mandatory, pay-as-you-go, publicly provided part) by changing the rates of contributions, the benefit calculations, and the retirement age. This article assesses whether the demographic and social pressures these countries face require structural reform of social security, along the lines of the reforms adopted in Latin America. This article does not address the special challenges of reforming pension systems for government workers: it focuses on the social security system for the employees of enterprises. Civil service pensions put significant fiscal pressures on more than half of the world's countries-including some of the largest developing economies, such as Brazil, China, and India-which have separate pension schemes for civil servants. Civil service pension reform is a contentious issue. Not 58 The World Bank Research Obserirr: vo1. 23, no. 1 (Spring 2008) surprisingly, in most Latin American countries the reforms did not affect private and public sector employees equally. For political reasons reforming governments often avoided structural changes to the public pension systems benefiting the mili- tary and civil servants. The article is organized as follows. The next section summarizes social security problems and reforms in various regions. The following section assesses the performance of reform in Latin America and draws lessons from that experience for other countries. The last section draws implications for policymakers consider- ing reforms. Social Security Problems and Reforms The underlying conceptual framework is based on the "comprehensive insurance" concept of Ehrlich and Becker (1972, 2000). In the face of a possible loss a "com- prehensive insurance" approach suggests that individuals can insure against the loss, take steps to lower the likelihood that the loss will occur, or do nothing. The purchase of insurance transfers income from "good" to "bad" times in order to reduce the magnitude of losses in bad times. Individuals can insure themselves in two ways: through mechanisms that pool the risk of the loss occurring among those who are exposed to this risk or by consumption smoothing through individ- ual savings ("self-insurance").In a perfect world, pensions could be left to private insurance and to individuals' voluntary saving decisions. In the presence of imperfect information, missing markets, and other distortions government invol- vement becomes necessary. In addition to insurance and consumption-smoothing objectives the two other primary objectives of pensions are poverty relief and redistribution. Redistribution complements the role of progressive taxation, for example, by subsidizing the con- sumption smoothing of individuals who earned little during their working years (Barr and Diamond 2006). Pension policy may also have secondary goals. such as improving the operation of labor and capital markets and encouraging individ- uals to save more. Some might argue that promoting economic growth is an additional objective of pensions. One of the key debates on pensions centers on the relative weights of these different objectives. Another debate concerns whether pensions should be pay-as-you-go or funded. Most state-run pension schemes remain pay-as-you-go. Private schemes are gen- erally funded, with pensions paid from a fund built up over time by members' contributions. Many countries in Latin America added mandatory contributions to private-funded pensions to their existing pay-as-you-go schemes. There is general agreement on the desirability of regulated voluntary pensions; disagreements remain over whether private schemes should be mandatory. Gill, Ozer: and Tatucu 59 Supporters of the mandatory-funded private schemes emphasize that individual incentives to work, save, and incur risks are least distorted if the mandated saving flows into privately managed accounts. Individuals' pension-related decisions involve long-run choices and require a good understanding of pensions products, which are often complex. Both factors create information problems, which reduce-often considerably-people's ability to male choices that maximize their long-term well-being (Barr and Diamond 2006). Arenas and Mesa-Lago (2006) find, for example, that many workers in Chile lack the data and skills to make an informed selection of the best pension provider. Barr and Diamond (2006) argue that imperfect information in this context cannot be addressed simply by offering more information, because the issue at hand is an information-processing problem. The most common component of a pension system is a national defined-benefit scheme in which pension benefits depend on a worker's wages and age (table 2). Some countries also require that workers contribute to individual retirement accounts, and pensions are paid from the accumulated funds. Many countries also encourage workers to contribute to individual accounts managed by financial institutions. The last two kinds of pensions are called defined-contribution plans (for a more detailed explanation of the economics of pensions, see Barr and Diamond 2006). A recent review of pension systems in 53 countries-all 30 Organization for Economic Co-operation and Development (OECD) members plus 23 countries in Eastern Europe and Central Asia, Latin America and the Caribbean, and the Middle East and North Africa-shows that the primary differences across systems lie in three main areas (Whitehouse 2007). First, the level of social security - Table 2. Instruments for Old-Age Income Security Mainstc~y:savirlg Mairlstu!y: pooling Nature o] instrlrnlent Murrdntory Mnrtdatorg Volrintary -- -- Common name First pillar Second pillar Third pillar Main function Insure against poverty in Smooth consumption Smooth consumption old age, reduce income over life cycle over life cycle inequality Main role of Detine benefits Define contributions Define incentives goverr~ment Principal risk bearer Government Worker Worlcer Financial instrument Unfunded pay-as-you-go Funded individual Funded tax-preferred accounts individual accounts Solrrcc.:Authors' compilation. 60 Th(z M'orld Bnrlk Ktpsmrcl~ODserwc vol. 2 3. no. 1 (Spring 2 0 0 8 ) benefits (the target replacement rates) varies considerably. There is a negative link between the target replacement rate for mandatory pensions and the relative importance of voluntary private pensions. In countries such as Canada, the United Kingdom, and the United States, where voluntary private pension pro- visions are widespread, mandatory pensions are relatively small. Second, the relative emphasis on pooling and saving differs significantly. Countries such as Australia, Canada, New Zealand, and the United Kingdom emphasize the pooling function; Nordic countries and most Latin American countries also have progressive systems. In continental Europe (outside the Nordic countries) and the Middle East and North Africa pensions are more strongly correlated with earnings, signaling a greater emphasis on the savings function. Third, the relative reliance on public and private sectors varies greatly. In addition to the larger role for voluntary and private provision in countries with low target replacement rates, many countries also involve the private sector in running the mandatory pension system. The best-known cases of pension reforms that increase the role of the private sector are in Latin America and the Caribbean, followed by Eastern Europe and Central Asia. The private sector plays an important role in mandatory pension provision in about one-third of the high- income OECD countries. At the other end of the spectrum, countries in the Middle East and North Africa do not yet involve the private sector in mandatory pensions, and voluntary provisions barely exist. The experiences of Brazil, China, India, the Russian Federation, and South Africa illustrate the range of problems faced by developing economies and transition economies. Reformsin these and other countries are described below. Brazil Conditions in Brazil today resemble those in some Latin American countries before they undertook the type of structural reforms assessed below. Pension reform was motivated mainly by fiscal pressures. Despite having a young popu- lation, Brazil's level of public expenditure on pensions is large. Subsidies to cover Brazil's public pension regimes' deficits rose from 4.6 percent of GDP in 1998 to 5.6 percent in 2004 (Giambiagi and de Mello 2006). So far pension reform has involved streamlining the system's mandatory first pillar and developing a third pillar of voluntary, complementary, personal saving schemes, without creating a second pillar of mandatory individual saving plans. Since the late 1990s, reform of the regime for private sector workers has been aimed at tightening eligibility conditions. reducing replacement rates, and increasing the share of population covered by social security. Gill. 0zc.c trrld Tatncu East Asia East Asia has a diverse set of pension systems. Malaysia and Singapore have large provident fund systems operating under public administration at the national level on the basis of defined contributions. The Republic of Korea, the Philippines. and Thailand have OECD-style defined-benefit pension schemes (though with lower coverage rates than industrial countries), with more emphasis on redistribution. Like other communist countries, China formerly had defined-benefit pay-as- you-go pension systems covering urban public sector employees, and contri- butions were largely the responsibility of state-owned enterprises. Rising pension expenditures (caused partly by the use of early retirement as a mechanism to deal with excess workers at state enterprises) and declining contributions (the result of poor performance by state enterprises, rising unemployment, a growing informal sector, and weak enforcement) motivated China to seek the best reform alterna- tives (Asher and others 2005). South Asia India, like other countries in South Asia, is still at the beginning of its demo- graphic transition. It has a low ratio of pensioners to workers, who continue to contribute to social security schemes. Although only 13 percent of India's labor force is covered by pensions, pension debt is becoming a serious issue. Implicit pension debt is estimated at 25 percent of GDP nationally, and in some states the extent of the problem is much greater. India is in the process of passing into law a new pension system that would shift all new central government employees to a defined contribution plan from the current defined-benefit scheme, shifting the risk of retirement financing from the government to individuals (Shah 2006). Participants in the new scheme will have access to a range of investment products from selected private sector companies. The new pension system will be offered on a voluntary basis to private sector workers. Aiming to fulfill the social protection dimension of pensions, India's noncontributory pensions target the elderly poor: the means-tested schemes administered by states and supplemented with federal funds reach 1of every 10 elderly Indians. Eastern Europe and Central Asia Regionally, Europe and Central Asia is second only to Latin America in terms of pension reform activity. In the early transition period, countries in the region faced serious challenges to their social security systems as output fell, contri- butions declined, and the number of beneficiaries grew. Transition proved to be 62 The World Bank Research Observe6 vol. 23. no. 1 (Spring 2008) challenging politically. Some countries chose parametric reforms, such as raising the retirement age, with or without changing the benefit formulas; others, especially the European Union accession countries, undertook structural reform. Ten countries in Europe and Central Asia created second pillars of mandatory, funded individual accounts. The Russian Federation began structural reforms of its pension system in 2001. The new system comprises three pillars. The first pillar, the major component of the system, is a publicly managed pay-as-you-go, defined-benefit scheme that con- sists of a flat basic benefit and a notional defined-contribution scheme. The second pillar is a mandatory defined-contribution scheme with mixed public- private management. The third pillar is the voluntary privately managed com- ponent (OECD 2006). Middle East and North Africa Countries in the Middle East and North Africa have already put in place defined- benefit pension systems financed on a pay-as-you-go basis. Egypt, Iran, and Libya have also developed noncontributory pension schemes. Even with young popu- lations, many pension systems in the region are not financially sustainable without reform. Despite this, reforms have been limited, with Lebanon and Morocco among the few countries considering systemic pension reform. Sub-Saharan Africa Noncontributory schemes in Sub-Saharan Africa exist in only a few countries, such as South Africa, where they are financed by general revenues. As in South Asia, the pension reform agenda in Sub-Saharan Africa is driven by the fiscal pressures arising from civil service pensions. Average coverage in the region is less than one-fifth of the labor force, with the rest of the population relying on its own resources and informal old-age support. Pension systems around the world are thus diverse, with every country facing a unique set of problems. But there are some common features. Many countries in the developing world-including major economies such as China and India- have to deal with expanding social security systems that cover only small parts of their populations, and they have to do so within tight fiscal constraints. Industrial countries face problems that are long term in nature and have been brought about by an aging population. While the immediate concern behind the reform process has often been fiscal sustainability, getting the incentives right is equally important. Gill. OZECand Tatucu Performance i n Latin America and Lessons o f Experience The experiences of Latin America offer some general lessons for countries in other parts of the world. Chile first adopted structural reform in 1981. Argentina, Bolivia, Colombia, Mexico, Peru, and Uruguay followed in the 1990s, and Costa Rica, El Salvador, the Dominican Republic, Ecuador, and Nicaragua followed after 2000. The notable exceptions to reform are Brazil and Republica Bolivariana de Venezuela. The details of the reforms vary across countries. What is common is that a publicly mandated and administered pay-as-you-go component operated on a defined-benefit basis was retained and a publicly mandated but privately adminis- tered system of defined-contribution personal accounts was added. Governments also made some attempts to increase voluntary saving through defined incentives, such as Investment Retirement Accounts in the United States, which encourage individual retirement savings through tax benefits. But the headline item has been the new system of personal accounts. Mandatory personal accounts constitute the second pillar. Second-pillar reforms can be classified into three categories. The first is the "Chilean model," which made private accounts mandatory for all new workers; Bolivia, El Salvador, and Mexico also adopted this model. The second is what might be called the "Peruvian model," which Colombia also adopted. Under this model. new workers are given a choice between a downsized pay-as-you-go pension and a private account. Under the third approach. which can be termed the "Argentine model," new workers have a pay-as-you-go tier combined with a private account tier; Costa Rica and Uruguay also adopted this model. Until the early 2000s, Latin American countries were inclined to adopt some variant of these three approaches, with a tendency for later reformers to select the Chilean model, which gave new workers no choice but the personal accounts. Since then, pension reforms are as likely to eschew privatization entirely. Brazil, for example, has chosen to reform the parameters of its pay-as-you-go pensions, and Ecuador and Nicaragua have decided to postpone structural reforms. Assessment of the fiscal, financial, and labor market effects of the reforms reveals that the results have been mixed. Performance in a variety of areas is assessed below. Effect of Reform on System Balances Fiscal imbalances were the primary motivation behind reforms, just as they appear to be the main concern in India, the Russia Federation, and the United States. The reforms seem to have had some success. Simulations by Gill, Packard, and Yermo (2005) for eight Latin American reformers indicate that the rate of 64 Tllr World Bank Research Obscrvec v01. 23.110.1 (Spring 2008) Figure 1. Pension-related Long-term Deficits after Reforms Solirce: Based on Gill. Packard, and Yermo (2005). accumulation of pension debt fell sharply in most countries as a result of the reform (figure 1).In Bolivia, for example, pension-related debt would have been almost 160 percent of GDP in 2030 without reforms but is less than 50 percent of GDP with reforms. In Uruguay pension-related debt ratios for 2030 would be about 150 percent without reform and 70 percent with reform. These are the long-term effects. In the immediate aftermath of reforms, however, these countries had to deal with the transition costs, as contributions were divided from paying benefits to the elderly to investing in the private accounts of worliers. The promised benefits to current pensioners and older worliers under the old system had to be paid, while part of the payroll tax flowing in had to be diverted to fund individual accounts. With contributions diverted into funded pension accounts, governments had to find ways of financing existing pay-as-you-go liabilities. For a variety of reasons, in some countries these tran- sition costs proved to be higher than expected at the time of the reform. In Bolivia, for example, the pension-related deficit has been rising instead of falling, as projected. More important is the fact that many countries had to finance the transition through increased government debt, some of it held by the new pension funds. With the investment regulations favoring government debt and with the thin capital markets in much of the region, Latin American workers essentially Gill.Ozcy and 'Rztucli 65 Figure 2. Importance of Government Bonds in Private Account Portfolios Government securities Financial institutions Corporatebonds and equities Source: Based on Gill. Packard. and Yermo (2005). swapped pay-as-you-go debt for government bonds. Fully two-thirds of the average investment portfolio consisted of government securities (figure 2). While a case can be made that this debt is more secure, the case of Argentina-where the government wrote down its debt by more than two-thirds-suggests that this greater security is a matter of degree. Moreover, with less than 20 percent of these funds going to corporate bonds and equities, the growth effects of the reform were also likely weak. The reforms aimed at improving labor market efficiency by strengthening the links between contributions and benefits and by reducing the regressive transfers that characterized the previous social security systems. They appear to have been successful in reducing regressive transfers (figure 3). Although reforms have addressed within-system equity concerns, the most inequitable aspect of unreformed social security systems in Latin America was that they excluded large shares of the population from even a semblance of income security. Reforms have been less effective in addressing this problem. While the closer links between benefits and contributions may have increased participation, the effect was small. Participation rates in most countries have essentially flat-lined at levels ranging from about 10 percent to about 67 percent of the active labor force (figure 4). This lackluster performance--despite closer links between contributions and benefits-can be attributed to high (and rising) 66 The World Bar~kResearch Observer; vol. 23, r~o.1 (Spring 2008) Figure 3. Effect of Reforms on Within-system Equity I 1 Reform I reform No I -3 Men \Wornen Men lWomen Men b r n a Men lWornm Men IWornen Men IWomn Men lWm Me- Chlle Peru Colornb~a Argenl~na Uruguay Mexico Bol~v~a El Salvador Note: Figure shows percentage-point difference in internal rates of return earned from national retirement security system by wealthiest and poorest workers. Source: Based on Gill, Packard. and Yermo (2005). payroll taxes, as discussed below. But some of it also reflects the high manage- ment and insurance fees private pension providers charge. Effect of Reform on Financial Markets Reforms stimulated financial markets-but often at a high price for contributors. In Latin America as a whole the administrative costs of private schemes have been considerably higher than those of public schemes (Arenas de Mesa and Mesa-Lago 2006). In theory, private systems can reduce administrative costs through compe- tition. In practice, multiple private providers lose the advantage of econon~iesof scale, and considerable resources are spent on advertising and sales commissions. Administrators in private systems charge a commission (as a percentage of wages) for managing the old-age program plus a premium, transferred to an insurance company, to cover disability and survivor risks (Mesa-Lago 2006). The evidence indicates that ensuring a captive clientele for these pension funds has led to the growth of a new industry in the reforming countries; assets held by pension funds more than doubled as a share of GDP between 1998 and 2004. But while financial market development has been hastened, contributors have generally paid a high price. This was especially the case in the early days of reform in countries such as Chile; for workers who had contributed for a decade or longer. Gill, O z e ~and'R~tuclr 67 Figure 4. Pension System Participation Rates, before and after Reform, 1980-2000 70 0 - 7 E " ). 60.0 - B C 50.0 - - - m .- - .- 40.0 e 0 - a 30.0 g a 20.0 - m i o1 I 5 1 0 0 - 0 a 0 . 0 , , , , , , , , , , , , , , , , , , , , - - - 29"" , g B " . $ @ , g B " , g B " , @ @ + , 9 . " @ @ s - Argenbna - - -I- - -Bolivia Ch~le - - Colombia ----rcElSalvador Mexico --O-- Uruguay Costa Rica Ecuador Nicaragua Brazil Note: Rerorm dates are as follows: Argentina. 1994; Bolivia. 1997: Chile, 1981: Colombia. 1994: El Salvador, 1998; Mexico. 1997: Uruguay, 1996. Source: Based on Gill. Packard, and Yerrno (2005). more than a quarter of contributions may have gone toward management and insurance fees. Even today, 15-20 percent of contributions go to management fees, and workers have to pay insurance fees as well (figure 5). Administrative costs may be even higher for low-wage workers in some countries. Low-wage workers often did not participate in the system before reform because of the weak connection between contributions and benefits; they may now find the new systems unattrac- tive because of higher payroll tax rates and onerous administrative fees. Administrative costs have generally come down as the pension fund industries have matured, but in some countries, such as Peru, the fees charged to contribu- tors have not fallen commensurately. This has meant high profits for the fund managers. Between 1998 and 2002 the share of workers' contributions going to fees remained steady in Peru, while fund expenses fell. As a result, profit rates sky- rocketed (figure 6). Chile's recent experience appears to be similar. Competition among private pension providers in Chile has been limited because of the small number of administrators and high and increasing concentration among the largest funds (Arenas de Mesa and Mesa-Lago 2006). Even capable Latin American govern- ments appear to find it difficult to effectively regulate these oligopolies. 68 The World Hrlnk Kt~srtrrchOhsrrvrr: \luI. 2 3, rlo. 7 (Spring 2008) -- - - - Figure 5. Administrative Fees Paid by Workers, as Percentage of Total Contribution, 2002 Argentina Bollv~a Chlle Colomb~a El Salvador Mexico Peru Uruguay Average Source: Based on Gill. Packard. and Yermo (2005). Lessonsfor Other Countries Not all of these findings are relevant for other countries, because the countries and the social security issues being debated are different. However, some general lessons do emerge. Figure 6. Pension Fund Costs,Fees, and Profits in Peru, 1998-2002 100.0 30 Feeslnet 90.0 contributions .- .-.o=. Operational expenseslNetfees 25 and expenses1 r 80.0 net fees (percent) U -a 70.0 20 2- ._- 60.0 2 ir 50.0 15 0 E- 40.0 4- d $ 30.0 10 20.0 5 10.0 0.0 0 Source: Based on Gill. Packard, and Yermo (2005). Gill. Ozec and Tatucu Effect of payroll taxes on labor market incentives. While the move to private accounts had a small but positive effect on participation, the negative effect of higher payroll taxes may have been greater. Given near universal coverage of social security in developed countries and many transition economies, the discus- sion here is focused on the incentives to work rather than to contribute to social security. In particular, the concern is about the distorting effect of social security on the age of retirement. The Latin American experience appears to support those who argue against raising the payroll tax rate. Two findings are noteworthy. The first is that there is little evidence from Latin America that, in the presence of high transaction costs, individual accounts led to stronger labor market incentives, as evidenced in partici- pation rates. Payroll taxes went up in all countries except Chile and Uruguay (figure 7). The second finding is that diverting payroll contributions from pay-as-you-go systems to individual accounts appears to have adverse fiscal implications that are far more potent. Most Latin American countries had to create space for the second pillar, which required that they downsize and redesign the first pillar. Chile, Bolivia, and Mexico provided a minimum pension guarantee to low-income workers whose personal accumulations fell below a specified amount. Many struc- tural reformers in Latin America had to deal with financing the transition. One option was to increase taxation (through payroll or broader taxes, such as income tax or a general consumption tax) or borrowing (by issuing conventional Figure 7. Payroll Tax Rates before and after Reform Chile Peru Colombla Argentina Uruguay Mexico Bolivia El Salvador Costa Rlca Nicaragua Dominican Republic Source: Based un Gill. Packard. and Yermo (2005). 70 The World Bank Research Ohservt,r, c~ol.23, no. 1 (Spring 2008) public sector debt). Another was to reduce public spending4n pensions or in general or create new revenue (through privatization, for example). When labor is mobile across sectors and the informal economy is large, structur- ing the premiums for social insurance programs as payroll taxes may be ineffective. Greater reliance on a broad tax base, such as an income or consumption tax instead of a payroll tax, may be more efficient. Using a broader tax is also more consistent with the poverty prevention and redistributive functions of the remain- ing public pooling pillar after introduction of the multipillar model, because such a tax reduces the wedge between the formal and informal parts of the labor force. The public pooling pillar enables individuals and households to manage shocks to their income should they need to, enabling them to be more enterprising. Effect ofpnvatization. The pension debate in Latin America has centered on two costs associated with the move to private accounts. One is the administrative fees charged by the special pension funds set up to manage these accounts and the costs of annuitizing the accumulated funds. The other is the fiscal costs associated with the transition to private accounts. Latin America's experience with fiscal costs may be relevant for other countries. The cases of Chile and Argentina provide contrasting experiences of the inter- action of pension reform and fiscal effort. In Chile a strong fiscal effort character- ized the lead-in to the pension reform: fiscal surpluses averaged more than 5 percent of GDP in the years before the 1981 reform, so that Chile's fiscal deficits after the reform were mild and short-lived. In contrast, Argentina did not substan- tially bolster its fiscal situation in the years leading up to its 1994 reform. Though it ran small fiscal surpluses in the two years before the reform, there is reason to believe that its fiscal stance after the reform was worse than indicated by published figures. Payroll tax deductions reduced revenues and increased pension system deficits. About half of the deterioration of the consolidated public sector fiscal deficit between 1994 and the 2001 crisis was caused by the worsen- ing social security balance. The degree of protection against policy risk offered by privatizing a portion of mandated pensions may also be exaggerated. The experience in countries such as Argentina illustrates how any government-organized social security system- whether directly administered or simply mandated--can fall prey to politicians. Since the start of the system about half of all privately managed assets have been invested in government bonds. During the 2001 crisis, when the goverriment forced the pension funds to swap dollar-denominated government bonds for peso debt, the share of government bonds in the private funds' portfolio rose above three-quarters. Argentina is not unique in this regard. In Mexico, for example, several years after the reform the share of government bonds in pension fund portfolios is as high as in Argentina. Gill, O ; ~ Kand Ritucu 71 Pension system deficits contributed significantly to the deterioration of the fiscal balance in Argentina and other Latin American reformers. Many observers would underscore the importance of having a relatively strong fiscal position before undertaking structural reforms and of reducing the implicit debt of unfunded pay-as-you-go systems before making the debt explicit by shifting to a funded second pillar. But there are also concerns that this replacement could actually worsen the fiscal balances as reneging on explicit debt may be more costly than eroding real pension benefits (that is, reneging on implicit debt). The Latin American experience supports the views of advocates of general fiscal discipline rather than social security privatization as a prerequi- site for ensuring a stable domestic financial sector and a friendly environment for private saving. Worker objectives in buying old-age insurance. The experience in some Latin American countries raises questions about what individuals expect from their gov- ernments. Fiscal stability appears to be necessary for governments to fulfill these expectations; other conditions may also be required. Latin America's experience can offer useful insights on how to curb rising pension costs and prevent pensioner poverty at the same time. Indeed, many devel- oping economies already face rising pension spending, often combined with signifi- cant pensioner poverty (Barr 2006). Bourguignon et cxl. (2004) calculate the incidence of poverty among the elderly in 19 Latin American countries. Using household surveys to simulate the fiscal cost and impact on poverty rates of various uniform pension schemes, they show that a universal minimum pension would substantially reduce poverty among the elderly in all countries except Argentina, Brazil, Chile, and Uruguay, where minimum pension systems already exist. Evidence from Chile and Peru, where the results of surveys (Gill, Packard, and Yermo 2005) designed to determine how households manage economic risk are available, reveals something about what workers expect from governments.1 At the time of the survey in Peru the government had not instituted a minimum pension guarantee; the survey revealed that private financial institutions were trusted more than all three branches of government. It also revealed that more risk-averse workers chose private funds over the reformed but still risky govern- ment pay-as-you-go option. In Chile the survey results were more revealing. Two decades after reform, workers seem to be using a system intended to act primarily as a vehicle for savings-with a small pooling component-mainly as a risk-pooling mechanism. Each cohort of workers that completes the minimum contribution requirements appears to be content with qualifying simply for the government's guarantee of minimum pension-a modest means-tested amount of about 80-90 percent of the minimum wage. 72 Tlre L2iorld Rank Reseclrcll Ohs~rver:w l . 2 3 , no. 1 (Spring 2 0 0 8 ) Some researchers attribute this outcome to the moral hazard associated with low-income worlters realizing that any contributions beyond this are a pure tax. In fact, this behavior occurs less among the working poor and more among middle- and higher income groups; it is more consistent with a desire to purchase some insurance against old-age poverty. The switch to other savings instruments after they have qualified for this insurance also indicates that workers see the mandated private accounts as relatively expensive or risky compared with other investments. There is evidence that other retirement investments-housing, household enterprise, even the education of children-in Chile are perceived as less risky than saving in the reformed pension system (Packard 2002). There is also evidence that households prefer to gain eligibility for the low, government- guaranteed annuity and continue to save outside the system, despite the variable but high real returns they could earn in the system. This evidence suggests that they may place greater value on security than on real rates of return (Gill, Pacltard, and Yermo 2005). In countries where government is generally viewed as reliable, one could make the case that workers view the social security system more as a mechanism for insurance against poverty and less as a vehicle for saving to smooth consumption. The implication may be that the social security benefit structure should be made more progressive or the system made even more progressive than it is currently in developed countries. Main Policy Implications It is difficult to draw universal lessons on how to reform social security systems from the experience in Latin America. Developed countries already have universal coverage and well-developed financial markets, many developing economies outside Latin America do not have well-developed contributory social security systems. and transition economies in Europe and Central Asia face entirely differ- ent challenges than emerging markets in Latin America. These differences notwithstanding, the Latin American experience provides some useful information about the behavior of (rational) workers, the responses of (profit-seeking) firms, and the responsibility of (fiscally constrained) govern- ments. Put another way, the experience provides insights into how workers and firms react to changes in the structure of social security systems, what workers expect from their governments. and how governments can meet these expec- tations. The main policy pointers appear to be the following: . Keep payroll taxes low. Strengthening the links between contributions and benefits can improve labor market incentives somewhat, but higher payroll tax rates will offset these benefits. Keep benefits frugal. Public pension benefits should be small and secure, in order not to unduly discourage saving for old age while providing insurance against poverty in old age. Keep governn~entssolvent. Fiscal prudence is the most important rule for gov- ernments that wish to provide both a safe environment for private saving and reliable insurance against old-age poverty. While these lessons emerge from the experience in Latin America, they are also consistent with fundamental principles of the economics of insurance. Ehrlich and Becker (2000) and others propose that optimal insurance implies that rarer and more idiosyncratic losses are better pooled, while frequent and more systemic losses should be saved for. This principle can be applied to the losses associated with old age. The blessing of rising longevity implies that losing the capacity to earn is an increasingly frequent loss for individuals. The blessing of falling poverty rates implies that being poor in old age is becoming an increasingly rare loss. Rising longevity necessitates a shift to self-insurance or saving as the way to smooth consumption over one's lifetime, while falling poverty implies a shift to market insurance or pooling. The role of governments is to facilitate these actions by individuals to insure, self-insure, and self-protect. Since there are relatively few serious impediments to the ability of individuals to save for old age, the role for governments in encoura- ging saving for old age should be secondary and diminish. In contrast, in the case of poverty, because of the "social" nature of the loss being insured against and well-known problems with insurance markets, the role for governments is primary. Much of the discussion in any country should center on the role of govern- ments in helping individuals save and smooth consumption over their lifetimes and the need to help individuals insure against the losses associated with becom- ing destitute in old age. While it is clear that the mainstay for consumption smoothing should be individual saving, it is less clear what role the government should play in getting individuals to save. In the case of destitution in old age, however, the role for governments is clearer: it needs to provide an instrument for insurance against the increasingly rare loss associated with falling into poverty. For various reasons social security systems have historically bundled these two functions. The implication is that as the role of government in saving is scaled down, the insurance function becomes more, not less, important. As Lindbeck and Persson (2003, p. 60) note in their cross-regional survey of social security reforms, "Reforms do not diminish the need for basic, or guaranteed, pensions. Ouite the contrary; growing reliance on quasi-actuarial and actuarially fair systems which in themselves do not encompass any systematic intra-generational 74 The World Bnrrk Rcsc~rrrchOhser~,e,:vol. 23, fro. I(Spring 1008) redistributive elements, makes it even more imperative to maintain a safety net to prevent poverty in old age." Indermit S. Gill is the director of World Development Report 2009 at the World Bank; his email address is Igill@worldbank.org. Ceren Ozer is an economist in the South Asia Poverty Reduction and Economic Management Unit at the World Bank; her email address is cozer@worldbank.org. Radu Tatucu is a junior professional in the East Asia Poverty Reduction and Economic Management Unit at the World Bank; his email address is rtatucu@worldbank.org. The authors would like to thank Jeffrey Brown, Peter Diamond, Andras Simonovits, and participants at a workshop at the 2006 American Economic Association meetings for useful comments, as well as three anonymous referees for many suggestions that improved this article. 1. Data were collected in specially designed surveys on risk, savings, and social insurance (Encuestas de Prevision de Riesgos Sociales) conducted in Santiago. Chile, in January 2000, and in Lima. Peru, in May 2002. References Arenas de Mesa. Alberto, and Carmelo Mesa-Lago. 2006. "The Structural Pension Reform in Chile: Effects, Comparisons with Other Latin American Reforms, and Lessons." Oxford Review of Economic Policy 22(1):149-67. Asher. Mukul. Nicholas Barr. Peter Diamond, Edward Lim, and James Mirrlees. 2005. "Social Security Reforms in China: Issues and Options." Policy Study of the China Economic Research and Advisory Program. London School of Economics and Political Science. London. (http:/'/econ. Ise.ac.uk/staff/nb/index~own.html~ Barr. Nicholas. 2006. "Pensions: Overview of the Issues." Oxford Review of Econornic Policg 22(1): 1-14. Barr, Nicholas, and Peter Diamond. 2006. "The Econonlics of Pensions." Oxford Review of Economic. Policy 22(1):15-39. Bourguignon. Franqois, Martin Cicowiez. Dethier Jean-Jacques. Leonardo Gasparini, and Pierre Pestieau. 2004. "What Impact Would a Minimum Pension Have on Old-Age Poverty? Evidence from Latin America." Paper presented at the conference "Keeping the Promise of Old Age Security." Bogota. June 22-2 3. Ehrlich. Isaac, and Gary Becker. 1972. "Market Insurance, Self-Insurance and Self-Protection." Jollrnal of Political Econorny 80(4):623-48. . 2000. "Market Insurance. Self-Insurance, and Self-Protection." In G. Dionne, and S. Harrington, eds., Foundations of Insurance 1:'conomics: Reodings in Economics and Finance. Boston: Kluwer Academic Publishers. Giambiagi. Fabio, and Luiz de Mello. 2006. "Social Security Reforms in Brazil: Achievements and Remaining Challenges." OECD Economics Department Working Paper 534. Organisation for Economic Co-operation and Development.Paris. Gill. Indermit. Truman Packard, and Juan Yermo. 2005. Kerping thcl Promisr of Socirrl Security irr 1,rrti11ilrnrricn. Stanford, CA: Stanford University Press for the World Bank. Jousten, Alain. 2007. "Public Pension Reform: A Primer." IMF Working Paper 07/28. International Monetary Fund. Washington. D.C. Lindbeck. Assar. and Mats Persson. 2003. "The Gains from Pension Reform." Journal of Economic Literutlrre 41(1):74-112. Mesa-Lago, Carmelo. 2006. "Private and Public Pension Systems Compared: An Evaluation of the Latin American Experience." Review of Political Economy 18(3):317- 34. OECD (Organisation for Economic Co-operation and Development). 2006. Reform and Cl~allengesfor Private Pensions in Russia. Private Pension Series 7. Paris. (http://www.oecd.org/document/40/ 0.2340.en~2649~34853~36734824~111~111,00.htm1). Packard. Truman G. 2002. "Pooling. Savings and Prevention: Mitigating The Risk of Old Age Poverty in Chile". World Bank Working Paper No. 2849. Washington, (http://econ.worldbank. OrgJ. Shah, Ajay. 2006. "Indian Pension Reform: A Sustainable and Scalable Approach." In David A. Kelly, and Ramkishen S. Rajan, and Gillian H.L. Goh eds., Managing Globalization: Lessoils from China and Indiu. Singapore: World Scientific Publishing Company. (http://www.mayin.org/ ajayshah/PDFDOCS/Shah2005-sustainable-pension-reform.pdf). lJnited Nations Secretariat. 2005. World Populution Prospects: The 2004 Revisiotr Highlights. New York: United Nations. Whitehouse. Edward. 2007. Pensions P~~lnoramu:Retirement-lncomr S!ystetrls it! 53 Courltries. Washington, D.C.: World Bank. The World Hnrlk Resenrcl~Obscr~vr;vol. 2 1, no. 1 (Spring 2008) Why OECD Countries Should Reform Rules of Origin Olivier Cadot Jaime de Melo MTith preferential trade agreements on the rise worldwide rules of origin-which are necessary to prevent trade deflectioil-are attracting increasing nttentiot~.At the same time, preference erosion for Generalized System of Preferences (GSP) recipients is increasing resistanc7e to further nlultilateral negotiations. Drawing on dijferent approaches, this article shows that the current system of rules of origin that is used by the European Union and the United States in preferential trade agreements (including the GSP) and that is sirnilar to gistems used by other Organisation for Economic Co-opemtion and Development countries should be drastically sitnplijied if developed economies really want to help developing economies integmte into the world trading s~stein.In addition to diverting resources for adininistrative tasks, current rules of origin cart-:! significant compliance costs. Mort. fundamentally, it is becoming increas- ingly clear that they are often been designed to force developing economies to buy ineffi- cient intermediate products from developed economies to "pay for" preferential access for the final product. The evidence also suggests that a significant share of the rents associ- ated with market access (net of rules of origin compliance costs) is captured by developed economies. Finally, the restrictiveipness of rules of origin is found to be beyond the levels that would be justified to prevent tmde deflection, suggesting a capture by special interest groups. The article outlines some alternative paths to reforms. JEL codes: F13, F15 Rules of origin are an integral part of proliferating free trade agreements- countries belong to an average of six, according to a recent tally by the World Bank (2005,table 2.1)-and nonreciprocal preferential trade agreements such as the Generalized System of Preferences (GSP).~Given the lack of progress on har- monization at the World Trade Organization (WTO)and given that regionalism is here to stay, rules of origin are likely to be increasingly important in the world trading system. I The Author 2007. Published by Oxbrd University Press on behalf of the International Bank for Reconstruct~onand Ilevelopmcnt / TINw o ~ wHANK. All rights reserved. For permissions, please e-mail: journals.permissions~~ox~ordjo~~r~~als.org doi:10.109 3/wbro/lkmO10 Advance Access publication October 4. 7007 7 3:i7-105 The primary justification for rules of origin in preferential trade agreements is to prevent "trade deflection," or taking advantage of low external tariffs or weak customs-monitoring capacities to bring in imports destined for more protected markets in a trading bloc (possibly after superficial conditioning or assembly). In effect, rules of origin are needed to prevent trade deflection for all preferential trade agreements short of customs unions, where trade deflection is not an issue because members have a common external tariff. Beyond the largely unimportant issue of tariff revenue, what is at stake is the unwanted extension of preferences to out-of-bloc producers, which would erode the value of those preferences to eligible producers. In preferential trade agreements between developed and developing economies, rules of origin are also sometimes justified on "developmental" grounds because they can help foster integrated manufacturing activities in devel- oping economy partners. However, this article provides evidence that, by their complexity, rules of origin impose substantial compliance costs on preferred producers. For instance, in addition to regime-wide rules, the European Union has more than 500 product- specific rules of origin (Cadot, de Melo, and Pondard 2006). As a result, these rules are increasingly difficult to observe. In the least developed economies the rules divert scarce customs resources from other tasks such as trade fa~ilitation.~ In preferential trade agreements between developed and developing economies, forcing developing economy producers to source relatively inefficient intermediate goods locally or from developed economy partners rather than from the most price-competitive sources (as in, say, Asia) increases inefficiency and raises costs. The result is reduced value of preferences (compounding preference erosion in particular for least developed economies) and rent creation for developed country producers. This potential for rules of origin to become a form of "export protection" was first observed by Krueger (1998) during negotiations for the North American Free Trade Agreement (NAFTA). It applies to all preferential trade agreements (including nonreciprocal preferential schemes) granted by Organisation for Economic Co-operation and Development (OECD) countries to developing econo- mies. Moreover, there is an overwhelming evidence that this protectionist effect of rules of origin is not incidental but by design. Because rules of origin, unlike more traditional forms of trade protection such as voluntary export restraints or antidumping provisions, have so far largely escaped WTO disciplines: they are thus potentially a choice instrument for creeping protectionism. New evidence reported in this article shows that the burden imposed by the rules of origin applied by the two main protagonists in preferential trade agree- ments, the European Union and the United States, is substantial whenever prefer- ential margins are anything more than negligible. All told, the detailed evidence gathered here suggests that the current system of rules of origin applied by 78 The W'orld Barlk Research Obser\'~~:VOI. 2 3, no. 1 (Spring 2 0 0 8 ) developed economies is out of hand and defeats both the spirit of reforms aimed at bringing greater transparency to the multilateral trading system and the development-friendly intent of preference schemes. In a recent communication, the European Union decided to consider simplify- ing its rules of origin.3However, other OECD countries have so far refrained from reforming their rules and have opposed any discussion of reform of preferential rules of origin at the WTO. This article is a contribution to an overdue debate on how to design benign, transparent, and WTO-compatible rules of origin. This article is organized as follows. The first section briefly recounts how product-specific rules of origin are defined in EU and U.S. preferential schemes and proposes an ordinal restrictiveness index summarizing their complexity. This index is shown to be correlated with EU and U.S. most favored nation tariffs (and thus, with the depth of trade preferences). The second section presents a simple framework for quantifying the costs associated with rules of origin: distortionary, administrative, and rent-transfer. The third section provides direct evidence of the effect of rules of origin on preference use and rent sharing using preference utiliza- tion rates and unit values. The fourth section qualifies the direct evidence by considering the Asian exception and the natural experiment provided by compar- ing the EU Everything But Arms initiative and the U.S. African Growth and Opportunity Act (AGOA),which have similar tariff-preference margins but differ- ent rules of origin. The fifth section provides further indirect evidence. The sixth section draws policy implications from the article's findings and makes recom- mendations for simplifying existing rules of origin. Rules of Origin: Definition and Measurement Rules of origin in preferential trade agreements have two components: a small set of regime-wide rules and a large set of product-specific rules, typically defined at the Harmonized System six-digit level of disaggregation (HS-6). Both rules together are to ensure sufficient transformation. Because the European Union and the United States are the main users of preferential trade schemes among OECD countries, this article follows the approach of Cadot, de Melo, and Portugal-Perez (2005), describing briefly the rules for NAFTA, which have been in place for a long time and correspond closely to those applied by the United States in other preferential trade agreements, and those for the European Union's "Pan-European system (PANEURO),"also called the "single-list" because it covers the common set of product-specific rules of origin that the European Union applies in all its prefer- ential trade agreements (regime-wide rules differ across the European Union's preference schemes such as the GSP or Cotonou Agreement). The analysis starts with regime-wide rules then turns to product-specific rules of origin. Regime-wide Rules Regime-wide rules usually include five components (these and other terms are defined in the glossary at the end of the article): A de rninirnis (or tolerance) criterion that stipulates the maximum percentage of nonoriginating materials that can be used without affecting the origin of the final product. A cumulation rule. A provision on whether "roll-up" applies. The status of duty drawbacks. The applicable certification method. Table 1 describes how these regime-wide rules differ between the European Union and the United States. Table 1. EIJ and U.S. Examples of Regime-wide Rules of Origin Preferential trade De rninirnis or tolerance Absorption Drawback agrrrrnrnt rule (roll up) C~itrl~ilatior~ allouvd Certification rnethod NAFTA 7 percent (except Yes (except Bilateral Not after 7 Self-certification agricultural and autos) Yrs industrial products), 7 percent of weight for goods in chapters 50-63 United States- 10percent (except Yes Bilateral Not Self-certification Chile agricultural and mentioned processed agricultural products) U. S.-GSP 10percent, 10percent Not Bilateral. Not Solf-certification of weight for goods in mentioned limited mentioned chapters 50-63 diagonal Cotonou 15 percent Yes Full Not Two-step private Agreement mentioned and public and limited self- certification ELI GSP 10percent (except Yes Bilateral. Not Two-step private goods in chapters limited mentioned and public and 50-631' diagonal limited self- certification Note: Classification is carried out at the six-digit Harmonized System tariff line level. Each cell is the percentage of tariff lines that have the rules of origin in the corresponding row and in the corresponding column. "Goods in chapters 50-63 (textilesand apparel) do not benefit from a de n~inirnisprovlslon. Source: Cadot, de Melo, and Portugal-Perez 2005, table 1. 80 T ~ 12brld Bank Xesearcli Obsc~rvrr:wl. 23, no. I P (Spring 20081 Even for regime-wide rules, table 1 gives the impression of "made-to-measure" rules. It also shows that regime-wide rules differ across preferential trade agree- ments for the same developed economy partner, confirming the hub-and-spoke characteristic of preferential trade agreements between developed and developing economies. Certification methods also differ between EU and U.S. preferential trade agreements; certification is easier to carry out in U.S. agreements, at least in principle, than in EU ones. Product-SpecificRules of Origin Devising methods for determining sufficient processing (or substantial transform- ation) has turned out to be very complex in all existing preferential trade agree- ments because the Harmonized System was not designed to define the origin of goods. Three criteria are used by the European Union and the United States to determine whether sufficient transformation has taken place in activities requir- ing processing (that is, anything but crude products): A change of tariff classification (at various levels of the Harmonized System), meaning that the final product and its imported components should not belong to the same tariff classification (in other words, that the local PI-oces- sing should be substantial enough to induce a change of tariff classification). A critical threshold for value added (in short, a value content rule). A specific manufacturing process (a so-called "technical requirement"). For crude products the typical rule is "wholly obtained," which permits no foreign content whatsoever, although other rules apply in special cases, such as fish products. Both NAFTA (whose rules are also used in other U.S. preferential trade agree- ments) and PANEITRO have a long list of criteria-including such technical requirements as the "triple transformation" requirement in textiles and apparel, which requires apparel to be woven from originating fabric and yarn. Criteria also include exceptions (making them more stringent) and allowances (making them less stringent). NAFTA relies more heavily on changes of tariff classification, though often in combination with other criteria. PANEURO relies mostly on value content and wholly obtained criteria, with wholly obtained criteria prevalent for GSP and African, Caribbean, and Pacific (ACP) exports of primary products with little processing. As Krishna (2006) points out, when analyzing rules of origin, the devil is in the details because the complexity of rules of origin is what provides an opportu- nity for special interests to influence their design and administration. While many facets of rules of origin have been explored, rigorous empirical study of their effects has been hampered by two difficulties, one relating to data on utilization rates, the other to measurement of the rules' restrictiveness. First, data on preference utilization have been made freely available to the public only recently for the United States but not yet for the European Union ( for example, Brenton and Manchin 2003 and the studies collected in Cadot, Estevadeordal et al. 2006). Second, because rules of origin are a set of complex, heterogeneous legal rules, it has proved difficult to develop a reliable measure of their restrictiveness to serve as a synthetic indicator (much like effective rates of protection are a synthetic indicator of the restrictiveness of a country's trade regime). Estevadeordal (20001 has proposed an ordinal index of product-specific rules of origin restrictiveness (or R-index), taking values between one and seven, with higher values corresponding to more restrictive rules of origin. The index, constructed from a simple observa- tion rule at the HS-6 level, where rules of origin are defined, is described below. The observation rule is as follows (Cadot, de Melo, and Portugal-Perez 2005). Let CC stand for a change of chapter, CH for a change of heading, CS for a change of subheading, and CI for a change of item. A change of classification at the item level can be taken as less stringent than one at the subheading level, and so forth. So the criterion for classifying changes of tariff classification criteria is But a change of tariffclassification is often accompanied by one or two (in a few cases even three) additional requirements, such as value content rules, technical requirements, exceptions, or allowances. The observation rule assigns higher index values to changes of tariff classification when these requirements are added and lower ones in the case of allowances. For instance. a change of heading is given an index value of four, which rises to a five when accompanied by a techni- cal requirement or exception but shrinks to three when accompanied by an allowance. Though not amenable to quantification as effective rates of protection, the R-index plays the same analytical role; it is intended as an overall indicator of how trade-inhibiting the requirements that must be met by a product to obtain origi- nating status. There is preliminary evidence that preferences have hidden compli- ance costs and that those compliance costs may be related to rules of origin. Table 2 shows evidence for the textile and apparel sector under NAFTA, the EU GSE and the Cotonou Agreement (which grants tariff-free access for most ACP products to the EU market).4 Although NAFTA's and Cotonou's preference margins are equal, at 10.4 percentage points, their utilization rates vary widely: 50 percent for Cotonou compared with 79.9 for NAFTA. Cotonou's low rate of uptake despite deep preferences suggests hidden barriers. ACP countries benefit from full rather than diagonal cumulation (that is, intermediate purchases from 82 T l i p llbrlrl Rrrtrk Rr'stv~r~flOhserver;~ 1 0 1 .23. no. 1 (Spring 2008) - -- Table 2. Preferences and Utilization Rates for Textiles and Apparel Number o j Utilization rate Prejerence margin Prejerential trade agreement observations (percent) (percentage points) p~ NAFTA (2001) 618 79.9 10.4 EU GSP (2004) 16,555 (HS-8) 52.2 1.8 12,920 (HS-6) Cotonou Agreement (2004) 1.370 (HS-8) 50.0 10.4 Note: Averages are unweighted. HS-6 is the six-digit Harmonized System level; HS-8 is the eight-digit Harmonized System level. Source: Cadot, de Melo, and Portugal-Perez (forthcoming),table 3b. Table 3. Preferences and Utilization Rates, All Goods PreJerence margin -- Prejermtial trade agreement T2 4 percent" T 2 8 perccnta T 2 12 percenta North American Free Trade ~ ~ r e e m e n t ~ 87 (1.239) 86.0 (558) 82.8 (287) GSP' 50.2 (1.297) 52.5 (91) 66.2 (44) Cotonou AgreementC 92.5 (1,627) 94.3 (892) 96.4 (566) Note: Averages are unweighted. Numbers in parentheses are the number of tariff lines. a T; = (tyFN- tPREF)/(1 trREF)is the preference margin. + b~omputedat the six-digit Harmonized System tariff-line level with 2001 data. 'Computed at the eight-digit Harmonized System tariff-line level with 2004 data for 92 countries (GSP) and 37 countries (Cotonou Agreement) qualifying for prekrential market access. Source: Cadot, de Melo, and Portugal-Perez. (forthcoming), table 2 . all partners qualify as originating) and a 15 percent tolerance rule compared with only 10 for the GSE which also excludes the textile and apparel sector (chapters 50-63) from the 10percent tolerance rule. Table 3 shows that the evidence of hidden costs goes beyond the textile and apparel sector, where differences in uptake at similar margins may reflect compo- sition effects. Define the preferential margin r by the normalized difference between most favored nation and preferential tariffs Table 3 shows that, contrary to expectations, when the preferential margin rises, utilization rates fall for NAFTA. This suggests that an omitted variable is positively Cadot and de Melo 83 Table 4. Tariff Peaks and the R-index Restrictiveness-index. value North American Free Trark.Agroen~erlt RINEURO - -- Tariff peaks" 6.2 (257) 5.2 (780) Low tariffsb 4.8 (1.432) 3.9 (3,241) Total number of tariff lines 3.555 4,961 Note: Numbers in parentheses are numbers of tariff lines. Restrictiveness indexes are unweighted. "Tariff lines whose tariffs exceed three tirnes the GSP average. Tariffs lines whose tariffs are less than one-third of the GSP average. Source: Cadot, de Melo, and Portugal-Perez 2005, table 3. correlated with tariffs but negatively correlated with preference utilization. Rules of origin are an obvious culprit. Table 4 shows that lines with tariff peaks (that is, with tariffs more than three times the average), where preferential margins are highest, do have higher R- index values than those with low tariffs. This relationship holds for both NAFTA and PANEURO. Figure 1 confirms the patterns in tables 2-4; utilization rates do not really increase with tariff-preference margins. For NAFTA, they actually decrease due largely to the influence of the textile and apparel sector, where tariff preferences are deep and rules of origin stringent. Quantifying the Effects of Rules of Origin Although product-specific rules of origin, as already noted, take a variety of legal forms (changes of tariff classification, value content rules, technical requirement, and the like), they can all be represented conceptually as floors on domestic value added. Suppose that a producer in Madagascar wishes to sell a shirt under prefer- ential access in the European Union, this shirt is made with both originating intermediate goods (that is, intermediate goods that are either local, EU-made, or imported from other qualifying countries, according to cumulation rules) and nonoriginating intermediate goods, say from Bangladesh, China, or India. Now assume that to satisfy origin requirements (whether change of tariff classification, value content rule, or technical requirement), the Malagasy producer uses a higher proportion of originating inputs than would be the case in the absence of rules of origin (which is precisely the rule's purpose). Let superscript R denote a choice restricted by rules of origin. Unrestricted value added is vai, and restricted value added is vay, so rules of origin content 84 T11cIthrld Bank Research Ohsrrwr: lid. 2 3, no. I(Spring 2008) Figure 1. Average Utilization Rates for Different Preferential Margin Thresholds .- I +CotonouAgreement 20 '1 GSP H NAFTA O , , , l ~ ~ l ~ I l Average utilization rate (percent) Note: Cotonou Agreement includes 37 countries, computed at the eight-digit Harmonized System level: GSP includes 92 countries, computed at the eight-digit Harmonized System level; and the NAFTA includes 3 countries, computed at the six-digit Harmonized System level. Data are unweighted averages computed at the most disaggregated tariff-line level (table 2). Averages are based on more than 100 observations except for GSP (minimum of 27 observations for preference margins. 7, equal to or greater than LO percent.). Source: Cadot, de Melo, and Pondard 2006. reduces to va; > vai,whether or not it explicitly takes the form of a value content rule. Thus, conceptually a value content rule can be thought of as a generic rule that can play the role of all others by quantifying the objective common to all. This principle is important because it underlies an approach to rules of origin reform, discussed later, that substitutes a value content rule-possibly, although not necessarily, at differentiated rates across products-for the current array of instruments. It also highlights how information on rules of origin restrictiveness can be aggregated across instruments and subsumed into a single restrictiveness index. which itself can be then aggregated across product lines by averaging. Five results emerge from the quantitative analysis of the relationship between rules of origin restrictiveness and preference uptake: For a given preference margin a higher restrictiveness index translates into a lower utilization rate, all other things being equal. For a given restrictiveness index a higher tariff-preferencemargin translates into a higher utilization rate, all other things being equal. The compliance decisions of individual firms are binary; how the decisions aggre- gate into industry-wide utilization rates depends on the unobserved distribution of compliance costs. A lower pass-through of tariff preferences for the least developed economies (due to low bargaining power) implies lower uptake of preferences, all other things being equal. Improvements in the uptake of preferences can be obtained either from reductions in the restrictiveness of rules of origin or from cost-reducing administrative sim- plifications (such as transparent and uniform criterion). The third result implies that the statistical relationship between R-index values, preference margins, and utilization rates can only be "noisy" (that is, affected by a large unexplained component) at the aggregate (product-line) level. But notwith- standing the noise introduced by unobserved firm characteristics (which could be investigated only with firm-level data that are not currently available), figure 1 suggests an unambiguous relationship between preference margins, rules of origin restrictiveness, compliance costs, and utilization rates. It also suggests that, without a proxy for rules of origin restrictiveness such as the R-index, attempting to evaluate the effect of tariff-preference margins on the uptake of those prefer- ences may lead to omitted-variable bias. Keeping in mind that this framework captures only some of the effects associ- ated with rules of origin, several observations are in order.5 First, administrative costs act as a technical barrier to trade; they result in resource waste, and in the welfare calculus of the effects of rules of origin they are more costly than the usual deadweight losses. Second, if costs are associated with certification, requests for preferential status would not be observed when preference margins are low. Third, compliance costs are particularly high for differentiated products, for which there can be quality as well as price differences between eligible (local)and noneligible intermediate goods. Because part of those costs is passed on to consu- mers in the countries that determine the rules of origin, high utilization rates does not necessarily imply that rules of origin have small effects. Stiff rules of origin may inhibit or deflect trade altogether, not just the uptake of preferences. This was shown in the case of the Europe Agreements, free-trade agreements signed in 1991 between the European Union and the Central and Eastern European countries. Tumurchudur (2007a) showed that a large share of the exports from Central and Eastern Europe was deflected from EU markets by rules of origin, resulting in heavy losses. Evidence of trade-inhibiting effects is also apparent in the evolution of textile and apparel exports under AGOA and the Everything But Arms initiative, which is discussed in the exception and quasi- natural experiment section below. The World Rank Research Observer; vol. 2 3 , no. 1 (Spring 2 0 0 8 ) Direct Evidence In the absence of firm-level data Carrkre and de Melo (2006) assume that the pre- ference utilization rate for product line i (the percent of exports sent under the preferential regime rather than the most favored nation one), referred to as Ui, rises with the tariff-preference margin, ri (which may be just equal to the most favored nation tariff when preferential access means tariff-free access) and shrinks with rules of origin compliance costs c" That is, Ui =f(ri - cf) where f(.)is an increasing function, and, c; = (Rooi),where g(.) is an increasing function (true compliance costs are firm-specific and are thus unobserved; all that is observed is the presence of Rooi).These assumptions lead to an estimable relation of the form where Rooik is a set of dummy variables indicating the presence of product-specific rules of origin (change of tariff classifications, exceptions, and so on). Results from estimating equation (3) on NAFTA data confirm that utilization rates rise with preferential margins and shrink in the presence of rules of origin (Cadot, de Melo, and Portugal-Perez 2005 for results using data for the European Union). Carrere and de Melo (2006) combined their estimates with R-index values to compute an estimated ad valorem equivalent of total rules of origin compliance costs (administrative costs and costs due to higher input costs). Their estimates range from 3.5 percent for a change of chapter to more than 15 percent for com- binations of rules of origin involving technical requirements. The strongly inhibit- ing effect of technical requirements appears to be an empirical regularity. Even if the estimates are robust to a range of specifications, it is difficult to infer a sense of robustness from estimates derived from a relation like equation (3) because so much heterogeneity and so many "unobservables" influence prefer- ence uptake. Estimates have proved fairly sensitive to the inclusion of control vari- ables, in particular when using EU GSP data. An alternative is to restrict the analysis to products, for which the sole criterion used to determined origin is a value content. Drawing on the variation in EU value content criteria across product lines with value content the sole criterion, Cadot, Carrere, and Strauss-Kahn (2007) estimate an equation similar to equation (3), in which however the dummy variables for rules of origin are replaced with the continuous value content rule value^.^ Using dummy variables for Harmonized System sections to control for heterogeneity across sectors and restricting the sample to tariff lines with substantial tariff-preference margins (above 2 or 5 percent), they find that utilization rates rise, all other things being equal, with the maximum foreign content allowed by the value content rule. Cadot and de Melo Table 5. Estimated Effects on Preference Utilization and Rent Transfer of Relaxing a Value Content Requirement 7,2 2'Y" Tj2 5%" rj 2 2% Ti 2 5%" ACP +GSP ACP+ GSP GSP GSP Number of observations 19.261 5,958 Mean preferential margin (;i) (percent) 3.74 5.14 Mean utilization rate (percent) 0.12 0.17 Mean value content (percent of unit price) 58.8 58.2 Mean value of imports (euros) 1,475,182 2,376.301 Simulation: local content requirement reduced by 10 percentage points Change in preference utilization rate 2.0 5.2 (percentage points) Total rent transfer from increased 21. 7 37. 4 utilization (millions or euros)" - - aEvaluatedat the mean value of imports. Source: Authors' computations based on Cadot, de Melo, and Portugal-Perez (Torthcoming),table 6. Since a single value content criterion is a serious candidate for reform, at least in the case of the European Union (Stevens et al. 2006 and Cadot, de Melo, and Pondard 2006), table 5 reports two illustrative simulations based on these esti- mates. The mean local content requirement is 58 percent and preference margin 3-5 percent depending on the sample; mean utilization rates are rather low-between 12 and 22 percent. The bottom of the table shows the first-round effects (no supply response) of reducing the local-content requirement by 10 per- centage points. Utilization rates rise by 2-5 percentage points (row 6), raising the rent transfer by 21 -3 7 million, for a mean value of imports of 1.5-3.0 billion. To fully grasp the welfare effects of rules of origin, the rent distribution between the exporting and importing country must be factored in. This implies estimating the pass-through effect of tariffs on consumer prices (that is, the extent to which preferences translate into a higher producer price for exporters). Estimates for AGOA preferences (Olarreaga and 0zden 2005) and for the Caribbean Community (0zden and Sharma 2006) are that between one-third and one-half of tariff reductions are passed on to producers. However, part of the border-price increase could reflect the compliance costs discussed above. Using a monopolistic-competition model with differen- tiated products in which Mexican exporters can export product j either to the rest of the world (under most favored nation status, at price py) or to the 88 The World Bank R~~searchObserver, w l . 2 3, nu. 1 (Spring ZOOX) Table 6. Exports. Unit Costs, and Prices under Preferential Market Access arid a Binding Minimum Local Content Requirement Sinrlllations ( 1 ) (2) (31" (4) (5) ppppp--- Preference margin (percent) 1 0 10 10 1I) 10 Administrative unit costs (percent of 0 0 (1 2.5 1.O unit price) Ilnconstrained, minimum local content 40 40 40 40 36 requirement (percent of unit price) Constrained, minimum local content 50 50 50 40 requirement (percent or unit price) Preferential exports (percent change 15.9 11.1 -0.1 5 7.1 10.7 from scenario with no preferential access) lJnit costs (percent change from 0 1.9 6.7 1.9 0.4 scenario with no preferential access) IJnil net price (percent change from 2.9 3.0 3.4 2.2 2.1 scenario with no preferential access) no ti^: Unit net price set equal to 1, initial output to 100, and value-added to 20. All output is exported (40 percent to preference-receiving destination). For columns 1-4 nonoriginating inputs are set to 75 percent oT intermediate good input purchases. This implies that initial (unconstrained) local content is 20 +0.25 ( 8 0 )= 40. Setting the minimum local content requirement at 50 percent implies reducing nonoriginating intermediate goods to 62.5 percent of intermediate good purchases. For colunlrl 5 nonoriginating inputs are set at 80 percent and reduced to 75 percent through the minimum local content rule. 'Same as column 2 but with low value for the elasticity of substitution between originating and nonoriginating materials (0.5 instead of 2). So~rrcc~:Authors' cornputations adapted from model in Cadot rptrrl. 2005. lJnited States (under NAFTA, at price py), Cadot et al. (2005) estimate the following relationship NAFTA markup = a,, a, ~j + +alCCj+a3TECH,+ E, (4) where "NAFTA markup" is the percentage by which Mexico's NAFTA shipment prices are raised over comparable most favored nation shipment prices, CCj is a dummy vari- able marking a change of tariff classification at the chapter level, and TECHj is a dummy variable marking a change of a technical requirement. When estimated at the HS-8 level, equation (4) is the best tool to compare prices in different markets. With complete pass-through (k= 1 in equation (4) the estimated coefficient for a1 would be close to one, but Cadot et a]. (2005) find it substantially below one. They also obtain negative and significant estimates for (a7,a3)indicating that rules of origin costs are at least to some extent passed on to consumers. Once rules of origin are taken into account, the backward pass-through of preferences to producer prices falls from 80 percent of the margins to only 50 percent. They also show, using input-output links, that U.S. producers of intermediate goods are able to retain a substantial share of the rents generated by rules of origin downstream. That is, stiff rules of origin on, say, Mexican shirts exported to the United States significantly raise the price of fabric exported by the United States to Mexico for use in those shirts. This reflects the fact that rules of origin create a captive market for U.S. intermediate goods. An Exception and a Quasi-Natural Experiment The covariation of utilization rates and margins does not account for all the effects of rules of origin. Case studies such as those reported in Cadot, de Melo, and Pondard (2006) and Stevens et al. (2006) provide useful complementary evi- dence, although they conclude that each case is different, thereby explaining if not justifying the current maze. An exception and a quasi-natural experiment are drawn here, with both suggesting that rules of origin are, as they stand, unnecessarily restrictive. Asian Exception In a world where rules of origin are as cumbersome and complicated as they are (Estevadeordal and Suominen 2006 for a detailed description), Association of Southeast Asian Nations (ASEAN), Free Trade Area (AFTA) and the ASEAN- China Free Trade Area (ACFTA) stand out as exceptions. To obtain originating status (that is, to fulfill the criterion of sufficient processing), either the wholly obtained criterion (for a few agricultural products) or a single-value content rule requiring 40 percent local content (for most products) is used. This rule has been relaxed by allowing a choice between criteria for countries that found it too con- straining. For instance, under ACFTA the importer can choose a change of tariff classification can be used as an alternative to the 40 percent local content for obtaining origin for leather goods, and some specific process criteria are also accepted for some textile products.7 So why are rules of origin under AFTA less stringent than elsewhere? First, until recently Asian regionalism was more about cooperation than about prefer- ential trade. Under the aegis of the United States, Asia-Pacific Economic Cooperation was set up specifically to avoid preferential trade and the formation of an Asian trade bloc. Much of the region's integration in the world economy has been driven by unilateral tariff reductions. Second, regional trade has made possible the rise of the Asian manufacturing matrix in which labor-intensive stages of production initially carried out in Japan-and later in the Republic of 90 The World Bank Research Observe&vol. 23. no. 1 (Spring 2008) Korea-were outsourced to the region's lower wage countries. The resulting regional production networks have contributed to the price-competitiveness of Asia's exports, which has benefited the whole region. Stiff rules of origin would have jeopardized this successful model. This 'asian exception" has been conducive to the successful development of Asian countries that have fully participated in "verticalizing" trade (the develop- ment of cross-border supply chains generating trade in intermediate products). In this unusual setup (relative to other global trading patterns), intraregional trade in politically sensitive final products where protection is highest was insignificant. Thus, the political-economy forces that would usually lead to the complex rules of origin observed elsewhere have not been at work so far. As a result, low-income countries such as Cambodia and Lao PDR have been able to participate in the fragmentation of production according to comparative advantage.8 Arguably, Asia's simple and uniform rules of origin requirement is an example of the kind of rules of origin that would really be development-friendly. AGOA and Everything but Arms: a Natural Experiment In the textile and apparel sector, the choice area for obscure and trade-inhibiting rules of origin, the one notable exception is the U.S. preferences granted to 22 Sub-Saharan African least developed economies under AGOA. Thus, comparing African apparel exports to the European Union and the United States provides a quasi-experimental situation in which the effects of rules of origin on the uptake of trade preferences are analyzed. This quasi-experimental situation, first studied by Brenton and 0zden (2005), comes from the combination of different rules of origin with very similar rates of preference margins (textiles and apparel receive approximately the same protection in the EU and U.S. markets. In 2001 the EU- 15's most favored nation tariff was 10.1 percent compared with 11.7 percent for the United States, and duty-free access applied to both Everything But Arms eli- gible and the 34 AGOA-eligible African countries). To qualify for preferential access to the U.S. market, an exporter must prove that the garments are produced, cut, and sewn in the area benefiting from prefer- ential access (here, AGOA). Cotton products must be made from originating fabric, yarn, and thread, with diagonal cumulation somewhat relaxing the requirement, since fabric originating in other member countries qualifies. However, this rule, known as "the triple transformation" rule, was relaxed for 22 least developed economies under AGOA's "special regime," which permits the use of third-country fabri~.~That is, the special regime reduces the transformation requirement to a single transformation (from fabric to garment). Fifteen of AGOA's special regime beneficiaries are also eligible for the European Union's Everything But Arms initiative. But no such relaxation applies to exports Cadot and de Melo 91 to the European Union under either the Cotonou Agreement or Every But Arms preferences. EU rules of origin for apparel require production from originating yarn, which implies a "double transformation" from yarn to fabric and from fabric to clothing. The European Union's "double-transformation" rule obviously males compliance difficult for countries that have no textile industry. Small or poor countries that cannot profitably produce fabric-weaving is a capital-inten- sive activity involving expensive machinery, particularly for woven products- should not, from an economic-efficiency viewpoint, set up the vertically inte- grated local value chains that would satisfy the double-transformation rule. In apparel preference utilization rates are very high under both AGOA (97.36 percent in 2004) and Everything But ArmslCotonou (94.9 percent). Cotonou has rules similar to those that Everything But Arms has for apparel. However, export volumes evolved quite differently for the 15 least developed economies that benefit from both schemes. Figure 2 shows a substantial increase in the value of apparel exports with AGOA's entry into force in 2000 (in particular for Lesotho and Madagascar). By contrast, the value of exports from this same group of countries did not rise following the adoption of Everything But Arms-in fact it fell slightly. Of course, the exports that remained flat for those countries should come as no surprise since they already benefited from Cotonou preferences, which give almost Figure 2. Apparel Exports of 22 Countries Benefiting from the AGOA Special Regime, 2004 countriesa - - + - - U.S. imports from 7 top exportersb + EU imports from 22 countriesa EU imports from 7 top exportersb Note: "Benin. Botswana, Cameroon. Cape Verde. Ethiopia. Ghana. Kenya. Lesotho. Madagascar. Malawi. Mali. Mozambique. Namibia. Niger, Nigeria. Rwanda, Senegal. Sierra Leone. Swaziland. Tanzania. Uganda, and Zambia. '~otswana. Cameroon. Ghana. Kenya. Lesotho. Madagascar, Namibia. Nigeria, and Swaziland. Source: Portugal-Perez (2007)based on the WTO Integrated Data Rase" - 92 Tltr M'orld Bank Resc~crrclrOl,serv~,r;vol. 2 3 , no. 1 (Spring 2 0 0 8 ) as much access as Everything But Arms (with slightly more lenient rules on cumulation). In effect, nothing changed for them on this front, and along with other ACP countries they largely continued to request access under Cotonou, with which they were familiar, rather than Everything But Arms. But AGOA's special regime did not merely trigger a catch up of U.S.-bound exports toward already high levels of EU-bound exports: it dwarfed them. Thus, unlike AGOA's special regime neither Cotonou nor Everything But Arms appeared to have offered a pre- ference mix (tariff preferences and rules of origin) conducive to export growth. Because the data in figure 2 are computed at the HS-6 product level, it is safe to assume that heterogeneity in export composition is largely controlled for. This is confirmed by formal econometric evidence. In a model that controls for differ- ences in preference margins and for demand shifters in the EU and U.S. markets, Portugal-Perez (2007) finds that relaxing rules of origin for apparel (captured by a dummy variable corresponding to the introduction of the AGOA's special regime) raised apparel exports significantly for beneficiary countries. Because the special regime was not introduced in the same year for all countries, its effects are well identified statistically, and Portugal-Perez' results strongly suggest that the difference in performance apparent in figure 2 is indeed attributable to differences in rules of origin regimes. AGOA's special regime seems to have encouraged growth not only at the "inten- sive margin" (higher volumes) but also at the "extensive margin" (diversification by addition of new products). As new products were exported to both countries (an active extensive margin), the rate of increase in new products was several orders of magnitude higher for the US.-bound goods than for EU-bound ones, which is an important achievement. Product diversification is one measure of industrialization, particularly at early stages of the economic development process (Cadot, Carrere, and Strauss-Kahn 2007 and references therein). Controlliilg for other factors, countries that have a more diversified industrial base enjoy less volatile growth and are better poised to absorb shocks. Only three countries in Sub-Saharan Africa-Lesotho, Madagascar, and Senegal-export more than 50 products to either the European Union or the United States. Thus, if the devel- opment objective of rules of origin is to be taken seriously, encouraging export growth at the extensive margin is important, and in this regard Everything But Arms and Cotonou's performance are again disappointing compared with that of AGOA's special regime. Taken together, the brief discussion here on the Asian exception and the com- parison of AGOA with the Everything But Arms initiative suggests two results: Limited differences between preferential regimes can have drastic effects on their performance; AGOA's relaxation of the triple transformation rule gave a signifi- cant boost to Sub-Saharan African apparel exports. Cndot and de Melo Utilization rates are an incomplete measure of the performance of preferential regimes, as the inhibiting effect of stiff rules of origin can be felt on trade volumes as well. Indirect Evidence Taking inspiration from the early work by Herin (1986) for EFTA, Cadot, de Melo, and Portugal-Perez (forthcoming) applied revealed-preference arguments to esti- mate upper and lower bounds of compliance costs. Arguably, this nonparametric approach could be more robust than the parametric evidence reported above. By revealed preference, for products with 100 percent utilization rates the net benefit of preferences is positive for all firms. Since everyone uses the preferences, the ad valorem equivalent of compliance costs cannot be larger than the tariff-preference margin. Conversely, for products with zero percent utilization rates, since no one uses the preferences, the compliance cost cannot be smaller than the preference margin. For remaining sectors (those with utilization rates between 0 and 100 percent) the story is more complicated because of firm heterogeneity, so assumptions must be made. Cadot, de Melo, and Portugal-Perez (forthcoming) argue that, firm het- erogeneity notwithstanding, the average exporter (in terms of compliance costs) is not too far from indifference between the preferential and the most favored nation regimes, which means that the compliance cost is about equal to the tariff- preference margin. Applying this reasoning gives trade-weighted ad valorem esti- mates of 4.7-8.2 percent depending on sectors for PANEURO and 1.8-1.9 percent for NAFTA-values in line with the econometric estimates of Carrere and de Melo (2006) reported earlier. How then should requests for preferential status be interpreted when tariff pre- ferences are nil? Beyond (likely) errors in data transcription, the logical possibility would be that administrative costs are negligible, but this contradicts the evidence (the nonparametric approach described in the previous paragraph gave estimates of pure administrative costs slightly above 3 percent in ad valorem form). Francois, Hoekman, and Manchin (2006) elegantly addressed this problem by modeling the determinants of utilization rates for EU trade with ACP countries in a switching-regression framework where the relationship between the variable of interest (utilization rates) and explanatory variables varies between two regimes: one for low-margin sectors and the other for high-margin ones. The dividing point between the two regimes is determined by the data using an algorithm developed by Hansen (2000).'~They found that exporters start requesting prefer- ences when preferential margins are in the 4.0-4.5 percent range, a result that 94 The World Bank Research Observer; vol. 23. no. 1 (Spring 2008) is also broadly consistent with the nonparametric estimates of compliance costs reported above. Other studies using aggregate bilateral trade data also suggest costs associated with the presence of rules of origin. Using a gravity model of bilateral trade, Anson et al. (2005) find that after controlling for the other determinants of the volume of bilateral trade, including the presence of free trade agreements, the intensity of bilateral trade is inversely related to the values taken by the R-index. Using a similar framework, Augier, Gasiorek, and Tang (2005) find that the volume of bilateral trade is lower when cumulation is on a bilateral rather than a full basis, leading them to suggest that rules of origin should be relaxed to allow for full cumulation. The evidence reported so far in this article is overwhelming: rules of origin are burdensome and foster economic inefficiency. But this article also argues that they have a role in combating trade deflection, so calling them trade barriers is not enough. To make progress in designing "clean" rules of origin, a key part of the argument is to tell apart, in their current characteristics (and in particular their restrictiveness), how much is attributable to their antideflection role com- pared with how much is simply capture by special interests. Portugal-Perez (2006) tries to address this issue by decomposing variations in the R-index into a component attributable to trade deflection and one associated with lobbying or political-economy motives. He estimates this decomposition for Mexican textile and apparel exports to the United States under NAFTA using the following equation where (Rooi)is R-index values at the HS-6 level. The regressors are the trade deflection vector, which includes a proxy for the extent of product differentiation (the more homo- geneous the product, the more there is to gain from arbitraging even small differences in external tariffs), and differences in external tariffs (the larger these differences the more there is to arbitrage). Political-economy variables including the level of the United States' most favored nation tariff (a proxy for lobbying power) revealed comparative-advantage indexes and the value of Mexican exports to the rest of the world (a proxy for potential penetration of the U.S. market). Portugal-Perez finds strong and quite robust correlations, suggesting that both sets of factors are at work in explaining cross-sectoral variations in rules of origin restrictiveness. Using estimated parameter values, he constructs a counterfactual distribution of R-index values across goods in the absence of political-economy correlates (that is, by setting y = 0 in equation (5).The two distributions (actual and counterfactual) are reported in figure 3. They show that political-economy concerns (which shift the actual distribution to the right of the counterfactual) contribute to the overall restrictiveness of rules of origin. Drawing on the Cadot and de Melo 95 Figure 3. Counterfactual Distribution for R-Index Source: Portugal-Perez 2006, figure 3. estimates discussed earlier by Carrere and de Melo (2006), he concludes that capture by special interests may have raised the costs of rules of origin an average of 3.5- 11 percent of good value, a very steep increase in the face of the shallow preferences that are generally granted. Simulation methods provide another way of obtaining orders of magnitude of rules of origin effects on trade. Francois, Hoekman, and Manchin (2006) use their estimate of compliance costs to simulate the effects of trade liberalization by developed economies on low-income countries in a multiregional trade model. Despite preference erosion, low-income countries gain instead of losing from trade liberalization by the European Union because the "rectangle" deadweight losses associated with compliance costs are eliminated. Table 6 provides alternative estimates from a partial-equilibrium perspective, taking as an example a GSP country benefiting from a 10 percent preferential margin in the EU (or U.S.) market (row 1)but forced to raise its minimum local content from the value in row 3 (40 percent, except in column 5) to the value in row 4 (50 percent, except in column 5). When present, administrative costs, also expressed as a percentage of the unit price, are given in row 2. The table's bottom three rows show the effect of rules of origin on equilibrium exports and prices. Column 1 shows the benefits that accrue to the GSP producer from receiving a 10 percent preference margin with no constraint on the sourcing of inputs. For this constellation of elasticity (all are on the high side to reflect the likelihood 96 T l ~ cWorld Bunk Kest,arch Observrr.1~01.2 3, no. 1 (Spring 2 0 0 8 ) that products from different origin are close substitutes, whether at the intermedi- ate- or final-good level), the pass-through is 2.9 percent (row 7) out of a prefer- ence margin of 10 percent, in line with econometric estimates mentioned in the section on direct evidence. Exports increase by 16 percent, but costs do not increase because inputs are bought at constant world prices. Column 2 shows what happens when the producer must reduce the use of nonoriginating materials to meet a value content rule of 50 percent (a 25 percent increase from column 1).For the example, where value added is 20 percent and unconstrained purchases of nonoriginating intermediate goods equal 75 percent of the value of total intermediate good purchases, raising the minimum local content from 40 to 50 percent implies that purchases of nonorigi- nating intermediate goods must be reduced to 62.5 percent. The result of forcing producers to shift away from preferred intermediate goods is a higher unit pro- duction cost resulting in lower export volume, with the 1.9 percent increase in unit cost passed on to EU and U.S. consumers. Matters get worse if substitution possibilities for materials from different origins are low (column 3), which might be the representative of industries with a lot of transformation and many pro- duction stages. '' Column 4 mirrors column 2 but adds administrative compliance costs of 2.5 percent. This further penalizes the GSP producer, even though part of this cost increase can again be passed on to consumers in the importing country. Of course, if GSP producers were competing with close substitutes, they would be unable to pass on the price increase. Finally, column 5 considers a simulation that might be fairly representative of an industry with enough originating inter- mediate good purchases that the shift to a 40 percent minimum local content would not affect producers much. In this case, the net price to producers might go up by about one-third of the preference margin, resulting in a modest supply response of about 1 0 percent. Implications for Reform If rules of origin are a legitimate way to prevent trade deflection by mandating that sufficient processing take place in the preferential zone, the accumulated evi- dence reported in this article indicates that they have gone vastly beyond that role, becoming akin to technical barriers to trade. Various estimates suggest that the compliance costs associated with meeting origin requirements in preferential trade agreements range between 3 and 5 percent of final product prices-a very stiff price tag for preference margins that are often thin, given that most favored nation tariffs are low in most sectors except textiles and apparel. Controlling for preferential margins. utilization rates are lower in product lines with more restrictive rules of origin and when producers are limited in the sourcing of their intermediate good purchases. Because of their trade-inhibiting effects, rules of origin hinder the integration of preference-receiving least developed economies in the world economy and thus work at cross-purposes with the development-policy goals of EU and U.S. prefer- ences. For Sub-Saharan African countries supplying apparel products to the European Union, even high utilization rates hide obstacles to export growth caused by the double-transformation requirement imposed on those products. This article also shows that in the case of the European Union and the United States, the two largest users of preferential trade agreements, rules of origin are stricter for products with tariff peaks where preferences could be most valuable. The correlation between the presence of tariff peaks and that of highly restrictive rules of origin suggests capture by protectionist interests, a hypothesis largely con- firmed by political-economy theory and evidence. Moreover, because rules of origin have so far escaped WTO disciplines-whereas other, more traditional trade-policy instruments are brought under increasingly stringent ones-they stand as a choice candidate for creeping protectionism. Despite the prevalence of capture by special interests, two quasi-natural experiments point to broad directions for reform. First, the relaxation of the U.S. triple-transformation requirement in textile and apparel for Sub-Saharan African producers under AGOA has proved to strongly encourage export diversi- fication and growth compared with exports destined to the European Union, which are subject to stricter rules under the Everything But Arms initiative (which otherwise features similar preference margins). Second, low-income Asian countries operating under simple and benign rules of origin have been able to rapidly integrate themselves into cross-border supply chains and have, as a result, tremendously benefited from the verticalization of world trade. These observations suggest that a multilateral agenda for preferential rules of origin reform, a key step in bringing preferential trade agreements under WTO disciplines, would have to move along three dimensions: harmonization, simplifi- cation, and relaxation. Harmonization between trading blocs, although unlikely to be attained anytime soon, is desirable in view of the "spaghetti bowl" of prefer- ential trade agreements and is a prerequisite for simple and mutually consistent cumulation rules. The European Union has set an example in this regard with the PANEURO system, designed precisely to facilitate cumulation across preferen- tial zones. For simplification arguments in favor of a single across-the-board rule are much like those in favor of uniform tariffs-that is, simplification fosters transpar- ency and mitigates capture. Clearly, technical requirements should be targeted for elimination first because they are the most opaque, difficult to harmonize, and capture-prone instruments. Leaving aside agricultural products that could still 98 The World Bank Research Observer; vol. 23. no. 1 (Spring 2008) operate under the wholly obtained criterion and keeping in mind that any uniform rule will affect industries and countries differently, two avenues could be considered; a simple change of tariff classification, say at the subheading (HS-6) level so that it is not too restrictive or a uniform value-content rule. Some information can be gleaned in this regard from the European Union's recent review. The change of tariff classification has the advantage of simplicity, transparency, and low administrative costs. But the Harmonized System tariff nomenclature was designed to collect trade statistics, not to separate products and confer origin, so defining the change of tariff classification at a uniform level would produce erratic results across sectors. This would call for exceptions to uni- formity, opening up a Pandora's Box of special deals. Moreover, a change of tariff classification that would not easily lend itself to differential treatment for least developed economies should be an objective (see below). Notwithstanding conceptual clarity, a value content rule may be less than straightforward to apply in practice.12 It may increase producer risk due to the sensitivity of costs to exchange-rate. wage, and commodity-price fluctuations and is also burdensome to apply for customs officials. However, it is simple to specify and transparent, and it allows for differential treatment of least developed econ- omies. All told, if properly specified, it is the best candidate for an across-the- board criterion, ideally in combination, at the exporter's choice, with a change of tariff classification. In this spirit Tumurchudur (2007b) estimated for each good the maximum foreign content that would make a value content rule equivalent to the current array of NAFTA's rules of origin. Her method consisted of three steps. First, she estimated the statistical relationship between utilization rates and rules of origin, including value content rules. Second, she inverted that relation- ship to find the rate of a value content rule that would give a utilization rate equal to the current one. Third, she calculated the trade-weighted average of that maximum content. This neutral average turns out to be a very low 21 percent of the good's value in maximum foreign content, confirming the diagnosis that NAFTA's rules of origin are very restrictive. More important, this rate provides a transparent and fully comparable benchmark which is to base discussions of reform and harmonization. If the slow pace of harmonization talks at the WTO is any indication, the reform agenda described above may be overambitious by several orders of magni- tude; even if the European Commission manages to complete the agenda, compe- tition between systems may trigger similar rounds of simplification elsewhere. including in free trade agreements between developing economies in Africa and Latin America, whose rules of origin are often directly inspired by NAFT4 and PANEURO. However, the outcome of the EU reform process is highly uncertain at this stage; moreover. even if the plan to adopt an across-the-board value content (hdol r1111l dr Mclo 99 criterion survives, it is not clear that the rate of this value content rule would be uniform. Nor is it certain (perhaps even less) that it would relax the restrictive- ness of the current system. More immediate, win-win steps may be a better way to proceed. A simple first step would consist of eliminating rules of origin requirements for tariff lines with preferential margins below 3 or perhaps even 5 percent (the rate could be agreed upon in the context of multilateral negotiations at the WTO). This would be an all-around winning proposition since resources would be freed for other purposes, especially in developing economies, but also for consumers in developed econom- ies, who would no longer bear part of the increased costs associated with compli- ance. A second step would be to allow for differential treatment not across sectors, but across beneficiaries, with low value content requirements for least developed economies reflecting the empirical observation that the "slices" of value added in least developed economies through cross-border production networks are generally thin. In this regard, the experience with the U.S. special regime granted in textile and apparel to African producers under AGOA is most encouraging. Appendix. Glossary of terms Harmonized System. A system of classification for traded goods in which all countries belonging to the World Customs Organization participate. It classifies traded goods into (by increasing order of disaggregation) 2 1 sections (one digit), 99 chapters (two digits). 1,417 items (four digits), and 4.998 subitems (six digits). Beyond that (eight- and ten-digit), classification systems are no longer har- monized across countries and are subject to frequent classification changes. Preference Margin. The difference between most favored nation and preferential tariffs. Preference Pass-Through. The percentage of a tariff-preference margin that is "appropriated" by exporters in the form of an increase in the export price. It is inversely related to the bargaining power of importers. Preferential Status. Whether a good is eligible for the preferential tariff rate. Technical Requirement. Rule of origin that imposes a certain type of production process or the use of certain specified technology or standard. 100 TIIPWorld Bunk Rcsectrch Observer. vol. 2 3, rlo. I(Spring 2008) Trade Deflection. Use of the country with the lowest external tariff by importers in a free trade agreement (which reduces tariff revenue for others). This notion is distinct from Vinerian "trade diversion." Utilization Rate. Share of exports shipped under the preferential (as opposed to most favored nation) regime. Regime-wide Rules of Origin Absorption or Roll-up. Principle that allows nonoriginating materials that have acquired origin by meeting specific processing requirements to maintain this origin when used as input in a subsequent transformation. In other words, the nonoriginating materials are no longer talcen into account in calculating value added. The roll-up or absorption principle is used in most preferential trade agree- ments (in particular, the EU GSP and the Cotonou Agreement), although a few have exceptions for the automotive sector. Cumulation. Principle that allows producers from one member country in a prefer- ential trade agreement to import nonoriginating materials from another member country without affecting the final product's originating status. There are three types of cumulation rules: bilateral, diagonal, and full Bilateral cumulation. It is the most common type and applies to trade between two partners in a preferential trade agreement. It stipulates that producers in country A can use inputs from country B without affecting the final good's originating status as long as the inputs satisfy the area's rules of origin. Diagonal Cumulation. Under diagonal cumulation (the basic principle of the EU's PANEURO system), countries in a preferential trade agreement can use materials that originate in any member country as if the materials originated in the country where the processing is undertaken. Under full cumulation all stages of processing or transformation of a product within countries in a preferential trade agreement can be counted as qualifying content regardless of whether the proces- sing is sufficient to confer originating status to the materials themselves. Full cumulation allows for greater fragmentation of the production process than bilat- eral and diagonal cumulation. Duty Drawbacks. Refunds to exporters of tariffs paid on imported intermediate good inputs. Many preferential trade agreements. especially in the Americas, mandate the elimination of duty-drawback schemes for exports to partner countries on the grounds that a duty drawback claimed by a producer in country A to export to country B would put that producer at a competitive advantage compared with domestic producers in country B given that the producer in country A already benefits from the elimination of intrabloc tariffs. Eliminating duty drawbacks as part of a preferential trade agreement can harm the profitability of final-good assembly for export to partner countries in the area, although tariff escalation, when present, already provides some protection for final-assembly operations (because it implies lower tariffs on intermediate goods than on final ones). Product-Specific Rdes of Origin Allowance. An amendment to a mandated change of tariff classification that excludes some categories from noneligibility (that is, a final good belonging to, say, chapter 11 can embody imported inputs belonging to any other chapter or from chapter 11itself but between headings X and Y). Change of Tariff Classification. Rule of origin requiring that a final good made with imported inputs belong to a Harmonized System category that differs from that of its imported inputs (as proof of transformation). The mandated change of tariff classification can be specified at the chapter (two digits), heading (four digits), subheading (six digits), or item (eight digits) level. Exception. An amendment to a mandated change of tariff classification that excludes some categories from eligibility (that is, a final good belonging to, say, chapter 11 can embody imported inputs belonging to any other chapter except headings X to Y). Value Content. Rule of origin requiring a minimum percentage of local value (materials or value added) or a maximum percentage of foreign value. Olivier Cadot is Professor of Economics at the University of Lausanne, associated scholar at Centre dlEtudes et de Recherches sur le Developpement International (CERDI),and fellow at the Centre for Economic Policy Research (CEPR): his email address is olivier.cadot@unil.ch. Jaime de Melo (corre- sponding author) is Professor of Economics at the University of Geneva, associated scholar at CERDI, and fellow at CEPR: his email address is demelo@ecopo.unige.ch. The authors thank Paul Brenton and Marcelo Olarreaga for many useful suggestions and their colleagues and co-authors Celine Carrere, Antoni Estevadeordal, Alberto Portugal-Perez, Akiko Suwa-Eisenmann, and Bolormaa Tumurchudur for permission to draw on joint work. They also thank three referees for comments on a previous draft. 1. According to this same tally, 45 developing economies having signed bilateral trade agree- ments with a developed country, and 90 of the 109 preferential trade agreements between deve- loped and developing ecorlomies have been created since 1990. 2. According to a survey administered by the World Customs Organization to customs officials in developing economies (as reported by Brenton and Imawaga 2004). 67 percent of respondents in 102 Tl~rWorld Bank Research Observer; vol. 2 3, no. 1 (Spring 2008) Sub-Saharan Africa agree that dealing with-rules of origin under overlapping trade agreements causes problems, and a majority also agrees that rules of origin are more labor-intensive. Administering rules of origin detracts from other objectives of tax collection and trade facilitation. 3 . Because meeting the requirements is difficult and appears unnecessarily complex, in view of the European Commission's objective to grant some preferential access to its market for GSP-eligible countries, on 16 March 2004 the commission adopted Communication COM (2005) on "The Rules of Origin in Preferential Trade Arrangements." The communication explores alternative rules of origin that would be simpler and more development friendly. A key proposal under consideration is to replace the current product-specific rules of origin with a single rule based on a minimum of orig- inating value added. 4. By comparison, the average preferential margin (computed over tariff lines with positive tariffs) was 4.5 percent for NAFTA (almost all tariffs had been eliminated on NAFTA trade by 2001). 2.4 percent for GSP-eligible countries, and 4.6 for ACP countries (not eligible for Everything But Arms status). Data for the European Union are for 2004, when 62 percent of trade for GSP-eli- gible countries and over 8 0 percent of trade for ACP countries took place at zero tariffs (some ACP also benefited from Everything But Arms status at zero tariffs in the EU market). 5. Krishna (2006) discusses other effects that are more difficult to quantify: effects such as rules of origin-jumping investment and effects on intermediate prices. Thoenig and Verdier (2006) also consider the implications of rules of origin for multinationals confronted with outward-processing decisions. 6. The United States rarely uses a value content criterion as the sole requirement for origin, and when it does it tends to rely on a single 40 percent foreign content requirement. The European Union has value content criteria ranging from 50 to 15 percent of domestic value added. 7. Cumulation is, in principle, only diagonal (see the glossary in the appendix), but the domestic content can be calculated as an aggregate of value added in any ASEAN member state; so in effect AFTA provides for full cumulation, although, as noted by Brenton (2006), the rules stipulate that the final stage of manufacture must be carried out in the exporting member state (what constitutes "the final stage" is not defined). Because vertical links and outsourcing are very important in Asia, full cumulation considerably relaxes the requirements of satisfying origin. 8. To drive home the importance of trade in intermediate goods, consider the following example. On the basis of the input-output data in Baldwin (2006, table 1 for Indonesia, Malaysia. Philippines, and Thailand (middle-income Asian countries), an average of 35-40 percent of inter- mediate goods are sourced outside AFTA. For example, take an activity with 1 0 percent value-added and 4 0 percent nonoriginating intermediate goods-that is, 36 percent of the final unit product price is nonoriginating. Originating value for this activity would be 64 percent. Then take the plaus- ible example of an activity with the same value added but with 60 percent of materials nonoriginat- ing; originating value falls to 46 percent, barely above the 40 percent minimum currently stipulated in AFTA. 9. The special regime was recently extended until 2015. Figure 2 lists the 22 beneficiary countries. 10. The algorithm is in essence a grid search over cutoffs whose criterion is the minimization of the concentrated sum of squared errors of the ordinary least squares regressions in the two regimes. 11. The decline in exports to the preferential-giving destination suggests that producers would choose to export under most favored nation status. In the illustrative simulations reported here, with constant elasticity throughout and smooth substitution possibilities across the origin for intermediate good purchases and export destination sales, producers pass on cost increases to consumers. 12. The authors of this article are aware of concerns voiced by the private sector in the course of the EU review about the practical dificulty of a value content criterion for small firms and, if based on costs, its potential to force unwanted disclosure of strategic information to powerful EU buyers that would enhance their ability to squeeze rents from developing country producers. Cadot rir~ddc Melo 103 References Anson. J.. 0. Cadot. C. Carrere. A. Estevadeordal, J. de Melo, and B. Tunlurchudur. 2005. "Rules of Origin in North-South Preferential Trading Arrangements with an Application to NAFTA." Review of It~ternationalEconomics 13(3):501-17. Augier, F!, M. Gasiorek, and C. Tang. 7005. "The Impact of Rules of Origin on Trade Flows." Economic Policy 20(43):567-24. Baldwin, R. 2006. "Managing the Noodle Bowl: The Fragility of East Asian Regionalism." CEPR Discussion Paper 5561. Centre for Economic Policy Research, London. Brenton. E 7006. "Notes on Rules of Origin with Implications for Regional Integration in South East Asia." World Bank. Washington, D.C. Brenton. F!, and H. Imagawa. 2004. "Rules of Origin, Trade and Customs." In L. de Sokol, and J. Wulf eds., Custorrls Modernization Handbook. Washington. D.C.: World Bank. Brenton, E, and M. Manchin. 2003. "Making EU Trade Agreements Work: the Role of Rules of Origin." The Rt~rldEconorny 26(5):755-69. Brenton, F!, and C. 0zden. 2005. "Trade Preferences for Apparel and the Role of Rules of Origin: The Case of Africa." World Bank, Washington, D.C. Cadot. 0.. C. Carrere, and L! Strauss-Kahn. 2007. "Export Diversification: What's Behind the Hump?"University of Lausanne. Switzerland. Cadot, 0.. J. de Melo, and A. Portugal-Perez. 2005. "Market Access and Welfare under Free Trade Agreements: the Case of Textiles under NAFTA." Miorld Rank Economic Review 19(3):379-405. . (forthcoming). "Rules of Origin for Preferential Trading Arrangements: Implications for the ASEAN FTA of EU and US Experience." Journal of'Economic Intc~gratiotl. Cadot, 0.. J. de Melo, and E. Pondard. 2006. "Evaluating the Conseyuc~ncc~sofa Shift to a Value-addcif Method for Determining Origin in the European Union's GSP Preferential Scl~erne."Report prepared for the European Commission, Brussels. Cadot, 0.. A. Estevadeordal, and A. Suwa-Eisenmann. 2006. "Rules of Origin as Export Subsidies." In 0. Cadot eds., The Origin of Goods: Rulcls of Origin in Regional Trad~Agrclements. London: Oxford University Press. Cadot, 0.. C. Carrere. J. de Melo, and B. Tumurchudur. 7006. "Product Specific Rules of Origin in EU and ITS Preferential Trading Arrangements: an Assessment." World Tradi~ Review 5(2): 199-224. Cadot. 0.. A. Estevadeordal. A. Suwa-Eisenmann, and T. Verdier eds. 2006. The Origin of Goods: Rules of Origin in Regional Trade Agreclmclnts. London: Oxford University Press. Carrere, C., and J. de Melo. 2006. 'Are Different Rules of Origin Equally Costly? Estimates from NAFTA." In 0. Cadot eds., The Origin of Goods: Rules of Origin in Regional Trade Agreements. London: Oxford University Press. Estevadeordal, A. 2000. "Negotiating Preferential Market Access: the Case of NAFTA." Journal of world Trade 34(1):141-66. Estevadeordal, A,, and K. Suominen. 7006. "Mapping Rules of Origin Around the World." In 0. Cadot eds.. Thr Origin of Gooifs: Rules of Origitr it1 Regiorlal Trade Agreements. London: Oxford University Press. Francois. J., B. Hoekman, and M. Manchin. 2006. "Preference Erosion and Multilateral Trade Liberalization." World Rsnk Economic Review 20(7):197-216. Hansen, B. 7000. "Sample Splitting and Threshold Estimation." Econornetric~~ 68(3):575-603. 104 The World Bank Hesearcl~Obsertjer; vol. 23. 110. 1 (Sprirlg 2008) Herin. J. 1986. "Rules of Origin and Differences between Tariff Levels in EFTA and in the EC." EFTA Occasional Paper 13. Geneva: European Free Trade Agreement. Krishna. K. 2006. "Understanding Rules of Origin." In 0. Cadot eds., The Origin 01Goods: Rules of Origin in Regional Trade Agreements. London: Oxford University Press. Krueger. A.O. 1998. "Free Trade Areas versus Customs Unions." Journal 01Development Economic's 54(1):lhY-87. 0l;trreaga. M.. and C. 0zden. 2005. 'RGOA and Apparel: Who Captures the Tariff Rent in the Presence of Preferential Market Access?" Worlrl Economg 28(1 ):63 77. &den. C.. and G. Sharma. 2006. "'The Price Effects of Preferential Market Access: Caribbean Basin Initiative and the Apparel Sector." World Bar~kErol~omicReview 20(2):241-60. Portugal-Perez. A. 2006. "Disentangling the Determinants of Rules of Origin in North-South Preferential Trade Agreements: Evidence for NAPTA." University of Geneva. . 2007. "The Costs of ROOin Apparel: African Apparel Exports to the US and EIT." University of Geneva. Stevens, C.. M. Gasiorek, J. Chweijczak, and J. Kennan. 2006. "Creating Development Friendly Rules of Origin in the EU." Overseas Development Institute, International Economic Development Group. London. (ww~~.odi.org.ukliedgipublications/Rules~of~0rigin~PinalReport.pdf). l'hoenig, M., and T. Verdier. 2006. "The Impact of ROO on Strategic Outsourcing: An I 0 Perspective." In 0 . Cadot eds.. Tllf Origin of Goods: Rules 01Origin in Regional Trade Agretvnents. London: Oxford University Press. Tumurchudur, B. 2007a. "Rules of Origin and Marlcet Access in the Europe Agreements." University of Lausanne, Switzerland. . 2007b. "Reforming NAPTA's Rules of Origin." University of Lausanne, Switzerland. World Bank. 2005. Global Economic Prospects 2005: Trade. Regionalism and Developmant. Washington, D.C. WTO (World Trade Organization). 2002. "Rules of Origin in Regional Trade Agreements." WT/REG/ W145. Geneva. Switzerland. Wulf. L. de, and J. Sokol eds. 2004. Customs Moderr~i:rrtion Handbook. Washington, D.C.: World Bank. Cadot cind cle 1\.11~lo