WT 3 wki1. POLICY RESEARCH WORKING PAPER 2470 Are Larger Countries Really Or has evidence that government corruption is less M ore Corrupt? severe in small countries been an artifact of sample selection? Stephen Knack Omar Azfar The World Bank Development Research Group Regulation and Competition Policy H November 2000 PqLICY RESEARCH WORKING PAPER 2470 Summary findings Several authors claim to provide evidence that Knack and Azfar find that the relationship between government corruption is less severe in small than in corruption and country size disappears when one uses large countries. Knack and Azfar demonstrate that this either a new corruption indicator with substantially relationship is an artifact of sample selection. increased country coverage or an alternative corruption Most corruption indicators provide ratings only for the indicator that covers all World Bank borrowers without countries in which multinational investors have the regard to country size. greatest interest. These tend to include almost all large They also show that the relationship between nations but, among small nations, only those that are corruption and trade intensity-a variable strongly well governed. related to population-disappears when samples less subject to selection bias are used. This paper-a product of Regulation and Competition Policy, Development Research Group-is part of a larger effort in the group to identify the determinants of good governance and institutions conducive to long-run economic development. Copies of the paper are available free from the World Bank, 1818 H Street NW, Washington, DC 20433. Please contact Paulina Sintim-Aboagye, room MC3-422, telephone 202-473-8526, fax 202-522-1155, email address psintimaboagye@worldbank.org. Policy Research Working Papers are also posted on the Web at www.worldbank.org/ research/workingpapers. The authors may be contacted at sknack@worldbank.org or omar@iris.econ.umd.edu. November 2000. (28 pages) The Policy ResearcP Working Paper Series dissetninates the findings of work in progress to encourage the exctange of ideas about development issues. An objective of the series is toget the findings out quickly, even if the presentations are less than fully polished. The. papers carry the names of the authors and sbould be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the view of the World Bank, its Executive Directors, or the countries they represent. Produced by the Policy Research Dissemination Center Are Larger Countries Really More Corrupt? Stephen Knack The World Bank and Omar Azfar IRIS, University of Maryland Sknack@worldbank.org, omar@iris.econ.umd.edu. We thank Anand Swamy for helpful comments and suggestions, Paul Schorosch for able research assistance, and Ray Fisman, Roberta Gatti, Aart Kraay, and Shang-jin Wei for kindly providing data. 1. Introduction A growing literature finds that corruption, and the quality of governance more generally, has important implications for growth of per capita incomes (e.g. Mauro, 1995; Knack and Keefer, 1995; Hall and Jones, 1999). Motivated in part by the breakup of the Soviet Union and Yugoslavia, another literature addresses the implications of country size for growth and efficient provision of public goods (Alesina, Spolaore and Wacziarg, 2000; Alesina and Wacziarg, 1999; Alesina and Spolaore, 1997). Several recent papers, combining elements from these separate literatures, examine the impact of country size on the quality of governance (Fisman and Gatti, 2000; Root, 1999; Treisman, 1999). These studies all conclude that larger size tend to have governments that are more corrupt than governments in smaller nations. We demonstrate that this relationship is entirely an artifact of sample selection. Most available corruption indicators provide ratings only for those countries in which multinational investors have the greatest interest: these tend to include almost all large nations, but among small nations only those that are well-governed. The relationship between corruption and country size disappears, using either a new corruption indicator with substantially increased country coverage, or an alternative corruption indicator that covers all World Bank borrowers without regard to country size. This finding also applies to the apparently favorable effects of trade openness on corruption (Ades and Di Tella, 1999; Treisman, 2000; Wei, 2000). We show that the relationship between openness and corruption is not robust to the use of the newer corruption indicators that are less subject to sample selection bias. 1 2. The Costs and Benefits of Being Small The post-cold war era has seen a dramatic increase in the number of new nations and new and increasingly plausible independence movements. Twenty new nations were created between 1990 and 1994 (Alesina, Spolaore and Wacziarg (2000), mostly due to the fragmentation of the Soviet Union and Yugoslavia, and the divisions of Ethiopia and Czechoslovakia. In Indonesia, Aceh or Irian Jaya may follow East Timor in demanding autonomy or even independence. Within Western Europe, popular movements in regions such as Catalonia, Lombardy and Scotland have demanded more extensive self governance. In Africa, civil wars in Congo-Zaire and other nations have the potential for breaking up nations . These breakups, coupled with the EU monetary union and the spread of "globalization," have led to an increased interest in the issue of the optimal size of nation- states. A Wall Street Journal feature extolled the economic performance of small nations, arguing that globalization - with freer trade and increased mobility of labor and capital - has reduced the costs of being small.2 A theoretical paper by Alesina and Spolaore (1998) on "The Number and Size of Nations" was the subject of a full-page article in The Economist.3 Arguments for the benefits of small manageable countries date at least to Plato and Aristotle. Plato declares with characteristic boldness in "The Laws" that the optimal size of the state is 5040 citizens, and in fact prescribes population control to keep it at this ' Emerson (2000) describes potential independence movements in Indonesia. Newhouse (1997) discusses of European regions ranging from Catalonia, the Rhone-Alps, Lombardy and the Veneto which are demanding and sometimes obtaining increased autonomy. Rising regionalism and state break-ups in Europe and elsewhere have even prompted authors as influential as Vaclav Havel to suggest "the end of the nation state" (Havel 1999, see also Matthews 1997). 2 "An Era for Mice to Roar: From Iceland to Botswana, Small Nations Prosper," February 25, 1999. 3See "A Wealth of Nations" (The Economist, April 29 1995). 2 precise level.4 Aristotle (1932) states in "The Politics" that "experience has also shown that it is difficult, if not impossible, for a populous state to be run by good laws" and prescribes that a state should be large enough to be self sufficient, but small enough to be manageable and easily surveyed. Unlike Plato, he refrains from providing a precise number as the optimal state size. More recently, Jalan suggests (1982: 85-86), that smaller nations benefit from greater social cohesion and fewer vested interests, making it easier to adapt policies efficiently to new challenges and opportunities. Other authors have noted various disadvantages associated with smallness. Some theories of growth also imply increasing returns to scale in economic activity overall (e.g. Romer, 1986; Murphy, Shleifer and Vishny, 1989), which may put small nations at a disadvantage. However, cross-country growth regressions do not find evidence for such scale effects (e.g. Easterly and Kraay, 2000). Small economies are more vulnerable to terms of trade shocks, because they are more trade-dependent and because they tend to be diversified across activities. Although this vulnerability is not associated with slower growth on average, it is associated with greater year-to-year volatility in growth rates (Easterly and Kraay, 2000). Larger nations may benefit more than smaller states from economies of scale in infrastructure, including communications, power generation, education, and health facilities. There are also plausibly substantial economies of scale in establishing political and administrative structures (Srinivasan, 1986).5 As Aristotle noted, the survivability of small states in a hostile environment is also problematic (Sardar, 1995; Harden, 1985). 4Plato (1988). A state of 5040 citizens implies a population closer to 50,000 by modem standards as women, slaves and many other adult permanent residents were not included as citizens. 5 However, some functions are often delegated by small states to supranational bodies to exploit economies of scale. The Eastem Caribbean Central Bank is one example. 3 In a series of recent papers, Alesina and co-authors have analyzed several issues relating to the optimal size and number of nations. Public goods provision has implications for the optimal size of nations, as the benefits of internalizing spillovers must be balanced against the costs of imposing a common set of policies on heterogeneous groups (Alesina and Wacziarg, 1999; Alesina and Spolaore, 1998). Alesina, Spolaore and Wacziarg (2000) have formally modeled the relationship between openness and the equilibrium number and size of nations. In cross-country tests, they find that the impact of country size on income growth depends on trade openness. If trade levels are sufficiently high, larger size carries lesser growth benefits. They also present descriptive historical evidence suggesting that decolonization, the creation of nation-states, and secessions are influenced by trade openness. 3. Country Size and Corruption: Previous Evidence Several authors have recently examined the implications of country size for the quality of governance, and found that smaller nations tend to have less corrupt governments than larger nations. This relationship is potentially important because there is strong evidence linking corruption - and the quality of governance more generally -- to economic performance (e.g. Mauro, 1995; Knack and Keefer, 1995; Hall and Jones, 1999). If small size helps in controlling corruption, international support for autonomy or independence movements in highly-corrupt nations could be grounded not only on principles of self-determination and concern for human rights, but also on "good governance" considerations. Based on Transparency International's Corruption Perceptions Index for 1998, Root (1999) -- in a cross-country regression using a sample of 60 countries -- finds that 4 higher population is significantly associated with low%'er ratings (i.e. more corruption), controlling for several other variables. Root attributes this pattern to economies of scale in governance: in large nations rulers can extract significant resources from the country and pay off the constituencies necessary for them to maintain power. In small countries, economies of scale paradoxically imply that the state must be run well to be financially viable. While the argument is ingenious, it seems just as plausible to make the simpler argument that small nations do not have the fiscal resources to afford capable and honest civil servants, and may therefore suffer from more corruption and incompetence. Fisman and Gatti (2000) conjecture that in large countries, which may have fewer government officials per citizen -due to similar economies of scale as Root--citizens may be tempted to bribe officials to jump the queue. But if these economies of scale really existed, the time spent in the queues may not be very long. Indeed, a recent World Bank/Commonwealth report (discussed below) argues that time spent in queues may be longer in small countries precisely because of these economies of scale. Wei (2000) argues that smaller countries must be more open because fewer goods are domestically produced in small countries. The market discipline imposed by being an open economy in turn imposes good governance. Note this argument also appears paradoxical: the quality of the government is being improved by restricting its choice set. In fact smallness in size may increase the per-capita rents that can be extracted by customs officials for precisely the same reasons that small economies are more open. There are fewer domestically produced goods, increasing the demand for imports, so corrupt customs officials can demand larger bribes6. 6Anderson et al (1999) find that customs officials are among the more corrupt government agencies in their empirical analysis of anti-corruption surveys from the forner Soviet republic of Georgia. 5 Treisman (1999) finds the log of population is associated with significantly worse corruption using Business International's corruption indexes for 1980-83, and TI's indexes for 1996-1998. He speculates that this result may be attributable to diminishing returns to scale in combating corruption (as law enforcement agencies become less effective as they get larger), or to lower exposure to imports (leading to more monopoly power in product markets). Other arguments could be made about the impact of country size on corruption levels. There may be diseconomies of scale in fighting corruption, if anti-corruption agencies must remain small to avoid infection by corrupt officers. Elliot Ness's early attempts to combat Al Capone failed because of such an infection; eventually he had to rely on a small band of "Untouchables" to snare Capone. Two of the three successful experiences of anti-corruption reform described by Klitgaard (1988) were in the small city states of Hong Kong and Singapore. On the other hand, Klitgaard's later book "Tropical Gangsters" (1991) described rampant corruption in tiny Equatorial Guinea. That nation's experience suggests that small size might facilitate corrupt activity, by making it easier for the government to suppress the media and the opposition.7 A recent report on challenges facing small states8 suggests additional reasons to expect corruption to be more severe in small nations. Inability to take advantage of scale economies in the public sector can result in inadequate compensation levels for civil servants, increasing their temptation to solicit or accept bribes. Queues for public services could be longer, encouraging citizens to offer bribes - as Fisman and Gatti 7 A recent article in the New York Times (Onishi 2000) states that there is not a single newsstand in the capital of Malabo, and that the president had bought off every opposition politician in the recent election. 8 See Commonwealth Secretariat/World Bank Joint Task Force on Small States (2000). Small States: Meeting Challenges in the Global Economy. Report of the Joint Task Force on Small States. According to the report, the median public sector wage bill is 31% of GDP in developing nations with less than 1.5 million people, compared to 21% for larger developing countries. 6 conjectured for larger countries. Particularly for nontradeables, or where markets are less open to foreign trade, there are likely to be more monopolies and oligopolies in smaller nations, with associated rents that public officials may be tempted to extract. The regulators and the regulated are more likely to have family or other personal ties in smaller nations. Because arguments can be made either way, the relationship between corruption and country size is ultimately an empirical issue. Depending on the data set chosen, it is easy to find - as did Root (1999), Treisman (1999) and Fisman and Gatti (2000) -- a strong pattern indicating that smaller countries are less corrupt than larger ones. However, we demonstrate below that this relationship is driven by the absence of smaller, more corrupt countries from the data sets used. In fact, there appears to be no such relationship. 4. New Evidence on Country Size and Corruption We show in this section that the econometric relationship between country size and corruption identified in Root (1999), Treisman (1999) and Fisman and Gatti (2000) is entirely a statistical artifact created by sample selection in the availability of corruption data. The corruption data used in these studies are obtained from firms that specialize in providing assessments of "political risk" to overseas investors. Generally, these risk assessment firms provide assessments for only a limited number of countries. The selection of countries will obviously reflect the interest of overseas investors -- the clients of the risk assessment firms. Countries that constitute large markets -- such as Brazil, India, Indonesia, Nigeria, Russia and the USA - will be of interest whether the country is 7 well-governed or not. In contrast, among smaller countries only those that are well governed (or rich) are likely to be of much interest to overseas investors. Iceland shows up in many more of the standard data sources on governance than does Equatorial Guinea. Clearly, selecting countries in this way can potentially create a spurious relationship between country size and corruption. The sample will tend to include all large nations, whether corrupt or not, but only the less corrupt nations among the smaller ones. The most widely-known corruption indicator is Transparency International's (TI) Corruption Perceptions Index. This corruption indicator is used by Root (1999), Treisman (1999, 2000) and Wei (2000). It is constructed by standardizing and equally weighting values from numerous other indicators, including expert assessments -- such as the International Country Risk Guide's (ICRG) "corruption in government" rating -- and surveys of investors and citizens.9 Values range from 0 (worst corruption) to 10 (least corrupt). Countries are rated by TI only if data are available from at least three underlying sources. For example, if the ICRG is the only source from which TI can find data on a given country, that country is not included in TI's index. As interest in corruption has increased in recent years, more data from surveys and other sources has become available to TI. Accordingly, the number of countries included in TI's index rose from 41 in 1995 to 54 in 1996, 52 in 1997, 85 in 1998 and 99 in 1999. As the number of countries increases, representation of smaller and more corrupt nations will tend to increase, if larger and less corrupt nations were already well represented in 9 See Lambsdorff (1999) for detailed methodology. For the 1999 TI index, 17 sets of ratings from 10 separate sources were used. 8 the data. Table 1 shows the sample sizes, means, and standard deviations of the TI indexes. Table 2 shows the simple correlations between country size and the TI corruption index for each of these years. The correlation with the log of population is -.64 for 1995. If this relationship is driven by sample selection, then we would expect that this correlation would decline as the number of countries with TI ratings rises over time. This is exactly the pattern found in Table 2.10 As the TI sample more than doubles to 99 nations in 1999, the correlation with population drops by more than half to -.25. This relationship weakens even further using either of two newer data sets that are far more inclusive of smaller nations than are the TI indexes. Recently, Kaufinann, Kraay and Zoido-Lobaton (1999) created a "Graft" index using data from 11 sources (mostly the same as used by TI), and a methodology which weights more heavily those indicators that tend to be most highly correlated with the others. In practice, the differences in data and methodology matter little, as the Graft index is correlated at .98 with the TI 1999 index. The major difference between the indexes is country coverage, as Kaufmann et al. provide ratings even where there are only one or two underlying data sources."1 The Graft index provides ratings for 155 countries, compared to 99 for the 1999 TI index. Values are standardized, so that a Graft index value of 1.5 indicates that a nation is 1.5 standard deviations above the mean value for all nations. The lowest value is -1.567 (Niger) and the highest is 2.085 (Finland and Sweden). '1 A similar pattern is found using log of gross national product as an alternative measure of country size. The correlation with the TI index goes from -.27 for 1995 to a statistically significant and positive .28 in 1999. We focus on log of population because it is the standard size measure used in the relevant literature. " Kaufmann et al. also provide "standard errors" associated with each country value; these standard errors increase with the level of disagreement among the underlying sources, and decrease with the number of sources from which data are available. 9 As Table 2 shows, the relationship between population and corruption weakens even further using this new index which covers many more countries than TI. This result is consistent with the hypothesis that this relationship is dependent on the use of samples that tend to exclude small, corrupt nations. Table 2 also shows how the median population in the TI sample declines markedly with time. The biggest decline, from 22.5 million to 11.5 million, occurred between 1997 and 1998 - the year in which TI's country coverage expanded the most, from 52 to 85. Using the Graft index as a benchmark, the composition of the TI samples also appears to be tilted toward less-corrupt nations. The median Graft rating for the TI 1995 sample was 0.83, nearly a full standard deviation above the mean of 0 for the full Graft sample. Nearly three-quarters of the TI 1995 sample have Graft scores above the mean value of 0. As the sample gradually increased through 1999, the representation of more-corrupt nations increased. In 1999, however, the TI sample retained a modest bias toward less- corrupt nations, relative to the Graft sample.'2 Of course, even country coverage for the Graft sample is influenced in part by investor interest, and may be subject to some bias in analyzing the relation between corruption and population. The median-population nation (with 9.2 rnillion) in the 155- country Graft sample is larger than the median among all 207 nations (5.3 million) for which the World Bank had population data in 1998. The 52 nations for which population data are available, but which are missing data on the Graft index, may also be more corrupt on average than those for which Graft data are available. A second new corruption indicator, constructed for internal use by the World Bank, rates every member country which is an active borrower (in practice, most members that 12 Note the median value of the Graft index is -.24, below its mean of 0. 10 are not high-income nations). As part of the Bank's annual "Country Policy and Institutional Assessment" (CPIA), it rates 20 aspects of policies and governance on a 1-6 scale. One of these items measures "transparency, accountability, and corruption in the public sector." Unlike TI and even the Graft index, the sample composition is largely unrelated to population. The median country size among the 136 CPIA nations in 1999 was 6.4 million, not much larger than the median of 5.3 million among all nations with data. Of the 136 nations covered, 29 (mostly small) nations are not represented in the Graft sample. Among the CPIA nations with Graft ratings, the median value is -.40, and only 22.4% have Graft ratings above the mean of 0 (unsurprisingly, as high-income nations are not included in the CPIA ratings). Although the CPIA sample likely overrepresents the more corrupt nations, small nations are only slightly underrepresented, so the estimated relation between population and corruption should be little affected by selection bias. As shown in Table 2, the correlation of the CPIA corruption rating with population is a very modest and insignificant -. 11. By estimating missing Graft index data from CPIA ratings, or vice versa, corruption ratings for 1999 can be generated for 184 nations.13 Correlations of (log) population with these augmented Graft and CPIA indexes are only -.05 and -.04 respectively. Evidence from multivariate tests confirms evidence from these simple correlations that the population-corruption relationship is driven by sample selection bias. Regressions in Table 3 control for other variables shown elsewhere to be associated with 13 Missing Graft values are imputed from the regression Graft=-1.637 + .436(CPIA). Missing CPIA values are imputed from the regression CPIA=3.325 + 1.155(Graft). In each regression, N = 107, t = 10.33, and 11 corruption levels, including per capita income, the Freedom House measure of political freedoms, and a dummy for former British colonies. 14 As the TI sample increases substantially leading up to 1999, the coefficient on population declines dramatically (equations 1-5).'5 For the Graft index and the CPIA index (equations 6-7), the relationship disappears entirely. As shown in the bottom row of Table 3, standardized coefficients for population in these regressions decline steadily as the sample increases. By estimating missing data on the Graft index from CPIA index values (or vice versa), the sample size in these regressions can be increased to 150. In these tests (not reported in the table), the effect of population is even weaker: for the augmented Graft index, the standardized estimate for population drops to -.006, and for the augmented CPIA index to -.017. The country size and corruption relationship thus appears to be entirely due to the use of samples which systematically exclude smaller and more corrupt nations. 16 R2 = .50. These augmented corruption measures cover all 184 countries for which both population and per capita income data are available for 1998, with the sole exception of Antigua. 14 Independent variables are all lagged a year behind the dependent variable. These control variables are taken from Treisman (2000) and Swamy et al. (2000), who find corruption is less severe in ex-British colonies; that result also shows up in Table 3. Political freedom ranges from I (least free) to 7 (most free), and is significant only in the graft index regression in Table 3-the one with the largest sample, and with the greatest variation in freedoms. The coefficient of variation in freedoms in the TI 1995 sample (N=38) in Table 3 is only .29, rising to .45 in the graft sample (N= 129). We avoided using other regressors that plausibly are related to corruption but which would reduce the sample size substantially. 5 Although he does not take note of it, Treisman's (1999) coefficients (and standard errors) for the log of population exhibit a similar pattern: .75 (.19) using the 1996 TI ratings, with only 52 nations in the regression, but .50 (.29) using the 1998 ratings, with 74 countries in the regression. 16 We do not report estimates from Heckman sample selection models in tables, because an extended search failed to identify any variables that strongly affect selection but not corruption. However, the availability of the Graft and (especially) the CPIA indexes obviates the need for using Heckman selection methods to generate corrected estimates of the effects of country size. 12 5. Determinants of Inclusion in Corruption Data Sets - In this section, we present more direct evidence indicating that country size and corruption levels are each significant determinants of inclusion in the corruption data sets. Even for the 1999 TI ratings, which cover 99 countries, the only nations with populations of less than one million that are included (among more than 30 such independent nations in the world) are Iceland (274,000) and Luxembourg (426,000). In contrast, the most populous 8 nations are all included, as are 25 of the largest 30. Of the 15 most populous nations not included in the 1999 TI index, all have below-average ratings on the Graft index. The only two small nations included in the 1999 TI index, Iceland and Luxembourg, both score far above average on the TI index and on the Graft index. Deleting these two observations reduces the correlation of (log) population with TI's 1998 and 1999 indexes substantially (Table 2). Table 4 summarizes how the likelihood of inclusion in the various corruption indexes is related to size and the quality of governance. Countries are divided into four categories, those with: (1) below-median population and below-median Graft ratings (augmented with estimated values based on CPIA ratings), (2) below-median populations and above-median corruption ratings, (3) above-median populations and below-median corruption ratings, and (4) above-median values of both population and corruption. Of 42 countries in the first category, not a single one was included in the TI indexes for 1995, 1996, and 1997. Until 1998, the bulk of countries in categories 2 and 3 were also missing from the TI indexes. Most category 4 nations were included even in 1995 (27 of 40), with coverage rising to about 93% in 1998 and 1999. By contrast, the Graft 13 index covers the majority of countries in category 1. It has roughly double the country coverage of TI 1999 for categories 2 and 3. The CPIA index actually over-represents countries in category 1 (41 of 42). However, most large countries with below-median corruption ratings are also represented (43 of 50). Countries with above-median corruption ratings are equally likely to be included whether they are small (29 of 50) or large (23 of 41). The greater tendency for well governed nations to be included in the TI indexes is not attributable merely to their higher income levels. Table 5 reports logit regressions in which the dependent variable is a dummy, indicating whether each country is included in the relevant corruption data set. Independent variables include population, per capita income, and corruption levels, as measured by the Graft index (including values imputed from CPIA ratings, where Graft data were missing). Because of missing data on per capita income, about 20 fewer countries are represented in Table 5 than in Table 4. The coefficient on population is positive and significantly associated with the likelihood of inclusion in each of the TI indexes and the Graft index, but not for the CPIA. These results are all as expected, as TI and Graft are constructed by aggregating ratings provided by firms assessing risks to overseas investors, while the CPIA covers all World Bank borrowers whether large or small. The coefficient on per capita income is positively and significantly (except in the case of TI 1996) associated with inclusion in the TI and the Graft indexes.'7 Income is negatively and significantly associated with 17 This pattern has the potential to create a downward bias in estimates of the relationship between income and corruption. Well-off countries are likely to be included in the corruption data sets, whether they are particularly well governed or not; arnong the poorer countries, those that are well governed are more likely to be included (controlling for population). As the TI sample increases over time, the sample selection problem should diminish, suggesting that the ceefficient on per capita income should increase. As samples expand, they are also likely to include more countries for which experts have relatively little infornation, and for which they might rely on income as a rough signal of the severity of corruption. This would also tend to increase the coefficient on per capita income. Table 3 provides no evidence, however, that this 14 CPIA coverage, because high-income members of the World Bank tend not to be borrowers. Higher corruption ratings (measured by the Graft index, augmented by imputing values from CPIA ratings where data were missing) are associated with a significantly greater probability that nations are included in the TI samnple.18 This result suggests that, even after taking into account income differences, risk ratings firms and other sources of corruption data often choose not to devote resources to providing regular assessments of nations which are not sufficiently well governed to generate interest among clients (mostly overseas investors and lenders). Figures 1 and 2 plot the relationship between (log of) population and the Graft index; with countries represented in the TI 1996 (Figure 1) and 1997 (Figure 2) data sets marked by black diamonds, and countries without TI data marked by white diamonds. The figures illustrates the sample selection problem in the TI data very nicely. Overall, there appears to be no strong relationship between population and the Graft ratings in the figures. However, among those countries for which TI values are available (those marked by black diamonds), there is a clear and strong positive relationship. The figures provide obvious visual evidence that data availability on TI is highly dependent on population and on corruption levels: only the well-governed countries among small nations are represented in the TI index, and only the large nations among the poorly- governed ones are represented. 6. Implications coefficient increases as country coverage expands. Standardized coefficients on per capita income for the six regressions in Table 3 respectively are .75, .83, .77, .83, .77, and .62. 15 A recent study (Sambanis, 2000) concluded that territorial partition was ineffective in reducing ethnic violence. By demonstrating that the commonly-found relation between country size and corruption is an artifact of sample selection, the analysis above indicates that partition or secession would also be ineffective in improving the quality of governance. Our findings also have implications for researchers. Until data on the quality of governance are available for all countries, care must be taken in making inferences regarding independent variables such as population, per capita income, and the quality of governance, that influence which countries are included in the governance data sets. These cautions also hold for analyses of the impact on corruption of variables that are highly correlated with population, such as decentralization (Fisman and Gatti, 2000; Treisman, 1999) and trade openness (Ades and Di Tella, 1999; Treisman 2000; Wei, 2000). Treisman's (1999) results that decentralization increases corruption may be influenced by the fact that large countries, which tend to be federal, are included in the sample whether or not they are corrupt, and small countries--which tend not to be federal- -are only included in the data if they are well governed. Although he minimizes the importance of the result, Treisman (1999) in fact finds that the negative relationship between corruption and federalism is not robust to the inclusion of population in his regressions. Fisman and Gatti (2000) obtain the opposite result from Treisman (1999): greater decentralization, measured as the state and local share of government spending, is associated with lower corruption, using the ICRG corruption index (a 0 to 6 scale, with higher ratings indicating less corruption). The ICRG index is available for about 130 18 A similar result is found using the CPIA index as augmented by values estimated from Graft ratings. The 16 countries over the period they analyzed, so may be less subject to sample selection bias than the TI and Business International (BI) indexes used by Root (1999), Treisman (1999) and Wei (2000). However, ICRG coverage also turns out to be positively related to country size and corruption ratings, as indicated in the bottom rows of Tables 4 and 5. Moreover, only 57 nations are included in the Fisman-Gatti regressions, due to limited availability of data on state and local spending from GFS. The median population in that 57-nation sample for 1994 was 10.0 million, far above the 5.4 million median among all 126 countries for which ICRG and population data were available. The mean ICRG corruption rating for 1995 for their sample was 4.1, compared to only 3.5 in the 126- nation sample. The simple correlation between log(population) and the ICRG corruption rating in their sample is -.21, but falls to -.04 in the larger sample. By omitting the decentralization variable, the sample in their corruption regressions can be increased to 108. Controlling for per capita income and a democracy indicator, population is unrelated to corruption (t-statistic = 0.4). The implication is not that Fisman and Gatti should omit population from their regressions. The major purpose of their paper is to estimate the impact of decentralization on corruption; if they did not control for population, their estimate of the impact on corruption of decentralization (which is correlated with the log of population at .30) will reflect greater selection bias than if population were included. The implication, rather, is that the population coefficient should not be given a substantive interpretation. Imports as a share of GDP is sometimes included as a determinant of corruption levels (Ades and Di Tella, 1999; Treisman, 2000), in the belief that higher imports signify more competition in product markets, lowering rents and thereby bribe-taking. augmented CPIA and augmented Graft ratings are correlated at .90. 17 Ades and Di Tella instrument for imports/GDP with the log of population and the log of land area, and find that higher imports are associated with lower corruption, as reflected in the BI ratings and in ratings from the World Competitiveness Report (WCR). The import share of GDP is strongly related to population, with a correlation of -.61 for a sample of 160 countries in 1997. Because Ades and Di Tella (and Treisman, 2000) do not control for population, the coefficient on imports/GDP in their tests is likely to reflect selection bias. Their 31-nation WCR sample is particularly instructive on how investor interest drives selection; "Their 31 country WCR sample is particularly instructive on how investor interest drives selection; it is composed of 24 OECD members (including new members Korea and Mexico), 2 small and 2 medium-sized fast-growing East Asian nations (Hong Kong, Singapore, Malaysia and Thailand), and the 3 largest non- Communist developing nations (India, Indonesia and Brazil). Treisman (2000) also finds imports/GDP is associated with better ratings on the BI index and the 1996 and 1997 TI indexes. However, the relationship disappears for the 1998 TI index; Treisman does not link this latter result to the larger sample provided by the 1998 TI index. Adding the import share of GDP to our corruption regressions based on the Graft and CPIA indexes, we confirm that imports/GDP is unrelated to corruption in samples less subject to selection bias. 19 Wei's result that "natural openness" leads to better governance also turns out to be driven by sample selection. "Natural openness" is constructed by taking the predicted values from a regression of trade intensity on (log) population and several other variables. Population is easily the most powerful predictor of trade in these regressions. "Natural openness" averaged over 1994-96 turns out to be correlated at -.91 with (log) population 19 These results are not reported in tables but are available from the authors on request. 18 for 1995. Wei's corruption regressions using the BI corruption index include 66 or fewer countries, and those using the TI 1998 index include 82 or fewer countries. It is telling that Wei finds that "residual openness" -- the part of trade intensity unrelated to country size -- has no impact on corruption. Using a range of corruption indicators in Table 6, we find that natural openness is also unrelated to corruption. We replicated Wei' s regressions of corruption on natural openness, residual openness, log of per capita income, and the Freedom House democracy indicator. As shomwn in Table 6, the coefficient on natural openness is cut in half- and becomes only marginally significant -- simply by substituting the TI 1999 index (with 14 additional countries) for TI 1998. Using the Graft index, and particularly the CPIA index, the relationship weakens further and is not significant.20 While trade openness may increase growth rates (Frankel and Romer, 2000), particularly for small countries (Alesina et al., 2000), there is no convincing evidence that it reduces corruption. How should researchers respond to potential problems with sample selection in studying the determinants of good governance? In the short run, it is preferable other things equal to choose, among existing data sets, those with greater cross-country coverage. By aggregating information from several sources, the TI indexes likely contain less measurement error than the ICRG index. Because the latter currently covers more than 140 countries, it may nonetheless produce more accurate estimates in the face of sample selection problems than the TI index, which currently covers only 99 countries. However, the Graft index combines the advantages of TI (aggregation) and ICRG 20 We obtain very similar results using the Frankel and Romer (2000) predicted trade shares, also constructed from regressions of trade intensity on population and geographic variables. Inclusion in the BI sample, as for TI, is significantly related to population, per capita income, and corruption levels as measured by the Graft index (augmented by estimates from CPIA). Our Table 6 regressions are based on Wei's Table 5, equation 4. Our TI 1998 regression replicates his result closely, but not exactly. 19 (country coverage), and is preferable to either one. Unfortunately, the CPIA index is not yet available to researchers outside the World Bank.2' In the longer run it is important to systematically collect data on all small states, if social scientists are to more rigorously test hypotheses concerning the impact of country size on governance and other outcomes. 21 Although country coverage of the CPIA index is independent of country size, it has other potential disadvantages. Unlike the case with ICRG and other commercial furns that produce many of the ratings used to construct the TI and Graft indexes, there is no financial incentive for accuracy in constructing the CPIA ratings. The CPIA also may contain more measurement error than the TI and Graft indexes, which aggregate information from numerous sources. 20 References Ades, Alberto and Rafael Di Tella (1999). "Rents, Competition, and Corruption." American Economic Review, 89(4), 982-93. Alesina, Alberto; Enrico Spolaore and Roman Wacziarg (2000). "Economic Integration and Political Disintegration." American Economic Review (forthcoming). Alesina, Alberto and Enrico Spolaore (1998). "On the Number and Size of Nations." Quarterly Journal of Economics, 112, 1027-56. Alesina, Alberto and Roman Wacziarg (1999). "Is Europe Going Too Far?" Journal of Monetary Economics (supplement), 1-42. Anderson, James; Omar Azfar, Daniel Kaufmann, Young Lee, Amitabha Mukherjee, and Randi Ryterman (1999). "Corruption in Georgia: Survey Evidence." World Bank, unpublished manuscript. Aristotle (1932). The Politics. Harvard University Press. Commonwealth Secretariat/World Bank Joint Task Force on Small States (2000). Small States: Meeting Challenges in the Global Economy. Report of the Joint Task Force on Small States. Easterly, William and Aart Kraay (2000). "Small States, Small Problems? Income, Growth, and Volatility in Small States." World Development (forthcoming). Fisman, Ray and Roberta Gatti (2000). "Decentralization and Corruption: Evidence Across Countries." World Basnk Policy Research Working Paper 2290. Frankel, Jeffrey A. and David Romer (2000). "Does Trade Cause Growth?" American Economic Review, 89(3), 379-99. Hall, Robert and Charles Jones (1999). "Why Do Some Countries Produce So Much More Output Per Worker Than Others?" Quarterly Journal of Economics,1 14(1), 83-116. Harden, Sheila (1985). Small is Dangerous: Micro States in a Macro World. London: Frances Pinter. Havel, Vaclav. "Kosovo and the End of the Nation State." New York Review of Books, June 10 1999, 4-5, firom an address delivered to the Canadian Parliament. Jalan, Bimal (1982). Problems and Policies in Small Economies. New York: St. Martin's Press. Kaufmann, Dani; Aart Kraay and Pablo Zoido-Lobaton (1999). "Aggregating Governance Indicators." World Bank Policy Research Working Paper 2195. 21 Klitgaard, Robert (1991). Tropical Gangsters, One Man's Experience With Development and Decadence in Deepest Africa. New York: Basic Books. Klitgaard, Robert (1988). Controlling Corruption. Berkeley: University of California Press. Knack, Stephen and Philip Keefer (1995). "Institutions and Economic Performance: Cross-Country Tests Using Alternative Institutional Measures." Economics and Politics, 7, 207-27. Lambsdorff, Johan Graf (1999). "The Transparency International Corruption Perceptions Index 1999: Framework Document." http://www.transparency.de/documents/cpi/ cpi_framework.html Matthews, Jessica. Power Shift, Foreign Affairs, Jan/Feb 1997 67-84. Mauro, Paolo (1995). "Corruption and Growth." Quarterly Journal of Economics 110, 681-712. Newhouse, John, Europe's Rising Regionalism, Foreign Affairs, Jan/Feb 1997 50-66. Murphy, Kevin; Andrei Shleifer and Robert Vishny (1989). "Income Distribution, Market Size, and Industrialization." Quarterly Journal of Economics, 104, 537-64. Onishi, Norimitsu (2000). "Oil Riches, and Risks, in a Tiny African Nation." New York Times (July 23). Plato (1988). The Laws. Chicago University Press. Romer, Paul M. (1986). "Increasing Returns and Long-Run Growth." Journal of Political Economy, 94, 1002-37. Root, Hilton (1999). "The Importance of Being Small." Unpublished manuscript. Sambanis, Nicholas (2000). "Partition as a Solution to Ethnic War: An Empirical Critique of the Theoretical Literature." World Politics, 52, 437-83. Sardar, Ziauddin (1995). "Can Small Countries Survive the Future"? Futures, 27(8), 883- 89. Srinivasan, T. N. (1986). "The Costs and Benefits of Being a Small, Remote, Island, Landlocked, or Mini-state Economy." World Bank Research Observer, 1(2), 205-218. Swamy, Anand; Stephen Knack, Young Lee and Omar Azfar (2000). "Gender and Corruption." Journal of Development Economics, forthcoming. 22 Treisman, Daniel (1999). "Decentralization and Corruption: Why Are Federal States Perceived to be More Corrupt?" UCLA Department of Political Science, Unpublished manuscript. Treisman, Daniel (2000). "The Causes of Corruption: A Cross-National Study." Journal of Public Economics, forthcoming. . Wei, Shang-jin (2000). "Natural Openness and Good Government." World Bank Policy Research Working Paper 2411 and NBER Working Paper 7765. 23 Table 1: Descriptive Statistics for Corruption Indexes ___________ N Mean Std. Dev. TI 1995 41 5.93 2.55 TI 1996 54 5.35 2.60 TI 1997 52 5.67 2.53 TI 1998 85 4.89 2.40 TI 1999 99 4.60 2.36 Graft index (1999) 155 0 1 CPIA (1999) 136 2.89 0.86 ICRG (1995) 129 3.53 1.28 Table 2: Corruption-Population Simple Correlations Corruption Correlations Median Median % of indicator population Graft index sample in sarnple in sample with Graft (millions) > 0 Full sample Pop. > 1 million TI 1995 -.64** (40) -.64** (40) 31.7 .83 73.2 TI 1996 -.56** (53) -.56** (53) 27.2 .62 63.0 TI 1997 .57** (52) -.56**(51) 22.5 .65 69.2 TI 1998 -.34** (85) -.26* (83) 11.5 .06 52.9 TI 1999 -.25* (99) -.17 (97) 10.5 -.14 47.5 Graft index -.17* (154) -.08 (142) 9.2 -.24 38.7 CPIA -.11 (136) -.09 (112) 6.4 -.40 22.4 Population is lagged one year relative to the respective corruption indicator. A * (**) indicates significance at .05 (.01) level for 2-tailed tests. 24 Table 3: Corruption Regressions Equation 1 2 3 4 5 6 7 Dependent var. TI 1995 TI 1996 TI 1997 TI 1998 TI 1999 Graft CPIA Log (population) -0.387 -0.493** -0.358** -0.313** -0.234* -0.033 -0.043 (0.185) (0.116) (0.115) (0.088) (0.081) (0.029) (0.038) Log (per capita 2.464** 2.098** 2.185** 1.856** 1.686** 0.494** 0.344** income) (0.388) (0.183) (0.240) (0.187) (0.190) (0.053) (0.089) Ex-British colony 1.462** 1.168** 1.271 * * 1.415** 1.324** 0.320** 0.319* (0.360) (0.271) (0.324) (0.258) (0.263) (0.096) (0.134) Political freedoms -0.115 -0.022 -0.051 -0.010 0.118 0.128** 0.073 (0.132) (0.098) (0.129) (0.088) (0.085) (0.026) (0.039) Intercept -15.436 -12.193 -13.414 -10.913 -10.337 -4.753 -0.095 (3.800) (1.473) (2.094) (1.449) (1.414) (0.399) (0.736) N 38 51 49 78 90 129 119 Adj. Ri .70 .79 .73 .71 .71 .66 .21 Mean, dep. variable 5.8 5.2 5.6 4.8 4.6 .01 2.9 Standardized coeff. -.22 -.26 -.22 -.20 -.15 -.06 -.10 on log(population) I Heteroskedastic-consistent standard errors in parentheses. Population, per capita income and political freedoms are lagged one year relative to the respective corruption indicator. A * (**) indicates significance at .05 (.0 1) level for 2-tailed tests. 25 Table 4: Representation in Corruption Samples (in %) Category 1 2 3 4 No. of countries 42 50 50 41 Corruption sample Pop: Low Pop: Low Pop: High Pop: High Graft: Low Graft: High Graft: Low Graft: High TI 1995 0 14 12 67.5 TI 1996 0 18 26 77.5 TI 1997 0 22 22 73.2 TI 1998 9.5 40 46 92.7 TI 1999 26.2 46 54 92.7 Graft index (1999) 54.8 86 94 100 CPIA (1999) 97.6 58 86 56.1 ICRG (1995) 28.6 68 80 97.5 Cells indicate percentages of countries in each category that are represented in corruption indexes. The four categories are defined in terms of above-median and below-median population, and above-median and below-median ratings on the Graft index (augmented by CPIA ratings). Population is lagged by one year relative to the respective corruption indicator. Table 5: Logit Regressions Dependent variable = dummy for corruption data availability Eq. Dependent Intercept Log Log Graft index N Pseudo Variable Population Per capita (augmented) R2 = dummy for: income 3.1 TI 1995 -27.90 (8.42) 2.47 (0.61) 2.27 (0.84) 2.41 (0.84) 163 .78 3.2 TI 1996 -10.71 (3.64) 1.92 (0.34) 0.65 (0.42) 2.26 (0.64) 163 .67 3.3 TI 1997 -19.79 (5.51) 2.00 (0.39) 1.66 (0.59) 2.45 (0.78) 163 .74 3.4 TI 1998 -7.56 (2.47) 1.16 (0.21) 0.70 (0.29) 1.62 (0.49) 163 .46 3.5 TI 1999 -8.36 (2.41) 0.99 (0.18) 0.89 (0.28) 0.89 (0.43) 162 .39 3.6 Graft index -14.35 (4.55) 1.89 (0.38) 1.90 (0.57) 1.09 (0.81) 162 .61 3.7 CPIA 45.30 (11.43) -0.14 (0.20) -4.80 (1.24) 0.09 (0.68) 162 .73 3.8 ICRG (1995) -2.47 (2.23) 0.79 (0.15) 0.29 (0.27) 1.09 (0.41) 163 .29 Cells contain logit coefficients and standard errors. Population and per capita income are lagged one year relative to the respective corruption indicator. All coefficients are significant at .05 for 2-tailed tests except those shown in bold. 26 Table 6 Natural Openness and Corruption Equation 1 2 3 4 5 6 7 Dependent var. TI 1995 TI 1996 TI 1997 TI 1998 TI 1999 Graft CPIA Natural openness 1.922* 2.312** 1.979** 1.694** 0.821 0.217 0.214 (0.749) (0.459) (0.640) (0.440) (0.438) (0.142) (0.216) Residual openness 0.081 0.013 0.203 -0.141 0.119 -0.010 -0.215 (0.462) (0.387) (0.577) (0.392) (0.316) (0.147) (0.212) Log (per capita 2.245** 1.855** 2.015** 1.617** 1.526** 0.467** 0.359** income) (0.410) (0.224) (0.292) (0.242) (0.221) (0.054) (0.092) Political freedoms -0.121 -0.044 -0.006 -0.035 0.141 0.122** 0.076 (0.185) (0.107) (0.194) (0.117) (0.102) (0.025) (0.042) Intercept -22.038 -20.529 -21.032 -16.189 -12.838 -5.405 -1.136 (3.283) (2.078) (2.687) (2.257) (2.262) (0.636) (0.959) N 38 51 47 78 92 128 109 Adj. P2 .64 .75 .70 .63 .63 .62 .23 Standardized coeff. .27 .30 .27 .24 .12 .08 .09 on natural openness) IIeteroskedastic-consistent standard errors in parentheses. Population, per capita income and political freedoms are lagged one year relative to the respective corruption indicator. A * (**) indicates significance at .05 (.01) level for 2-tailed tests. 27 Figure 1 Population and Graft by TI 1996 Availability y = -0.358Ln(x) + 1.7645 R2= 0.2455 25 0~~~~0 E 0 it 10°0 Log (population) Figure 2 Population and Graft by TI 1997 Availability y =-0.3378Ln(x) + 1.7481 R =0.2842 K a ° 1 oqooo° Vi4;+ +~ 1010to E | ~~~~~~~~~Log (population) 'I 28 Policy Research Working Paper Series Contact Title Author Date for paper WPS2455 The Effects on Growth of Commodity Jan Dehn September 2000 P. Varangis Price Uncertainty and Shocks 33852 WPS2456 Geography and Development J. Vernon Henderson September 2000 R. Yazigi Zmarak Shalizi 37176 Anthony J. Venables WPS2457 Urban and Regional Dynamics in Uwe Deichmann September 2000 R. Yazigi Poland Vernon Henderson 37176 WPS2458 Choosing Rural Road Investments Dominique van de Walle October 2000 H. Sladovich to Help Reduce Poverty 37698 WPS2459 Short-Lived Shocks with Long-Lived Michael Lokshin October 2000 P. Sader Impacts? Household Income Martin Ravallion 33902 Dynamics in a Transition Economy WPS2460 Labor Redundancy, Retraining, and Antonio Estache October 2000 G. Chenet-Smith Outplacement during Privatization: Jose Antonio Schmitt 36370 The Experience of Brazil's Federal de Azevedo Railway Evelyn Sydenstricker WPS2461 Vertical Price Control and Parallel Keith E. Maskus October 2000 L. Tabada Imports: Theory and Evidence Yongmin Chen 36896 WPS2462 Foreign Entry in Turkey's Banking Cevdet Denizer October 2000 I. Partola Sector, 1980-97 35759 WPS2463 Personal Pension Plans and Stock Max Alier October 2000 A. Yaptenco Market Volatility Dimitri Vittas 31823 WPS2464 The Decumulation (Payout) Phase of Estelle James October 2000 A. Yaptenco Defined Contribution Pillars: Policy Dimitri Vittas 31823 Issues in the Provision of Annuities and Other Benefits WPS2465 Reforming Tax Expenditure Programs Carlos B. Cavalcanti October 2000 A. Correa in Poland Zhicheng Li 38949 WPS2466 El Nifno or El Peso? Crisis, Poverty, Gaurav Datt October 2000 T. Mailei And Income Distribution in the Hans Hoogeveen 87347 Philippines WPS2467 Does Financial Liberalization Relax Luc Laeven October 2000 R. Vo Financing Constraints on Firms? 33722 Policy Research Working Paper Series Contact Title Author Date for paper WPS2468 Pricing, Subsidies, and the Poor: Ian Walker November 2000 S. Delgado Demand for Improved Water Services Fidel Ordohez 37840 in Central America Pedro Serrano Jonathan Halpern WPS2469 Risk Shifting and Long-Term Mansoor Dailami November 2000 W. Nedrow Contracts: Evidence from the Robert Hauswald 31585 Ras Gas Project /7/2