A NEW DEVELOPMENT DATA BASE The following article is the first in an occasional series introducing new data bases. The series intends to make new development data bases more widely available and to contribute to discussion and further research on economic de- velopment issues. The data bases included in the series are selected for their potential usefulness for research and policy analysis on critical issues in devel- oping and transition economies. Some are drawn from micro-level firm or house- hold surveys; others contain country-level data. The authors describe the data contents, criteria for inclusion or exclusion of values, sources, strengths and weaknesses, and any plans for maintenance or updating. Each data base is avail- able from the author, at the address provided in the article. 563 77124 THE WORLD BANK ECONOMIC REVIEW VOL. 10, NO. 3: J65-91 A New Data Set Measuring Income Inequality Klaus Deininger and Lyn Squire This article presents a new data set on inequality in the distribution of income. The authors explain the criteria they applied in selecting data on Gini coefficients and on individual quintile groups' income shares. Comparison of the new data set with existing compilations reveals that the data assembled here represent an improve- ment in quality and a significant expansion in coverage, although differences in the definition of the underlying data might still affect intertemporal and international comparability. Based on this new data set, the authors do not find a systematic link between growth and changes in aggregate inequality. They do find a strong positive relationship between growth and reduction of poverty. Following a long-standing recognition of potentially important relationships between economic growth and inequality, the profession has recently rediscov- ered the topic, emphasizing, in particular, the potential endogeneity of growth and interactions between the economic and political systems. Earlier discus- sions, such as the famous Kuznets Hypothesis, were framed mainly in terms of an exogenous growth process and its implications for inequality. In contrast, the recent literature has focused on the potential effects of inequality on growth in a wide variety of circumstances. Although attention has focused on both political and economic explanations for such a relationship, the underlying processes are still imperfectly understood. Indeed, theoretical models arrive at widely differ- ent conclusions, depending on the underlying assumptions. Which of these as- sumptions is more accurate is an empirical question that can only be decided by confronting the hypotheses emerging from such models with actual data. Empirical work using cross-country data to draw inferences regarding the relationship between growth and inequality has a long tradition and has led to a number of fruitful (or controversial) hypotheses, including Kuznets's conjecture that inequality would increase with rising incomes at early stages of development and decrease at higher levels of per capita income. The lack of time series that are sufficiently long has prevented appropriate testing of these hypotheses. Furthermore, problems in the quality of data and the fact Klaus Deininger and Lyn Squire are with the Policy Research Department at the World Bank. The authors are grateful to Roland Benabou, Shaohua Chen, Gaurav Datt, Hamid Davoodi, Bill Easterly, Gary Fields, Emmanuel Jimenez, Peter Lanjouw, Branko Milanovic, Lant Pritchett, and Yvonne Ying for their advice and/or data, and to participants in seminars at the World Bank, Cornell University, the Harvard Growth Conference, and the Institute of Developing Economies (Tokyo) for their comments. The authors thank Hongyi Li and Tao Zhang for very able research assistance. © 1996 The International Bank for Reconstruction and Development /THE WORLD BANK 565 566 THE WORLD BANK ECONOMIC REVIEW, VOL. 10, NO. 3 that existing measures are often based on different definitions hamper com- parability between countries—and often even within the same country over time—thus affecting empirical results in unpredictable ways. These concerns become more important as the complexity of theories about inequality and growth increases beyond the often simplistic mechanisms that characterized early models. The main purpose of this article is to present a new data set on inequality and to discuss the procedures followed in putting it together as well as the remaining limitations. In section I we discuss our choice of the Gini coeffi- cient, supplemented by income shares by quintiles, as the relevant distribu- tional measure and set forth the criteria we applied in selecting data. In sec- tion II we describe the new data set and compare its coverage to existing compilations of data related to inequality. Compared with earlier data sets, our data represent a significant expansion in coverage and a substantial im- provement in quality. That said, variation in the definition of the variables used to measure inequality—gross income or net income, income or expen- diture, data per capita or data per household—can seriously affect the mag- nitude of the indicators of inequality and undermine the international and intertemporal comparability of the data. We therefore discuss how to deal with the problem of comparability in order to ensure the robustness of em- pirical analyses. Section III turns from a description of how the data set was put together to an illustrative analysis of what it can tell us. Using both the Gini coefficient and share data, the data set describes regional and intertemporal differences in in- equality, highlighting the familiar fact that inequality in Latin America is con- siderably higher than in the rest of the world. It also looks at the contemporane- ous relationship between growth, inequality, and poverty. For the ninety-five growth spells for which we have information on income shares, we find no sys- tematic link between growth and inequality, but we do find a strong positive relationship between growth and poverty reduction. In particular, growth ben- efits the poor in the vast majority (87.5 percent) of cases, whereas economic decline quite often hurts the poor disproportionately (in five out of seven cases). These findings illustrate the value of combining aggregate measures of inequal- ity and information on income shares. I. METHODOLOGICAL ISSUES In assembling a data set on inequality, the distributional measure has to be chosen by weighing advantages and disadvantages. In addition, criteria have to be established to ensure that the data used do indeed measure the variable of interest with minimal error. Also, it is necessary to identify sources of residual variation remaining in the data—in this case differences in the definition of the variable being measured—and to assess the likely implica- tions of such variation. Deininger and Squire 567 Measures of Inequality This section does not attempt to substitute for a detailed discussion of differ- ent measures of inequality. Jenkins (1991) provides an overview and a more detailed discussion of these measures as well as a review of the literature. Our main purpose here is to justify our choice of variable—the Gini coefficient comple- mented by income shares of population quintiles wherever possible—as a way to combine maximum coverage of countries and time periods with an accept- able level of quality. A popular representation of income inequality, the Gini coefficient is based on the Lorenz curve, which plots the share of population against the share of income received. We chose the Gini index as the indicator of inequality because it is widely reported in official sources that are based on primary data and be- cause studies that included several measures, such as Anand and Kanbur (1993), found aggregate results to be similar for different measures of inequality. One disadvantage of any aggregate measure of inequality such as the Gini index is that there is no unique mapping between changes in the index and the underlying income distribution; redistribution from the top to the middle class may be associated with the same change in the aggregate indicator as an in- crease in the share of income received by the bottom quintile at the expense of the middle class. To overcome this shortcoming, and to uncover possible move- ments in the income received by individual groups in society that could be ob- scured by the use of an aggregate measure such as the Gini index, we report information on income shares by quintile wherever possible. When our sources contained information on income shares that did not directly correspond to quintiles or when income shares but no Gini coefficients were reported, we used POVCAL, a statistical routine developed by Chen, Datt, and Ravallion (1995) to compute quintile shares or Gini coefficients, or both, based on the estimation of a parametric Lorenz curve. The POVCAL procedure fits a parametric Lorenz curve (general quadratic or beta) through the available distributional data. Where the estimated curve is valid, we use it to approximate the income shares obtained by different quintiles. To avoid making spurious inferences, we decided not to in- clude cases in which the Lorenz curve thus estimated would have to be based on information for less than five income groups or cases in which there were obvi- ous gaps in coverage. This procedure can be justified by noting that for a num- ber of cases for which primary data were available, POVCAL produced estimates that were very close to the real distribution, even if based only on partial information. Standards for Quality Although a large number of earlier studies on inequality have amassed sub- stantial data on inequality, the information included is often of dubious quality. Establishing a data set that allows cross-country comparison requires that mini- mum standards for quality be adopted. Slightly increasing the standards adopted S68 THE WORLD BANK ECONOMIC REVIEW, VOL. 10, NO. 3 in the earlier literature (Fields 1989a), we require that observations be based on household surveys, on comprehensive coverage of the population, and on com- prehensive coverage of income sources. In the following subsections we briefly discuss each of these requirements and their potential implications for measur- ing inequality. We also discuss the consequences of excluding specific observa- tions reported in the literature. Different applications may call for different se- lection criteria, a concern that we hope to satisfy by making available all of the data reviewed. HOUSEHOLD OR INDIVIDUAL AS UNIT OF OBSERVATION. We require that data on inequality be based on actual observation of individual units drawn from household surveys; we do not use data based on information from national accounts and some assumption regarding a general functional form according to which different types of income are distributed. The latter approach to measuring inequality relies on strong hypotheses about patterns of inequality across countries or over time that cannot be tested if such information is included in the data set. It is difficult to assess the error associated with such procedures because these procedures are normally used only when household surveys are unavailable, which means that a household-based control that would indicate the true value of inequality does not exist. Given that the reliability of these measures cannot be established, we exclude them from our data set. Applying the criterion that the unit of observation be either the household or the individual, we exclude a number of studies such as Adelman and Morris (1973) and Van Ginneken and Park (1984). These studies have generated syn- thetic estimates of inequality from national accounts and assumptions on the functional form of the distribution of income taken from other countries "at the same level of development," from a social accounting matrix (SAM), or from extrapolation of the distribution of income observed in small surveys originat- ing within the same country (Cromwell 1977 for Guatemala and Altimir 1986 for Argentina). COMPREHENSIVE COVERAGE OF THE POPULATION. Use of a nonrepresentative subset of the population can easily result in biased estimates. Therefore, we require that data on inequality, even if drawn from household surveys, be based on a representative sample covering all of the population. Empirically, the most frequent deviations from this principle are surveys that cover only economically active individuals, wage earners, or taxpayers, or that cover only rural or urban dwellers. Differences between Gini coefficients based on a subset of the population and those based on a nationally representative sample can be substantial. In Peru the expenditure-based Gini coefficient for metropolitan Lima in 1985 was 32, which was 10 points lower than the index obtained from'a nation- ally representative sample (Government of Peru 1991). In South Africa, ex- Deininger and Squire 569 trapolation from detailed information only on whites resulted in an aggre- gate Gini index for 1987 of 48, 14 points below the one measured in a na- tionally representative household survey in 1993 (Lachman and Bercuson 1992; World Bank 1995). Some analysts justify the use of observations from surveys that covered only a subset of the population by noting that it would be straightforward to deter- mine the sign of the bias and, implicitly, that such a bias would be constant over time. For example, inequality among wage earners or the economically active population is generally higher than inequality among households that may con- tain more than one wage-earning member. Similarly, the observation that the distribution of income is more egalitarian in rural than in urban areas is the stylized fact at the heart of the Kuznets Hypothesis (Kuznets 1955; Anand and Kanbur 1993). Using observations from our data set, we can show that these generalizations are often violated. For example, in several countries, such as Cote d'lvoire (Kozel 1990), Jordan (Haddad 1990), Tanzania (Ferreira 1994), Poland (Milanovic 1995), Sierra Leone (Kansal 1982), and, most strikingly, China (Chai and Chai 1994), contrary to conventional wisdom, rural incomes are dis- tributed more unequally than urban ones. There is not much theoretical or empirical justification to conclude that the difference between measures of inequality for various subgroups of the popula- tion will remain the same even within any given country (let alone across coun- tries) because the underlying structural parameters change over time. The rela- tionship between urban and rural inequality within the same country is far from static, as shown, for example, for India (Datt 1995) and Indonesia (Government of Indonesia, various issues). Therefore, it is not valid to draw inferences about national inequality from information on inequality within a subgroup of the population. To avoid such errors, we discarded a large number of observations from Latin American countries—Argentina, Bolivia, Colombia, Ecuador, El Salvador, Para- guay, and Uruguay—where many household surveys have been limited to met- ropolitan or urban areas (Psacharopoulos and others 1992; Melgar 1989; Fishlow, Fiszbein, and Ramos 1993). Other countries for which we made a significant reduction in the number of included observations are Japan (Mizoguchi 1985), Israel (official surveys exclude the rural population), and Malawi and Madagas- car (Pryor 1990). COMPREHENSIVE MEASUREMENT OF INCOME OR EXPENDITURE. We require that measures of inequality be based on comprehensive coverage of different income sources as well as of population groups. We have two main concerns about noncomprehensive coverage. First, the exclusion of nonmonetary income can impart serious biases to esti- mates of inequality, especially in developing countries. For example, nonmon- etary items in Greece in 1974 accounted for more than 70 percent of the expen- 570 THE WORLD BANK ECONOMIC REVIEW, VOL 10, NO. 3 diture of the lowest decile, leading to considerable differences between a mea- sure of income inequality based on full, compared with only monetary, expendi- ture (Government of Greece, various issues). We are aware that measuring non- monetary income appropriately is difficult and that inflated figures concerning this component of income (in particular the imputed value of owned housing) can conjure up an image of a more egalitarian distribution of income than is actually the case. Given constraints on our resources, we were not able to pur- sue this issue further. Second, measures of inequality reported in the literature are often based solely on wage income, thereby excluding nonwage earnings—pensions, for example, and income from self-employment. The reason is that the information underly- ing these studies has often been drawn from tax records, the population cover- age of which differs widely (depending primarily on tax laws) and is generally far from comprehensive. Measuring inequality solely on the basis of wage in- come would have a quantitatively significant effect on measured levels of in- equality, especially if individuals with no wage earnings are included. Calcula- tion of inequality measures from household-level data in the Luxembourg Income Study (see Atkinson, Rainwater, and Smeeding 1995) indicates that Gini coeffi- cients based on wage earnings (including households with no wage earnings) are 10 to 15 points higher than coefficients based on gross income. This general order of magnitude is confirmed by observations from the secondary literature, both for industrial and for developing countries. For example, using wage in- come to assess inequality in Sweden in 1976 resulted in a Gini index of 43.6, compared with one of 28.1 based on nationally representative data (see our data set). Although restricting attention to certain subsets of the population will un- doubtedly have a dramatic effect on measured levels of inequality, its impact on changes cannot be neglected either. An exogenous shock that leads to layoffs of workers would, for example, affect overall inequality between households in the population but could leave inequality among wage earners unaffected, in which case use of the latter would give a very distorted picture. The principle of comprehensive coverage obliged us to exclude relatively long time series on inequality in Greece (Lianos and Prodomidis 1974), Morocco (Bourguignon and Morrisson 1989), New Zealand (Easton 1983), and Sweden (Spant 1980). Similarly, observations for Nigeria that include only cash income (Owosekun and Otigba 1976) were excluded. The paucity of observations available to study distributional issues causes each individual data point generally to acquire considerable importance. For each of the three issues discussed earlier—the unit of observation, compre- hensive coverage of the population, and comprehensive measurement of in- come or expenditure—we can find examples that illustrate that the conclu- sions of earlier studies may have been affected by data of inferior quality. The combination of data based on national account estimates for early peri- Deininger and Squire 571 ods with information based on household surveys for later periods can lead to the appearance of large decreases in inequality, as in Kenya (Bigsten 1986). Comparison of inequality estimates from rural areas for 1953-64 with na- tionally representative data for later years gives rise to the appearance of a segment of initially increasing inequality along the Kuznets curve in the Re- public of Korea (Kwack 1990). Using data that cover only a truncated subset of the population, such as wage earners—or, as we shall see later, data that are not based on a consistent definition more generally—could lead to virtu- ally any type of growth-inequality pattern, such as a strong Kuznets curve for Malaysia (Meesook 1975). II. THE NEW DATA SET In this section we present the data set assembled using the above principles and compare it with existing compilations of data on inequality. We also high- light problems arising from variation in the definition of the variable used to measure inequality that may affect the intertemporal and international compa- rability of the inequality estimates contained in our data set. We briefly discuss how to deal with the problem of comparability. Sources We assembled the largest possible set of Gini coefficients and other income distribution measures that were reported in the literature and that seemed to have national coverage. Doing so yielded more than 2,600 observations, charac- terized by great heterogeneity, with Gini coefficient estimates ranging from 12.1 (China 1982) to 79 (Zambia 1970). These data suffer from two problems. First, the documentation in secondary sources is often very weak or totally absent, thus forcing the reader to make guesses concerning coverage, definitions of in- come, or units of measurement. Second, a good proportion of Gini coefficients of very doubtful quality continue to be passed down from generation to genera- tion (with each author quoting only the immediate predecessor) without satisfy- ing minimum criteria for quality. In view of these problems, it was necessary to go back to primary sources wherever possible to be able even to decide on the quality of an observation. In many cases the principles outlined in section I were useful as heuristic tools that allowed us to uncover and explain certain biases and irregularities in the Gini coefficients reported in the literature. Once we had identified a reputable source with the necessary information, we applied the three criteria outlined earlier to decide whether to include the observation in the high-quality data set (that is, the one that meets the three criteria). When more than one observation for the same year and country satisfied the minimum criteria, we used consistency of definition and source as well as origin in an official publication as criteria for inclusion. Given the large variation in data quality and reporting formats, con- 572 THE WORLD BANK ECONOMIC REVIEW, VOL. 10, NO. 3 sistency of sources as well as levels of aggregation are important, even if pri- mary sources are used.1 The procedures we followed resulted in a data set of 682 observations (for 108 countries), of which about 65 percent are based on primary sources such as national statistical agencies (50 percent) or compilations of such results by repu- table international agencies (15 percent). The remaining 35 percent of the data are based on primary sources that have been quoted by a reliable secondary source. Table 1 provides summary statistics by region and economy—number of observations; mean, minimum, and maximum Gini coefficient; standard de- viation; period covered; and the ratio of the top quintile's to the bottom quintile's share of income in cases for which share data are available. Both the high- quality data set with the respective definitions and the original, large data set with the reasons for rejecting certain observations are available from the Bank's Web server.2 Decisions concerning the inclusion or exclusion of certain observations are always based on some judgment and arbitrariness. Although we have attempted to be as objective as possible, we have undoubtedly either missed or misinter- preted a piece of available information in some cases. We hope that making available all the original data reviewed will allow interested readers to correct those lapses or to adapt the data to suit their more specific needs. Coverage and Comparison with Existing Data Sets We highlight some of the features of our data set by comparing it with the compilations by Jain (1975), Paukert (1973), and Fields (1989b), which have in various combinations been used by the existing literature on inequality and growth. 3 Such a comparison demonstrates three points. First, when the three criteria for quality are applied consistently to all the data sets, it is apparent that our data set contains a substantially larger number of high-quality observations than any of the others (see table 2, rows 1 and 2). With 682 high-quality obser- vations, the new data set has nine times as many observations as the largest of the other data sets. Second, the new data set has a much greater coverage of economies—three times as many as the next largest data set (see table 2, row 3). Third, and perhaps most important for the study of the relationship between inequality and growth through time, our data set provides a more reliable basis 1. The empirically most relevant case is that countries often report the share of households in different income groups. If income is in nominal terms, if the class boundaries stay constant over time, and if no average income (or expenditure) for individual groups is reported, the simple fact that there are fewer and fewer households in the lower brackets would give rise to the illusion of a decrease in inequality over time. We encountered this phenomenon for the Philippines (for which we switched to decile shares as a consequence), for Tunisia (for which we decided not to report shares but only the Gini coefficient provided in the government's statistical yearbook), and for Sweden. In the last case, adding the mean for the respective income groups (which is available in the government's statistical yearbook) changed the Gini coefficient by up to 4-5 points. We have tried to avoid using such data wherever possible. 2. The address is http://www.worldbank.org/html/prdmg/grwthweb/growth_t.htm 3. Trie data set by Paukert forms the basis for the data set in Lecaillon and others (1984). Deininger and Squire S 73 for time-series analysis (see table 2, rows 4 and 5). Compared with an average of about two high-quality observations for each country in Fields and Jain, our data set contains an average of more than six high-quality observations for each country. It contains fifty-eight countries with four or more high-quality obser- vations compared with only ten such countries in the next largest data set. Al- though we had to discard a number of observations from early periods because of their quality, expanded coverage in more recent periods has more than com- pensated for them. Examination of the quality of the data suggests that a large number of data points used to substantiate the negative relationship between initial in- come inequality and subsequent growth in the literature may be of doubtful quality (Persson and Tabellini 1995; Alesina and Rodrik 1994). For example, the Persson and Tabellini data set, based on Paukert (1973), includes several countries (Burma, Chad, Cyprus, Benin, Iraq, and Lebanon) for which we were unable to locate data of acceptable quality. In addition, one-third of Persson and Tabellini's Gini coefficients differ by 5 or more points from the closest acceptable observation, and only eighteen of their fifty-five observa- tions satisfy the criteria for quality indicated above. Although the data used by Alesina and Rodrik—at least the part based on Fields (1989b)—are of much higher quality, their data set still contains fourteen observations that differ by more than 5 points from the closest comparable value available in our data set. The negative relationship between income inequality and growth evaporates if, for example, we attempt to rerun the regressions by Persson and Tabellini using only the eighteen (out of fifty-five) high-quality observa- tions contained in their sample. The large number of observations in our data set enables us to better account for the time-series dimension of the data. This is important because making inferences on longitudinal relationships such as the Kuznets Hypothesis from cross-sectional data is questionable (Fields and Jakubson 1994; Ravallion and Chen 1995). Indeed, our data provide little support for an inverted-U relation- ship between levels of income and inequality when tested on a country-by- country basis, with no support for the existence of a Kuznets curve in about 90 percent of the countries investigated. Despite the improvement over earlier data sets, coverage still varies widely across regions and decades. In particular, a comparison of the number of econo- mies included in the data set by region reveals that Asia, Eastern Europe, and industrial and high-income economies are very well represented, whereas coun- tries in the Middle East and North Africa, and especially Sub-Saharan Africa, are underrepresented (see table 3). And within economies, our coverage of Sub- Saharan Africa and the Middle East and North Africa is also thin, with fewer than two observations for each economy on average, compared with more than ten in Asia and the industrial economies. Table 3 reveals a significant improve- ment in the number of observations over time: there are twice as many observa- tions for the 1980s as for the 1960s. Table 1. Descriptive Statistics and Coverage of the Data Set on Income Inequality, Selected Economies Ratio of top quintile's share of Number of Average Minimum Maximum Standard First Last income to bottom Region and economy observations Gini Gini Gini deviation year year quintile's share* Sub-Saharan Africa 40 44.71 28.90 63.18 9.18 1968 1993 11.61 Botswana 1 54.21 54.21 54.21 — 1986 1986 16.36 Cameroon 1 49.00 49.00 49.00 1983 1983 — Central African Republic 1 55.00 55.00 55.00 — 1992 1992 — Cote d'lvoire 4 39.18 36.89 41.21 1.86 1985 1988 7.17 Gabon 2 61.23 59.27 63.18 2.76 1975 1977 19.79 Ghana 4 35.13 33.91 36.74 1.42 1988 1992 5.97 Guinea-Bissau 1 56.12 56.12 56.12 — 1991 1991 28.57 Kenya 1 54.39 54.39 54.39 — 1992 1992 18.24 Lesotho 1 56.02 56.02 56.02 — 1987 1987 20.90 Madagascar 1 43.44 43.44 43.44 — 1990 1990 8.52 Mauritania 1 42.53 42.53 42.53 — 1988 1988 13.12 Mauritius 3 40.67 36.69 45.70 4.59 1980 1991 6.62 Niger 1 36.10 36.10 36.10 1992 1992 5.90 Nigeria 3 38.55 37.02 41.15 2.27 1986 1992 8.67 Rwanda 1 28.90 28.90 28.90 1983 1983 4.01 Senegal 1 54.12 54.12 54.12 — 1991 1991 16.75 Seychelles 2 46.50 46.00 47.00 0.71 1978 1984 — Sierra Leone 1 60.79 60.79 60.79 — 1968 1968 22.45 South Africa 1 62.30 62.30 62.30 — 1992 1992 32.11 Sudan 1 38.72 38.72 38.72 — 1971 1971 5.58 Tanzania 3 40.37 38.10 44.00 3.18 1969 1993 6.63 Uganda 2 36.89 33.00 40.78 5.50 1989 1992 6.01 Zambia 2 47.26 43.51 51.00 5.30 1976 1991 12.11 Zimbabwe 1 56.83 56.83 56.83 1990 1990 15.66 East Asia and the Pacific 123 36.18 25.70 53.00 6.55 1953 1993 7.15 China 12 32.68 25.70 37.80 3.78 1980 1992 5.17 Fiji 1 42.50 42.50 42.50 1977 1977 Hong Kong 7 41.58 37.30 45.18 2.81 1971 1991 9.46 Indonesia 11 33.49 30.70 38.59 2.17 1964 1993 5.22 Japan 23 34.82 32.50 37.60 1.35 1962 1990 7.06 Korea, Rep. of 14 34.19 29.82 39.10 2.63 1953 1988 6.29 LaoPDR 1 30.40 30.40 30.40 1992 1992 4.21 Malaysia 6 50.36 48.00 53.00 1.96 1970 1989 14.18 Taiwan (China) 26 29.62 27.70 33.60 1.53 1964 1993 4.67 Philippines 7 47.62 45.00 51.32 2.46 1957 1991 12.00 Singapore 6 40.12 37.00 42.00 1.81 1973 1989 6.71 Thailand 8 45.48 41.28 51.50 3.78 1962 1992 11.65 Vietnam 1 35.71 35.71 35.71 — 1992 1992 5.51 South Asia 60 34.06 28.27 47.80 4.54 1951 1992 5.50 Bangladesh 10 34.51 28.27 39.00 3.52 1963 1992 5.72 India 31 32.55 29.17 37.05 2.06 1951 1992 4.98 Nepal 1 30.06 30.06 30.06 1984 1984 4.34 Pakistan 9 31.50 29.91 32.44 0.86 1969 1991 4.68 Sri Lanka 9 41.71 30.10 47.80 6.10 1953 1990 7.98 Eastern Europe 101 26.01 17.83 39.39 4.71 1958 1995 4.05 Armenia 1 39.39 39.39 39.39 1989 1989 23.88 Belarus 1 28.53 28.53 28.53 1995 1995 4.30 Bulgaria 28 23.30 17.83 34.42 3.40 1963 1993 3.24 Czechoslovakia 12 22.25 19.37 27.19 2.40 1958 1992 3.08 Czech Republic 2 27.43 26.60 28.26 1.17 1993 1994 3.75 Estonia 3 34.66 31.52 36.63 2.75 1992 1995 6.62 Hungary 9 24.65 20.97 32.24 3.57 1962 1993 3.61 Kazakstan 1 32.67 32.67 32.67 1993 1993 5.39 Kyrgyz Republic 1 35.32 35.32 35.32 1993 1993 6.31 Latvia 1 26.98 26.98 26.98 1993 1993 3.83 Lithuania 1 33.64 33.64 33.64 1993 1993 5.20 Moldova 1 34.43 34.43 34.43 1992 1992 6.06 Poland 17 25.69 20.88 33.06 2.52 1976 1993 3.75 (Table continues on the following page.) Table 1. (continued) Ratio of top quintile's share of Number of Average Minimum Maximum Standard First Last income to bottom Region and economy observations Cini Cini Gini deviation year year quintile's share1 Eastern Europe (continued) Romania 3 25.83 23.38 28.66 2.66 1989 1994 3.79 Slovak Republic 2 20.50 19.49 21.50 1.42 1992 1993 2.76 Slovenia 2 27.08 25.95 28.20 1.59 1992 1993 3.77 U.S.S.R. 5 26.94 24.56 30.53 2.32 1980 1993 4.06 Ukraine 1 25.71 25.71 25.71 1992 1992 3.71 Yugoslavia 10 32.62 31.18 34.73 1.00 1963 1990 5.63 Middle East and North Africa 20 40.77 32.00 45.45 3.07 1959 1991 7.14 Algeria 1 38.73 38.73 38.73 — 1988 1988 6.85 Egypt, Arab Rep. of 4 38.00 32.00 42.00 4.32 1959 1991 4.72 Iran, Islamic Rep. of 5 43.23 41.88 45.45 1.41 1969 1984 — Jordan 3 39.19 36.10 40.80 2.67 1980 1991 7.39 Morocco 2 39.20 39.19 39.20 0.01 1984 1991 7.03 Tunisia 5 42.51 40.24 44.00 1.41 1965 1990 8.25 Latin America and the Caribbean 100 50.15 37.92 61.88 6.05 1950 1994 16.02 Barbados 2 47.18 45.49 48.86 2.38 1951 1979 17.56 Bolivia 1 42.04 42.04 42.04 1990 1990 8.58 Brazil 15 57.32 53.00 61.76 2.72 1960 1989 23.07 Chile 5 51.84 45.64 57.88 5.76 1968 1994 14.48 Colombia 7 51.51 46.00 54.50 2.68 1970 1991 13.94 Costa Rica 9 46.00 42.00 50.00 2.97 1961 1989 13.13 Dominican Republic 4 46.94 43.29 50.46 3.35 1976 1992 11.06 Ecuador 1 43.00 43.00 43.00 — 1992 1992 9.82 El Salvador 1 48.40 48.40 48.40 1977 1977 10.64 Guatemala 3 55.68 49.72 59.06 5.18 1979 1989 20.82 Guyana 2 48.19 40.22 56.16 11.27 1990 1990 9.15 Honduras 7 54.49 50.00 61.88 3.63 1968 1993 27.74 Jamaica 9 42.90 37.92 54.31 4.81 1958 1993 8.75 Mexico 9 53.85 50.00 57.90 3.09 1950 1992 17.12 Nicaragua 1 50.32 50.32 50.32 — 1990 1990 13.12 Panama 4 52.43 47.47 57.00 5.01 1970 1989 22.64 Peru 4 47.99 42.76 55.00 5.42 1971 1994 9.21 Puerto Rico 3 51.11 50.15 52.32 1.11 1969 1989 22.20 Trinidad 4 46.21 41.72 51.00 3.79 1958 1981 18.31 Venezuela 9 44.42 39.42 53.84 4.27 1971 1990 10.93 Industrial countries and high-income developing countries 238 33.19 22.90 56.00 5.76 1947 1993 6.63 Australia 9 37.88 32.02 41.72 3.08 1969 1990 8.32 Bahamas 11 45.77 40.64 54.09 4.10 1970 1993 14.14 Belgium 4 27.01 26.22 28.25 0.88 1979 1992 4.26 Canada 23 31.27 27.41 32.97 1.67 1951 1991 5.54 Denmark 4 32.09 30.99 33.20 1.26 1976 1992 6.29 Finland 12 29.93 26.11 32.04 2.17 1966 1991 5.35 France 7 43.11 34.85 49.00 6.07 1956 1984 6.31 Germany 7 31.22 28.13 33.57 1.71 1963 1984 5.35 Greece 3 34.53 33.29 35.19 1.07 1974 1988 6.37 Ireland 3 36.31 34.60 38.69 2.12 1973 1987 8.91 Italy 15 34.93 32.02 41.00 2.61 1974 1991 4.94 Luxembourg 1 27.13 27.13 27.13 — 1985 1985 4.11 Netherlands 12 28.59 26.66 29.68 0.95 1975 1991 4.43 New Zealand 12 34.36 30.04 40.21 2.90 1973 1990 6.78 Norway 9 34.21 30.57 37.52 2.90 1962 1991 7.39 Portugal 4 37.44 35.63 40.58 2.16 1973 1990 7.44 Spain 8 27.90 24.42 37.11 4.38 1965 1989 4.34 Sweden 15 31.63 27.31 33.41 1.49 1967 1992 5.64 Turkey 3 50.36 44.09 56.00 5.98 1968 1987 15.22 United Kingdom 31 25.98 22.90 32.40 2.61 1961 1991 4.03 United States 45 35.28 33.50 38.16 1.29 1947 1991 8.46 Total 682 36.12 17.83 63.18 9.33 1947 1995 7.80 — Not available. a. This ratio is the average for all observations included in the data set. Source: Authors' calculations based on various sources as described in the text. 578 THE WORLD BANK ECONOMIC REVIEW, VOL 10, NO. 3 Table 2. Characteristics of Various Data Sets on Income Inequality Fields Jain Paukert Characteristic Our data (1989b) (19?'5) (1973) Original number of observations 2,621 105 405 55 High-quality observations 682 73 61 18 Number of economies 108 36 30 18 Average number of high-quality observations per economy 6.31 2.03 2.03 1.00 Economies with four or more high- quality observations 58 10 8 0 Note: High-quality observations meet the three criteria described in the text. Variation in the Definition of Variables Even if indexes of inequality satisfy the criteria for quality outlined above, the indexes may not be fully comparable over time or across countries because of differences in how variables are defined. Here we explore the quantitative im- portance of differences in definition and discuss the pros and cons of options to increase intertemporal and interspatial comparability of data that are based on different definitions. The three main differences in definition we deal with are choice of the recipient unit, use of gross income or net income, and use of expen- diture or income. How a variable is defined is important for two reasons. First, if inequality changes only slowly over time and if different measurement concepts are in some cases associated with comparatively large jumps in Gini coefficients, the variation caused by changes related to definition could well account for most of the variation that is subsequently "explained" by conventional regression analy- sis. Basing measures of inequality within countries on identical measurement concepts would be crucial to reducing the potential for such error. Second, defi- nitional issues might affect international comparisons of inequality, especially if the method of measurement varies systematically between different types of countries. One obvious source of such bias would be that more recent data on inequality, particularly for developing countries, are often measured in terms of expenditure rather than income, which would decrease measured Gini coeffi- cients, other things being equal.4 To avoid spurious correlations in both respects, researchers must seek ways to increase the comparability of inequality measures across time or countries. RECIPIENT UNIT: THE HOUSEHOLD OR THE INDIVIDUAL? The distinction between households and individuals is important if there are systematic differences in 4. The countries for which the majority of Gini indexes in our data set are based on expenditure information are Algeria, Bolivia, Botswana, Cameroon, the Central African Republic, Cote d'lvoire, Ecuador, the Arab Republic of Egypt, Estonia, Ghana, Greece, Guinea-Bissau, India, Indonesia, the Islamic Republic of Iran, Jamaica, Jordan, Kenya, Lao PDR, Lesotho, Madagascar, Mauritania, Mauritius, Morocco, Nicaragua, Niger, Nigeria, Pakistan, Peru, Portugal, Rwanda, Senegal, Spain, Sri Lanka, Tanzania, Tunisia, Uganda, Vietnam, and Zimbabwe. Table 3. Number of Economies and Observations Included in the Data Set, by Decade and Region Total 1960s 1970s 1980s 1990s Econ- Obser- Econ- Obser- Econ- Obser- Econ- Obser- Econ- Obser- Region omies vations omies vations omies vations omies vations omies vations Sub-Saharan Africa 24 40 2 2 5 6 11 16 14 16 East Asia and the Pacific 13 123 6 24 10 37 10 46 9 16 South Asia 5 60 4 24 4 13 5 17 4 6 Eastern Europe 19 101 4 9 5 21 8 39 18 32 Middle East and North Africa 6 20 3 4 3 5 5 7 4 4 Latin America and the Caribbean 20 100 9 12 15 34 14 35 12 19 Industrial countries and high-income developing countries 21 238 11 50 20 68 21 99 14 21 Total 108 682 39 125 62 184 74 259 75 114 Source: Authors' calculations based on various sources as described in the text. 580 THE WORLD BANK ECONOMIC REVIEW, VOL. 10, NO. 3 size between rich and poor households. If, for example, the number of individuals per household is much higher in poor than in rich families, use of household- based data would result in a lower measure of inequality than measurement on a per capita basis. If, however, the difference is primarily caused by the number of children, simply dividing total household income by the number of people may in turn result in an overestimation of inequality. An adjustment based on adult-equivalents, rather than the number of individuals in the household, would be appropriate. Because multiperson households usually have greater possibilities for making intertemporal or interpersonal adjustments in labor supply and spending patterns than individuals do, systematic differences in household sizes may affect measured inequality in other ways as well. Our data confirm that using the distribution of income across households rather than persons as the basis for the Gini index results in a slightly lower value of the index. In sixty-seven cases (included in the original data set) in which information on both households and individuals is available from reasonably reputable sources, the mean difference between person-based and household-based Gini coefficients is 1.69, with the household-based Gini indeed being lower in fifty of the sixty-seven cases. Given that the difference is not too large, we conclude that there is no reason to expect a large systematic bias in empirical work as a result of using both house- hold-based and individual-based Gini coefficients.5 We have therefore accepted measures of inequality based on either definition. INCOME: GROSS OR NET OF TAXES? If, as in most industrial countries, taxation redistributes resources from the rich to the poor, use of gross income should yield higher measured inequality than use of net income. For Sweden (1981), for example, use of gross income yields a Gini coefficient that is about 5 points higher (39 compared with 34) than the Gini coefficient based on net income (the government's statistical yearbook). In a sample of nineteen pairs of Gini coefficients computed using Luxembourg Income Study (us) data, those based on net income were on average 3 points lower than those based on gross income, a difference that varied between 1.87 and 5.66. However, the LIS sample includes only one developing country (Mexico). Thus, although the distinction between gross income and net income may affect the level of measured inequality in a cross-country sample, the quantitative importance of this effect will depend on the progressivity and effectiveness of the tax system and might therefore be of less relevance for developing countries to the degree that the role of redistributive taxation is smaller in these countries. VARIABLE MEASURED: INCOME OR EXPENDITURE? It is usually much easier for individuals to accumulate assets and savings to smooth consumption, that is, 5. This conclusion is supported by Coulter, Cowell, and Jenkins (1992) and Jenkins and Cowell (1994), who demonstrate that, for parametric variation of the equivalence scale between households and individuals, the Gini moves in a U-shaped fashion, with the difference between household- and per capita- based data not being too large. Deininger and Squire S81 expenditure, than to buy insurance against sickness or unemployment to smooth income. The greater variability of income resulting from this fact may be augmented by fluctuations in income (but not expenditure) associated with voluntary unemployment or by underreporting. Therefore, in a cross-section of individuals, using income as the measure would generally be expected to result in a higher degree of measured inequality than using expenditure. The tendency toward this result might be reinforced by the fart that expenditure is, by definition, based on net income, which, because of the progressivity of the tax system, tends to be more equally distributed than gross income. Indeed, our data suggest a significant and systematic difference between in- come-based and expenditure-based coefficients. For the forty-seven observa- tions of acceptable quality included in our data, the mean difference between expenditure-based Gini coefficients and those based on gross income is 6.6, rang- ing from - 3 (for Bangladesh in 1973, the only negative value in the sample) to 20 (for Tanzania in 1969). It seems that some of the large intertemporal changes in our high-quality Gini indexes might be caused by shifts in definition rather than real changes in inequality. For example, in Peru, a 13-point drop in the Gini index between 1971 and 1986 can at least partly be explained by the change from income to expenditure as the relevant measurement concept. A similar situation arises with Jamaica, which had a 10-point drop in the Gini index be- tween 1958 and 1971. Ensuring Intertemporal and International Comparability Given the important quantitative effects of definitional differences in the vari- ables on which measures of inequality are based, it is important to account for such differences in any empirical application. Within any given country this is not too difficult, implying that one of the major advantages of our data set is that it permits a consistent assessment of intertemporal changes in inequality within countries. Because most countries change the methodology of their house- hold surveys very infrequently, it is possible to obtain a consistently defined series within countries by eliminating only 10 observations from the original high-quality data set of 682 observations.6 Therefore, we can look at changes in the Gini coefficient as well as in the shares and real income received by different quintiles within countries. Methodologically, the most justifiable way to ensure cross-country compara- bility of inequality measures is to use only measures that are defined consis- tently. The quantitatively most important distinction arises from the difference between Gini coefficients that are based on information on income and those based on information on expenditures. Unfortunately, accounting for this dif- ference would result in a considerably reduced sample—only 69 out of the origi- nal 108 countries or 546 income-based, compared with 682 total, observations (see table 4). Finding an appropriate way of adjusting observations to a com- 6. These observations are Bangladesh for 1989 and 1992, Brazil for 1974, Guyana for 1956, Jamaica for 1958, Mexico for 1992, Peru for 1971 and 1981, Seychelles for 1978, and Sri Lanka for 1990. S82 THE WORLD BANK ECONOMIC REVIEW, VOL. 10, NO. 3 mon denominator would greatly increase the potential for making cross- country inferences. We have seen above that, for identical surveys, the mean difference between income-based and expenditure-based Gini coefficients is about 6.6. This differ- ence suggests that conclusions from a cross-country sample of Gini coefficients that includes both types of data may be misleading. One way of avoiding the exclusion of the thirty-nine countries for which Gini coefficients are based on expenditures would be to add the difference of 6.6 between expenditure-based and income-based coefficients to the 136 expenditure-based Gini coefficients in the sample. Such an adjustment would be supported by the fact that the differ- ence between income- and expenditure-based Gini coefficients for the forty- seven available observations does not seem to follow any distinguishable pat- tern except for narrowing over time. Thus, the difference is not significantly correlated at the 5 percent level with levels of per capita income, continent dum- mies, or the average level of the Gini in the country, but it is correlated nega- tively (with a correlation of 0.47) with time. Given the importance of defini- tional differences, researchers should then explore the robustness of results that rely on cross-national comparisons to changes in definitions. In particular, it would be prudent to examine whether such results hold for (a) the raw data, (b) data that have been adjusted for differences between expenditure-based and income-based coefficients, and (c) data consistently based on a common definition. A similar problem of cross-country comparability emerges for income shares and, in principle, should be handled in the same way. Thus, the robustness of any empirical results using the raw data should be checked against a sample of observations based on consistently defined income shares. Unfortunately, the number of observations in our sample for which share data on both income and expenditure are available is limited, making it difficult to arrive at any reason- able adjustment. Here we simply report the differences for the fifteen observa- tions for which such information is available. We find that expenditure-based share data are on average 1.2 percentage points higher than income-based share data for the bottom two quintiles, close to 0 percentage points higher for the middle class (third and fourth quintiles), and 1.3 percentage points lower for the top quintile. The difference between expenditure-based and income-based share Table 4. Distribution of Observations by Inequality Measure Unit of observation Income Expenditure Total Household 345 25 370 Individual 201 111 312 Total 546 136 682 Note: Values are the number of observations in the data set that are Gini coefficients based on information on income and those based on information on expenditures. The sample includes 108 economies. Source: Authors' calculations based on various sources as described in the text. Deininger and Squire 583 data ranges from —1.1 to 2.3 and -3.4 to 4.8 for the two bottom quintiles, —4.1 to 2.7 for the third and fourth quintiles, and -5.9 to 7 for the top quintile. III. SOME DESCRIPTIVE EVIDENCE Our data set can be used to revisit many of the relationships among growth, inequality, and poverty that have been studied in the literature. We undertake such an analysis in a separate paper (Deininger and Squire 1996). Here we use our data set to illustrate intertemporal and interregional differences in inequal- ity and to provide an exploratory descriptive assessment of the relationship be- tween growth, inequality, and poverty defined on the basis of income received by the bottom quintile. We highlight, among other points, how share data can usefully complement the one-dimensional Gini index of income inequality. We also explore the relationship between aggregate growth and changes in real in- come received by different quintile groups in the population. A similar exercise has been undertaken by Ravallion and Chen (1995), who focus on poverty de- fined as percentage of the population receiving less than a certain percentage of the mean. Ravallion and Chen concentrate on growth spells observed during the 1980s for forty-two developing countries. Given the large number of observa- tions from Eastern European countries included, together with the relatively atypical performance of this group during the period concerned, the results of the study depend heavily on sample composition but, in general, do not contra- dict the findings reported here. Regional Differences in Inequality Decadal averages of inequality indexes across regions are presented in table 5. The regional averages are unweighted means of country averages during the period under concern. We have used raw data (that is, unadjusted data) and note that the composition of each regional sample can change over the four decades. The measures are relatively stable through time, but they differ sub- stantially across regions, a result that emerges for individual countries as well (Li, Squire, and Zou 1996). The average standard deviation within countries (in a sample of countries for which at least four observations are available) is 2.79, compared with a standard deviation for the country-specific means of 9.15. We distinguish between three groups of regions, with considerable variation of Gini coefficients within regions: • Latin America and the Caribbean and Sub-Saharan Africa. Inequality is highest in Latin America and Sub-Saharan Africa, where the simple average of country-level Gini coefficients is almost 50, ranging from 57 in Brazil to 42 in Bolivia. None of the Latin American countries has an average Gini coefficient below 40, in contrast to Sub-Saharan Africa, where the range is from 28.9 in Rwanda to 62.3 in South Africa. Gini coefficients for the countries in the Middle East and North Africa region are in the 40s, although 584 THE WORLD BANK ECONOMIC REVIEW, VOL. 10, NO. 3 the fact that most of the coefficients are based on expenditure rather than income may imply that they somewhat understate actual income inequality. • East Asia and South Asia. East Asia and South Asia are characterized by average Gini coefficients in the middle to upper 30s that range from a high of about 50 in Malaysia and the Philippines to less than 30 in Taiwan (China). Gini coefficients are based on income for all economies except India. • Industrial and high-income developing countries. Gini coefficients in the low 30s characterize the industrial and high-income developing economies. Although inequality in several industrial countries (including the United Kingdom and the United States) increased during the 1990s, this increase was compensated for by a decrease in inequality in countries such as Canada and Finland and by a relatively constant distribution of income in the Netherlands and Sweden. The historically low levels of Eastern Europe, a region that, with Gini coefficients in the mid-20s, is much more egalitarian than the rest of the world, show a considerable increase in the 1990s. For many of these countries (including the Russian Federation), Gini coefficients now stand in the lower 30s, comparable to those of some of the industrial countries. Shares of total income received by different quintiles, possibly a more tan- gible indicator of inequality, are given in table 6. Although the aggregate picture is similar to the one conveyed by Gini coefficients, the share of income received by specific quintiles is not always completely congruent with the Gini coeffi- cient, even at the regional level. For example, despite similar Gini coefficients in both regions, the top and bottom quintiles receive a higher share of total income in South Asia than in industrial countries. Despite a lower Gini coefficient than in Eastern Europe, the middle class in industrial countries receives a greater share and the top quintile a lower share than in Eastern Europe. Table 5. Decadal Averages of Inequality Indexes, by Region Gini coefficients Overall Region average 1960s 1970s 1980s 1990s Latin America and the Caribbean 49.78 53.24 49.06 49.75 49.31 Sub-Saharan Africa 46.05 49.90 48.19 43.46 46.95 Middle East and North Africa 40.49 41.39 41.93 40.45 38.03 East Asia and the Pacific 38.75 37.43 39.88 38.70 38.09 South Asia 35.08 36.23 33.95 35.01 31.88 Industrial countries and high- income developing countries 34.31 35.03 34.76 33.23 33.75 Eastern Europe 26.57 25.09 24.63 25.01 28.94 Note: Figures reported are unweighted averages of Gini coefficients of economies in each region. The sample includes 108 economies. Changes within regions may be caused by the fact that not all economies have observations for all decades. Source: Authors' calculations based on various sources as described in the text. Deininger and Squire 585 Table 6. Income Shares of Different Quintiles, by Decade and Region Overall Quintile and region average 1960s 1970s 1980s 1990s Lowest quintile Sub-Saharan Africa 5.26 2.76 5.10 5.70 5.15 East Asia and the Pacific 6.34 6.44 6.00 6.27 6.84 South Asia 7.74 7.39 7.84 7.91 8.76 Eastern Europe 9.34 9.67 9.76 9.81 8.83 Middle East and North Africa 6.66 5.70 — 6.64 6.90 Latin America and the Caribbean 3.86 3.42 3.69 3.67 4.52 Industrial countries and high- income developing countries 6.42 6.42 6.31 6.68 6.26 Middle class (third and fourth quintiles) Sub-Saharan Africa 34.06 32.72 32.15 35.40 33.54 East Asia and the Pacific 37.02 36.29 36.88 37.18 37.53 South Asia 37.25 37.05 37.89 37.17 38.42 Eastern Europe 40.65 39.69 41.59 41.25 40.01 Middle East and North Africa 36.28 35.30 — 35.88 36.84 Latin America and the Caribbean 33.21 28.13 34.59 33.58 33.84 Industrial countries and high- income developing countries 40.99 39.89 40.61 41.21 41.80 Top quintile Sub-Saharan Africa 51.79 61.97 55.82 48.86 52.37 East Asia and the Pacific 45.73 45.90 46.50 45.51 44.33 South Asia 43.01 44.05 42.19 42.57 39.91 Eastern Europe 36.11 36.30 34.51 34.64 37.80 Middle East and North Africa 46.32 49.00 — 46.72 45.35 Latin America and the Caribbean 55.12 61.62 54.18 54.86 52.94 Industrial countries and high- income developing countries 40.42 41.22 41.11 39.89 39.79 — Not available. Source: Authors' calculations based on various sources as described in the text. It has long been known that, in the presence of intersecting Lorenz curves, movements of the Gini coefficient may not accurately indicate changes in the welfare of individual groups in a population. Our data suggest that intersecting Lorenz curves are indeed observed in most cases (55 percent of the countries). This observation would imply that, within countries, there may be considerable changes in the income shares received by individual quintile groups of the popu- lation, despite the apparent stability of the Gini coefficient. By contrast, large differences in the Gini coefficient across countries need not necessarily be ac- companied by an equally large variation in the shares of individual income groups. Within countries, we do indeed find that changes in the aggregate Gini index and changes in the income shares of individual income groups are not very highly correlated, especially for the subsample of countries with intersecting Lorenz curves. Simple correlation coefficients for this subsample range from -0.3 for 586 THE WORLD BANK ECONOMIC REVIEW, VOL 10, NO. 3 changes in the share of the bottom 20 and 40 percent to 0.2 for the top 20 percent. The correlation is insignificant for changes in the shares of the third and fourth quintiles. The corresponding correlation coefficients for the complete sample are -0.53 between the change in the Gini coefficient and income growth for the bottom 20 and 40 percent, -0.26 between changes in the Gini and changes in the shares of the third and fourth quintiles, and 0.48 between changes in the Gini and changes in the share of the top quintile of income receivers. Changes of similar magnitude in the income share of any given quintile could be associated with quite significantly different changes in the aggregate Gini coefficient. To illustrate, we compare two cases in which the share of the bot- tom quintile declined by about 4 percentage points. In Indonesia the decline occurred between 1978 and 1980 and was accompanied by a significant in- crease in the shares of the second to fourth quintiles and a decrease in the share of the top quintile, resulting in a net decrease of the Gini coefficient by about 3 points. In Hong Kong, a similar decline occurred between 1986 and 1991, but in this case the shares of both the third and the fourth quintiles increased, resulting in an increase in the Gini coefficient of 1.4 points. Across countries, the intersection of Lorenz curves in pairwise comparisons is a frequent occurrence. As a consequence, large differences in Gini coefficients can be associated with income shares for individual population groups that are remarkably similar. Countries in which Gini coefficients differ by as much as 10 or more points may have almost identical shares of income for the bottom quintile. For example, the Gini coefficient in Korea in 1985 was 35.5, compared with 50 in Colombia in 1970. The bottom quintile received almost 7 percent of total income in both cases. We conclude that, because Lorenz curves are observed to cross frequently, Gini coefficients and income shares can usefully complement each other in many types of analysis. To account for this we have included the ratio of incomes received by the top and bottom quintiles in table 1 and refer the reader to the data diskette for more details. Growth, Inequality, and Poverty The question of whether, or under what conditions, growth is associated with changes in inequality has intrigued economists for a long time. For all but a few countries for which long-enough time series have been available, for example, India, a satisfactory treatment of this issue has been precluded by a lack of sufficient country-level data and the fact that cross-sectional studies might pick up unobservable country-specific effects. Our data can be used to eliminate time- invariant country effects and to investigate the relationship between growth rates of aggregate income and inequality as measured by the Gini index. In ad- dition, we can use the information on changes in individual quintiles' shares of total income together with information on aggregate growth to investigate changes in the real income received by different quintile groups and in particu- lar the bottom 20 percent in the population. Real income is obtained by multi- Deininger and Squire 587 Table 7. Growth, Inequality, and Poverty Periods of growth (88) Periods of decline (7) Indicator Improved Worsened Improved Worsened Inequality 45 43 2 5 Income of the poor 1 77 11 2 5 Note: "Improved" in the income distribution implies a decrease of the Gini coefficient; "worsened" implies an increase. The sample includes ninety-five economies, a. The income of the lowest quintile. Source: Authors' calculations based on various sources as described in the text. plying the share of each quintile with real national per capita income (purchas- ing-power parity estimates, obtained from the Summers-Heston 1991 data set). Here we provide a descriptive analysis of these relationships. We focus on the relation between changes in overall income and inequality during decadal growth episodes that are defined by the availability of distribu- tional data that span at least one decade. The results illustrate two points (see table 7). First, there appears to be little systematic relationship between growth and changes in aggregate inequality. Periods of aggregate growth were associ- ated with an increase in inequality almost as often (forty-three cases) as with a decrease in inequality (forty-five cases). Similarly, periods of economic decline were associated with increased inequality in five cases and with a more equi- table distribution of income in two cases. The simple correlation between con- temporaneous as well as lagged income growth and the change in the Gini coef- ficient is insignificant for the whole sample as well as for subsamples defined in terms of country characteristics (rich or poor, equal or unequal, fast-growing or slow-growing economies), suggesting no strong relationship between growth and changes in aggregate inequality. The main reason for the lack of relationship appears to be that, whether average incomes are increasing or declining, changes in the Gini coefficient of inequality tend to be small (see Li, Squire, and Zou 1996). Thus, the average annual percent- age change in the Gini coefficients in our sample was only 0.28 points, compared with an average growth rate in per capita income of 2.16 percent. Some examples illustrate the quantitative significance of this point. In Taiwan (China), real income per capita increased fivefold, from US$1,540 in 1964 to US$8,063 in 1990, whereas the Gini index barely changed, declining from 32.2 to 30.1. Similar outcomes can be observed in other economies: In the United States, real income increased from US$8,772 in 1950 to US$17,594 in 1991, yet the Gini index changed hardly at all, moving from 36.0 to 37.9. Brazil saw real income increase from US$1,784 in 1960 to US$4,271 in 1989 while the Gini index moved from 53.0 to 59.6. Even where inequality changed considerably, as in Thailand, where the Gini index moved from 41.3 in 1962 to 51.5 in 1991, the change in the index seems small compared with the fourfold increase in real income. This lack of change suggests that efforts to find systematic links between inequality and aggregate growth may have to be rethought (see Deininger and Squire 1996). 588 THE WORLD BANK ECONOMIC REVIEW, VOL. 10, NO. 3 The second point is that changes in the absolute income received by different quintiles reveal additional information that is not captured in our aggregate measure of inequality. In particular, although we do not find significant correla- tions between aggregate growth and changes in inequality, there is a strong correlation between aggregate growth and changes in the income of all quintiles except the top one. Changes in absolute income enable us to investigate to what degree growth would be impoverishing, that is, to what degree increases in mean income would be associated with a fall in the income of the poor. We find that for most of the growth episodes in our sample, growth of average income, even if accompanied by increases in inequality, led to an increase in incomes for the members of the lowest quintile (see table 7). Aggregate growth was associated with an increase in the incomes of the poorest quintile in more than 85 percent of the ninety-one cases. Nonconforming growth episodes are ones in which either the economy grew and the income of the poor decreased or the economy declined and the poor benefited. A case-by-case review of the thirteen nonconforming growth episodes confirms the strong association between aggregate growth and improvements in income for all groups of the population. In nine of the thirteen cases, the asso- ciation can be shown to be caused by the use of ten-year growth spells; the association disappears when longer periods are considered. In three of the re- maining four cases, aggregate growth was low—below 2 percentage points. This leaves only one case, Colombia from 1970 to 1980, where a growth rate of slightly more than 2 percent was associated with a slight decrease (0.9 percent) in the income of the poor. Thus, there is not a very strong basis on which to question the generally positive association between growth and the welfare of the bottom quintile. To sum up, our data suggest no systematic relationship between growth of aggregate income and changes in inequality as measured by the Gini coefficient. The data do, however, suggest that a mere focus on distribution that neglects the large cross-country differences in overall growth may lead to flawed conclu- sions. Especially because changes in inequality tend to be relatively modest, we find a strong link between overall growth and a reduction in poverty. This link supports the hypothesis that economic growth benefits the poor in the large majority of cases, whereas economic decline generally hurts the poor. IV. CONCLUSION This article originated in an attempt to provide a data set on inequality that could narrow the gap between the far-reaching implications of the theoretical literature on inequality and the much more limited empirical evidence available to actually support and test such theories. To that end, we have expanded the available information on inequality. In our view, we have been more successful in improving the within-country, time-series dimension of the data, a significant improvement given that the evolution of inequality is inherently an intertemporal Deininger and Squire 589 issue. At the same time, we have identified a number of factors that are likely to affect cross-country research. We therefore caution researchers who use these data to interpret results carefully in light of the issues discussed here, to subject them to sensitivity analysis and tests for robustness, and to complement analysis based on summary statistics (such as the Gini coefficient) with data on income shares. REFERENCES The word "processed" describes informally reproduced works that may not be com- monly available through library systems. Adelman, Irma, and Cynthia Taft Morris. 1973. Economic Growth and Social Equity in Developing Countries. Stanford, Calif.: Stanford University Press. Alesina, Alberto, and Dani Rodrik. 1994. "Distributive Politics and Economic Growth." Quarterly Journal of Economics 109(2, May):465-90. Altimir, Oscar. 1986. "Estimaciones de la distribution del ingreso en la Argentina, 1953-80." Desarrollo Economico 25(100):521-66. Anand, Sudhir, and R. S. M. Kanbur. 1993. "Inequality and Development: A Critique." Journal of Development Economics 41(l):19-43. Atkinson, A. B., Lee Rainwater, and Timothy M. Smeeding. 1995. "Income Distribu- tion in OECD Countries: Evidence from Luxembourg Income Study." OECD Social Policy Studies no. 18. Bigsten, Arne. 1986. "Welfare and Economic Growth in Kenya, 1974-76." World De- velopment. 14(9):1151-60. Bourguignon, Francois, and Christian Morrisson. 1989. "External Trade and Income Distribution." Organisation for Economic Co-operation and Development (OECD), Development Center, Paris. Chai, Joseph C. H., and Karin B. Chai. 1994. "Economic Reforms and Inequality in China." Rivista Internazionale di Scienze Economiche e Commerciali 41(8, August):675-96. Chen, Shao-hua, Gaurav Datt, and Martin Ravallion. 1995. "Is Poverty Increasing in the Developing World?" Data Appendix, updated version. World Bank, Policy Re- search Department, Washington, D.C. Processed. Coulter, Fiona A. E., Frank A. Cowell, and Stephen P. Jenkins. 1992. "Equivalence Scale Relativities and the Extent of Inequality and Poverty." Economic Journal 102(September):1067-82. Cromwell, Jerry. 1977. "The Size Distribution of Income: An International Compari- son." Review of Income and Wealth 23(3, September):291-308. Datt, Gaurav. 1995. "Poverty in India, 1951-91." World Bank, Policy Research De- partment, Washington, D.C. Processed. Deininger, Klaus, and Lyn Squire. 1996. "New Ways of Looking at Old Issues: Inequality and Growth." World Bank, Policy Research Department, Washington, D.C. Processed. Easton, Brian. 1983. Income Distribution in New Zealand. Wellington: New Zealand Institute of Economic Research. Ferreira, Luisa. 1994. "Poverty and Inequality during Structural Adjustment in Rural Tanzania." Transition Economics Research Paper Series 8 (July). World Bank, East- ern Africa Department, Washington, D.C. Processed. S90 THE WORLD BANK ECONOMIC REVIEW, VOL 10, NO. 3 Fields, Gary S. 1989a. "Changes in Poverty and Inequality in Developing Countries." The World Bank Research Observer 4(2):167-85. . 1989b. "A Compendium of Data on Inequality and Poverty for the Developing World." Cornell University, Department of Economics, Ithaca, N.Y. Processed. Fields, Gary S., and George H. Jakubson. 1994. "New Evidence on the Kuznets Curve." Cornell University, Department of Economics, Ithaca, N.Y. Processed. Fishlow, Albert, Ariel Fiszbein, and Lauro Ramos. 1993. "Distribuiclo de renda no Brasil e no Argentina: Uma analise comparativa." Pesquisa e Planejamento Economico Government of Greece. Various issues. Statistical Yearbook of Greece. Athens. Government of Indonesia. Various issues. Statistical Yearbook of Indonesia. Jakarta. Government of Peru. 1991. Statistical Yearbook of Peru. Lima. Government of Sweden. 1980. Statistical Yearbook of Sweden. Stockholm. Haddad, Adeeb. 1990. "Jordan's Income Distribution in Retrospect." In Kamel Abu Jaber, Matthes Buhbe, and Mohammad Smadi, eds., Income Distribution in Jordan. Boulder, Colo.: Westview Press. Jain, Shail. 1975. Size Distribution of Income: A Compilation of Data. Washington, D.C.: World Bank. Jenkins, Stephen P. 1991. "The Measurement of Income Inequality." In Lars Osbert, ed., Economic Inequality and Poverty: International Perspectives. Armonk, N.Y.: Sharpe, pp. 3-38. Jenkins, Stephen P., and Frank A. Cowell. 1994. "Parametric Equivalence Scales and Scale Relativities." Economic journal 104(July):891-900. Kansal, Satish. 1982. "Data on Income Distribution in Thailand." Division Working Paper No. 1982-1. World Bank, Economic Analysis and Projections Department, Washington, D.C. Processed. Kozel, Valerie. 1990. The Composition and Distribution of Income in Cote d'lvoire. Living Standards Measurement Study Working Paper 68. Washington, D.C: World Bank. Kuznets, Simon. 1955. "Economic Growth and Income Inequality." American Eco- nomic Review 45:1-28. Kwack, Sung Yeung. 1990. "The Economic Development of the Republic of Korea, 1965-81." In Lawrence J. Lau, ed., Models of Development: A Comparative Study of Economic Growth in South Korea and Taiwan. Revised and expanded edition. San Francisco: International Center for Economic Growth. Lachman, Desmond, and Kenneth Bercuson, eds. 1992. Economic Policies for a New South Africa, IMF Occasional Paper 91. Washington, D.C: International Monetary Fund. Lecaillon, Jacques, Felix Paukert, Christian Morrisson, and Dimitxi Germidis. 1984. Income Distribution and Economic Development: An Analytical Survey. Geneva: International Labour Office. Li, Hongyi, Lyn Squire, and Heng-fu Zou. 1996. "Explaining International and Intertemporal Income Inequality." World Bank, Policy Research Department, Wash- ington, D . C Processed. Lianos, Theodoros P., and K. P. Prodomidis. 1974. Aspects of Income Distribution in Greece. Lecture Series 28. Athens: Center of Planning and Economic Research. Deininger and Squire 591 Meesook, Oey A. 1975. "Review of Income Distribution Data: Thailand, Malaysia, and Indonesia." Discussion Paper 56. Princeton University, Woodrow Wilson School, Research Program in Economic Development, Princeton, N.J. Melgar, Alicia. 1989. "La Distribucion del ingreso en la decada de los anos ochenta en Uruguay." Economia de America Latina 18-19:113-26. Milanovic, Branko. 1995. Personal communication. World Bank, Policy Research De- partment, Washington, D.C. Mizoguchi, Toshiyuki. 1985. "Economic Development Policy and Income Distribu- tion: The Experience in East and Southeast Asia." The Developing Economies 23(4, December):307-24. Owosekun, A., and M. Otigba. 1976. "The Nigerian Enterprises Promotion Decree: Impact on Indigenous Ownership." In J. F. Rweyemamu, ed., Industrialization and Income Distribution in Africa. Dakar, Senegal: Codesria, B. Pb. Paukert, Felix. 1973. "Income Distribution at Different Levels of Development: A Sur- vey of Evidence." International Labour Review 108(2, August-September):97-125. Persson, Torsten, and Guido Tabellini. 1995. "Is Inequality Harmful for Growth?" American Economic Review 84(3, June):600-21. Pryor, Frederic L. 1990. The Political Economy of Poverty, Equity, and Growth: Malawi and Madagascar. New York: Oxford University Press. Psacharopoulos, George, Samuel Morley, Ariel Fiszbein, Haeduck Lee, and Bill Wood. 1992. "Poverty and Income Distribution in Latin America: The Story of the 1980s." World Bank, Latin America and the Caribbean Technical Department, Washington, D.C. Processed. Ravallion, Martin, and Shaohua Chen. 1995. "What Can New Survey Data Tell Us about Recent Changes in Living Standards in Developing and Transitional Econo- mies?" World Bank, Policy Research Department, Washington, D.C. Processed. Spant, R. 1980. "The Distribution of Income in Sweden, 1920-1976." In N. A. Klevmarken and J. A. Lybeck, eds., The Statics and Dynamics of Income. Avon, U.K.: Tieto. Summers, Robert, and Alan Heston. 1991. "The Penn World Table (Mark 5): An Ex- panded Set of International Comparisons, 1950-1988." Quarterly Journal of Eco- nomics 106(2, May):327-68. Van Ginneken, Wouter, and Jong-goo Park. 1984. Generating International Compa- rable Income Distribution Estimates. Geneva: International Labour Office. World Bank. 1995. World Development Report 1995: Workers in an Integrating World. New York: Oxford University Press.