WPS6763
Policy Research Working Paper 6763
Individual Diversity and the Gini Decomposition
Lidia Ceriani
Paolo Verme
The World Bank
Middle East and North Africa Region
Poverty Reduction and Economic Management Department
January 2014
Policy Research Working Paper 6763
Abstract
The paper defines the Gini index as the sum of individual found in the literature, the paper shows that only one
contributions where individual contributions are form satisfies a set of desirable properties. This form
interpreted as the degree of diversity of each individual can be used for decomposing the Gini into population
from all other members of society. Among various subgroups. An empirical illustration shows the use of this
possible forms of individual contributions to the Gini approach.
This paper is a product of the Poverty Reduction and Economic Management Department, Middle East and North Africa
Region. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to
development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://
econ.worldbank.org. The authors may be contacted at lceriani@worldbank.org or pverme@worldbank.org.
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.
Produced by the Research Support Team
Individual Diversity and the Gini Decomposition
Lidia Ceriani1 and Paolo Verme2
JEL: D31, D63
Keywords: Gini, Inequality, Decomposition
Sector Borad: Poverty (POV)
1
EconPubblica, Bocconi University, Milano
2
World Bank. The authors are grateful to Ernesto Savaglio, Casilda Lasso de la Vega, Chiara Gigliarano
e d’Aix-Marseille in 2011 for useful exchanges during
and participants to a seminar held at the Universit´
the preparation of the paper.
1
1 Introduction
a e
Over one 100 years ago, Corrado Gini (1912) published his seminal book Variabilit`
a where he presented for the ﬁrst time the distributional index that today is
Mutabilit`
associated with his name. The Gini index remains one of the most popular statistical
indices of all times and continues to be the subject of studies across the social sciences. A
search on JSTOR ﬁnds 265 articles with the name ‘Gini’ in the title, 366 with the word
gini in the abstract and 16,594 with the word gini in the text. A search on econpapers ﬁnds
423 journal articles with the word gini in the title or keywords. The majority of articles
from these two sources concerns inference, decompositions, the relation with the Lorenz
curve, various extensions or formulations of the Gini, and relations with other measures
such as deprivation.
It is known that the Gini index can be expressed in many diﬀerent forms. A recent paper
on the origins of the Gini index by Ceriani and Verme (2012) reports 13 diﬀerent forms
that emerged in the literature since its ﬁrst formulation by Corrado Gini in 1912. With
some basic manipulation of these indexes, it is also possible to see that eight of these forms
can be expressed as sums of individual values across a given population. This is a simple
but interesting observation because it implies that these eight formulations of the Gini can
potentially be used to decompose the Gini into population subgroups. For example, one
could estimate the individual contribution to the Gini for all members of a given population,
sum these contributions by males and females and then sum results for the two groups to
obtain the overall Gini. This simple procedure would overcome the century-long issue of
the Gini decomposition into population subgroups.
The problem with this approach is that the individual contribution to the Gini per se has
no meaning. We cannot really talk of individual inequality. By deﬁnition, inequality is
measured across a set of individuals. However, if we were able to provide a meaning and a
set of desirable properties to the individual contribution to the Gini, then these individual
values would become meaningful in their own right, could be added across groups and used
for the Gini decomposition into population subgroup.
The paper follows this approach. We deﬁne the individual contribution to the Gini as a
2
measure of individual diversity, we identify a set of desirable properties that this measure
of individual diversity should have and we seek, among the various formulations of the indi-
vidual contribution to the Gini, those formulations that satisfy these desirable properties.
We will ﬁnd that only one formulation satisﬁes these properties and we will show how this
formulation can be used for the Gini decomposition into population subgroups.
The concept of individual diversity we propose is similar to the concept of individual com-
plaint developed by Temkin (1986) and characterized by Cowell and Ebert (2004) and the
concept of individual relative deprivation developed by Yitzhaki (1979) and characterized
by Ebert and Moyes (2000). As shown by these authors, societal measures of complaint
or deprivation can be seen as the sum of individual values where individual values have a
meaning in themselves and are measured as the sum of income distances between one own
income and the income of richer or poorer individuals. As also shown by these authors,
the societal measures of complaint or deprivation have a direct algebraic link with the Gini
index.
This paper builds on this literature in four respects. First, we deﬁne individual values as
individual diversity. By deﬁnition, this concept does not attribute any positive or negative
connotation to the individual values and, we believe, is closer to the concept of inequality
that Corrado Gini had in mind. One cannot talk of individual inequality but one can talk of
individual diversity and it can be reasonably argued that the sum of individual diversities
in a given population is a measure of societal inequality. Second and as a consequence
of the ﬁrst point, we consider the full set of income distances between an individual and
all other members of society. Or, in other words, we capture deprivation and satisfaction
into one measure of individual diversity or inequality. Third, instead of deﬁning a new
societal index, we review the existing forms of the Gini and ask the question of whether
any of these forms satisﬁes our quest for a measure of individual diversity. Fourth, we
exploit these features to provide an exact decomposition of the Gini index into population
subgroups. This is a rather diﬀerent approach from the traditional decompositions of the
Gini index in between and within components as originally proposed by Bhattacharaya
and Mahalanobis (1967) and Pyatt (1976).
In the next section we illustrate the diversity of Gini indices oﬀered by the literature and
3
the diﬀerent distributions of unit values that these indices imply. Section three outlines
some of the desirable properties that a measure of individual diversity should have and
tests which Gini satisﬁes these properties. Section four provides an example of the Gini
decomposition by population subgroups using UK data and section ﬁve concludes.
2 Diﬀerent Ginis
The Gini index can be written in many diﬀerent forms (see Xu, 2003 and Ceriani and
Verme, 2012 for reviews). In his 1912 book, Corrado Gini ﬁrst proposed a measure which
he called the average diﬀerence between n quantities.3 Gini was particularly keen in showing
how any distribution could be seen as a symmetric distribution where each observation had
a symmetric counterpart, an aspect that became evident when he published an article that
proposed a modiﬁed version of his index as a concentration ratio (Gini, 1914).4 The ratio
proposed was equal to his original average distance between n quantities divided by twice
the arithmetic mean of X . This is what today is commonly referred to as the Gini index
or one-half of the Gini relative mean diﬀerence.5
As the starting point of our analysis, we will then use the two indices introduced by Gini
in 1912 but expressed in the more popular form published in 1914. Let us consider a
population of N individuals, i = 1, 2, . . . , n, n ∈ N, n ≥ 3, having an income distribution
X = (x1 , x2 , . . . , xi , . . . , xn ), where X ∈ Rn
++ and x1 ≤ x2 ≤, . . . , ≤ xn . Then the Gini
index can be expressed as:
n
(n + 1 − 2i)(xn−i+1 − xi )
GI (X ) = (1)
2n2 µX
i=1
or
3
Diﬀerenza media tra le n quantit`a, Gini (1912), p. 22, eq. 5.
4
This is the article where Gini shows the relation of his index with the Lorenz curve.
5
Gini’s average distance between n quantities was later referred by Dalton (1920) as the absolute mean
diﬀerence to distinguish it from the relative mean diﬀerence deﬁned as the Gini absolute mean diﬀerence
divided by the mean.
4
n
II 2(i − M )(xi − xM )
G (X ) = (2)
n2 µ X
i=1
where µX and xM are, respectively, the arithmetic mean and the median of distribution
X and M is the rank of the individual with the median income.
Since its introduction, the Gini index attracted a great amount of attention and has been
reformulated in at least 13 diﬀerent forms as noted in Ceriani and Verme (2012). By
reviewing these forms, we found a total of eight forms that can be expressed as sums of
individual contributions and that exhibit diﬀerent individual functions. These are reported
in Table 1 in chronological order and expressed as sums of individual observations across
the population. Table 1 also reports the alleged original proponents of each index and a
tentative synthetic description of the form of index.6
[Table 1]
One ﬁrst remarkable aspect is the variety of individual functions underlying the diﬀerent
Ginis. For example, form II is expressed in terms of distances from the median while
form VIII is expressed in terms of distances from the mean. Form IV ignores individual
incomes altogether and the index is expressed only in terms of rank whereas forms III and
V ignore rank and use only individual incomes. To illustrate further diﬀerences across
the eight Gini indices considered, we took a small arbitrary sample of eleven observations,
estimated the individual values for each type of Gini and plotted the distributions of these
values. Table 2 reports the individual values and Figure 1 plots these values. Note that,
by construction, two values of the income distribution reported in Table 2 correspond to
the mean and median values respectively. This is to better appreciate what happens to
the individual contributions to the Gini in correspondence of these two central moments
of the distribution.
Several diﬀerences across the distributions of individual values are evident.7 First, some
6
It is unclear where formulation III ﬁrst appeared. It is not in Gini’s 1912 book or 1914 article and can
be found in Kendall and Stuart (1958) and Xu (2003). We attributed it to Kendall and Stuart (1958) but
it could well have appeared before in the literature.
7
Note that all forms of index have been estimated with x ranked in ascending order.
5
Gini forms result in only positive values (I, II, III, V) and others in negative and positive
values (IV, VI, VII, VIII). Second, some forms attribute the same unit values to the same
values of x (III, V) while others attribute diﬀerent values (I, II, IV, VI, VII, VIII).8 Third,
there is no common order in rank across the series. One is ordered in descending order of
x (V), ﬁve are U-shaped (I, II, III, VII, VIII), one is ordered in ascending order of x (VI)
and one has no regular shape (IV). Four, some of the U-shaped distributions invert the
trend in correspondence of the median (I, II, III) while others in correspondence of lower
values (VII, VIII). Fifth, some individual scores take the value of zero in correspondence
of the median (I, II, VII) and one in correspondence of the mean (VIII). Sixth, in some
series the greatest absolute contribution to the Gini is given by the largest values of x (II,
III, VI, VII, VIII), others by the lowest values (IV, V) and one is perfectly symmetric (I).
Can any of the functional forms be suitable to describe an individual measure of diversity?
This is the question we address in the next section.
[Table 2 and Figure 1]
3 Properties
As before, let us consider a population of N individuals, i = 1, 2, . . . , n, n ∈ N , where
N is the class of all possible ﬁnite subsets of N with at least three elements. Each i-th
individual in population N is endowed with a non-negative income xi ∈ X , where X is a
generic vector of length n, and X n the class of all vector X .
Notation (xi , x−i ) denotes a relative income distribution X where individual i-th has a
relative income of xi and all other j = i individuals have a relative income distribution
of x−i . In the same way, (xi , xj , x−ij ) denotes a relative income distribution X where
individual i-th has a relative income xi , individual j -th has a relative income xj , and all
other k = i, j individuals have a relative income distribution described by x−i,j . Also,
G(X ) is the inequality level related to income distribution X , as measured by the relative
Gini index and µX is the mean income in distribution X .
8
See observations 7 and 8.
6
Deﬁnition 3.1. gi (xi , x−i ) : n∈N X n → R+ is a measure of individual i diversity, such
that the average of all individual diversities normalized by the mean income in the popula-
1 n
tion returns the Gini coeﬃcient: G(X ) = nµX i=1 gi (X ).
The aim of this section is to deﬁne a set of desirable properties of this individual index of
diversity.
Property 3.1 (Continuity). gi (xi , x−i ) is continuous over n∈N X n.
The ﬁrst property is that the individual contribution to inequality is weakly sensitive
to small changes in income values. Given that empirical data are typically aﬀected by
measurement errors, this property ensures that the individual contribution to the Gini is
not very sensitive to such errors.
Property 3.2 (Additivity). Let N−i be the class of all possible non-trivial subset of N −{i},
1 ∪ N 2 = N − {i} . Let
and let N 1 and N 2 be two generic elements of N−i , such that N−i −i
x1 2 1 2
−i and x−i be the income distribution of subgroup N−i and N−i respectively, such that
X = xi , x1 2 1 2
−i , x−i . Then, gi (xi , x−i ) = gi (xi , x−i ) + gi (xi , x−i ).
The individual i-th diversity is the sum of individual i-th diversity in diﬀerent subgroups
of the population (where a subgroup can be constituted by a single individual).
Property 3.3 (Linear Homogeneity). gi (λxi , λx−i ) = λgi (xi , x−i ), for any λ ∈ R+ .
If all incomes in the population are scaled by a factor λ, individual i-th diversity is scaled
by the same factor.
Property 3.4 (Translation Invariance). gi (xi , x−i ) = gi (xi + λ, x−i + 1λ), for any λ ∈ R,
−1
and for 1= (1, 1, . . . , 1) ∈ Rn
+ .
Individual diversity does not change whenever all incomes are changed by the same amount
λ.
Property 3.5 (Population Invariance). gi (X ) = gi (Xα ) where Xα is an α-replication of
X , Xα = (x1 , x2 , . . . , xn , x1 , x2 , . . . , xn , . . . , x1 , x2 , . . . , xn ) and α ∈ R++ .
1 2 α
If each individual in the population is replicated α-times, individual i-th diversity is un-
changed.
Property 3.6 (Symmetry). For any i, j ∈ N , for any xi , xj , xj ∈ X and for any ∈ R+ ,
such that xj = xi + and xj = xi − : gi (xi , xj , x−ij ) = gi (xi , xj , x−ij ).
7
Individual i-th diversity is unchanged if, everything else being equal, she faces another
individual who is richer or poorer by . Her diversity is inﬂuenced only by the diﬀerence
between her income and other incomes, regardless of satisfaction or deprivation.
Property 3.7 (Anonymity). Given any permutation π of N , such that X π = (xπ(i) , x−π(i) ) =
(xπ(1) , xπ(2) , . . . , xπ(n) ), gi (xi , x−i ) = gπ(i) (xπ(i) , x−π(i) )
By imposing Anonymity only relative income levels deﬁne the individual contribution to
inequality. This implies that equal relative incomes correspond to equal individual contri-
butions to inequality.
Table 3 reports whether the various forms of individual contribution to the Gini satisfy or
do not satisfy the listed properties. All diﬀerent formulations satisfy Continuity, Additivity
and Linear Homogeneity. Gini formulations IV, VI and VII fail to obey Translation Invari-
ance, while Anonymity excludes those formulations of the Gini based on rank (all except for
III and V) and by imposing Symmetry, we rule out the Gini based on a relative-deprivation
concept (form V). Only form III satisﬁes all properties.
4 Decomposition by population subgroups
Historically, the decomposition of the Gini index has focused on two main areas, decompo-
sition by income source and decomposition by within and between groups. The literature
on inequality decomposition by within and between groups is rather rich. It was ﬁrst pro-
posed by Bhattacharaya and Mahalanobis (1967) and Pyatt (1976) in diﬀerent contexts
but both methodologies led to decompositions into within and between groups inequali-
ties, a line of research followed by many others (see for example Blackorby, Donaldson,
and Auersperg, 1981 and Cowell, 1980). Bourguignon (1979) stated that “a decomposable
inequality measure is deﬁned as a measure such that the total inequality of a population
can be broken down into a weighted average of the inequality existing within subgroups of
the population and the inequality existing between them.” And Shorrocks (1980) deﬁned an
additively decomposable inequality measure as “one which can be expressed as a weighted
sum of the inequality values calculated for population subgroups plus the contribution aris-
8
ing from diﬀerences between subgroups means.”9 Initially, the Gini index was not thought
to be suitable for decompositions but there are now some methodologies that can be used
for an exact decomposition of the Gini such as the Shapley value method (Shorrocks, 1999).
All these contributions continued to focus on the within and between groups decomposition
of the Gini.
The deﬁnition of the individual contribution to inequality presented in the previous sec-
tion opens the possibility for a diﬀerent form of additive decomposition by population
subgroups. The individual contributions to the Gini are considered as a measure of the
individual degree of diversity. When we aggregate these individual degrees of diversity
across groups such as males and females, we can simply add up the individual values by
group and obtain the Gini. If we take the share of Gini by group, we obtain an exact
decomposition by subgroup.
As an example, let i = 1, 2, . . . , m be the set M of male individuals in the population
and j = m + 1, . . . , n the set W of female individuals in the population, where M ∪ W =
N . Then, the Gini index can be written as the sum of males and females individual
diversities.
m n
1
g (X ) = gi + gj (3)
nµX
i=1 j =m+1
This decomposition is determined by both group size and within-group inequality. Groups’
size being equal, the more unequal group accounts for more inequality. Groups’ inequality
being equal, the larger group accounts for more inequality. We consider population size
and within-group inequality as equally legitimate contributors to total inequality.
To illustrate the decomposition proposed, we took a reduced sample from the 2000 British
Household Panel Survey (BHPS) restricting the population to employees age 41-50 and
using as a measure of welfare income net of taxes. Table 4 reports group means and
population size as well as the decomposition by groups of total inequality. We can see
that men contribute to total inequality by almost 56% of the total. This is due to both
9
Both deﬁnitions can be found in the abstracts of the respective papers.
9
population size where men represent 52.7% of the total population and the within-group
inequality which is higher for men.
5 Conclusion
The Gini index can be formulated in many diﬀerent forms. When expressed as sums across
the population, several of these forms provide diﬀerent values at the individual level or, in
other words, have diﬀerent individual functions. We asked the question of what form an
individual function should take so as to represent the portion of inequality explained by
each individual. Based on a set of desirable properties, we showed that only one form of
function can be derived from a set of desirable properties. This is the original formulation
of the Gini index as found in Kendall and Stuart (1958). We then illustrated the use of the
individual contributions to the Gini for an exact decomposition of the Gini by population
subgroups.
References
Bhattacharaya, N., and B. Mahalanobis (1967): “Regional Disparities in Household
Consumption in India,” Journal of the American Statistical Association, 22(317), 143–
161.
Blackorby, C., D. Donaldson, and M. Auersperg (1981): “A New Procedure for
the Measurement of Inequality Within and Among Population Subgroups,” Canadian
Journal of Economics, 14, 665–685.
Bourguignon, F. (1979): “Decomposable Income Inequality Measures,” Econometrica,
47, 901–902.
Ceriani, L., and P. Verme (2012): “The origins of the Gini index: extracts from Vari-
a e Mutabilit`
abilit` a (1912) by Corrado Gini,” The Journal of Economic Inequality, 10(3),
421–443.
10
Cowell, F. (1980): “On the Structure of Additive Inequality Measures,” Review of Eco-
nomic Studies, 47, 521–531.
Cowell, F., and U. Ebert (2004): “Complaints and Inequality,” Social Choice Welfare,
23, 71–89.
Dalton, H. (1920): “The measuremento of the inequality of incomes,” Economic Journal,
30, 348–361.
Ebert, U., and P. Moyes (2000): “An axiomatic characterization of Yitzhaki’s index of
individual deprivation,” Economics Letters, 68, 263–270.
a e Mutabilit`
Gini, C. (1912): Variabilit` a. Contributo allo studio delle distribuzioni e delle
a di giurispru-
relazioni statistiche, Studi economico-giuridici Anno III, Parte II. Facolt`
a di Cagliari, Cuppini, Bologna.
denza della Regia Universit`
a dei Caratteri,” Atti
(1914): “Sulla Misura della Concentrazione e della Variabilit`
del Reale Istituto Veneto di Scienze, Lettere ed Arti, LXXIII(II), 1203–1248.
Kendall, M. G., and A. Stuart (1958): The Advanced Theory of Statistics, vol. 1.
Hafner Publishing Company, New York, 1st edn.
Pyatt, G. (1976): “On the Interpretation and Disaggregation of Gini Coeﬃcients,” Eco-
nomic Journal, 86, 243–255.
Shorrocks, A. F. (1980): “The Class of Additively Decomposable Inequality Measures,”
Econometrica, 48, 613–625.
(1999): “Decomposition Procedures for distributional analysis¿ a uniﬁed frame-
work based on the Shapley value,” manuscript, Univeristy of Essex.
Temkin, L. S. (1986): “Inequality,” Philosophy and Public Aﬀairs, 15, 99–121.
Xu, K. (2003): “How Has the Literature on the Gini’s Index Evolved in the Past 80
Years?,” Department of economics at dalhousie university working papers archive, Dal-
housie, Department of Economics.
11
Yitzhaki, S. (1979): “Relative Deprivation and the Gini Coeﬃcient,” Quarterly Journal
of Economics, XCIII, 321–324.
12
Table 1: Alternative Gini Formulations
Index Source Form
1 n (n+1−2i)(xn−i+1 −xi )
GI = nµX i=1 2n Gini, 1912, 1914 Original
1 n 2(i−M )(xi −xM )
GII = nµX i=1 n Gini, 1912 Median
1 n n | xi − xj |
GIII = nµX i=1 j =1 2n Kendall and Stuart, 1958 Adjusted Gini
1 n (n+1)µX −2(n+1−i)xi
GIV = nµX i=1 n Sen, 1973 Geometric
1 n (xj −xi )
GV = nµX i=1 j>i n Yitzhaki, 1979 Deprivation
1 n 2ixi −(n+1)µX
GV I = nµX i=1 n Anand, 1983 Covariance
1 n (2i−n−1)xi
GV II = nµX i=1 n Silber, 1989 Matrix
1 n 2i(xi −µX )
GV III = nµX i=1 n Shorrocks, 1999 Mean
13
Table 2: Gini Individual Values
i xi I II III IV V VI VII VIII
1 3 17.27 15.00 9.00 16.91 18.00 -22.36 -2.73 -3.27
2 6 11.27 9.82 7.77 12.00 15.27 -20.73 -4.36 -5.45
3 10 7.36 5.18 6.50 6.55 12.00 -17.45 -5.45 -6.00
4 12 2.36 2.73 6.05 5.45 10.55 -14.18 -4.36 -6.55
5 19 0.18 0.09 5.09 -1.27 6.09 -5.64 -3.45 -1.82
6 20 0.00 0.00 5.05 1.09 5.55 -1.09 0.00 -1.09
7 21 0.18 0.27 5.09 3.82 5.09 3.82 3.82 0.00
8 25 2.36 2.00 5.64 4.73 3.64 13.45 9.09 5.82
9 37 7.36 9.55 8.36 2.73 0.36 37.64 20.18 26.18
10 37 11.27 12.73 8.36 9.45 0.36 44.36 26.91 29.09
11 41 17.27 19.55 10.00 15.45 0.00 59.09 37.27 40.00
G(X) = 0.333; µX = 21; xM = 20.
14
Figure 1: Distributions of Gini Individual Values
y I II
0 5 10 0 5 10 0 5 10
i i i
III IV V
0 5 10 0 5 10 0 5 10
i i i
VI VII VIII
0 5 10 0 5 10 0 5 10
i i i
15
Table 3: Ginis and axioms
No. Axiom I II III IV V VI VII VIII
1 Continuity yes yes yes yes yes yes yes yes
2 Additivity yes yes yes yes yes yes yes yes
3 Linear homogeneity yes yes yes yes yes yes yes yes
4 Translation Invariance yes yes yes no yes no no yes
5 Population Invariance no no yes no yes no no no
6 Symmetry no no yes no no no no no
7 Anonymity no no yes no yes no no no
16
Table 4: Decomposition by Gender
Gender Gini Population Pop. Share Gini Contrib. Gini Share
Male 0.238 463 52.7 0.181 55.9
Female 0.134 416 47.3 0.143 44.1
Total 0.324 879 100 0.324 100
17