WPS5375 Policy Research Working Paper 5375 On Measuring Scientific Influence Martin Ravallion Adam Wagstaff The World Bank Development Research Group Director's office July 2010 Policy Research Working Paper 5375 Abstract Bibliometric measures based on citations are widely used function," representing explicit prior beliefs about how in assessing the scientific publication records of authors, citations reflect influence. They provide conditions for institutions and journals. Yet currently favored measures robust qualitative comparisons of influence--conditions lack a clear conceptual foundation and are known to have that can be implemented using readily-available data. counter-intuitive properties. The authors propose a new An example is provided using the economics publication approach that is grounded on a theoretical "influence records of selected universities and the World Bank. This paper--a product of the Director's office, Development Research Group--is part of a larger effort in the department to assess the impact of World Bank research. Policy Research Working Papers are also posted on the Web at http://econ. worldbank.org. The author may be contacted at mravallion@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team On Measuring Scientific Influence Martin Ravallion and Adam Wagstaff1 Development Research Group, World Bank 1818 H Street NW, Washington DC, 20433, USA 1 Our thanks to Qinghua Zhao for his programming work to help retrieve Google Scholar citation data, and to Imran Hafiz for help retrieving the bibliographic metadata from SCOPUS. The views expressed here are those of the authors and need not reflect those of the World Bank, its Executive Directors, or the countries they represent. Correspondence: mravallion@worldbank.org and awagstaff@worldbank.org. With the huge expansion in bibliographic data and easier (electronic) access to those data, most researchers have become very aware of how much others have cited their work. Bibliometric measures, such as Hirsch's (2005) famous h-index, are now routinely calculated from these data and used to assess the performance of individual researchers, universities and journals. Recruiting, promotion, tenure, employer choice and funding decisions are increasingly influenced by these measures. As economists coming fresh into bibliomerics, we were surprised how little attention has been given to the logically-prior conceptual issue of what the theoretically ideal measure would look like. Citations are clearly only of interest as an observable indicator for a latent, but more important, concept of "scholarly influence" (more often called "scientific impact" in the literature, but we prefer our more modest term). Yet it is rare to find any explicit discussion in this literature of how one expects citations to reflect scholarly influence. And without being explicit about that relationship it is hard to understand what exactly the h-index (or the variations on it since Hirsch's paper) is measuring. This paper proposes a new class of measures of the impacts of a publication record. Our point of departure from past work is that we build the bibliometric measure on a theoretical "influence function" that embodies our priors about how citations reflect influence. The assumed properties of this function then determine the bibliometric measure, thus making the measure's conceptual foundation transparent.2 While the properties we postulate for the influence function appear to be quite natural, we find that widely-used bibliometric measures have one or more properties that are inconsistent with those assumptions. We argue that these are counter-intuitive properties for a measure of the overall impact of a publication record. We also recognize explicitly that reasonable people need not agree on the properties of the influence function. Then the question arises as to whether one can derive robust orderings of two or more publication records without knowing the precise way in which citations reflect 2 The idea that measures of performance should be built on a precise formulation of the objective function is not, of course, new. The most prominent example we know of has been in a strand of the economics literature on the measurement of income inequality, in which the measure is defined as the loss of aggregate social welfare due to inequality, where social welfare is the sum of individual "utilities," each of which is a stable function of own income; the utility function is taken to be unobserved, and so a matter for prior judgment. The properties of the inequality measure are thus derived from the prior (ethical) assumptions made about social welfare. Dalton (1920) first suggested this approach. An influential formalization and development was provided by Atkinson (1970). 2 influence. Adapting ideas from the economic theory of stochastic dominance,3 we provide criteria for robust qualitative comparisons of publication records, based on our assumptions about the theoretical properties of the influence function. The arguments for and against the h-index, as reviewed in the next section, provide a motivation for our approach. We then present our approach in an intuitive (largely non- mathematical) way, before providing a more formal exposition. This is followed by examples. For and against the h-index We define a publication record as a list of the publications by a given author (or other entity being compared). Each element of this record has its own citation count. The list of the citations received ranked in descending order of citations can be called the citation profile. These are the data provided by standard sources such as SCOPUS, the Social Science Citation Index (SSCI), and Harzing's Publish or Perish (POP), which uses Google Scholar. There are a number of ways one can use these data to assess the scientific impact of a publication record. One way is to calculate the average number of citations per publication. This has the advantage that it measures quality independently of scale (not penalizing small university departments for example). But this reflects a disadvantage too: a researcher or institution that published just one well-cited paper could hardly be considered very productive. A consensus appears to have emerged in the biological, physical and social sciences that the Hirsch (2005) index (the h-index hereafter) is a useful comprehensive measure of the scientific impact of a publication record, based on its citation profile. An h-index of x means that x is the highest rank in the citation profile such that the first x items received at least x citations. Hirsch argues that this index is a robust and relevant measure of "..the importance, significance and broad impact of a scientist's cumulative research contributions." Hirsch's paper has been widely cited (1200 times in Google Scholar), and the h-index has its own (substantial) Wikipedia entry. The index has also been popularized by citation software such as POP, Scopus and Web of 3 The theory of stochastic dominance has been mainly used for comparing risky portfolios and in comparing income distributions in terms of social welfare or poverty. An important early contribution in the context of portfolio choice was Hadar and Russell (1971). Applications to social welfare and the measurement of poverty and inequality include Atkinson (1970, 1987) , Dasgupta et al. (1973) and Shorrocks (1983). A difference to past applications of dominance theory (that we know of) is that in the present case one cares about the number of objects being compared (the number of publications) as well as their "results" (citations in our case, returns to an investment or incomes in other applications). 3 Science (which implemented the h-index within two years of the publication of Hirsch's paper). Alonso et al. (2009) provide a useful review of the literature on the h-index--a review that covers some 90 papers--in just four years since Hirsch (2005). While the h-index has clearly been a major contribution, and appears to have gained wide acceptance, the literature has pointed to some concerns. It is known that the h-index can be deceptive in comparing different scientific disciplines, with different referencing cultures. (Hirsch warned against using his index for such comparisons.) One needs to normalize for these differences, and there have been some proposals, as reviewed in Alonso et al. (2009). The (often-heard) claims about the "robustness" of the h-index appear to rest on its insensitivity to lowly-cited papers (see, for example, Vanclay, 2007). However, robustness to other differences in citation profiles is more problematic. Woeginger (2008a) shows that adding a publication with above average cites need not increase the h-index for that record. He gives the example of two records: person A has six papers with citations 1,2,3,4,5,6 and B has five papers, with 1,2,3,4,5. Nobody could doubt that A has the better record, yet the h-index is 3 for both. Though we have not seen it done in practice, this type of example can be easily avoided by re-defining the index in a continuous (rather than integer) form. For this purpose, we can define a citation curve as the continuous representation of the citation profile; the citation curve can be obtained by interpolating between the discrete points on the citation profile. On a graph, one simply ranks publications in descending order of citations, and plots citations on the vertical axis and the (cumulative) number of publications on the horizontal axis, and then joins the successive points with (say) straight lines. The continuous version of the h-index is then found at the point where the citation curve cuts a 45-degree line from the origin, as illustrated in Figure 1. In the case of Woeginger's example, the continuous h-index is 3.5 for A and 3 for B. However, even the continuous h-index can violate the seemingly plausible requirement that extra citations for a given set of papers should yield a higher measure of success for a publication record. Consider the shift in the citation curve indicated by the dashed line in Figure 1; despite the higher citations, the h-index is unchanged. Such observations prompted Egghe (2006) to propose an alternative measure, the g- index, which gives higher weight to more highly cited papers. When a publication record has a g- index of x it means that x is the highest rank such that the top x papers have at least x2 citations. 4 Egghe argues that his measure better reflects the "visibility" of scientists. (We will return to the g-index.) This is an instance of a broader set of concerns about how the h-index does, or does not, reflect differences in citation profiles. Two identical profiles will naturally have the same h- index. And if citation curves never touch each other then the higher curve always has a higher continuous h-index. However, intersecting citation curves may well be common, given that the way citation counts vary across papers can differ greatly between authors. In particular, the density of citations at any point (including at the value taken by the h-index), and hence the slope of the citation curve at that point, will vary. One author might have a very steep curve, with a high concentration of citations amongst a small number of publications, while it is relatively flat for another author, with a more even profile of the citations received. There are infinitely many citation curves consistent with a given h-index. An alternative approach We postulate the existence of an influence function, which gives the degree of scholarly influence implied by any given level of citations. So the influence function can be thought of as a valuation function for citations. Each publication has its own influence, as reflected in its own citations. The properties of the influence function determine the bibliometric measure. We assume that the aggregate influence of a publication record is the sum of the influences of its constituent publications, each of which depends solely on its own citations. This additivity property is not beyond question. For example, it might be conjectured that the marginal influence of extra citations for a given paper may be lower for an author with many other well cited papers than for an author for very few other citated papers. These is scope for relaxing additivity in the following analysis, to allow for a more general measure of aggregate influence.4 However, this complicates the analysis considerably, without much obvious gain. We will stick to the simple additive idea of aggregate influence. While the influence function is a theoretical concept (in that it cannot be directly observed) we can postulate certain properties that appear plausible on a priori grounds. We shall 4 The analysis for a non-additive but increasing and quasi-concave (or, more generally, Schur-concave) aggregate influence function would have a number of formal similarities with the analysis in Dasgupta et al. (1973), in the context of measuring income inequality. 5 assume that the influence function is stable across the units being compared. In other words, the same level of citations implies the same level of influence in each of the publication records. This assumption is implicit in past citation comparisons, such as those using the h-index. Here we will only consider comparisons within a given discipline. However, stability can in principle be relaxed by introducing an explicit scaling factor for inter-disciplinary differences in the influence function. A second assumption about the influence function is monotonicity, meaning that the function is strictly increasing, in that a higher citation count for a publication implies greater influence. This is a natural assumption (though, as we have noted, the h-index need not order records consistently with that assumption). We will say that there is citation-curve dominance when two citation curves do not overlap; one curve is somewhere above the other and nowhere below. When combined with stability, montonicity implies that a robust comparison of aggregate influence can be deduced from citation-curve dominance. The publication record with the higher citation curve will have had greater aggregate influence, as assessed by any stable increasing influence function. However, this does not get us far given that citation curves will often intersect. We can still draw conclusions about the aggregate influence of the publication record, but only by making stronger assumptions about the influence function. A powerful, but potentially contentious, additional assumption is diminishing marginal influence, whereby the first citation to a given publication has the highest impact, followed by the second, and so on. This is equivalent to assuming that the influence function is concave. A simple example of a measure of the overall influence of a publication record satisfying these three assumptions--stability, monotonicity, and concavity--is the quadratic-influence index (qi-index), which we define more precisely in the next section. This is the special case of the general class of indices satisfying our assumptions; the key characteristic of this special case is that the marginal influence of extra citations is linear in citations. We show in the next section that a concave influence function (along with other technical assumptions) implies robust orderings in terms of aggregate influence if the area under the inverse citation curve of one record is everywhere greater than that of the other.5 Our empirical 5 The inverse citation curve is obtained by flipping the ordinary citation curve--swapping the axes. 6 example later will illustrate how powerful this assumption can be in resolving otherwise ambiguous rankings. The assumption of diminishing marginal influence does not hold for the aforementioned g-index, as proposed by Egghe (2006) as an alternative to the h-index. Woeginger (2008b) provides an axiomatic derivation of the g-index, in which one of the axioms is essentially the opposite assumption, namely rising marginal influence. As Woeginger notes, the g-index implies a preference for inequality, in that it rates higher a publication record with a more unequal citation profile (in the sense that citations are transferred from publications with low-cites to those with higher ones at the same mean). However, we do not think this is a plausible property. The gain in influence indicated by the first citation received by a publication that was previously ignored must surely be larger than the gain in influence that is implied by an extra citation received by the most cited, and hence most influential, paper in a discipline. To put the point another way, imagine two publication records, each containing two publications. In A's record, both papers received 50 citations, while in B's, one paper received 100 citations and the other paper received none. Only one of B's papers is known to have had any influence, while both of A's have demonstrated influence. The concave influence function we postulate here implies that A's record has been more influential, while the convex influence function implies that B's record is the more influential. Egghe's motivation for the g-index is to reward highly-cited papers, which may carry little or no weight in the h-index, as in Figure 1. However, the approach we take here allows us to address Egghe's concern without building in the (seemingly counter-intuitive) preference for inequality of citations. We can allow that higher citations always imply higher influence (arguably the main feature Egghe wants) but that they do so at a declining rate. In short, we interpret the core motivation for Egghe as avoiding the potential violations of monotonicity that can arise using the h-index, not a desire for a convex influence function per se. So far we have focused on the aggregate influence of a publication record. One might well argue that average influence is also relevant to assessing a record--as an indicator of the average quality of the papers. In comparisons of journal quality, for example, it is not obvious that one solely wants a measure of aggregate influence, given the often large variation between journals in the number of articles they publish per year and in the number of years they have been publishing. We can assess average quality by normalizing the citation curve by the total 7 number of publications (so that the horizontal axis gives percentiles of total publications). We can define a normalized h-index, whereby the highest x% of publication in terms of citations received at least x citations.6 We can also show that dominance in the normalized curve implies that the higher curve has higher influence per publication, for any influence function. Similarly, if the normalized curves intersect, one may still be able to make a robust comparison for the subset of influence functions exhibiting diminishing marginal influence. This overview has asserted a number of claims that require a more precise mathematical demonstration. Next we outline our assumptions and define the qi-index, after which we provide the dominance criteria for robust orderings. We then turn to the empirical examples. A more formal exposition Let the influence of c citations for a given publication be I (c ) . (For analytic convenience we treat citations, c, as a continuous variable.) We make the following assumptions throughout: max Core assumptions: The influence function, I (c ) for c in [0, c ] , is: (i) stable across the max units being compared; 7 (ii) continuous and differentiable in [0, c ] ; (iii) montonically increasing, i.e., I ( c ) 0 for all c in [0, c ] ;8 and (iv) normalized such that I(0)=0 and max I ( c max ) 1 . The core assumptions describe a general class of influence functions. When citations curves intersect, we are also interested in seeing whether robust comparisons are possible for a subset of influence functions satisfying the following additional assumptions: Extra assumptions: The influence function is also: (i) strictly concave, i.e., I ( c ) 0 for max all c in [0, c ] ; and (ii) with marginal influence going to zero in the limit as c* is reached, i.e., I (c ) 0 .9 max 6 This should not be confused with the h-index per publication, which is also called the "normalized h-index" in some of the literature (Alonso et al., 2009). 7 The following results can be modified to allow for a multiplicative scaling factor, i , so that the influence function for records of type i is i I (c) . 8 This can be weakened to allow I (c) 0 for some c. 9 Condition (ii) is not essential for our main results, though it simplifies things, and by allowing a sufficiently large cmax it does not seem unduly restrictive to set I (c )0. max 8 The overall influence of a record is taken to be the sum of the influences across all publications in that record. The aggregate influence of the i'th record is then given by: c max Ii Ni I (c) f (c)dc 0 i (1) Here we let N i denote the number of publications in i's record (treated as continuous) while f i (c) denotes the (continuous) density function. Note that the marginal influence function, I (c) , can be interpreted as the weight attached to c citations. Setting I(0)=0 and I ( c max ) 1 (condition (iv) in the core assumptions) is equivalent to normalizing the weights to sum to unity: c max I ' (c)dc 1 0 (2) To help interpret (1) it is useful to re-write it in the following form: I i (1 Di ) N i (3) Here Di can be thought of as a discount factor, and the aggregate influence is essentially a discounted measure of the number of publications.10 After integrating by parts and re-arranging, we can write the discount factor as: c max Di I (c)F (c)dc 0 i (4) in which Fi (c) is the cumulative distribution function for citations (the proportion of publications receiving no more than c citations). Thus the discount factor is a weighted mean of the points on the cumulative distribution function of citations. A weaker publication record, in the sense of having a larger proportion of lowly cited papers, will be discounted more heavily. At one extreme, none of the publications get any citations, and so we have Fi (c) 1 for all c in [0, c max ] . Then Di 1 and aggregate impact is zero. At the other extreme, the aggregate influence of a record in which every paper attained c max citations would simply be the number of publications, which is the maximum value of I across all possible records. 10 D has some similarity to a measure of inequality, but note that (unlike an inequality measure) D does not go to zero when all publications have the same citations unless that is at the level cmax. 9 The qi-index: An example of an influence function satisfying both our core and extra assumptions is the quadratic influence function: I (c) (2 c / c )c / c . (The marginal max max influence function is: I (c) (1 c / c max )2 / cmax .) Inserting this into (1) we obtain our qi-index, for which the discount factor takes the form: c max 2 c Di * c 1 c 0 * i F (c)dc (5) In other words, the weights are the proportionate deficits relative to maximum citations. On integrating by parts a second time, we obtain a very simply formula for the qi-index: c max 2N Ii maxi 2 (c ) G (c)dc 0 i (6) where: c Gi (c) [1 Fi ( x )]dx (7) 0 To further interpret (6), notice that N i [1 Fi ( c )] is the number of papers that received more than c citations, and that this is the inverse of the citation curve, ci ci (x ) . (The continuous h-index is the solution to hi ci (hi ) .) N iGi (c ) is the cumulative of the inverse citation curve. Thus the qi- index is twice the total area under this curve, normalized by (c max )2 . Similarly, if we normalize the citation curve by the total number of publications (so that the horizontal axis gives the percentile of publications rather than number), we can readily calculate the cumulative values of the inverse of this normalized citation curve. For the quadratic influence function, average influence is directly proportional to the total area under this curve. The qi-index can be encompassed within a larger class of parametric influence functions: I (c) (2 c / cmax )c / cmax (0 1) (8) max This relaxes one of our assumptions, by allowing the possibility that I (c ) 2(1 ) / c 0 max (for 1). Also, by setting 1 one can adjust the curvature of the influence function (making the function "more linear" as falls). Thus parameterizes how rapidly marginal influence declines as the level of citations rises. At the other extreme to the qi-index ( 1), the linear 10 version ( 0 ) is equivalent to measuring aggregate influence by simply counting total citations. Average influence is then measured by average citations per article. Dominance tests: Given that any precise functional form for the influence function involves an arbitrary choice, it is of interest to ask how far we can get in ranking publication records using only our theoretical assumptions. Invoking the core assumptions, we have: First-order citation dominance: Comparing the records of A and B, if N A [1 FA (c)] N B [1 FB (c)] for all c in [0, c max ] , with strict inequality for some c, then A has higher aggregate influence than B for any influence function satisfying our core assumptions. This follows from the fact that (on integrating (1) by parts): c max I i Ni I (c)[1 F (c)]dc 0 i (9) Note that the normalized citation curve can be used to rank records in terms of average influence per publication for any influence function satisfying our assumptions. (This follows from (9).) On also noting that the normalized citation curve--c plotted against 1 Fi (c) --is the mirror image of the inverse of the distribution function, Fi (c) , it is plain that if the normalized citation curve for publication record A is everywhere above that for B then influence per publication is higher for A for any influence function satisfying our core assumptions. If the citation curves intersect then a robust comparison in terms of aggregate influence might still be possible by invoking our extra assumptions, using the following result: Second-order dominance: Under the core and extra assumptions, if NAGA (c) NBGB (c) max for all c in [0, c ] with a strict inequality somewhere, then we can still conclude that record A has the higher aggregate influence than B. To verify this claim, we simply integrate (9) by parts, giving:11 c max Ii Ni I (c)G (c)dc 0 i (10) 11 Note that Gi (0) 0 and that I (c ) 0 (under the extra assumptions). However, our second-order dominance * condition also holds for I (c ) 0 . * 11 Note that, if the normalized citation curves intersect we may still be able to establish a robust ordering in terms of average influence by imposing declining marginal influence. By a similar argument, the corresponding second-order dominance condition for this case is that GA (c) GB (c) for all c, with strict inequality for some c. If this holds then one can conclude that the publication record of A has greater average influence per publication for any influence function satisfying both our core and extra assumptions. Empirical examples We illustrate these ideas using comparisons of the economics publications records of selected universities and the World Bank. We rely on Google Scholar for citation counts. Compared to other bibliographic databases, Google Scholar casts a broad net, including citations by books, working papers, conference proceedings, open-access journals, new and less well-established journals. Google Scholar is also more "global" in its reach, as it includes research outputs from everywhere in the world and all languages. Google Scholar is also timelier than the bibliographic databases. No time limit was set for the data on citations, since all are old institutions; in practice the records we use date back to 1964, but only five percent of them are before 1980. We searched in SCOPUS for articles where any author was affiliated with one of the institutions we consider; multiple records are therefore possible in the dataset where collaboration has taken place across two or more of the institutions. Only articles in the discipline "Economics, Econometrics and Finance" were included in our analysis. Google Scholar citation data were then obtained for each article in our database. Table 1 gives the summary statistics on citations for Berkeley, Harvard, London School of Economics (LSE), Oxford, Princeton and Yale, plus the World Bank. The h-index ranking is Harvard (first), Berkeley, World Bank, Princeton, Yale, LSE, and Oxford. This is exactly the same ranking as our qi-index and the two series have a correlation coefficient of 0.98. The ranking in terms of mean citations per article is Harvard, Berkeley, Princeton, World Bank, Yale, LSE, Oxford, which is similar to the ranking in terms of our normalized h-index. However, not all these rankings are robust to allowing all possible influence functions satisfying our core assumptions. Figure 2 shows the citation curve; panel (b) enlarges the area around the origin to show more clearly the relative positions of the curves. There are a number of 12 intersections, although some robust comparisons are still possible. For any increasing influence function satisfying our core assumptions we can robustly claim that both Berkeley and Harvard have had more influence in the field of economics than Princeton, Oxford or LSE. The rankings of Harvard, Berkeley, Yale and the World Bank are not so robust, as they depend on the precise form of the influence function. A function that gives a sufficiently high weight to high citations would favor Berkeley, Yale and the World Bank over Harvard, while this reverses for measures that put lower weight on high citations. We can go a long way toward resolving the rankings if we confine attention to influence functions that exhibit declining marginal influence. Figure 3 gives the cumulative areas under the (inverse) citation curves, to test for second-order dominance. The curves are indistinguishable at low levels but fan out markedly above that. The overall ranking that emerges is exactly the same as that implied by the h-index and the qi-index (Table 1). When we calculated the normalized citation curves we also found that robust comparisons of average influence per publication are elusive over all possible influence functions satisfying our core assumption. However, the ranking is clearer when one tests for second-order dominance, which is given in Figure 4. The ranking is Harvard, Princeton, Berkeley, World Bank, Yale, LSE, Oxford. This is very close to the ranking in terms of mean citations in Table 1 (noting that Berkeley is very close to Princeton in this dimension). Conclusions We have proposed an approach to bibliometric measurement that has a clearer conceptual foundation than past, essentially ad hoc, approaches. The core of the idea is an "influence function," representing how citations reflect scholarly influence. Only by building bibliometric measurement on an explicit influence function can we be sure that the measure is consistent with our priors about how citations reflect influence. However, the function is taken to be a theoretical concept, which (unlike citations) cannot be observed. We have outlined what we believe to be plausible assumptions about the influence function. Rankings of publication records using standard bibliometric measures, such as average citations, the h-index and the g-index, need not conform to the rankings implied by our theoretically ideal measure. We have proposed a simple bibliometric measure satisfying all our assumptions, namely the qi-index. We have also shown that, depending on the assumptions made 13 about the influence function, robust qualitative comparisons can be derived by exploiting ideas from the theory of stochastic dominance. All the proposed tests use readily available data on the citation profile of publication records. We have illustrated these ideas using data on the citation records in economics of six prominent universities and the World Bank. We find that the h-index rankings of aggregate influence are not robust to allowing any stable influence function that rewards higher citations. However, more robust rankings emerge if one also assumes that the marginal influence of extra citations declines with the level of citations. This turns out to be a powerful assumption in assessing publication records. Rankings in terms of average influence per publication based on mean citations are similarly sensitive to the underlying weights, though robust rankings emerge if one imposes declining marginal influence. While the h-index has questionable theoretical properties (as the literature has noted) it is striking that it ranks institutions in our empirical example identically to our theoretically ideal measure. And, across institutions, the cardinal values of the h-index turn out to be very highly correlated with our qi-index. For this empirical application at least, the rankings based on the h- index are reliable. Our theoretically ideal rankings of average influence also turn out to be very close to those based on average mean citations per paper. It remains to be seen whether these two standard bibliometric measures perform as well in other applications. 14 Table 1: Summary data on citations to economics papers for selected universities and World Bank N Median Mean h- Normalized qi- citations Citations index h-index index Harvard University 2191 27 96.16 240 41.0 66.6 London School of Economics 1634 13 43.59 124 30.2 22.8 Oxford University 1117 11 33.45 94 28.2 12.1 Princeton University 1299 26 78.09 165 40.5 32.6 Univ. California, Berkeley 2612 22 78.13 206 37.0 62.5 World Bank 2448 21 67.50 189 37.7 52.2 Yale University 1470 17 65.33 140 34.4 29.8 Notes: Only articles indexed in SCOPUS and in `Economics, Econometrics and Finance' were included. Multi- disciplinary articles were excluded. Bibliographic metadata were collected from SCOPUS in May-June 2010, and Google Scholar citation data were collected in June 2010. The qi-index uses c*=6,063, the highest citation count for any paper. Figure 1: Citation curves and the continuous h-index No. citations (c) c =x h No. papers ranked by decreasing citations (x) 15 Figure 2: Citation curves for various universities and the World Bank (WB) (a) Full curves 6000 4000 citations 2000 0 0 200 400 600 800 1000 Cumul. number of articles (most-cited first) Harvard Berkeley LSE Oxford Princeton WB Yale 16 (b) Blow up of lower segment 6000 4000 citations 2000 0 0 10 20 30 40 50 Cumul. number of articles (most-cited first) Harvard Berkeley LSE Oxford Princeton WB Yale 17 Figure 3: Second-order dominance test for aggregate influence cumul. area under inverted citation curve 50000 100000 150000 200000 0 0 2000 4000 6000 citations Harvard Berkeley LSE Oxford Princeton WB Yale 18 Figure 4: Second-order dominance test for average influence per publication cumul. area under normalized inverted citation curve 100 80 60 40 20 0 0 2000 4000 6000 citations Harvard Berkeley LSE Oxford Princeton WB Yale 19 References Alonso, S., F.J. Cabrerizo, E. Herrera-Viedma, F. Herrera. 2009. "h-Index: A Review Focused in its Variants, Computation and Standardization for Different Scientific Fields," Journal of Informetrics 3: 273-289. Atkinson, Anthony B. 1970. "On the Measurement of Inequality," Journal of Economic Theory 2: 244-263. _________________. 1987. "On the Measurement of Poverty," Econometrica 55: 749-64. Dalton, Hugh. 1920. "The Measurement of the Inequality of Incomes." Economic Journal 30: 348-361. Dasgupta, Partha, Amartya Sen and David Starrett. 1973. "Notes on the Measurement of Inequality." Journal of Economic Theory 6: 180-187. Egghe, Leo. 2006. "Theory and Practice of the g-index," Scientometrics 69(1): 131-152. Hadar, Josef and William R. Russell. 1971. "Stochastic Dominance and Diversification," Journal of Economic Theory 3: 288-305. Hirsch, Jorge E. 2005. "An Index to Quantify an Individual's Scientific Research Output." Proceedings of the National Academy of Sciences 102(46): 16569-16572. Shorrocks, Anthony F. 1983. "Ranking Income Distributions," Economica 50: 3-17. Vanclay, Jerome. 2007. "On the Robustness of the h-index," Journal of the American Society for Information Science and Technology 58(10): 1547­1550. Woeginger, Gerhard J. 2008a. "An Axiomatic Characterization of the Hirsch-Index," Mathematical Social Sciences 56: 224-232. __________________. 2008b. "An Axiomatic Analysis of Egghe's g-index," Journal of Informetrics 2: 364-368. 20