90723 International Comparison Program [01.03] PRELIMINARY AND INCOMPLETE Calibrating measurement uncertainty in purchasing power parity exchange rates Angus Deaton 7th Technical Advisory Group Meeting September 17-18, 2012 Washington DC Calibrating measurement uncertainty in purchasing power parity exchange rates Angus Deaton Research Program in Development Studies Princeton University Draft version, April 2012 1. Introduction As is the case for all price indexes, the raw material of PPP exchange rates is a set of prices and expenditure shares. Relative prices are not the same in all countries so that some averaging is required and even if we know just what sort of average we want, its estimation will be subject to uncertainty that is greater the greater is the variability of relative prices. There is also a choice over aggregation procedures, and there are many different price index formulas that can be used to turn the price and share information into a set of PPPs, one for each country. The multiplicity of indexes is a familiar issue in price indexes—the choice of Paasche, Laspeyres, Fisher, Törnqvist, or some other—but in the PPP case, there is also the choice of the multilateral aggregation formula—such as Gini EKS, Geary-Khamis, weighted CPD, or Iklé. Typically, a particular set of PPPs, from the ICP tables, from the Penn World Table, or from Eurostat, selects an aggregation formula and presents a set of estimates based on that aggregation formula. However, the choice of aggregation formula has some degree of arbitrariness, and when the expenditure patterns and/or the relative prices differ from one country to another, the choice may matter. While there are certainly arguments for and against each aggregation formula, these are not usually decisive, so that any specific choice resolves the inherent uncertainty in a more or less arbitrary way. For example, if we need to calculate a PPP for Canada relative to the US, or a set of PPPs for the countries of the EEC, cases where expenditure patterns are relative prices are quite similar, the choice of aggregation formula matters much less to the results than would be the case if we are calculating 1 PPPs for the whole world, or even for Africa, or for the countries of the OECD, which include places as diverse as Mexico and Japan. As will be shown, the choice of aggregation formula among those considered here is not as important in generating uncertainty as the variation that comes from the sampling of goods. Within each country, prices of different commodities vary more or less around the around the PPP average. In comparisons of countries with similar structure and similar relative prices, such as the US and Canada, or France and Belgium, the price relatives of individual goods are relatively narrowly distributed around the PPP. In such cases, the sampling of goods makes little difference, and the average price relative, weighted appropriately to yield the PPP, is relatively precisely determined. But for countries across which relative prices are very different, the sampling of goods matters, and the PPP itself is much more uncertain. This source of variation induces substantial uncertainty into the price indexes. In this paper, I propose a method for calculating standard errors for PPPs that takes into account the uncertainty that comes from the fact that relative prices and expenditure patterns differ from country to country and I do so for a number of PPP formulas. The basic ideas were first laid out in a working paper by the author and Olivier Dupriez, and in Deaton and Dupriez (2011), they were applied to the calculation of poverty-weighted purchasing power parity exchange rates. The ideas are closely related to, but different from and differently motivated than the procedures in Hill and Timmer (2006). The paper is laid out as follows. I start, in Section 2, with a simple example that is intended to establish notation and clarify the basic ideas. Section 3 lays out 2 the formulas for bivariate index numbers where each country is compared with the US as base. Section 4 extends the analysis to multilateral indexes. All sections are illustrated using data from ICP 2005. 2. Preliminaries Consider a “star” world in which we are making direct comparisons between the US and other countries, one at a time. For countries that are “similar” to the US, like Canada or Ireland, there is less uncertainty about relative prices than for countries that are “different,” like Tajikistan and Chad. For example, over the 128 basic headings in the 2005 ICP, the standard deviation of the log of the price relatives for Canada relative to the US is 0.25. For China, the standard deviation is 0.77, for India, it is 0.81, and for Tajikistan—which is typically the most extreme case—it is 1.35. Relative prices are much more dissimilar between Tajikistan and the US, or China or India and the US, than between Canada and the US. In the two country context, differences in price relatives will typically show up in differences between the Paasche and Laspeyres price indexes and, indeed, the absolute value of the log of the ratio of the Laspeyres to the Paasche (the “Paasche-Laspeyres spread”) has often been used precisely as a measure of dissimilarity in relative prices, see for example Hill (1999). For reasons that will become apparent, I shall work with the logarithmic form of the ratio so that, for the record and to establish notation, I write for country c and base country 1, here the US, 3  n  p   n  p  ln ρ1c  ln   si1  ic    ln   sic  i1   (1)  i 1  pi1      i 1  pic     where the first subscript is always the base country for any comparison, ρ is the Paasche Laspeyres spread, sic is the share of good i in total expenditure of country c, pic is the price of good i in country c, and there are n goods. For total domestic absorption in ICP 2005, the ratio ρ runs from 1.048 for Ireland and 1.057 for Canada through 1.61 and 1.66 for India and China to 5.11 and 9.62 for Tajikistan. Another way approaching price differences is via what is known as the country- product-dummy (CPD) specification. According to this, first introduced by Summers (1973) as an approximation useful for imputing missing values, the logarithm of each price is written as the sum of a commodity term, a country term, and a residual. Hence, we have ln pic  αc  βi  εic (2) where αc is a country effect and βi is a commodity effect and εic is a residual. I take (2) to be exact, and to define the parameters a and β as projections of the log prices on a set of country and commodity dummies. If there were no trade costs and all goods were tradable and freely traded between countries, so that the law of one price held, the residual terms in (2) would be zero. The quantities αc would be the logarithm of the exchange rate for c relative to the base country 1 (which is the omitted country in the projection regression) and the quantities βi the logarithm of the price of good i relative to a numeraire good. These relative prices are identical in all countries so that, for example, the price of cows’ milk cheese is the same in Paris 4 and Cambodia once we have deflated by the Cambodian to French exchange rate. When this is true, all reasonable price index formulas will give the same result, and all reasonable PPP formulas will deliver the market exchange rates as PPPs. When the law of one price fails, the residual terms εic measure deviations from the law of one price, or the deviations of actual prices from the prices that would obtain under perfect arbitrage with exchange rates given by PPPs. Given this interpretation, one way of assessing the uncertainty associated with various PPPs is to examine the variation in calculated PPPs that are driven by random draws from the distribution of these deviations. The randomness here is perhaps best thought of as driven by the choice of different goods by the ICP. In what follows, I make these calculations conditional on the expenditure patterns sic . An obvious alternative would be to think about a model of expenditure determination in which demand patterns are thought of as responding to the deviations in prices, though that would require a commitment to substantive modeling beyond the statistical approach of this paper. To see how this stochastic approach works in a simple example, consider equation (2) for the US and Canada, and subtract the US equation from the Canadian equation to get p  ln  id   ln pid  ln pic  (αd  αc )  (εid  εic ) (3)  pic  where I have used d for Canada and c for the US to emphasize that this works for any pair of countries. The quantity (αd  αc ) is the logarithm of the Canadian to US price index, and we can estimate it by an OLS regression, or more simply, just by taking the 5 average of the log price ratios over all n goods. This gives us the log of the geomean PPP exchange rate 1 n pid G ln Pcd   ln n i 1 pic (4) The standard deviation of the log price ratios, by (3), is p  s.d.ln  id   σ d 2  σc 2 (5)  pic  where I have assumed for simplicity that there are no covariances across goods or across countries and that the variances are the same for all goods. Equations (4) and (5) imply that the standard error of the log of the geomean PPP rate for the two countries is 2 σd  σc 2 s.e.ln P  G cd (6) n Equations (4) and (5) provide a direct measure of relative price variability that can be regarded as an alternative to the Paasche-Laspeyres spread as a measure of price dissimilarity. In fact, the two are quite closely related, as we shall see below. Figure 1 shows a plot of the log Laspeyres to Paasche ratio on the vertical axis against the variance of the price ratios. In each case, the base country is the US, and the comparisons are with the other 145 countries in ICP 2005. As we can see, the two measures are close to one another, so that in this simple two country case, the new measure gives similar results to the standard one. Or perhaps more simply, the variance of the log price ratios is approximately equal to the log of the ratio of the Laspeyres and Paasche indexes. This also means that the standard error of the log 6 geomean can be approximated by the square root of the log Laspeyres-Paasche ratio divided by the number of goods. Why should this be so? At an intuitive level, the Paasche Laspeyres spread is one measure of the range of possibilities in the comparison of prices between two countries. The failure of arbitrage standard error measures the degree of uncertainty associated with deviations from the law of one price. If the law of one price were to hold, the Laspeyres and Paasche indexes would be identical, and there would be no 2.5 Tajikistan Log laspeyres to paasche ratio 2 Gambia Kyrgyzstan 1.5 Chad Zimbabwe Bolivia Djibouti 1 Qatar .5 Ghana Sudan Tanzania 0 0 .5 1 1.5 2 Variance of log price ratios Figure 1: Variance of log price ratios and Laspeyres-Paasche ratio: US base spread. More formally, we can rewrite the Laspeyres Paasche ratio (1) using the price formula (2) to give 7  n   n  ln ρdc  ln   sic exp(εid  εic )  ln   sid exp(εic  εid ) (7)  i 1   i 1  Equation (7) is exact, but is more helpful if we take a second order expansion in the εic . This yields n 1 n ln ρdc   (sic  sid )(εid  εic )  i 1  (s  sid )(εid  εic )2 2 i 1 ic (8) Taking expectations, and once again assuming that εic are orthogonal both over commodities and countries, and that the variances are identical over commodities, we have E(ln ρcd )  σ c 2  σd 2 (9) To this degree of approximation then, we have the result that the variance of the log price ratios is the log Laspeyres to Paasche ratio. Note that the absence of the shares in (9) depends on the assumption that the variances are the same for all goods; if the variances are different, the expectation of the log ratio will differ from the variance of the log price ratios. Even so, Figure 1 shows that the approximation is close in the ICP 2005 data. We can also use Figure 1 to read off some of the relevant magnitudes. For the two “closest” countries to the US, Ireland and Canada, the standard error of the geomean PPP is 2.5 percent. For Gambia it is 8.7 percent, Kyrgyzstan 10.7 percent, and Tajikistan 11.9 percent. For two important countries in the middle, India and China, the standard errors are 7.1 percent and 6.8 percent respectively. I have so far established that there is a close link between the Laspeyres to Paasche ratio and the variance of log price relatives in the two country case. This 8 result also implies that there is a similar relationship between the Laspeyres to Paasche ratio and the “failure of arbitrage” standard error of the log of the geomean PPP between two countries. In the next section, I extend this analysis to more relevant price indexes. 3. Törnqvist and Fisher indexes in a bilateral star system The geomean index is not a practical contender for a PPP if only because it ignores the expenditure weights, the very fact that makes it useful at the finest level of detail where no weights exist. In this section, I consider more realistic formulas, and derive the failure of arbitrage standard errors. The two-country Törnqvist index is written 1 n p T ln Pcd   2 i 1 (sic  sid )ln id pic (10) which fits with (2) to give, without approximation 1 n T ln Pcd  (αd  αc )   (s  sid )(εid  εic ) 2 i 1 ic (11) The log of the Fisher index is half the sum of the log Laspeyres and log Paasche, so that, following (7), we have 1  n  1  n  ln P  αd  αc  ln   sic exp(εid  εic )  ln   sid exp(εic  εid ) F cd (12) 2  i 1  2  i 1  If we make a first-order approximation to (12), we get back to the Törnqvist (11), so that to this order, the Fisher and Törnqvist indexes are identical. Equation (11) gives us the variance of the two-country Törnqvist as well as an approximation to the two- country Fisher 9 T V (ln Pcd cd cd ) = s cd ' Vee 's (13) cd where the overbar denotes the mean of the two country expenditure shares and Vee is the variance covariance matrix of  d   c . In the examples in the previous section Vcd  was taken to be a diagonal matrix with each element equal to  c2   d 2 . In this case, and if, in addition, the expenditure shares are identical for all goods, (13) reduces to the (square of) (6), so that the variance of the log Törnqvist (and approximate variance of the log Fisher) is the same as the variance of the log geomean; indeed in this case, the Törnqvist is identical to the geomean. More generally, with unequal budget shares, the variance of the log Törnqvist will be at least as large as the variance of the log geomean. We have already seen the close relationship between the log of the Laspeyres- Paasche ratio and the variance of the log geomean and this carries through to the variance of the log Törnqvist. From (8) and (11) in the case where the covariances are zero n E ln dc   si ( di 2   ci 2 ) (14) i 1 n T V (ln Pcd )   si2 ( di 2   ci 2 ) (15) i 1 T When the budget shares are identical, V (ln Pcd )  E(ln cd ) / n. Figure 2 shows that a plot of the square root of the log of the bilateral Laspeyres to bilateral Paasche divided by the square root of n, the number of goods, against the standard error of the log Törnqvist from (8). As before, the US is always the base country and these are pairwise indexes, so that I am working with a star system with the US as the star. The 10 two colo s are India and ored points n the figure a China. The cross-country correlation in e is 005 data, the log Fishers and the ρ=0.89. In these 20 qvists are ve e log Törnq o the ery close to mean index log geom igures 1 and 2 are clo x, so that Fi ed. osely relate 2 Paasche-Laspeyres Figure 2: nd standard s spread an ndexes d error of bilateral in f points are A number of ting here be e worth not g on to the multilatera efore going al . First, the standard errors indexes. e ms of the bi of the logarithm dexes are la ilateral ind arge, any countrie with ma ng around values es clusterin v .10 to 0.15. For India and China, of 0. , for e, if we were to take two standa example on either sid ard errors o mean as a de of the m nce interva confiden o of Chinese al, the ratio an prices co e or Indian to America ould be 30 s percent larger or smaller n measured than hat the ICP takes GDP in local d. Given th 11 currencies as given, this carries through to the same margin of uncertainty for GDP (or at least domestic absorption) in international prices. Second, for the group of countries at the bottom left of Figure 2, including all of western Europe, Canada, Australia, and New Zealand (but not Mexico or Japan, though both are close by), the standard errors are much smaller, between 5 and 8 percent. For these countries, whose economic structure is similar to that of the US, we can make much more precise comparisons of prices and GDP, though not without substantial uncertainty. 4. Multilateral weighted CPD and GEKS indexes Multilateral indexes based on Törnqvist or Fisher indexes are known as GEKS indexes, and these take the pairwise country indexes for all pairs as a starting point and then adjust them in order to impose transitivity over countries, and to generate a single set of PPPs, one for each country. If we denote the (arbitrarily chosen) base country by 1, the Törnqvist-based GEKS index for country c is written 1 M ln PPPcT   (ln P1T M d 1 d  ln Pdc ) T (14) In order to calculate the variance of (14), we substitute in (11) to get 1 M  ln PPPcT   c   (s  s j ).( j  1)  (s j  sc )( c   j ) 2 M j 1  1  (15) where the “.” Indicates an inner product and s and  with single (country) suffices are vectors with one element for each good. The variance of the Törnqvist PPP is then computed by squaring the second term on the right hand side of (15), and taking expectations. Again, a simple version comes from assuming that the  are independent over goods and countries. More generally, we have 12 M M M M 4M2V (ln pi )  (1  21i ) (s i  s j )i (s i  s k )   (s1  s j )1(s1  s k )  j 1 k 1 j 1 k 1 M M M (16)  (s j 1 1  s ) (s  s ) 2 (s  s ) (s  s ) 2 (s  s ) (s  s ) j j 1 j j 1 i j i 1 i j 1 1 i 1 1 j The variance-covariance matrices  are diagonal matrices with the commodity- specific variances on the diagonal. In practice, I evaluate them using the squared residuals from the original CPD regression (2). The formulas (14), (15), and (16) are also valid for the Fisher version of the GEKS PPPs with only the substitution of the Fisher for the Törnqvist index in (14). The second multilateral index that I consider here is the weighted country- product dummy (CPD) index. This method works directly from equation (2), and estimates the country effects αc by regressing the log prices on a set of country and commodity dummies. The method recognizes the importance of the expenditure shares by using them as weights to calculate a weighted regression, rather than a simple OLS regression. We define X as an MN by N  M  1 matrix of 1’s and 0’s, where there is one row for each country commodity combination, and the columns are a constant, N  1 commodity columns with 1’s marking the position of each commodity, and M  1 columns with 1’s marking each country. If we write y for the MN vector of the log prices, and S for an MN by MN diagonal matrix containing the expenditure shares, the weighted CPD estimator comes from selecting the coefficients on the country dummies from the estimate b  ( X ' SX )1 X ' Sy (17) Note that this is not a GLS estimator in the usual sense because S is there to control the projection, and is not related to the variance covariance matrix of the residuals in 13 (2). Rather, we must use the so-called “outer product” variance covariance matrix for (16), which takes the form V (b)  ( X ' SX )1 X ' SΣSX ( X ' SX )1 (18) where Σ is the MN by MN variance covariance matrix of the ε , the deviations of the log prices from their country product predictions. If we follow the assumptions so far, 2 this will be a diagonal matrix with the σ c on the diagonal. Using the data from ICP 2005, the three multilateral indexes, the GEKS Fisher, the GEKS Törnqvist, and the weighted CPD, are close to one another, with correlations of the logarithms over countries in excess of 0.999. Even deflated by the average foreign exchange rate for 2005 so that we are correlating the price levels of domestic absorption, the correlations are all above 0.978. Nor are the two sets of standard errors (one for the Fisher and Törnqvist and one for the weighted CPD) very different; the cross-country correlation is 0.928 (excluding the US, whose standard error is zero by construction). This negative result is important because it makes it clear that the uncertainty in the PPPs comes, not from the choice of index number formula, at least for the index number formulas considered here, but from the dispersion of relative prices. The latter has similar effects across index formulas, at least for those considered here. The more interesting results are those akin to Figure 2, on the size of the standard errors and on the differences in precision across groups of countries. The new standard errors—of the multilateral GEKS—are compared with the old—the standard errors of the bilateral Törnquist or Fisher indexes—in Figure 3, which allows assessment of both size and precision by country. The new multilateral standard 14 errors are plotted on the vert gainst the o tical axis ag ateral ones original bila s on the horizont n average, the tal axis. On ateral stand t multila , 0.15, as dard errors are larger, 3 Standard Figure 3: f bilateral and d errors of ateral price a multila e indexes opposed b their dis d to 0.12, but h the stand spersion is lower, with tion of the dard deviat d errors red standard m 0.036 to 0.022. duced from 0 Ther re are a num ctors at wo mber of fac ork he multilate here. Th es introduce eral indexe nformation, particularly e a great d eal more in mparisons be the com untries othe etween cou e US all of w er than the the which are ignored by t M bilateral indexes. More mation, by itself, shou inform ute to prec uld contribu it cision, but i s on the nat depends ion, and the multilate ture of that informati s use all eral indexes f which wil country comparisons, some of ility in rela ll involve h uge variabi s, ative prices 15 think Japan versus Oman, or Fiji versus Finland. The enforcement of transitivity reduces variance of averages, but the raw material going into the averages is more variable to start with. These offsetting forces work differently for different groups of countries. On the left of Figure 3, inside the dotted ellipse, are the same Western European and European offshoot countries that were grouped in Figure 2. For them, the standard errors have been increased by a large amount, typically by a factor of 3. For these structurally similar economies, the introduction of relative prices from all pairs of countries, and the enforcement of transitivity so that, for example, the price level of Canada relative to the US must now be the same whether it is calculated directly, or through any possible chain of countries—from the US to Oman to Chad to Tajikistan to Canada—introduces much uncertainty that is avoided by the bilateral comparison. The lesson from this is that, if we want a bilateral comparison, we should make a bilateral comparison; nothing but avoidable noise comes from introducing countries that are irrelevant for the problem at hand. More broadly, if we only need comparisons between Western Europe, the US, Canada, Australia, and New Zealand (and possibly Mexico and Japan), we should use multilateral indexes for them that ignore the rest of the world. Of course, the ICP does exactly this by enforcing the principle of fixity, for which the current analysis provide considerable support. There are 13 countries whose multilateral index with the US has lower variance than their bilateral index. Beyond that, there is a large group of countries, in the bottom middle of Figure 3, where the standard errors are similar, and the costs of the additional information are offset by the benefits of enforcing transitivity. For 16 these non-OECD countries, it is not clear that the comparison in Figure 3 is the relevant one, though it is presumably still the case that the regional comparisons that are incorporated into the ICP through fixity are likely to be more accurate than the full international multilateral comparison. Even so, we should not expect for them the tight bounds that we get for the OECD countries. Several of the regional groupings contain countries that are much more different from one another than any of the countries of the OECD. Finally I note that, as was the case for the bilateral indexes, the standard errors of the multilateral indexes are large. There is very substantial uncertainty attached to multilateral price indexes, so that we know much less about international comparisons of national income than is commonly recognized. The large revisions associated with successive rounds of the ICP may reflect this uncertainty, at least in part. In any case, we would do better to recognize the margins of uncertainty in making international comparisons in general, especially those whose structures are dissimilar. References Deaton, Angus, and Olivier Dupriez, 2011, “Purchasing power parity exchange rates for the global poor,” American Economic Journal Applied, 3:1, 13766. Hill, Robert J., 1999, “Comparing price levels across countries using minimum spanning trees,” Review of Economics and Statistics, 81:1, 13542. 17 Hill, Robert J., and Marcel P. Timmer, 2006, “Standard errors as weights in multilateral price indexes,” Journal of Business and Economic Statistics, 24:3, 36677. Summers, Robert, 1973, “International comparisons with incomplete data,” Review of Income and Wealth, 29:1, 116. 18