> 40 0 00 00.. 0 oLs regression 0 4F 0. oO°, 0H ,ooz°O Q.00 0.00 0 f 0~~~~~0 O) *.. o °~ 0 t 0 o .,'0 a 0 a 0.-' 00 Tbbit aE One 0 20 40 60 80 100 0 20 40 60 80 100 Independent variable x Note: See text for model definition and estimation procedures. Source: Author's calculations. 88 THEANALYSISOFHOUSEHOLDSURVEYS dependent in these models with limited dependent variables. As a result of the dependence, misspecification of scale will cause the P's that maximize (2.42) to be inconsistent for the true parameters, a result first noted by Hurd (1979), Nelson (1981), and Arabmazar and Schmidt (1981). The right-hand side of Figure 2.4 gives an illustration of the kind of problems that can occur with heteroskedasticity. Instead of being homoskedastic as in the left-hand panel, the ui are drawn from independent normal distributions with zero means and standard deviations a, given by (2.43) °i = 20(1 +0.2 nmax(0,xi- 40 )). According to this specification there is homoskedasticity to the left of the cutoff point (40), but heteroskedasicity to its right, and the conditional variance grows with the mean of the dependent variable beyond the cutoff. Although (2.43) does not pretend to be based on any actual data, it mimics reasonable models of behav- ior. Richer households have more scope for idiosyncracies of behavior than do the poor, and as we see in the right-hand panel, we now get zero observations among the rich as well as the poor, something that cannot occur in the homoskedastic model. This is what happens in practice; if we look at the demand for tobacco, alcohol, fee-paying schools or clinics, there are more nonconsumers among the poor, but there are also many better-off households who choose not to purchase. Not purchasing is partly a matter of income, and partly a matter of taste. The figure shows three lines. The dots-and-dashes line to the left is the OLS regression which is still biased downward; although the heteroskedasticity has generated more very high y's at high levels of x, the censoring at low values of x keeps the OLS slope down. In the replications the OLS slope averaged 0.699 with a standard deviation of 0.100; there is more variability than before, but the bias is much the same. The second, middle (solid) line is the kinked line max(0, x - 40) which is (2.41) when all the ui are zero. (Note that this line is not the regression function, which is defined as the expectation of y conditional on x.) The third line, on the right of the picture, comes from maximizing the likelihood (2.42) under the (false) assumption that the u's are homoskedastic. Because the Tobit proce- dure allows it to deal with censoring at low values of x, but provides it with no explanation for censoring at high values of x, the line is biased upward in order to pass through the center of the distribution on the right of the picture. The average MLE (Tobit) estimate of the slope in the replications was 1.345 with a standard error of 0.175, so that in the face of the heteroskedasticity, the Tobit procedure yields estimates that are as biased up as OLS is biased down. It is certainly possi- ble to construct examples where the Tobit estimators are better than least squares, even in the presence of heteroskedasticity. But there is nothing odd about the cur- rent example; heteroskedasticity will usually be present in practical applications, and there is no general guarantee that the attempt to deal with censoring by re- placing OLS with the Tobit MLE will give estimates that reduce the bias. This is not a defense of OLS, but a warning against the supposition that Tobit guarantees any improvement. ECONOMETRIC ISSUES FOR SURVEY DATA 89 In practice, the situation is worse than in this example. Even when there is no heteroskedasticity, the consistency of the Tobit estimates requires that the distri- bution of errors be normal, and biases can occur when it is not (see Goldberger 1983 and Arabmazar and Schmidt 1982). And since the distribution of the u's is almost always unknown, it is unclear how one might respecify the likelihood function in order to do better. Even so, censored data occur frequently in practice, and we need some method for estimating sensible models. There are two very dif- ferent approaches; the first is to look for estimation strategies that are robust against heteroskedasticity of the u's in (2.41) and that require only weak assump- tions about their distribution, while the second is more radical, and essentially abandons the Tobit approach altogether. I begin with the former. *Robust estimation of censored regression models There are a number of different estimators that implement the first approach, yielding nonparametric Tobit estimators-nonparametric referring to the distribu- tion of the u's, not to the functional form of the latent variable which remains linear. None of these has yet passed into standard usage, and I review only one, Powell's (1984) censored LAD estimator. It is relatively easily implemented and appears to work in practice. (An alternative is Powell's (1986) symmetrically trimmed least squares estimator.) One of the most useful properties of quantiles is that they are preserved under monotone transformations; for example, if we have a set of positive observations, and we take logarithms, the median of the logarithms will be the logarithm of the median of the untransformed data. Since max(O, z) is monotone nondecreasing in z, we can take medians of (2.41) conditional on xi to get (2.44) q50(yilxi) = max[0,q50(x,'f +uiIxi)] = max(0,x1x/) where q50(. Ix) denotes the median of the distribution coniditional on x and the median of u, is assumed to be 0. But as we have already seen, LAD regression estimates the conditional median regression, so that J3 can be consistently esti- mated by the parameter vector that minimizes n (2.45) Y I yi - max(0, x,i ) I i=l which is what Powell suggests. The consistency of this estimator does not require knowledge of the distribution of the u's, nor is it assumed that the distribution is homoskedastic, only that it has median 0. Although Powell's estimator is not available in standard software, it can be calculated from repeated application of median regression following a suggestion of Buchinsky (1994, p. 412). The first regression is run on all the observations, and the predicted values x,'f calculated; these are used to discard sample observa- tions where the predicted values are negative. The median regression is then re- peated on the truncated sample, the parameter estimates used to recalculate x,/f 90 THE ANALYSIS OF HOUSEHOLD SURVEYS for the whole sample, the negative values discarded, and so on until convergence. In (occasional) cases where the procedure does not converge, but cycles through a finite set of parameters, the parameters with the highest value of the criterion function should be chosen. Standard errors can be taken from the final iteration though, as before, bootstrapped estimates should be used. Such a procedure is easily coded in STATA, and was applied to the heteroske- dastic example given above and shown in Figure 2.4 (see Example 2.1 in the Code Appendix). To simplify the coding, the procedure was terminated after 10 median regressions, so that to the extent that convergence had not been attained, the results will be biased against Powell's estimator. On average, the method does well, and the mean of the censored LAD estimator over the 100 replications was 0.946. However, there is a price to be paid in variance, and the standard deviation of 0.305 is three times that of the OLS estimator and more than one and a half times larger than that of the Tobit. As a result, and although both Tobit and OLS are inconsistent, in only 55 out of 100 of the replications is the censored LAD closer to the truth than both OLS and Tobit. Of course, the bias to variance trade- off turns in favor of Powell's estimator as the sample size becomes larger. With 1,000 observations instead of 100, and with the new x values again equally spaced but 10 times closer, the censored LAD estimator is closer to the truth than either OLS or Tobit in 96 percent of the cases. Since most household surveys will have sample sizes at least this large, Powell's estimator is worth serious consideration. At the very least, comparing it with the Tobit estimates will provide a useful guide to failure of homoskedasticity or normality (see Newey 1987 for an exer- cise of this kind). Even so, the censored LAD estimator is designed for the censored regression model, and does not apply to other cases, such as truncation, where the observa- tions that would have been negative do not appear in the sample instead of being replaced by zeros, nor to more general models of sample selectivity. In these, the censoring or truncation of one variable is determined by the behavior of a second latent variable regression, so that (2.46) Yi1 = Y2i = ZiY + U2N where u1 and u2 are typically allowed to be correlated, y2i is observed as a dichotomous variable indicating whether or not Y" is positive, and yli is observed as yi when Y2, is 1, and is zero otherwise. Equations (2.46) are a generalization of Tobit, whereby the censoring is controlled by variables that are different from the variables that control the magnitude of the variable of interest. If the two sets of u's are assumed to be jointly normally distributed, (2.46) can be estimated by maximum likelihood, or by Heckman's (1976) two-step estimator-the "Heckit" procedure (see the next section for further discussion). As with Tobit, which is a special case, these methods do not yield consistent estimates in the presence of heteroskedasticity or nonnormality, and as with Tobit, the provision of nonpara- metric estimators is a lively topic of current research in econometrics. I shall return to these topics in more detail in the next section. ECONOMETRIC ISSUES FOR SURVEY DATA 91 Radical approaches to censored regressions Serious attention must also be given to a second, more radical, approach that questions the usefulness of these models in general. There are conceptual issues as well as practical ones. In the first place, these models are typically presented as elaborations of linear regression, in which a standard regression equation is ex- tended to deal with censoring, truncation, selectivity, or whatever is the issue at hand. However, in so doing they make a major break from the standard situation presented in the introduction where the regressionfunction, the expectation of the dependent variable conditional on the covariates, coincides with the deterministic part of the regression model. In the Tobit and its generalizations, the regression functions are no longer simple linear functions of the x's, but are more complex expressions that involve the distribution of the u's. For example, in the censored regression model (2.41), the regression function is given by E(yixI) = xi)x3 + E(u Ix,l + u 2 0) (2.47) x,p + [I-F(-x!l)]f f udF(u) where F(u) is the CDF of the u's. Absent knowledge of F, this regression function does not even identify the Al's-see Powell (1989)-but more fundamentally, we should ask how it has come about that we have to deal with such an awkward, difficult, and nonrobust object. Regressions are routinely assumed to be linear, not because linearity is thought to be exactly true, but because it is convenient. A linear model is often a sensible first approximation, and linear regressions are easy to estimate, to replicate, and to interpret. But once we move into models with censoring or selection, it is much less convenient to start with linearity, since it buys us no simplification. It is therefore worth considering alternative possibilities, such as starting by specifying a suitable functional form for the regression function itself, rather than for the part of the model that would have been the regression function had we been dealing with a linear model. Linearity will often not be appropriate for the regression function, but there are many other possibilities, and as we shall see in the next chapter, it is often possible to finesse the functional form issue altogether. Such an approach goes beyond partially nonparametric treatments that allow arbitrary distributional assumptions for the disturbances while maintaining linearity for the functional form of the model itself, and recognizes that the functional form is as much an unknown as is the error distribution. It also explicitly abandons the attempt to estimate the structure of selectivity or censoring, and focusses on fea- tures of the data-such as regression functions-that are clearly and uncontrover- sially observable. There will sometimes be a cost to abandoning the structure, but there are many policy problems for which the structure is irrelevant, and which can be addressed through the regression function. A good example is the effect of a change in tax rates on tax revenue. A gov- ernment is considering a reduction in the subsidy on wheat (say), and needs to 92 THE ANALYSIS OF HOUSEHOLD SURvEYS know the extent to which demand will be reduced at the higher price. The quan- tity of interest is the effect of price on average demand. Suppose that we have sur- vey data on wheat purchases, together with regional or temporal price variation as well as other relevant explanatory variables. Some households buy positive quan- tities of wheat, and some buy none, a situation that would seem to call for a Tobit Estimation of the model yields an estimate of the response of quantity to price for those who buy wheat. But the policymaker is interested not only in this effect, but also in the loss of demand from those who previously purchased, but who will drop out of the market at the higher price. These effects will have to be modeled separately and added into the calculation. But this is an artificial and unneces- sarily elaborate approach to the problem. The policy question is about the effect of price on average demand, averaged over consumers and nonconsumers alike. But this is exactly what we would estimate if we simply regressed quantity on price, with zeros and nonzeros included in the regression. In this case, not only is the regression function more convenient to deal with from an econometric per- spective, it is also what we need to know for policy. 2.4 Structure and regression in nonexperimental data The regression model is the standard workhorse for the analysis of survey data, and the parameters estimated by regression analysis frequently provide useful summaries of the data. Even so, they do not always give us what we want. This is particularly so when the survey data are a poor substitute for unobtainable experi- mental data. For example, if we want to know the effect of constructing health clinics, or of expanding schools, or what will happen if a minimum wage or health coverage is mandated, we should ideally like to conduct an experiment, in which some randomly chosen group is given the "treatment," and the results com- pared with a randomly selected group of controls from whom the treatment is withheld. The randomization guarantees that there are no differences-observable or unobservable-between the two groups. In consequence, if there is a signifi- cant difference in outcomes, it can only be the effect of the treatment. Although the role of policy experiments has been greatly expanded in recent years (see Grossman 1994 and Newman, Rawlings, and Gertler 1994), there are many cases where experiments are difficult or even impossible, sometimes because of the cost, and sometimes because of the moral and political implications. Instead, we have to use nonexperimental survey data to look at the differences in behavior between different people, and to try to relate the degree of exposure to the treat- ment to variation in the outcomes in which we are interested. Only under ideal conditions will regression analysis give the right answers. In this section, I ex- plore the various difficulties; in the next two sections, I look at two of the most important of the econometric solutions, panel data and the technique of instru- mental variables. The starting point for a nonexperimental study is often a regression model, in which the outcome variable y is related to a set of explanatory variables x. At least one of the x-variables is the treatment variable, while others are "control" vari- ECONOMETRIC ISSUES FOR SURVEY DATA 93 ables, included so as to allow for differences in outcomes that are not caused by the treatment and to allow the treatment effect to be isolated. These variables play the same role as the control group in an experiment. The error term in the regres- sion captures omitted controls, as well as measurement error in the outcome y, and is assumed to satisfy (2.4), that its expectation conditional on the x's is 0. In this setup, the expectation of y conditional on x is ,51x, and the effects of the treatment and controls can be recovered by estimating P3. The most common problem with this procedure is the failure-or at least implausibility-of the assumption that the conditional mean of the error term is zero. If a relevant vari- able is omitted, perhaps because it is unobservable or because data are unavail- able, and if that variable is correlated with any of the included x's, the error will not be orthogonal to the x's, and the conditional expectation of y will not be ,'x. The regression function no longer coincides with the structure that we are trying to recover, and estimation of the regression function will not yield the parameters of interest. The failure of the structure and the regression function to coincide happens for many different reasons, some more obvious than others. In this sec- tion, I consider a number of cases that are important in the analysis of household survey data from developing countries. Simultaneity, feedback, and unobserved heterogeneity Simultaneity is a classic argument for a correlation between error terms and ex- planatory variables. If we supplement the regression model (2.3) with another equation or equations by which some of the explanatory variables are determined by factors that include y, then the error term in (2.3) will be correlated with one or more of the x's and OLS estimates will be biased and inconsistent. The classic textbook examples of simultaneity, the interdependence of supply or demand and price, and the feedbacks through national income from expenditures to income, are usually thought not to be important for microeconomic data, where the pur- chases of individuals are too small to affect price or to influence their own in- comes through macroeconomic feedbacks. As we shall see, this is not necessarily the case, especially when there are local village markets. Other forms of simulta- neity are also important in micro data, although the underlying causes often have more to do with omitted or unobservable variables than with feedbacks through time. Four examples illustrate. Example 1. Price and quantities in local markets In the analysis of demand using microeconomic data, it is usually assumed that individual purchases are too small to affect price, so that the simultaneity between price and aggregate demand can be ignored in the analysis of the microeconomic data. Examples where this is not the case have been provided by Kennan (1989), and local markets in developing countries provide a related case. Suppose that the demand function for each individual in each village contains an unobservable village-level component, and that, because of poor transportation and lack of an 94 THE ANALYSIS OF HOUSEHOLD SURVEYS integrated market, supply and demand are equilibriated at the village level. Al- though the village-level component in individual demands may contribute little to the total variance of demand, the other components will cancel out over the vil- lage as a whole, so that the variation in price across villages is correlated with village-level taste for the good. Villages that have a relatively high taste for wheat will tend to have a relatively high price for wheat, and the correlation can be important even when regressions are run using data from individuals or house- holds. To illustrate, write the demand function in the form (2.48) Yi = aO f pXiC -TPC + Ui, = aO % pxiC - YPc + ac + Cic where yic is demand by household i in cluster c, xi, is income or some other indi- vidual variable, pc is the common village price, and u k is the error term. As in previous modeling of clusters, I assume that uic is the sum of a village term ac and an idiosyncratic term ec,i both of which are mean-zero random variables. Suppose that aggregate supply for the village is zc per household, which comes from a weather-affected harvest but is unresponsive to price (or to income). Mar- ket clearing implies that (2.49) ZC = Yc = ao + pC - Ypc + ac which determines price in terms of the village taste effect, supply, and average village income. Because markets have to clear at the village level, the price is higher in villages with a higher taste for the commodity. In consequence, the price on the right-hand side of (2.48) is correlated with the ac component of the error term, and OLS estimates will be inconsistent. The inconsistency arises even if the village contains many households, each of which has a negligible effect on price. The bias can be large in this case. To make things simple, assume that , = 0, so that income does not appear in (2.48) nor average income in (2.49). According to the latter, price in village c is (2.50) PC = Y (ao + aC - Zc). Write V for the OLS estimate of y obtained by regressing individual household demands on the price in the village in which the household lives. Provided that tastes are uncorrelated with harvests, it is straightforward to show that y 2 (2.51) plimy =______ 22 a+a The price response is biased downwards; in addition to the negative effect of price on demand, there is a positive effect from demand to price that comes from the effect on both of village-level tastes. The bias will only vanish when the vil- lage taste effects aC are absent, and will be large if the variance of tastes is large relative to the variance of the harvest. ECONOMETRIC ISSUES FOR SURVEY DATA 95 Example 2. Farm size andfarm productivity Consider a model of the determinants of agricultural productivity, and in particu- lar the old question of whether larger or smaller farms are more productive; the observation of an inverse relationship between farm size and productivity goes back to Chayanov (1925), and has acquired the status of a stylized fact; see Sen (1962) for India and Berry and Cline (1979) for reviews. To examine the proposition, we might use survey data to regress output per hectare on farm size and on other variables not shown, viz. (2.52) ln(Qi/Ai) = a + , lnAA + ui where Qi is farm output, A, is farm size, and the common finding is that f0 < 0, so that small farms are "more productive" than large farms. This might be inter- preted to mean that, compared with hired labor, family labor is of better quality, more safely entrusted with valuable animals or machinery, and needs less moni- toring (see Feder 1985; Otsuka, Chuma, and Hayami 1992; and Johnson and Ruttan 1994), or as an optimal response by small farmers to uncertainty (see Srinivasan 1972). It has also sometimes been interpreted as a sign of inefficiency, and of dualistic labor markets, because in the absence of smoothly operating labor markets farmers may be forced to work too much on their own farms, pushing their marginal productivity below the market wage (see particularly Sen 1966, 1975). However, if a relationship like (2.52) is estimated on a cross section of farms, and even if the amount of land is outside the control of the farner, (2.52) is likely to suffer from what are effectively simultaneity problems. Such issues have the distinction of being among the very first topics studied in the early days of econometrics (see Marschak and Andrews 1944). Although it may be reasonable to suppose that the farmer treats his farm size as fixed when deciding what to plant and how hard-to work, this does not mean that Ai is uncorrelated with u, in (2.52). Farm size may not be controlled by the farmer, but fanns do not get to be the size they are at random. The mechanism determining farm size will differ from place to place and time to time, but it is unlikely to be independent of the quality of the land. "Desert" farms that are used for low-intensity animal grazing are typically larger than "garden" farms, where the land is rich and output per hectare is high. Such a correlation will be present whether farms are allocated by the market-low-quality land is cheaper per hect- are so that it is easier for an owner-occupier to buy a large farm-or by state- mandated land schemes-each farmer is given a plot large enough to make a living. In consequence, the right-hand side of (2.52) is at least partly determined by the left-hand side, and regression estimates of ,5 will be biased downward. We can also give this simultaneity an omitted variable interpretation where land quality is the missing variable; if quality could be included in the regression instead of in the residual, the new residual could more plausibly be treated as orthogonal to farm size. At the same time, the coefficient f3 would more nearly measure the effect of land size, and not as in (2.52) the effect of land size contam- 96 THE ANALYSIS OF HOUSEHOLD SURVEYS inated by the (negative) projection of land quality on farm size. Indeed, when data are available on land quality-Bhalla and Roy (1988)-or when quality is con- trolled by iv methods-Benjamin (1993)-there is little or no evidence of a nega- tive relationship between farm size and productivity. The effect of an omitted variable is worth recording explicitly, since the for- mula is one of the most useful in the econometrician's toolbox, and is routinely used to assess results and to calculate the direction of bias caused by the omission. Suppose that the correct model is (2.53) Yi = a + pxi + yzi + ui and that we have data on y and x, but not on z. In the current example, y is yield, and z is land quality. If we run the regression of y on x, the probability limit of the OLS estimate of P is (2.54) plim = + y cov(x,Z) var x In the example, it might be the case that , = 0, so that farm size has no effect on yields conditional on land quality. But y >0, because better land has higher yields, and the probability limit of f3 will be negative because farm size and land quality are negatively correlated. The land quality problem arises in a similar form if we attempt to use equa- tions like (2.52) to measure the effects on output of extension services or "mod- ern" inputs such as chemical fertilizer. Several studies, Bevan, Collier and Gun- ning (1989) for Kenya and Tanzania, and Deaton and Benjamin (1988) for C6te d'Ivoire, find that a regression of output on fertilizer input shows extremely high returns, estimates that, if correct, imply considerable inefficiency and scope for government intervention. Deaton and Benjamin use the 1985 Living Standards Survey of C6te d'Ivoire to estimate the following regression between cocoa output, the structure of the orchard, and the use of fertilizer and insecticide, ln(Q ILM) = 5.621 + 0.526 (LO ILM) + 0.0541nsect (2.55) (68.5) (4.3) (2.5) (2.55) + 0.158 Fert (2.8) where Q is kilos of cocoa produced on the farm, LM and LO are the numbers of hectares of "mature" and "old" trees, respectively, and Insect and Fert are ex- penditures in thousands of Central African francs per cocoa hectare on insecticide and fertilizer, respectively. According to (2.55), an additional 1,000 francs spent on fertilizer will increase the logarithm of output per hectare by 0.158, which at a sample mean log yield of 5.64 implies an additional 48 kilos of cocoa at 400 francs per kilo, or an additional 19,200 francs. However, only slightly more than a half of the cocoa stands are fully mature, and the farmers pay the mettayeurs who harvest the crop between a half and a third of the total. But even after these ECONOMETRIC ISSUES FOR SURVEY DATA 97 adjustments, the farmer will be left with a return of 5,400 for an outlay of 1,000 francs. Insecticide is estimated to be somewhat less profitable, and the same cal- culation gives a return of only 1,800 for each 1,000 francs outlay. Yet only 1 in 14 farmers uses fertilizer, and 1 in 5 uses insecticide. On the surface, these results seem to indicate very large inefficiencies. How- ever, there are other interpretations. It is likely that highly productive farms are more likely to adopt fertilizer, particularly if the use of fertilizer is an indicator of farmer quality and the general willingness to adopt modern methods, high-yield- ing varieties, and so on. Credit for fertilizer purchases may only be available to better, or to better-off farmers. Suppose also that some farmers cannot use fertil- izer because of local climatic or soil conditions or because the type of trees in their stand, while others have the conditions to make good use of it. When we compare these different farms, we shall find what we have found, that farmers that use fertilizer are more productive, but there is no implication that more fertil- izer should be used. Expenditure on fertilizer in (2.55) may do no more than indicate that the orchard contains new hybrid varieties of cocoa trees, something on which the survey did not collect data. Example 3. The evaluation of projects Analysis of the effectiveness of government programs and projects has always been a central topic in development economics. Regression analysis seems like a helpful tool in this endeavor, because it enables us to link outcomes-incomes, consumption, employment, health, fertility-to the presence or extent of pro- grams designed to influence them. The econometric problems of such analyses are similar to those we encountered when linking farm outputs to farm inputs. In particular, it is usually impossible to maintain that the explanatory variables-in this case the programs-are uncorrelated with the regression residuals. Govern- ment programs are not typically run as experiments, in which some randomly selected groups are treated and others are left alone. A regression analysis may show that health outcomes are better in areas where the government has put clinics, but such an analysis takes no account of the pro- cess whereby sites are chosen. Clinics may be put where health outcomes were previously very poor, so that the cross-section regression will tend to underesti- mate their effects, or they may be allocated to relatively wealthy districts that are politically powerful, in which case regression analysis will tend to overstate their true impact. Rosenzweig and Wolpin (1986) found evidence of underestimation in the Philippines, where the positive effect of clinics on children's health did not show up in a cross section of children because clinics were allocated first to the areas where they were most needed. The clinics were being allocated in a desir- able way, and that fact caused regression analysis to fail to detect the benefits. In the next section, I shall follow Rosenzweig and Wolpin and show how panel data can sometimes be used to circumvent these difficulties. I shall return to the issue of project evaluation later in this section when I come to discuss selection bias, and again in Section 2.6 on IV estimation. 98 THE ANALYSIS OF HOUSEHOLD SURVEYS Example 4. Simultaneity and lags: nutrition andproductivity It is important to realize that in cross-section data, simultaneity cannot usually be avoided by using lags to ensure that the right-hand side variables are prior in time to the left-hand side variables. If x precedes y, then it is reasonable to suppose that y cannot affect x directly. However, there is often a third variable that affects y today as well as x yesterday, and if this variable is omitted from the regression, today's y will contain information that is correlated with yesterday's x. The land quality issue in the previous example can be thought about this way; although farm size is determined before the farmer's input and effort decisions, and before they and the weather determine farm output, both output and inputs are affected by land quality, so that there remains a correlation between output and the prede- ternined variables. As a final example, consider one of the more intractable cases of simultaneity, between nourishment and productivity. If poor people cannot work because they are malnourished, and they cannot eat because they do not earn enough, poor people are excluded from the labor market and there is persis- tent unemployment and destitution. The theory of this interaction was developed by Mirrlees (1975) and Stiglitz (1976), and it has been argued that such a mecha- nism helps account for destitution in India (Dasgupta 1993) and for the slow pace of premodern development in Europe (Fogel 1994). People who eat better may be more productive, because they have more en- ergy and work more efficiently, but people who work more efficiently also earn more, out of which they will spend more on food. Disentangling the effect of nutrition on wages from the Engel curve for food is difficult, and as emphasized by Bliss and Stern (1981), it is far from clear that the two effects can ever be - disentangled. One possibility, given suitable data, is to suppose that productivity depends on nutrition with a lag-sustained nutrition is needed for work-while consumption depends on current income. Hence, if yt is the productivity of indi- vidual i at time t, and cit is consumption of calories, we might write (2.56) Yit ~ l= 1 + pici-+ +YZli, +lit cit = a2 + P2+ Y2Z2it+ U2it where Zt and Z2 are other variables needed to identify the system. Provided equa- tion (2.56) is correct and the two error terms are serially independent, both equa- tions can consistently be estimated by least squares in a cross section with infor- mation on lagged consumption. However, any form of serial dependence in the residuals ulit will make OLS estimates of the first equation inconsistent. But there is a good reason to suppose that these residuals will be serially correlated, since permanent productivity differences across people that are not attributable to nutri- tion or the other variables will add a constant "individual" component to the error. Individuals who are more productive in one period are likely to be more produc- tive in the next, even when we have controlled for their nutrition and other ob- servable covariates. More productive individuals will have higher incomes and ECONOMETRIC ISSUES FOR SURVEY DATA 99 higher levels of nutrition, not only today but also yesterday, so that the lag in the equation no longer removes the correlation between the error term and the-right- hand-side variable. In a cross section, predetermined variables can rarely be legit- imately treated as exogenous. Measurement error Measurement error in survey data is a fact of life, and while it is not always pos- sible to counter its effects, it is always important to realize what those effects are likely to be, and to beware of inferences that are possibly attributable to, or con- taminated by, measurement error. The textbook case is the univariate regression model where both the explana- tory and dependent variables are subject to mean-zero errors of measurement. Hence, for the correctly measured variables y and x, we have the linear relation- ship (2.57) Yi = a + pxi + u together with the measurement equations (2.58) x, = xi + C1i ,i = Yi + C2i where the measurement error is assumed to be orthogonal to the true variables. Faute de mieux, g is regressed on x, and the OLS parameter estimate of , has the probability limit (2.59) plim, = Mx + = PIO where m. is the variance of the unobservable, correctly measured x, and a2 is the variance of the measurement error in x. Equation (2.59) is the "iron law of econo- metrics," that the OLS estimate of , is biased towards zero, or "attenuated." The degree of attenuation is the ratio of signal to combined signal and noise, AX, the reliability ratio. The presence of measurement error in the dependent variable does not bias the regression coefficients, because it simply adds to the variance of the equation as a whole. Of course, this measurement error, like the measurement error in x, will decrease the precision with which the parameters are estimated. Attenuation bias is amplified by the addition of correctly measured explana- tory variables to the bivariate regression (2.57). Suppose we add a vector z to the right-hand side of (2.57), and assume that z is uncorrelated with the measurement error in x and with the original residuals. Then the probability limit of the OLS estimate of J3, the coefficient of x, is now I1p where the new reliability ratio Al is (2.60) X, 2 s - O 1 - R 100 THE ANALYSIS OF HOUSEHOLD SURVEYS and R~ is the R2 from the regression of x on z. The new explanatory variables z "soak up" some of the signal from the noisy regressor x, so that the reliability ratio for , is reduced, and the "iron law" more severely enforced. More generally, consider a multivariate regression where all regressors may be noisy and where the measurement error in the independent variables may be correlated with the measurement error in the dependent variable. Suppose that the correctly measured variables satisfy (2.61) y = X, + u. Then the oLs parameter estimates have probability limits given by (2.62) plim, = (M+Q)'IMp + (M+n)-1y where M is the moment matrix of the true x's, n is the variance-covariance ma- trix of the measurement error in the x's, and y is the vector of covariances be- tween the measurement errors in the x's and the measurement error in yF. The first term in (2.62) is the matrix generalization of the attenuation effect in the univariate regression-the vector of parameters is subject to a matrix rather than scalar shrinkage factor-while the second term captures any additional bias from a correlation between the measurement errors in dependent and independent vari- ables. The latter effects can be important; for example, if consumption is being regressed on income, and if there is a common and noisily measured imputation term in both-home-produced food, or the imputed value of owner-occupied housing-then there will be an additional source of bias beyond attenuation ef- fects. Even in the absence of this second term on the right-hand side of (2.62) and, in spite of the obvious generalization from scalar to matrix attenuation, the result does not yield any simple result on the direction of bias in any one coeffi- cient (unless, of course, Q is diagonal). One useful general lesson is to be specific about the structure of measurement error, and to use a richer and more appropriate specification than the standard one of mean-zero, independent noise. The analysis is rarely complex, is frequently worthwhile, and will not always lead to the standard attenuation result. One spe- cific example is worth a brief discussion. It arises frequently and is simple, but is nevertheless sometimes misunderstood. Consider the model (2.63) Yic = a + pxi. + yz, +u "i where i is an individual who lives in village c, yi, is an outcome variable, xic and zc are individual and village-level explanatory variables. In a typical example, y might be a measure of educational attainment, x a set of family background vari- ables, and z a measure of educational provision or school quality in the village. The effect of health provision on health status might be another example. What - often happens in practice is that the z-variables are obtained from administrative, not survey data, so that we do not have village-level data on z, but only broader ECONOMETRIC ISSUES FOR SURVEY DATA 101 measures, perhaps at a district or provincial level. These measures are error-rid- den proxies for the ideal measures, and it might seem that the iron law would apply. But this is not so. To see why, write zp for the broad measure-p is for province-so that (2.64) p Pp CGpc where np is the number of villages in the province. Hence, instead of the measurement equation (2.58) where the observable is the unobservable plus an unrelated measurement error, we have (2.65) Z, = Zp + Ec and it is now the observable zp that is orthogonal to the measurement error. Be- cause the measurement error in (2.65) is the deviation of the village-level z from its provincial mean, it is orthogonal to the observed zp by construction. As a result, when we run the regression (2.63) with provincial data replacing village data, there is no correlation between the explanatory variables and the error term, and the OLS estimates are unbiased and consistent. Of course, the loss of the village-level information is not without cost. By (2.65), the averages are less vari- able than the individuals, so that the precision of the estimates will be reduced. And we must always be careful in these cases to correct standard errors for group effects as discussed in Section 2.2 above. But there is no errors-in-variables atten- uation bias. In Section 2.6 below, I review how, in favorable circumstances, Iv techniques can be used to obtain consistent estimates of the parameters even in the presence of measurement error. Note, however, that if it is possible to obtain estimates of 2. measurement error variances and covariances, ( in (2.59) or Q and y in (2.62), then the biases can be corrected and consistent estimates obtained.by substituting the OLS estimate on the left-hand side of (2.62), replacing Q, y, and M on the right-hand side by their estimates, and solving for P. For (2.62), this leads to the estimator (2.66) b = y where n is the sample size, and the tildes denote variables measured with error. The estimator (2.66) is consistent if Q and y are known or are replaced by consis- tent estimates. This option will not always be available, but is sometimes possible, for example, when there are several mismeasured estimates of the same quantity, and we shall see practical examples in Section 5.3 and 5.4 below. Selectivity issues In Chapter 1 and the first sections of this chapter, I discussed the construction of samples, and the fact that the sample design frequently needs to be taken into account when estimating characteristics of the underlying population. This is 102 THE ANALYSIS OF HOUSEHOLD SURVEYS particularly important when the selection of the sample is related to the quantity under study; average travel time in a sample of travelers is likely to be quite unrepresentative of average travel time among the population as a whole: if wages influence the decision to work, average wages among workers-which are often the only wages observed-will be an upward-biased estimator of actual and po- tential wages. Sample selection also affects behavioral relationships. In one of the first and most famous examples, Gronau (1973) found that women's wages were higher when they had small children, a result whose inherent implausibility prompted the search for an alternative explanation, and which led to the selection story. Women with children have higher reservation wages, fewer of them work, and the wages of those who do are higher. As with the other cases in this section, the econometric problem is the induced correlation between the error terms and the regressors. In the Gronau example, the more valuable is a woman's time at home, the larger will have to be the unobserved component in her wages in order to induce her to work, so that among working women, there is a positive correla- tion between the number of children and the error term in the wage equation. A useful and quite general model of selectivity is given in Heckman (1990); according to this there are two different regressions or regimes, and the model switches between them according to a dichotomous "switch" that is itself ex- plained. The model is written: (2.67) Yoi= xo Poi, Yii= x I iI+U11 together with the { 1,0} variable d, which satisfies (2.68) di = 1(z,y + u2i>0) where the indicator function 1(.) takes the value 1 when the statement it contains is true, and is zero otherwise. The observed variable y, is determined according to (2.69) Yi = diyoi +(1 -di)yli. The model is sometimes used in almost exactly this form; for example, the two equations in (2.67) could be wage equations in the formal and informal sectors respectively, while (2.68) models the decision about which sector to join (see, for example, van der Gaag, Stelcner and Vijverberg 1989 for a model of this sort applied to LSMS data from Peru and CMte d'Ivoire). However, it also covers sev- eral special cases, many of them useful in their own right. If the right-hand side of the second equation in (2.67) were zero, as it would be if P1 =0 and the variance of u, were zero, we would have the censored regres- sion model or generalized Tobit. This further specializes to the Tobit model if the argument of (2.68) and the right-hand side of the first equation coincide, so that the switching behavior and the size of the response are controlled by the same factors. However, the generalized Tobit model is also useful; for example, it is often argued that the factors that determine whether or not people smoke tobacco ECONOMETRIC ISSUES FOR SURVEY DATA 103 are different from the factors that determine how much smokers smoke. In this case, (2.69) implies that for those values of y that are positive, the regression function is (2.70) E(ylxi, zi, yi>0) = xiP+ A (zi /y) where, since there is only one x and one ,B, I have dropped the zero suffix, and where the last term is defined by (2.71) (z'y) = E(uo0iu2i2 -z,y). (Compare this with the Tobit in (2.47) above.) This version of the model can also be used to think about the case where the data are truncated, rather than censored as in the Tobit and generalized Tobit. Censoring refers to the case where obser- vations that fall outside limits-in this case below zero-are replaced by the limit points, hence the term "censoring." With truncation, observations beyond the limit are discarded and do not appear in our data. Censoring is easier to deal with because, although we do not observe the underlying latent variable, individual ob- servations are either censored or not censored, and for both we observe the covariates x and z, so that it is possible to estimate the switching equation (2.68) as well as (2.70). With truncation, we know nothing about the truncated observa- tions, so that we cannot estimate the switching process, and we are restricted to (2.70). The missing information in the truncated regression makes it difficult to handle convincingly, and it should be avoided when possible. A second important special case of the general model is the "treatment" or "policy evaluation" case. In the standard version, the right-hand sides of the two switching regressions in (2.67) are taken to be identical apart from their constant terms, so that (2.69) takes the special form (2.72) Y= a+ Odi + x) + ui so that the parameter 0 is the effect on the outcome variable of whether or not the "treatment" is applied. If this were a controlled and randomized experiment, the randomization would guarantee that dj would be orthogonal to ui. However, since u2 in (2.68) is correlated with the error terms in the regressions in (2.67), least squares will not yield consistent estimates of (2.72) because di is correlated with u;. This model is the standard one for examining union wage differentials, for example, but it also applies to many important applications in development where di indicates the presence of some policy or project. The siting of health clinics and schools are the perhaps the most obvious examples. As we have al- ready seen above, this version of the model can also be thought of in terms of simultaneity bias. There are various methods of estimating the general model and its variants. One possibility is to specify some distribution for the three sets of disturbances in (2.67) and (2.68), typically joint normality, and then to estimate by maximum likelihood. Given normality, the y-parameters in (2.68) can be estimated (up to 104 THE ANALYSIS OF HOUSEHOLD SURVEYS scale) by probit, and again given normality, the A-function in (2.71) has a specific form-the (inverse) Mills' ratio-and as Heckman (1976) showed in a famous paper, the results from the probit can be substituted into (2.70) in such a way that the remaining unknown parameters can be estimated by least squares. Since I shall refer to this again, it is worth briefly reviewing the mechanics. When uo and u2 are jointly normally distributed, the expectation of each conditional on the other is linear, so that we can write (2.73) uoi= a0p(u2,/C2) + Ci where ei is orthogonal to u2i, Io and 02 are the two standard deviations, and p is the correlation coefficient. (Note that p°O /02 = °02 /°2 is the large-sample regres- sion coefficient of uo on u2, the ratio of the covariance to variance.) Given (2.73), we can rewrite (2.71) as (2.74) (, °y) = pE a P J = (z0YIo2 02 02 U (1(z`yIo2) where 4 (.) and (D(.) are the density and distribution functions of the standard normal distribution, and where the final formula relies on the special properties of the normal distribution. The regression function (2.70) can then be written as 4)(Zi/yIo2) (2.75) Yi = xJ3 + poo . D(Z1 Y/02) The vector of ratios y/°2 can be estimated by running a probit on the dichoto- mous di from (2.68), the estimates used to compute the inverse Mills' ratio on the right-hand side of (2.75), and consistent estimates of , and poo obtained by OLS regression. This "Heckit" (Heckman's probit) procedure is widely used in the empirical development literature, to the extent that it is almost routinely applied as a method of dealing with selectivity bias. In recent years, however, it has been increasingly realized that the normality assumptions in these and similar procedures are far from incidental, and that the results-and even the identification of the models- may be compromised if we are not prepared to maintain normality. Even when normality holds, there will be the difficulties with heteroskedasticity that we have already seen. Recent work has been concerned with the logically prior question as to whether and under what conditions the parameters of these models are identi- fied without further parametric distributional assumptions, and with how identi- fied models can be estimated in a way that is consistent and at least reasonably efficient under the sort of assumptions that make sense in practice. The identification of the general model turns out to be a delicate matter, and is discussed in Chamberlain (1986), Manski (1988), and Heckman (1990). Given data on which observations are in which regime, the switching equation (2.68) is identified without further distributional assumptions; at least if we make the (es- sentially normalizing) assumption that the variance of u2 is unity. The identifica- tion of the other equations requires that there be at least one variable in the ECONOMETRIC ISSUES FOR SURVEY DATA 105 switching equation that does not appear in the substantive equations, and even then there can be difficulties; for example, identification requires that the vari- ables unique to the switching equation be continuous. In many practical applica- tions, these conditions will not be met, or at best be controversial. In particular, it is often difficult to exclude any of the selection variables from the substantive equations. Gronau's example, in which children clearly do not belong in the wage equation, seems to be the exception rather than the rule, and unless it is clear how the selection mechanism is working, there seems little point in pursuing these sorts of models, as opposed to a standard investigation of appropriate condition- ing variables and how they enter the regression function. The robust estimation of the parameters of selection models is a live research topic, although the methods are still experimental, and there is far from general agreement on which are best. In the censoring model (2.70), there exist distribu- tion-free methods that generalize Heckman's two stage procedure (see, for ex- ample, Newey, Powell, and Walker 1990, who make use of the kernel estimation methods that are discussed in Chapters 3 and 4 below. One possible move in this direction is to retain a probit-or even linear prob- ability model, regressing di on zi-for the first-stage estimation of (2.68), and to use the estimates to form the index z 'y, which is entered in the second-stage reg- ression (2.70), not through the Mills' ratio as in (2.75), but in polynomial form, with the polynomial regarded as an approximation to whatever the true X-function should be. This is perhaps an unusual mixture of parametric and nonparametric techniques, but the probit model or linear probability model (if the probabilities are typically far from either zero or one) are typically acceptable as functional forms, and it makes most sense to focus on removing the normality assumptions. The "policy evaluation" or "treatment" model (2.72) is most obviously estima- ted using Iv techniques as described in Section 2.6 below. Note that the classic experimental case corresponds to the case where treatment is randomly assigned, or is randomly assigned to certain groups, so that in either case the u2i in (2.68) is uncorrelated with the errors in the outcome equations (2.67). In most economic applications, the "treatment" has at least some element of self-selection, so that di in (2.72) will be correlated with the errors, and instrumentation is required. The obvious instruments are the z-variables, although in practice there will often be difficulties in finding instruments that can be plausibly excluded from the sub- stantive equation. Good instruments in this case can sometimes be provided by "natural experiments," where some feature of the policy design allows the con- struction of "treatments" and "controls" that are not self-selected. I shall discuss these in more detail below. 2.5 Panel data When our data contain repeated observations on each individual, the resulting panel data open up a number of possibilities that are not available in the single cross section. In particular, the opportunity to compare the same individual under different circumstances permits the possibility of using that individual as his or 106 THE ANALYSIS OF HOUSEHOLD SURVEYS her own control, so that we can come closer to the ideal experimental situation. In the farm example of the previous section, the quality of the farm-or indeed of the fanner-can be controlled for, and indeed, the first use of panel data in econo- metrics was by Mundlak (1961)-see also Hoch (1955)-who estimated farm production functions controlling for the quality of farm management. Similarly, we have seen that the use of regression for project evaluation is often invalidated by the purposeful allocation of projects to regions or villages, so that the explana- tory variable-the presence or absence of the project-is correlated with unob- served characteristics of the village. Rosenzweig and Wolpin (1986) and Pitt, Rosenzweig, and Gibbons (1993) have made good use of panel data to test for such effects in educational, health, and family planning programs in the Philip- pines and Indonesia. Several different kinds of panel data are sometimes available in developing countries (see also the discussion in Section 1.1 above). A very few surveys- most notably the ICRPSAT survey in India-have followed the same households over a substantial period of time. In some of the LSMS surveys, households were visited twice, a year apart, and there are several cases of opportunistic surveys returning to households for repeat interviews, often with a gap of several years. Since many important changes take time to occur, and projects and policies take time to have their effect, the longer gap often produces more useful data. It is also possible to "create" panel data from cross-sectional data, usually by aggregation. For example, while it is not usually possible to match individuals from one census to another, it is frequently possible to match locations, so as to create a panel at the location level. A good example is Pitt, Rosenzweig, and Gibbons (1993), who use several different cross-sectional surveys to construct data on facilities for 1980 and 1985 for 3,302 kecamatan (subdistricts) in Indonesia. In Section 2.7 below, I discuss another important example in some detail, the use of repeated but independent cross sections to construct panel data on birth cohorts of individuals. For all of these kinds of data, there are opportunities that are not available with a single cross-sectional survey. Dealing with heterogeneity: difference- and within-estimation To see the main advantage of panel data, start from the linear regression model (2.76) yi, = p xi.t+ ei + Pt + Uit where the index i runs from 1 to n, the sample size, and t from 1 to T, where T is usually small, often just two. The quantity p, is a time (or macro) effect, that ap- plies to all individuals in the sample at time t. The parameter O is a fixed effect for observation i; in the farm size example above it would be unobservable land quality, in the nutritional wage example, it would be the unobservable personal productivity characteristic of the individual, and in the project evaluation case, it would be some unmeasured characteristic of the individual (or of the individual's region) that affects program allocation. These fixed effects are designed to cap- ECONOMETRIC ISSUES FOR SURVEY DATA 107 ture the heterogeneity that causes the inconsistency in the OLS cross-sectional regression, and are set up in such a way as to allow their control using panel data. Note that there is nothing to prevent us from thinking of the 0's as randomly distributed over the population-so that in this sense the term "fixed effects" is an unfortunate one-but we are not prepared to assume that they are uncorrelated with the observed x's in the regression. Indeed, it is precisely this correlation that is the source of the difficulty in the farm, project evaluation, and nutrition exam- ples. The fact that we have more than one observation on each of the sample points allows us to remove the 0's by taking differences, or when there are more than two observations, by subtracting (or "sweeping out") the individual means. Sup- pose that T = 2, so that from (2.76), we can write (2.77) Yi - yi, = (p 2 - P 1) + [Y(X -xi) + Ui2 -uil an equation that can be consistently and efficiently estimated by OLS. When T is greater than two, use (2.76) to give (2.78) yi, -ji = (p, - p) + p/(X0 -., ) + u, -Ui where the notation 5i denotes the time mean for individual i. Equation (2.78) can be estimated as a pooled regression by OLS, although it should be noted (a) that there are n (T - 1) independent observations, not n T. Neither (2.77) nor (2.78) contains the individual fixed effects O,, so that these regressions are free of any correlation between the explanatory variables and the unobserved fixed effects, and the parameters can be estimated consistently by OLS. Of course, the fixed effect must indeed be fixed over time-which there is often little reason to sup- pose-and it must enter the equation additively and linearly. But given these assumptions, ots estimation of the suitably transformed regression will yield con- sistent estimates in the presence of unobserved heterogeneity-or omitted vari- ables-even when that heterogeneity is correlated with one or more of the in- cluded right-hand side variables. In the example from the Philippines studied by Rosenzweig and Wolpin (1986), there are data on 274 children from 85 households in 20 barrios. The cross-section regression of child nutritional status (age-standardized height) on exposure to rural health units and family planning programs gives negative (and insignificant) coefficients on both. Because the children were observed in two years, 1975 and 1979, it is also possible to run (2.77), where changes in height are regressed on changes in exposure, in which regression both coefficients become positive. Such a result is plausible if the programs were indeed effective, but were allocated first to those who needed them the most. The benefit of eliminating unobserved heterogeneity does not come without cost, and a number of points should be noted. Note first that the regression (2.77) has exactly half as many observations as the regression (2.76), so that, in order to remove the inconsistency, precision has been sacrificed. More generally, with T periods, one is sacrificed to control for the fixed effects, so that the proportional 108 THE ANALYSIS OF HOUSEHOLD SURVEYS loss of efficiency is greatest when there are only two observations. Of course, it can be argued that there are limited attractions to the precise estimation of some- thing that we do not wish to know, but a consistent but imprecise estimate can be further from the truth than an inconsistent estimator. The tradeoff between bias and efficiency has to be made on a case-by-case basis. We must also beware of misinterpreting a decrease in efficiency as a change in parameter estimates be- tween the differenced and undifferenced equations. If the cross-section estimate shows that P is positive and significant, and if the differenced data yield an esti- mate that is insignificantly different from both zero and the cross-section esti- mate, it is not persuasive to claim that the cross-section result is an artifact of not "treating" the heterogeneity. Second, the differencing will not only sweep out the fixed effects, it will sweep out all fixed effects, including any regressor that does not change over the period of observation. In some cases, this removes the attrac- tion of the procedure, and will limit it in short panels. In the Ivorian cocoa farm- ing example in the previous section, most of the farmers who used fertilizer re- ported the same amount in both periods, so that, although the panel data allows us to control for farm fixed effects, it still does not allow us to estimate how much additional production comes from the application of additional fertilizer. Panel data and measurement error Perhaps the greatest difficulties for difference- and within-estimators occur in the presence of measurement error. Indeed, when regressors are measured with error, within- or difference-estimators will no longer be consistent in the presence of un- observed individual fixed effects, nor need their biases be less than that of the un- corrected OLS estimator. Consider the univariate versions of the regressions (2.76) and (2.77), and com- pare the probability limits of the oLs estimators in the two cases when, in addition to the fixed effects, there is white noise measurement error in x. Again, for sim- plicity, I compare the results from estimation on a single cross section with those from a two-period panel. The probability limit of the OLS estimator in the cross section (2.76) is given by P3M + cxe (2.79) Plim = 2 mxx: + a,l where cx0 is the covariance of the fixed effect and the true x, al is the variance of the measurement error, and I have assumed that the measurement errors and fixed effects are uncorrelated. The formula (2.79) is a combination of omitted variable bias, (2.54), and measurement error bias, (2.59). The probability limit of the difference-estimator in (2.77) is (2.80) plim, = PM mA + oA, where m. is the variance of the difference of the true x, and a2 is the variance of the difference of measurement error in x. ECONOMETRIC ISSUES FOR SURVEY DATA 109 That the estimate in the levels suffers from two biases-attenuation bias and omitted variable bias-while the difference-estimate suffers from only attenuation bias is clearly no basis for preferring the latter! The relevant question is not the number of biases but whether the differencing reduces the variance in the signal relative to the variance of the noise so that the attenuation bias in the difference- estimator is more severe than the combined attenuation and omitted variable biases in the cross-section regression. We have seen one extreme case already; when the true x does not change between the two periods, the estimator will be dominated by the measurement error and will converge to zero. Although the ext- reme case would often be apparent in advance, there are many cases where the cross-section variance is much larger than the variance in the changes over time, especially when the panel observations are not very far apart in time. Although measurement error may also be serially correlated, with the same individual mis- reporting in the same way at different times, there will be other cases where errors are uncorrelated over time, in which case the error difference will have twice the variance of the errors in levels. Consider again the two examples of farm productivity and nutritional wages, where individual fixed effects are arguably important. In the first case, mu is the cross-sectional variance of farm size, while mA is the cross-sectional variance of the change in farm size from one period to another, something that will usually be small or even zero. In the nutritional wage example, there is probably much great- er variation in eating habits between people than there is for the same person over time, so that once again, the potential for measurement error to do harm is much enhanced. One rather different case is worth recording since it is a rare example of direct evidence on measurement error. Bound and Krueger (1991) matched earnings data from the U.S. Current Population Survey with Social Security re- cords, and were thus able to calculate the measurement error in the former. They found that measurement error was serially correlated and negatively related to actual earnings. The reliability ratios-the ratios of signal variance to total vari- ance-which are also the multipliers of ,B in (2.79) and (2.80), fall from 0.82 in levels to 0.65 in differences for men, and from 0.92 to 0.81 for women. Since measurement error is omnipresent, and because of the relative ineffi- ciency of difference- and within-estimators, we must be careful never to assume that the use of panel data will automatically improve our inference, or to treat the estimate from panel data as a gold standard for judging other estimates. Neverthe- less, it is clear that there is more information in a panel than in a single cross section, and that this information can be used to improve inference. Much can be learned from comparing different estimates. If the difference-estimate has a dif- ferent sign from the cross-sectional estimate, inspection of (2.79) and (2.80) shows that the covariance between x and the heterogeneity must be nonzero; measurement error alone cannot change the signs. When there are several periods of panel data, the difference-estimator (2.77) and the within-estimator (2.78) are mathematically distinct, and in the presence of measurement error will have dif- ferent probability limits. Griliches and Hausman (1986) show how the compari- son of these two estimators can identify the variance of the measurement error 110 THE ANALYSIS OF HOUSEHOLD SURVEYS when the errors are independent over time-so that consistent estimators can be constructed using (2.66). When errors are correlated over time-as will be the case if households persistently make errors in the same direction-information on measurement error can be obtained by comparing parameters from regressions computed using alternative differences, one period apart, two periods apart, and so on. Lagged dependent variables and exogeneity in panel data Although it will not be of great concern for this book, I should also note that there are a number of specific difficulties that arise when panel data are used to esti- mate regressions containing lagged dependent variables. In ordinary linear regres- sions, serial correlation in the residuals makes OLS inconsistent in the presence of a lagged dependent variable. In panel data, the presence of unobserved individual heterogeneity will have the same effect; if farm output is affected by unobserved farm quality, so must be last period's output on the same farm, so that this pe- riod's residual will be correlated with the lagged dependent variable. Nor can the heterogeneity be dealt with by using the standard within- or difference-estimators. When there is a lagged dependent variable together with unobserved fixed effects, and we difference, the right-hand side of the equation will have the lagged differ- ence y-, - Y1,1-2, and although the fixed effects have been removed by the differ- encing, there is a differenced error term u, - uU, which is correlated with the lagged difference because ui, 1 is correlated with yi,. Similarly, the within- estimator is inconsistent because the deviation of lagged y, l from its mean over time is correlated, with the deviation of uh, from its mean, not because u,, is cor- related with y.,-1, but because the two means are correlated. These inconsistencies vanish as the number of time periods in the panel increases but, in practice, most panels are short. Nor are the problems confined to lagged-dependent variables. Even if all the right-hand side variables are uncorrelated with the contemporaneous regression error u.,, the deviations from their means can be correlated with the average over time, u;. For this not to be the case, we require that explanatory variables be uncorrelated with the errors at all lags and leads, a requirement that is much more stringent than the usual assumption in time-series work that a variable is predeter- mined. It is also a requirement that is unlikely to be met in several of the exam- ples I have been discussing. For example, farm yields may depend on farm size, on the weather, on farm inputs such as fertilizer and insecticide, and on (unob- served) quality. The inputs are chosen before the farmer knows output, but a good output in one year may make the farmer more willing, or more able, to use more inputs in a subsequent year. In such circumstances, the within-regression will eliminate the unobservable quality variable, but it will induce a correlation be- tween inputs and the error term, so that the within-estimator will be inconsistent. These problems are extremely difficult to deal with in a convincing and robust way, although there exist a number of techniques (see in particular Nickell 1981; Chamberlain 1984; Holtz-Eakin, Newey, and Rosen 1988; and particularly the ECONOMETRIC ISSUES FOR SURVEY DATA 111 series of papers, Arellano and Bond 1991, Arellano and Bover 1993, and Alonso- Borrego and Arellano 1996). But too much should not be expected from these methods; attempts to disentangle heterogeneity, on the one hand, and dynamics, on the other, have a long and difficult history in various branches of statistics and econometrics. 2.6 Instrumental variables In all of the cases discussed in Section 2.4, the regression function differs from the structural model because of correlation between the error terms and the ex- planatory variables. The reasons differ from case to case, but it is the correlation that produces the inconsistency in OLS estimation. The technique of iv is the standard prescription for correcting such cases, and for recovering the structural parameters. Provided it is possible to find instrumental variables that are corre- lated with the explanatory variables but uncorrelated with the error terms, then rv regression will yield consistent estimates. For reference, it is useful to record the formulas. If X is the nxk matrix of explanatory variables, and if W is an nxk matrix of instruments, then the IV estimator of ,3 is given by (2.81) PIv = (W'X)'1W'y. Since y = X,B + u and W is orthogonal to u by assumption, (2.81) yields consistent estimators if the premultiplying matrix W'X is of full rank. If there are fewer instruments than explanatory variables-and some explanatory variables will often be suitable to serve as their own instruments-the IV estimate does not exist, and the model is underidentified. When there are exactly as many instruments as explanatory variables, the model is said to be exactly identified. In practice, it is desirable to have more instruments than strictly needed, because the additional in- struments can be used either to increase precision or to construct tests. In this overidentified case, suppose that Z is an nxk' matrix of potential instruments, with k '> k. Then all the instruments are used in the construction of the set W by using two-stage least squares, so that at the first stage, each X is regressed on all the instruments Z, with the predicted values used to construct W. If we define the "projection" matrix Pz = Z(Z'Z)'Z', the IV estimator is written (2.82) PI3v = (X'Z(Z'Z)1IZ'X)Y1X'Z(Z 'Z)1Z'y = (X PzX)YX 'Pzy. Under standard assumptions, Plv is asymptotically normally distributed with mean ,B and a variance-covariance matrix that can be estimated by (2.83) V = (XPzX)-.(X'PzDPzX) (XPzX)-'. The choice of D depends on the treatment of the variance-covariance matrix of the residuals, and is handled as with OLS, replaced by 02 I under homoskedas- 112 THEANALYSISOFHOUSEHOLDSURVEYS ticity, or by a diagonal matrix of squared residuals if heteroskedasticity is sus- pected, or by the appropriate matrix of cluster residuals if the survey is clustered (see (2.30) above). (Note that the residuals must be calculated as y -Xp,, which is not the vector of residuals from the second stage of two-stage least squares. However, this is hardly ever an issue in practice, since econometric packages make the correction automatically.) When the model is overidentified, and k'> k, the (partial) validity of the instruments is usually assessed by computing an overidentification (OID) test statistic. The simplest-and most intuitive-way to calculate the statistic is to re- gress the Iv residuals y - Xp,, on the matrix of instruments Z and to multiply the resulting (uncentered) R 2 statistic by the sample size n (see Davidson and Mac- Kinnon 1993, pp. 232-37). (The uncentered R2 iS I minus the ratio of the sum of squared residuals to the sum of squared dependent variables.) Under the null hypothesis that the instruments are valid, this test statistic is distributed as a x2 statistic with k '-k degrees of freedom. This procedure tests whether, contrary to the hypothesis, the instruments play a direct role in determining y, not just an indirect role, through predicting the x's. If the test fails, one or more of the instru- ments are invalid, and ought to be included in the explanation of y. Put differ- ently, the OID test tells us whether we would get (significantly) different answers if we used different instruments or different combinations of instruments in the regression. This interpretation also clarifies the limitations of the test. It is a test of overidentification, not of all the instruments. If we have only k instruments and k regressors, the model is exactly identified, the residuals of the iv regression are orthogonal to the instruments by construction, so that the OID test is mechanically equal to zero, there is only one way of using the instruments, and no alternative estimates to compare. So the OID test, useful though it is, is only informative when there are more instruments than strictly necessary. Although estimation by Iv is one of the most useful and most used tools of modern econometrics, it does not offer a routine solution for the problems diag- nosed in Section 2.4. Just as it is almost always possible to find reasons-meas- urement error, omitted heterogeneity, selection, or omitted variables-why the structural variables are correlated with the error terms, so is it almost always dif- ficult to find instruments that do not have these problems, while at the same time being related to the structural variables. It is easy to generate estimates that are different from the OLS estimates. What is much harder is to make the case that these estimates are necessarily to be preferred. Credible identification and estima- tion of structural equations almost always requires real creativity, and creativity cannot be produced to a formula. Policy evaluation and natural experiments One promising approach to the selection of instruments, especially for the treat- ment model, is to look for "natural experiments," cases where different sets of individuals are treated differently in a way that, if not random by design, was effectively so in practice. ECONOMETRIC ISSUES FOR SURVEY DATA 113 One of the best, and certainly earliest, examples is Snow's (1855) analysis of deaths in the London cholera epidemic of 1853-54, work that is cited by Freed- man (1991) as a leading example of convincing statistical work in the social sciences. The following is based on Freedman's account. Snow's hypothesis- which was not widely accepted at the time-was that cholera was waterborne. He discovered that households were supplied with water by two different water com- panies, the Lambeth water company, which in 1849 had moved its water intake to a point in the Thames above the main sewage discharge, and the Southwark and Vauxhall company, whose intake remained below the discharge. There was no sharp separation between houses supplied by the two companies, instead "the mixing of the supply is of the most intimate kind. The pipes of each Company go down all the streets, and into nearly all the courts and alleys.... The experiment, too, is on the grandest scale. No fewer than three hundred thousand people of both sexes, of every age and occupation, and of every rank and station, from gentlefolks down to the very poor, were divided into two groups without their choice, and in most cases, without their knowledge; one group supplied with water containing the sewage of London, and amongst it, whatever might have come from the cholera patients, the other group having water quite free from such impurity." Snow collected data on the addresses of cholera victims, and found that there were 8.5 times as many deaths per thousand among households sup- plied by the Southwark and Vauxhall company. Snow's analysis can be thought of in terms of instrumental variables. Cholera is not directly caused by the position of a water intake, but by contamination of drinking water. Had it then been possible to do so, an alternative analysis might have linked the probability of contracting cholera to a measure of water purity. But even if such an analysis had shown significant results, it would not have been very convincing. The people who drank impure water were also more likely to be poor, and to live in an environment contaminated in many ways, not least by the "poison miasmas" that were then thought to be the cause of cholera. In terms of the discussion of Section 2.4, the explanatory variable, water purity, is correlated with omitted variables or with omitted individual heterogeneity. The identity of the water supplier is an ideal iv for this analysis. It is correlated with the explana- tory variable (water purity) for well-understood reasons, and it is uncorrelated with other explanatory variables because of the "intimate" mixing of supplies and the fact that most people did not even know the identity of their supplier. There are a number of good examples of natural experiments in the economics literature. Card (1989) shows that the Mariel boatlift, where political events in Cuba led to the arrival of 125,000 Cubans in Miami between May and September 1980, had little apparent effect on wages in Miami, for either Cubans or non- Cubans. Card and Krueger (1994) study fast-food outlets on either side of the bor- der between New Jersey and Pennsylvania around the time of an increase in New Jersey's minimum wage, and find that employment rose in New Jersey relative to Pennsylvania. Another example comes from the studies by Angrist (1990) and Angrist and Krueger (1994) into eamings differences of American males by vet- eran status. The "treatment" variable is spending time in the military, and the out- 114 THEANALYSISOFHOUSEHOLDSURVEYS come is the effect on wages. The data present somewhat of a puzzle because veterans of World War II appear to enjoy a substantial wage premium over other workers, while veterans of the Vietnam War are typically paid less than other similar workers. The suspicion is that selectivity is important, the argument being that the majority of those who served in Vietnam had relatively low unobservable labor market skills, while in World War II, where the majority served, only those with relatively low skills were excluded from service. Angrist and Krueger (1994) point out that in the late years of World War II, the selection mechanism acted in such a way that those born early in the year had a (very slightly) higher chance of being selected than those born later in the year. They can then use birth dates as instruments, effectively averaging over all indivi- duals born in the same quarter, so that to preserve variation in the averages, Ang- rist and Krueger require a very large sample, in this case 300,000 individuals from the 1980 census. (Large sample sizes will often be required by "natural ex- periments" since instruments that are convincingly uncorrelated with the residuals will often be only weakly correlated with the selection process.) In the Iv esti- mates, the World War II premium is reversed, and earnings are lower for those cohorts who had a larger fraction of veterans. By contrast, Angrist (1990) finds that instrumenting earnings equations for Vietnam veterans using the draft lottery makes little difference to the negative earnings premium experienced by these workers, so that the two studies together suggest that time spent in the military lowers earnings compared with the earnings of those who did not serve. Impressive as these studies are, natural experiments are not always available when we need them, and some cases yield better instruments than others. Because "natural" experiments are not genuine, randomized experiments, the fact that the experiment is effectively (or quasi-) randomized has to be argued on a case-by- case basis, and the argument is not always as persuasive as in Snow's case. For example, government policies only rarely generate convincing experiments (see Besley and Case 1994). Although two otherwise similar countries (towns, or pro- vinces) may experience different policies, comparison of outcomes is always be- deviled by the concern that the differences are not random, but linked to some characteristic of the country (town or province) that caused the government to draw the distinction in the first place. However, it may be possible to follow Angrist and Krueger's lead in looking, not at programs themselves, but at the details of their administration. The argu- ment is that in any program with limited resources or limited reach, where some units are treated and some not, the administration of the project is likely to lead, at some level, to choices that are close to random. In the World War II example, it is not the draft that is random, but the fact that local draft boards had to fill quotas, and that the bureaucrats who selected draftees did so partially by order of birth. In other cases, one could imagine people being selected because they are higher in the alphabet than others, or because an administrator used a list constructed for other purposes. While the broad design of the program is likely to be politically and economically motivated, and so cannot be treated as an experiment, natural or otherwise, the details are handled by bureaucrats who are simply trying to get the ECONOMETRIC ISSUES FOR SURVEY DATA 1 Is job done, and who make selections that are effectively random. This is a recipe for project evaluation that calls for intimate knowledge and examination of detail, but it is one that has some prospect of yielding convincing results. One feature of good natural experiments is their simplicity. Snow's study is a model in this regard. The argument is straightforward, and is easily explained to nonstatisticians or noneconometricians, to whom the concept of instrumental vari- ables could not be readily communicated. Simplicity not only aids communica- tion, but greatly adds to the persuasiveness of the results and increases the likeli- hood that the results will affect the policy debate. A case in point is the recent political firestorm in the United States over Card and Krueger's (1994) findings on the minimum wage. Econometric issues for instrumental variables iv estimators are invaluable tools for handling nonexperimental data. Even so, there are a number of difficulties of which it is necessary to be aware. As with other techniques for controlling for nonexperimental inconsistencies, there is a cost in terms of precision. The variance-covariance matrix (2.83) exceeds the corresponding OLS matrix by a positive definite matrix, so that, even when there is no inconsistency, the IV estimators-and all linear combinations of the Iv esti- mates-will have larger standard errors than their oLs counterparts. Even when oL.s is inconsistent, there is no guarantee that in individual cases, the Iv estimates will be closer to the truth, and the larger the variance, the less likely it is that they will be so. It must also be emphasized that the distributional theory for iv estimates is asymptotic, and that asymptotic approximations may be a poor guide to finite sample performance. Formulas exist for the finite sample distributions of Iv esti- mators (see, for example, Anderson and Sawa 1979) but these are typically not sufficiently transparent to provide practical guidance. Nevertheless, a certain amount is known, and this knowledge provides some warnings for practice. Finite sample distributions of Iv estimators will typically be more dispersed with more mass in the tails than either OLS estimators or their own asymptotic distributions. Indeed, IV estimates possess moments only up to the degree of over- identification, so that when there is one instrument for one suspect structural vari- able, the Iv estimate will be so dispersed that its mean does not exist (see David- son and MacKinnon 1993, 220-4) for further discussion and references. As a result, there will always be the possibility of obtaining extreme estimates, whose presence is not taken into account in the calculation of the asymptotic standard errors. Given sufficient overidentification so that the requisite moments exist- and note that this rules out some of the most difficult cases-Nagar (1959) and Buse (1992) show that in finite samples, IV estimates are biased towards the OLS estimators. This gives support to many students' intuition when first confronted with iv estimation, that it is a clever trick designed to reproduce the OLS estimate as closely as possible while guaranteeing consistency in a (conveniently hypo- thetical) large sample. In the extreme case, where there are as many instruments 116 THE ANALYSIS OF HOUSEHOLD SURVEYS as observations so that the first stage of two-stage least squares fits the data per- fectly, the iv and OLS estimates are identical. More generally, there is a tradeoff between having too many instruments, overfitting at the first stage, and being biased towards OLS, or having too few instruments, and risking dispersion and extreme estimates. Either way, the asymptotic standard errors on which we rou- tinely rely will not properly indicate the degree of bias or the dispersion. Nelson and Startz (1990a, 1990b) and Maddala and Jeong (1992) have ana- lyzed the case of a univariate regression where the options are OLS or IV estima- tion with a single instrument. Their results show that the central tendency of the finite-sample distribution of the iv estimator is biased away from the true value and towards the OLS value. Perhaps most seriously, the asymptotic distribution is a very poor approximation to the finite-sample distribution when the instrument is a poor one, in the sense that it is close to orthogonal to the explanatory variable. Additional evidence of poor performance comes from Bound, Jaeger, and Baker (1993), who show that the empirical results in Angrist and Krueger (1991), who used up to 180 instruments with 30,000 observations, can be closely reproduced with randomly generated instruments. Both sets of results show that poor instru- ments do not necessarily reveal themselves as large standard errors for the iv estimates. Instead it is easy to produce situations in which y is unrelated to x, and where z is a poor instrument for x, but where the Iv estimate of the regression of y on x with z as instrument generates a parameter estimate whose "asymptotic t- value" shows an apparently significant effect. As a result, if Iv results are to be credible, it is important to establish first that the instruments do indeed have,pre- dictive power for the contaminated right-hand-side variables. This means display- ing the first-stage regressions-a practice that is far from routine-or at the least examining and presenting evidence on the explanatory power of the instruments. (Note that when calculating two-stage least squares, the exogenous x variables are also included on the right-hand-side with the instruments, and that it is the predic- tive power of the latter that must be established, for example, by using an F-test for those variables rather than the R2 for the regression as a whole.) In recent work, Staiger and Stock (1993) have proposed a new asymptotic theory for iv when the instruments are only wealcy correlated with the regressors, and have produced evidence that their asymptotics provides a good approximation to the finite-sample distribution of IV estimators, even in difficult cases such as those examined by Nelson and Startz. These results may provide a better basis for iv inference in future work. 2.7 Using a time series of cross sections Although long-running panels are rare in both developed and developing count- ries, independent cross-sectional household surveys are frequently conducted on a regular basis, sometimes annually, and sometimes less frequently. In Chapter 1, I have already referred to and illustrated from the Surveys of Personal Income Distribution in Taiwan (China), which have been running annually since 1976, and I shall use these data further in this section. Although such surveys select ECONOMETRIC ISSUES FOR SURVEY DATA 117 different households in each survey, so that there is no possibility of following individuals over time, it is still possible to follow groups of people from one survey to another. Obvious examples are the group of the whole population, where we use the surveys to track aggregate data over time, or regional, sectoral, or occupational groups, where we might track the differing fortunes over time of farmers versus government servants, or where we might ask whether poverty is diminishing more rapidly in one region than in another. Perhaps somewhat less obvious is the use of survey data to follow cohorts of individuals over time, where cohorts are defined by date of birth. Provided the population is not much affected by immigration and emigration, and provided the cohort is not so old that its members are dying in significant numbers, we can use successive surveys to follow each cohort over time by looking at the members of the cohort who are randomly selected into each survey. For example, we can look at the average consumption of 30-year-olds in the 1976 survey, of 31 -year-olds in the 1977 survey, and so on. These averages, because they relate to the same group of people, have many of the properties of panel data. Cohorts are frequently inter- esting in their own right, and questions about the gainers and losers from econo- mic development are often conveniently addressed by following such groups over time. Because there are many cohorts alive at one time, cohort data are more diverse and richer than are aggregate data, but their semiaggregated structure provides a link between the microeconomic household-level data and the macro- economic data from national accounts. The most important measures of living standards, income and consumption, have strong life-cycle age-related compon- ents, but the profiles themselves will move upward over time with economic growth as each generation becomes better-off than its predecessors. Tracking different cohorts through successive surveys allows us to disentangle the gene- rational from life-cycle components in income and consumption profiles. Cohort data: an example The left-hand top panel of Figure 2.5 shows the averages of real earnings for vari- ous cohorts in Taiwan (China) observed from 1976 through to 1990. The data were constructed according to the principles outlined above. For example, for the cohort born in 1941, who were 35 years old in 1976, I used the 1976 survey to calculate the average earnings of all those aged 35, and the result is plotted as the first point in the third line from the left in the figure. The average earnings of 36- year-olds in the 1977 survey is calculated and forms the second point on the same segment. The rest of the line comes from the other surveys, tracking the cohort born in 1941 through the 15 surveys until they are last observed at age 49 in 1990. Table 2.2 shows that there were 699 members of the cohort in the 1976 survey, 624 in the 1977 survey, 879 in the 1978 survey (in which the sample size was increased), and so on until 691 in 1990. The figure illustrates the same pro- cess for seven cohorts, born in 1951, 1946, and so on backward at five-year inter- vals until the oldest, which was born in 1921, and the members of which were 69 years old when last seen in 1990. Although it is possible to make graphs for all Figure 2.5. Earnings by cohort and their decomposition, Taiwan (China), 1976-90 25- 0- 20 Earnings by -1c \oot effects cohon 8 5t4Vs i<§~~~~~5 / 65 2i5 35 4 55 3 ~~~~~~Age Cobort: age in _ a ~~~~~~~~~~~~~~~~~~~~~~~~~~~1976 oo 8~~~~~~~~~0 Year effects/ 25 235 ' 45 55 65 ' 8 84 Age Year Note: Author's calculations based on Surveys of Personal Income Distribution. ECONOMETRIC ISSUES FOR SURVEY DATA 119 Table 2.2. Number of persons in selected cohorts by survey year, Taiwan (China), 1976-90 Cohort: age in 1976 Year 25 30 35 40 45 50 55 1976 863 521 699 609 552 461 333 1977 902 604 624 535 585 427 308 1978 1,389 854 879 738 714 629 477 1979 1,351 796 846 708 714 574 462 1980 1,402 834 845 723 746 625 460 1981 1,460 794 807 720 750 624 426 1982 1,461 771 838 695 689 655 496 1983 1,426 737 846 718 702 597 463 1984 1,477 825 820 711 695 541 454 1985 1,396 766 775 651 617 596 442 1986 1,381 725 713 659 664 549 428 1987 1,309 634 775 632 675 513 0 1988 1,275 674 700 617 595 548 0 1989 1,225 672 652 600 609 519 0 1990 1,121 601 691 575 564 508 0 Note: The year is the year of the survey, and the numbers are the numbers of individuals in each cohort sampled in each survey year. 65 is used as an age cutoff, so the oldest cohort is not observed after 1986. Source: Author's calculations from the Surveys of Personal Income Distribution. birth years, I have shown only every fifth cohort so as to keep the diagram clear. Note that only members of the same cohort are joined up by connecting lines, and this construction makes it clear when we are following different groups of people or jumping from one cohort to another. (See also Figures 6.3 and 6.4 below for the corresponding graphs for consumption and for a comparison of cross-section- al and cohort plots.) The top left-hand panel of the figure shows clear age and cohort effects in earnings; it is also possible to detect common macroeconomic patterns for all cohorts. With a very few exceptions at older ages, the lines for the younger co- horts are always above the lines for the older cohorts, even when they are ob- served at the same age. This is because rapid economic growth in Taiwan (China) is making younger generations better-off, so that, for example, those born in 1951-the youngest, left-most cohort in the figure-have average earnings at age 38 that are approximately twice as much as the earnings at age 38 of the cohort born 10 years earlier-the third cohort in the figure. There is also a pronounced life-cycle profile to earnings, and although the age profile is "broken up" by the cohort effects, it is clear that earnings tend to grow much more rapidly in the early years of the working life than they do after age 50. As a result, not only are the younger cohorts of workers in Taiwan (China) better-off than their predeces- sors, but they have also experienced much more rapid growth in earnings. The macroeconomic effects in the first panel of Figure 2.5 are perhaps the hardest to see, but note that each connected line segment corresponds to the same contempo- raneous span of 15 years in "real" time, 1976-90. Each segment shows the impact 120 THE ANALYSIS OF HOUSEHOLD SURVEYS of the slowdown in Taiwanese economic growth after the 1979 oil shock. Each cohort has very rapid growth from the second to third year observed, which is 1977-78, somewhat slower growth for the next two years, 1978-80, and then two years of slow or negative growth after the shock. This decomposition into cohort, age, and year effects can be formalized in a way that will work even when the data are not annual and not necessarily evenly spaced, a topic to which I return in the final subsection below. Before that, however, it is useful to use this example to highlight the advantages and disadvantages of cohort data more generally. Cohort data versus panel data A useful comparison is between the semiaggregated cohort data and genuine panel data in which individual households are tracked over time. In both cases, we have a time series of observations on a number of units, with units defined as either cohorts or individuals. The cohort data cannot tell us anything about dy- namics within the cohorts; each survey tells us about the distribution of the char- acteristic in the cohort in each period, but two adjacent surveys tell us nothing about the joint distribution of the characteristic in the two periods. In the earnings example, the time series of cross sections can tell us about average earnings for the cohort over time, and it can tell us about inequality of earnings within the cohort and how it is changing over time, but it cannot tell us how long individuals are poor, or whether the people ftho are rich now were rich or poor at some ear- lier date. But apart from dynamics, the cohort data can do most of what would be expected of panel data. In particular, and as we shall see in the next subsection, the cohort data can be used to control for unobservable fixed effects just as with panel data, a feature that is often thought to be the main econometric attraction of the latter. Cohort data also have a number of advantages over most panels. As we have seen in Chapter 1, many panels suffer from attrition, especially in the early years, and so run the risk of becoming increasingly unrepresentative over time. Because the cohort data are constructed from fresh samples every year, there is no attri- tion. There will be (related) problems with the cohort data if the sampling design changes over time, or if the probabilities of selection into the sample depend on age as, for example, for young men undergoing military training. The way in which the cohort data are used will often be less susceptible to measurement error than is the case with panels. The quantity that is being tracked over time is typi- cally an average (or some other statistic such as the median or other percentile) and the averaging will nearly always reduce the effects of measurement error and enhance the signal-to-noise ratio. In this sense, cohort methods can be regarded as iv methods, where the instruments are grouping variables, whose application averages away the measurement error. Working with aggregated data at a level that is intermediate between micro and macro also brings out the relationship between household behavior and the national aggregates and helps bridge the gap between them; in Figure 2.5, for example, the behavior of the aggregate economy is clearly apparent in the averages of the household data. ECONOMETRIC ISSUES FOR SURVEY DATA 121 It should be emphasized that cohort data can be constructed for any character- istic of the distribution of interest; we are not confined to means. As we shall see in Chapter 6, it can be interesting and useful to study how inequality changes within cohorts over time, and since we have the micro data for each cohort in each year, it is as straightforward to work with measures of dispersion as it is to work with measures of central tendency. Medians can be used instead of means- a technique that is often useful in the presence of outliers-and if the theory sug- gests working with some transform of the data, the transform can be made prior to averaging. When working with aggregate data, theoretical considerations often suggest working with the mean of a logarithm, for example, rather than with the logarithm of the mean. The former is not available from aggregate data, but can be routinely computed from the micro data when calculating the semiaggregated cohort averages. A final advantage of cohort methods is that they allow the combination of data from different surveys on different households. The means of cohort consumption from an expenditure survey can be combined with the means of cohort income from a labor force survey, and the hybrid data set used to study saving. It is not necessary that all variables are collected from the same households in one survey. Against the use of cohort data, it should be noted that there are sometimes problems with the assumption that the cohort population is constant, an assump- tion that is needed if the successive surveys are to generate random samples from the same underlying population. I have already noted potential problems with military service, migration, aging, and death. But the more serious difficulties come when we are forced to work, not with individuals, but with households, and to define cohorts of households by the age of the head. If households once formed are indissoluble, there would be no difficulty, but divorce and remarriage reorga- nize households, as does the process whereby older people go to live with their children, so that previously "old" households become "young" households in sub- sequent years. It is usually clear when these problems are serious, and they affect some segments of the population more than others, so that we know which data to trust and which to suspect. Panel data from successive cross sections It is useful to consider briefly the issues that arise when using cohort data as if they were repeated observations on individual units. I show first how fixed effects at the individual level carry through to the cohort data, and what steps have to be taken if they are to be eliminated. Consider the simplest univariate model with fixed effects, so that at the level of the individual household, we have (2.76) with a single variable (2.84) Yit = a +.pxi,+Pt,Oi+ui, where the ,u are year dummies and O6 is an individual-specific fixed effect. If there were no fixed effects, it would be possible to average (2.84) over all the 122 THE ANALYSIS OF HOUSEHOLD SURVEYS households in each cohort in each year to give a corresponding equation for the cohort averages. When there are fixed effects, (2.84) still holds for the cohort population means, with cohort fixed effects replacing the individual fixed effects. However, if we average (2.84) over the members of the cohorts who appear in the survey, and who will be different from year to year, the "fixed effect" will not be fixed, because it is the average of the fixed effects of different households in each year. Because of this sampling effect, we cannot remove the cohort fixed effects by differencing or using within-estimators. Consider an alternative approach based on the unobservable population means for each cohort. Start from the cohort version of (2.84), and denote population means in cohorts by the subscripts c, so that, simply changing the subscript i to c, we have (2.85) Yc, = a+IpxC+P +Oc+u, and take first differences-the comparable analysis for the within-estimator is left as an exercise-to eliminate the fixed effects so that (2.86) Ayc, = Ap, + PAxC, + Au, where the first term is a constant in any given year. This procedure has eliminated the fixed effects, but we are left with the unobservable changes in the population cohort means in place of the sample cohort means, which is what we observe. If we replace Ay and Ax in (2.84) by the observed changes in the sample means, we generate an error-in-variables problem, and the estimates will be attenuated. There are at least two ways of dealing with this problem. The first is to note that, just as the sample was used to provide an estimate of the cohort mean, it can also be used to provide an estimate of the standard error of the estimate, which in this context is the variance of the measurement error. For the example (2.86), we can use overbars to denote sample means and write (2.87) AsY, = Ay,, + E2, -El,ct-, ( . ) ~~~~~A XCt = A xCt + E2,t -'E2,t-I where Etc, and e&, are sampling errors in the cohort means. Because they come from different surveys with independently selected samples, they are independent 2 2 over time, and their variances and covariance, a,, a2, and 0,2 are calculated in the usual way, from the variances and covariance in the sample divided by the cohort size (with correction for cluster effects as necessary.) From (2.87), we see that the variances and covariances of the sample cohort means are inflated by the variances and covariances of the sampling errors, but that, if these are subtracted out, we can obtain consistent estimates of P in (2.86) from-cf. also (2.62) above, cov(AX.8Ay8-) - 012, - G12t-1 var(A5%,) - a2 - <2 and where, for illustrative purposes, I have assumed that there are only two time periods t and t - 1. The standard error for (2.88) can be calculated using the boot- strap or the delta method-discussed in the next section-which can also take ECONOMETRIC ISSUES FOR SURVEY DATA 123 into account the fact that the variance and covariances of the sampling errors are estimated (see Deaton 1985, who also discusses the multivariate case, and Fuller 1987, who gives a general treatment for a range of similar models). Another possible estimation strategy is to use IV, with changes from earlier years used as instruments. Since the successive samples are independently drawn, changes in cohort means from t -2 to t - 1 are measured independently of the change from t to t + 1. In some cases, the cohort samples may be large enough and the means precisely enough estimated so that these corrections are small enough to ignore. In any case, it is a good idea to check the standard errors of the cohort means, to make sure that regression results are not being dominated by sampling effects, and if so, to increase the cohort sizes, for example, by working with five-year age bands instead of single years. In some applications, this might be desirable on other grounds; in some countries, people do not know their dates of birth well enough to be able to report age accurately, and reported ages "heap" at numbers ending in 5 and 0. Decompositions by age, cohort, and year A number of the quantities most closely associated with welfare, including family size, earnings, income, and consumption, have distinct and characteristic life- cycle profiles. Wage rates, earnings, and saving usually have hump-shaped age profiles, rising to their maximum in the middle years of life, and declining some- what thereafter. The natural process of bearing and raising children induces a similar profile in average family size. Moreover, all of these quantities are subject to secular variation; consumption, earnings, and incomes rise over time with economic development, and family size decreases as countries pass through the demographic transition. In consequence, even if the shape of the age profiles re- mains the same for successive generations, their position will shift from one to the next. The age profile from a single cross section confounds the age profile with the generational or cohort effects. For example, a cross-sectional earnings profile will tend to exaggerate the downturn in earnings at the highest age because, as we look at older and older individuals, we are not just moving along a given age- earnings profile, but we are also moving to ever lower lifetime profiles. The co- hort data described in this section allow us to track the same cohort over several years and thus to avoid the difficulty; indeed, the Taiwanese earnings example in Figure 2.5 provides a clear example of the differences between the age profiles of different cohorts. In many cases, diagrams like Figure 2.5 will tell us all that we need to know. However, since each cohort is only observed for a limited period of time, it is useful to have a technique for linking together the age profiles from different cohorts to generate a single complete life-cycle age profile. This is par- ticularly true when there is only a limited number of surveys, and the intervals between them are more than one year. In such cases, diagrams like Figure 2.5 are harder to draw, and a good deal less informative. In this subsection, I discuss how the cohort data can be decomposed into age effects, cohort effects, and year effects, the first to give the typical age profile, the 124 THE ANALYSIS OF HOUSEHOLD SURVEYS second the secular trends that lead to differences in the positions of age profiles for different cohorts, and the third the aggregate effects that synchronously but temporarily move all cohorts off their profiles. These decompositions are based on models and are certainly not free of structural assumptions; they assume away interaction effects between age, cohort, and years, so that, for example, the shape of age profiles is unaffected by changes in their position, and the appropriateness and usefulness of the assumption has to be judged on a case-by-case basis. To make the analysis concrete, consider the case of the lifetime consumption profile. If the growth in living standards acts so as to move up the consumption- age profiles proportionately, it makes sense to work in logarithms, and to write the logarithm of consumption as (2.89) Incc, = ,+ a + Yc + *t + uCZ where the superscripts c and t (as usual) refer to cohort and time (year), and a refers to age, defined here as the age of cohort c in year t. In this particular case, (2.89) can be given a theoretical interpretation, since according to life-cycle the- ory under certainty, consumption is the product of lifetime wealth, the cohort aggregate of which is constant over time, and an age effect, which is determined by preferences (see Section 6.1 below). In other contexts where there is no such theory, the decomposition is often a useful descriptive device, as for earnings in Taiwan (China), where it is hard to look at the top left-hand panel of Figure 2.5 without thinking about an age and cohort decomposition. In order to implement a model like (2.89), we need to decide how to label cohorts. A convenient way to do so is to choose c as the age in year t =0. By this, c is just a number like a and t. We can then choose to restrict the age, cohort, and year effects in (2.89) in various different ways. In particular, we can choose poly- nomials or dummies. For the year effects, where there is no obvious pattern a priori, dummy variables would seem to be necessary, but age effects could rea- sonably be modeled as a cubic, quartic, or quintic polynomial in age, and cohort effects, which are likely to be trend-like, might even be adequately handled as linear in c. Given the way in which we have defined cohorts, with bigger values of c corresponding to older cohorts, we would expect yc to be declining with c. When data are plentiful, as in the Taiwanese case, there is no reason not to use dummy variables for all three sets of effects, and thus to allow the data to choose any pattern. Suppose that A is a matrix of age dummies, C a matrix of cohort dummies, and Y a matrix of year dummies. The cohort data are arranged as cohort-year pairs, with each "observation" corresponding to a single cohort in a specific year. If there are m such cohort-year pairs, the three matrices will each have m rows; the number of columns will be the number of ages (or age groups), the number of cohorts, and the number of years, respectively. The model (2.89) can then be written in the form (2.90) y = +Aa+Cy+Yd,i+u ECONOMETRIC ISSUES FOR SURVEY DATA 125 where y is the stacked vector of cohort-year observations-each row corresponds to a single observation on a cohort-on the cohort means of the logarithm of con- sumption. As usual, we must drop one column from each of the three matrices, since for the full matrices, the sum of the columns is a column of ones, which is already included as the constant term. However, even having dropped these columns, it is still impossible to estimate (2.90) because there is an additional linear relationship across the three matrices. The problem lies in the fact that if we know the date, and we know when a cohort was born, then we can infer the cohort's age. Indeed, since c is the age of the cohort in year 0, we have (2.91) act = c + t which implies that the matrices of dummies satisfy (2.92) Asa = Csc + Ysy where the s vectors are arithmetic sequences { 0,1,2,3, . . of the length given by the number of columns of the matrix that premultiplies them. Equation (2.92) is a single identity, so that to estimate the model it is necessary to drop one more column from any one of the three matrices. The normalization of age, cohort, and year effects has been discussed in dif- ferent contexts by a number of authors, particularly Hall (1971), who provides an admirably clear account in the context of embodied and disembodied technical progress for different vintages of pickup trucks, and by Weiss and Lillard (1978), who are concerned with age, vintage, and time effects in the earnings of scientists. The treatment here is similar to Hall's, but is based on that given in Deaton and Paxson (1994a). Note first that in (2.90), we can replace the parameter vectors a, y, and * by (2.93) = a +Ksa, ' =y -KSc, vsy = -KSy for any scalar constant Kc, and by (2.92) there will be no change in the predicted value of y in (2.87). According to (2.90), a time-trend can be added to the age dummies, and the effects offset by subtracting time-trends from the cohort dum- mies and the year dummies. Since these transformations are a little hard to visualize, and a good deal more complicated than more familiar dummy-variable normalizations, it is worth con- sidering examples. Suppose first that consumption is constant over cohorts, ages, and years, so that the curves in Figure 2.5 degenerate to a single straight line with slope 0. Then we could "decompose" this into a positive age effect, with con- sumption growing at (say) five percent for each year of age, and offset this by a negative year effect of five percent a year. According to this, each cohort would get a five percent age bonus each year, but would lose it to a macroeconomic effect whereby everyone gets five percent less than in the previous year. If this 126 THEANALYSISOFHOUSEHOLDSURVEYS were all, younger cohorts would get less than older cohorts at the same age, be- cause they come along later in time. To offset this, we need to give each cohort five percent more than the cohort born the year previously which, since the older cohorts have higher cohort numbers, means a negative trend in the cohort effects. More realistically, suppose that when we draw Figure 2.5, we find that the con- sumption of each cohort is growing at three percent a year, and that each succes- sive cohort's profile is three percent higher than that of its predecessor. Everyone gets three percent more a year as they age, and starting consumption rises by three percent a year. This situation can be represented (exactly) by age effects that rise linearly with age added to cohort effects that fall linearly with age by the same amount each year; note that cohorts are labeled by age at a fixed date, so that older cohorts (larger c) are poorer, not richer. But the same data can be represent- ed by a time-trend of three percent a year in the age effects, without either cohort or year effects. In practice, we choose a normalization that is most suitable for the problem at hand, attributing time-trends to year effects, or to matching age and cohort ef- fects. In the example here, where consumption or earnings is the variable to be decomposed, a simple method of presentation is to attribute growth to age and cohort effects, and to use the year effects to capture cyclical fluctuations or busi- ness-cycle effects that average to zero over the long run. A normalization that accomplishes this makes the year effects orthogonal to a time-trend, so that, using the same notation as above, (2.94) s,y = 0. The simplest way to estimate (2.90) subject to the normalization (2.94) is to re- gress y on (a) dummies for each cohort excluding (say) the first, (b) dummies for each age excluding the first, and (c) a set of T- 2 year dummies defined as fol- lows, from t = 3,.., T (2.95) d, = d,-[(t-1)d2-(t-2)d11 where dt is the usual year dummy, equal to 1 if the year is t and 0 otherwise. This procedure enforces the restriction (2.94) as well as the restriction that the year dummies add to zero. The coefficients of the di* give the third through final year coefficients; the first and second can be recovered from the fact that all year ef- fects add to zero and satisfy (2.94). This procedure is dangerous when there are few surveys, where it is difficult to separate trends from transitory shocks. In the extreme case where there are only two years, the method would attribute any increase in consumption between the first and second years to an increasing age profile combined with growth from older to younger cohorts. Only when there are sufficient years for trend and cycle to be separated can we make the decomposition with any confidence. The three remaining panels of Figure 2.5 show the decomposition of the earn- ings averages into age, cohort, and year dummies. The cohort effects in the top ECONOMETRIC ISSUES FOR SURVEY DATA 127 right-hand panel are declining with age; the earlier you are born, the older you are in 1976, and age in 1976 is the cohort measure. Although the picture is one that is close to steady growth from cohort to cohort, there has been a perceptible acceleration in the rate of growth for the younger cohorts. The bottom left-hand panel shows the estimated age effects; according to this, wages are a concave function of age, and although there is little wage increase after age 50, there is no clear turning down of the profile. Although the top left panel creates an impres- sion of a hump-shaped age profile of earnings, much of the impression comes from the cohort effects, not the age effects, and although the oldest cohort shown has declining wages from ages 58 through 65, other cohorts observed at the same ages do not display the same pattern. (Note that only every fifth cohort is included in the top left panel, but all cohorts are included in the regressions, subject only to age lying being between 25 and 65 inclusive.) The final panel shows the year effects, which are estimated to be much smaller in magnitude than either the cohort or age effects; nevertheless they show a distinctive pattern, with the econ- omy growing much faster than trend at the beginning and end of the period, and much more slowly in the middle after the 1979 oil shock. Age and cohort profiles such as those in Figure 2.5 provide the material for examining the structural consequences of changes in the rates of growth of popu- lation and real income. For example, if the age profiles of consumption and in- come are determined by tastes and technology, and are invariant to changes in the rate of economic growth, we can change the cohort effects holding the age effects constant and thus derive the effects of growth on aggregates of consumption, saving, and income. Changes in population growth rates redistribute the popula- tion over the various ages, so that, once again, we can use the age profiles as the basis for aggregating over different age distributions of the population. Much pre- vious work has been forced to rely on single cross sections to estimate age pro- files, and while this is sometimes the best that can be done, cross-sectional age profiles confuse the cohort and age effects, and will typically give much less reli- able estimates than the methods discussed in this section. I return to these tech- niques in the final chapter when I come to examine household saving behavior. 2.8 Two issues in statistical inference This final section deals briefly with two topics that will be required at various points in the rest of the book, but which do not fit easily into the rest of this chap- ter. The first deals with a situation that often arises in practice, when the parame- ters of interest are not the parameters that are estimated, but functions of them. I briefly explain the "delta" method which allows us to transform the variance- covariance matrix of the estimated parameters into the variance-covariance matrix of the parameters of interest, so that we can construct hypothesis tests for the latter. Even when we want to use the bootstrap to generate confidence intervals, asymptotic approximation to variances are useful starting points that can be im- proved using the bootstrap (see Section 1.4). The second topic is concerned with sample size, and its effects on statistical inference. Applied econometricians often 128 THE ANALYSIS OF HOUSEHOLD SURVEYS express the view that rejecting a hypothesis using 100 observations does not have the same meaning as rejecting a hypothesis using 10,000 observations, and that null hypotheses are more often rejected the larger is the sample size. Household surveys vary in size from a few hundred to tens or even hundreds of thousands of observations, so that if inference is indeed the hostage of sample size, it is import- ant to be aware of exactly what is going on, and how to deal with it in practice. *Parameter transformations: the delta method Suppose that we have estimates of a parameter vector ,, but that the parameters of interest are not P, but some possibly nonlinear transformation a, where (2.96) a = h(p) for some known vector of differentiable functions h. In general, this function will also depend on the data, or on some characteristics of the data such as sample means. It will also usually be the case that a and P will have different numbers of elements, k for , and q for a, with q k. Our estimation method has yielded an estimate , for , and an associated variance-covariance matrix V, for which an estimate is also available. The delta method is a means of transforming V, into V.; a good formal account is contained in Fuller (1987, pp. 85-88). Here I con- fine myself to a simple intuitive outline. Start by substituting the estimate of ,B to obtain the obvious estimate of a, a = h (n). If we then take a Taylor series approximation of a = h ( ) around the true value of ,B, we have for i = 1 ... q, kah (2.97) di & ai + E3 pi) or in an obvious matrix notation (2.98) & H(O - ). The matrix H is the qxk Jacobian matrix of the transformation. If we then post- multiply (2.98) by its transpose and take expectations, we have (2.99) Va HVpHW. In practice (2.99) is evaluated by replacing the three terms on the right-hand side by their estimates calculated from the estimated parameters. The estimate of the matrix H can either be programmed directly once the differentiation has been done analytically, or the computer can be left to do it, either using the analytical differentiation software that is increasingly incorporated into some econometric packages, or by numerical differentiation around the estimates of P. Variance-covariance matrices from the delta method are often employed to calculate Wald test statistics for hypotheses that place nonlinear restrictions on the ECONOMETRIC ISSUES FOR SURVEY DATA 129 parameters. The procedure follows immediately from the analysis above by writ- ing the null hypothesis in the form: (2.100) Ho a = h(,) = 0 for which we can compute the Wald statistic (2.101) W = &,val.-. Under the null hypothesis, W is asymptotically distributed as X2 with q degrees of freedom. For this to work, the matrix Va has to be nonsingular, for which a necessary condition is that q be no larger than k; clearly we must not try to test the same restriction more than once. As usual, some warnings are in order. These results are valid only as large- sample approximations, and may be seriously misleading in finite samples. For example, the ratio of two normally distributed variables has a Cauchy distribution which does not possess any moments, yet the delta method will routinely provide a "variance" for this case. In the context of the Wald tests of nonlinear restric- tions, there are typically many different ways of writing the restrictions, and un- less the sample size is large and the hypothesis correct, these will all lead to dif- ferent values of the Wald test (see Gregory and Veall 1985 and Davidson and MacKinnon 1993, pp. 463-7 1, for further discussion). Sample size and hypothesis tests Consider the frequently encountered situation where we wish to test a simple null hypothesis against a compound alternative, that j3 = Po for some known po against the alternative ,B 9# Po. A typical method for conducting such a test would be to calculate some statistic from the data and to see how far it is from the value that it would assume under the null, with the size of the discrepancy acting as evidence against the null hypothesis. Most obviously, we might estimate f3 itself without imposing the restriction, and compare its value with Po Likelihood-ratio tests-or other measures based on fit-compare how well the model fits the data at unrestricted and restricted estimates of P. Score-or Lagrange multiplier-tests calculate the derivative of the criterion function at Po, on the grounds that non- zero values indicate that there are better-fitting alternatives nearby, so casting doubt on the null. All of these supply a measure of the failure of the null, and our acceptance and rejection of the hypothesis can be based on how big is the mea- sure. The real differences between different methods of hypothesis testing come, not in the selection of the measure, but in the setting of a critical value, above which we reject the hypothesis on the grounds that there is too much evidence against it, and below which we accept it, on the grounds that the evidence is not strong enough to reject. Classical statistical procedures-which dominate econo- metric practice-set the critical value in such a way that the probability of reject- 130 THE ANALYSIS OF HOUSEHOLD SURVEYS ing the null when it is correct, the probability of Type I error, or the size of the test, is fixed at some preassigned level, for example, five or one percent. In the ideal situation, it is possible under the null hypothesis to derive the sampling distribution of the quantity that is being used as evidence against the null, so that critical values can be calculated that will lead to exactly five (one) percent of rejections when the null is true. Even when this cannot be done, the asymptotic distribution of the test statistic can usually be derived, and if this is used to select critical values, the null will be rejected five percent of the time when the sample size is sufficiently large. These procedures take no explicit account of the power of the test, the probability that the null hypothesis will be rejected when it is false, or its complement, the Type II error, the probability of not rejecting the null when it is false. Indeed, it is hard to see how these errors can be controlled because the power depends on the unknown true values of the parameter, and tests will typi- cally be more powerful the further is the truth from the null. That classical procedures can generate uncomfortable results as the sample size increases is something that is often expressed informally by practitioners, and the phenomenon has been given an excellent treatment by Leamer (1978, pp. 100- 120), and it is on his discussion that the following is based. The effect most noted by empirical researchers is that the null hypothesis seems to be more frequently rejected in large samples than in small. Since it is hard to believe that the truth depends on the sample size, something else must be going on. If the critical values are exact, and if the null hypothesis is exactly true, then by construction the null hypothesis will be rejected the same fraction of times in all sample sizes; there is nothing wrong with the logic of the classical tests. But consider what happens when the null is not exactly true, or alterna- tively, that what we mean when we say that the null is true is that the parameters are "close" to the null, "close" referring to some economic or substantive mean- ing that is not formally incorporated into the statistical procedure. As the sample size increases, and provided we are using a consistent estimation procedure, our estimates will be closer and closer to the truth, and less dispersed around it, so that discrepancies that were undetectable with small sample sizes will lead to rejections in large samples. Larger sample sizes are like greater resolving power on a telescope; features that are not visible from a distance become more and more sharply delineated as the magnification is turned up. Over-rejection in large samples can also be thought about in terms of Type I and Type II errors. When we hold Type I error fixed and increase the sample size, all the benefits of increased precision are implicitly devoted to the reduction of Type II error. If there are equal probabilities of rejecting the null when it is true and not rejecting it when it is false at a sample size of 100, say, then at 10,000, we will have essentially no chance of accepting it when it is false, even though we are still rejecting it five percent of the time when it is true. For economists, who are used to making tradeoffs and allocating resources efficiently, this is a very strange thing to do. As Leamer points out, the standard defense of the fixed size for classical tests is to protect the null, controlling the probability of rejecting it when it is true. But such a defense is clearly inconsistent with a procedure that ECONOMETRIC ISSUES FOR SURVEY DATA 131 devotes none of the benefit of increased sample size to lowering the probability that it will be so rejected. Repairing these difficulties requires that the critical values of test statistics be raised with the sample size, so that the benefits of increased precision are more equally allocated between reduction in Type I and Type II errors. That said, it is a good deal more difficult to decide exactly how to do so, and to derive the rule from basic principles. Since classical procedures cannot provide such a basis, Bayesian alternatives are the obvious place to look. Bayesian hypothesis testing is based on the comparison of posterior probabilities, and so does not suffer from the fundamental asymmetry between null and alternative that is the source of the difficulty in classical tests. Nevertheless, there are difficulties with the Bayesian methods too, perhaps most seriously the fact that the ratio of posterior probabili- ties of two hypotheses is affected by their prior probabilities, no matter what the sample size. Nevertheless, the Bayesian approach has produced a number of pro- cedures that seem attractive in practice, several of which are reviewed by Leamer. It is beyond the scope of this section to discuss the Bayesian testing proce- dures in any detail. However, one of Leamer's suggestions, independently pro- posed by Schwarz (1978) in a slightly different form, and whose derivation is also insightfully discussed by Chow (1983, pp. 300-2), is to adjust the critical values for F and x2 tests. Instead of using the standard tabulated values, the null is re- jected when the calculated F-value exceeds the logarithm of the sample size, Inn, or when a X2 statistic for q restrictions exceeds qlnn. To illustrate, when the sample size is 100, the null hypothesis would be rejected only if calculated F- statistics are larger than 4.6, a value that would be doubled to 9.2 when working with sample sizes of 10,000. In my own work, some of which is discussed in the subsequent chapters of this book, I have often found these Leamer-Schwarz critical values to be useful. This is especially true in those cases where the theory applies most closely, when we are trying to choose between a restricted and unrestricted model, and when we have no particular predisposition either way except perhaps simplicity, and we want to know whether it is safe to work with the simpler restricted model. If the Leamer- Schwarz criterion is too large, experience suggests that such simplifica- tions are indeed dangerous, something that is not true for classical tests, where large-sample rejections can often be ignored with impunity. 2.9 Guide to further reading The aim of this chapter has been to extract from the recent theoretical and applied econometric literature material that is useful for the analysis of household-level data. The source of the material was referenced as it was introduced, and in most cases, there is little to consult apart from these original papers. I have assumed that the reader has a good working knowledge of econometrics at the level of an advanced undergraduate, masters', or first-year graduate course in econometrics covering material such as that presented in Pindyck and Rubinfeld (1991). At the same level, the text by Johnston and DiNardo (1996) is also an excellent starting 132 THE ANALYSIS OF HOUSEHOLD SURVEYS point and, on many topics, adopts an approach that is sympathetic to that taken here. A more advanced text that covers a good deal of the modern theoretical material is Davidson and MacKinnon (1993), but like other texts it is not written from an applied perspective. Cramer (1969), although now dated, is one of the few genuine applied econometrics texts, and contains a great deal that is still worth reading, much of it concerned with the analysis of survey data. Some of the material on clustering is discussed in Chapter 2 of Skinner, Holt, and Smith (1989). Groves (1989, ch. 6) contains an excellent discussion of weighting in the context of modeling versus description. The STATA manuals, Stata Corporation (1993), are in many cases well ahead of the textbooks, and provide brief discus- sions and references on each of the topics with which they deal. 3 Welfare, poverty, and distribution One of the main reasons for collecting survey data on household consumption and income is to provide information on living standards, on their evolution over time, and on their distribution over households. Living standards of the poorest parts of the population are of particular concern, and survey data provide the principal means for estimating the extent and severity of poverty. Consumption data on specific commodities tell us who consumes how much of what, and can be used to examine the distributional consequences of price changes, whether induced by deliberate policy decisions or as a result of weather, world prices, or other exoge- nous forces. In this chapter, I provide a brief overview of the theory and practice of welfare measurement, including summary measures of living standards, of poverty, and of inequality, with illustrations from the Living Standards Surveys of Cote d'Ivoire from 1985 through 1988 and of South Africa in 1993. I also discuss the use of survey data to examine the welfare effects of pricing and of transfer policies using as examples pricing policy for rice in Thailand and pensions in South Africa. The use of survey data to investigate living standards is often straightforward, requiring little statistical technique beyond the calculation of measures of central tendency and dispersion. Although there are deep and still-controversial concep- tual issues in deciding how to measure welfare, poverty, and inequality, the mea- surement itself is direct in that there is no need to estimate behavioral responses nor to construct the econometric models required to do so. Instead, the focus is on the data themselves, and on the best way to present reliable and robust measures of welfare. Graphical techniques are particularly useful and can be used to describe the whole distribution of living standards, rather than focussing on a few summary statistics. For example, the Lorenz curve is a standard tool for charting inequality, and in recent work, good use has been made of the cumulative distribution func- tion to explore the robustness of poverty measures. For other questions it is useful to be able to display (univariate and bivariate) density functions, for example when looking at two measures of living standards such as expenditures and nutritional status, or when investigating the incidence of price changes in relation to the distribution of real incomes. While cross-tabulations and histograms are the tradi- tional tools for charting densities, it is often more informative to calculate nonpara- 133 134 THE ANALYSIS OF HOUSEHOLD SURVEYS metric estimates of densities using one of the smoothing methods that have re- cently been developed in the statistical literature. One of the purposes of this chapter is to explain these methods in simple terms, and to illustrate their use- fulness for the measurement of welfare and the evaluation of policy. The chapter consists of three sections. Section 3.1 is concerned with welfare measurement, and Section 3.3 with the distributional effects of price changes and cash transfers. Each section begins with a brief theoretical overview and continues with empirical examples. The techniques of nonparametric density estimation are introduced in the context of living standards in Section 3.2 and are used exten- sively in Section 3.3 This last section shows how regression functions-condi- tional expectations-can often provide direct answers to questions about distribu- tional effects of policy changes, and I discuss the use of nonparametric regression as a simple tool for calculating and presenting these regression functions. 3.1 Living standards, inequality, and poverty Perhaps the most straightforward way to think about measuring living standards and their distribution is a purely statistical one, with the mean, median, or mode representing the central tendency and various measures of dispersion-such as the variance or interquartile range-used to measure inequality. However, greater conceptual clarity comes from a more theoretical approach, and specifically from the use of social welfare functions as pioneered by Atkinson (1970). This is the approach that I follow here, beginning with social welfare functions, and then using them to interpret measures of inequality and poverty. Social welfare Suppose that we have decided on a suitable measure of living standards, denoted by x; this is typically a measure of per capita household real income or consump- tion, but there are other possibilities, and the choices are discussed below. We de- note the value of social welfare by W and write it as a nondecreasing function of all the x's in the population, so that (3.1) W = V(XlX,2. . . *XN) where N is the population size. Although our data often come at the level of the household, it is hard to give meaning to household or family welfare without start- ing from the welfare of its members. In consequence, the x's in (3.1) should be thought of as relating to individuals, and N to the number of persons in the popula- tion. The issue of how to move from household data to individual welfare is an important and difficult one, and I shall return to it. It is important not to misinterpret a social welfare function in this context. In particular, it should definitely not be thought of as the objective function of a government or policymaking agency. There are few if any countries for which the maximization of (3.1) subject to constraints would provide an adequate description WELFARE, POVERTY, AND DISTRIBUTION 135 of the political economy of decisionmaking. Instead, (3.1) should be seen as a statistical "aggregator" that turns a distribution into a single number that provides an overall judgment on that distribution and that forces us to think coherently about welfare and its distribution. Whatever our view of the policymaking process, it is always useful to think about policy in terms of its effects on efficiency and on equity, and (3.1) should be thought of as a tool for organizing our thoughts in a coherent way. What is the nature of the function V, and how is it related to the usual concepts? When V is increasing in each of its arguments, social welfare is greater whenever any one individual is better-off and no one is worse-off, so that Pareto improve- ments are always improvements in social welfare. For judging the effects of any policy, we shall almost always want this Pareto condition to be satisfied. However, as we shall see, it is often useful to think about poverty measurement in terms of social welfare, and this typically requires a social welfare function that is unre- sponsive to increases in welfare among the nonpoor. This case can be accommo- dated by weakening the Pareto condition to the requirement that V be nondecreas- ing in each of its arguments. Social welfare functions are nearly always assumed to have a symmetry or anonymity property, whereby social welfare depends only on the list of welfare levels in society, and not on who has which welfare level. This makes sense only if the welfare levels are appropriately defined. Mon sy income does not translate into the same level of living at different price levels, and a large household can hardly be as well-off as a smaller one unless it has more money to spend. I shall return to this issue below, when I discuss the definition of x, and in Chapter 4, when I discuss the effects of household composition on welfare. Finally, and perhaps most importantly, social welfare functions are usually assumed to prefer more equal distributions to less equal ones. If we believe that inequality is undesirable, or equivalently that a gift to an individual should in- crease social welfare in (3.1) by more when the recipient is poorer, then for any given total of x-and ignoring any constraints on feasible allocations-social welfare will be maximized when all x's are equal. (Note that policies that seek to promote equality will often have incentive effects, so that a preference for equality is not the same as a claim that equality is desirable once the practical constraints are taken into account.) Equity preference will be guaranteed if the function V has the same properties as a standard utility function, with diminishing marginal utility to each x, or more formally, when it is quasi-concave, so that when we draw social indifference curves over the different x's, they are convex to the origin. Quasi- concavity of V means that if x I and x2 are two lists of x's, with one element for each person, and if V(x l) = V(x2) so that the two allocations are equally socially valuable, then any weighted average, I x I + (1 -X)X2 for X between 0 and 1, will have as high or higher social welfare. A weighted average of any two equally good allocations is at least as good as either. In particular, quasi-concavity implies that social welfare will be increased by any transfer of x from a richer to a poorer person, provided only that the transfer is not sufficiently large to reverse their relative positions. This is the "principle of transfers," originally proposed by Dal- 136 THE ANALYSIS OF HOUSEHOLD SURVEYS ton (1920). It should be noted that the principle of transfers does not require quasi- concavity, but a weaker condition called "s-concavity" (see Atkinson 1992 for a survey and more detailed discussion). Inequality and social welfare For the purposes of passing from social welfare to measures of inequality, it is convenient that social welfare be measured in the same units as individual welfare, so that proportional changes in all x's have the same proportional effect on the aggregate. This will happen if the function V is homogeneous of degree one, or has been transformed by a monotone increasing transform to make it so. Provided the transform has been made, we can rewrite (3.1) as (3.2) W = PV( XI, XNv 11 p where p is the mean of the x's. Equation (3.2) gives a separation between the mean value of x and its distribution, and will allow us to decompose changes in social welfare into changes in the mean and changes in a suitably defined measure of inequality. Finally, we choose units so that V( 1,1_ . . ,1 ) = 1, so that when there is perfect equality, and everyone has the mean level of welfare, social welfare is also equal to that value. Since social welfare is equal to p when the distribution of x's is perfectly equal, then, by the principle of transfers, social welfare for any unequal allocation cannot be greater than the mean of the distribution p. Hence we can write (3.2) as (3.3) W = p (1-I) where I is defined by the comparison of (3.2) and (3.3), and represents the cost of inequality, or the amount by which social welfare falls short of the maximum that would be attained under perfect equality. I is a measure of inequality, taking the value zero when the x's are equally distributed, and increasing with disequalizing transfers. Since the inequality measure is a scaled version of the function V with a sign change, it satisfies the principle of transfers in reverse, so that any change in distribution that involves a transfer from rich to poor will decrease I as defined by (3.2) and (3.3). Figure 3.1 illustrates social welfare and inequality measures for the case of a two-person economy. The axes show the amount of x for each of the two consum- ers, and the point S marks the actual allocation or status quo. Since the social welfare function is symmetric, the point S l, which is the reflection of S in the 45- degree line, must lie on the same social welfare contour, which is shown as the line SBS'. Allocations along the straight line SCS' (which will not generally be feasible) correspond to the same total x, and those between S and S' have higher values of social welfare. The point B is the point on the 45-degree line that has the same social welfare as does S; although there is less x per capita at B than at S, the equality of distribution makes up for the loss in total. The amount of x at B is WELFARE, POVERTY, AND DISTRIBUTION 137 Figure 3.1. Measuring inequality from social welfare X2 D xe X XI X2I denoted x ', and is referred to by Atkinson as "equally distributed equivalent x." Equality is measured by the ratio OBIOC, or by x '/g, a quantity that will be unity if everyone has the same, or if the social welfare contours are straight lines perpen- dicular to the 45-degree line. This is the case where "a dollar is a dollar" whoever receives it so that there is no perceived inequality. Atkinson's measure of inequal- ity, defined by (3.3), is shown in the diagram as the ratio BCIOC. One of the advantages of the social welfare approach to inequality measure- ment, as embodied in (3.3), is that it precludes us from making the error of inter- preting measures of inequality by themselves as measures of welfare. It will some- times be the case that inequality will increase at the same time that social welfare is increasing. For example, if everyone gets better-off, but the rich get more than the poor, inequality will rise, but there has been a Pareto improvement, and most observers would see the new situation as an improvement on the original one. When inequality is seen as a component of social welfare, together with mean levels of living, we also defuse those critics who point out that a focus on inequal- ity misdirects attention away from the living standards of the poorest (see in particular Streeten et al 1981). Atkinson's formulation is entirely consistent with an approach that pays attention only to the needs of the poor or of the poorest groups, provided of course that we measure welfare through (3.3), and not through (negative) I alone. Just to reinforce the point, we might define a "basic-needs" social welfare function to be the average consumption of the poorest five percent of society, Vi P say. This measure can be rewritten as ,u (1 - I), where I is the inequality measure 1 - p P1l1. 138 THE ANALYSIS OF HOUSEHOLD SURVEYS Measures of inequality Given this basic framework, we can generate measures of inequality by specifying a social welfare function and solving for the inequality measure, or we can start from a standard statistical measure of inequality, and enquire into its consistency with the principle of transfers and with a social welfare function. The first ap- proach is exemplified by Atkinson's own inequality measure. This starts from the additive social welfare function I N Xi (3.4a) W =-- ' , e I N i=i l-eF N (3.4b) InW 1 E Inx,, e= 1. Ni=i The parameter e 20 controls the degree of "inequality aversion" or the degree to which social welfare trades off mean living standards on the one hand for equality of the distribution on the other. In Figure 3.1, social welfare indifference curves are flatter when e is small, so that, for the same initial distribution S, the point B moves closer to the origin as e increases. Atkinson's social welfare function, which will also prove useful in the tax reform analysis of Chapter 5, has the property that the ratio of marginal social utilities of two individuals is given by the reciprocal of the ratio of their x's raised to the power of e: aw/ax. (3.5) aw/ax. = (xi/X)dE. Hence, if e is zero so that there is no aversion to inequality, marginal utility is the same for everyone, and social welfare is simply p, the mean of the x's. If e is 2, for example, and i is twice as well-off as j, then the marginal social utility of addi- tional x to i is one-fourth the marginal social utility of additional x to j. As e tends to infinity, the marginal social utility of the poorest dominates over all other mar- ginal utilities, and policy is concerned only with the poorest. When social welfare is the welfare of the poorest, which is what (3.4) becomes as e tends to infinity, social preferences are sometimes said to be maximin (the object of policy is to maximize the minimum level of welfare) or Rawlsian, after Rawls (1972). Think- ing about relative marginal utilities according to (3.5) is sometimes a convenient way of operationalizing the extent to which one would want poor people to be favored by policies or projects. The inequality measure associated with (3.4) are, when E * 1, 1 11(1 -c) (3.6a) I1= 1--(/)' and, when e = 1, the multiplicative form N (3.6b) I = 1- (XI/p)1IN. 3=1 WELFARE, POVERTY, AND DISTRIBUTION 139 These expressions are obtained by raising social welfare to the power of 1/( 1 - e), which makes the function homogeneous of the first degree, and then following through the procedures of the previous subsection. In line with the interpretation of e as an aversion or perception parameter, there is no (perceived) inequality when E is zero, no matter what the distribution of the x's. Conversely, if e > 0 and one person has all but a small amount a, say, with a spread equally over the oth- ers, then I tends to one as the number of people becomes large. Values of e above 0 but below 2 appear to be useful, although in applications, it is often wise to look at results for a range of different values. We may also choose to start from the standard measures of inequality. Provided these satisfy the principle of transfers, they will be consistent with Atkinson's approach, and will each have an associated social welfare function that can be recovered by applying (3.3). Some statistical measures of inequality do not satisfy the principle of transfers. The interquartile ratio-the 75th percentile less the 25th percentile divided by the median-is one such. Transferring x from a richer to a poorer person in the same quartile group will have no effect on inequality, and a transfer from someone at the bottom quartile to someone poorer will lower the bottom quartile and so will actually increase inequality. Less obviously, it is also possible to construct cases where a transfer from a better-off to a poorer person will increase the variance of logarithms. However, this can only happen when both people are far above the mean-which may not be relevant in some applications- and the other conveniences of the log variance may render it a competitive in- equality measure in spite of this deficiency. Other standard measures that do satisfy the principle of transfers are the Gini coefficient, the coefficient of variation, and Theil's "entropy" measure of inequal- ity. The Gini coefficient if often defined from the Lorenz curve (see below), but can also be defined directly. One definition is the ratio to the mean of half the average over all pairs of the absolute deviations between people; there are N(N-1)/2 distinct pairs in all, so that the Gini is (3.7a) ' = p N(N l ) lX . Note that when everyone has the same, ,u, the Gini coefficient is zero, while if one person has Ng, and everyone else zero, there are N - 1 distinct nonzero absolute differences, each of which is Np, so that the Gini is 1. The double sum in (3.7a) can be expensive to calculate if N is large, and an equivalent but computationally more convenient form is (3.7b) .~, = N+1 _ 2 N Ni N(N-1)p i=1 where pi is the rank of individual i in the x-distribution, counting from the top so that the richest has rank 1. Using (3.7b), the Gini can straightforwardly and rapidly be calculated from microeconomic data after sorting the observations. I shall give examples below, together with discussion of how to incorporate sample weights, and how to calculate the individual-level Gini from household-level data. 140 THE ANALYSIS OF HOUSEHOLD SURVEYS Not surprisingly in view of (3.7b), the social welfare function associated with the Gini coefficient is one in which the x's are weighted by the ranks of each individual in the distribution, with the weights larger for the poor. Since the Gini lies between zero and one, the value of social welfare in an economy with mean p and Gini coefficient ly is p (1 - y), a measure advocated by Sen (1976a) who used it to rank of Indian states. The same measure has been generalized by Graaff (1977) to p (1 - y)0, for o between 1 and 0; Graaff suggests that equity and effi- ciency are separate components of welfare, and that by varying G we can give different weights to each (see also Atkinson 1992 for examples). The coefficient of variation is the standard deviation divided by the mean, while Theil's entropy measure is given by (3.8) N = E xi Inx) IT lies between 0, when all x's are identical, and InN, when one person has every- thing. This and other measures are discussed at much greater length in a number of texts, for example, Cowell (1995) or Kakwani (1980). The choice between the various inequality measures is sometimes made on grounds of practical convenience, and sometimes on grounds of theoretical prefer- ence. On the former, it is frequently useful to be able to decompose inequality into "between" and "within" components, for example, between and within regions, sectors, or occupational groups. Variances can be so decomposed, as can Theil's entropy measure, while the Gini coefficient is not decomposable, or at least not without hard-to-interpret residual terms (see, for example, Pyatt 1976). It is also sometimes necessary to compute inequality measures for magnitudes-such as in- comes or wealth-that can be negative, which is possible with the Gini or the coef- ficient of variation, but not with the Theil measure, the variance of logarithms, or the Atkinson measure. Further theoretical refinements can also be used to narrow down the choice. For example, we might require that inequality be more sensitive to differences between the poor than among the rich (see Cowell 1995), or that inequality aversion be stronger the further we are away from an equal allocation (see Blackorby and Donaldson 1978). All of these restrictions have appeal, but none has acquired the universal assent that is accorded to the principle of transfers. Poverty and social welfare In developing countries, attention is often focussed less on social welfare and in- equality than on poverty. Indeed, poverty is frequendy seen as the defining charac- teristic of underdevelopment, and its elimination as the main purpose of economic development. In such a context, it is natural for welfare economics to have a poverty focus. Even so, and although the poverty measurement literature has de- veloped in a somewhat different direction, the social welfare function approach of this section is quite general, and as we have already seen, can readily accommo- date a preference and measurement structure that is focusses attention exclusively towards the poor. WELFARE, POVERTY, AND DISTRIBUTION 141 The social welfare function (3.1) transforms the distribution of x's into a single number that can be interpreted as a summary welfare measure that takes into ac- count both the mean of the distribution and its dispersion. However, we are free to choose a function that gives little or no weight to the welfare of people who are well-off, so that social welfare becomes a measure of the welfare of the poor, in other words, a (negative) measure of poverty. In this sense, poverty measures are special cases of social welfare measures. However, in practical work, they serve rather different purposes. Poverty measures are designed to count the poor and to diagnose the extent and distribution of poverty, while social welfare functions are guides to policy. Just as the measurement of social welfare can be a inadequate guide to poverty, so are poverty measures likely to be an inadequate guide to policy. As far as measurement is concerned, what separates the social welfare from the poverty literatures is that, in the latter, there is a poverty line, below which people are defined as poor, and above which they are not poor. In the language of social welfare, this effectively assigns zero social welfare to marginal benefits that accrue to the nonpoor, whereas the inequality literature, while typically assigning greater weight to benefits that reach lower in the distribution, rarely goes as far as assign- ing zero weight to the nonpoor. While the simplicity of a poverty line concept has much to recommend it, and is perhaps necessary to focus attention on poverty, it is a crude device. Many writers have expressed grave doubts about the idea that there is some discontinuity in the distribution of welfare, with poverty on one side and lack of it on the other, and certainly there is no empirical indicator-income, consumption, calories, or the consumption of individual commodities-where there is any perceptible break in the distribution or in behavior that would provide an empirical basis for the construction of a poverty line. Even when there exists an acceptable, readily comprehensible, and uncontro- versial line, so that we know what we mean when we say that a percent of the population is poor, we should never minimize this measure as an object of policy. The poverty count is an obviously useful statistic, it is widely understood, and it is hard to imagine discussions of poverty without it. However, there are policies that reduce the number of people in poverty, but which just as clearly decrease social welfare, such as taxes on very poor people that are used to lift the just-poor out of poverty. Similarly, a Pareto-improving project is surely socially desirable even when it fails to reduce poverty, and it makes no sense to ignore policies that would improve the lot of those who are poor by many definitions, but whose incomes place them just above some arbitrary poverty line. The construction of poverty lines Without an empirical basis such as a discontinuity in some measure, the construc- tion of poverty lines always involves arbitrariness. In developed countries where most people do not consider themselves to be poor, a poverty line must be below the median, but different people will have different views about exactly how much money is needed to keep them out of poverty. Almost any figure that is reasonably 142 THE ANALYSIS OF HOUSEHOLD SURVEYS central within the distribution of these views will make an acceptable poverty line. The official poverty line in the United States evolved from work in the early 1960s by Orshansky (1963, 1965) who took the cost of the U.S. Department of Agricul- ture's "low-cost food plan" and multiplied it by three, which was the reciprocal of the average food share in the Agriculture Department's 1955 household survey of food consumption. While such a procedure might seem to be empirically well-grounded-and the perception that it is so has been important in the wide and continuing acceptance of the line-it is arbitrary to a considerable extent. The food plan itself was only one of several that were adapted by nutritional "experts" from the food consump- tion patterns of those in the lowest third of the income range in the 1955 survey, while the factor of three was based on food shares at the mean, not at the median, or at the 40th or the 25th percentile, for all of which a case could be mounted. In fact, Orshansky's line of a little over $3,000 for a nonfarm family of four was adopted, not because of its scientific foundations, but because her procedure yielded an answer that was acceptably close to another arbitrary figure that was already in informal use within the federal government (see Fisher 1992 for more on the history and development of the U.S. poverty line). In India, poverty lines and poverty counts have an even more venerable history stretching back to 1938 and the National Planning Committee of the Indian National Congress. The more recent history is detailed in Government of India (1993), from which the following account is drawn. In 1962, "a Working Group of eminent economists and social thinkers" recommended that people be counted as poor if they lived in a household whose per capita monthly expenditure was less than 20 rupees at 1960-61 prices in rural areas, or 25 rupees in urban areas. These "bare minimum" amounts excluded expenditure on health and education, both of which were "expected to be provided by the State according to the Constitution and in the light of its other commitments." The precise economic and statistical basis for these calculations is not known, although the cost of obtaining minimally adequate nutrition was clearly taken into account, and the difference between urban and rural lines made an allowance for higher prices in the former. Dandekar and Rath (197 la, 197 lb) refined these poverty lines using a method that is still in widespread use. They started from an explicit calorie norm, 2,250 calories per person per day in both urban and rural areas. Using detailed food data from the National Sample Surveys (NSS), they calculated calorie consumption per capita as a function of total household expenditure per capita-the calorie Engel curve-and found that the norms were reached on average at 14.20 rupees per capita per month in rural areas, and 22.60 rupees per capita in urban areas, again at 1960-61 prices. These estimates were further refined by a "Task Force" of the Planning Commission in 1979, who revised the calorie norms to 2,400 in rural areas, and 2,100 in urban areas; the difference comes from the lower rates of physical activity in urban areas. The 28th round (1973-74) of the NSS was then used to estimate regression functions of calories on expenditure, and to convert these numbers to 49.09 rupees (rural) and 56.64 rupees (urban) at 1973-74 prices. These lines-updated for all-India price inflation-have been the basis for Indian WELFARE, POVERTY, AND DISTRIBUTION 143 poverty counts since 1979, although the "Expert Group" that reported in 1993 has recommended that allowance be made for interstate variation in price levels. In poor countries such as India, where food makes up a large share of the bud- get, and where the concern with poverty is closely associated with concerns about undernutrition, it makes more sense to use food and nutritional requirements to derive poverty lines than it does in the United States The "low-cost food plan" in the United States can be replaced by something closer to the minimum adequate diet for the country and type of occupation, and because food is closer to three- quarters than a third of the budget, the "multiplier" needed to allow for nonfood consumption is smaller, less important, and so inherently less controversial. Even so, the calorie-based procedure of setting a poverty line is subject to a number of serious difficulties. First, the minimum adequate calorie levels are themselves subject to uncertainty and controversy, and some would argue that resolving the arbitrariness about the poverty line with a calorie requirement simply replaces one arbitrary decision with another. Second, the concept of a behavioral Engel curve does not sit well with the notion that there is a subsistence level of calories. Suppose, for example, that a household is poor in that its expected calorie intake conditional on its income is inadequate, but has more than enough to buy the subsistence number of calories if it spent more of its budget on food. It seems that the subsistence number of calories is not really "required" in any absolute sense, or at least that the household is prepared to make tradeoffs between food and other goods, tradeoffs that are not taken into account in setting the line. Third, it is always dangerous to measure welfare using only a part of consumption, even when the part of consumption is as important as is food. When food is relatively cheap, people will consume more-even if only marginally so-and poverty lines will be higher where the relative price of food is higher, even though consumers may be compensated by lower prices elsewhere in the budget. Bidani and Ravallion (1994) have examined this phenomenon in Indonesia. They show that higher food prices in the cities, together with the lower caloric re- quirements of more sedentary urban jobs, imply that the urban calorie Engel curve is lower than the rural calorie Engel curve. At the same level of PCE, urban con- sumers consume less calories than do rural consumers. In consequence, a common nutritional standard requires a higher level of PCE in the cities. In the Indonesian case, this results in a poverty line so much higher in urban than rural areas that there appears to be more poverty in the former, even though real incomes and real levels of consumption are much higher in the cities. Once poverty lines are established they often remain fixed in real terms. In the United States, the current poverty line is simply Orshansky's 1961 poverty line updated for increases in the cost of living. In India, as detailed above, there have been revisions to methodology, but the lines have changed very little in real terms, and a number of studies, such as Bardhan (1973) and Ahluwalia (1978, 1985), have used poverty lines close to those proposed by Dandekar and Rath in 1971. This constancy reflects a view of poverty as an absolute; poverty is defined by the ability to purchase a given bundle of goods so that the poverty line should remain fixed in real terms. However, not everyone accepts this position, and it can be ar- 144 THE ANALYSIS OF HOUSEHOLD SURVEYS gued that poverty lines should move with the general standard of living, although perhaps not at the same rate. Some would argue that poverty is a purely relative phenomenon, defined by current social customs, and that the poor are simply those in the bottom percentiles of the distribution of welfare. An intermediate view comes from Sen's (1985, 1992) view of welfare in terms of the capability to function in society. If economic growth means that food is sold with an increased amount of packaging and service built in, if city center stores relocate to suburban areas that cannot be reached on foot, and if urban growth increases the cost and time to travel to work, then a fixed absolute poverty line makes no sense. There is also some relevant empirical evidence that comes from asking people whether they are poor and what the poverty line ought to be (see Mangahas 1979, 1982, 1985, who makes good use of such surveys to assess poverty in the Philippines). In the United States, Gallup polls have regularly asked respondents how much money they would need "to get along," and more occasion- ally what they think would be an adequate poverty line. In the 1960s, the mean responses about the latter were close to the official (Orshansky) line, but have since increased in real terms, although not always as fast as has average real dis- posable income (see Rainwater 1974 and Vaughan 1992). Ravallion (1993) has also examined the cross-country relationship between real gross domestic product (GDP) and poverty lines, and found that the elasticity is close to unity. While many people-including this author-are uncomfortable with an entirely relative con- cept of poverty, it is surely right that there should be some movement of the line in response to changes in mean levels of living. The conceptual and practical difficulties over the choice of a poverty line mean that all measures of poverty should be treated with skepticism. For policy evalu- ation, the social welfare function is all that is required to measure welfare, includ- ing an appropriate treatment of poverty. While it is possible-and in my view desirable-to give greater weight to the needs of the poorest, I see few advantages in trying to set a sharp line, below which people count and above which they do not. Poverty lines and poverty counts make good headlines, and are an inevitable part of the policy debate, but they should not be used in policy evaluation. Perhaps the best poverty line is an infinite one; everyone is poor, but some a good deal more so than others, and the poorer they are the greater weight they should get in measuring welfare and in policy evaluation. The concept of a poverty line is deeply embedded in the poverty literature, and measures of poverty are typically based on it. Even so, a good deal of the recent literature on poverty has followed Atkinson (1987) in recognizing that the poverty line is unlikely to be very precisely measured, and trying to explore situations in which poverty measures are robust to this uncertainty. I shall return to this ap- proach below once I have introduced some of the standard measures. Measures of poverty There are a number of good reviews of alternative poverty measures and their properties, see in particular Foster (1984) and Ravallion (1993), so that I can WELFARE, POVERTY, AND DISTRIBUTION 145 confine myself here to a brief discussion of the most important measures. The obvious starting point-and the measure most often quoted-is the headcount ratio, defined as the fraction of the population below the poverty line. If the line is denoted by z, and the welfare measure is x, then the headcount ratio is N (3.9) Po =N I (Xi Z) where I (.) is an indicator function that is I if its argument is true and 0 otherwise. The sum of the indicators on the right-hand side of (3.9) is the number of people in poverty, so that PO is simply the fraction of people in poverty. It is worth noting that with a change of sign, (3.9) could conceivably be re- garded as a social welfare function. It is the average value of a rather strange valu- ation function in which x counts as -I when it is below the poverty line z, and as O when it is above z. This function is illustrated as the heavy line labeled P. in Figure 3.2; it is nondecreasing in x, so it has some of the characteristics of a utility function, but its discontinuity at the poverty line means that it is not concave. It is this lack of concavity that violates the principle of transfers, and makes it possible to increase social welfare by taking money from the very poor to lift some better- off poor out of poverty. Even if the poverty line were correctly set, and even if it were acceptable to view poverty as a discrete state, the headcount ratio would be at best a limited measure of poverty. In particular, it takes no account of the degree of poverty, and would, for example, be unaffected by a policy that made the poor even poorer. The Figure 3.2. Alternative poverty measures and social welfare Contribution to social welfare z x Individual welfare P PO 146 THE ANALYSIS OF HOUSEHOLD SURVEYS headcount ratio gives the same measure of poverty whether all the poor are just below a generous poverty line, or whether they are just above an ungenerous level of subsistence. One way of doing better is to use the poverty gap measure (3.10) PX N ;. ( 1 --L) 1 (xi < z). According to (3.10), the contribution of individual i to aggregate poverty is larger the poorer is i. P1 can also be interpreted as a per capita measure of the total short- fall of individual welfare levels below the poverty line; it is the sum of all the shortfalls divided by the population and expressed as a ratio of the poverty line itself. Hence if, for example, P, were 0.25, the total amount that the poor are below the poverty line is equal to the population multiplied by a quarter of the poverty line. It is tempting to think of P1 (or at least P1z) as a measure of the per capita "cost" of eliminating poverty, but this is far from being so except in the impracti- cal case where lump-sum taxes and subsidies are possible. Even when tax and subsidy administration is efficient and is not corrupt, redistributive taxes have incentive effects that may render the elimination of poverty neither possible nor desirable given the actual range of feasible policies. This is clearly the case in an economy where everyone is poor, but applies much more widely. Once again, the appropriate way to think about tax systems for poverty alleviation is to go back to the social welfare function (3.1), to make sure that it incorporates the appropriate degree of weighting towards the poor, and to apply the general theory of tax design (see Newbery and Stern 1987 for a general discussion of such problems in the contexts of developing countries, and Chapter 5 below for some of the empi- rical issues). The poverty gap measure (3.10) has a number of advantages over the head- count ratio (3.9). In particular, the summand is now a continuous function of x, so that there is no longer a discontinuity in the contribution of an individual to the poverty measure as that individual's x passes through the poverty line. When x is just below z, the contribution to poverty is very small, it is zero when x equals z, and remains at zero above z. Furthermore the function (1 -x/z) l(x z) is convex in x-although not strictly so-so that the principle of transfers holds-at least in a weak form. As a result, the social welfare interpretation of the poverty gap mea- sure also makes more sense than that of the headcount ratio. The behavior of each individual's contribution to -Pi is illustrated in Figure 3.2 by the piecewise linear function rising from -1 to 0, a value which it retains above z. This function is increasing in x, and is (just) concave, so that while social welfare is not altered by transfers among the poor or among the nonpoor, it is no longer possible to increase social welfare by acting as an anti-Robin Hood, taking resources from the poor to give to the rich. The poverty gap measure will be increased by transfers from poor to nonpoor, or from poor to less poor who thereby become nonpoor. But transfers among the poor have no effect on the measure of poverty, and on this account we may wish to consider other poverty measures. Sen's (1976b) measure of poverty remedies WELFARE, POVERTY, AND DISTRIBUTION 147 the defect by incorporating the inequality among the poor. The definition is (3.11) Ps = Po( i - ( l-yP) P) where VP is the mean of x among the poor, and yP is the Gini coefficient of in- equality among the poor, calculated by treating the poor as the whole population. Note that when there is no inequality among the poor, Ps reduces to the poverty gap measure P1. Conversely, when all but one of the poor has nothing, Ps=Po and the Sen measure coincides with the headcount ratio. More generally, the Sen measure is the average of the headcount and poverty gap measures weighted by the Gini coefficient of the poor, (3.12) Ps = Po YP + P (1 - yP) Because Sen's measure depends on the Gini coefficient, it shares two of its in- conveniences. First, the Gini-and thus the Sen index-is not differentiable. Al- though there is no economic reason to require differentiability, the inability to dif- ferentiate is sometimes a nuisance. More seriously, Sen's measure cannot be used to decompose poverty into contributions from different subgroups, something that is often informative when monitoring changes in poverty. If the aggregate poverty measure can be written as a weighted average of the poverty measures for the rural and urban sectors, or for households by age, or by occupation of the head, then changes over time can be similarly decomposed thus helping to identify groups that are particularly at risk, as well as sometimes pointing to the underlying mecha- nisms. While decomposability is hardly as fundamental a property as (say) the principle of transfers, it is extremely useful. Our final poverty measure, or set of measures, comes from Foster, Greer, and Thorbecke (1984). Their measures are direct generalizations of the poverty gap (3.10) and are written, for some positive parameter a, N (3.13) Pa = N E (I -xi /z)a 1 (xs,' _. - DistriouDatrionA n Cumulativ percee of pe Distribtion B 0 Cumulative percentage of people 100 WELFARE, POVERTY, AND DISTRIBUTION 159 everyone's allocation by a positive number, and so can tell us nothing about the mean of the distribution. Apart from this, all the information in the distribution is contained in the Lorenz curve so that, provided we know the mean, it is possible, for example, to recover the density or distribution function from the Lorenz curve. As first shown by Atkinson (1970), Lorenz curves play a very important role in characterizing the robustness of inequality measures. If two different Lorenz curves do not cross, as is the case for distributions B and either A or C in Figure 3.3, the lower curve can always be transformed into the upper curve by a series of equalizing transfers, by each of which welfare is transferred from a richer to a poorer individual. In consequence, when two Lorenz curves do not cross, the upper one represents an unambiguously more egalitarian distribution, one that will show a lower level of inequality using any measure of inequality that respects the principle of transfers. In Figure 3.3, the two distributions A and C cross one an- other (twice as shown), so that there is no unambiguous ranking of inequality without committing to more specific inequality measures. Distribution C is more equal among both the poorest and the richest, but is more unequal in the middle of the distribution than is distribution A. When one Lorenz curve is everywhere above another, we say that the distribution corresponding to the upper curve Lorenz dominates the distribution represented by the lower curve. Lorenz domination does not give a complete ordering of distributions; when Lorenz curves cross, neither distribution dominates the other. Because the Lorenz curves are unaffected by the mean of the distribution, they cannot be used to rank distributions in terms of social welfare, only in terms of inequality. This deficiency is easily repaired by looking at "generalized" Lorenz curves-a concept introduced by Shorrocks (1983). The horizontal axis for the generalized Lorenz curve is the same as that for the Lorenz curve, the cumulative fraction of the population, but the vertical axis, instead of showing the cumulative share of income, wealth, or consumption, shows the cumulative share multiplied by the mean, so that a Lorenz curve can be converted into a generalized Lorenz curve by multiplying by mean welfare. Clearly, for any single Lorenz curve, this is only a change of scale, and has no effect on its shape; generalized Lorenz curves are used for comparing different distributions with different means and thus with different aggregates. If the generalized Lorenz curve in one period lies above the generalized Lorenz curve in another period, it implies that for all p from 0 to 100, the poorest p percent of the population have more resources in total in the first period's distribution which will therefore be preferred by any equity respecting social welfare function. The poorest person has more, there is more in aggregate, and more generally, each quantile of the distribution is higher. Hence an equity respecting social welfare function will always prefer a distribution whose general- ized Lorenz curve lies above another. The generalized Lorenz curves corresponding to the three distributions in Figure 3.3 are shown in Figure 3.4, where I have assumed that A and B have the same mean, but that mean x in distribution C is higher. As drawn, the effect is to "lift" the distribution C clear of distribution A, so that C now dominates A by the generalized Lorenz criterion, although not by the Lorenz criterion. As a result, 160 THEANALYSISOFHOUSEHOLDSURVEYS Figure 3.4. Generalized Lorenz curves for Lorenz curves in Figure 3.3 A'IC 0 CO ~~~~~~~~~/W / ,. ",. / Distribution C 0 Distribution A Cg-~~ - ~~ Distribution B v Cumulative percentage of people IvO distribution C will be preferred to A by any equality-preferring social welfare function. The generalized Lorenz curve of C now crosses that of B, so that the social welfare ranking of the two will depend on the precise social welfare func- tion used, on the tradeoff between more equality in B and the more mean in C. These examples should make it clear once again that inequality by itself is not a measure of welfare. If the mean of distribution C were further increased so that the generalized Lorenz curve for C were everywhere above that of B, we would have a situation where one distribution is preferred to another by all equity respecting social welfare functions, even though it is more unequal according to all measures of inequality that satisfy the transfer principle. Lorenz curves and inequality in South Africa and Coe d 'lvoire Figure 3.5 shows three Lorenz curves for the individual PCE distributions in South Africa in 1993 for the whole population-the outer curve-for Blacks-the bro- ken line-and for Whites-the innermost line. These curves show, for example, that the poorest 20 (50) percent of South Africans receive only 3 (13) percent of all of PCE, that the poorest 20 (50) percent of Blacks receive 5 (20) percent of all PCE received by Blacks, while the poorest 20 (50) percent of Whites receive 7.5 (28) percent of all PCE received by Wbites. Also important to note is that the Lorenz curve for Blacks lies everywhere outside the Lorenz curve for Whites. As a result, the unanimous ranking in Table 3.4, where the distribution of PCE among Whites is shown as more equal than that amnong Blacks by all the measures, is not a special WELFARE, POVERTY, AND DISTRIBUTION 161 Figure 3.5. Lorenz curves for individual PCE by race, South Africa, 1993 100- 802 0- ~ 020 4600 0 Cumulative percentage of people Source: Authoess calculations based on South African Living Standards Survey, 1993. feature of those particular measures but will be repeated for any other inequality mleasure that satisfies the principle of transfers. Put differently, provided we re- spect the principle of transfers, there can be no dispute that there is more inequality among Blacks than Whites. A second example comes from Cote d'Ivoire. Figure 3.6 shows Lorenz curves for the four years of CILSS data, but in a slightly different way from usual. It is frequently the case with empirical Lorenz curves-as opposed to the theoretical curves in Figure 3.3-that different curves are very close to one another and are nlot easily told apart by inspection. That this was not the case for South Africa is because of that country's extreme differences in inequality between the races; Whites and Blacks are not only far apart in average living standards, but als-o in the dispersion of their living standards. Changes in inequality over time are likely to be less marked, and in M8e d'Ivoire from 1985-88, the Lorenz curves do not mrlove much. Differences are more easily seen if we plot, not the Lorenz curve itself, but the distance of the Lorenz curve from the 45-degree line, and these cur- ves are plotted in Figure 3.6. Because signs are changed, the higher curves are now those with the greatest inequality. The figure shows why the results in Table 3.2 come out as they do and they tell us what would happen if we were to work with alternative measures of inequality. The curve for 1988 lies entirely below the curve for 1986, and both lie below the curves for 1985 and 1987; these last cannot be ranked relative to one another, but cross at around the 70th percentile of the population. Below the 70th percentile, the 1985 curve is higher because the poorest part of the distribution was poorer in 162 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 3.6. Transformed Lorenz curves, Cote d'lvoire, 1985-88 30 * 20 ' 0 - 0 20 40 60 80 100 Cumulative percentage of people Note: The curves show the vertical distance of the Lorenz curve from the 45-degree line. Source: Authores calculations using CILss 1985 than in 1987. Above the 70th percentile, which is a part of the distribution about which we may be less concerned, the curve for 1985 is lower because the share of PcE in the hands of the best-off people is less. As we saw in Table 3.2, all the measures agreed that 1988 was more equal than 1986, and that both were more equal than either 1985 or 1987, while disagreeing on the relative ranking of those two years. This is exactly what must happen given the shapes of the Lorenz curves. Table 3.1 shows that the average level of PCE is falling over time, so that a graph of generalized Lorenz curves over time will be one in which the standard Lorenz curves are progressively rotated clockwise from the origin. Because the changes in the mean are large relative to the changes in inequality, the generalized Lorenz curves move downward over time, with 1986 below 1985, 1987 below 1986, and 1988 below 1987. A small exception is that the generalized Lorenz curve for 1986 is slightly higher than that for 1985 at the very bottom of the distri- bution; the poorest people had absolutely more in 1986 than a year earlier, not just a larger share. This apart, social welfare fell in CMe d'Ivoire from 1985 to 1988, and the conclusion does not depend on how we measure inequality nor on our degree of aversion to it. *Stochastic dominance The mechanics of ranking welfare distributions are clarified by reference to the concept of stochastic dominance. While it is possible to read discussions of in- equality and of poverty without knowing anything about stochastic dominance, a WELFARE, POVERTY, AND DISTRIBUTION 163 deeper understanding can be obtained with the aid of the few straightforward definitions and results given in this subsection. For those who wish to skip this material, I shall label the stochastic dominance results when I come to them, so that it is possible to move directly to the next subsection without loss of continuity. Stochastic dominance is about ranking distributions, and the Lorenz dominance discussed above is only one of a set of definitions and results. All work by treating consumers as a continuum, so that instead of dealing with concepts such as the fraction of people whose consumption (say) is less than x, we think of x as being continuously distributed in the population with CDF F(x). We shall typically be concerned with the comparison of two such (welfare) distributions,, whose CDFs I write as F1(x) and F2(x), and we want to know whether we can say that one is "better" than the other, in some sense to be defined. The first definition isfirst-order stochastic dominance. We say that distribution with CDF F,(x) first-order stochastically dominates distribution F2(x) if and only if, for all monotone nondecreasing functions a(x) (3.20) fa(x)dF1(x) 2 fa(x)dF2(x) where the integral is taken over the whole range of x. The way to appreciate this definition is to think of a(x) as a valuation function, and monotonicity as meaning that more is better (or at least no worse.) According to (3.20), the average value of a is at least as large in distribution 1 as in distribution 2 no matter how we value x, so long as more is better. In consequence, distribution 1 is "better," in the sense that it has more of x, and it stochastically dominates distribution 2. There is a useful result that provides an alternative characterization of first- order stochastic dominance. The condition (3.20) is equivalent to the condition that, for all x, (3.21) F2(x) 2 FI(x) so that the CDF of distribution 2 is always at least as large as that of distribution 1. (The proof of the equivalence of (3.20) and (3.21) is straightforward, and is left as an exercise. But we can go from (3.20) to (3.21) by choosing a(x) to be the function that is zero for x s a and 1 thereafter, and we can go from (3.21) to (3.20) by first integrating the latter by parts.) Note that distribution 1, which is the dominating distribution, is on the left- hand side of (3.20), but on the right-hand side of (3.21). Intuitively, (3.21) says that distribution 2 always has more mass in the lower part of the distribution, which is why any monotone increasing function ranks distribution 1 ahead of distribution 2. The second definition is of second-order stochastic dominance, a concept that is weaker than first-order stochastic dominance in that first-order dominance implies second-order dominance, but not vice versa. We say that distribution F1(x) second-order stochastically dominates distribution F2(x) if and only if, for all monotone nondecreasing and concave functions a(x), the inequality (3.20) holds. Since monotone nondecreasing concave functions are members of the class 164 THE ANALYSIS OF HOUSEHOLD SURVEYS of monotone nondecreasing functions, first-order stochastic dominance implies second-order stochastic dominance. In first-order stochastic dominance, the func- tion a(x) has a positive first derivative; in second-order stochastic dominance, it has a positive first derivative and a negative second derivative. When a(x) is concave, we can interpret the integrals in (3.20) as additive social welfare functions with a(x) the social valuation (utility) function for indi- vidual x. Given this interpretation, second-order stochastic dominance is equiva- lent to social welfare dominance for any concave utility function. As we have already seen, social welfare dominance is equivalent to generalized Lorenz domi- nance, so we also have the result that generalized Lorenz dominance and second- order stochastic dominance are equivalent. For distributions whose means are the same, second-order stochastic dominance, welfare dominance, and (standard) Lor- enz dominance are the same. Second-order stochastic dominance, like first-order stochastic dominance, can be expressed in more than one way. In particular, second-order stochastic domi- nance of Fl(x) over F2(x) implies, and is implied by, the statement that, for all x, x x (3.22) D2(x) = f F2(t)dt f FI(t)dt = D,(x) so that second-order stochastic dominance is checked, not by comparing the CDFS themselves, but by comparing the integrals beneath them. We shall see examples of both comparisons in the next subsection. As we might expect, the fact that first- order stochastic dominance implies second-order stochastic dominance is also ap- parent from the alternative characterizations (3.21) and (3.22). Clearly, if (3.21) holds for all x, (3.22) must also hold for all x. However, when discussing poverty, we will sometimes want a restricted form of stochastic dominance in which (3.21) is true, not for all x, but over some limited range zo s x s zl. But when (3.21) holds in this restricted form it no longer implies (3.22) except in the special case when zo is the lowest possible value of x (see (3.27) and (3.28) below). Further orders of stochastic dominance can be defined by continuing the se- quence. For first-order dominance, distributions are ranked according to the in- equality (3.20) where a (x) has a nonnegative first derivative. For second-order dominance, the a (x) function has nonnegative first derivative and nonpositive second derivative. Distribution F, (x) third-order stochastically dominates distri- bution F2 (x) when (3.20) holds for all functions a (x) with nonnegative first deri- vative, nonpositive second derivative, and nonnegative third derivative. And so on in the sequence. Just as second-order dominance can be tested from (3.22) by comparing D,(x) and D2(x), themselves the integrals of F,(x) and F2(x), whose relative rankings tells us about first-order dominance, so can third-order dominance be tested using the integrals of D1(x) and D2(x). Exploring the welfare distribution: poverty If robustness analysis is desirable for social welfare and inequality comparisons, it is even more so for the measurement of poverty if we are not to be hostage to an WELFARE, POVERTY, AND DISTRIBUTION 165 ill-defined and arbitrarily selected poverty line. At the least, we need to explore the sensitivity of the various poverty measures to the choice of z, although when we do so, it should not be a surprise that we are thereby led back to something closely akin to the social welfare approach. If we have literally no idea where the poverty line is, and it is even possible that everyone is poor, poverty measures have to give at least some weight to everyone, although as always, poorer individuals will get more weight than richer ones. But this is exactly what a standard social welfare function does, so that the social welfare approach will only differ from the poverty approach when we can set some limits on permissible poverty lines. Start from the headcount ratio, and consider what happens as we vary the poverty line z. Since P0 is the fraction of the population whose welfare level is below z, we have (3.23) PO(z;F) = F(z) where the notation on the left-hand side emphasizes not only that the headcount is a function of the poverty line z, but also that it is a function (technically afunc- tional) of the distribution F. If we have two different distributions F, and F2, relating to different years, sectors, or countries, and we want to know which shows more poverty and the extent to which the comparison depends on the choice of poverty line z, then (3.23) tells us that if, for all poverty lines z (3.24) F1(z) > F2(Z) the headcount will always be higher for the first distribution than the second. Hence, all we have to do to test the robustness of the headcount ratio is to plot the CDFs of the two distributions that we are interested in comparing, and if one lies above the other over the range of relevant poverty lines, then the choice of poverty line within that range will make no difference to the outcome. In the language of the previous subsection, the poverty ranking of two distri- butions according to the headcount ratio is robust to all possible choices of poverty line if, and only if, one distribution first-order stochastically dominates the other. In practice, we usually have some idea of the poverty line, or are at least able to rule out some as implausible, so that the more useful requirement is that (3.24) hold over some relevant range, which is a restricted form of stochastic dominance. Figures 3.7 and 3.8 show part of the cumulative distributions for individual PCE in South Africa and in C6te d'Ivoire for 1985 through 1988. Since poverty lines at the very top of the distribution are usually implausible-even for a poor country- it is not necessary to show the complete range of PcE levels. In the South African case in Figure 3.7, the cutoff of 2,000 rand per capita per month excludes about 20 percent of Whites, but no one else. As was the case for the Lorenz curves, the extraordinary inequalities in South Africa produce an unusually clear picture. The four distribution functions are quite separate so that, no matter what poverty line we choose, there will be a higher fraction of people in poverty among Blacks than among Coloreds, a higher fraction among Coloreds than among Indians, and a higher fraction among Indians than among Whites. 166 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 3.7. Distribution functions of individual PCE by race, South Africa, 1993 I.0- .= 0.S- / Coloreds 0.8 0.6- 2 , / / Whi~~~~~~~~~~~~~~~~tes 0.46 0 0 S00 1,500 1,500 2,000 PCE: rand per capita per month Sourre: Author's calculations based on South African Living Standards Survey, 1993. 'Me situation for M6e d'Ivoire is less clear because several of the distribution functions cross. Here I have excluded people living in households with per capita monthly expenditure of more than 300,000 CFAF, which is two and a half times the poverty line used in constructing Table 3.3. Given the declines in PCE over time, it is no surprise that, over most of the range, the curves are higher in the later years so that, for most poverty lines, the fraction of poor people will be increasing from 1985 through to 1988. However, around the poverty line of 128,600 CFAF used in liable 3.3, the distribution functions for the first three years are very close and, at the lowest values of PcE, the curve for 1985 lies above that for 1986 and 1987. As we have already seen in making inequality comparisons, the poorest did better in 1985 than in 1986, even though average PCE fell. Tio examine the robustness of the other poverty measures, consider the "poverty deficit curve," defined as the area under the CDF Up to some poverty line z z (3.25) D(Z;F) f F(x)dx Why this measure is useful is revealed by integrating the right-hand side of (3.25) to give z p (3.26) D(Z;F) = zF(z) -ff(x)xdx = ZF(Z)(1 -11 ) = ZPI(Z;F) where, as before, v P is the mean welfare anong the poor and Phe(z; F ) is the pov- erty-gap measure of poverty. Equation (3.26) establishes that we can use the WELFARE, POVERTY, AND DISTRIBUTION 167 Figure 3.8. Distribution functions of individual PCE, CMte d'lvoire, 1985-88 1.0- 0.8- 9 0 0.6- t LX~~~~~18 0 0.6~~~~~~~~~~~18 0.2 0 0- O I&0 200 3100 PcE: thousands of CFAF per capita per monffi Source: Authoes calculations based on aLss. poverty deficit curve to examine the robustness of the poverty-gap measure to different choices of the poverty line in exactly the same way that we used the CDF to examine the robustness of the headcount ratio. If the poverty deficit curve for one distribution lies above the poverty deficit curve of another, the first distribu- tion will always have more poverty according to the poverty-gap measure. Figure 3.9 shows the lower segments of the poverty deficit curves for the Ivor- ian data. These curves, which are marked in the same way as Figure 3.8, show that the poverty-gap ratio is higher in 1988 than in 1987 for a ran-ge of poverty lines, results that establish some robustness for the estimates in Table 3.3. The poverty deficit curve for 1987 is above that for 1986 and, except for low poverty lines, above that for 1985. Given previous results, the crossing of the 1985 and 1986 curves is to be expected; 1986 was better than 1985 at the bottom of the distribu- tion, but worse on average. It is possible to continue this type of robustness analysis beyond the headcount and poverty-gap ratios to the other poverty measures. However, it is better at this point to look at the pattern that is emerging, and once again to link the analysis of poverty back to the social welfare function. Note first that if, as happens in South Africa, or in Cote d'Ivoire for 1988 and 1987, one of the distribution functions had been higher than another from O up to some plausible upper limit for the poverty line z', say, then the same would automatically have been true for the poverty deficit curves. Formally, if for two distributions F, and F2 (3.27) F,(x) 2 F2(x), 0 S X S z ' 168 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 3.9. Poverty deficit curves, CMte d'Ivoire, 1985-88 150_ § ~~~~~~~~~~~~1986 50- Oi. 0 100 200 300 PcF: thousands of CFAF per capita per month Source: Author's calculations based on CIISS. then z z (3.28) D(z;F,) f F1 (x)dx 2 f F2(x)dx = D(z;F2), O S Z S z 0 o Hence if the distributions do not cross before the maximum possible poverty line, then not only are the headcount ratios robust to the choice of line, but so are the poverty-gap ratios. Indeed, if we were to push the analysis a stage further, and look at the area under the deficit curve, the resulting curves would not cross if the poverty deficit curves did not cross, so that measures like P2 would also be robust to choice of the line. Of course, these results only work one way; it is possible for the distribution functions to cross and for the poverty deficit curves not to do so, as indeed is the case for 1987 versus 1986 in C8te d'Ivoire. But if we find that the distribution functions do not cross, we need look no further because all the poverty measures will be robust. If they do cross but the poverty deficit curves do not, then any measure that is sensitive to the depth of poverty will be robust, and so on. While these results that take us from one type of robustness to another are useful, they are not always exactly what we need. As emphasized by Atkinson (1987), we may have a lower as well as an upper limit for the poverty line, and it may turn out that the distribution functions do not cross between the limits, so that (3.27) holds for z - s x s z +, say, so that the headcount ratio is robust to the choice of poverty line within the range of possibilities. Since the distribution functions may still cross below z -, we no longer have the implication that (3.28) holds even over the restricted range. WELFARE, POVERTY, AND DISTRIBUTION 169 It is also worth considering again what happens when the poverty line can be anything, from zero to infinity. This is the case where the robustness of the head- count ratio is equivalent to the distribution functions never crossing, which is (unrestricted) first-order stochastic dominance. If the poverty deficit curves never cross, we have second-order stochastic dominance; by definition, F1 second-order stochastically dominates F2 if the expectation of all monotone increasing concave functions is larger under F1 than under F2. If the social welfare function is addi- tive over individual welfare levels, concavity is equivalent to diminishing marginal social utility, and thus to a preference for transfers from richer to the poorer peo- ple. Hence, noncrossing of the poverty deficit curves means that an additive equity-respecting social welfare function would prefer the distribution with the lower curve. Not surprisingly, the poverty deficit curve of one distribution is everywhere above that of another if and only if the generalized Lorenz curve of the former is everywhere below that of the former; generalized Lorenz curves and social welfare functions will rank distributions in the same way as one another and in the same way as poverty deficit curves when nothing is known about the pov- erty line. If the poverty deficit curves cross or, equivalently, if the generalized Lorenz curves cross, we might want to consider one more order of integration or of domi- nance. If we draw the curves formed by integrating under the poverty deficit curves, and if these do not cross, then one distribution dominates the other at the third order. This means that it would be preferred by any additive social welfare function where social marginal utility is positive, diminishing, and diminishing at a decreasing rate. This last condition, sometimes referred to as the principle of diminishing transfers (see Kolm 1976), means that, not only does the social wel- fare function increase when transfers are made from rich to poor, but that a transfer from someone earning 300 rand to someone earning 200 rand is to be preferred to a transfer from someone with 500 rand to someone with 300 rand. These third- order connections were first recognized by Atkinson, who reviews them in more detail in Atkinson (1992). Finally, it is worth emphasizing again the potential role of measurement error in household expenditures. The addition of random noise to a distribution spreads it out, and the contaminated distribution will be second-order stochastically domi- nated by the true distribution. If the contamination is similar across years, inter- temporal comparisons may be unaffected, but there is no reason to suppose that this is the case. Surveys often have start-up problems in their first year, and accu- racy can be expected to improve over time in a well-run survey, or to deteriorate if enumerators are not carefully supervised. 3.2 Nonparametric methods for estimating density functions All of the techniques discussed in the previous section relate to the distribution functions of welfare, to transformations like Lorenz curves or generalized Lorenz curves or to the integrals of areas beneath them. However, for many purposes, we are also interested in the density functions of income, consumption, or welfare. 170 THE ANALYSIS OF HOUSEHOLD SURVEYS Standard measures of central tendency and dispersion are often most easily visual- ized in terms of densities, and so it is useful to have techniques that provide esti- mates of densities. These techniques are the topic of this section which takes the form of a largely methodological digression between two more substantive discus- sions. It presents an introduction to the tools of nonparametric density estimation to be used in the next section, as well as in several subsequent chapters. These tools also help understand the nonparametric regressions that will be used in the next section. The distribution of welfare is only one of many examples where it is useful to calculate and display a density function. In the next section, where I con- sider the effects of pricing policies, I shall show how the joint density of the con- sumption or production of a commodity and levels of living can be used to des- cribe the differential effects of price changes on the well-being of rich and poor. Estimating univariate densities: histograms A good place to start is with the distributions of PCE by race in South Africa that were discussed in the previous section. For each of the four groups, Figure 3.10 shows the standard histograms that are used to approximate densities. The histo- grams are drawn for the logarithm of real PCE at the individual level, not for the level; the distribution of the latter is so positively skewed as to preclude the draw- ing of informative histograms. The logarithmic transformation yields a distribution that is more symmetric and much closer to normal. Indeed, the curves drawn on each histogram show the normal densities with mean and variance equal to the Figure 3.10. Histograms of log(PCE) by race, South Africa, 1993 Blacks Coloreds 0.15- 0.10 Indians Wldtes 2 4 6 8' 10 2 4 6 8 10 Logarithm of PCE Source: Authores calculations based on South African Living Standards Survey, 1993. WELFARE, POVERTY, AN1D DISTRIBUTION 171 means and variances of the underlying distributions of the logarithm of PCE; these appear to provide a reasonably good approximation to the empirical densities, at least as represented by the histograms. The histograms and normal distributions of Figure 3.10 are all that we need for many purposes. They provide a visual impression of the position and spread of the data, and allow comparison with some convenient theoretical distribution, in this case the (log)normal. However, there are also a number of disadvantages of histo- grams. There is a degree of arbitrariness that comes from the choice of the number of "bins" and of their widths. Depending on the distribution of points, the choice of a bin boundary on one side or another of a cluster of points can change the shape of the histogram, so that the empirical distribution appears to be different. Perhaps more fundamental are the problems that arise from using histograms- which are tools for representing discrete distributions-to represent the continu- ously differentiable densities of variables that are inherently continuous. Such representations can obscure the genuine shape of the empirical distribution, and are inherently unsuited to providing information about the derivatives of density functions, quantities that are sometimes of interest in their own right, as we shall see in Chapters 4 and 5. It is worth asking why these difficulties did not arise when graphing CDFs, or the areas beneath them as in Figures 3.7, 3.8, and 3.9 above. The answer is that they did, but because the data are cumulated and there are a large number of data points, the discontinuities are less apparent. The empirical distribution functions in Figures 3.7 and 3.8 are calculated from the formula (3.29) F(x) = n -1 E 1 (xi s x) i=l which is simply the fraction of the sample whose x's are less than or equal to the value x. F(x) is a step function that jumps whenever x is equal to a data point, and that is flat between data points. Because there are many data points, the steps are all very small relative to the scale of the figure, and there are many steps, so that the eye does not perceive the jagged shape of the graph, at least away from the upper tail of the distribution where the points thin out. These considerations apply even more strongly to poverty deficit curves, which are the integrals of distribution functions; they are no longer discontinuous at the data points, but only have slopes that are discontinuous step functions. But changes in slopes are even harder to see, and the pictures give the impression of smooth and ever accelerating slopes. When we move from a CDF to a density, we are moving in the opposite direction, differ- entiating rather than integrating, so that the discontinuities in the empirical distri- bution function present serious difficulties in estimating densities, difficulties that are magnified even further if we try to estimate the derivatives of densities. *Esfimating univariate densities: kernel estimators The problems with histograms have prompted statisticians to consider alternative ways of estimating density functions, and techniques for doing so are the main 172 THEANALYSISOFHOUSEHOLDSURVEYS topic of this subsection. One familiar method is to fit a parametric density to the data; the two-parameter lognormal is the simplest, but there are many other possi- bilities with more parameters to permit a better fit (see, for example, Cramer 1969 or Kakwani 1980.) Here I look at nonparametric techniques which, like the histo- gram, allow a more direct inspection of the data, but which do not share the histo- gram's deficiencies. Readers interested in following the topic further are encour- aged to consult the splendid (and splendidly accessible) book by Silverman (1986), on whose treatment the following account is based. Perhaps the simplest way to get away from the "bins" of the histogram is to try to estimate the density at every point along the x-axis. With a finite sample, there will only be empirical mass at a finite number of points, but we can get round this problem by using mass at nearby points as well as at the point itself. The essential idea is to estimate the density f (x) from the fraction of the sample that is "near" to x. One way of doing this is to choose some interval or "band," and to count the number of points in the band around each x. Think of this as sliding the band (or window) along the x-axis, calculating the fraction of the sample per unit interval within it, and plotting the result as an estimate of the density at the mid-point of the band. If the "bandwidth" is h, say, the so-called naive estimator is (3.30) f(x) =n I ( I h 5 X-Xi s h At each point x, we pass through the sample, giving a score of 1 to each point within h12 of x, and zero otherwise; the density is estimated as the total score as a fraction of the sample size and divided by h to put it on a per unit basis. The choice of the bandwidth h is something to which I shall return. For the moment, the point to note is that the bandwidth ought to be smaller the larger is the sample size. If we only have a few points, we need large bands in order to get any points in each, even if the wide bandwidth means that we risk biasing the estimate by bringing into the count data that come from a different part of the distribution. However, as the sample size grows, we can shrink the bandwidth so that, in the limit when we have an infinite amount of data, the bandwidth will be zero, and we will know the true density at each point. In fact, we have to do a little more than this. What has to happen in order to get a consistent estimate of the density at each point is that the bandwidth become smaller at a rate that is less fast than the rate at which the sample size is increasing. As a result, not only does the shrinking band- width guarantee that bias will eventually be eliminated by concentrating only on the mass at the point of interest, but it also ensures that variance will go to zero as the number of points within each band increases and the average within the band becomes more precise. In this way, the increase in the sample size is shared be- tween more points per band, to increase the precision, and smaller bandwidths so as to ultimately eliminate bias. Of course, because some of the benefits of larger sample sizes have to be "diverted" from putting more observations into each band and devoted to shrinking the bandwidth, the rate of convergence to the density at each x is bound to be slower than the normal root-N convergence that is standard in sampling or regression analysis. WELFARE, POVERTY, AND DISTRIBUTION 173 While the naive estimator captures the essential idea of nonparametric density estimation using the "kernel" method-the kernel is the band or indicator function in (3.30)-it does not solve one of the problems with which we began. In parti- cular, there will be steps in f(x) every time a data point enters or exits the band. But this can be dealt with by a simple modification. Instead of giving all the points inside the band equal weight, we give more weight to those near to x and less to those far away, so that points have a weight of zero both just outside and just inside the band. We can do this quite generally by replacing the indicator function in (3.30) by a "kernel" function K(.), so that (3.31) IxK nh (x) which is the "kernel estimate" of the density f (x). There are many possible choices for the kernel function. Because it is a weight- ing function, it should be positive and integrate to unity over the band, it should be symmetric around zero, so that points below x get the same weight as those an equal distance above, and it should be decreasing in the absolute value of its argu- ment. The "rectangular" kernel in (3.30-so called because all observations in the band get equal weight-satisfies all these criteria except the last. A better choice is a kernel function that uses quadratic weights. This is the Epanechnikov kernel (3.32) K(z) = 0.75(1-z2), -1 szs 1 = 0, IzI>1 whose weights have an inverted U-shape that decline to zero at the band's edges. Another obvious source of kernels is the class of symmetric density functions, the most popular of which is the Gaussian kernel (3.33) K(z) =(2ir)-05exp(-z2/2). The Gaussian kernel does not use a discrete band, within which observations have weight and outside of which they do not, but instead gives all observations some weight at each point in the estimated density. Of course, the normal density is very small beyond a few standard deviations from the mean, so that the Gaussian kernel will assign very little weight in the estimate of the density at x to observations that are further than (say) 3h from x. A third useful kernel is the quartic or "biweight" K(z) = .!5.(1_z2)2, -_IZ1 (3.34) 16 = 0, IzI >1. The quartic kernel behaves similarly to the Epanechnikov kernel, declining to zero at the band's edges, but has the additional property that its derivative is continuous at the edge of the band, a property that is useful in a number of circumstances that we shall see as we proceed. Although the choice of kernel function will influence the shape of the estimated density, especially when there are few points and the bandwidth is large, the litera- 174 THEANALYSISOFHOUSEHOLDSURVEYS ture suggests that this choice is not a critical one, at least among sensible alterna- tives such as those listed in the previous paragraph. As a result, a kernel can be chosen on other grounds, such as computational convenience or the requirement that it be continuously differentiable at the boundary. More important is the choice of the bandwidth, and this is a practical problem that has to be faced in every application. As we have already seen, the bandwidth controls the trade-off between bias and variance; a large bandwidth will provide a smooth and not very variable estimate but risks bias by bringing in observations from other parts of the density, while a small bandwidth, although helping us pick up genuine features of the underlying density, risks producing an unnecessarily variable plot. Estimating densities by kernel methods is an exercise in "smoothing" the raw observations into an estimated density, and the bandwidth controls how much smoothing is done. Oversmoothed estimates are biased, and undersmoothed estimates too variable. A formal theory of the trade-off between bias and variance provides helpful insights and is a useful guide to bandwidth selection. In standard parametric infer- ence, optimal estimation is frequently based on minimizing mean-squared error between the estimated and true parameters. In the nonparametric case, we are at- tempting to estimate, not a parameter, but a function, and there will be a mean- squared error at each point on 'the estimated density. One natural procedure is to attempt to minimize the mean integrated squared error, defined as the expectation of the integral of the squared error over the whole density. Silvernan (1986, pp. 38-40) shows how to approximate the mean integrated square error for a kernel estimate of a density, and shows that the (approximate) optimal bandwidth is (3.35) h * = [fz2K(z)dz]2/5 [fK(z)2dz]1/5 [ff'i(x)2dx] 5N-115. Since the evaluation of (3.35) requires knowledge of the very density that we are trying to estimate, it is not directly useful, but is nevertheless informative. It con- firms that the bandwidth should shrink as the sample size increases, but that it should do so only very slowly, in (inverse) proportion to the fifth root of N. Note too the importance of the absolute size of the second derivative of the density. If there is a large amount of curvature, then estimates based on averaging in a band will be biased, so that the bandwidth ought to be small, and conversely on seg- ments of the density that are approximately linear. For most of the applications considered in this book, an adequate procedure is to consider a number of different bandwidths, to plot the associated density esti- mates, and to judge by eye whether the plots are undersmoothed or oversmoothed. This can perhaps be regarded as an informal version of an iterative procedure that computes a "pilot" estimate, calculates the optimal bandwidth from (3.35) assum- ing that the pilot is the truth, and then repeats. Applying the informal procedure usually leads to an easy separation of features in the density that are driven by ran- dom sampling from those that appear to be genuine characteristics of the under- lying law from which the sample is drawn. There should also be some preference for undersmoothing when using graphical methods; the eye can readily ignore WELFARE, POVERTY, AND DISTRIBUTION 175 variability that it judged to be spurious, but it cannot discern features that have been covered up by oversmoothing. It is very useful in these calculations to have a good bandwidth with which to start. If it is assumed that the density has some specific form, normal being the obvious possibility, then (3.35) yields an optimal bandwidth once a kernel is chosen. If the estimated density has the right shape, we have a good choice-in- deed the best choice-and otherwise we can make informal adjustments by hand. If both the kernel and the density are Gaussian, (3.35) gives an optimal bandwidth of 1.06 aN -115, where a is the standard deviation of the density, which can be calculated prior to estimation. Silvernan suggests that better results will typically be obtained by replacing a by a robust measure of spread, in which case the opti- mal bandwidth is (3.36) h * = 1.06 min(a, 0.751QR)N "115 where IQR is the interquartile range-the difference between the 75th and 25th percentiles. Similar expressions can easily be obtained from (3.35) using other kernels; for the Epanechnikov, the multiplying factor 1.06 should be replaced by 2.34, for the quartic by 2.42. Estimating univariate densities: examples The techniques of the previous subsection can be illustrated using the distributions of log (PCE)in Cote d'Ivoire and South Africa, as well as similar distributions from Thailand that I shall refer to again in the next section. Figure 3.11 shows the esti- mated density function for PCE in South Africa, with the data pooled by race. The calculations were done using the "kdensity" command in STATA, weighted using the sampling weights multiplied by household size, and with STATA's default choice of kernel-the Epanechnikov. The weighting is done using the obvious modification of (3.31) (3.37) ft(x) = h 1EvhK x-!) where v. are the normalized weights, the sampling weights normalized by their sum (see equation 1.25). The top left panel uses STATA's default bandwidth, which comes from a formula similar to (3.36), while the other panels show what happens if the bandwidth is scaled up or down. The South African expenditure distribution is a good illustration because the distributions for Africans and Whites are so far apart that the combined density has a distinct "bump," perhaps even a second mode, around the modal log (PcE) for Whites. With the default bandwidth, this can be clearly seen in the estimated den- sity. It is a good deal less clear when the bandwidth is doubled, and is altogether smoothed away when the bandwidth is doubled again. Choosing too large a band- width runs the risk of smoothing out genuine, and genuinely important, features of the data. The figure in the bottom right panel uses a bandwidth that would usually be regarded as too small. Certainly, the graph is not very smooth, and the rough- ness is almost certainly a feature of the estimation, and not of the underlying true 176 THE ANALYSIS OF HOUSEHOLD SURVEYS 0.4 Optimal bandwidth Twice optimal 0c 0.2- 0.4 Pour times optimal Half optimal 03- 0.2. 0.1 2 4 6 t 1o 2 4 6 a lo Logarithm of PCE Source: Authores calculations based on South African Living Standards Survey, 1993. density. Nevertheless, the mode for Whites is clear enough, and the eye is usually capable of smoothing away irrelevant roughness. In this sort of visual exercise, variance is less of a risk than bias, and some undersmoothing is often useful. Figure 3.12 shows the results of estimating kernel densities for the logarithtn of PCE for the four years of data from CMte d'Ivoire, and provides yet another repre- sentation of data that should by now be familiar. We see once again, although in a different form, the features displayed by the summary statistics in Table 3.1, and in the Lorenz, distribution, and deficit curves in Figures 3.6, 3.8, and 3.9. The densities generally move to the left over time, although 1986 moves to the right at the bottom, decreasing both poverty and inequality. *Extensions and alternatives Although kernel estimators are the only nonparametric estimators of densities that I shall use in this book, there are several alternative techniques. Even within the kernel class, there are different ways of selecting the bandwidth, and there are several useful estimators based on other techniques. When the estimates are used for more than graphical presentation, for example as the input into further calcu- lations, it is essential to have a method of bandwidth selection that is objective and replicable across investigators. Apart from the use of rules based on specific distributions, such as (3.36), the most commonly used technique is cross-valida- tion (see again Silverman 1986, pp. 48-6, or Stoker 1991, pp. 123-24). The essen- tial idea here is to compute alternative estimates of the density based on different WELFARE, POVERTY, AND DISTRIBUTION 177 Figure 3.12. Estimated density functions for log(PCE), Cote d'Ivoire, 1985-88 1986 0.6-1 98 198198 1988 0.4- 0 ~ ~ ~ ~~~'1986 2 4 6 8 Logarithm of PCE Source: Authoews calculations based on CaLss. bandwidths, to use the result to get some idea of the relationship between the band width and mean-squared error, and thus to select a bandwidth that minimizes error. Such calculations are computer-intensive, and are unnecessary for the types of ap- plications considered in this book: when our main concern is the graphical inspec- tion of data, it is usually sufficient to try several bandwidths, and to select a suit- able one by inspection of the results. K.emnel mediods can also be generalized to allow different bandwidths at differ- ent points of the distribution. One nonkernel approach that does so automatically is the nearest neighbor mediod. Here the estimate of the density is obtained by cal- culating the distance from each point x to its kth nearest neighbor; k is a number chosen by the investigator, equal perhaps to a fifth of the sample size, and which plays the same role in nearest neighbor estimation as does the bandwidth in kernel estimation. If dk(x) is the distance from x to the kth nearest of its neighbors, whet- her to the left or the right, the estimate of the density at x is given by (3.38) k(x) =(k -1)2dk(x)n. The logic is straightforward. Since the kth nearest neighbor is at distance d k(X), there are k -1I points, or a fraction (k -1I)/N of the sample, to the left or right of x, but in either case within an interval of width 2dk(x) around it. The fraction of the sarmple per unit of interval is therefore given by (3.38). The larger is k, the smoother the estimate, and the larger the risk of bias, while small k avoids bias but 178 THEANALYSISOFHOUSEHOLDSURVEYS risks unnecessary variability; hence the analogy between k and the bandwidth. However, the nearest neighbor method, unlike the kernel method with fixed band- width, keeps the number of points in the band fixed, so that precision is more equally spread through the density. One of the attractions of kernel methods is that they can be routinely extended to higher dimensional cases, at least in principle. Once again, the naive or rectang- ular kernel estimator is the most straightforward case to consider. Suppose that we have two variables xi and x2, and that we have drawn a standard scatter diagram of one against the other. To construct the density at the point (x,,x2) we count the fraction of the sample in, not a band, but a box around the point. If the area around the point is a square with side h, which is the immediate generalization of an interval of width h in the unidimensional case, then the rectangular kernel estima- tor is (3.39) f(x1,x2) = Nh2 [ 1) ( _ h 1) This is the immediate extension of (3.30) to the bivariate case, with the product of the two kernels indicating whether the point is or is not in the square around (x1,x2) and the division by h 2 instead of h required because we are now counting fraction of sample per unit area rather than interval, and the area of each square is h 2. Of course the rectangular kernel is no better for bivariate than for univariate problems and the indicator functions in (3.39) need to be replaced by a bivariate kernel that gives greater weights to observations close to (xl, x2). There is also no need to use the square-contoured kernels that result from multiplying the univari- ate kernels and an obvious alternative is to count (and weight) points in a circle around each point where we want to estimate the density. However, there is a new issue that did not arise in the univariate case, and that is whether it makes sense to use the same bandwidth in both dimensions. For example, if the variance of xi is much larger than that of x2 it makes more sense to use rectangles instead of squares, or ellipses instead of circles, and to make the axes larger in the direction of xl. Similar considerations apply when the two variables are correlated, where we would want to align the ellipses in the direction of the correlation. In practice, these issues are dealt with by transforming the data prior to the calculations so that the transformed variables have equal variance and are orthogonal to one another. The transformation done, it is then appropriate to apply bivariate kernels that treat the two variables symmetrically, to estimate a density for the transformed observa- tions, and then to transform back at a final stage. These operations can all be done in one stage. I illustrate for the bivariate version of the Epanechnikov kernel; while it would be possible to use the product of two univariate Epanechnikov kernels in the bivariate context, we can also use the "circular" form (3.40) K(zl,z2) = (2/T)(1-,Z-z22)l(Z2 +z22 1) If the scatter of the data were approximately circular, which requires that the two variables be uncorrelated and have the same variance, we could apply (3.40) WELFARE, POVERTY, AND DISTRIBUTION 179 directly. More generally, we transform the data using its variance-covariance mat- rix before applying the kernel smoothing. This can be done in one step by writing the V for the 2 x 2 variance-covariance matrix of the sample, defining (3.41) ti2 = (xi-x)'V-1(x,-x) and calculating the density estimate using the Epanechnikov kernel as 2(eV-1/2 N 2\ (3.42) f(x1,x2) = 2(detV) E ( 1- -I (t 1). 7tNh 2hj= Corresponding formulas can be readily derived for the Gaussian and quartic ker- nels; for the former, (3.40) is replaced by the standardized bivariate normal, and for the latter, the 2/h is replaced by 3/h and the deviation of the squares from unity is itself squared. The bivariate density estimates display the empirical structure underlying any statistical analysis of the relationship between two variables. Just as univariate den- sities are substitutes for histograms, bivariate densities can be used in place of cross-tabulation. In the context of welfare measurement, bivariate densities can illustrate the relationship between two different measures of welfare, calories and income, or income and expenditure, or between welfare measures in two periods. They can also be used to display the allocation of public services in relation to levels of living, and in the next section, I shall show how they can be put to good purpose to illuminate the distributional effects of pricing policies. There is no difficulty in principle in extending these methods to the estimation of three- or higher-dimensional densities. Even so, there are a number of practical problems. One is computational cost: if there are k dimensions and N observations, the evaluation of the density on a k-dimensional grid of G points requires NG k evaluations, a number that quickly becomes prohibitively large as k increases. A second difficulty is the "curse of dimensionality," the very large sample sizes required to give adequate estimates of multidimensional densities. An analogy with cross-tabulation is useful. For univariate and bivariate cross-tabulations, sample sizes of a few hundred can provide useful cell means, and so it is with density estimation provided we choose an appropriate bandwidth. In high-dimen- sional cross-tabulations, cells are frequently empty even when the underlying density is nonzero, and we need very large sample sizes to get an adequate match between the sample and the population. The third difficulty is one of presentation. Univariate densities are straightfor- wardly shown as plots of f (x) against x, and with two-dimensional densities, we can use the three-dimensional surface and contour plots that will be illustrated in Figures 3.13 through 3.16 below. Four dimensions can sometimes be dealt with by coloring three-dimensional plots, or if one of the variables is discrete, by reporting different three-dimensional plots for different discrete values. Perhaps the only current application of high-dimensional density estimation is where the estimated density is an intermediate input into another estimation problem, as in the average derivative estimates of Stoker (1991) and Hardle and Stoker (1989). 180 THE ANALYSIS OF HOUSEHOLD SURVEYS Estimating bivariate densities: examples Figure 3.13 shows superimposed contour and scatter plots of the estimated joint density of the logarithm of PCE in the two years 1985 and 1986 on a household basis for the first set of panel households from the cILss. The switch from an indi- vidual to a household basis is dictated by the fact that the panel structure is a household not an individual one, and since the people within households change from one year to the next, we cannot always track individuals over time. The plots make no use of weights; as in the univariate case, it is straightforward to incorpo- rate the weights into the kernel estimation, but I have not done so in order to allow a clear comparison with the scatter plot. The figure shows that PcE in 1986 is strongly positively correlated with PCE in 1985; as usual, we have no way of separating the differences that do exist into genuine changes in living standards on the one hand and measurement error on the other. The superimposed contour map is constructed using the bivariate Epanech- nikov kernel, equations (3.41) and (3.42). Exactly as in the univariate case, I take a grid of equally spaced points from the minimum to maximum value, but now in two dimensions, so that with 99 points on each grid, the density is calculated according to (3.42) for each cell of a 99 by 99 matrix. (The choice of 99 rather than 100 reflects the fact that the surface drawing algorithm used here requires an odd number of points. A smaller grid, say 49 by 49, does not give such clear re- sults; a fine grid is required to give smooth contours.) The calculations are again Figure 3.13. Contour plot and scatter diagram of the joint density of ln(PCE) in 1985 and 1986 for panel households, Cote d'Ivoire 8 7- 3 4 5 6 7 8 Logarithm of PcE in 1985 Source: Authoes calculations based on CLSS. WELFARE, POVERTY, AND DISTRIBUTION 181 Figure 3.14. Netmaps of the joint density of log(PCE) in 1985 and 1986 for panel households, CMte d'Ivoire (i) View from the origin (ii) Side view 1985 B~~~~~~ Note: The same object is displayed in both figures; the letters A and B mark the same sides of the base. .Source: Author's calculations based on CSS. slow, and the time for the bivariate calculations approximately the square of that for the univariate calculations; even so, they are perfectly feasible on small per sonal computers. Unfortunately, while STATA can handle the univariate calcula- tions, it does not currently have facilities for contour or surface plotting, so that it is necessary to turn to GAUSS or some other graphics program that is capable of plotting contours and surfaces (see the Code Appendix for the GAUSS code used for the figure.) A bandwidth of 1 was used for the contour map; note that the scaling in (3.41) means that the bandwidth refers to the standardized distribution rather than the original one, and so does not need to be scaled by a measure of dispersion. Figure 3.14 shows the same estimated density in the form of a surface drawing or netmap, which is a projection of the three-dimensional object. While the con- tour maps are analogous to the usual topographical maps, the netmaps are more like aerial photographs. The two projections, both of which are of the same object plotted in the contour map in Figure 3.13, are shown from different perspectives. In the netmap on the left-hand side of the figure, the eye is (beyond) the bottom left corner of the contour in Figure 3.13, while in the netmap on the right-hand side, the eye position is beyond the lower right corner (see the matching letters A and B in the figure). The netmaps give a clearer overall visual impression than the contours, they do not overemphasize the tails of the distribution, and they can be calculated more cheaply since they do not require so fine a grid of points. How- ever, netmaps also obscure information, and contour maps are recommended whenever we need an accurate and detailed picture of the structure of the density. 182 THE ANALYSIS OF HOUSEHOLD SURVEYS 3.3 Analyzing the distributional effects of policy Finding out who benefits and who loses from a policy change is a task to which household survey data are often well-suited. In this section, I look at two exam- ples. The first, which comes from Deaton (1989a, 1991), considers policy on rice prices in Thailand, and uses the Socioeconomic Survey of 1981-82 to look at the effects of changes in prices on the distribution of income. Farm-households who are net producers of rice gain from higher prices, while net consumers will lose. Changes in rice prices therefore affect the distribution of real income between urban and rural sectors, as well as the distribution within sectors, depending on the relationship between levels of living and the net consumption and production of rice. I focus on this relationship, and discuss various techniques for estimating and displaying it. My second example comes from South Africa and is drawn from Case and Deaton (1996). In recent years, the government in South Africa has paid a "social pension" to men aged 65 or over and women aged 60 or over. The monthly pay- ments are conditioned on a means test-which excludes nearly all Whites-but the vast majority of age-qualified Africans receive payments. It is not immediately clear whether such a policy is effective in reaching poor people. Not everyone who is poor lives with a pensioner, and there are many unemployed adults who are unlikely to be reached by the scheme. However, we can use the South African Liv- ing Standards Survey to see who it is that lives in pensioner households, and to assess the distributional incidence of the scheme by calculating the average amount of pension income that is delivered to people at different levels of living. In both the Thai and South African examples, I use the nonparametric methods discussed in the previous section. They provide simple and appealing graphics that tell us much of what we need to know, and do so in a way that is readily assimila- ble. Along the way, I also introduce techniques of nonparametric regression. Ker- nel regressions are a natural extension of the kernel density estimation methods of Section 3.2, and I use them in the analysis of rice prices in Thailand. I also discuss how to estimate nonparametric regressions using a locally weighted version of least squares. Recent work has shown that this method often works better than kernel regression, and this is the technique applied to the South African pension analysis. Rice prices and distribution in Thailand Although the Thai economy has become much more diversified in recent years, it was traditionally heavily dependent on exports of rice, and it remains (along with the United States) one of the world's two largest exporters. Since almost all rice is shipped down the Chao Phraya river and through the port of Bangkok, rice exports have always been a ready target for taxation, and until recently an export tax (the rice premium) has been a major contributor to government revenue in Thailand (see Siamwalla and Setboongsarng 1991 and Trairatvorakul 1984 for further dis- cussion). If the world price can be taken as given-a matter of some dispute but a WELFARE, POVERTY, AND DISTRIBUTION 183 natural starting point-export taxes lower the domestic price of rice, favoring domestic consumers-rice is the basic staple in Thailand-and harming domestic producers. Export taxes thus favor urban at the expense of rural interests, but there are also distributional effects within the rural sector between rural consumers and producers; if net producers are typically much better-off than rural consumers, the tax may have favorable distributional effects within the rural sector. The operation of these effects can be detailed using household survey data on consumption and production, and this section shows that nonparametric density estimation is a natural tool for the work. Nonparametric regressions are also useful, and I develop these methods as straightforward extensions of the tools introduced in Section 3.2. For other analyses of pricing that use the same techniques, see Budd (1993) and Benjamin and Deaton (1993); both studies use data from the CILSS to look at the pricing of food and cash crops, particularly cocoa and coffee. The distributional effects of price changes: theory Many rural households in Thailand are both producers and consumers of rice, so that when we work through the analysis of welfare, it is important to use a model that recognizes this dual role of households. The natural vehicle is the "farm- household" model (see, for example, Singh, Squire, and Strauss 1986). For my current purposes, I need only a utility function that can be used to examine the effect of price changes. As usual, the effects of price changes can most easily be seen by using an indirect utility function, in which the household's utility is written as a function of its income and prices. If we ignore saving-and the Thai surveys record little or no saving among rural households-the farm-household's utility can be written in the form (3.43) Uh = *h(xh, P) = Ih(mh h' P) where * is the indirect utility function, p is a vector of prices of consumption goods, including the price of rice, lth is profits (net income) from farm activity, and mA is income from nonfarm activities, such as wage-labor or transfers. Some- what schizophrenically in view of the analysis to come in Chapter 5, I am assum- ing that prices are the same for all farmers, thus ignoring regional variation in prices. However, the survey data do not report household-specific prices, and provided all prices move with the export price-as seems to be the case- individ- ual price variation will not affect the analysis. Although the utility function in (3.43) is a standard one-given that goods' prices are p, it is the maximum utility that can be obtained from the efficient spending of all sources of income, whether from the farmn or elsewhere-it rests on a number of important and by no means obviously correct assumptions. Most crucial is the "separation" property, that the only effect of the household having a farm is that it has farm income (see again Singh, Squire, and Strauss 1986 for further discussion). The validity of separation rests on the existence of efficient rural labor markets-a plausible assumption for Thailand-but also, and more 184 THE ANALYSIS OF HOUSEHOLD SURVEYS dubiously, on the supposition that household and hired labor are perfect substitutes on the family farm. When these assumptions hold, the shadow price of family labor is its market price, which is the wage rate, and working on the household's farm is no different from any other job. As a result, workers in the household can be modeled as if they were wage earners at a fixed wage, and the ownership of the farm brings a rental income, as would the ownership of any other real or financial asset. There are good reasons to doubt the perfect substitutability of household and hired-in labor-differential monitoring costs or transport costs are two obvious examples-but the validity of the separation property as an adequate approxima- tion is ultimately an empirical question. I am unaware of evidence from Thailand, but tests for Indonesia in Pitt and Rosenzweig (1986) and Benjamin (1992) have yielded reasonably favorable results. The validity of (3.43) also requires a different sort of separability, that goods and leisure are separable in preferences, so that the direct utility function takes the form u(l,u(q)) for leisure I and vector of goods q. Given this, the indirect utility function in (3.43) corresponds to the utility of the goods that enter the direct sub- utility function l(q). It is straightforward to relax this assumption in the present context, but only at the price of complicating the algebra. The same can be said for my ignoring intertemporal aspects and assuming that saving is zero. Suppose now that there is a change in the ith price, in this case the price of rice. If we confine ourselves to small changes, the effects can be analyzed through the derivatives of the indirect utility function (3.43). In particular, using the chain rule auh a4h ash + 1h (3.44) - - ____ +_ 3api aXh api aPi since, for households that are both producers and consumers, the price of rice affects both farm profits and the cost of living. For households, such as urban households, that produce nothing, the first term on the right-hand side of (3.44) will be zero, since there are no farm profits, and similarly for farmers who do not produce rice, whose profits are unaffected by the price of rice. The two terms in (3.44) can be further elucidated using general results. The effects of prices on profits-often referred to as Hotelling's Lemma after Hotelling (1932)-are given by ash (3.45) Yp hi where Yhi is the production of good i by h. The effect on utility of an increase in price is given by Roy's (1942) theorem, a48h = *h (3.46) P qhi where qhi is the amount of good i consumed by household h, and the quantity a8h laxh is the marginal utility of money to household h, a nonoperational concept with which I shall dispense shortly. Producers benefit from a price change in pro- portion to the amount of their production and consumers lose in proportion to the amount of their consumption. For households that are both producers and con- WELFARE, POVERTY, AND DISTRIBUTION 185 sumers-farm-households-the gain or loss is proportional to the difference between production and consumption, which is Yhi - q*;. It is often more convenient to work with proportional changes in prices, and if we substitute (3.45) and (3.46) into (3.44) and multiply by pi we obtain auh _ )Ih p,(yhi-qhi) alnp; aInxh Xh I shall refer to the last term on the right-hand side of (3.47) as the "net benefit ratio." It measures the elasticity with respect to price of money-equivalent utility, or consumers' surplus for those who prefer that much-abused concept. For policy work, we are concerned with the second of these two terms, but not the first. When considering a policy-induced price change, we need to know the money-equivalent losses and gains for different individuals, so that we can calculate the distributional effects as well as anticipate the likely political repercussions. The marginal utility of money to each individual, quite apart from being unobservable, will be sub- sumed in policy discussions by decisionmakers' views about the value of giving money to each individual, whether based on levels of living, region, caste, or political preference. One simple way to summarize these effects is to write (3.48) aln = E( ((x-,zh)p,(yh; *i)lxh where W is to be thought of as a social welfare function, and Ch captures, not the private marginal utility of money, but the social marginal utility of money. This is not something observable, but summarizes the attitudes of the policymaker to- wards giving resources to individual h, depending on that household's consump- tion level xh as well as on other relevant characteristics Zh, such as perhaps region or ethnic characteristics. (For the remainder of this book, z will denote household characteristics, and not a poverty line.) We still need to allow for the fact that the rice that is produced-paddy-is not the same commodity as the rice that is consumed-milled rice. Suppose that one kilogram of paddy generates A< 1 kilograms of milled rice, and that pi is the price of the latter, so that the price of paddy is lp;. If we then rework the preceding analysis, the net benefit ratio is as before, but with Ay., in place of y1;. But this is still the ratio of the value of net sales of rice (or paddy) to total expenditure, so that provided everything is measured in money terms, the benefit ratio is correctly computed by subtracting the value of consumption from the value of sales. There are a number of caveats that ought to be entered before going on to the empirical results. Note first that the use of the differential calculus, although con- venient, limits the analysis to the effects of "small" price changes and may not give the correct answer for actual (finite) price changes. Of course, the result that producers gain and consumers lose from a price increase is a general one, not a local approximation. However, the magnitude of the welfare effects of finite price changes will depend, not just on the amounts produced and consumed, but also on the second-order effects, which involve the amounts by which consumption and Table 3.5. Household PCE, rice production, and rice consumption in Thailand, 1981-82 Upper Lower Whole Upper Lower North North Kingdom North North East East Center South Bangkok Municipal areas (urban) PCE 1,516 1,349 1,362 1,171 1,172 1,497 1,361 1,680 Value of rice sold 20 26 134 4 12 33 18 4 Value of rice bought 243 326 261 298 276 247 234 216 Budget share of rice 5.59 8.80 7.03 8.18 7.68 5.72 6.06 4.05 Villages (rural) PCE 675 560 647 472 441 862 712 1,021 °l, Valueofricesold 909 426 1,471 580 711 1,183 366 1,515 Value of rice bought 388 377 377 469 468 374 324 285 Budget share of rice 18.70 21.77 19.44 24.64 26.78 14.21 14.38 8.59 Note: All figures are averages over all households in the sample. Amounts in the first three rows are baht per month. Production values are one-twelfth of the annual value of crops and are averaged over all households whether or not they produce rice. The budget shares of rice are the percentages of the nondurable expenditures devoted to rice derived for each household and averaged over households. Glutinous rice and nonglutinous rice are taken together. Source: Deaton (1989a). WELFARE, POVERTY, AND DISTRIBUTION 187 production respond to price changes. Since my main interest here is in the distribu- tional effects of price changes, and in locating the benefits and costs of price changes in the distribution of living standards, these effects will change the con- clusions only to the extent that the elasticities of supply and demand differ system- atically between poor and rich. Although there is no reason to rule out such effects a priori, there is no reliable evidence on the topic; as I shall argue in Chapter 5, the measurement of the slopes of demand functions is difficult enough without trying to determine how the slopes vary with the level of living. A more serious deficiency in the analysis is its neglect of repercussions in the labor market. Changes in the price of the basic staple will affect both supply and demand for labor, and these effects can cause first-order modifications to the results. Once again, although it is possible to write down models for how these effects might operate-see, for example, Sah and Stiglitz (1992) who work through several cases-there is little point in doing so here without more hard information about the structure and functioning of the rural labor market in Thai- land. Implementing theformulas: the production and consumption of rice The empirical analysis starts from (3.48), and uses the 1981-82 Socioeconomic Survey of the Whole Kingdom of Thailand to provide the information. The general approach is to calculate net benefit ratios for each household, and to examine the distribution of these ratios in relation to living standards and regional variation. While it is evident without survey data that lower rice prices redistribute real resources from rural to urban areas, what is less obvious is how price changes redistribute real income between rich and poor people within the rural sector. The Thai socioeconomic surveys collect data on households by three levels of urbanization, municipal areas (4,159 households in the 1981-82 survey), sanitary districts (1,898 households), and villages (5,836 households). The first corresponds to the urban sector, while the last is the rural area with which I am primarily con- cerned. (Sanitary districts are semiurban conglomerates of villages, and I shall typically ignore them.) Table 3.5 shows averages of household per capita expendi- ture as well as information on rice for the country as a whole and for seven major regions. Households in Bangkok are nearly half of urban households and their per capita expenditure is a good deal higher than elsewhere. The rural households that live on the fringes of Bangkok-an excellent rice growing region-are much better-off than other rural households, although substantially poorer than urban Bangkok households. Over the other regions, the average levels of per capita expenditure in the rural areas are a half to a third of the corresponding urban esti- mates. Once again, I do not have satisfactory price indices for rural versus urban, but there is little doubt that the urban households have a substantially higher standard of living. In consequence, there is no equity argument in favor of the redistribution from rural to urban that is brought about by export taxes on rice. Not surprisingly, urban households produce little rice compared with rural households. Of the latter, those on the fringes of Bangkok, in the Center, and in the Figure 3.15. Living standards and rice consumption in Thailand, 1981-82 Univariate densities: all regions Joint density contours: viDages 0.7 0.5 0.6 Joi nJages Municipal areas 0 5.4 , / s / ~~~~~~~~~~~~~~~~~~~0.4 aW i 1 \ 9 9 ass 0.5 ~ ~ ~ ~ ~ ~ ~ ~ 5 5.0 's' 4 5 6 7 a 9 4 5 6 7 59 LogS(PCE): log (bahit) Log(PCE): log (bhati) Joint density nesniap: mnunicipal areas Joint density contours: municipal areas abo v) ~~~~~~~~~~~~~~~~Log(PCE): log (baht) Source: Audbor's calculations based on Socioeconomic Sutrvey of Thailand, 1981-82. WELFARE, POVERTY, AND DISTRIBUTION 189 Lower North are the most productive; these are the regions in the alluvial river basin of central Thailand. In the North Eastern and Upper Northern regions, most production (and consumption) of rice is of the glutinous variety. Although there is little substitution in consumption between glutinous and nonglutinous rice, with distinct groups of people eating each, there is substitutability in production, and for the purpose of the current analysis I assume that the prices of the two kinds of rice move in parallel. Rice consumption is also higher in rural areas in spite of the higher levels of living among urban households. However, urban Thais spend sub- stantial amounts on precooked meals-often good value for single-person house- holds-and although the survey collects data on these expenditures, it cannot decompose these meals into their constituent commodities, so that urban consump- tion of rice is understated by reported purchases of rice. For this reason, and be- cause rural households are poorer, their budget shares for rice are very high, around a quarter of all expenditures in the North East, which is the poorest region. The four panels of Figure 3.15 provide a detailed graphical presentation of the data whose averages appear in Table 3.5. The top left-hand panel shows the esti- mated densities for the logarithm of PCE for the three regions, with the richer urban areas clearly separated from the rural areas with the sanitary districts in between. Note also that, unlike the traditional presumption that cities are more unequal, the Thai urban distribution shows the least dispersion of the three. Relative to a normal distribution, the densities have thick tails, particularly in the rural areas where there are a number of households with very high consumption levels. The other three panels show the relationship between the share of rice in household expendi- tures and the logarithm of per capita household expenditure. In the top right panel is the contour map of the bivariate distribution in the villages, while the bottom two panels show the netmap and contour map for the urban areas. All three dia- grams show that the share of rice in the budget is lower for better-off households, but also that there is considerable dispersion at all levels of living. In line with the averages in the table, budget shares of rice are much lower in the urban than rural areas, and the netmap shows clearly that the urban density has finite mass at zero, presumably corresponding to those who get their rice by buying prepared meals. While the relationship between rice purchases and living levels is of interest in its own right, and I shall return to these general questions in the next chapter, our main interest here is in net purchases of rice, or at least in the benefit ratio-net sales of rice as a ratio of total household expenditure. The bivariate density for the net benefit ratios and the logarithm of PCE is shown in Figure 3.16. The horizontal line corresponds to a net benefit of zero so that households who are self-sufficient in rice will be along that line at a position determined by their level of living. Households above the line-somewhat less than a half of all rural households-are net producers of rice (or at least were so in the survey year) while those below the line consume more than they grow. This majority of households includes non- farmers as well as those who are farmers but grow no rice. The figure shows that those who benefit from higher rice prices, and among them those who benefit the most, tend to be located in the middle of the distribu- tion of the logarithm of PCE. Among the best-off households, there are relatively 190 THE ANALYSIS OF HOUSEHOLD SURVEYS few with positive benefit ratios, and none with large benefit ratios. The best-off rural households in Thailand are either nonfarmers (government employees or teachers, for example) or specialize in crops other than rice, which is usually a small-scale family enterprise unsuited for large-scale commercial agriculture. At the other end of the distribution, among the poorest families, there are many more rice growers, and a larger fraction of households who obtain a substantial fraction of their consumption by sales of rice. But on average, the poorest households, although they would be net beneficiaries from high rice prices, do not have benefit ratios as large as those for the households in the middle of the distribution. In consequence, higher prices for rice would benefit average rural households at all levels of living, but the greatest proportional benefits would be to the rural middle class, not to either rich or poor. Although higher rice prices would not be a very efficient way of helping the poor in rural Thailand-and bear in mind that I am ignoring any effects that work through the labor market-it is clearly not the case, as with plantation crops in some countries, that higher prices benefit only the richest and largest farmers. Although Figure 3.16 contains essentially all the information that we need, the discussion in the previous paragraph rested, not so much on the detail of the den- sity, but on an averaging over net benefit levels for different PCE groups. This con- ditional averaging is the essence of nonparametric (and indeed of parametric) re- gression, to which the next subsection provides an introduction. Figure 3.16. Joint distribution of net benefit ratios and log(PCE), rural Thailand, 1981-82 1.2 0.9 0. 2,677 households ') 0.3 z .0 -0.3 make net purchases -0.6 4.5 5.5 6.5 7.5 5.5 Log(PCE): log(baht) Source: Authores calculations based on Socioeconomic Survey of Thailand, 1981-82. WELFARE, POVERTY, AND DISTRIBUTION 191 Nonparametric regression analysis Economists often think of regression as linear regression, with one variable fitted by ordinary least squares on one or more explanatory variables. But recall from Chapter 2 that the definition of a regression is as a conditional expectation. For a variable y and a vector of covariates x, the expectation of y conditional on x is (3.49) E(ylx) = m(x) where the function m(x), which in general will not be linear, is the regression function of y on x. Since y differs from its expectation by a residual that is, by construction, uncorrelated with the latter, the statistical definition coincides with the standard linear regression model when the regression function is linear. How- ever, it should also be noted that, given a joint distribution for any set of variables, we can always calculate the regression function for any one variable conditional on the others, so that there can be no automatic association between regression and causality, or anything else that depends on an asymmetric treatment of the vari- ables. It is instructive to rehearse the standard links between a conditional expectation and the underlying distributions. In particular, we can write (3.50) E(ylx) = fyfc(ylx)dy = fYfj(Y x)dylfM(X) where the C, M, and J subscripts denote conditional, marginal, and joint distribu- tions, respectively. Alternatively, the regression fj}nction can be written entirely in terms of the joint distribution, (3.51) m(x) = E(ylx) = fyfi(y,x)dylffj(y,x)dy. This equation suggests a nonparametric method for estimating the regression function; estimate the joint density using kernel (or other) methods, and use the results to calculate (3.51). In practice, there is a somewhat more direct approach through equation (3.50). Perhaps the obvious way to calculate a regression function is to use the sample information to calculate the average of all y-values corresponding to each x or vector of x's. With an infinite sample, or with discrete explanatory variables, such an approach would be feasible. With finite samples and continuous x, we face the same problem as in density estimation, that there are no sample points exactly at each x, and we adopt the same solution, which is to average over the points near x, nearness being defined with reference to a bandwidth that will shrink to zero as the sample size increases. As with density estimation, weighting is desirable so as to avoid discontinuities in the regression function as individual observations move into and out of the bands, and this can be dealt with by calculating kernel regres- sions that are closely analogous to kernel estimates of densities. Indeed, the con- cept is perhaps already more familiar with regressions; the common practice of 192 THE ANALYSIS OF HOUSEHOLD SURVEYS smoothing time series by calculating a moving average over a number of adjacent points is effectively a (rectangular) kernel regression. The kernel regression estimator can be written as follows: (3.52) mi(x) = Y i='iJ which, using f (x) the kernel estimate of the density at x from (3.31) above, can be written as (3.53) iti(x) = (nh)-' yiK(.i) /f!(x). n ( h~. This equation can be thought of as an implementation of (3.50) with the kernels acting to smooth out the discrete sample points. Using (3.52), the estimate of the regression function can also be written as a weighted average of the y's, (3.54) mh(x) = E wi(x)y, where the weights are given from (3.52). According to (3.54), which makes clear the moving average analogy, the estimated regression function is a weighted aver- age of all the y, in the sample with the weights depending on how far away each corresponding xi is from the point at which we are calculating the function. As when estimating densities, bandwidths can be chosen by trying different values and by inspecting the plots of the resulting estimates, or by some more automatic and computationally intensive technique. The same kernels are available for regression analysis as for density estimation (see the formulas on p. 173). These matters, as well as other topics relating to nonparametric regression, are discussed in Hardle (1991) who also (p. 101) gives formulas for asymptotic confi- dence bands around the regression (3.54): if c,, is the (I-a) quantile of the t-distri- bution, the upper and lower (I -a) confidence bands for the regression are given by (3.55) ba(x) = mi(x)±c,,fK(z)2dz] a(x) [Nhl(x)] 1/2 where O(x) is the estimate of the local regression standard error, and is calculated from N (3.56) 02(x) = N' E w,(x)[yi -ii (xi)] 2. Although these bands are useful for assessing the relative precision of different points along the estimated regression function, they should not be treated very seri- ously. The asymptotic results are obtained by ignoring a number of terms-such as the bias of the regression-which tend to zero only very slowly. A better proce- dure is to use the bootstrap, and I shall give some substantive examples in Chapter 4. Even so, the routine presentation of confidence bands together with regressions is not always advisable, since it tends to clutter the diagrams and obscure more important features. WELFARE, POVERTY, AND DISTRIBUTION 193 Nonparametric estimation of regression functions is a harder task than nonpara- metric estimation of densities. In particular, it is not possible to calculate a condi- tional expectation for values of x where the density is zero; if x cannot occur, it makes no sense to condition y on its occurrence, and the attempt to calculate the regression will involve dividing by zero; see (3.53). In practice, there will be dif- ficulties whenever the estimated density is small or zero; while only the latter calls for division by zero, the former will make the regression function imprecise; see (3.55). Unlike linear regression, or regression with an assumed functional form, it is impossible to use nonparametric regression to calculate predictions for out-of- sample behavior. It is also necessary when calculating (3.52) to take care to calcu- late the regression only for values of x where the calculated density is reasonably large. This is not a problem for the density, since the estimated density is simply zero in places where there are no observations. In consequence, the regression can be calculated by evaluating the numerator and denominator of (3.53) for each value of x for which an estimate is desired, then calculating and presenting the ratio for only those values of x where f (x) is above some critical value, set for ex- ample so as to exclude 5 percent of the sample observations. The main strength of nonparametric over parametric regression is the fact that it assumes no functional form for the relationship, allowing the data to choose, not only the parameter estimates, but the shape of the curve itself. The price of the flexibility is the much greater data requirements to implement the nonparametric methods, the difficulties of handling high-dimensional problems, and to a lesser extent, computational costs. When data are scarce, the best that can be done is to focus on a few key parameters, and to make inferences conditional on plausible functional forms. But in many problems using household survey data-as in the Thai rice pricing example-there is enough information for it to make sense to ask the data to determine functional form, something that will be particularly attractive when functional form is an important issue, and when the dimension of the prob- lem is low. There are also many other regression techniques that bridge the gap between linear regression on the one hand and nonparametric kernel regression on the other. Polynomial regressions are a familiar tool, and are capable of modeling a wide range of functional forms provided the degree of the approximating polyno- mial is increased as the sample size increases. Fourier series offer an alternative way of approximating functional forms, an alternative that has been explored by Gallant and his colleagues, for example Gallant (1981). To regress y on x nonpara- metrically, run an OLS regression of y on a constant, x, x 2, and a series of terms of the form sin(jx) and cos(jx), for j running from 1 to J. For larger sample sizes, larger values of J are used, and increases in J correspond to reductions in band- width in kernel methods. In practice and with the usual sample sizes of a few thousand, setting J to 2 or 3 seems to work well. Cleveland (1979) has proposed a local regression method, LOWESS, that can be thought of as a series of linear regressions at different points appropriately stitched together; I shall return to these locally weighted regressions below. There is also substantial experience in using spline functions for regression analysis (see, for ex- 194 THE ANALYSIS OF HOUSEHOLD SURVEYS ample, Hardle 1991, pp. 56-65, and Engle et al 1986). There are also methods that allow some covariates to be treated nonparametrically, while others appear in the standard linear form (Robinson 1988 and Estes and Honore 1996). Indeed, there is a rapidly growing literature on semiparametric estimation, well reviewed by Stoker (1991). All of these techniques have their strengths and weaknesses, and some are more appropriate for some problems than others. For example, polyno- mials tend not to work very well with household survey data, because their shapes can be very sensitive to the position of a few outliers. Some methods, such as kernels, are more readily generalizable to higher dimensions than others, such as splines. There are also important differences in computational costs, as well as in data requirements, topics that are still being actively researched, and on which it is currently difficult to give adequate practical guidance. One important distinction between nonparametric and parametric econometrics is that the former lacks the menu of options that is available in the latter, for ex- ample for dealing with simultaneity, measurement error, selectivity, and so forth. In some cases, such as the selectivity models discussed in Chapter 2 where the models are identified by functional form assumptions, nonparametric alternatives are by definition impossible, which is only a dubious disadvantage. In other cases, such as simultaneity, it is because the techniques do not yet exist, although there will undoubtedly be further developments in this area. However, one problem for linear regression that is dealt with automatically by kernel estimation is when the dependent variable is discrete, as for example in the Thai case when we are inter- ested in whether a given household is or is not a rice farmer. In the binary case where the dependent variable y. takes the value I or 0, with the probability of the former given by some unknown function 7c(xi), the expectation of y, conditional on x, is simply the probability tc(xi) which is therefore also the regression func- tion. Hence, if we simply treat the l's and 0's as we would any other values of a dependent variable, and mechanically apply the regression formulas (3.52) or (3.53), the results will converge to the probability function -N(x). Nonparametric regressions for rice in Thailand Nonparametric regression techniques can be used to complete the previous analy- sis, calculating the average benefit ratios at each level of living, and also to look behind the averages to the structures of rice farming that underlie them. Figure 3.17 starts with the latter and shows nonparametric regressions where the inde- pendent variable is-as usual-the logarithm of PCE, and the dependent variable is a dichotomous variable taking the value one if the household produces rice, and zero if not. Each panel shows this regression together with a similar regression for whether or not the household sells rice; the latter always lies below the former, since a household must be a producer to be a seller. The top left-hand panel shows the results for the whole rural sector, and indicates that the fraction of households growing rice diminishes with the level of living. Three-quarters of the poorest households are rice farmers, but less than one-quarter of the best-off households are so. However, conditional on being a rice fanner, the probability of being a WELFARE, POVERTY, AND DISTRIBUTION 195 Figure 3.17. Proportions of households producing and selling rice, all rural and selected regions, Thailand, 198182 All rural howhoids Upper North hotachols 0. 0 .2 s t L sck' Lower Nonh househokis Bngk,,k Fringc homchokis ig 1. ( E= F_ IA 0 104 S,_-c u- > (02 4.5 5.5 6.5 7.5 X.5 4.5 5.5 6 5 7.5 x.5 Log(PCE): log(baht) Source: Author's calculations based on Socioeconomic Survey of Thailand. 1981-82. seller increases with log(PcE); the lower line draws closer to the upper line as we move from left to right. While most poor households in rural Thailand grow rice, many do so to meet their own food needs and have no surplus to sell on the mar- ket. And while better-off rice farmers are much more likely to be selling rice, the richer the household the less likely it is to farm rice at all. T1he other three panels in the figure remind us that the top left panel is an aver- age, and that there is considerable regional variation. The Upper North is a rela- tively poor area; nearly all its poor households grow rice, but only a fifth sell any. The Lower North in the next panel is a good deal better-off, and a higher fraction of rice farmers sell rice at all levels of living. In the extreme case of the Bangkok Fringe in the bottom right panel, only between 30 and 40 percent of households grow rice, but they do so extremely productively and almost all participate in the market. These facts about rice production and rice sales help to interpret Figure 3.18, which shows the nonparametric regression of the net benefit ratio on the logarithm of household per capita expenditure. This regression is the conditional expectation corresponding to the joint density in Figure 3.16, and contains no new information. However, the regression provides the answer to the question of by how much the people at each level of living would benefit from an increase in the price of rice. Since the net benefit ratio expresses the benefit as a fraction of total household consumption, a flat line would show that all rural households benefit proportion- ately, so that the change would be neither regressive nor progressive. A positive 196 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 3.18. Net benefit ratios averaged by log(PCE), rural households, Thailand, 1981-82 0.3 0.2 - 0 0. 0. Cs 0.0 ., , , , , , 4.5 5.5 6.5 7.5 8.5 Log(PcE): log(bahl) Sourre: Authors calculations based on Socioeconomic Survey of Thailand, 1981-82. slope would indicate that the benefits are proportionately larger for those who are better-off, and vice versa for a negative slope. In fact, the graph shows none of those patterns; instead-and as we have already seen from Figure 3.16-it is the households in the middle of the distribution that benefit the most. The poor gain from a price increase, but not very much; although they grow rice, they sell rela- tively little on the market and many of them have to buy rice in addition to their own production, so that as a group they benefit only modestly from the price in- crease. Wealthy households also benefit modestly, but for precisely the opposite reasons. Wealthy rice farmers sell most of their crop on the market, but few wealthy households are rice farmers at all. In consequence, the regression function for the net benefit ratio has the inverted U-shape shown in Figure 3.18. If we put this analysis together with the data on the sectoral expenditure distri- butions in Figure 3.15, we can conclude that the survey data do not support the argument that an export tax on rice is desirable on distributional grounds. The tax redistributes real resources from relatively poor rural growers to better-off urban consumers, while within the rural sector, all income groups lose, though the largest losses are born by those in the middle of the welfare distribution. Of course, these conclusions do not tell us everything that we would need to know in deciding on tax policy in Thailand. I have said nothing about the distortions caused by the tax, nor about the desirability or otherwise of alternative instruments for raising government revenue. Even so, the techniques of this section give information on who gets (or produces) what, and provide in a readily accessible form part of what WELFARE, PoVERTY, AND DISTRIBUTION 197 is required for making sensible decisions. Since the actual distributional conse- quences of policies are frequently misrepresented for political purposes, with mid- dle-class benefits disguised as benefits for the poor, such analyses can play a valuable role in policymaking. Bias in kernel regression: locally weighted regression Two potential sources of bias in kernel regression are illustrated in Figure 3.19. The graph shows two highly simplified examples. On the x-axis, there are three equally spaced data points, xl, x2, and x3. Ignore the points XA and x, for the moment. There are two regression functions shown; m, which is a straight line, and m2 which is curved. For each of these I have marked the y-points correspond- ing to the x's; yl, y2, and y3 on mp, and y,*( =yl), y *, and y * on m2. Since nothing depends on their being a scatter of points, I am going to suppose that the y-values lie exactly on the respective regression lines. Consider what happens when we try to estimate the two regression functions using kernel regression, and focus on the point x2, so that, for regression ml, the right answer is y2, while for regression m22, itis Y2* Start with the concave regression function m2. If the bandwidth is as shown, only the points x,, x2, and X3 will contribute, so that the estimate of the regression at x2 will be a weighted average of y,, y * and y * in which y * gets the most weight, y, and y * get equal weight, and the weights add up to 1. Because the regression function is concave, such an average must be less than y *, so that the estimate is biased downward. If the regression function had been convex, the bias would have been upward, and it is only in the linear case that there will be no bias. Figure 3.19. Sources of bias in kernel regressions y Bandwidth _,2 3 _ ,,,, ... /. y -.~~~~~~~~~~~1 X2 xi X2 XA XB X3 X 198 THE ANALYSIS OF HOUSEHOLD SURVEYS The bias will be gradually eliminated as the sample size gets larger because the bandwidth will get smaller, so that, in the limit, only the points at x2 contribute to the estimate of the conditional mean. In practice, there is little comfort in the fact that there is no bias in some hypothetical situation, and we must always be careful when the bandwidth is large and there is curvature. Preliminary data transforma- tions that induce approximate linearity are likely to be useful in reducing bias in kernel regression. Even when the regression function is linear, there is another potential source of bias. Turn now to the linear regression function mi, and consider once again esti- mating the regression function at x2. When we use the points xl, x2, and X3, everything works as it should, and the weighted average of y,, y2, and y3is Y2. But now introduce the two additional points XA and XB, so that there are now five points within the bandwidth, and they are no longer equally spaced. The kernel estimate of the regression at x2 is now the weighted average of the five corres- ponding y-values. It is still the case that y2 gets the most weight, and that y, and Y3 get equal weights, but the y-values corresponding to XA and XB also get positive weight, so that the estimate is biased upward. More generally, there will be bias of this kind when, at the point of estimation, both the regression function and the density of x have nonzero derivatives. In practice, this bias is likely to be most serious at the "ends" of the estimated regression. For example, suppose that xl is the smallest value of x in the sample. When we try to estimate a kernel regression at x1, the average of nearby points can include only those to the right so that, if the regression function is positively (negatively) sloped, there will be an upward (downward) bias. There will be corresponding biases in the opposite direction for the largest x-values. These biases will diminish as we move from the extremes towards the center of the distribution but, if the bandwidth is large, they can seri- ously distort the estimated regression. As shown by Fan (1992), the biases associated with unequally spaced x's can be eliminated by moving away from kernel regression. The idea, like its imple- mentation, is straightforward. Note first that, in the example of the linear regres- sion function in Figure 3.19, an OLS regression would not encounter the same difficulties as the kernel regression. Indeed, when the regression function is linear, oLs will be unbiased and consistent. The problem with OLS is that it cannot adapt to the shape of the regression function so that, no matter how large the sample, it can never estimate a nonlinear function such as m2. But this can be dealt with by estimating a series of local regressions. Instead of averaging the y's around x2, as in kernel regression, and instead of running a regression using all the data points as in OLS, we adopt the best of both procedures and run a regression using only the points "close" to x2. As with kernel regression, we use a kernel to define "close," but instead of averaging, we run a weighted or GLS regression at x2, where the weights are nonzero only within the band, and are larger the closer are the obser- vations to x2. We repeat this procedure for each point at which we want to esti- mate the regression function. Fan's locally weighted regression smoother can be summarized as follows. As in the Thai example, suppose that we want to plot the estimated regression func- WELFARE, POVERTY, AND DISTRIBUTION 199 tion for a range of values of x. As before, we divide up the range of x into a grid of points, usually 50 or 100 depending on how much detail we need to show. For each point x on the grid, calculate a series of weights for each data point using a kernel with a suitably chosen bandwidth. This can be done exactly as in kernel regression (3.52), so that we first define (3.57) ei(x) = h K( i) We then estimate the locally weighted regression parameter estimates (3.58) 5(x) = [X'0(x)XV-'X'o(x)y where El(x) is an nxn diagonal matrix with O,(x) in the ith position, and the n x 2 matrix X has ones in its first column and the vector of x-values in its second. The predicted value of this regression at x is then the estimated value of the regres- sion function at the grid point, i.e., (3.59) mh(x) = OI(x) + 52(x)x. The predicted value in (3.59) is calculated for each point on the grid, and the results plotted as in Figures 3.17 and 3.18. As Fan notes, the value of 0,(x) from (3.58) is a natural estimator of the slope of the regression function at x and, as we shall see in the next chapter, plots of this function can also be informative. The local regression in (3.58) can be extended to incorporate quadratic as well as linear terms in x, in which case the X matrix would have three columns, with x2 in the third. The nonlinearity will generally help alleviate the first problem discus- sed above and illustrated in Figure 3.19, of forcing a linear structure on the data, even locally. The estimated regression will also provide a local estimate of the second derivative of the regression function. Presumably, higher-order polynomi- als could also be considered, and will yield estimates of higher-order derivatives. The role of the bandwidth in (3.57) is the same for these locally weighted regressions as it is in kernel regressions. It controls the tradeoff between bias and variance and, for consistency of the estimates, it must tend to zero as the sample size tends to infinity and must do so slowly enough that the number of observa- tions in each local regression also tends to infinity. As with kernel regressions, standard errors ares best calculated using the bootstrap, and I shall give examples in the next chapter. The "ksm" command in STATA implements Cleveland's (1979) LOWESS estimation, which is closely related to Fan's locally weighted smoother; the difference is that LOWESS uses a nearest-neighbor definition of closeness in place of the kernel, but this should make little difference to the operating charac- teristics of the procedure. However, as implemented in STATA, "ksm" estimates the regression function at every observation, which is likely to be prohibitively expen- sive for large data sets. STATA code for a direct implementation of Fan's smoother is provided in the Code Appendix for the South African example in the next subsection. 200 THE ANALYSIS OF HOUSEHOLD SURVEYS The distibutional effects of the social pension in South Africa The following account is based on Case and Deaton (1996), which should be consulted for fuller details and documentation. The social security system in South Africa is unlike any other in the world. At the time of the Living Standards Survey of 1993, and well before the elections in the spring of 1994, the government paid a monthly "social" pension to age-qualified men and women whose (individual) income fell below a cutoff. In late 1993, the monthly payment was 370 rand, a little more than $100 at what was then the exchange rate. For comparison, 370 rand is around half of average household income in the survey, and is more than twice the median per capita household income of Blacks. That such comparatively enormous sums should be paid as pensions is a historical accident. In the apartheid era, most White workers were covered by privately funded occupational pension schemes, and the social pension was designed as a safety measure for those few Whites who reached retirement age without adequate coverage. During the transi- tional period, the social pension was gradually extended to non-Whites, first at lower rates, but ultimately uniformly subject to the means test. But because of the enormous disparity between the incomes of Whites and Blacks, a pension that is very small by White standards can be very large relative to the typical earnings of the majority population. We also have a situation where the means test rules out the vast majority of Whites while, at the same time, more than three-quarters of Black women over 59 and men over 64 are receiving payments. In 1993 (and it remains true at the time of writing) the social pension accounted for most of social welfare expenditure in South Africa. Not only is there concern about the cost of the scheme (around seven billion rand), but there are also ques- tions about whether a pension is the best way of spending the very limited social budget at the expense, for example, of payments to children or the unemployed. Nor is it clear that transferring cash to the elderly is an effective antipoverty strat- egy. In the United States, the elderly are somewhat less subject to poverty than the population as a whole, and because they usually live alone, or with other elderly, there is no automatic presumption that nonelderly benefit from pension payments. Much the same is true of White pensioners in South Africa but, by and large, they are not recipients of the social pension. In Black households, the elderly do not live alone, they live in households that are larger than average, and households with a pensioner actually have more children than households without a pensioner. Indeed, because of South African patterns of migrant labor, there are many "hol- low" households, with elderly and children, but without working-age adults. In such circumstances, transfers in the form of a pension may be well-targeted to- wards poverty reduction, and the distributional effects are certainly worth serious investigation. Figure 3.20 looks at the distributional incidence of pension income for White households and for Black households separately. The data from the Living Stan- dards Survey are used to give, for each household, actual pension receipts, which are reported as a component of individual income, and "potential" pension re- ceipts, defined as the number of age-qualified people-women aged 60 and over WELFARE, POVERTY, AND DISTRIBUTION 201 Figure 3.20. Regressions of actual and potential pension receipts, by race, South Africa, 1993 400 -Qo40. \ $1 (U.S.) a day $1 (U.S.) a day = 300 \ per capita per capita e300. Potential receipts 200 2E Potential receipts 0 0 m 100 / EReported receipts \ - -. 0O ° Reported receipts . 0 2 4 6 8 10 0 2 4 6 8 10 Log of per capita household income, excluding pensions Black households White households Note: Locally weighted regressions with quartic (biweight) kemels and bandwidths of 1.0 for African households and 1.5 for White households. The solid line is the regression function of household pension receipts conditional on the logarithm of household income (excluding pensions) per capita. The broken line is the corresponding regression for potential pension receipts, where potential receipts are 370 rand times the number of age-qualified people in the household. Source: Case and Deaton (1996). and men aged 65 and over-multiplied by the maximum monthly pension of 370 rand. Households would receive this potential amount if there were no means test-or if everyone qualified-and if everyone who was entitled to the pension actually received it in full. Locally weighted regressions are then used to calculate the conditional expectation of these actual and potential receipts as a function of household per capita income, excluding the pension. In the left-hand panel, for Black households, the two lines are very close to- gether, showing that the means test has little effect. At all levels of Black pre- pension income, most of those who are age-qualified receive the maximum pay- ment. More importantly, both lines slope down from left to right, showing that the pension payments are progressive; households at the bottom of the per capita income distribution receive between 300 and 400 rand per month, while those at the top are entitled to-and receive-very little. Note that this is automatic target- ing that owes nothing to the operation of the means test-potential receipts differ little from actual receipts-but works because the Black elderly live in households with low per capita income. The situation is very different for the White house- holds shown in the right-hand panel. Although there are a few Whites with non- pension per capita household income below the dollar-a-day poverty line-though as we have seen there are no Whites with PCE below this level-and these people receive social pensions, very few of better-off White households receive pensions. At typical levels of per capita income among Whites, the means test rules out receipt of the social pension so that potential and actual receipts are far apart. That 202 THE ANALYSIS OF HOUSEHOLD SURVEYS there is no similar discrepancy among Black households at the same level of per capita income suggests that the means test is not very consistently applied among Blacks-for which there is some independent if anecdotal evidence-but it should also be noted that there are very few Black households with incomes in this range. As with the case of rice prices in Thailand, the nonparametric regressions pro- vide a straightforward way of calculating and presenting the immediate distribu- tional incidence of policy. But that this is only a starting point should be obvious. In the Thai case, I emphasized that such calculations take no account of behavioral consequences of price changes-particularly in the labor market-and there are similar caveats in the South African case. In particular, I have taken no account of changes in private transfers in response to the social transfers, nor of possible effects on the migration patterns that make them possible (see Jensen 1996 for evidence). 3.4 Guide to further reading There are several good reviews of the material on inequality and distribution in Section 3.1. Atkinson (1970) remains the cornerstone of the inequality literature, and repays careful reading. Sen (1973), updated, with a review of the subsequent literature in Foster and Sen 1997) takes a somewhat different approach, starting from the statistical measures, and enquiring into their suitability for assessing eco- nomic inequality. This monograph should be read in conjunction with Sen (1992) which takes a much broader view of the meaning of inequality and its place in social arrangements more generally. Cowell (1995) is a useful review of alterna- tive measures of inequality and of the practical problems of implementing them. Wolfson (1994) shows that the the polarization of incomes (as in the vanishing middle) is quite different from expanding inequality. Kakwani (1980) is recom- mended. His book focusses on inequality and poverty in developing countries, and covers a great deal of useful material not discussed in here, such as families of dis- tributions for fitting to income distributions, and how to estimate inequality mea- sures from the grouped data that are published in survey reports. He also provides a thorough treatment of inequality measures and of the Lorenz curve. The book also contains a useful discussion of poverty methods, but was written too early to cover the more recent developments. Sundrum (1990) is also a good discussion of inequality in developing countries. Shorrocks (1983) should be consulted for generalized Lorenz curves; the paper also contains empirical calculations for sev- eral countries. The literature on poverty measurement has grown rapidly in the last decade and has come into much closer alignment with the social welfare approach to welfare and inequality. Foster (1984) is a survey of the various different mea- sures. Atkinson (1987, 1992) ties together the strands in the literature and his analysis and presentation strongly influenced the layout and ideas in this chapter. Ravallion's (1993) monograph is an excellent review that focusses on poverty in developing countries and together with Atkinson (1992) is strongly recommended as supplementary reading for this chapter. Anderson (1996) discusses possible methods for constructing statistical tests that can be used in conjunction with WELFARE, POVERTY, AND DISTRIBUTION 203 stochastic dominanance comparisons, and provides references to the earlier litera- ture. Nonparametric density and regression estimation is becoming more used in economics, but much of the econometric literature remains (unnecessarily) obscure and impenetrable, so that this useful material has found fewer applications than it merits. A bright spot in this literature is the book by Silverman (1986), which is a delight to read; it is focussed towards applications, with the level of technique determined by what is strictly necessary to explain the ideas, and no more. Al- though Silverman is concerned only with density estimation, and not with regres- sion analysis, the basic ideas are very similar, and this book remains the first stop for anyone interested in pursuing these topics. The book by Hardle (1991) is about nonparametric regression, and covers all of the important techniques, but it is not as transparent as Silverman, nor is it as immediately useful for someone who wants to use the techniques. Hardle and Linton (1994) is a useful summary review, and although not all topics are covered, it is a better introduction than Hardle's book. An excellent, up-to-date review of locally weighted regression is Hastie and Loader (1993). Semi-parametric techniques are a live research area in economet- rics. In many applications, especially those in high dimensions, a fully nonpara- metric approach requires too much data, so that there is a role for techniques that blend parametric and nonparametric methods. Stoker (1991) is a good entry point into this literature. For applications of nonparametric methods that are related to those given in this chapter, see Bierens and Pott-Buter (1990) and Budd (1993). 4 Nutrition, children, and intrahousehold allocation One of the traditional uses of household survey data has been the analysis of family budgets, starting with the descriptive work of the 18th and 19th centuries, and becoming more analytic and econometric in the 20th, with Prais and Houth- akker's (1955) classic monograph perhaps still the best known. That literature has investigated the distribution of the budget over goods, how that allocation chan- ges with levels of living-Engel curve analysis-and the relationship between the demographic structure of the household-the sexes and ages of its members-and the way in which it spends its resources. These studies have had a wide audience in the policy community. The early work came from social activists who hoped that documenting the living standards of the poor would generate a political de- mand to improve them. Engel curve analysis has been an important ingredient in understanding how the structure of economies changes in the process of economic growth (see in particular the classic work of Kuznets 1962, 1966). Much of the work on demographic structure has been motivated by attempts to derive "equi- valence scales," numbers that tell us how much a child costs relative to an adult, and that might allow us to correct crude measures of living standards such as in- come or total expenditure for differences in household composition. Such correc- tions have a major impact on measures of poverty and inequality, on the identi- fication of who is poor, and on the design of benefit programs. This chapter is concerned with these traditional topics in the context of surveys from developing countries. In the eyes of many people, including many development economists, poverty is closely related to whether or not people get enough to eat, so that documenting the living standards of the poor becomes a question of counting calories and a major task of household surveys is to assess nutritional adequacy. Household sur- vey data can also be used to examine how levels of nutrition change with the amount of money people have to spend. The topic is important in the debate over development strategy, between growth versus basic needs, and between less and more interventionist positions. If the elasticity of calories with respect to income is high, general economic development will eliminate hunger, while if the elasti- city is low, we are faced with a choice, between a strategy for economic growth with hunger remaining a problem for a long time, perhaps indefinitely, or a more 204 NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 205 interventionist strategy that targets the nutrition of the poor while letting general economic development look after itself. This topic is addressed in Section 4.1 using survey data from India and Pakistan. I look at some of the theoretical as well as empirical issues, and argue that some of the questions in the debate can be approached using the nonparametric techniques discussed in Chapter 3. Section 4.2 is about the demographic composition of the household, its effects on demand patterns, and about the use of such information to make inferences about the allocation within the household. Household surveys nearly always col- lect data on household consumption (or purchases), not on individual consump- tion, and so cannot give us direct information about who gets what. In the devel- opment literature, much attention has focussed on gender issues, particularly al- though not exclusively among children, and on the question of whether girls are treated as well as boys. I review some of this work, as well as recent theoretical developments on how to use household data to make inferences about intrahouse- hold allocation. I implement one specific methodology that tries to detect whether girls get more or less than boys, and look at evidence from India and Pakistan as well as from a number of other countries. Section 4.3 turns from the relatively firm empirical territory of Sections 4.1 and 4.2 to the more controversial ground of equivalence scales. Although the con- struction of scales is of great importance for any enterprise that uses household survey data to draw conclusions about welfare, the state of knowledge and agree- ment in the area is not such as to allow incontrovertible conclusions or recom- mendations. Even so, it is important to understand clearly what the difficulties are in passing from the empirical evidence to the construction of scales, and to see the assumptions that underlie the methodologies that are used in practice. Clarifica- tion of assumptions is the main issue here; equivalence scales are not identified from empirical evidence alone although, once an identifying assumption has been made, the empirical evidence is relevant and useful. Nor is there any lack of prac- tical empirically based scales once identifying assumptions have been made. The problem with much of the literature, both in the construction and use of equiva- lence scales, is that identifying assumptions are often implicit, and that the effects of the assumptions on the results can be hard to see. As a result, it is difficult to know whether different investigators are actually measuring the same thing, and those who use the scales run the risk of implicitly incorporating an assumption that they would have no hesitation in rejecting were it made explicit. In most of this chapter, I adopt the standard convention of household budget analysis, that prices are the same for all households in the survey. The assumption of uniform prices is what has traditionally separated the fields of family budget analysis on the one hand from demand analysis on the other. The former investi- gates the nature of Engel curves and the effects of household composition, while the latter is mostly concerned with the measurement of price effects. Chapter 5 is about the effects of prices on demand in the context of tax and price reform, and while prices are typically not central to the questions of this chapter, there is no satisfactory justification for the uniform price assumption. Because transportation and distribution networks tend to develop along with economic growth, there is 206 THE ANALYSIS OF HOUSEHOLD SURVEYS much greater scope for spatial price variation in less developed than more deve- loped countries. In consequence, the uniform price assumption, while possibly defensible in the context of the United States or Great Britain, is certainly false in the countries analyzed in this book. I shall indicate places where I think that it is potentially hazardous to ignore price variation, but this is a poor substitute for the research that builds price variation into the analysis. That work is not straight- forward. Price data are not always available, and when they are, they frequently come in a form that requires the special treatment that is one of the main issues in the next chapter. 4.1 The demand for food and nutrition One attractive definition of poverty is that a person is poor when he or she does not have enough to eat, or in more explicitly economic terms, when they do not have enough money to buy the food that is required for basic subsistence. For the United States or other developed economies, where few people spend more than a third of their incomes on food, such a definition is clearly inadequate on its own, and must be supplemented by reference to commodities other than food. However, in countries such as India and Pakistan, where a substantial fraction of the population spend three-quarters or more of their budgets on food, a hunger- based definition of poverty makes sense. This section explores the relationship between measures of nutritional status, typically the number of calories consum- ed, and the standard economic measures of living standards, such as income or total expenditure. As usual, the analysis will be largely empirical, using data from India and Pakistan, but there are a number of theoretical issues that have to be given prior consideration. Welfare measures: economic or nutritional? If everyone spent all their income on food, and did so in the ways that are recom- mended by nutritionists, there would be no conflict between economic and nu- tritional views of living standards. However, people choose to buy goods other than food, some of which are obvious necessities like housing, shelter, and medi- cal care, but others less obviously so, like entertainment or tobacco, and they buy such goods even when food intake is below the best estimates of subsistence. Furthermore, food purchases themselves are rarely organized according to purely nutritional considerations. As has been known (before and) since the first applica- tions of linear programming-see Stigler (1945) and Dorfman, Samuelson, and Solow (1958, pp. 9-28) for an account-minimum nutritional requirements can usually be met for very small amounts of money, even by the standards of the very poorest. But minimum-cost diets are tedious and uninteresting, and they often bear no relation to what is actually eaten by poor people who presumably have interests beyond nutritional content. As a result, measures of welfare based on nutritional status will differ from the standard economic measures based on expenditures, income, or assets. There is, of course, no reason why we cannot NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 207 have multidimensional measures of welfare-someone can be wealthy but hung- ry, or well-fed but poor-but we can run into difficulties if we do not keep the differences clear, especially when the two views have different implications for the design of policy. The conflict between nutritional status and economic welfare is sharpest when we look at price changes, where it is possible for something that is desirable from the point of view of nutrition to be undesirable according to the standard eco- nomic criteria. In particular, economists tend to think that individuals with high substitution elasticities are in a good position to deal with price fluctuations, since they are well equipped both to avoid the consequences of price increases and to take advantage of price decreases. By contrast, nutritionists see high substitution elasticities as a cause for concern, at least among the poor, since nutritional status is thereby threatened by price increases. To clarify these issues, we need a simple formulation of welfare under the two alternative approaches. For the economist, welfare is defined with reference to a preference ordering or utility function, which for these purposes we can write as v(qf, qn), where the two components are food and non-food respectively. We can think of qf and qn as vectors, but nothing is lost here if we consider only two goods, one food and one nonfood. Corresponding to this utility function, there is an indirect utility function, written Ilr(x, pf ,p.), whose value is the maximum utility that can be reached by someone who has x to spend and when the prices of food and nonfood are p1 and pn, respectively. In practice, indirect utility would usually be approxi- mated by real total expenditure, which is x deflated by a price index formed from pf and p,n. To consider the effect of price changes on welfare, it is convenient to follow the usual route of consumers' surplus and convert price changes into their money equivalents. For this, we use the cost or expenditure function c(u, f pd, which is defined as the minimum expenditure needed to reach the welfare level u at prices pf and pn; see Deaton and Muellbauer (1980a, ch. 2) for a full discussion of the cost function and its properties. The partial derivatives of the cost function with respect to prices are the quantities consumed, while the matrix of second derivatives is the matrix of compensated price effects, the Slutsky matrix. The cost function is concave in the prices; holding utility constant, the response of cost to price is linear (and proportional to consumption) if consumption is held constant in face of the price increase, but will typically increase less rapidly be- cause it is possible to substitute away from the more expensive good. In particu- lar, if prices change by an amount Ap, the associated change in costs A c satis- fies the inequality (4.1) Ac q.Ap = qfApf+qApn. Equation (4.1) is illustrated for a single price change in Figure 4.1. The straight line through the origin is the case of no substitution, where the same is bought irrespective of price, and costs are proportional to price with slope given by the amount consumed. The other two cases show two different degrees of substitu- 208 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 4.1. Substitution and the costs of price change Cost change Ac Limited substitution ________ ________ ________High substitution Slope is q Price change AP No substitut tion; because there is substitution away from the good as it becomes more expen- sive, as well as toward it when it becomes cheaper, consumers with more ability to substitute are hurt less by price increases and benefit more by price decreases. Suppose now that only the price of food changes. A second-order Taylor ap- proximation to the change in the cost of living can be obtained from the cost function using the fact that its first derivative is the quantity consumed and the second the substitution (Slutsky) term: (4.2) Ac C ~qfApf+0.5sffA pf where sff is the compensated derivative of demand with respect to price. Because the substitution effects of price must be nonpositive, so that Si 0, the second term in (4.2) is always zero or negative; the larger the opportunities for substitu- tion, the more is the consumer able to offset the costs of the price increase. In Figure 4. 1, Stf is the curvature of each line at the origin, and thus (locally) deter- mines how much consumers benefit from substitution. Clearly, substitution is a good thing; the higher the substitution, the less vulnerable is the consumer to in- creases in price, and the more he or she benefits from decreases. Further, any policy or other change that increases substitution possibilities (while preserving consumption levels) will make people better-off and is thus to be encouraged. For the nutritionist, the costs of a price increase are measured in lost calories or other nutrients. If K is the calorie content per unit of food, and k is total calorie intake (k for kilocalories,) the change in calories induced by the price change can be approximated by NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 209 (4.3) A k aCfpfP +0.51 a2fA 2 where the derivatives are uncompensated. If we compare (4.2) and (4.3), both depend on the responsiveness of food consumption to prices, the latter through the uncompensated first and second derivatives, and the former through the com- pensated price derivative. We can perhaps be permitted to assume that the second term in (4.3) is negligible-it is the price derivative of the price derivative and certainly there is little or no empirical evidence about such a quantity-in which case the distinction between the two equations becomes quite sharp. In (4.2), the cost of a price increase is smaller the larger (absolutely) is the (compensated) price response, while in (4.3) the nutritional cost of the price increase is larger the larger (absolutely) is the (uncompensated) price response. The difference between the compensated and uncompensated price effects will be large when the good whose price is changing is a large share of the budget, which is true for food as a whole, but in practice we are usually concerned with the price of a particular food where the difference will be small. The crux of the conflict between the two approaches can be seen by consider- ing an example. Suppose that milk is being subsidized, that the poor receive a good deal of their nutrients from milk, and that the government is considering reducing the subsidy for budgetary reasons. Suppose also, for the purpose of the argument, that, according to the best empirical evidence, there is a large price elasticity among the poor, so that if the price is increased, the poor will reduce their consumption of milk and its associated nutrients. The nutritional advice is therefore to maintain the subsidy, while the economic advice is likely to be the re- verse. By their high price elasticity, the poor have revealed that there are good substitutes for milk, in their own eyes if not in those of the nutritionists, so that the withdrawal of the subsidy is unlikely to hurt them much. For those who, like most economists, base welfare on people's purchasing power and leave the choice of individual goods up to people themselves, the large price elasticity is welcome since it is evidence that people are not vulnerable to increases in the price of milk. This conflict between commodity-specific and money approaches to welfare occurs in many other situations; health care, education, and even telephones, are cases where policymakers and their constituents sometimes believe that consump- tion of a specific good is valuable independently of the general living standards of the consumer, and independently of whether people appear to value the good themselves. At the heart of the matter is whether or not we accept people's own judgments of what is good for them. While it is easy to find examples of people making poor choices, it is also difficult to find convincing cases where policy- makers or other "experts" have done better on their behalf. One such case that is relevant for the current discussion is the history of food rationing in Britain dur- ing World War II; in spite of widespread shortages there was a marked improve- ment in general nutritional standards brought about by policies that simultane- ously limited consumption (by rationing) while redirecting it towards commodi- ties such as fresh milk that had not been widely consumed prior to the war and 210 THE ANALYSIS OF HOUSEHOLD SURVEYS whose supply was guaranteed during it (see Hammond 1951). The policy seems also to have narrowed long-standing health and mortality inequalities, at least temporarily (see Wilkinson 1986). In any case, my quarrel is not with those who wish to change tastes, or who wish to eliminate hunger even at the expense of more general economic well-being. Although many economists would disagree with such prescriptions, there is no logical flaw in the arguments. What is both incorrect and logically flawed is to try to follow both the com- modity-specific and economic approaches simultaneously. High substitution effects are either a good thing or a bad thing; they cannot be both. It is entirely legitimate to worry about the effects of food prices on nutrition or of user charges on consumption of hospital services or education by the poor-and such has been a major research topic in the World Bank in recent years (see Jimenez 1987 or Gertler and van der Gaag 1990)-but it is necessary to take a view about what high price elasticities mean. If our goal is to provide these services to the poor even when their behavior suggests that they do not value them, then that fact should be explicitly recognized and its implications-for education, for exam- ple-taken into account. Alternatively, if we accept that the decisions of the poor have a reasonable basis-perhaps because the services are of poor quality and worth very little-then it is necessary to think hard about the justification for the subsidy, and whether or not the funds could not be better employed elsewhere, perhaps in improving quality and delivery. Spending scarce resources to subsidize facilities that are not valued by the poor will not do much to reduce poverty, and the true beneficiaries of such policies are often to be found elsewhere. Nutrition and productivity An important issue is the direction of causation, whether the link runs, not only from income to nutrition, but also from nutrition to income. One possibility is that those who do heavy manual labor require more calories than those who do not- see for example Pitt, Rosenzweig, and Hassan (1990), who also consider the im- plications for intrahousehold allocation. This possibility can be dealt with by controlling for the appropriate occupational variables in the nutrition demand function. However, it is also possible that not getting enough to eat impairs pro- ductivity to the point where poverty and malnutrition become a trap; it is impos- sible to work because of malnourishment, and impossible to escape from mal- nourishment without working. Following Leibenstein (1957), the theory of nutritional wages has been work- ed out by Mirrlees (1975) and Stiglitz (1976) in some of the finest theoretical work in economic development. These models can account for the existence of unemployment; attempts by workers without jobs to underbid those with jobs will only succeed in reducing their productivity, so that the employer gains nothing by hiring them. By the same token, the theory can explain the existence of high wages in modern sector jobs. It is also consistent with unequal allocations within the family because equal shares may leave no one with enough energy to work the farm or to be productive enough to get a job. This model has been used to NUTRITlON, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 211 account for destitution in India by Dasgupta (1993, pt. IV see also Dasgupta and Ray 1986, 1987) as well as by Fogel (1994), who sees the mechanism as the major impediment to historical growth in Europe. While there has been some empirical work that looks for such effects-most notably by Strauss (1986) for Sierra Leone-there are formidable difficulties in the way of constructing a convincing test. If we use data on self-employed work- ers, and find a relationship between income and nutrition, we need some means of knowing whether what we see is the common-or-garden consumption function, by which higher income generates more spending and more nutrition, or is instead what we are looking for, the hypothesized effect of nutrition on productivity, out- put, and income. In principle, such identification problems can be solved by the application of instrumental variables, but it is doubtful whether there are any vari- ables that can convincingly be included in one relationship and excluded from the other. These points were argued by Bliss and Stem (1981), who also identified the corresponding difficulties in looking for nutritional effects among employed workers. Since employers will only hire well-nourished workers-which is the source of the model's predictions about unemployment-nutritional effects will only be found among the employees of those employers who are unaware of the effects of nutrition on productivity! The Mirrlees-Stiglitz theory hinges on non- linearities in the effects of nutrition on productivity, and requires that productivity be a convex function of nutrition at low levels, becoming concave as nutritional status improves. These nonlinearities would have to provide the basis for identi- fication in the econometric analysis, which is likely to be both controversial and unconvincing. There is one final point on which the empirical evidence is directly relevant. For the nutritional wage theory to be a serious contender to explain the phenom- ena for which it was invented, the calories required to maintain productivity must be costly. If it is possible to obtain enough calories to do a day's work for a very small fraction of the daily wage, then low productivity rooted in malnutrition is an implausible explanation for unemployment. In the empirical analysis below, we shall use Indian data to make these calculations. The expenditure elasticity of nutrition: background The relationship between nutritional intake and total expenditure (or income) in poor countries is the link between economic development and the elimination of hunger. In the simplest of worlds, one might imagine that food is the "first neces- sity," and that people whose incomes do not permit them to buy enough food would spend almost everything on food. Even admitting the existence of other necessities, such as clothing and shelter, it is still the case that the poorest house- holds in India (for example) spend more than three-quarters of their budget on food, and that this share does not fall by much until the top quartile of the PCE dis- tribution. For such people, the demand for food is elastic with respect to PCE, so that it might be supposed that the elasticity of nutrient intake would also be high, perhaps even close to unity as in the simplest case. 212 THE ANALYSIS OF HOUSEHOLD SURVEYS However, it has been recognized for many years that this view is too simple. Even if the expenditure elasticity of food were unity, the elasticity of calories need not be, since the composition of foods will change as income rises. This can happen through substitution between broad groups of goods, as when meats are substituted for cereals, or when "fine" cereals such as rice and wheat are substi- tuted for roots (e.g., cassava) or coarse cereals such as sorghum or millet, or it can happen within food groups, as people substitute quality, variety, and taste for quantity and calories. As a result, the nutrient elasticity will be lower than the food elasticity, perhaps by as much as a half. Indeed, Reutlinger and Selowsky (1976), in one of the most cited studies of malnutrition and development, as- sumed that calorie intake and total expenditure were linked by a semilogarithmic relationship-with the implication that the elasticity falls as calorie consumption rises-but that even among households just meeting their caloric need, the elastic- ity was between 0.15 and 0.30. The empirical evidence has produced a wide range of estimates of the elastic- ity, from close to unity to close to zero (see Bouis and Haddad 1992 and Strauss and Thomas 1995 for reviews of the literature) but, at least until recently, there would have been assent for the range suggested by Reutlinger and Selowsky, as well as for their assumption that the elasticity is higher among the poor than the rich. To paraphrase Alfred Marshall's dictum on the demand for food, the size of the elasticity is ultimately limited by the capacity of the human stomach, elastic though that may be. However, a number of recent studies have claimed that the true elasticity is very low, perhaps even zero, and that the earlier presumptions are based on studies that are flawed by inadequate data as well as by econometric and theoretical flaws. Bouis and Haddad (1992), using data from the Philippines, in- vestigate a wide range of reporting and econometric biases, and conclude that the true estimate, although significantly positive, is in the range of 0.08 to 0.14. Behr- man and Deolalikar (1987) use data from the ICRISAT villages in south India, and although their preferred point estimate is 0.37, it has an estimated standard error that is also 0.37, leading Behrnan and Deolalikar to conclude that "for communi- ties like the one under study, increases in income will not result in substantial improvements in nutrient intakes." (p. 505, italics in the original). If we were to accept this extreme revisionist opinion, the conflict between basic needs and economic development is stark indeed. Even if structural adjust- ment or other policy reform were to succeed in accelerating (or in some countries just starting) economic growth, the poor will still not have enough to eat, and we will be left with the problems of endemic hunger and malnutrition, albeit in a richer world. Much of the applied economics of development policy would have to be rethought; the standard prescriptions of project evaluation, price reform, tax policy, and trade policy are all derived under the assumption that what we want to promote is people's general economic well-being, not their intake of nutrients, or if we are concerned with the latter, it is assumed that it will follow from the for- mer. One can ultimately imagine a pricing policy that is designed, not to target the poor, to avoid distortion, nor to raise revenue, but to induce people to eat the recommended foods. NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 213 Evidence from India and Pakistan The main results reported in this subsection are taken from Subramanian and Deaton (1996), which should be consulted for the details of the calculations. We use Indian NSS data for the state of Maharashtra in 1983 (38th round) to construct household calorie availabilities from the consumption data. Household surveys (including the NSS) typically collect data on consumption levels, not on nutrition, so that data on calorie intake have to be calculated ex post from the data on con- sumption of food. Because of substitution between foods as incomes rise, the multiplication of total food expenditures by an average "calorie per rupee" factor will systematically understate the calories of the poor and overstate those of the rich, so that the elasticity of nutrition with respect to income will be overstated. Since substitution can take place both between broad groups and within broad groups, it is good practice to evaluate calories using as much detail as possible. In the Maharashtran data, 5,630 rural households report consumption during the last 30 days on each of more than 300 items, of which 149 are food items for which there are data on both expenditures and quantities. The latter are converted to calories using the tables on the nutritive values of Indian foods in Gopalan et al. (1971). In Indian households, a substantial amount of food is not consumed by household members, but is given to employees and guests, and conversely, many members of poorer households receive a fraction of their calories in the form of meals provided by employers. Although the data do not provide direct estimates of the calories involved in these transactions, the 38th round of the NSS asked respondents to report the number of meals taken by family members, provided to employees and by employers, and served to guests at both ceremonial and other occasions. These figures can be used to estimate the average calorie content of each type of meal by regressing the total calorie content of the food bought by the household on the numbers of meals provided by the household in each of the categories. Doing so suggests that each person-meal at home provides 727 calo- ries, each meal to a servant 608 calories, while meals to guests have 1,550 calo- ries at ceremonies and 1,379 at less formal occasions. These numbers can be used to correct the total calories, to exclude food not consumed by family members, and to include food received by family members as employees. Table 4.1 shows how rural Maharashtran households obtained their calories in 1983, what foods they consumed and how much they spent on each. The top half of the table shows the allocations over broad groups of foods, while the bottom half looks at cereals alone. On the left side, I show the allocation of expenditure between the goods, in the middle the allocation of calories, and on the right, the prices that people paid for 1,000 calories obtained by purchases of each of the various foods or groups of foods. In each case, the table shows the mean for all rural households in the sample together with means for the bottom and top deciles of the PCE distribution. (Tabulations of means by decile is a special case of the method of fractile analysis introduced by Mahalanobis (1960), and can be thought of as a crude but often useful form of nonparametric regression, just as histo- grams are crude but useful nonparametric estimators of densities.) Table 4.1. Expenditure patterns, calorie consumption, and prices per calorie, rural Maharashtra, 1983 Expenditure shares (percent) Calorie shares (percent) Price per 1,000 calories (rupees) Bottom Top Bottom Top Bottom Top Food Mean 10% 10% Mean 10% 10% Mean 10% 10% Food groups Cereals 40.7 46.0 31.0 70.8 77.3 57.3 0.64 0.51 0.79 Pulses 8.9 10.2 7.8 6.6 6.2 7.2 1.51 1.44 1.60 Dairy 8.1 4.9 11.8 2.8 1.3 4.9 3.69 3.59 3.92 Oils & fats 9.0 9.2 9.2 5.9 4.8 7.6 1.74 1.67 1.81 Meat 5.1 3.4 6.4 0.7 0.4 1.0 11.7 11.0 12.2 Fruit & vegetables 10.5 8.5 12.0 3.5 2.3 5.4 3.90 3.83 3.85 Sugar 6.5 7.4 5.9 7.2 7.0 8.0 1.01 0.94 1.09 Otherfood 11.3 10.4 16.1 2.5 0.8 8.6 17.4 16.8 15.9 Cereals Rice 11.6 9.0 10.9 15.2 10.1 16.5 0.95 0.89 1.02 Wheat 5.6 3.8 7.9 8.5 4.7 14.4 0.79 0.73 0.82 Jowar 18.2 27.4 9.3 37.8 52.9 21.6 0.50 0.43 0.55 Bajra 3.0 2.7 1.3 6.6 4.9 3.2 0.48 0.48 0.50 Other coarse cereals 1.2 2.8 0.3 2.2 4.5 0.6 0.66 0.58 0.99 Cereal substitutes 1.1 0.5 1.3 0.6 0.2 0.8 2.23 2.22 2.22 Total food 67.4 73.4 54.1 2,120 1385 3382 1.14 0.88 1.50 Calories adjusted 2,098 1429 3167 Note: Mean refers to the mean of the whole sample and bottom (top) 10 percent to the mean of households in the bottom (top) decile of PCE. Shares of calories and of expenditures are calculated on an individual-household basis and are averaged over all appropriate households. Calorie prices are averages over consuming households. Source: Subramanian and Deaton (1996, table 1). NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 215 On average, rural Maharashtran households spent more than two-thirds (67.4 percent) of their budget on food; the corresponding figure is near three-quarters for the bottom decile (73.4 percent) and is still above a half (54.1 percent) among households in the top decile. From this they obtained 2,120 calories on average. More than 40 percent of expenditures were on cereals, with the poor buying most- ly coarse cereals, mainly jowar (sorghum vulgare) while better-off households spent more on wheat and rice. Cereals provide calories more cheaply than do other foods. On average each 1,000 calories from cereals cost only 64 paise in 1983, and a large share of total calories-more than three-quarters in the bottom decile-came from this source. The coarse cereals provided calories even more cheaply, jowar and bajra (millet) at 50 paise per 1,000 calories, and the focus of poor households' consumption on these goods means that poor households spent 88 paise per 1,000 calories compared with 1.50 rupees for households in the top decile and 1.14 rupees on average. For comparison, the wage rate in rural Maha- rashtra in 1983 was around 15 rupees per day, so it is hard to believe that the inadequate nutrition is a bar to employment or productive labor for the individu- als in this sample. Given that a substantial fraction of calories are required to maintain the body even at rest, and that an additional 2,000 calories can be pur- chased for one rupee, malnutrition alone is surely an insufficient force to keep people in destitution. There are two other important points in the table. First, the increase in the cost per calorie from poor to rich is a consequence of the substitution between the broad groups of foods, at least provided that the different types of cereals are distinguished from one another. There is not much evidence of higher prices being paid within categories, so that, for example, the best-off households pay about the same per kilo for their dairy products or their wheat as do households in the bottom decile. Secondly, the correction for meals given and taken, which is shown in the last row of the table, works as expected, increasing the calories of the poorest households, decreasing the calories of the better-off, and leaving the mean relatively unchanged. If the correction is not made, we run the risk of over- estimating the responsiveness of calories to income. Table 4.1 also gives a rough idea of the various elasticities, and how the in- crease of food expenditure with income is divided between additional calories on the one hand and additional expenditure per calorie on the other. Since the budget share of a good is identically equal to the price multiplied by quantity (per capita) divided by total expenditure (per capita), the logarithm of the budget share is the logarithm of price plus the logarithm of quantity less the logarithm of total expen- diture. The average of the logarithm of PCE among households in the top decile is 5.6, and that among those in the bottom decile is 3.8. If this is compared with the change in the logarithm of the budget share, from (the logarithm of) 73.4 to 54.1, we get a total elasticity of food expenditure per head of about 0.8, matching the slow decline of the food share with PCE. The corresponding rise in expenditure per calorie, from 0.88 to 1.50 rupees per 1,000, translates into an elasticity of around 0.3, so that the residual, which is the elasticity of calories with respect to PCE, is around 0.5. As we shall see, this estimate is on the high side, but the gene- 216 THE ANALYSIS OF HOUSEHOLD SURVEYS ral picture is not; the food expenditure elasticity is large with substantial fractions accounted for by each of its two components, the elasticity of calories and the elasticity of the price of calories. The general relationship between calories and PCE can also be illustrated using the bivariate density estimation techniques of Section 3.2. Figure 4.2 shows the estimated contours of the joint density of the logarithm of calories per head per day and the logarithm of monthly PcE. For comparative purposes, and using the same vertical scale (for calorie intake), Figure 4.3 shows the same plot for the Pakistani province of Sindh in 1984-85 based on data from the National Income and Expenditure Survey. I will not use the Pakistani data further in this section, but the general similarity between the two graphs is worthy of note. In both cases, the contours are approximately elliptical, as would happen if the joint density were in fact normal. The implicit regression functions are close to being both line- ar and homoskedastic, and the slope in each case is approximately one-half. Al- though both surveys come from the subcontinent, the dietary habits of the two regions are different, with rice the staple in the Sindh Province of Pakistan, while in Maharashtra the hierarchy of basic foods starts with jowar and progresses to rice, and eventually to wheat. Other similarities and differences between these two surveys will be further explored in Chapter 5 when I consider the response of food demand to prices. Regression functions and regression slopes for Maharashtra Given the results in Figure 4.2, it is hardly surprising that the associated regres- sion function is close to linear. Figure 4.4 shows the estimated regression using a locally weighted smoother with a quartic kernel and a bandwidth of 0.50. The graphs also show confidence bands for the regression line. These were obtained by bootstrapping the locally weighted regressions 50 times, so that, at each point along the 100-point grid used for the regressions, there are 50 bootstrapped repli- cations. These are used to calculate a standard error at each point, and then to plot lines that are two standard errors above and below the estimated regression func- tion. The graph shows two such pairs of lines; the inner pair ignores the clustered structure of the sample and the outer pair takes it into account in the bootstrap- ping. In this case, the cluster structure makes little difference, and the confidence bands are tight around the regression line. The graph suggests that the slope falls with PcE but is close to linear with slope 0.5. Given the economic plausibility of a slope that is higher among the poor, it is worth calculating and plotting a nonparametric estimate of the slope itself, allowing it to be different at different levels of PCE. Such estimates are a natural by-product of the locally weighted regressions, since each local regression delivers a slope parameter, and are shown in Figure 4.5, once again with two sets of estimated confidence bands. The graph shows an elasticity that declines gently with the level of PcE, from about 0.7 at the bottom to 0.3 at the top. These data do not show any evidence of a sharp change in the elasticity at some critical level of calorie intake as found by Strauss and Thomas (1995) for Brazil. NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 217 Figure 4.2. Calories per head and log(PCE), rural Maharashtra, 1983 9.2 8.4 7.6 0 6.8- 3.0 3.8 4.6 5.4 6.2 Log(PcE): log(rupees per capita per month) Source: Subramanian and Deaton (1996). Figure 4.3. Calories per head and log(PCE), rural Sindh, 1984-85 9.2 8.4 7.0 6.8 4.0 5.0 6.0 7.0 Log (PCE): log(rupees per mont) Source: Author's calculations based on National Income and Expenditure Survey, 1984-85. 218 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 4.4. Estimated regression function for log of calories per head and log(PCE), rural Maharashtra, 1983 8.2- 8.0 7.8- 0 bO7X 70- 3.4 3.8 4.2 4.6 5.0 5.4 5.8 6.2 Log of PCE Source: Subramanian and Deaton (1996). Figure 4.5. Elasticity of calories per head to PCE, Maharashtra, 1983 1.0 n 0.8 v , 0.6- 0.4- 0.2_ 0.0- 3.4 3.8 4.2 4.6 5.0 5.4 5.8 6.2 Log(PcE) Source: Subramanian and Deaton (1996). NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 219 Allowing for household structure One serious problem with the regressions and regression slopes in Figures 4.4 and 4.5 is that they portray only the bivariate relationship between calories per head and total household expenditure per head. While the latter is certainly the most important variable determining calorie consumption, it is by no means the only one, and even if we are less interested in other covariates than in the total budget, the calorie expenditure relationship will not be consistently estimated if the omit- ted variables are correlated with PCE. The most important omitted variables are those representing the size and composition of the household. While a relation- ship between calories per head and total expenditure per head is a natural starting point, and allows for household size in a rough way, it makes no concession to the fact that children consume fewer calories than do their parents, so that, controlling for PCE, households with more children will typically consume fewer calories per head. Even among all-adult households, we might expect the presence of econo- mies of scale to imply that larger households with the same PCE are better-off than smaller households (but see Section 4.3 below) so that food consumption per capita will be a function of both PCE and household size. The same will be true if there are economies of scale in food consumption itself, for example if wastage is reduced when more people share the same kitchen and take meals together. As always, nonparametric techniques are much better at handling bivariate than multivariate relationships, but it is possible to look at the calorie expenditure relationship nonparametrically for a range of household sizes. Figure 4.6 illus- trates this, reproducing the nonparametric regression functions between the loga- rithm of calories per head and the logarithm of PCE, but now separately for house- holds of different sizes. There are 10 curves, for household sizes I through 10, and each curve is based on a different number of households. The curve for five- person households uses 1,100 observations and is the most precisely estimated; the number of households in each category rises from 283 single-person house- holds to 1,100 with 5 persons, and falls to only 96 households with 10 persons. There are several important points to note. First, the curves are lower for larger households, so that, conditional on PCE, calorie consumption per capita falls with household size, an effect that is at least partly due to the fact that larger households have a larger proportion of children. Second, the elasticity of calories with respect to expenditure is not much affected by the size of the household. The different regression lines are approximately parallel to one another. Third, because per capita calorie consumption is nega- tively related to household size at constant PCE, and because in rural Maharashtra PCE is negatively related to household size-as it is in most household surveys- the omission of household size from the regression of per capita calorie consump- tion on PCE will have the effect of biasing upward the estimate of the calorie expenditure elasticity. As PCE rises, household size falls, which will have an inde- pendent positive effect on calories per head, and this effect will be attributed to-or projected on-PcE if household size is not included in the regression. A careful examination of Figures 4.6 and 4.4 shows that the slopes in the former are 220 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 4.6. Calorie-outlay regression functions by household size 8.4- 8.2 - One-person households 8.0- vTwo-person households 8.0 > 7.8 - Three-person households 7.6- Eight-person households ~7.4- 7.2- 7.0 - 3.4 3.8 4.2 4.6 5.0 5.4 5.8 6.2 Log(PcE) Source: Subramanian and Deaton (1996). indeed lower. In Figure 4.4, a difference in the logarithm of PCE from 3.8 to 5.8 is associated with a change from 7.2 to 8.0 in the logarithm of per capita calories, a difference of 0.8, compared with a corresponding difference of about 0.7 in Fig- ure 4.6. Controlling for household size will therefore tend to reduce the calorie expenditure elasticity, from about 0.40 to about 0.35, figures that are much more like the conventional than the revisionist wisdom. Since the nonparametric regressions suggest that the relationship between the logarithm of calories per head and the logarithm of PCE is close to linear and that household size has an effect that is additive without interactions with log(PcE), the nonparametric results can be closely approximated by a linear regression of the logarithm of calories per head on the logarithms of PCE and of household size. This regression can then be used as a basis for examining the role of other covari- ates, since the completely nonparametric approach cannot handle any further in- crease in dimension. In fact, apart from household size, the inclusion of other variables-even including dummy variables for each of the 563 villages in the survey-has little effect on the estimates of the elasticity, which remains close to 0.35. Several of the other variables are important-see Subramanian and Deaton (1996, Table 2) for details-but unlike household size, they are sufficiently or- thogonal to the logarithm of PCE to have little effect on the key coefficient. One objection to the use of household size and household composition vari- ables to "explain" calorie consumption is that these variables are themselves en- dogenous, so that their coefficients-as well as that on the total expenditure vari- NUTRrrION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 221 able-are not readily interpreted. This is partly an argument about the endoge- neity of fertility-families do not have children by random and unanticipated delivery (perhaps by storks!)-but more generally recognizes that family size is adjusted in many ways, by migration, by younger adults splitting off to form their own households, by marriage, and in some parts of the world, by large-scale fostering. All of these processes are (arguably) influenced by the economic posi- tion of the household. As usual, the fundamental problem is a possible correlation between the regression errors and the explanatory variables, so that the calorie expenditure elasticity will be biased if households who consume more calories than predicted are also those that are more fertile, more likely to absorb members than to shed them, and so forth, possibilities that can hardly be excluded on theo- retical grounds. In principle, the calorie expenditure relationship could be reestimated using instrumental variables, although in practice, our understanding of fertility or the adaptation of household size to economic change is far from sufficiently precise to provide instruments that are well-correlated with household size, let alone that are credibly excluded from the calorie expenditure relationship itself. Even so, it is useful to consider interpreting the relationships with and without household size as short-run versus long-run relationships, so that the reduced-form relation- ship, between calories and income (or total outlay), is taken as revealing the long- run influence of higher incomes on calorie consumption, taking into account those effects that operate indirectly through the adjustment of household size through fertility or other mechanisms. For some purposes, this is what we want to know; the effect of increasing per capita income on levels of nutrition once all the offsetting and reinforcing second-order effects have worked themselves out. If so, the appropriate elasticity would be at the upper end of the range, closer to 0.40 than 0.30. The effect of measurement errors The estimates of the calorie expenditure elasticity for Maharashtra-and the results are very similar for the 1984-85 Pakistani survey-are close to what I have characterized as the conventional wisdom, and are very much higher than the revisionist estimates of Behrman and Deolalikar (1987) and Bouis and Had- dad (1992). Bouis and Haddad (see also Bouis 1994) note that their low esti- mates-as well as those of Behrman and Deolalikar-are obtained using data obtained by the direct observation by nutritionists of calorie consumption, and not, as in the Indian and Pakistani data and in most other studies, by using con- sumption data that has been converted to calories using tables of conversion ra- tios. Bouis and Haddad provide evidence from both the Philippines and from Kenya, where the data allow implementation of both methodologies, that the esti- mates from the direct nutritional data are lower. (Behrman and Deolalikar's esti- mates, which also come from Maharashtran villages, are not lower, but are insig- nificantly different from zero; their point estimate of 0.37 is essentially the same as that discussed above.) There can be no automatic presumption that the direct 222 THE ANALYSIS OF HOUSEHOLD SURVEYS nutrition surveys provide the superior data, since the increased accuracy from observing what people actually eat could well be offset by the artificial and intru- sive nature of the survey. However, as Bouis and Haddad argue, there is one very good reason for sus- pecting that standard methods lead to upwardly biased estimates; because calories are counted by multiplying consumption quantities by positive factors (calorie conversion factors) and adding, and because total expenditure is obtained by mul- tiplying those same quantities by other positive factors (prices) and adding, measurement error in the quantities will be transmitted to both totals, thus leading to a spurious positive correlation between total expenditure and total calories. Although the mismeasurement of total expenditure will typically bias downward the estimate of the elasticity-the standard attenuation effect of measurement error in an explanatory variable-there is here the additional effect of the com- mon measurement error which will bias the elasticity upwards, and as shown in the paper by Bouis and Haddad, this effect will typically dominate. By contrast, the direct observation of calories, even if subject to measurement error, will yield estimates whose errors are independent of the measurement of total expenditure, so that the calorie elasticity, even if measured imprecisely because of the mea- surement errors, will not be biased in either direction. The standard proceduie for dealing with measurement error is to reestimate using an instrument for total expenditure, the obvious choice being the sum of the nonimputed components of income. This quantity is measured independently of expenditure, is well-correlated with expenditure, and is plausibly excluded from the calorie equation, conditional on expenditure. Unfortunately, the NSS consump- tion surveys do not collect income data. However, Subramanian and Deaton show that, just as the OLS estimator is biased up, the (inconsistent) estimator obtained by using nonfood expenditure as an instrument is biased down. But this estimate of the elasticity is still much larger than zero, 0.3 as opposed to 0.4, and is quite pre- cisely estimated. This result is certainly consistent with the result that the calorie to PCE elasticity is biased upward by the standard method of imputing calories by converting quantities purchased, but it is not consistent with a true elasticity that is much less than 0.3, and certainly not with one that is 0.10 or even zero. If the difference between direct calorie measurement and imputed calorie measurement is correlated measurement error, we cannot sustain an argument that direct calorie estimation in Maharashtra would lead to very low estimates of the expenditure elasticity of calories. It remains possible that the calorie data from the NSS are compromised in other ways. Bouis and Haddad observe in their Filipino data that meals given to ser- vants by rich households tend to be understated as are meals received by servants and employees. If this is true in the Indian data, even our lowest estimates will be biased upward, although it is worth noting that the nonparametric estimates of regression functions and slopes at any given point depend only on data within the bandwidth around that point, so that estimates of the slope in the middle of the range of PCE, for example, are unaffected by measurement errors that affect only the richest and poorest. NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 223 4.2 Intrahousehold allocation and gender bias Welfare, living standards, and poverty are all characteristics of individuals, not households, and although households are often the primary income recipients and are nearly always the units for which we observe consumption, we cannot be indifferent to the allocations to individual members of the household. If women systematically get less than men, or if children and old people are systematically worse-off than other members of the household, social welfare will be overstated when we use measures that assume that everyone in the household is equally treated. When people are treated differently but we assume the opposite, the true distribution of welfare among family members can be obtained from the supposed equal distribution by a set of mean-preserving but disequalizing transfers. In consequence, the supposed distribution always understates true inequality and overstates true social welfare. Measures of poverty are also usually evaluated by attributing household per capita income or consumption to each member of the household, a practice that is used even when we are counting the number of wom- en, children, or old people in poverty. Without some idea of how resources are allocated within households, these measures are little more than guesses. Even in cases where welfare is the same for all household members, per capita consumption measures will generally not provide a correct ranking of the living standards of different households or of the members within them. Children will often require less than adults to obtain the same level of living, while old people may need more of some things-health services or warmth in cold countries- and less of others-food or work-related consumption. Members of large families may therefore be better-off than members of smaller families with the same level of resources per capita, although we have no means of making the correction unless we understand enough about intrahousehold allocation to know how much money is needed to attain equal welfare levels for different types of household members. There are two separate issues. The first is the positive question of whether and to what extent allocations within the household differ according to the gender and age of the recipient. The second is whether we can find a theoretical framework that will allow us to use the empirical evidence on the effects of household com- position on household demand patterns to say enough about individual welfare to improve our measures of social welfare and poverty. The issues also differ depending on whether we are dealing with gender is- sues, with children versus adults, or with the elderly. For issues of men versus women, or boys versus girls, we are most often concerned with whether differ- ences exist, with their magnitude, and under what circumstances allocations de- pend on gender. For children versus adults, there are at least two questions. The first is the welfare of children as separate entities from their parents. Since the allocation of resources within the household is controlled by adults, and since children are not voluntary members of the families to which they belong, there can be no general presupposition that their interests are fully taken into account. The second question concerning children is about "equivalence scales," which is 224 THE ANALYSIS OF HOUSEHOLD SURVEYS how much it costs to support children, whether and under what assumptions such calculations make sense, and how they can be used-if at all-to make compari- sons of welfare levels between members of households of different compositions. For the elderly, there are other issues that are only now coming into focus for developing countries. In much of Asia, the demographic transition that is either underway or recently completed is "aging" the population; reductions in mortality are leading to more elderly people, while reductions in mortality combined with reductions in fertility are increasing the ratio of the old to the young. There is much concern, although so far little hard knowledge, about the economic conse- quences of this change, about changes in living arrangements of the elderly, about support mechanisms both inside and outside the extended family, about the eco- nomic burdens for working-age people, and about the effects on saving and in- vestment (see Cowgill 1986, Deaton and Paxson 1992, and Chapter 6 for further discussion and some empirical evidence). In this section, I shall be concerned with the positive, empirical issues of how to use household survey data to look for patterns of discrimination between indi- viduals by gender, and particularly between boys and girls. In the next, I turn to the issue of equivalence scales. Gender bias in intrahousehold allocation There is a large literature documenting that, at least in some areas of the world, the allocation of household resources favors males over females. Much of this work, which is reviewed for example by Dreze and Sen (1989), Harriss (1990), Dasgupta (1993), and Strauss and Thomas (1995), is not concerned with the allo- cation of consumption within the family, but with the direct comparison by gen- der of nutrition, health, or education. Perhaps best known are the findings on excess infant mortality among girls, particularly (but not exclusively) in China, Northern India, Pakistan, and Bangladesh (see among others Government of India 1988, Sen 1989, Dreze and Sen 1989, pp. 50-54, Rosenzweig and Schultz 1982, and Coale 1991). The mechanisms underlying these differences are not fully un- derstood, but it is widely believed that female children are given less medical attention when they are sick, and it is certainly possible that they are provided with fewer resources in general. Other research has found more complex gender- related differences in nutrition; for example, Thomas (1994) used surveys from the United States, Ghana, and Brazil to document a positive association between a mother's education and her daughters' heights and between father's education and his sons' heights, an association that is attenuated or absent between mothers and sons and between fathers and daughters. At first sight, household survey data on consumption do not appear to be useful for sorting out who gets what; data on consumption are for the household as a whole, not for its individual members. Even so, the surveys tell us about household composition, the numbers of household members, and the age and sex of each, so that it is certainly possible to study the relationship between household composition and household consumption patterns. Indeed, if the consumption of NUTR1TION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 225 (say) food relative to nonfood depends on the ratio of males to females in the household, then we have established that allocations depend on gender, although we are very far from having understood what determines the differences, and in particular, whether they are the result of tastes or of discrimination. It is some- times the case that we directly observe the consumption of individual members of the household, at least for some goods, or that there are goods that, by their very nature, are only consumed by a subset of family members. An example of the former would be surveys where each member of the household is asked to record each expenditure, some of which are directly assignable to that person's con- sumption. More common is the second case, where there are goods (adult cloth- ing, women's clothing, alcohol, tobacco, or baby formula) that are exclusively used by some members or some groups and not others, such as men versus women or adults versus children. Given a suitable theoretical structure, these features of the data can be used as a lever to pry open the black box and to ob- serve the internal workings of the household. A theoretical digression There are many different theories of how resources are allocated within the household. The simplest, which has dominated empirical research until recently, is one in which households are treated as monolithic entities, endowed with pre- ferences as if the household were an individual. This can be thought of as a dicta- torial model, in which (presumably) the paterfiamilias decides,on behalf of every- one so that the consumption behavior of the household will look very much like the behavior of the individual consumer of the textbook. At the other extreme, the household can be thought of as a group of individuals who bargain with each other over resources. Such models provide a richer structure to behavior than does the dictatorial model. For example, the allocation between a husband and wife will depend on what each would get should the marriage be dissolved, so that such models predict differences in household consumption patterns according to the relative earnings of each partner, unlike the dictatorial model in which all re- sources are pooled. It also makes sense to suppose that people in the family care for each other, and get pleasure from each other's consumption as well as from their own, or get pleasure from each other's pleasure. The consequences of these different assumptions have been explored in the literature, most notably by Man- ser and Brown (1980), McElroy and Horney (1981), Becker (1981), McElroy (1990), and Lundberg and Pollak (1993). In a series of papers, Chiappori (1988, 1992), Bourguignon and Chiappori (1992), and Browning et al. (1994) have developed a methodology that is consis- tent with a number of these models and that permits empirical testing of their different predictions. Under appropriate, although not always uncontroversial as- sumptions, these methods allow recovery of the rules for resource allocation with- in the household. Their procedures are also useful for thinking about the empirical results later in this chapter and for discussing equivalence scales, so that it is useful to provide a brief introduction here. 226 THE ANALYSIS OF HOUSEHOLD SURVEYS The simplest case is where the household consists of two members A and B whose private consumption vectors are denoted q A and q B, respectively. Sup- pose also that there is a vector of public goods z, that are available to each, so that the two utility functions can be written tA(q ,z) and uB(q Z). Chiappori and his coauthors remain agnostic about exactly how the allocative conflict between the two people is resolved, but instead focus on the assumption that the allocation is efficient, so that, given whatever each member gets, each individual's utility function is maximized subject to the effective budget of each. Given efficiency, the optimal choice for A can be written as the solution to the problem (4.4) max uA(q, z) s.t. p A .q= OA p,p, y) where z is the optimal choice of the public good, p is the price vector for all goods, p A is the price of goods consumed by A, pz is the price vector of public goods, and OA(p,pzy) is the sharing rule, the function that determines the total amount that A gets conditional on the prices of goods, including the prices of public goods, and total household resources y. The solution to (4.4) will be a set of demand functions (4.5) qA = gi[OA(p p y) pA Z] If we hold fixed the allocation to A, OA(p,pz,y), then because these demand functions are the result of the standard maximization problem (4.4), they have all the usual properties associated with well-behaved demand functions from the textbook. There is a precisely analogous maximization problem and set of de- mand functions for B, and the overall budget constraint implies that B's sharing rule satisfies (4.6) 6'(P,P y) = y - p Z_ p A q A. Although these results require the formal justification that is provided by Chiappori and his coauthors, the underlying intuition is quite straightforward. The efficiency criterion means that it is impossible to make one family member better- off without making the other worse-off, so that each person's utility must be maximized subject to whatever is spent on the goods and services entering that utility function, which is precisely the content of equation (4.4). The result clearly holds for the standard dictatorial or pooling model where the paterfamilias allo- cates resources to maximize a "family" social welfare function which has the in- dividual utility functions-including his own-as arguments. But it holds much more widely, for example when the allocation is set by bargaining, or even when there is altruism, providing it is of the "caring" type, so that A and B care about each other's living standards, but not about the specific items of each partner's consumption. Different kinds of behavior will result in different sharing rules, but once the sharing rule is set, the individual demands will be characterized by equa- tions (4.5) and (4.6). NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 227 What can this framework be used for, and how does it help us discover the allocation mechanisms within the household? Even without any further structure or assumptions, the results deliver testable predictions about behavior, so that it is possible to test whether efficiency holds, and to examine specific forms of effi- ciency, such as dictatorial behavior. Beyond this, and more importantly from the point of view of welfare analysis, we can ask what additional assumptions are required to identify the sharing rules. Consider first the testing problem. Suppose that the household has two people, and three sources of income, one that accrues to A, one that goes to B, and one that is received collectively. We can think of these as individual earnings versus capital income from jointly owned assets, and they are denoted y A, y B, and y 0, with total household income y given by (4.7) y = yA +yB +yO. The sharing rule will depend on all three types of income separately, so that sup- pressing the public goods and the prices to simplify the exposition, we can write the (observable) household demand for good i as (4.8) q. = g A[ OA(yA yB,y) I + g1B[y - (A (y A,yB,y) where we have chosen to eliminate y ° so that we can consider the effects of varying the individual incomes y A and y B while holding total income constant, with collective income implicitly adjusted. If (4.8) is,differentiated with respect to y A and y B in turn, and one derivative divided by the other, we obtain aqiloy A _ aAlayA (4.9) aqilay B aoAlay B The right-hand side of (4.9) is independent of i so that we can test for Pareto efficiency within the household by calculating the left-hand side for as many goods as we have data, computing the ratios, and testing that they are the same up to random variation. One attractive feature of (4.9) is that it is a generalization (i.e., a weaker version) of the condition that holds under dictatorial preferences. When household income is pooled, the ownership of income makes no difference, and the ratios in (4.9) should all be unity. We therefore have two nested tests, first of efficiency, and then of the more restrictive hypothesis of income pooling. The evidence cited earlier is consistent with a failure of the dictatorial or pool- ing model. Moreover, using French data, Bourguignon et al. (1992) reject pooling but cannot reject the weaker restriction in (4.9). It would be a productive exercise to replicate these tests using data from developing countries, since they provide an obvious first step in an enquiry into the structure of household allocation. It should also be noted that the predictions underlying these tests, unlike the sym- metry or homogeneity predictions of standard demand analysis literature, relate to income (or total expenditure) derivatives, and not to price effects. As a result, 228 THE ANALYSIS OF HOUSEHOLD SURVEYS they require only a single cross section of survey data with no price variation. Even with time-series or panel data, where price effects can be identified, varia- tion in relative prices is typically much less than variation in real incomes, so that tests based on the latter are both more powerful and easier to implement. On the negative side, a good deal of work remains to be done in interpreting findings that different types of income are spent in different ways. Some income sources are more regular than others, and some income flows are better measured than others. In either case, the different types of income will have different effects, whether real or apparent, and these differences would occur even if household resource are allocated according to the standard model of a unitary household. Equation (4.9) is suggestive in another way. Since the ratios on the left-hand side are observable from the data, the right-hand side is econometrically identi- fied, so that although this equation does not identify the sharing rule itself, it tells us something about it that we want to know, which is how the allocation to A is differentially affected by increments to the earnings of A and B, respectively. Is it possible to do even better than this, and to develop procedures that identify the rule itself? Without further assumptions, the answer is no. The demand equation (4.8) contains not only the sharing rule, but also the two sets of demand functions, and observation of household demand on the left-hand side is not sufficient to allow recovery of all three. However, for some goods we do indeed have addi- tional information. In particular, and adopting Bourguignon and Chiappori's terminology, goods may be assignable in that the consumption of each person may be separately observed, or exclusive, in that consumption is known to be confined to one person. When one particular good, say good i, is assignable, we observe q A and qiB separately, so that we can also observe (or at least estimate) the derivatives of each with respect to the three kinds of income. These deriva- tives are given by differentiating each of the two terms on the right hand side of (4.8) with respect to y A and y in turn and computing ratios, so that aq,-AlayA BOaAlayA aq. Blay A _ aoAlayA (4.10) aq.A/ay aAlayy aqi lay aOAIaY ' aqi 1 +aoAlay Since the left-hand sides of both equations are observable, we can solve the two equations for the two unknowns, which are the derivatives of the sharing rule with respect to y and y A. If the A and B superscripts in (4.10) are reversed, we can derive two more relationships that determine the derivatives of OA with re- spect to y (again) and y B, so that all the derivatives of the rule are identified or overidentified, and the rule itself is identified up to a constant. The role of assignable goods in this calculation can be taken by exclusive goods, as long as there are at least two of them, so that once again we have four ratios of derivatives corresponding to (4.10) that can be used to identify the rule. These remarkable results, although they are not (quite) sufficient to identify the allocation of resources to each family member, allow us to map out how the allo- cation changes in response to the distribution of earnings between household members, so that, for example, we can compare the change in the allocation to the NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 229 husband of a change in the earnings of the wife compared with a change in the wife's allocation in response to a change in her husband's earnings. And as we shall see in the next subsection, the same techniques can be used to investigate the allocative effects of changes in variables other than the earnings of individual family members. It is often difficult in practice to find commodities that are either assignable or exclusive, although in the longer run, survey practice can be adapted to collect better information. However, the likely candidates for exclusivity are items like men's clothing and women's clothing, and the most obvious of the assignable goods is leisure, since surveys routinely collect data on hours worked by various household members. But such commodities can only be thought of as exclusive or assignable in the narrowest sense. Most people are not indifferent to how their partner dresses or to the amount and disposition of their partner's leisure, so that, in terms of the formal model q B appears in A's utility function and vice versa. Browning et al. (1994) provide estimates of Canadian sharing rules based on the assumption that clothing is assignable, or equivalently, that men's and women's clothing is exclusive. As they note, this "implies that wives care only about their husband's clothing insomuch as it contributes to the welfare of their husband (and vice versa). Many readers will be thoroughly skeptical of this implication." Whether it is possible to do better than this remains to be seen. Adults, children, and gender In the analysis so far, we have looked at the effects of different income sources on household allocations. However, a parallel analysis can be carried out for any other variables that exert income effects on demand. One such is household com- position, where the needs that come with additional household members act so as to reduce the income available to each. Thinking about household composition in this way builds a bridge between the sharing rule approach and earlier treatments of the effects of household composition. Instead of two individuals A and B, imagine the household divided into two groups, adults and children. In this context, the decisionmaking surely rests with the adults, but we can nevertheless think of a procedure that shares total resources between the children's needs, the adults' needs, and public goods that are avail- able to both groups. This model has only one type of income, which accrues to the adults, but the role of the different types of income is now taken by the characteristics of children, their numbers and genders, which affect the consump- tion of the adults through the share of total income that they receive, Hence, if A stands for adult consumption, and C for the consumption of children, we might write the demand for adult goods (4.11 ) q*A = g.A( ApZ CZ A) where z C and z A are the characteristics of the children and adults respectively. The argument xA is the total expenditure that is allocated to the adults by the sharing rule 230 THE ANALYSIS OF HOUSEHOLD SURVEYS (4.12) xA = O(y,p,zc,zA). There is a corresponding demand function for the children, and the observable household demand for each good is the sum of the two components. Note that the child characteristics affect adult demand in two separate ways in (4.11), through the amount that adults get through the sharing rule-income effects-as well as directly through the demand functions-substitution effects. If we consider one particular change in child characteristics, namely the addition of a child to the household, the income effect is the reduction in adult consumption that is required to feed and clothe the child, while the substitution effects are the rearrangements in adult consumption that are desired (or required) in the new situation. These substitution effects are appropriately described as such because they would exist even if the adults were compensated for the income effects of the child by holding the value of the sharing function unchanged. Paralleling the previous analysis, the identification of the sharing rule requires additional assumptions. Again, a good deal of progress can be made by working with goods that are exclusive, in this case goods or services that are consumed by adults and not by children. The standard choice for such commodities is adult clothing, alcohol, and tobacco, although in any given survey there may be other candidates. Because of the exclusivity, household consumption of these items is adult consumption, which helps identify the sharing rule in (4.12). Suppose also that there are no substitution effects for at least some of these goods, so that the only effects of children on the consumption of adult goods is through income effects. As a result the effect of an additional child on the expenditure on each adult good ought to be proportional to the effect of income on the expenditure on that adult good. The formal result is obtained by differentiating (4.11) and (4.12) when z C is absent from the demand function (4.11), but not from the sharing rule (4.12), so that, in a manner precisely analogous to (4.9), the ratio aqiAIazc al/az c (4.13) aqiAlay = aolay is the same for all such goods, and identifies part of the sharing rule. In particular, it allows us to measure in income units the effects of children (or child character- istics) on the adult's share of income. For example, if the characteristic is the number of children, and the ratio (4.13) is 100 rupees per child (say), then the ef- fect of an additional child on the consumption of adult goods is equivalent to a re- duction of 100 rupees in the budget. Although we must have at least one good that is known to be exclusively adult, when there is more than one, that (4.13) is the same for all goods can be used to construct a test. This methodology is essentially that proposed by Rothbarth (1943) for mea- suring the costs of children, and applied in many subsequent studies, for example Henderson (1949-50a, 1949-50b), Nicholson (1949), Cramer (1969), Espenshade (1973), and Lazear and Michael (1988). Rothbarth chose a very broad group of adult goods (including saving), calculated how much the total was reduced by the NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 231 presence of an additional child, and calculated the cost of the children to the adults by the amount that income would have to rise to restore the original expen- ditures. I shall return to the topic of child costs in the next section, but for the moment I wish to note the link between Rothbarth's technique and Chiappori's sharing rule analysis. In this section, I use these methods to investigate another question, which is whether all children are treated equally, or whether there are differences in the effects on adult consumption according to whether the child is a boy or a girl. That there might be such differences is at least possible given the empirical evi- dence on discrimination against girls in education and health, and if the sharing rule approach works at all, we should expect to find a greater negative effect on adult consumption of additional boys than of additional girls. If parents treat boys better than girls, one way is to make greater reductions in their own expenditures for boys than they would for girls. Empirical evidence from India The relationship between gender and the pattern of household demand has been studied by Subramanian and Deaton (1991) using the Maharashtran NSS 38th round data used for the calorie work in Section 4.1. This subsection reports a selection of their results. Maharashtra is perhaps not the best state of India in which to look for gender bias, because there is little evidence of excess infant mortality among girls in Maharashtra or in southern India in general (see Govern- ment of India 1988). However, there is no need to prejudge the issue; the Maha- rashtran data provide an excellent example of how the analysis works, and I shall discuss other evidence below. The analysis is parametric and begins from the specification of a standard Engel curve linking expenditures on individual goods to total expenditure and to the socioeconomic and demographic characteristics of the household. There are many possible functional forms for such a relationship, and the one used here is based on that introduced by Working (1943), who postulated a linear relationship between the share of the budget on each good and the logarithm of total expendi- ture. Such a relationship has the theoretical advantage of being consistent with a utility function-see for example Deaton and Muellbauer (1980a, p. 75)-and its shape conforms well to the data in a wide range of circumstances. The transfor- mation of expenditures to budget shares and of total outlay to its logarithm in- duces an approximate normality in the joint density of the transformed variables, so that the regression function is approximately linear. Working's Engel curve can be extended to include household demographic composition by writing for good i, i = 1_. m, K-I (4.14) wi = ai + Piln(x/n) + lilnn + E Yik(nkln) tiZ + Ui k=1 where w, is the share of the budget devoted to good i, piqi Ix, x is (as usual) total expenditure, n is household size, nj is the number of people in age-sex class j, Table 4.2. Engel curves showing effects of household composition on selected food demands, rural Maharashtra, 1983 (t-values in brackets) Rice Wheat Coarse cereals Pulses Sugar Fruit & vegetables ln(xln) -0.70 (1.8) 0.54 (4.1) -9.82 (34.2) -2.08 (22.0) -1.51 (23.8) -0.23 (2.2) Inn -0.02 (0.1) 0.79 (6.3) -0.79 (2.9) -0.84 (9.3) -0.50 (8.3) -0.31 (3.2) Males aged 0-4 -1.46 (0.7) 0.72 (1.0) -4.35 (2.9) 0.10 (0.2) -1.15 (3.4) -0.19 (0.3) 5-9 -1.63 (0.8) 0.64 (1.0) -1.07 (0.7) -0.79 (1.7) -1.01 (3.2) -1.69 (3.2) 10-14 -4.78 (2.6) -0.04 (0.1) 2.44 (1.8) -0.54 (1.2) -1.07 (3.6) -1.54 (3.1) 15-54 -4.63 (3.3) 0.74 (1.5) 0.36 (0.3) -0.33 (1.0) -0.71 (3.0) -2.02 (5.3) 55 + -5.11 (2.6) -0.11 (0.2) 0.24 (0.2) -0.66 (1.4) 0.15 (0.5) -2.37 (4.4) Females aged 0-4 -2.44 (1.2) -0.42 (0.6) -6.18 (4.0) -0.70 (1.4) -1.79 (5.3) -0.99 (1.8) 5-9 -0.46 (0.2) -0.21 (0.3) 0.27 (0.2) -1.04 (2.1) -1.11 (3.5) -2.02 (3.7) 10-14 -0.38 (0.2) 0.00 (0.0) 1.06 (Q7) -0.54 (1.1) -0.80 (2.4) -1.85 (3.4) 15-54 1.25 (0.9) 0.92 (2.0) 0.89 (0.9) 0.22 (0.7) -0.39 (1.8) 0.51 (1.4) F-tests of equality by sex 0-4 0.20 2.28 1.26 2.22 3.14 1.82 5-9 0.31 1.36 0.73 0.23 0.16 0.33 10-14 4.80 0.00 0.86 0.00 0.65 0.31 15-54 24.00 0.19 0.35 3.63 2.61 61.52 55 + 6.62 0.03 0.03 1.83 0.22 19.54 Children 1.77 1.19 0.97 0.81 1.31 0.81 All 6.48 0.75 0.67 1.44 1.41 14.91 Source: Subramanian and Deaton (1991). NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 233 where there are K such age-sex classes in total, z is a vector of other socioecono- mic variables such as religion, caste, or occupation, and u; is the error term for the ith good. Note that i denotes the good, not the observation number, and that when (4.14) is estimated as a regression, the observations are the individual households. In Working's formulation, the sign of the ,-coefficients determines whether goods are necessities or luxuries. When Bi > O, the share of the budget increases with total outlay, so that its total expenditure elasticity is greater than unity, and vice versa when ji< 0. The way in which the demographics are entered in (4.14) is pragmatic but convenient. To the extent that household size and total expenditure are sufficient to explain demand, the terms in the summation will not be required and the 0 coefficients will be zero. Indeed, in some cases per capita outlay may be all that is required to explain the budget share, in which case both the y's and T's will be zero. In general, however, household composition will matter, and the y coef- ficients tell us what are the effects of changing composition while holding house- hold size constant, for example by replacing a man by a woman, or a young child by an older child. The K demographic categories come from a disaggregation of the household by age and by sex, in the example here there are 10 categories, males and females aged 0-4 years, 5-9 years, 10-14 years, 15-54 years, and older than 54. Note that only K - 1 of the ratios can be entered in the regression since the sum of all K is identically equal to unity. In the empirical work, the omitted category is females aged 55 and over. As was the case for demand curves for calories in Section 4. 1, the variables for household size and structure cannot be assumed to be exogenous. But as usual, the important point is to consider the consequences of endogeneity, not simply its existence. Estimates of (4.14) allow us to examine the expectation of adult goods expenditures conditional on total outlay and the structure of the household. If it is the case that the coefficients on boys and girls are different, we have-found that the regression function differs by gender, so that among households at the same level of outlay, and the same household size, and the same number of adults, expenditure on adult goods depends on the sex composition of the children. If the fact is established, its interpretation needs to be argued. Discrimination against girls is the most obvious story, and the one that I am interested in here. Other interpretations require a mechanism whereby expenditure on adult goods exerts an effect on the ratio of boys to girls. Perhaps parents with a strong taste for adult goods are also those who discriminate against girls, for example, if drunken fa- thers maltreat their daughters. Table 4.2 shows the results of estimating equation (4.14) by OLS on the 5,630 rural households for cereals, pulses, and fruits and vegetables. These foods are clearly consumed exclusively neither by adults nor by children, so that the regres- sions are designed, not to test for gender bias, but to explore how the pattern of household demand is influenced by household composition in general and gender in particular. The first two rows of the table show the effects of (the logarithm of) PCE and of (the logarithm of) household size, and these are followed by the coef- ficients on the various demographic ratios. Not shown in the table, but included in 234 THEANALYSISOFHOUSEHOLDSURVEYS the regressions, are four occupational dummies, two religion dummies, and a dummy indicating whether the household belongs to a scheduled caste or tribe. For the five foods shown in the table, only wheat is a luxury good; as we saw in Section 4.1, coarse cereals and pulses are the basic foodstuffs, followed by rice, which here has a total expenditure elasticity insignificantly different from unity, and finally wheat. Conditional on PCE, increases in household size decrease the shares of coarse cereals, pulses, and fruits and vegetables, have little or no effect on the budget share of rice, and increase the share of wheat. I shall return to these and similar findings when I discuss economies of scale in the next section. Although the detailed demographic effects are important in the regressions, the effects of gender are modest. The bottom panel of the table reports a series of F-tests, the first set testing the equality of the male and female coefficients for each of five age ranges, the sixth testing the joint hypothesis that there are no gender differences among children, i.e., among the first three groups, and the seventh that there are no differences among adults, the last two demographic cate- gories. For these five goods, there are no gender differences among children that show up as significant F-ratios for the three child groups together. The only case where there is an individually significant F-test is for rice on the 10- to 14-year- old category, where boys get less than girls, but note that this is the only signi- ficant result of 15 such tests, and can reasonably be attributed to chance. Among the adults, there are no apparent differences for wheat, coarse cereals, or pulses, but altering household composition by replacing women by men is estimated to decrease the budget shares of both rice and fruit and vegetables. The compensat- ing changes elsewhere in the budget-the male intensive goods-are in coarse cereals (although the coefficient is not significant in the regression) and in goods not shown here, notably in processed food, entertainment, alcohol, and tobacco. These results by themselves tell us nothing about the cause of the gender differences, and certainly do not allow us to separate the effects of tastes from a possible gender bias in household sharing rules. Men may prefer to smoke and drink more, while women prefer to consume more high-quality food, although the difference between coarse cereals and wheat may also come from the fact that men do more heavy agricultural work, and that coarse cereals provide calories more cheaply than does wheat (see Table 4.1 above). It is also possible that men control household budgets, and do not permit women to have the same access to the "luxuries" of intoxicants, tobacco, and entertainment. Boys versus girls in rural Maharashtra: methodology Once we have identified adult goods, we can make inferences about the effects of gender on the allocation, at least for children. Perhaps the most direct way of do- ing so is to run regressions corresponding to (4.14) for adult goods, to compare the coefficients on the ratios for boys and girls, and to test the difference using an F- or t-test. If the coefficient for boys is significantly more negative than the coefficient for girls, adults are making bigger sacrifices for boys than girls, and we have evidence of discrimination. NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 235 I shall show the results of such calculations below, but it is also useful to mea- sure the effects in a different way, and explicitly to calibrate the effects of an additional child in terms of the effects of changes in the size of the budget. In particular, I define "outlay-equivalent ratios" as the derivatives of expenditure on each adult good with respect to an additional child divided by the corresponding derivatives with respect to total expenditure, with the results expressed as a frac- tion of PCE. The outlay-equivalent ratio for a male child on tobacco, say, is the fraction by which PCE would have to be reduced to induce the same reduction in tobacco expenditure as would an additional male child. More formally, if 1i is the outlay-equivalent ratio for good i and demographic category j, (4.14) implies that (4.15) -: = aqillnj x = (__i-P_)+Y_i-_kYik(nk1n) aq,lax n pi + w, where, by convention, the Y ik for the last demographic category K is zero. Given estimates of the parameters, the outlay-equivalent ratios can be calculated. By (4.15), they will vary according to the values of the budget share and the demo- graphic composition of the household. Rather than track this from household to household, I follow the usual procedure and calculate the ratios at the mean val- ues of the data. The convenience of this procedure lies in the relationship between the outlay- equivalent ratios and (4.13); if good i is indeed an adult good, and if there are no substitution effects of children on its consumption, then the outlay-equivalent ratio measures the ratio of derivatives of the sharing rule, and should be identical for all such adult goods. We can therefore calculate the outlay-equivalent ratios for a number of possible goods, and use the estimates to guide our choice of adult goods, or given a selection, test the predictions that the ratios are equal using (4.15) to calculate the ratios and (4.18) for their variances and covariances (see Deaton, Ruiz-Castillo, and Thomas 1989 for further details). *Technical note: standard errorsfor outlay-equivalent ratios Estimates of the variances and covariances of the different t-ratios can be ob- tained by bootstrapping, or from the variance-covariance matrix of the regression coefficients using the delta method discussed in Section 2.8. Suppose that the regression equation for the ith good has the vector of para- meters bi, whose elements are ai, Pi, i1j, and yij, j = 1,...,K-1, plus the para- meters for any other included variables. Since all the right hand side variables are the same in the regressions for all goods, we have the classical multivariate regression model as described, for example, in Goldberger (1964, pp. 201-12). For this model, for which equation-by-equation OLS coincides with full informa- tion maximum likelihood, the variances and covariances of the OLS estimates of the b's both within and across equations is given by the formula (4.16) E(6I'-bi)(Sj -bj) = &)i.(XX) Table 4.3. Outlay-equivalent ratios for adult goods and possible adult goods, Maharashtra, 1983 Males: Age group Females: Age group 0-4 5-9 10-14 15-54 55 + 0-4 5-9 10-14 15-54 55 + Adult goods Tobacco and pan -0.42 -0.12 -0.13 0.57 0.87 -0.04 -0.01 -0.17 -0.01 0.03 Alcohol 0.02 0.11 -0.89 0.37 0.10 -0.31 -0.02 -0.76 -0.30 -0.24 Possible adult goods Men's clothing -0.39 -0.48 -0.14 -0.14 0.07 -0.56 -0.23 -0.45 -0.57 -0.52 Women's clothing -0.21 -0.37 -0.39 -0.54 -0.53 -0.31 -0.27 -0.40 -0.22 -0.20 Leather footwear -0.60 -0.77 -0.09 0.22 -0.03 -0.69 -0.12 -0.59 -0.64 -0.40 Amusements -0.25 -0.22 -0.46 0.97 -0.32 -0.46 -0.33 -0.35 -0.46 -0.44 Personal care 0.00 0.02 -0.08 0.12 -0.42 0.19 -0.14 0.16 0.26 -0.13 Source: Subramanian and Deaton (1991, Table 5). NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 237 where X is the matrix of explanatory variables and X ij is the covariance between the residuals in the ith and jth equations. This residual variance-covariance matrix is estimated from the OLS residuals from each equation, so that (4.17) j = n-lei/ej where ei and eI are the vectors of estimated residuals from the ith and jth equa- tions, respectively. The covariance of any two outlay-equivalent ratios can then be obtained from (4.16) and (4.17) by the delta method, viz. (4.18) E(=ik-Wilc)(jl-J,) = -r-s .. (X X)rS abr ais where the derivatives are calculated from (4.15). Boys versus girls in rural Maharashtra: results Table 4.3 shows the calculated outlay-equivalent ratios for a number of adult goods or potential adult goods from the 38th round of the NSS for rural Maharash- tra. As is the case in many other surveys, there are relatively few adult goods to choose from. The two best candidates are tobacco and pan, and alcohol, and the usefulness of the latter is even compromised by the fact that only 12 percent of rural households record any purchases of alcohol. Even so, Subramanian and Deaton treat these as "safe" adult goods, and their outlay-equivalent ratios are shown in the first two rows of the table. The NSS survey does not separate cloth- ing and footwear into adult and child categories, but rather into male and female categories, with children's clothing included in the categories. Nevertheless, the table treats these as "possible" adult goods, along with three others, leather foot- wear, amusements, and personal care and toiletries. As is perhaps to be expected from the data problems, only the results from tobacco and pan are really satisfactory. In all cases, the outlay-equivalent ratios for children are negative, indicating that additional children act like income re- ductions for this category, while the ratios for adults are positive, at least for the two male adult categories. There is little association between adult females and the consumption of tobacco and pan. For alcohol, there are significant negative n- ratios only for children in the 10- to 14-year-old group, and once again the posi- tive effects of adults are confined to males. For the "possibles," while it is indeed the case that most of the child ratios are negative, so are several of those for the adults. A case could perhaps be made for amusements as an adult good, although as with tobacco and pan and alcohol, the adult effects are primarily associated with males. If we ignore these other categories, and also alcohol on grounds of infrequency of purchase, we are left with expenditure on tobacco and pan to indicate possible gender effects in allocations to children. For the two youngest age groups of children, the effects are indeed larger negative for boys than for 238 THE ANALYSIS OF HOUSEHOLD SURVEYS girls, although the difference is only significant for the youngest group, those aged less than 5 years. The F-test of the underlying regression coefficients-of the hypothesis that the appropriate two y coefficients are the same in (4.14)-is 4.3, which is significant at conventional values, and which could be taken as evidence of discrimination against girls relative to boys. The fact that there are no such differences among the other age groups could perhaps be regarded as evi- dence in the other direction, but it should also be noted that the evidence on ex- cess mortality of girls is of excess infant mortality, precisely the age category in which the current tests detect a difference. If the analysis is repeated using the urban Maharashtran sample, there is no perceptible difference in treatment in any age group, including the youngest, but again it can be argued that it is in the rural areas where discrimination is most likely to be found. Subramanian and Deaton conclude the evidence, while suggestive of discrimination against young girls in rural areas, is by no means conclusive. It is unlikely to persuade someone whose priors are strongly against the existence of such discrimination. Nevertheless, the issue is of great importance, and the results here are sufficiently positive to make it seem worthwhile repeating this exercise for other states for which the NSS data are available. Subramanian (1994) has repeated the analysis using the 43rd round of the NSS (1987-88) on rural households from Andhra Pradesh, Haryana, Maharashtra, Punjab, and Rajasthan. The three Northwestern states of Haryana, Punjab, and Rajasthan are places where there is strong evidence of discrimination against girls; excess mortality among girls is high, and female literacy is low. For exam- ple, the 1991 census showed that among women aged seven and over, only 12 percent were literate in Rajasthan, 24 percent in Andhra, 33 percent in Haryana, 41 percent in Maharashtra, and 44 percent in Punjab. The results on pan and tobacco from the 1983 survey reappear for the 5- to 9-year age group in the later data from Maharashtra, but not in any of the other states. Perhaps Maharashtra is different in some way, or perhaps the Maharashtran results should be attributed to chance. CMte d'Ivoire, Thailand, Bangladesh, Pakistan, and Taiwan (China) Additional evidence on expenditure patterns and gender comes from other coun- tries. In the first application of the method in Deaton (1989b), I looked at the effects of boys and girls on adult expenditures in C6te d'Ivoire and Thailand using the 1985 Living Standards and 1980-81 Socioeconomic Surveys, respect- ively. In C8te d'Ivoire, it is impossible to reject the hypothesis that the outlay- equivalent ratios are equal for a set of seven adult goods consisting of adult cloth- ing, adult clothing fabric, adult shoes, alcohol, tobacco, meals taken away from home, and entertainment. For these goods taken together, the outlay-equivalent ratios are -0.12 for boys aged 0-4, and -0.49 for boys aged 5-14, while the corre- NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 239 sponding estimates for girls are -0.22 and -0.48. The two pairs of numbers are not significantly different; if anything, the numerical estimates are in favor of young girls. There is certainly no evidence of favoritism toward boys. (As in the Indian data, there are gender effects among adults, with expenditure on alcohol and tobacco again associated with men.) For the Thai data, the results are similar. For a three-good collection of adult goods consisting of tobacco, alcohol, and meals eaten away from home the outlay-equivalent ratios are -0.47, -0.52, and -0.34 for boys aged 0-2, 3-4, and 5-14, whereas the corresponding ratios for girls are -0.30, -0.36, and -0.22, all slightly smaller than the boys' ratios, but only the last is significantly so, a result that comes from apparent gender differences in expenditures on meals taken away from home. However, the tests for the equality across goods of the outlay-equiva- lent ratios give strong rejections, so that it is not legitimate to combine tobacco, alcohol, and outside meals into a single adult group. Even if we ignore the prob- lem, the differences between the boys' and girls' ratios are hardly impressive. Of course, Thailand and C6te d'Ivoire, like the state of Maharashtra in India, are not the places where we would expect to find evidence of discrimination against girls in the allocation of household expenditure. In West Africa, as in Thailand, women are economically productive, and girls are not seen as a burden on their parents. Much better laboratories for these techniques are provided by Bangladesh and Pakistan, two countries where there is a great deal of other evi- dence on differential treatment by gender. Ahmad and Morduch (1993) have applied the same methodology to data from the 1988 Household Expenditure Sur- vey from Bangladesh. Like Subramanian (1994), they find no evidence of bias in favor of boys even though the survey itself shows that the number of boys ex- ceeds the number of girls by 11 percent overall and 13 percent in rural areas, and although calculations using the 1988-89 Child Nutrition Survey show that the nutritional status of boys responds more to increases in household income than does the nutritional status of girls. These results suggest that the consumption- based methodology fails to detect gender bias, even when it is well documented by other measures. Ahmad and Morduch's results are confirmed for Pakistan by my own calcu- lations using the 1984 Household Income and Expenditure Survey. Table 4.4 shows the coefficients for boys and girls in the regressions (4.14) for tobacco and pan, and for men's and women's footwear, both defined to exclude children's footwear. As in Maharashtra, the results suggest that tobacco and pan are con- sumed by adult males, but unlike the Maharashtran case, there is no clear pattern of decreases in consumption in response to the presence of children. Boys aged 3-4 and girls aged 5-14 are the only child categories with negative coefficients, but neither the coefficients themselves nor their differences by gender are signi- ficantly different from zero. The evidence for the two footwear categories are much clearer, but like Ahmad and Morduch's results for Bangladesh, they appear to indicate identical treatment for boys and girls. Expenditure on both men's and women's footwear decreases in response to small children in the household, and for the two youngest child groups, those aged 4 and under, the coefficients are 240 THE ANALYSIS OF HOUSEHOLD SURVEYS Table 4.4. Tests for gender effects in adult goods, Pakistan, 1984 Tobacco and pan Men 's footwear Women 's footwear In(x/n) 0.20 (2.9) -0.33 (21.5) -0.17 (17.4) Inn 0.04 (0.6) 0.12 (7.7) 0.06 (5.9) Males 0-2 1.08 (2.4) -0.35 (3.5) -0.78 (12.3) 3-4 -0.58 (1.2) -0.15 (1.4) -0.80 (11.4) 5-14 0.03 (0.1) 0.11 (1.5) -0.82 (17.8) 15-54 2.07 (6.1) 1.41 (18.8) -0.71 (14.9) 55 0.50 (1.2) 1.20 (12.6) -0.69 (11.3) Females 0-2 0.22 (0.5) -0.33 (3.2) -0.75 (11.8) 3-4 0.11 (0.2) -0.30 (2.7) -0.77 (10.9) 5-14 -0.21 (0.6) -0.31 (4.2) -0.62 (13.1) 15-54 -0.52 (1.8) -0.00 (0.0) 0.15 (3.7) F-tests 0-2 3.24 0.05 0.25 3-4 1.67 1.48 0.16 5-14 0.74 47.50 27.17 All children 1.85 16.31 9.14 15-54 78.61 469.86 436.38 All adults 1.34 157.89 236.69 Note: All coefficients multiplied by 100, t-values in parentheses. Source: Author's calculations based on 1984 Household Income and Expenditure Survey. essentially the same for boys and girls. For the older children, aged 5-14, the results show that boys use men's footwear and girls use ladies' footwear, so that the F-tests show significant differences by gender, but this is no evidence of discrimination. Similar results were obtained by Rudd (1993), who analyzed Taiwanese data from 1990. While the evidence on gender bias is less clear in Taiwan (China) than in Bangladesh and Pakistan, work by Greenhalgh (1985) and by Parrish and Wil- lis (1992) is consistent with the traditional view that Chinese families prefer boys to girls. But as in the other studies, Rudd finds no evidence of discrimination against girls in the expenditure data. What should be concluded from this now quite extensive evidence? It is a puzzle that expenditure patterns so consistently fails to show strong gender effects even when measures of outcomes show differences between boys and girls. One possibility is that the method is flawed, perhaps by pervasive substitution effects associated with children, so that the outlay-equivalent ratios are not revealing the income effects of children, but are contaminated by the rearrangement of the budget that follows a change in family composition. While such substitution effects are likely enough, they are hardly an adequate explanation for those cases where the estimated coefficients are so similar for boys and girls. If boys are NUTRITION, CHILDREN, AND INTRAHOUSEViOLD ALLOCATION 241 treated better than girls, and the results do not show it because of substitution, then the substitution effects associated with girls should be different from those for boys, and of exactly the right size to offset the discriminatory income effects. Such an explanation is far-fetched compared with the simple message that comes from the data, that parents-and where adult goods are largely alcohol and to- bacco, parents means fathers-make the same amount of room in their budgets for girls as they do for boys. Perhaps discrimination requires action by mothers, who do not have access to adult goods, which are the prerogative of men. If so, the discrimination against girls in (northern) India, Pakistan, Bangladesh, and possibly Taiwan (China) must take forms that are not detectable in the expen- diture data. Ahmad and Morduch put forward three possible explanations: that girls have greater needs than boys (at least at some point); that certain critical interventions-such as access to a doctor when sick-are made for boys but not for girls; and that the discrimination is too subtle to be detected in even the sizes of samples available in household survey data. They cite Das Gupta's (1987) work showing that discrimination against girls is confined to those with older sisters as an example of the latter, and it is certainly true that neither this effect nor the occasional medical expenditure on boys would likely be detected using the adult-good method. It may also be that where medical expenses require fami- lies to go into debt, parents may be willing to incur debt to preserve an asset-a boy-but not to preserve a liability-a girl. It is also possible that discrimination works through allocations of time, not money, with mothers taking less time away from farm or market work after the birth of a girl. Indeed, Ahmad and Morduch find in Bangladesh that total household expenditure, which is treated as a condi- tioning variable in regressions like (4.14), responds negatively to the presence of small children in the household, and does so by more when the small child is a boy. However, the much larger Pakistani and Indian data sets used here show no such effects. There is clearly a good deal more work to be done in reconciling the evidence from different sources before the expenditure-based methods can be used as reli- able tools for investigating the nature and extent of gender discrimination. 4.3 Equivalence scales: theory and practice In Chapter 3, where I discussed the measurement of welfare and poverty, I made the conversion from a household to an individual basis by dividing total house- hold expenditure by the number of people in the household, and then used total household expenditure per capita as the measure of welfare for each member of the household. Not only does this procedure assume that everyone in the house- hold receives an equal allocation-an assumption that can perhaps be defended as the best that can be done given current knowledge-but it also fails to recognize the fact that not everyone in the household is the same and has the same needs. While it is true that children consume special goods, they surely require less of most things than do adults. It is also possible that there are economies of scale to living together, perhaps because family members benefit from each other's con- 242 THE ANALYSIS OF HOUSEHOLD SURVEYS sumption, or because there are public goods that can be used by all family mem- bers at no additional cost. Since the variation of needs within the household and the existence of economies of scale are commonplace observations that can hard- ly be contested, it would seem that we ought to be able to do better than measure welfare by total household expenditure per capita. The obvious solution is a system of weights, whereby children count as some fraction of an adult, with the fraction dependent on age, so that effective house- hold size is the sum of these fractions, and is measured not in numbers of persons, but in numbers of adult equivalents. Economies of scale can be allowed for by transforming the number of adult equivalents into "effective" adult equivalents, so that if two cannot live as cheaply as one, four adult equivalents can perhaps live as cheaply as three single adults. How these adult equivalents should be calculated, and whether it even makes sense to try, have been (occasional) topics of discussion in the economics literature for more than a century, although they have never really attracted the attention that is merited by their practical impor- tance. Perhaps as a result, there is no consensus on the matter, so that the views presented in this section are necessarily my own, although I shall attempt to pres- ent and discuss the difficulties and the alternative interpretations. Before discussing equivalence scales, I note one important approach that allows us to finesse the issue, at least in some cases. In the discussion of poverty and social welfare in Section 3.1, we saw that under some circumstances it is possible to tell whether poverty is greater or less without specifying the poverty line, even though all poverty measures require a poverty line for their calculation. For example, when the distribution function for one distribution lies entirely above that for another, the former shows more poverty whatever the line (see pp. 164-69 above). This is one case of stochastic dominance, and Chapter 3 showed how an examination of the various kinds of stochastic dominance allows us to rank at least some distributions in terms of poverty, inequality, and social welfare. In a series of papers, Atkinson and Bourguignon (1982, 1987, 1989) have laid out an approach based on bivariate stochastic dominance which allows some dis- tributions to be ranked without specifying equivalence scales for different family types. Atkinson and Bourguignon require that family types be ranked according to need, something that is much less demanding than constructing equivalence scales. Although we might have some difficulty telling whether (for example) a couple or a single person with two children has greater needs, it is easy to agree (for example) that two adults and two children need more resources than two adults and one child. Having ranked the family types, and on the assumption that the two distributions contain the same fractions of the different types, the two distributions are compared in the following way. Starting with the neediest type only, we check whether there is first-order (poverty line), second-order (social welfare, poverty deficit), or Lorenz (inequality) dominance, whichever is the concept of interest. If the dominance test is passed, the test is then applied to the neediest and second-neediest types pooled. If dominance holds all the way up to the last test, when all family types are combined, then one distribution is preferred to the other for any set of equivalence scales consistent with the original ranking NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 243 by need. As usual, dominance tests rank only some pairs of distributions, and there will be cases where the ranking is ambiguous, just as when Lorenz curves or distribution functions cross. Nor is the dominance approach designed to help in making comparisons between families with different compositions in a single dis- tribution; it will not tell us whether, given their respective incomes, larger fami- lies are better-off than smaller families. Equivalence scales, welfare, and poverty Given the difficulties of definition and of measurement, it would be convenient if the calculation of equivalence scales were unimportant for the measurement of welfare and poverty. Unfortunately such is not the case. As emphasized by Kuz- nets (1979), and in all household survey data of which I am aware, total house- hold expenditure rises with household size, but not as rapidly, so that PCE de- creases with household size. For example, in the data from rural Pakistan, the regression coefficient of the logarithm of total expenditure on the logarithm of total household size is 0.61, so that the regression coefficient of the logarithm of PCE on the logarithm of household size is -0.39; the corresponding figures for rural Maharashtra are 0.70 and -0.30, and for the Ivorian Living Standards Sur- veys are 0.58 and -0.42 in 1985, 0.63 and -0.37 in 1986, and 0.65 and -0.35 in 1987 and 1988. Given these correlations, the relationship between household size and individual welfare will depend on just how the equivalences are calculated. At one extreme, it might be argued that economies of scale and altruism are so strong that total household expenditure is the appropriate measure of individual welfare. At the other extreme, we might refuse to admit any variation of needs, arguing that all persons are equal whatever their age and sex, so that individual welfare should be measured by total household expenditure per capita. These two procedures would certainly seem to set bounds on the ideal correction. But given the correlations in the data, the use of total household expenditure is effectively an assumption that welfare rises with household size and that small households are overrepresented among the poor. Conversely, the procedures in Chapter 3, which use PCE as the welfare measure, assume the contrary, that large household sizes are automatically associated with lower welfare and with greater poverty. Either of these relationships between household size and welfare could be correct, but they need to be demonstrated, not assumed. Equivalence scales are required to compare poverty across different groups. In most societies, the elderly live in households that are relatively small and contain few children while children, who never live by themselves, live in households with children. In consequence, if we use equivalence scales that attribute low needs to children, or that incorporate large economies of scale, we will find that there are relatively few children in poverty, but a relatively large number of the elderly. For example, the "fact" that there is less poverty among the elderly in the United States depends on the assumption in the official counts that the elderly need less than other adults (see Deaton and Paxson 1997). International compari- sons of poverty and inequality are also sensitive to the choice of equivalence scale 244 THE ANALYSIS OF HOUSEHOLD SURVEYS (see Buhmann et al. 1988), something that is often given inadequate advertise- ment when the performance of different countries is presented in popular discus- sion. The relevance of household expenditure data For many economists and demographers since Engel more than a century ago, it has seemed that survey data on household expenditures can tell us something about equivalence scales. As we have already seen in the previous section, such data can be used to study how demand patterns vary with the demographic com- position of the household. Several authors, starting with Dublin and Lotka (1930) and perhaps most recently Lindert (1978, 1980), have used the data to associate particular expenditures with children, and thus to calculate the "cost" of a child on an annual basis at different ages, or as in Dublin and Lotka, to estimate the cost of bringing a child to maturity, an amount that they treat as an input cost in their calculation of the "money value of a man." Such costs can readily be translated into an equivalence scale by dividing by the total budget; in a household of two adults and a child where child costs are one-fifth of the budget, the adults get two- fifths each, so a child is "equivalent to" one-half of an adult. As first sight, such procedures seem unobjectionable. There are certain child goods, like education, whose costs can be attributed to the children-although even here it is unclear how we are to deal with the fact that richer parents spend more on the children's education than poorer parents, and whether this expendi- ture is to be credited entirely to the children-and the other nonexclusive expen- ditures can be prorated in some essentially arbitrary but sensible way. As we have seen above in the discussion of sharing rules, such prorating is hardly a trivial exercise, and what may seem like obvious procedures are typically incorrect. For example, it is tempting to identify child costs by estimating Engel curves in which expenditures are regressed on income (or outlay) and the numbers of adults and children, with the coefficients on the latter estimating how much is spent on each good for each child at the margin. However, if such regressions are applied to all the goods in the budget, and if total expenditure is included as an explanatory variable, the sum of the child coefficients must be zero. Additional children do not enlarge the budget, they cause it to be rearranged. In consequence, the regres- sion method will only yield a positive estimate of costs if the regressions are restricted to a subset of goods. The inclusion of goods into or exclusion of goods from this subset determines the value of the scale and thus needs clear justifica- tion. Of course, exactly the same criticisms apply to Dublin and Lotka's cruder methods; while their procedure has the advantage of transparency, it also rests on counting the positive effects of children on expenditures as child costs, while ignoring the negative effects. While it is important to understand why the estimation of the cost of a child is not simply a matter of counting up child expenditures, it must be emphasized that the criticism is directed at one specific method, and does not demonstrate that it is impossible to estimate child costs from household budget data in general. Indeed, NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 245 as we have already seen in the previous section, Bourguignon and Chiappori's work establishes that, given appropriate assumptions, it is possible to use house- hold data to identify at least some aspects of the rules whereby resources are allo- cated to different groups of people within the household. There also exist a num- ber of specific methods for calculating child costs from survey data, some of considerable antiquity, and each resting on its own set of more or less palatable assumptions. One of my purposes in this section is to clarify and discuss these assumptions, to rule out some clearly unacceptable methodologies, and to attach the appropriate "health warnings" to the others. But before I can undertake that task, it is necessary to review some of the general theory that underlies all of these measures. The theory of equivalence scales and the cost of children is based on an analogy with the theory of cost-of-living indices and consumers' surplus. Since that theory is useful in its own right in development applications, and since it gen- eralizes the results used in Section 3.3 on the welfare effects of price changes, I begin with a brief survey. Cost-of-living indices, consumer's surplus, and utility theory In standard consumer choice theory, a consumer maximizes a utility function u(q) by choosing a bundle of goods to satisfy the budget constraint p.q = x. If the utility-maximizing choice results in a utility value of u, expenditure x must be the minimum cost of attaining u at prices p, so that if this minimum cost is repre- sented by the costfunction (or expenditure function) c(u,p), we can write (4.19) c(u,p) = x. The maximum attainable utility can also be written as a function of total expendi- ture x and prices p using the indirect utility function (4.20) u = *(x,p) which is (4.19) rewritten; both equations link total expenditure, utility, and prices and each can be obtained from the other by rearrangement. The theory of cost-of-living numbers was developed by the Russian economist Kon0Is in the 1920s, and rests on a comparison of the costs of obtaining the same utility at two different sets of prices. Formally, we can write (4.21) p(pI,pO;UR) = C(UR,pI) *C(UR,pO) where u R is a "reference" or base utility level, typically either utility in period 0 or in period 1, so that the scalar value P is the relative cost of maintaining the standard of living u R at the two sets of prices p 1 and p 0. Instead of a ratio of costs, we might also compute the difference of costs, which is simply (4.22) D(pl,pO;UR) = c(uR,p1)-C(UR,pO). 246 THE ANALYSIS OF HOUSEHOLD SURVEYS The measure D is known as the compensating variation when reference utility is base-period utility's, and as the equivalent variation when period I's utility is the reference. These consumer surplus measures are simply different ways of expres- sing the same information as the Konus cost-of-living index numbers. It is important to realize that, given knowledge of the demand functions, the index numbers (4.21) and cost differences (4.22) are known or at least can be calculated. Although utility values are not observable-any monotone increasing transform of a utility function will have the same behavioral consequences as the original-so that it might seem that (4.21) is unobservable, this is not in fact the case. The essential point is that the derivatives of the cost function are the quan- tities consumed, which are directly observable, so that given the value of the cost function at one set of prices, its value at another can be obtained by integration. Since the literature does not always adequately recognize the observability of cost-of-living indices and equivalent and compensating variations, I include a brief technical note detailing the calculations. It can be skipped without losing the thread of the main argument. *Technical note: calculating the welfare effects of price The quantity demanded of good i can be written (4.23) qi = ac(up)1apj = h1(u,p) where h,(u,p) is the Hicksian (or compensated) demand function for good i. However, we can use the indirect utility function (4.20) to rewrite (4.23) as (4.24) qi = hi(u,p) = hj(*(x,p),p) = gi(x,p) where gi(x, p) is the familiar Marshallian (or uncompensated) demand function that gives demands as a function of total outlay and prices. These Marshallian demands are what we can hope to estimate from the data, and we need a proce- dure for evaluating the cost-of-living numbers from knowledge of these functions alone. To illustrate, suppose that we wish to use base-period utility u 0 as the refer- ence. Since c(u u,p 0) = x°, outlay in the base period, the only quantity that needs to be calculated for (4.21) and (4.22) is the counterfactual c(u0,p'), the mini- mum cost of obtaining base utility at the new prices. As a first stage, write this as (4.25) C(u p') = c(u°,p)+ f Ftac(u0,p)Iapk]dpk no k which by (4.23) and (4.24) is (4.26) c(u ,p') = c(u0,p0)+ f _[g,(c(u',p),p)]dp,. Po k NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 247 The principle of the calculation can now be seen. We start from the initial price vector p 0. For this value, the terms in the integral are known from the demand function, so we can use (4.26) to calculate the new value of c(u °,p), not for p 1, but for some other price close to p 0 in the direction of p l. Given this new value, we can move to a new price, so that by dividing up the path from p 0 to p I into many small steps, we can calculate (4.26) as accurately as we want, with more steps giving greater accuracy. The details of how to do this in practice need not concern us here; elegant algorithms have been proposed by Vartia (1983) and Hausman (1981) for the case where the parametric form of the Marshallian de- mands is known, while Hausman and Newey (1995) show how to perform the calculations when demands are estimated by the nonparametric kernel techniques discussed in Chapter 3. For the purposes of this chapter, what is required is the demonstration of the principle, that because the derivatives of the cost function are known or can be estimated, the cost-of-living indexes and consumers' surplus measures are known or can be estimated. Equivalence scales, the cost of children, and utility theory Consider now the extension of cost-of-living theory to the theory of the cost of children. At first sight the analogy seems close to perfect, with children, or more generally household characteristics, playing the role of prices. In particular, it is supposed that the cost function depends on demographic characteristics z, so that, for a utility-maximizing household, total expenditure is the minimum cost of reaching their utility level, and (4.19) is replaced by (4.27) c(u,p,z) = x. This equation is not without its problems, and in particular it does not make clear whose utility level is referred to. We are no longer dealing with the single agent of the standard cost-of-living theory, and there is more than one possible interpre- tation of (4.27). For example, we might be dealing with parents' welfare, with the cost function calculating the minimum expenditure required to maintain parents' utility in the face of social conventions about what needs to be spent on their off- spring. Alternatively, each person in the household might have a separate utility function, and the sharing rule between them is set so that all obtain the same wel- fare level, in which case (4.27) is the minimum cost of bringing each member to the utility level u. In the next section, I shall sometimes have to be more explicit, but for the moment I note only that (4.27) is consistent with a number of sensible models of intrahousehold allocation. The equivalence scale compares two households with compositions z° and z 1, in exactly the same way that the cost-of-living index number compares two price levels. Hence, if u R and p R are the reference utility level and price vector, respectively, the equivalence scale is written, by analogy with (4.21), (4.28) m(zI,zO;uR,pR) = C(UR,pR,zl) C(UR,p R,zO). 248 THE ANALYSIS OF HOUSEHOLD SURVEYS In the simplest case, z 0 would represent a family of two adults and z I a family with two adults and a newborn child, so that using the prechild utility level and current prices as base, the excess of (4.28) over unity would be the cost of the newborn as a ratio of total household expenditure. Alternatively, we can follow the consumers' surplus analogy, and measure not the equivalence scale-which is an index number-but the cost of children-which corresponds to a consumers' surplus measure. This last is the cost of the difference in household characteris- tics, cf. (4.22) (4.29) D(z1,z ;u R,pR) = C(U R pR zl) - C(U R, RzO) which, in the simple example above, would be the amount of money necessary to restore the original welfare level. The underidentif cation of equivalence scales The analogy between the cost of living and the cost of children runs into diffi- culty when we consider how the theoretical concepts are to be implemented. As before, I assume that we know the demand functions, which is more than before, since we must know not only how demands respond to incomes and to prices, but also how they respond to changes in household composition. Of course, these demand functions have to be estimated, but the survey data contain the informa- tion to do so. But that done, and in contrast to the cost-of-living case, knowledge of the demand functions is not sufficient to identify the equivalence scale (4.28) nor the cost measure (4.29), a result due to Pollak and Wales (1979). I first sketch their argument, then interpret the result, and finally consider possible remedies. Suppose that we start from a consumer whose preferences are represented by a cost function augmented by compositional variables as in (4.27) and that we track through the relationship between the cost function and the observable de- mands as in (4.23) and (4.24). With the addition of the z-vector, (4.24) becomes (4.30) qi = hi(u,p,z) = hi[*(x,p,z),p,zj = gj(x,p,z) where the Hicksian demand functions hi are the partial derivatives of the cost function as in (4.23). Now consider a new (and different) cost function that is constructed from the original one by means of (4.31) c(u,p,z) = c[F(u,z),p,z] where ((u, z) is a function that is increasing in its first argument, utility, so that in the new cost function, as in the old, it costs more to be better-off. Since i can be any increasing function, the new cost function and the old can be quite different, and are capable of embodying quite different attitudes of parents to their children, and quite different sharing rules between them. The new Hicksian demand functions are given by differentiating the new cost function, so that by (4.31) they are related to the old Hicksian demands by NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 249 (4.32) qi = h,(u,p,z) = hj[i(u,z),p,z] where u is the utility level associated with the new cost function and the original total outlay x. However, since (4.31) is equal to x, it must be true that (4.33) (a,z) = l(x,p,z). By the definition of the new cost function, the expression i(a,z) plays the same role as did u in the original cost function, and in particular both are equal to the value of the original indirect utility function. As a result, when we convert from the Hicksian to the Marshallian demand functions by substituting for utility, we obtain, from (4.32) and (4.33), (4.34) qi = hj[I(x,p,z),p,z] = gj(x,p,z) so that the new demand functions are identical to the old. The two cost functions, the new and the old, although representing different preferences and different costs of children (or other compositional variables) re- sult in the same observable behavior. As a result, if all we observe is the behavior, it is impossible to discover which cost function generated the data. But the costs of children depend on which is correct, so neither the scale nor the cost measure can be calculated from knowledge of the demand functions alone. This is the essence of Pollak and Wales' argument. Like all situations where there is a failure of identification, the result does not mean that measurement is impossible, only that additional information is requir- ed, and in the next subsections, where I discuss specific proposals for estimating equivalence scales, I shall be explicit about the source of that additional informa- tion. However, before going on to that analysis, there are several general issues of interpretation with which we have to deal. It is sometimes argued that it makes no sense even to try to measure the cost of children. While fertility control is not perfect, parents generally have children as a result of conscious choice, and are presumably made better-off by the birth of a child. By this argument, the costs of a child-the benefits of losing a child-are negative, perhaps infinitely so, so that to consider the additional expenditures associated with the child as a welfare cost to the parents makes no more sense than to claim that the cost of a car is a welfare loss to its purchaser. While this argument admits the reality of the costs, it would deny their relevance for welfare calculations, and views the identification problem as a symptom of the absurdity of trying to measure the happiness that a child brings to its parents by counting the cost of clothing and feeding them. If we insist on trying to define equivalence scales simply by making cost functions depend on household characteristics, the argument is entirely valid. If welfare in these equations is taken to be that of the parents, then the choice of the function t affects the welfare that parents get from their children without affecting the household demand functions, so that it is perfectly legitimate to interpret Pollak and Wales' underidentification argument 250 THE ANALYSIS OF HOUSEHOLD SURVEYS in terms of a story of endogenous fertility. Whatever additional information is used to make the identification, it must clarify exactly what costs and benefits are being considered, and somehow rule out the direct benefits that parents get from the existence of their children. Even if we were to suppose that having children is involuntary, and that all births are unanticipated events over which parents have no control, then the equi- valence scales would still not be identifiable from demand functions alone. The most obvious case is if children were simply smaller versions of their parents, with identical tastes, but needing less of everything. Suppose that for every unit of each good that the parent receives, the child gets an amount a< 1, which in this simple model is the equivalence scale. It is obvious that no amount of household expenditure data can identify the parameter a; reallocations within the household have no effect on what the household buys, only on how it is shared when the purchases are taken home. Fortunately children are not homunculi; they have different needs from their parents, and there are goods that are consumed only by adults and goods that are consumed only by children. Even so, this example makes clear the potential dangers of attaching welfare significance to the relation- ship between demographic composition and household expenditure patterns; in this case, the distribution of welfare can be changed arbitrarily without observable consequences for household demand. What happens if the identification problem is ignored, if a cost function like (4.27) is specified, the demand functions derived and estimated, and the equiva- lence scales calculated? Indeed, this is the precise counterpart of the standard practice in demand analysis without compositional variables. Because the cost function is so closely linked to the demand functions, it is convenient to specify utility-consistent demand functions, not by writing down a utility function, but by writing down a cost function, typically one that allows a general pattern of re- sponses to price and income-a flexible functional form. Differentiation and substitution from the indirect utility function as in (4.23) and (4.24) yield the demand functions whose parameters (which are also the parameters of the cost function) are then estimated from the data. It is also possible to write down flexible functional forms for the cost function augmented by compositional variables, for example by augmenting one of the standard flexible forms such as the translog (Christensen, Jorgenson, and Lau 1975) or the "almost ideal demand system" (Deaton and Muellbauer 1980b). The translog indirect utility function is a quadratic form in the logarithms of the ratios of price to total expenditure, so that a translog with demographics can be gener- ated simply by extending the vector of price ratios to include the demographics (see Jorgenson and Slesnick 1984 and Jorgenson 1990). This methodology will give results in spite of Pollak and Wales' underidentification theorem, but be- cause of that theorem, we know that the results on the equivalence scales can be altered by respecifying the cost function or indirect utility function in a way that will have no effect on the estimated demand equations, and which is therefore uncheckable on the data. Identification has been achieved by an arbitrary selec- tion of one from an infinity of observationally equivalent utility functions, each of NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 251 which will generate a different equivalence scale. But since this selection is not justified, the identifying assumptions remain implicit, and there is no straightfor- ward way of judging whether or not they make sense. The lesson of the under- identification result is not that scales cannot be estimated, but that scales that are not supported by explicit assumptions are scales that cannot be treated seriously. Engel's method The oldest of all the methods for constructing equivalence scales dates back to Engel (1857). Although many other procedures have been invented since, Engel's is one of the most straightforward, and is still widely used in practice. It is based on the identifying assumption that the share of the budget devoted to food expen- diture correctly indicates welfare between households of differing demographic composition. A large household and a small household are equally well-off if, and only if, they devote the same fraction of their budget to food. I shall discuss the theoretical basis of this assumption and its plausibility at the end of this subsec- tion, but I begin with an explanation of how the assumption allows calculation of the scale, and with some illustrative calculations. Figure 4.7 shows the standard diagram for the Engel method. On the vertical axis is plotted the food share and on the horizontal axis total household expendi- ture. For any given household composition, Engel's Law implies that there is a negative relationship between the food share and total expenditure, and the figure illustrates two curves, one for a small family, AB, and one for a large family, Figure 4.7. Engel's method for measuring the cost of children A Large household .~ 0 f _ _ _ Small household 0 B Ao x Outlay 252 THE ANALYSIS OF HOUSEHOLD SURVEYS A 'B . At the same level of total outlay, the large family spends a larger share of its budget on food, so that its curve lies above and to the right of that for the small family. If we start from some reference point on the curve for the small family, say the combination xo and w as illustrated, then we can use the identifying assumption to calculate the amount of total expenditure that the large household would require to be as well-off as the smaller family with x O. In the diagram, the large family has budget share 0 at x I so that, by the assumption, it requires xI to compensate it for its larger family size. If, for example, the larger family is two adults plus a child, and the smaller family is two adults, the cost of the child is x' -x°, and the equivalence scale-the cost of a child relative to an adult cou- ple-is (xI -x0)/x0. In practice, the scale would be calculated using an estimated food Engel curve. Equation (4.14) above is one possible functional form and will serve to illustrate and provide concrete results. For a reference family of two adults, the food share is given by (4.35) wf = a+plnxo+(I-p)ln2+y..1 where Ya is the y coefficient for adults, where I have suppressed any difference between males and females, and where the other y coefficients are absent because the household is all adult, with ratio of number of adults to household size of unity. For a household with two adults and a child, the corresponding equation is (4.36) w = a +plnx+(Tl-p)ln3+ya(2/3)+yc(1/3) where y. is the coefficient for the ratio of the appropriate child category. The compensating level of expenditure x Iis obtained by setting (4.36) equal to (4.35) and solving for x. Hence, (4.37) In(x ) =( 1-i) ln.3 + Ya -Yc If r1 = 0, so that the food share is independent of family size holding PCE constant, and if Ya = y., so that switching adults for children has no effect on food con- sumption, the ratio of x 1 to x° is simply the ratio of family sizes, here 3 to 2. But even if il is zero, this can be expected to overstate the compensation required because Ya>Yc, (adults eat more than children) and J3 is negative (Engel's Law) so that the last term in (4.37) will be negative. Table 4.5 shows the coefficients from the food share regressions from the Indian and Pakistani data. The ,B-coefficients are both negative, as they must be for Engel's Law to hold, and so are the coefficients on the logarithm of household size, so that in both data sets, the food share decreases with household size when PCE is held constant. This result, which is a good deal stronger in the Indian than in the Pakistani data, means that, if were to accept Engel's contention that the food share indicates welfare, larger households behave as if they are better-off than smaller households with the same PCE. (Whether this is a sensible interpretat- ion will be discussed below.) The coefficients on the demographic ratios show the NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 253 Table 4.5. Regression coefficients of food share in India and Pakistan Coefficient of Maharashtra, India Pakistan ln(x/n) -0.1270 -0.1016 Inn -0.0355 -0.0133 Ratio of adults 15-54 0.0070 0.0128 Ratio of children 0-4 -0.0231 -0.0205 5-9 -0.0116 -0.0020 10-14 -0.0049 0.0037 Source: India, Subramanian and Deaton (1991); Pakistan, author's calculations based on Household Income and Expenditure Survey, 1984-85. expected pattern, increasing with the age of the group; small children consume less food than older children, who consume less than adults. The calculations corresponding to (4.37) are shown in Table 4.6. The numbers show the estimated costs of a family of two adults plus one additional person of various ages all calculated relative to the costs of a childless couple. For the In- dian data, a child aged between 0 and 4 years is equivalent to 0.24 of a couple, or 0.48 of an adult, a ratio that rises to 56 percent and 60 percent for children aged 5 to 9 and 10 to 14, respectively. The age groups for Pakistan are a little older, and the estimates a little higher; 56 percent of an adult for the youngest category, and 72 percent and 76 percent for the two older groups. However, these estimates must be compared with the last row of the table, which shows the equivalence scale when an additional adult is added to the household. The third adult is only 68 percent of each of the original pair in India and 84 percent in Pakistan, so that even in the latter, the method yields large economies of scale. But these econo- mies of scale also operate for the child equivalence scales in the first three rows, so that when we say that a child is 48 percent of an adult, the effect is as much the consequence of the apparent economies of scale that operate for adults and chil- dren alike, as it is a consequence of children costing less than adults. One way of purging the child scales of the effects of the economies of scale is to look at the first three rows as ratios of the fourth row so as to measure the cost of an addi- tional child relative to the cost of an additional adult. By this measure, children Table 4.6. Equivalence scales using Engel's method, India and Pakistan Age Maharashtra, India Pakistan 0-4 1.24 1.28 5-9 1.28 1.36 10-14 1.30 1.38 15-54 1.34 1.42 Note: The numbers shown are the estimated ratio of the costs of a couple with the child as shown to a couple without children. Source: Table 4.5. 254 THE ANALYSIS OF HOUSEHOLD SURVEYS are very expensive indeed, with even the youngest child costing more than three- quarters of an adult. Such results appear to be typical of the Engel method in developing countries; for example, Deaton and Muellbauer (1986) report even higher estimates from Sri Lanka and Indonesia. In their work as in the results reported here, the estimates are large because the coefficients of the demographic ratios do not differ very much between adults and children, so that the replacement of an adult by a child does not shrink the food share by much relative to the amount that the food share would be reduced by an increase in total outlay. By the Engel method and its identifying assumption, this finding translates into children being almost as ex- pensive as adults. Although we have no standard by which to judge these estimates other than crude observation and intuition, by these criteria they seem very high. However, we must be careful about the source of such intuition and in particular rule out other estimates obtained from different models. When another methodology is used, the identifying assumption is different from that in the Engel method, so that even though the results are called "child equivalence scales" or "child costs," we are in effect measuring different things. A comparison between scales from different models is not the same thing as a comparison of, say, the expenditure elasticity of food from different models. Models of child costs, unlike models of demand, not only provide estimates, they also provide the definition of what is being estimated. As a result, the validity of the Engel estimates can only be tested by considering the basic assumptions, and trying to decide whether or not they make sense. It is to that task I now turn. Underlying the Engel methodology are two empirical regularities, and one assertion. The first regularity is Engel's Law itself, that the share of food in the budget declines as income or total outlay increases. The second regularity is that, with resources held constant, the food share increases with the number of chil- dren. The assertion, which was made by Engel himself, is that the food share is a good indicator of welfare. More precisely, if we rank households (inversely) according to their food shares, we have correctly ranked them in terms of well- being, and the procedure can be applied across households of different demo- graphic compositions. It is important to note that this is indeed an assertion, and not an implication of the two empirical regularities. The truth of Engel's Law certainly implies that among households with the same demographic composition, those with higher food shares are generally those with the lower levels of income, and other things constant, with lower levels of welfare, but this is no more than a restatement of Engel's Law itself. Because the presence of additional children tends to increase the household's food share, it is true that additional children affect the budget in the same direction as does a reduction in income, but that is very different from a demonstration that an increase in income sufficient to re- store the food share is the precise amount needed to compensate for the additional expenditures associated with the children. Because food is so important in the budgets of poor households, the assertion that the food share indicates welfare has a superficial plausibility; food is the first NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 255 necessity and, at least among the very poor, we can probably do a good job of assessing welfare by checking whether people have enough to eat. But the claim needs to be argued, and the primacy of food is not by itself enough to establish Engel's assertion. Even if our main concern is with food, and if we believe that food consumption is a rough but useful measure of welfare, why focus on the share of food in the budget in preference to more direct measures such as food consumption or nutrient intake? That the share of the budget on food does not correctly indicate welfare over households of different compositions has been convincingly argued by Nicholson (1976). His argument runs as follows. Consider the standard case of a child born to a previously childless couple, and suppose that we know the true compensa- tion, defined as the amount of money needed to provide for the child without cutting into the parents' consumption. If this compensation is paid, the parents are exactly as well-off as before, and will presumably consume much the same pat- tern of goods as before. However, the child's consumption patterns are different from those of its parents; in particular, they will be relatively biased towards food, which is one of the few goods consumed by small children. As a result, even when the correct compensation has been paid, the consumption pattern of the household will be tipped towards food in comparison with the pattern prior to the birth of the child. But according to Engel, the food share is an inverse indicator of welfare, so that the household is worse-off, and requires further compensation to reduce the food share to its original level. As a result, Engel compensation is overcompensation, and the estimates of child costs using the Engel methodology are too high. Nicholson's argument has its weaknesses, for example in not being based on an explicit model of allocation, and in not allowing for the substitution in adult consumption that is likely to come about from the presence of the child. But it is nevertheless convincing. All that is required for its general validity is that the compensated household's food share is increased by the presence of a child, something that is hard to dispute. Note again that this is not a question that can be settled with respect to the empirical evidence; we are discussing the plausibility of an identifying assumption, whether or not we would find compelling a claim for compensation that cited as evidence an increase in the food share after the addi- tion of a child to the household. The point of Nicholson's argument is to show that the "food share identifies welfare" assumption is unsupported, that it does not follow from the importance of food in the budget nor from the validity of Engel's Law, and that it is likely to lead to an overestimation of child costs and child equivalence scales. Because the argument is persuasive, it must be concluded that the identifying assumption of the Engel methodology is not an acceptable one. The method is unsound and should not be used. Rothbarth's method In Section 4.2 on sex bias, I used expenditure on adult goods to detect differential treatment of children by gender. This methodology is a simple extension of that 256 THEANALYSISOFHOUSEHOLDSURVEYS suggested by Rothbarth in 1943 for measuring the costs of children. Rothbarth's idea was that expenditures on adult goods could be used to indicate the welfare of the adults, so that, if additional children reduce such expenditures, it is because of the resources that have been redirected towards the children. By calculating how much of a reduction in income would have produced the same drop in expendi- tures on adult goods, Rothbarth calculated the decrease in income that was equi- valent to the additional children and used this as his measure of child costs. As I argued in the previous section, this method is effectively the same as the identi- fication of a sharing rule on the assumption that there exist goods that are exclu- sively consumed by one group in the household, in this case adults. In his paper, Rothbarth used a very broad selection of adult goods, including virtually all luxu- ries and saving, but the subsequent literature has used much narrower definitions of adult goods, often confined to alcohol, tobacco, and adult clothing. Although we are now using a different indicator of welfare-expenditure on adult goods rather than the food share-the procedure for calculating the Roth- barth measure is similar to that for calculating the Engel scale. Figure 4.8 corre- sponds to Figure 4.7, but instead of the food share we plot expenditure on adult goods against total outlay; the graph has a positive slope on the supposition that adults goods are normal goods. The larger household spends less on adult goods at the same level of total outlay, so that, if we pick the original x0 as the refer- ence outlay for the small household, the cost of children is once again x I -x0. Engel curves of the form (4.14) are not quite so convenient for the Rothbarth calculations as for the Engel method, although it is possible to follow through the Figure 4.8. Rothbarth's method for measuring the cost of children B Small household 0 0 *~XA ~0 C Large household A xo ~~~x I Outlay NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 257 calculations that are parallel to (4.35) through (4.37). The equation for the share of adult goods in the budget is used to derive the expenditure on adult goods for both the reference family and the larger family, and equating the two defines the compensating value x 1. For this version of Working's Engel curve, the equation has no explicit analytical solution but can be readily solved iteratively. However, we need not take the trouble here since most of the calculations have already been done in the previous section. Table 4.3 lists the outlay-equivalent ratios for boys and girls corresponding to various possible adult goods in the Maharashtran data. Up to a linear approxima- tion, these ratios (with the sign changed) tell us how much total outlay would have to be increased in order to restore the adult expenditure to its level prior to addition of one more child. For tobacco and pan, which is the most reliable of the adult goods, the six child-gender combinations generate outlay-equivalent ratios of between 0.42 and 0.01, so that if we take 0.20 (say) as representative, the re- sults suggest that an additional child costs rather less than half of an adult. For the Pakistani results in Table 4.4, the outlay-equivalent ratios are higher, a little less than a half for girls using men's footwear, and between a third and a half for boys using women's footwear, so that an additional child costs two-thirds or more of an adult. Deaton and Muellbauer (1986) used data from Sri Lanka and Indonesia together with the extreme assumption that all nonfoods are adult goods; they found that children cost between 20 and 40 percent of an adult. 9 The specific estimates are of less interest than the general point that the Roth- barth scales are a good deal smaller than the Engel scales, in these examples about half as much. There is no general result here; the relationship between the Roth- barth and Engel scales will depend on which goods are selected as adult goods, and on the empirical responses of food and adult goods to outlay and to demo- graphic composition. However, in the special case where children consume only food, so that all nonfoods can be treated as adult goods, then Engel's Law implies that the Engel scale must be larger than the Rothbarth scale. The argument is from Deaton and Muellbauer (1986) and runs as follows. Suppose that, when a child is born, the parents are paid the Rothbarth compensation, so that expenditure on adult goods, here nonfood, is restored to its original level. But since positive compensation has been paid, total expenditure has risen with nonfood expenditure unchanged, so that the share of food in the budget has risen. According to Engel, this is a decrease in welfare, and insufficient compensation has been paid. The Engel scale is therefore larger than the Rothbarth scale. What then are the problems with the Rothbarth methodology, and is its identi- fying assumption, that expenditure on adult goods is an indicator of adult welfare, any more reasonable than Engel's assertion about the food share? There are cer- tainly a number of practical problems. As we saw in the previous section, it is not always easy to find convincing examples of adult goods, either because the sur- vey does not collect such data-men's and women's clothing rather than adult or child clothing-or because adult goods are consumed by very few households- alcohol in Muslim societies. And while it is hard enough to find goods that are consumed only by adults, it can be even harder to find goods that are consumed 258 THE ANALYSIS OF HOUSEHOLD SURVEYS only by adults and where it is plausible that children do not affect their consump- tion through substitution effects. Babies may not go to the movies, nor eat meals in restaurants, but their presence may alter their parents' consumption of movies and restaurant meals, even when they have been fully compensated for the costs of the children. When we have a group of possible adult goods, so that it is possible to test the restriction that all give the same estimate of child costs, the restrictions are some- times rejected, leaving the choice of adult goods controversial and arbitrary (see the discussion of the evidence from Thailand in Deaton 1989b or from Spain in Deaton, Ruiz-Castillo, and Thomas 1989). Another difficulty arises if adult goods as a group have an expenditure elasticity of zero or close to it, in which case it be- comes impossible to calculate the compensating changes in total expenditure. If this happens, the theory is simply not working, since the children do not affect adult expenditures in the same way as do changes in income. If additional chil- dren cause changes in expenditure on alcohol and tobacco, and if income has no such effects, then children must be exerting substitution effects and the goods cannot be used as adult goods. However, this difficulty seems to occur only in developed countries when alcohol and tobacco are the adult goods (see Cramer 1969); in work on developing countries that I have seen, adult expenditures- including expenditures on alcohol and tobacco-respond strongly to changes in total outlay. From a theoretical perspective, there are certainly problems with the Rothbarth identifying assumption, but they are less severe than those associated with that required for the Engel method. Nicholson's argument destroys the foundation of the Engel approach, but the arguments against the Rothbarth procedure are more about details, and with its failure to include some factors that might be important. And while we know that the Engel procedure biases upwards the estimates of the scales, it is much harder to sign any bias associated with Rothbarth's method. The most serious arguments are concerned with the possible substitution ef- fects of children, with the rearrangements in the budget that the children cause even when compensation has been paid. One model of substitution is that children exert price-like effects that cause substitution away from the goods in which they are most intensive. The idea here, first formalized by Barten (1964), is that goods that are consumed by both adults and children become more expensive to the adult than goods that are only consumed by adults. For example, on a visit to a restaurant, the father who prefers a soft drink and who would order it were he alone, finds that in the company of a child his soft drink is twice as expensive but that a beer costs the same, and so is encouraged to substitute towards the latter. If so, perfectly compensated adults will consume more adult goods in the presence of children than they would without, so that the Rothbarth compensation would be too small. Barten's analogy between children and price effects is both elegant and insightful, but it is hard to believe that this is the only way that substitution effects operate. Much of the budget reallocation in the presence of children is concerned with the need to allocate time differently, and it is just as likely that compensated adults will cut adult expenditures as increase them. NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 259 We must also question whether it is appropriate to measure adult welfare as indicated by their consumption of adult goods, and beyond that, whether the welfare of adults as indicated by expenditures on such goods can tell us anything useful about the welfare of other members of the household. Of course, the Roth- barth methodology does not claim that adults get no welfare from other goods, only that their welfare is a monotonic increasing function of adult goods' expen- ditures. But the demand function for adult goods will certainly also depend on the relative price of adult goods, and so we must be careful not to apply the method to compare welfare over situations where such price changes have taken place. More generally, and even in developing countries, the price of adult time is likely to be sensitive to the presence of children, and may affect purchases of adult goods, especially time-intensive adult goods. As far as the welfare of nonadults is con- cerned, it is certainly possible to imagine situations in which the welfare of the adults is monotonically related to the welfare of the children, as for example when the parents allocate resources so as to equalize welfare over all members of the family. If such an assumption makes sense in context, there is a justification for using Rothbarth scales to correct total household expenditure, and to attribute the results to each member of the household. In the next subsection, I briefly discuss a number of other methods for calcu- lating equivalence scales, but none of these is as readily implemented as are either the Engel or Rothbarth methods, and in none is the identifying assumption so transparent. As a result, the choice for practical policy applications is to do nothing-or at least to work with either total or PCE to use one of either Engel or Rothbarth. Since the Engel method is indefensible, and since the attribution of total household expenditure to each household member requires quite implausible economies of scale, the choice is between calculating equivalents according to Rothbarth, or using PCE as in Chapter 3. More realistically, since it would be cumbersome to calculate a set of equivalents on a case by case basis, a modified Rothbarth alternative would be to choose a set of scales in line with the results that are generally obtained by the method, for example a weight of 0.40 for young children aged from 0 to 4, and 0.50 for children aged from 5 to 14. These num- bers are obviously arbitrary to a degree, but there is no good evidence that can be marshaled against them, and they are broadly consistent with results from a pro- cedure that has much to commend it. In Chapter 3, I followed standard practice, using PCE for welfare and poverty calculations, rather than these equivalence scales. However, the decision was more for convenience of presentation than from intellectual conviction. Indeed, welfare measures based on PCE certainly overstate the costs of children, and understate the welfare levels of members of large families relative to those of small families. Whether or not the procedure of counting children at about half of adults is considered to be an improvement depends on the weights we give to the various arguments and compromises that underlie these estimates. The measure- ment of welfare and poverty would rest on a much firmer footing if there existed a solid empirical and theoretical basis for the construction of equivalence scales. But given current knowledge, numbers like a half are as good as we have, and in 260 THE ANALYSIS OF HOUSEHOLD SURVEYS my view, it is better to use such estimates to construct welfare measures than to assume that everyone is equal as we do when we work with per capita measures. Other models of equivalence scales There are several other methods for estimating equivalence scales, and at least one of these, the method proposed by Prais and Houthakker (1955), has been used in the development literature. The basic idea behind this method, which traces back to earlier work by Sydenstricker and King (1921), is to write the household demand functions in the form (4.38) piqilmi = fi(xlm0) where mi and mo are commodity-specific and "general" scales that are functions of household composition, and are to be thought of as measuring the need for each good and for total outlay of different types of household. A household with children would have large commodity specific scales for child foods, children's clothing and education, and the overall scale would reflect each of these specific needs. Indeed, because the budget constraint must hold, the general scale in (4.38) can be defined as the solution to (4.39) Emjf1(xImo) = x. This model is estimated by specifying functional forms for each m, in terms of the observable demographic characteristics, substituting into (4.38), and estimat- ing (4.38) and (4.39) as a set of nonlinear demand equations. This apparently straightforward procedure is incomplete, at least conceptually if not numerically, because the model is not identified, a result first shown by Muellbauer (1980) (see also Deaton and Muellbauer 1980a, pp. 202-O5 for further discussion). To make things worse, researchers sometimes select functional forms for the demand functions-such as double logarithmic demands-that do not permit the budget constraint (4.39) to be satisfied exactly. This failure of the model can result in the empirical version of the model being econometrically identified, essentially because of a failure of approximation to the theoretically underidentified true model, and there are examples in the literature where authors have thus succeeded in estimating a model that is theoretically unidentified. Parameters obtained in such a way are clearly of no practical value. In order to obtain genuine identification, it is necessary to have prior informa- tion about at least one of the commodity scales, a result that should come as no surprise given the earlier discussion of identification in general. Note too the close relationship with Bourguignon and Chiappori's approach, where the identi- fication of the sharing rule is obtained by finding goods that are exclusive or assignable, and with Rothbarth's, where we specify exclusively adult goods. In the Prais-Houthakker method, an obvious way to proceed would be to identify the model by assuming that the commodity-specific scale for adult goods is unity. NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 261 Having identified the model in this or some other way, estimation can proceed and scales be obtained. However, it is not clear what is gained over the much simpler Rothbarth methodology. The calculation of the Prais-Houthakker scales requires estimation of a potentially large system of nonlinear equations, a task that is a good deal easier than it once was, but the effort requires a justification that is not apparent, at least to this author. I have already mentioned a second alternative, the model originally proposed by Barten (1964), and extended by Gorman (1976). Barten's procedure is even less suitable than Prais and Houthakker's for applied work, but it contains an important insight that I shall use again in the next subsection, so that it is worth a brief statement. Barten built the idea of commodity-specific scales into the frame- work of utility theory, writing the household utility function-best thought of as the utility of each member of the household-in the form (4.40) u = u(q11m1,...,q,,1m,) where the m's have essentially the same commodity-specific scale interpretation as in Prais and Houthakker's model. The maximization of (4.40) subject to the usual budget constraint p.q = x, can be conveniently rewritten by defining the scaled quantities qi* = qjimi and scaled prices p; = pimi, so that the consumer's problem can be rewritten as the maximization of u(q *) subject to the budget constraint that p .q= x. This problem immediately generates standard-form demand functions in which q* is a function of the prices p,, so that (4.41) qilmi = gi(x, mipi, . . .P.). Note first that if all the m; were identical and equal to household size, (4.41) would give the sensible result that consumption per head is equal to the standard Marshallian demand function of outlay per head and prices. (Recall that demand functions are homogeneous of degree zero.) More generally, when the scaling factors differ from one commodity to another, the income reduction effects will still be present, working by scaling up each price by a greater or lesser amount. However, and this is the central insight of the Barten model, there are also substi- tution effects. Goods that are child-intensive are relatively expensive in a house- hold with many children, and family decisionmakers will substitute against them. While the recognition of the possibility of such substitution is an important contribution of the Barten model, the model is clearly incomplete as it stands. The fundamental identification of equivalence scales comes from the assumption that household composition acts entirely analogously to prices, so that, not surpris- ingly, the model cannot be estimated without data with price variation (see again Muellbauer 1980 and Deaton and Muellbauer 1980a, pp. 202-5). But the analogy with price, while conceivably a part of the story, is surely not all of it. For exam- ple, as pointed out by Deaton and Muellbauer (1980a, p. 200) and forcefully argued by Nelson (1993), there is no reason to suppose that a child-intensive good, such as milk, could not be highly price elastic in families without children; 262 THEANALYSISOFHOUSEHOLDSURVEYS for adults, beer or soda may be good substitutes for milk, though one might hesi- tate to make the same substitutions for children. Within the Barten model, if the substitution effect is large enough, this situation can lead to a reduction in milk demand when a child is added to the family, a result that compromises the empi- rical usefulness of the model as well as any claim that the model might have to represent the welfare of children as well as adults. It is possible to extend the model to improve the situation, for example by following Gorman's (1976) suggestion to make allowance for fixed costs of children, but the empirical imple- mentation of such models is cumbersome out of all proportion to their advantages over the simpler Rothbarth methodology. Economies of scale within the household It is an old and plausible idea that two can live more cheaply together than two can live separately, yet there has been relatively little serious empirical analysis of the phenomenon, and our understanding of the topic is even less complete than that of child costs. One good approach to the issue is through the recognition that there are public goods within the household, goods that can be shared by several people without affecting the enjoyment of each. In this last subsection, I discuss some of the literature-which is again dominated by Engel's method-and then sketch out an approach to economies of scale based on the existence of public goods that can be shared within the household. The public good approach reveals serious flaws in the Engel methodology, and suggests an alternative approach. Attempts to implement this alternative founder on an empirical paradox that is currently unresolved. I start by dismissing children, and thinking only about households containing n identical adults. The conceptual issues with which I am concerned are not affected by the presence of children, and if children are to be included, we can suppose that they have already been converted to adults using a child equivalence scale. Consider the direct utility function u(ql, q2, . . . ,qm), which we think of as the utility of a single individual who consumes q, of good 1, q2 of good 2, up to qm of good m. For a household of n individuals who share consumption equally, the utility of each is given by the utility function applied to an nth of the household's consumption, so that total household utility is written (4.42) Uh = nu(q1In,...,q./n). But this equation assumes that there are no economies of scale, and that a house- hold of n people generates no more welfare than n households of one person each. Suppose instead that, by some process that is left implicit, the needs for each good do not expand with the number of people, but less rapidly, for example in propor- tion to n ° for some quantity 0 < 0 1. (This isoelastic form is easily generalized, but little is gained by doing so.) If 0 = 1, there are no economies of scale, and each person gets an nth of the total; for 0 < 1, there are economies of scale, and each person receives more than his or her share of the total. The quantity 1 - 0 is NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 263 therefore a measure of the extent of economies of scale. With this specification, household utility (4.42) is modified to (4.43) Uh = nu(q1Ine,...,qmInn). It is a straightforward exercise to show that the maximization of (4.43) subject to the budget constraint that the total cost of purchases be equal to x, gives de- mand functions (4.44) =~q _ =,,I ( x I -- x xln ngb n 'Pl***Pm so that the budget share of good i, for all goods i = 1, ..., m, is a function of prices and of total expenditure deflated by household size to the power of 0. The indi- rect utility function corresponding to (4.43) and (4.44) is (4.45) Uh = nl1r(xln/,P1,...,P,) where *(x,p .....p,) is the indirect utility of a single individual with outlay x. Because both the budget shares and indirect utility depend on family size only through the term xln 0, welfare is correctly indicated by the budget share of any good, and two households of different size are equally well-off when their pat- terns of budget shares are the same. There is an empirical restriction-that we get the same answer whichever budget share we use-but I ignore it for the moment Instead consider one particular budget share, that of food. As before, I use the subscriptf for food, and rewrite (4.44) as (4.46) Wf = f( .Pl..IPm) an equation that provides a formal justification of using Engel's method-see Figure 4.7-not for measuring child costs, but for measuring economies of scale. In particular, if we adopt the same functional form as before, (4.14), the general form (4.46) becomes (4.47) Wf = af + Pf ln(xln) + Pf (1 - 0) Inn so that the economies of scale parameter 0 can be obtained by regressing the food share on the logarithm of PCE and the logarithm of household size, computing the ratio of coefficients, and subtracting the result from unity. If we want to pursue the utility interpretation, we could then estimate 0 from other goods and compare, or we might simply wish to accept Engel's assertion that the food share indicates welfare, and not concern ourselves with its possible theoretical foundations. Table 4.5 above provides estimates of (4.47) for the Indian and Pakistani data. As predicted by (4.47), the coefficients on the logarithm of household size are negative; with PCF held constant, larger households are better-off and reveal the 264 THE ANALYSIS OF HOUSEHOLD SURVEYS fact by spending a smaller fraction of their budget on food. The 0 parameter is estimated to be 0.72 in the Indian data, and 0.87 in the Pakistani data, so that if we double household size and double household resources, the Maharashtran (Pakistani) households have effectively had a 28 (13) percent increase in per capita resources. Using the 1991 Living Standards Survey from Pakistan, Lanj- ouw and Ravallion (1995) find even larger economies of scale, with 0 estimated to be 0.6, so that doubling size and resources would increase effective resources by 40 percent. This Engel method is currently the only.simple method for measuring econo- mies of scale. There is no equivalent to the Rothbarth method, which focusses explicitly on children. This is unfortunate, since Engel's method is just as unsatis- factory for measuring economies of scale as it is for measuring child costs. There are two problems. The first is the lack of an answer to Pollak and Wales' identi- fication result. Even if (4.47) is an accurate representation of the data, by what assumption are we allowed to interpret 0, or better 1 - 0, as a measure of econo- mies of scale? In the technical note at the end of this subsection, I show that the identification problem is real by providing an example. In particular, if the utility function (4.43) is modified to make the economies of scale parameter differ for different levels of living, the demand functions (4.47) are unaffected. As a result, the true economies of scale are not captured by our estimates. The second problem is the lack of an explicit justification for the mysterious process whereby sharing prevents the effective per capita consumption of all goods falling in proportion to the number of sharers. The remainder of this sec- tion shows that attempts to model this process lead to a discrediting of Engel's assertion. If we replace Engel-or (4.43)-by a more appropriate model in which economies of scale are attributed to public goods, we are led to a different scheme for measurement that is much closer to Rothbarth's child cost procedure than it is to Engel's. Unfortunately, this promising and sensible. strategy is confounded when we turn to the data, which show behavior that is difficult to interpret in any coherent manner. The following is based on Deaton and Paxson (1996); the Rothbarth-like procedure for measuring economies of scale was suggested to the author by Jean Dreze. In the simplest case, suppose that there are two goods, one private, and one public. For the private good, if the household buys 10 kilos, and there are ten people, each gets one kilo; no one benefits from anyone else's consumption, and if one uses the good, another cannot. For the public good, by contrast, everyone consumes the total that is purchased; no one person's consumption precludes the consumption of anyone else. Works of art are a good example, although housing, cooking, and toilet facilities may be approximately so described up to the point of congestion. If q is household consumption of the private good, and z household consumption of the public good, each member of the household obtains utility (4.48) u = u(q/n,z) and the household maximizes this-or equivalently n times this-subject to the NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 265 budget constraint. Note that this problem is precisely analogous to Barten's formulation of compositional effects, or more precisely, it is a special case of Barten's formulation (see also Nelson 1988, 1992 for discussions of economies of scale along these lines). The demand functions from (4.48) are, for the private good (4-49) q = g(x,np,p,) = ( P. n) where p and p. are the prices of the private and public good, and for the public good (4.50) Z = P n n Suppose first that all goods are private. Then (4.49) gives the obvious and obviously sensible result that in a household with n identical individuals, per capita demand is the same as individual demand, provided we replace total outlay by per capita outlay. In this situation, the household is simply a collection of identical individuals, and their aggregate behavior is the replication of what would be the behavior of each individual in isolation. The presence of the public good affects this purely private allocation because additional people reduce the price of the public good. Compared with the private good, the public good is n- times blessed, and its price is reduced in proportion to the number of people in the household. As usual, this price reduction will have income effects, increasing the demand for all normal goods compared with the purely private solution, and substitution effects, which tip consumption towards the public goods. This account of public and private goods can be used as a basis for measuring economies of scale. Suppose that we can identify in advance a private good that is not easily substituted for other goods. (Note the analogy with the Rothbarth meth- od, which requires the identification of an adult good that is not affected by sub- stitution effects associated with children, and with the sharing-rule approach, which requires exclusive or assignable goods). When we increase the number of household members holding PcE constant, resources are progressively released from the public good-the income effect of the price decrease-and consumption of the private good will increase. If there are no substitution effects on consump- tion of the private good, its additional consumption can be used to measure econ- omies of scale. In particular, we can calculate the reduction in PcE that would restore the per capita consumption of the private good to its level prior to the increase in size. Figure 4.9 illustrates. On the vertical axis is consumption per head of the pri- vate good which, for the purpose of the argument, I shall refer to as "food." In poor countries, with people close to subsistence, food has few substitutes, and food eaten by one household member cannot be eaten by another. The horizontal axis is PCE, and the diagram illustrates an increase in household size, from n, to n2, with PcE held constant. Given that total resources have increased in line with household size, the household could keep its demand pattern unchanged if it 266 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 4.9. Dreze's method for measuring economies of scale q2ln2 2 I 0 n 0. Ui ql/n, 4 Tota expenditure per capita wished. But because there are public goods that do not need to be fully replicated, mnembers of the household can typically do better than this. Provided the price elasticity of food is low-which is equivalent to low substitution out of food in favor of the now effectively cheaper public goods-food consumption per capita will increase as shown. Although the result depends on the assumption of limaited substitution, it is clearly robust, at least for poor households. The (over)compen- sated increase in household size makes the household better-off, and in poor, near-subsistence economies, better-off households spend more on food. The diagram also shows o, the amount by which PCE of the larger household could be decreased to reduce per capita food consumption to its original level. If the compensated own-price elasticity of food is exactly zero, food expenditure per head is an exact indicator of welfare, and o is the per capita money value of the economies of scale. Even when substitution effects are nonzero, food is likely to be a good indicator of welfare, a case that has been well argued by Anand and IIarris (1990, 1994). In any case, o sets a lower bound on the value of the scale econorniies. After a reduction in PCE of o, the large household has the same per capita food consumption as the small household. But it faces a higher relative price for food and a lower relative price for the public good so that, given some possibility of substitution, it could be as well-off with rather less food and rather more public good. In consequence, a precisely compensated larger household will have less food consumption per head than the smaller household, which implies that o is an understatement of the value of the scale economies. All that remains is to implement the method on the data and to fill in the ideas of Figure 4.9 with real numbers. But as is perhaps already apparent, there is a NUTRITION, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 267 serious problem. The results in Table 4.5 are not consistent with the figure. Be- cause the coefficients on the logarithm of household size are negative, the curve for the larger household lies below that for the smaller household. In these Indian and Pakistani examples, the budget share of food falls with household size with PCE held constant. Because the budget share is the ratio of per capita food expen- diture to per capita total expenditure, the budget share can only decline with household size at constant PCE if food expenditure per head also declines. That the same result holds more widely is documented by Deaton and Paxson (1996), not only for households of Pakistan in 1991, in South Africa in 1993, and in Thai- land in 1992, but also in the much richer economies of Taiwan (China), Britain, France, and the United States. Their evidence shows that, in all these economies, among households with the same level of PCE, larger households spend less per capita on food. Furthermore, the declines are largest where they are hardest to explain, among households in the poorest economies, Pakistan, Thailand, and South Africa. That food consumption per capita declines with household size at constant PCE is a prediction of the Engel model, so that the empirical evidence might be inter- preted as evidence in favor of that model, and against the story of public goods and economies of scale. But to argue in favor of the Engel assertion is not credi- ble. It makes apparent (but only apparent) sense to argue that a larger household with the same PCE as a smaller household is better-off, and so should have a lower food share because better-off households generally have lower food shares. But once it is recognized that we are holding PCE constant, so that a lower food share means lower consumption per head, the implausibility of the argument is reveal- ed. In poor countries, making people better-off causes them to buy more food, not less. Recall that Lanjouw and Ravallion (1995) estimate that 0 is 0.6 for Pakistan in 1991. According to this, two adults can attain the same welfare level as two single adults for only 20.6 = 1.52 times as much money (see equation 4.45). For a society where the food accounts for more than half of the budget, this estimate seems low. But what is truly incredible is the implication that the couple can each reduce their food consumption by a quarter and be as well-off as they were as individuals. When it comes to the measurement of economies of scale, we are therefore in a most unsatisfactory predicament. We have a model that works, and fits the data, but makes no sense, and a model that makes sense, but whose most elementary implications are contradicted by the data. It is hard to see how we can make much progress until we can understand the paradoxical relationship between food con- sumption and household size. On this, there has been little or no progress so far. Deaton and Paxson list a number of possible explanations, none of which is very convincing. Only one is worth noting here, because the issue will arise again in the next chapter. Not everyone pays the same price for their food (or other goods), and it is easy to imagine that larger households obtain discounts by buy- ing in bulk, that they buy lower-quality food, or that they buy more of the cheaper foods. If so, it is possible that, although larger households spend less per head on food at the same level of PcE, they consume larger quantities. The evidence on 268 THE ANALYSIS OF HOUSEHOLD SURVEYS this question is limited by the number of surveys that collect data on both expen- ditures and quantities. However, what evidence there is-some of which appears in the next chapter-shows the opposite of what would be required; at a constant level of PCE, larger households spend more per unit on food, not less. Indeed, exactly this result was treated as evidence of economies of scale and used to measure their extent by Prais and Houthakker (1972, ch. 10). In consequence, the paradox is only deepened; people who must surely be better-off are switching to higher-quality foods-which makes perfect sense-while at the same time cutting consumption-which makes none. *Utility theory and the identification of economies of scale Suppose that we are prepared to assert that economies of scale operate by multi- plying the cost of living by a concave function of household size. Provided the utility function is not homothetic, so that the pattern of demand is a function of real income, the scaling function can be identified from the data, so that it is possible to test the implications of the model and to use the parameters to estimate the extent of economies of scale. Under the assumption that the scaling function is isoelastic, n 0, this is exactly what is done by the Engel method as outlined in equations (4.43) through (4.47) above. But what then of the underidentification of demographic effects in the utility function? By working through a simple exam- ple, I show that the problem is still present. Indeed, the scaling model, although identified on the data once specified, has the same empirical implications as other models in which the economies of scale are clearly different. As a result, what- ever it is that the Engel method measures, there are no grounds for claiming that it is economies of scale. In order to examine the effects of changing household size, we need to start from the preferences of a household with only one member. Consider the case where the cost of living is related to utility by (4.51) c(u,p) = a(p)up(P) where a(p) is linearly homogeneous and ,(p) zero-degree homogeneous in prices. One convenient specification for cross-sectional work is (4.52) Ina(p) = Eaklnpt; lnp(p) = EPklnPk. The budget shares are the elasticities of the cost function with respect to prices, so that using these two equations and the fact that cost equals total expenditure x, the system of demand patterns takes Working's form (4.53) Wi = (ai- caklnpk) + Pilnx = a* + P1lnx. We can now introduce household size and economies of scale by multiplying the cost function by an isoelastic function of n, so that (4.51) becomes NUTRITON, CHILDREN, AND INTRAHOUSEHOLD ALLOCATION 269 (4.54) C(u,p,n) = no a(p) u (p) where 0 is, as before, the parameter controlling economies of scale. If we derive the budget share equations as before, we get, in place of (4.53), cf. (4.47) above, Wi = a + Piln(xIn0) = a* + Pilnx - PiOInn ( ) = ai + 1iln(x/n) + P1(B1-)lnn which is the equation that I have been using to interpret the data. To see the identification problem, suppose that, instead of (4.54), we also make the economies of scale parameter 0, not a constant, but a function of the utility level u, in the form 0 + Olnu. This is plausible enough and certainly cannot be ruled out in advance; additional people may not affect costs proportion- ately, but have larger or smaller effects the better-off is the household. Substitu- tion gives our last cost function (4.56) c(u,p,n) = n +O1Inua(p) u (P = no a(p)uP(p)+OInn If we follow through the calculations for a third time, we simply come back to (4.55); the incorporation of the household size function into the elasticity has had no effect on behavior, and so we cannot tell (4.54) and (4.56) apart on the data. But the two cost functions have different implications for household welfare. The indirect utility function for (4.56) is (4.57) Inu = 1,(P) + Olnn]Y'[ Inx - Olnn - lna(p)] which contains the term 0 l Inn in the denominator. Welfare levels, unlike the de- mand functions, are affected by the presence of this term so that, once again, we cannot infer welfare from behavior. 4A Guide to further reading There is an enormous background literature on the topics covered in this chapter, and it is probably best approached through surveys. The recent book by Dasgupta (1993) is much concerned with gender discrimination (Chapter 11) and nutrition (Chapters 14 through 17). Dasgupta argues the case for the nutritional wage hypo- thesis as the root cause of destitution, but one does not have to accept all his argu- ments to benefit from his extensive reviews of the literatures in economics and in nutrition. Dreze and Sen (1989) is also concerned with undernutrition, its causes, and the design of public policy for its alleviation; it also provides extensive refer- ences to the literature. At the narrower and more technical level, both Bouis and Haddad (1992) and Strauss and Thomas (1995) provide reviews of earlier litera- ture as well as their own contributions to measurement. The reports of the Inter- national Food Policy Research Institute in Washington are frequently concerned with nutritional issues in a range of countries. On equivalence scales, the easiest 270 THE ANALYSIS OF HOUSEHOLD SURVEYS place to begin is with Chapter 8 of Deaton and Muellbauer (1980a), and to update with Deaton and Muellbauer (1986). Buhmann et al. (1988) provide a review of some of the equivalence scales that have been used in the academic literature as well as in government programs, and show how different scales affect poverty and inequality measurement for the (developed) countries in the Luxembourg Income Study. There is also a literature on "subjective" approaches to the con- struction of equivalence scales that parallels attempts to measure poverty lines by direct questioning (see, for example, Rainwater 1974; Danziger et al. 1984; and van Praag, Hagenaars, and van Weerden 1982). The issue of economies of scale that is sketched in the penultimate subsection is dealt with much more fully in Deaton and Paxson (1996). 5 Looking at price and tax reform In a world of laissez-faire, prices set in free competitive markets guarantee econo- mic efficiency. As has long been recognized, such efficiency is of a limited na- ture; its achievement requires strong assumptions and takes no account of the distribution of income, even to the extent of being consistent with some individu- als being unable to survive. Laissez-faire pricing also leaves little room for the government to collect revenue, whether to influence the distribution of resources or to finance its own activities. The lump sum taxes that would (by definition) leave behavior unaffected generate little revenue if they exist at all, and the lim- ited range of tax instruments on which governments in poor countries must rely are often quite distortionary. As a result, the best that can be hoped for is a price system that is efficient in the sense of minimizing distortion for any given amount of revenue. The design of such systems is the topic of optimal tax theory, starting from Mirrlees (1971) and Diamond and Mirrlees (1971), a brief account of which is given in the first section below. In recent years, public economics has been greatly influenced by what is now referred to as "political economy," a branch of economics and political science that uses the tools of economic analysis, particularly the typical economic as- sumptions of rationality and greed, to help explain the behavior of political actors. These models often seem relevant to poor countries, where the empirical reality is often in stark contrast to the assumptions of a previous generation of models in which the government was seen as a benevolent dictator acting in the public inter- est. As a result, there has been a good deal of suspicion-and in some cases rejection-of optimal tax models in which prices are set to maximize a social welfare function subject to budgetary and behavioral constraints. These criticisms are justified in that it is absurd to regard the optimal models as descriptions of the social and political equilibria that actually exist in developing countries. Even when the motivation of a given government is clear, existing taxes are likely to be a web of only partially consistent levies and subsidies, the fossilized traces of long-dead administrations of differing purposes and philosophies. Even so, it is important to think rationally about what system of prices would be desirable, at least to the extent of meeting the legitimate needs of government for revenue and redistribution while minimizing the costs of collection and distor- 271 272 THE ANALYSIS OF HOUSEHOLD SURVEYS tion. Indeed, international organizations such as the World Bank and the Interna- tional Monetary Fund are required to make proposals for reform, sometimes in opposition to what governments would choose left to themselves. Although opti- mal tax theory is unlikely to predict the tax structure in any given situation, it provides a systematic and indispensable framework for thinking about the stan- dard issues of distribution and economic efficiency that are the basis for any intelligent discussion of price and tax reform. My main concern in this chapter is not with the theory itself, but with its implementation, and with the use of household survey data to make the calcula- tions. In Chapter 3, I showed how the surveys could be used to see who gets what and who produces what, so that we could calculate who benefits and who loses from price changes, and thus assess the distributional consequences of a change in pricing policy. That analysis is an important part of the analysis of price reform, but to go further, we must also know something about efficiency, which means finding out how behavior responds to the incentives provided by price changes. Traditionally, such evidence has been obtained by examining the historical experience, linking demand and production decisions to historical variations in price. What I shall mostly be concerned with here is a different approach that uses variation in prices across space to identify behavioral responses. This line of work has been developed in a series of papers over recent years (Deaton 1987a, 1988, 1990a and Deaton and Grimard 1992), and the account in this chapter is intended to be more complete and comprehensible than is possible within the confines of a journal article. It also aims to show how the results of the calculations can be used to look at price reform in practice, with examples from India and Pakistan using the data sets introduced in previous chapters. Section 5.1 presents a brief review of the theory that is required for practical implementation in poor countries. It also provides a first discussion of how the formulas might be used, and reviews the various procedures that have been adopt- ed in the literature. I argue that, without using the data from the surveys, it is almost impossible to derive policy recommendations that are firmly based on the empirical evidence. While it is certainly possible to make policy based on prior reasoning, on the results of computable general equilibrium (cGE) models, or on econometric evidence from historical data, such recommendations-sensible though they may be-are largely grounded in prior assumption, and are not much affected by empirical evidence. Section 5.2 is a first look at the evidence on spa- tial price variation from the Pakistani and Maharashtran data, with a particular focus on whether the unit value data that come from the surveys can be trusted for an econometric analysis of behavioral responses to price. This section also con- siders some of the difficulties that lie in the way of a straightforward use of spatial price variation to estimate demand responses. Section 5.3 presents a model of consumer behavior that is designed to accommodate the behavior we see in the surveys, particularly the fact that better-off households pay more for each unit of even quite narrowly defined commodities. I show how to derive a simple econo- metric model that is based on this theory. Section 5.4 works through the stages of the estimation using the Indian and Pakistani data. Section 5.5 uses the results to LOOKING AT PRICE AND TAX REFORM 273 examine price reform, and highlight the differences that come from using the data-based methodology. In particular, I emphasize interaction effects; because there are important substitution effects between goods, it is important not to con- sider price reform one good at a time. The final Section 5.6 considers a number of directions for future research, and in particular the possible use of nonparametric estimation. 5.1 The theory of price and tax reform for developing countries The theory of taxation for developing countries is extensively treated in the monograph by Newbery and Stern (1987), which is not only a collection of appli- cations from several countries, but which also contains introductory chapters that review the underlying theory. Interested readers are strongly advised to consult that source for a fuller account than given here; my purpose is to provide only those results that are required for practical analysis of reform, and to identify those quantities for which empirical evidence is required. The analysis will be concerned almost exclusively with indirect taxes (and subsidies). Although in- come taxes provide a large share of government revenue in developed countries, they are less important in poorer countries, for obvious administrative reasons. Most poor countries do indeed have income taxes, but it is difficult to collect revenue from the self-employed, particularly in agriculture, or from those em- ployed in small businesses or informal enterprises. As a result, most income taxes come from the relatively small formal sectors, or from the government paying income taxes to itself on the incomes of its own employees. A simple linear in- come tax can be effectively replicated (except in its treatment of saving) by an indirect tax system that taxes all goods at the same rate and that uses a poll tax or subsidy that is identical for all individuals. However, the practical usefulness of this identity is limited by the impossibility of taxing all consumer goods in most countries; in particular, it is not possible to drive a tax wedge between producer and consumer when the producer and consumer are one and the same, as is the case for farm households. And although many countries provide food subsidies for urban residents, arbitrage places limits on differences in prices between urban and rural areas. For these (and other) reasons, the tax instruments that are avail- able to governments in developing countries are typically much more limited than is the case in richer countries with their more elaborate distribution networks, and the fact must always be kept in mind when discussing reforms. Tax reform The theory of tax reform is concerned with small departures from an existing tax structure. This is in contrast to the theory of optimal taxes, which is concerned with characterizing the (constrained) optimum in which taxes and subsidies are set at their socially desirable level. The differences are sometimes blurred in both theory and practice; the formulas for tax reform often have obvious implications for the optimum, and any practical reform will involve a discrete change in price. 274 THEANALYSISOFHOUSEHOLDSURVEYS However, the empirical burdens of calculating tax reforms are much lighter than those of calculating optima. The former asks us to evaluate the current position, and to calculate the desirability of various directions of reform, for which we need information about supply and demand and behavioral responses calculated at the current position of the economy. The calculation of an optimal tax system, by contrast, will require us to know the same quantities at the hypothetical wel- fare optimum, a position that is most likely far from the current one, or from anything in the historical record. We begin by specifying the accounting apparatus for judging prices, as well as the rules that govern the behavioral responses. This is done in exactly the same way as in Chapter 3, by specifying a social welfare function and its determinants. As in Chapter 3, the social welfare function should be regarded as an accounting device that keeps track of who gets what, and that is used to consider alternative distributions of real resources among individuals or households. In Chapter 3, equation (3.1), social welfare was defined over the "living standards" x of each member of the population, where x was later equated with real expenditure levels. Here, because we are interested in the effects of changes in prices, we need to be more explicit about the link between prices and welfare. To do so, rewrite social welfare as a function of the individual welfare levels u, which are in turn given by the values of the indirect utility functions that give the maximum attainable wel- fare in terms of prices and outlays. Hence, (5.1) W = V(u, U2,*... , UN) where N is the number of people in the economy and the welfare levels are given by (5.2) uh = * (Xh,P) for household h's indirect utility function *(Xh,P), where xh is total outlay, and p the vector of prices. The indirect utility function defines not only the way that welfare is affected by prices, but also encapsulates the behavioral responses of individuals; given the indirect utility function 1IJ(xh,p), demand functions can be derived by Roy's identity, (3.46). I am using the subscript h to identify individuals or households, and I shall deal with the difference between them later. I also assume that everyone faces the same price, an assumption that will be relaxed below, but which simplifies the initial exposition. Note finally that I am avoiding dealing either with labor supply, or with intertemporal choice and saving. Because there is no income tax in this model, there is no need to model earnings, not because labor supply is unaffected by commodity taxes and subsidies, but because with an untaxed wage, the in- duced changes in labor supply have no affect on tax collections. A similar argu- ment applies to saving, and I interpret (5.2) as the single period utility associated with an additively intertemporal utility function of the kind to be discussed in Chapter 6. Even so, the current approach fails to recognize or account for differ- ences between the distribution of welfare over lifetimes and over single periods. LOOKING AT PRICE AND TAX REFORM 275 We can also begin with a simplistic model of price determination whereby the consumer price is the sum of a fixed (for example, world) price and the tax or subsidy, so that (5.3) Pi= Pi0 + t where t, is the tax (if positive) or subsidy (if negative) on good i. The simplest (and best) justification of (5.3) is in the case where the good is imported, there is a domestic tax (or import tariff), and the country is small enough so that changes in domestic demand have no effect on the world price. Another possible justifica- tion runs in terms of a production technology displaying constant returns to scale, with a single nonproduced factor, and no joint products, in which case the non- substitution theorem holds, and producer prices are unaffected by the pattern of demand (see, for example, the text by Mas-Colell, Whinston, and Green 1995, pp. 157-60). The assumptions required for this are overly strong, even for a modern industrialized economy, and make less sense in a poor one with a large agricul- tural sector. We shall see how to dispense with them in the next subsection. Government revenue is the last character in the cast. It is simply the sum over all goods and all households of tax payments less subsidy costs: M H (5.4) R = Y E tiqih i=l h=1 where qi is the amount of good i purchased by household h and there are M goods in all. We can think of this revenue as being spent in a number of ways. One is administration and other public goods that generate either no utility, or generate utilities that are separable in preferences, while the other is to generate lump sum subsidies (poll subsidies) that are paid to everyone equally, for example through a limited inframarginal food ration to which everyone is entitled. Both of these will affect the distribution of welfare over individuals, the latter explicitly through the x's in (5.2). I need not make them explicit here because I shall not be considering varying them. However-and I shall come back to the point-the structure of optimal tax systems are quite sensitive to the amount and design of these lump sum subsidies (see Deaton 1979 and Deaton and Stern 1986). We are now in a position to consider the effects of a smal change in a single tax. Given the simplistic price assumption, the price change will be the same as the tax change and will have two effects, one on government revenue (5.4), and one on individual welfare levels through (5.2) and thus on social welfare (5.1). The derivative of revenue with respect to the tax change is H H M (5.5) aR /ati =E qih + E = tjaqjhlap. h=1 h=1 j=1 The effect on social welfare is obtained from (5.1) and (5.2) using the chain rule H aV au (5.6) awiati = E . - h=l aUh api 276 THEANALYSISOFHOUSEHOLDSURVEYS an expression that is conveniently written using Roy's identity as H (5.7) ahw/at = E q h=l where rh, the social marginal utility of money in the hands of h, is av a l,t aw (5.8) 'h aU8 ax For future reference, it is important to note that these formulas do not require that the purchase levels qih be positive. If individual h does not purchase good i, he or she is not harmed by a price increase, as in (5.7), and the effects of price changes on government revenue in (5.5) depend on the derivatives of aggregate demand, and are unaffected by whether or not individual consumers make purchases. The quantities ih in (5.7) and (5.8) are the only route through which the social welfare function has any influence on the calculations, and it is here that the distribution of real income comes into the analysis. If part of the object of the price reform is distributional, which could happen because we want the price change to help the poor more than the rich or hurt them less, or because other instruments (including preexisting taxes and subsidies) have not eliminated the distributional concern, the differential effects of price changes across real income groups must be taken into account. It is sometimes argued that price systems (or projects) are not the most appropriate way to handle distributional issues, with the implication that everyone should be treated equally in these calculations. But this only makes sense if alternative instruments are in place to take care of the distri- bution of income, something that is often not the case in poor countries, where subsidies on basic foods may be one of the few methods for helping the poor and are likely to be an important element in any social safety net. The two equations (5.5) and (5.7) represent the social benefits and (minus) the social costs of a tax increase. The benefit is the additional government revenue, the social value of which comes from the uses to which that government revenue is put. The costs are born by any individual who purchases the good. The money equivalent cost to each unit price change is the quantity that he or she purchases, and these costs are aggregated into a social cost by weighting each cost by the social weight for the individual. The social weights can be derived explicitly from a social welfare function, for example the Atkinson social welfare function (3.4), although we will also sometimes wish to recognize regional or sectoral priorities. The ratio of cost-the negative of the marginal social benefit in (5.7)-to benefit is usually denoted by X, and is defined by H E Tlhqih (5.9) Xi I hM h qih h l Ea Iqjhlapi h=1 hl= so that Xi is the social cost of raising one unit of government revenue by increas- ing the tax (reducing the subsidy) on good i. If the ratio is large, social welfare LOOKING AT PRICE AND TAX REFORM 277 would be improved by decreasing the price of the ith good, either because the good is hurting those whose real incomes are especially socially valuable, or because the taxation of the good is distortionary, or both. Goods with low Xi ratios are those that are candidates for a tax increase or subsidy reduction. When all the ratios are the same, taxes are optimally set and there is no scope for benefi- cial reform. Equation (5.9) is consistent with and generalizes standard notions about distor- tion and about the avoidance of taxes that are particularly distortionary. Suppose, for the sake of argument, that the only good with a nonzero tax is good i, so that the second term in the denominator involves only the own-price response aqihlapi. If this is large and negative, which is the case where a tax is causing distortion, and if the tax is positive, the last term will be negative, so that other things being equal, the larger the price responses, the larger the Xi ratios and the less attractive it would be to try to raise further revenue by this means. If the good is subsidized, large price responses will make the denominator large and positive, yielding a small 1i ratio, with the implication that the tax ought to be raised. In this special case, we get the familiar result that highly price-elastic goods are poor candidates for taxation or subsidy because of the resulting distortion. Of course, price elasticity is not the whole story in (5.9). The numerator will be large for necessities and small for luxuries if the social weights favor the poor. If it is also the case that necessities have low price elasticities-necessities are hard to do without and have few substitutes-then the equity and efficiency effects of com- modity taxes will offset one another, favoring the taxation of necessities on effi- ciency grounds, and of luxuries on equity grounds. Note also the potential impor- tance of cross-price effects; changing the price of one good affects the demand for others through income and substitution effects, and if these goods bear taxes or subsidies, there will be "second-round" efficiency effects that must be taken into account in the calculations. Generalizations using shadow prices The cost-benefit ratios can be rewritten in a form that allows a substantial general- ization of the analysis, and that permits us to relax the assumption that producer prices are fixed. Start from the budget constraint of household h, which using (5.3) can be written in the form M M (5.10) Xh EPkqkh E (Pk+ tk)q k* k=l k=l Since the total expenditure of each household is unaffected by the tax increase, we can differentiate (5.10) with respect to ti holding p 0 constant to give M M (5.11) qih +E tkaqkhlpi -Ep8qklaip. k=l i=1 If (5.11) is substituted into the revenue equation (5.5) and thence into the formula 278 THE ANALYSIS OF HOUSEHOLD SURVEYS for the cost-benefit ratios, we can rewrite the latter as H E. hqih (5.12) H M E X Xp aqjhlap, h=1 j=1 So far, (5.12) is simply a rewritten form of (5.9). That it can be more than this has been shown by Dreze and Stern (1987) and Stern (1987, pp. 63-6) who prove that under appropriate but standard assumptions, the prices p,i can be reinter- preted as shadow prices, defined (as usual) as the marginal social resource costs of each good. Under this interpretation, (5.12) is valid whether or not producer prices are fixed, and thus so is (5.9), provided that the tj are interpreted as "shad- ow taxes," defined as the difference between shadow and domestic prices. In some cases, such as when the pi" are world prices and therefore shadow prices, (5.12) and (5.9) coincide. But the shadow price interpretation of (5.12) makes intuitive sense. The denominator is simply minus the resources costs of a change in price, with resources valued-as they should be-at shadow prices, so that (5.12) is the social cost per unit of the socially valued resources that are released by an increase in a price. In practical applications in developing countries, (5.12) is useful in exactly the same way that shadow prices are useful in project evaluation. Instead of having to account for all the distortions in the economy by tracing through and accounting for all the general equilibrium effects of a price change, we can use shadow prices as a shortcut and to give us a relatively quick-if necessarily approximate- evaluation of the effects of a price change. Evaluation of nonbehavioral terns The cost-benefit ratios in (5.9) and (5.12) require knowledge of quantities, of social weights, and of the responses of quantities to price. Of these it is only the behavioral responses that pose any difficulty, and most of this chapter is con- cerned with their estimation. The first term in the denominator, the aggregate quantity consumed, can be estimated either from the survey data, or from admin- istrative records. If the commodity is being taxed or subsidized, the amounts involved are typically known by the taxing authority, and these figures provide a useful-if occasionally humbling-check on the survey results. The evaluation of the numerator will require survey data if the social weights vary from household to household. The weights themselves depend on the degree of inequality aversion that is to be built into the calculation, as well as on possible sectoral or regional preference. To illustrate how this works in practice, suppose that we use the Atkinson social welfare function (3.4), extended to recognize that there are nh individuals in household h, (5.13) W= nh ( Xh Hh=1 ~E~nh) LOOKING AT PRICE AND TAX REFORM 279 According to (5.13), each member of the household receives household per capita consumption, and the resulting social welfare contributions are multiplied by the number of people in the household. Given this, the numerator of (5.12) is H H (5.14) i hqih h H (x, In=) q,, h= h=l which can be thought of as a weighted average of the demands for good i, with weights depending (inversely) on PCE. Given a value for the inequality aversion parameter e, this expression can be evaluated from the survey, using the sample data and the appropriate inflation factors. The usual practice, which will be fol- lowed in this chapter, is to evaluate (5.14) and the cost-benefit ratios (5.12) for a range of values of e. When e is small-in the limit zero-(5.14) will be close to average consumption of good i, and as e increases will be increasingly weighted towards the consumption of the poor. Because the right-hand side of (5.14) can be evaluated directly from the data, its evaluation does not require parametric modeling. Although it would be poss- ible to specify and estimate an Engel curve for each good and to use the results to calculate (5.14), there are no obvious advantages compared with a direct appeal to the data. The actual relationship between the consumption of individual goods and total expenditure may be quite complicated in the data, and there is no need to simplify (and possibly to oversimplify) by enforcing a specific functional form. Alternative approaches to measuring behavioral responses Almost all of the difficulty in evaluating the cost-benefit ratios comes from the final term in the denominator, which summarizes the behavioral responses to price changes. The remaining sections of this chapter document my own approach to the estimation of this term, based on the use of spatial variation in prices, but it is useful to review alternative strategies, some of which have been implemented in the previous literature. The standard data source for estimates of price responses is the historical record. Variations over time in relative prices permit the econometric analysis of aggregate demand functions in which average quantities are related to average outlays and prices. The aggregate nature of these exercises is not a problem in the current context since, as is easily checked from (5.9) and (5.12), the cost-benefit formulas depend only on the response of aggregate demand to prices. The diffi- culty lies rather in the typical paucity of the relevant data and the consequent effects on the precision of estimation as well as on the number of explanatory variables that can reasonably be modeled. The latter limitation is particularly damaging when it comes to the estimation of cross-price effects. In practical policy exercises, where we need to separately distinguish goods that are taxed- or that might be taxed-at different rates, a detailed disaggregation if often re- quired, and we require estimates of all the own- and cross-price responses for each good in the disaggregation. With an annual time series that is nearly always shorter than 40 years, say, such calculations are close to impossible. Even for 280 THE ANALYSIS OF HOUSEHOLD SURVEYS developed countries, where in a few cases there is nearly a 100-year span of data, it has not proved possible to estimate demand responses with any degree of con- viction for disaggregations of more than a very few commodities (see Barten 1969 for a classic attempt using Dutch data, and Deaton 1974a for a much smaller (although not much more successful) attempt for Britain). If we have to rely on historical data, and if we are not prepared to restrict the problem in any way, it is effectively impossible to obtain reliable estimates of own- and cross-price elastic- ities for disaggregated systems of demand equations. It is possible to do a good deal better by using prior restrictions on behavior, especially restrictions that limit the interactions between goods. Perhaps the crud- est type of restrictions are those that simply ignore cross-price effects. But crude remedies are often effective, and if (5.9) is rewritten under the assumption that the derivatives aqih lap1 are zero when i is not equal to j, at least on average over all households, we reach the much simpler expression N (5.15) A= h=lh ih Qi (1 + lieii) where Qi is aggregate consumption of good i, -c is the tax rate t/lpi, and eii is the own-price elasticity of aggregate demand. This expression contains only one magnitude that does not come directly from the data, the own-price elasticity, which can be estimated using a simple model, guessed, or "calibrated" from other studies in the literature. The disadvantages of such a procedure are as obvious as its advantages. There is no reason to suppose that all the cross-price effects are small, especially when close substitutes are taxed at different rates. Even when own-price effects can be expected to dominate over individual cross-price effects, that is far from the same as assuming that the total effect of all cross-price effects on tax revenue is small enough to ignore. Finally, economic theory teaches us that there are both income and substitution effects of price changes, so that even when substitution effects are small or zero, income effects will still be present. Instead of restricting price effects by arbitrary zero restrictions, it is possible to use standard restrictions from economic theory. One particularly convenient as- sumption is that preferences are additive, meaning that the utility function is (a monotone increasing transformation of) a sum of M functions, each with a single good as its argument. By far the most popular case of additive preferences is the linear expenditure system-sometimes referred to as the Stone-Geary demand system-which has the utility function M (5.16) u(q) = Epiln(qi-yi) i=1 and associated demand functions (5.17) qi = Yi +P pii(x - p.y) where the 2M - 1 parameters (the P's must add to 1) are to be estimated. The sys- LOOKING AT PRICE AND TAX REFORM 281 tem (5.17) is readily estimated on short time-series data; indeed the P's are identi- fied from the Engel curves in a single cross section, as are the y's up to a con- stant. Once estimates have been obtained, (5.17) can be used to calculate the complete set of own- and cross-price responses, so that (5.9) or (5.12) can be evaluated without need for further assumptions. The convenience of the linear expenditure system has led to its extensive use in the applied policy literature, including work on tax reform in developing coun- tries (see Ahmad and Stern 1991, or Sadoulet and de Janvry 1995). The model is certainly superior to that based on the crude assumption that cross-price effects are zero; it is based on an explicit utility function and thus satisfies all theoretical requirements. However, the linear expenditure system still offers a relatively limited role for measurement, relying instead on the assumptions built into its functional form. Of course, that is the point. Without much data, there is little alternative but to use prior restrictions to aid estimation, and there is much to be said for using restrictions that are consistent with economic theory. But having done so, it is important to investigate which of our results come from the mea- surement, and which from the assumptions. In the case of the linear expenditure system, much is known about the effects of its supporting assumptions. As shown in Deaton (1974b), additive preferences imply that own-price elasticities are approximately proportional to total expendi- ture elasticities, so that the efficiency effects of the taxes, which work through the price elasticities, tend to be offset by the equity effects, which work through the total expenditure elasticities. Indeed, Atkinson (1977) showed that if all prefer- ences are given by the Stone-Geary form (5.16), and if the government has avail- able to it a lump sum polltax or subsidy that is the same for everyone in the econ- omy and that is optimally set, then the optimal solution is to tax all commodities at the same rate. This result can be generalized somewhat; Deaton (1979) shows that the result holds provided only that the Engel curves are linear and there is separability between goods and leisure. Furthermore, if to these requirements is added the assumption that preferences are additive-as in the linear expenditure system-then not only are uniform taxes optimal, but any move toward uniform taxes is welfare improving (see Deaton 1987b). It is even possible to allow for variation in tastes across individuals, provided that the optimal poll taxes or subsi- dies are set to take such variation into account (see Deaton and Stern 1986). In all these cases, the tax reform exercise does not require any econometric estimation at all! The answer is known before we start, and the analysis of the data adds nothing to the policy recommendations. If the linear expenditure system is plausible, then these results are very useful, since they tell us what sort of price reforms to recommend. Unfortunately, the model is much too restrictive to be credible, especially in the context of price re- form in poor countries. Demand functions from additive preferences do not per- mit inferior goods, and the substitution effects between commodities are of a simple and restrictive kind that do not permit complementarities between goods, nor any specific kinds of substitutability (see Deaton and Muellbauer 1980a, ch. 5, for a more precise discussion). When we are thinking about tax and subsidy 282 THE ANALYSIS OF HOUSEHOLD SURVEYS policy in poor countries, many of the commodities involved are foods, some of which may be close substitutes, such as different types of cereals, and some of which are complements, such as cooking oil and the foods with which it is used. In several countries, there are also important inferior goods, such as cassava in Indonesia, the consumption of which is largely confined to members of the poor- est households. Neither the linear expenditure system nor any other model of additive preferences can handle this sort of consumption pattern. A somewhat different approach to tax and price reform in developing coun- tries is to use CGE models. These (often quite elaborate) representations of the economy usually conform to the standard textbook representations of a general equilibrium system, with specifications for consumer preferences, for the technol- ogy of production, and for the process by which supply and demand are brought into equilibrium, usually by setting prices at their market clearing levels. The parameters of these models are not estimated using standard statistical methods; instead, the models are "calibrated" by choosing the parameters of preferences and technology so as to match the historical record (if only roughly) as well as to bring key responses into line with estimates in the literature. Many of these mod- els use the linear expenditure system, or one of its variants; the parameters of (5.16) could be calibrated from (5.17) by looking at a set of expenditure Engel curves, which would yield the P's and the intercepts y, - y.p; the individual y parameters can be solved out given one more piece of information, for example the own-price elasticity of food, for which there exist many estimates in the litera- ture. Alternatively, p'arameters can be taken from the many estimates of the linear expenditure system for developing countries, such as those assembled in the World Bank study by Lluch, Powell, and Williams (1977). While such models can be useful as spreadsheets for exploring alternative policies and futures, the incorporation of the linear expenditure system has ex- actly the same consequences as it does in the more explicitly econometric frame- work. The results of tax reform exercises in CGE models are entirely determined by the assumptions that go into them. When a CGE model incorporates the linear expenditure system-or the even more restrictive Cobb-Douglas or constant elas- ticity of substitution preferences-a prescription for uniform taxes or for a move towards uniform taxes is a prescription that comes directly from the assumptions about functional form, and is unaffected by the values of the parameters attached to those forms. The finding cannot therefore be shown to be robust by calculating alternative outcomes with different parameter values; it is guaranteed to be robust provided the functional forms are not changed. It is also guaranteed not to be robust with respect to changes in those functional forms. Atkinson and Stiglitz (1976) and Deaton (1981) provide examples of cases where major changes in optimal tax structure, for example between regressive and nonregressive taxation, can be brought about by minor variations in separability assumptions, variations that would be hard to detect in an econometric analysis. If we are to make recommendations about tax reform that recognize the actual structure of demand in developing countries, there is no alternative to the estima- tion of a model of demand that is sufficiently general to allow for that structure. LOOKING AT PRICE AND TAX REFORM 283 5.2 The analysis of spatial price variation In the literature on demand analysis in developed countries, almost all of the iden- tifying price variation has come from price changes over time, with little attention paid to variation in prices over space (although see Lluch 1971 for an exception). The reason for this is obvious enough; in countries like the United States, where transport and distribution systems are highly developed, and where transport costs are relatively low, there is little price variation between localities at any given time. In developing countries, transport is often more difficult, markets are not always well integrated, and even the presence of potential arbitrage cannot equal- ize prices between different geographical locations. In consequence, when we are trying to identify behavioral responses in demand, we have in developing coun- tries a source of price variation that is not usually available in developed econo- mies. Regional pice data Data on regional price differences are often available from the statistical offices responsible for constructing consumer price indexes; even when national esti- mates are the main focus of interest, standardized bundles of goods are priced in a number of (usually urban) locations around the country. These price data can be merged with the household survey data by associating with each household the prices in the nearest collection center at the time closest to the reporting period for that household. The combined data can then be used to estimate a demand system with individual households as the units of analysis. Alderman (1988) uses such a procedure to estimate a demand system for Pakistan. However, there are a number of difficulties. One is that with only a few sites where prices are collected, the data may give inaccurate estimates of the prices faced by at least some house- holds. In particular, urban prices may be a poor guide to rural prices, especially when there are price controls that are more effective in urban areas. The paucity of collection sites may also leave us in much the same situation as with the histor- ical record, with too many responses to estimate from too few data points. The situation is made worse by the fact that it is often desirable to allow for the effects of regional and seasonal taste variation in the pattern of demand by entering re- gional and seasonal dummies into the regression, so that the price effects on de- mand are only identified to the degree that there are multiple observations within regions or that regional prices do not move in parallel across seasons. Household price data The individual household responses often provide a useful source of price data. In many surveys, although not in all and in particular not in the original LSMS sur- veys, households are asked to report, not only their expenditures on each good, but also the physical amount that they bought, so that we get one observation in kilos and one in currency. The ratio of these two observations is a measurement 284 THE ANALYSIS OF HOUSEHOLD SURVEYS of price, or more accurately, of unit value. The attraction of such measures is the amount of data so provided. If the current Indian NSS were available on a nation- wide basis, we would have upwards of a quarter of a million observations on prices, a data set that would certainly permit the estimation of a much richer pattern of responses than could ever be obtained from a few dozen years of the historical record. Other surveys are not so large, but a typical comparison would still be between many thousands of observations from a survey and less than 50 from time series. As always, there is a price to be paid. Unit values are not the same thing as prices, and are affected by the choice of quality as well as by the actual prices that the consumer faces in the market. Not every household in each survey reports expenditures on every good, and no unit values can be obtained from those that do not. And when there are measurement errors in the data- which means always-there are obvious risks in dividing expenditure by quantity and using the result to "explain" quantity. These problems and others have to be dealt with if the analysis is to be convincing. Even so, it is worth beginning by looking at the unit values, and trying to see whether they give us useful information about prices. I do this using (once again) the Pakistani data from the 1984-85 Household Income and Expenditure Survey and the Maharashtran state sample from the 38th round of the Indian NSS. The results summarized here are reported in full in Deaton and Grimard (1992) for Pakistan, and in Deaton, Parikh, and Subramanian (1994) for Maharashtra. Table 5.1 for Pakistan and Figures 5.1 and 5.2 for Maharashtra provide differ- ent ways of looking at the spatial variation in the unit values. Table 5.1 reports two different sets of calculations for the Pakistani data. In the first, reported in the top panel, the logarithms of the unit values are regressed against a set of dummies for each of the four provinces in the survey-Punjab, Sindh, North West Frontier Province (NwFP), and Baluchistan (the omitted category)-and for each of the four quarters (omitting the fourth) of the calendar year during which the survey was conducted. These regressions are intended to capture broad regional and seasonal patterns. The bottom half of the table looks at variation from village to village within each province, and uses analysis of variance to decompose price variation into its within-village and between-village components. The unit values for all goods but the last come directly from the survey, al- though the wheat category includes bread and flour converted to wheat content. The unit value for the final category, other food, is computed from a set of weights-here the average budget share for each component over all households in the survey-that are used to construct a weighted geometric index from the unit values for each of the individual foods in the other food category. Although the same weights are applied to all households, they are rescaled in each case to account for goods not purchased to ensure that the weights add to one for the cal- culated unit value for each household. If a unit value were calculated for such a heterogeneous category by adding up expenditures and dividing by the sum of the physical weights, the result would be as much determined by the composition of demand between items in the category as by any variation in prices. The rescaling by fixed weights has some of the same effect since households that buy only ex- Table 5.1. Variation in log unit values of foods by provinces and villages, rural Pakistan, 1984-85 Provinces and quarters Wheat Rice Dairy products Meat Oils &fats Sugar Otherfoods Regression Punjab -7.2 -8.0 -68.7 -22.1 -2.0 -6.6 -9.7 (6.0) (4.5) (15.8) (16.4) (8.5) (21.9) (11.8) Sindh 7.3 -43.1 -78.3 -13.8 -2.6 -2.9 -11.8 (5.2) (22.3) (15.5) (9.5) (10.4) (9.0) (13.3) NWFP 6.3 -2.8 -26.0 -31.1 -1.3 -3.7 -6.3 (4.7) (1.4) (5.4) (20.8) (5.0) (10.9) (6.9) Jul-Sep 1984 -2.1 0.4 2.0 -1.4 -0.2 0.3 0.9 (2.6) (0.4) (0.7) (1.6) (1.4) (1.7) (1.7) Oct-Dec 1984 -2.4 -1.0 -1.5 -3.4 -0.5 -0.2 -4.3 N> (3.0) (0.8) (0.5) (4.1) (2.9) (1.1) (8.4) Jan-Mar 1985 0.0 -2.0 -3.0 -1.9 -0.3 0.3 -6.9 (0.0) (1.6) (1.1) (2.2) (1.8) (1.5) (13.3) Analysis of variance F R2 F R2 F R2 F R2 F R2 F R2 F R2 Punjab 4.02 0.33 9.14 0.59 3.07 0.39 7.85 0.45 13.7 0.61 7.2 0.42 12.8 0.53 Sindh 5.42 4.40 4.40 0.44 9.44 0.78 8.61 0.48 10.4 0.51 19.9 0.66 14.1 0.57 NWFP 6.06 7.12 7.12 0.57 10.63 0.67 4.46 0.30 13.5 0.54 13.9 0.59 10.8 0.48 Baluchistan 9.14 8.52 8.52 0.61 17.70 0.80 10.97 0.61 7.74 0.50 18.0 0.70 17.9 0.69 Note: The top panel shows coefficients from a regression of the logarithms of unit values on province and quarter dummies. The figures in brackets are absolute t-values. The bottom panel shows F- and R2-statistics for cluster (village) dummies within each of the five regions. Regressions and analyses of variance use data from only those households that report expenditures on the good in question. Source: Deaton and Grimard (1992, Table 3). 286 THE ANALYSIS OF HOUSEHOLD SURVEYS pensive items will have higher unit values, but there is no way of completely avoiding the difficulty, and the effect will have to be taken into account in the analysis. The top panel of Table 5.1 shows that interprovincial price differences are much larger than are seasonal differences, though the latter are statistically signi- ficant for most of the categories. Apart from wheat, which is more expensive in Sindh and in NWFP than in Baluchistan, all province means are lower than those in Baluchistan. For oils and fats and for sugar, there is a good deal of state price con- trol, and there is relatively little variation in price across the provinces, and the same holds to a lesser extent for wheat. Rice is very much cheaper in Sindh than elsewhere. The bottom panel reports, not parameter estimates, but the F-tests and R2- statistics from what can be thought of as a regression of the logarithm of unit value on dummies for each village, of which there are 440 in Punjab, 152 in Sindh, 110 in NWFP, and 55 in Baluchistan, although not all villages appear in each regression since the village is excluded if no household reports that type of expenditure. It is impossible (and unnecessary) to include seasonal dummies in these regressions because all the members of a village are interviewed within a few days of one another in the same calendar quarter. What we are looking for here is evidence that unit values are informative about prices, and since prices should not vary by much within villages over a short period of time, there should be significant F-statistics for the village effects, or put differently, the village dummies should explain a large share of the total variance in the logarithms of the unit values. While it is not clear how much is enough, all the F-statistics are sig- nificant at conventional values, and the village effects explain around a half of the total variance. Given that unit values are contaminated by measurement error, and given that there is variation within villages as richer households buy higher-qual- ity goods, the results in the table seem sufficient to justify the suppositions both that there is spatial price variation and that variation in unit values provides a (noisy) guide to it. Figures 5.1 and 5.2 show the behavior of the unit values for rice and for jowar in the Maharashtran data; similar graphs can be drawn for other foods, but these illustrate the point and the commodities are important in their own right. Instead of looking at the unit values at the village level, these diagrams display averages over households by district and quarter, so that each panel shows the seasonal pat- tern for that district in 1983, while a comparison of different panels reveals the differences in prices from one district to another. As for Pakistan, there is varia- tion in unit values, both across space and over the seasons of the year. Both dia- grams show pronounced seasonal patterns that are correlated-although far from perfectly so-between the districts. For rice, the logarithm of the unit value rose on average by 15 percent from the first to the third quarter, and by a further 2 percent between the third and fourth quarters. There is also substantial variation by district, both in levels and in the nature of the seasonal pattern. For example, in Dhule and Jalgaon, the average unit value of rice peaked in the second quarter, at which point the price of rice was nearly a third higher than in Chandrapur, for ex- LOOKING AT PRICE AND TAX REFORM 287 Figure 5.1. Log unit price of rice by district and subround, Maharashtra, 1983 Thane Kulaba Ratnagiri Nasik Dhule Jalgaon 1.4 0 1.2 00 0 0 0 0 1.0 Ahiednagar Pune Satara Sangi Solapur Kohlapur 1.4 00000 1.2 000 0000 1.0 1 Aurangabad Parbhani Beed Nanded Usmanabd Buldana 1.42 I.000 Akola Amravati Yevtmal Wardba Nagpur Bhandara 1.4 o00000 1.2 0 000* 1.0 Chandrapur Jalna Sindhudurg 1 2 3 4 1 2 3 4 1 2 3 4 1.4 0 0 0 1.2 000 1.0 1 2 3 4 1 2 3 4 1 2 3 4 Source: Deaton, Parikh, and Subramanian (1995, Fig. 1). Figure 5.2. Log unit price ofjowar by district and subround, Maharashtra, 1983 Tbane Kulaba Ratnagiri Nasik Dhule Jalgaon 0.9 o 0 0.5 000O 000 0 0.1 09 Ahmeduagar Pune Satara Sangli Solapur 0 Kohlapur 0 0 00 0 0.,5 0 0: 0 0; 0, 0 0 _@ 0.1 0.9 Aurangabad Parbhani Beed anded Us(9manb Buldana 0.5 o O 0 00000 0.9 Akola an Yevatmal Wardha Nagpur adr 0.5 0c 0.1 0--L ~ 0.9Chadraur Jalna Sindbudurg 1 2 34 1 2 34 1 23 4 0 0 0.1 1 2 34 1 2 34 1 23 4 Source: Deaton, Parikh, and Subramanian (1995, Table 2). 288 THE ANALYSIS OF HOUSEHOLD SURVEYS ample. Jowar prices also rise through the year, by 13 percent from the first to the third quarter, falling by 2 percent in the final quarter. Interdistrict variation in jowar prices is much larger than for rice. In the inland eastern districts, Buldana, Akola, Amravati, Yevatmal, Wardha, and Nagpur, which are also the main pro- ducing districts, prices are lower throughout the year, by a half or 50 paise per kilogram, than in the coastal districts where no jowar is grown. Of course, these figures come from only the 1983 survey, but the spatial and seasonal patterns appear to be typical of other years. For Maharashtra, it is also possible to compare the unit values from the survey with other price information. Government of India (1987) publishes monthly data on market prices in selected locations for a number of foodstuffs and crops. In cases where these prices can be matched by district, the two sets of numbers correspond very closely (see Deaton, Parikh, and Subramanian for details). Al- though the market price data are not sufficiently complete to serve as an alterna- tive for estimating price responses, they are nevertheless useful as a check on the unit values that are reported by individual households in the NSS survey data. While the examination of the unit value data cannot establish that they can be treated as if they were prices, the results here indicate that unit values are at the very least useful as an indicator of seasonal and spatial price variation. Unit values and the choice of quality One reason why unit values are not the same thing as prices is that unit values are affected by the choice of quality. A kilo of the best steak costs a great deal more than does a kilo of stewing beef, Roquefort costs more than cheddar, and even for relatively homogeneous commodities such as rice, there are many different grades and types. Unit values are computed by dividing expenditures by physical quan- tities, and high-quality items, or mixtures that have a relatively large share of high-quality items, will have higher unit prices. As a consequence, and in contrast to a market price, over which the consumer has no control, a unit value is chosen, at least to some extent. In particular, since better-off households tend to buy higher-quality goods, unit values will be positively related to incomes or total outlays. Because unit values are choice variables, there is a risk of simultaneity bias in any attempt to use them to "explain" the patterns of demand. Hence, in- stead of explaining the demand for rice by the price of rice, we are in effect re- gressing two aspects of the demand for rice on one another, with results that are unlikely to give what we want, which is the response of demand to price. The size of quality effects can be assessed by running regressions of the loga- rithm of unit value on the logarithm of total expenditure and the usual list of household demographics and other characteristics. Such a model is written K-1 (5.18) Inv = al + 1llnx + y'lnn + ( C(njln) + u, where v is the unit value, x is total household expenditure, and the demographic terms are as in equation (4.14), the logarithm of household size and household LOOKING AT PRICE AND TAX REFORM 289 structure as represented by the ratios to household size of persons in various age and sex categories. Although we might expect income (or total expenditure) to be the most important variable in explaining the choice of quality and thus the unit value, there is no reason to rule out other household characteristics; some groups within the household may choose particular items within a food category, and religious dietary preferences will also affect quality. Regressions like (5.18) were first examined by Prais and Houthakker (1955), who used a 1937-39 survey of British households to estimate values of P', a parameter that they referred to as the "quality elasticity," or more precisely as the expenditure elasticity of quality. For working-class households, they estimated quality elasticities ranging from 0.02 for onions, cocoa, and lard, to as high as 0.28 for sausages, 0.29 for fresh fish, and 0.33 for cake mixes (Prais and Houthakker 1972, Table 27, p. 124). They also emphasized local variations in the range of varieties; for example, the quality elasticity of cheese in Holland is 0.20 as compared with only 0.05 in Bri- tain-British working-class households in the 1930s ate only cheddar-and the quality elasticity of tea in Bombay was 0.25 as against 0.06 in Britain. Because unit values will vary not only with the choice of quality, but also with actual market prices, (5.18) should ideally be estimated with price data included on the right-hand side. This is impossible from the data that we have, but it is nevertheless possible to estimate the nonprice parameters of (5.18) consistently, provided that we are prepared to make the assumption that market prices do not vary within each village over the relevant reporting period. In rural areas in most developing countries, this is a reasonable assumption. Most villages have only a single market, and prices are likely to be much the same for a group of house- holds that are in the same location and surveyed at the same time. Hence, equa- tion (5.18) can be extended to include prices simply by adding dummy variables for each village. With large surveys, there may be several thousand villages, so the regression must usually be run, not by asking the computer to calculate it directly, but by first calculating village means for all variables, and then running a regression using as left- and right-hand-side variables the deviations from vil- lage means. By the Frisch-Waugh (1933) theorem, the regression of deviations from village means gives identical parameter estimates to those that would have been obtained from the regression containing the village dummies. Two practical points are worth noting. First, the removal of village means must be done on exactly the same data that are to be used in the regressions. This seems obvious enough, but in the case of (5.18), where different subsets of house- holds purchase different goods so that the samples with usable unit values are different for each good, the village means for variables such as Inx, which are available for all households, will have to be different for each good. One simple way of checking that the demeaning has been done correctly is to check that the estimated intercept in the demeaned regression is zero, at least up to the accuracy of the machine. Second, the standard errors given by the usual OLS formula will not be correct, although the error will typically be small, and is in any case easily corrected. If there are n households in the survey, grouped into C clusters, the hypothetical regression with all the dummies would have n-k-C-1 degrees of 290 THE ANALYSIS OF HOUSEHOLD SURVEYS freedom when there are k other regressors (such as those listed in (5.18) above). But the regression using the demeaned data includes all n observations, and has k regressors, so that when the equation standard error is estimated, the computer will divide the sum of squares of the residuals not by n-k-C- 1, as it should, but by n-k. The correct standard errors can be calculated by scaling the calculated standard errors by the square root of the ratio of n-k-C-1 to n-k, which with (say) 10 households per cluster, amounts to inflating the standard errors by the square root of 0.9, or by 5.4 percent. (All of these issues can be avoided by using the "areg" command in STATA, which when used with the option "absorb (clus- ter)" will estimate the regression with implicit dummies for each cluster.) Tables 5.2 and 5.3 present parameter estimates of (5.18) for the two data sets using the methodology of the previous paragraph. Only the coefficients on the logarithms of total expenditure and household size are shown, although the regres- sions contain a full range of compositional and socioeconomic variables in addi- tion to the village dummies. Apart from the case of sugar in urban Pakistan, where the coefficient is zero, all the expenditure elasticities of quality are positive, and most are statistically significant. The heterogeneous meat commodity has the highest quality elasticities, and the elasticity of 0.242 in urban Pakistan is the highest in either table. The important cereal categories-rice, wheat, and coarse cereals-have modest quality elasticities, from 3 to 11 percent. Price control for sugar and oil in Pakistan is reflected in very low quality elasticities, although the estimate of 0.006 for sugar in the rural sector is significantly different from zero. The response of quality to household size is negative, except for the intriguing case of fruit in Maharashtra, where both rural and urban estimates are positive. As Table 5.2. Within-village regressions for unit values, Pakistan, 1984-85 Rural Urban Food Inx Inn Inx Inn Wheat 0.095 -0.033 0.027 -0.028 (14.7) (4.3) (7.6) (6.7) Rice 0.107 -0.062 0.111 -0.075 (12.4) (6.3) (16.5) (9.7) Dairy products 0.139 -0.031 0.010 -0.023 (6.4) (1.3) (0.9) (1.9) Meat 0.154 -0.099 0.242 -0.156 (23.7) (13.5) (34.0) (19.0) Oils & fats 0.000 0.000 0.003 -0.002 (0.4) (0.0) (2.6) (2.1) Sugar 0.006 -0.004 -0.000 -0.000 (4.6) (2.8) (0.3) (0.1) Other foods 0.076 -0.033 0.129 -0.076 (21.2) (7.9) (25.9) (13.1) Note: The dependent variable is the logarithm of unit value. Included, but not shown, are demographic ratios by age and sex. Absolute values of t-values are shown in brackets. Source: Deaton and Grimard (1992, Table 4). LOOKING AT PRICE AND TAX REFORM 291 Table 5.3. Within-village regressions for unit values, rural Maharashtra, 1983 Food lnx abs t Inn abst Rice 0.058 (8.8) -0.041 (5.7) Wheat 0.047 (6.3) -0.026 (3.2) Coarse cereals 0.042 (7.1) -0.042 (6.5) Othercereals 0.123 (5.6) -0.011 (0.5) Pulses & gram 0.022 (3.9) -0.026 (4.2) Dairy products 0.109 (7.2) -0.034 (2.1) Oils & fats 0.034 (7.6) -0.022 (4.4) Meat 0.179 (11.4) -0.085 (5.2) Vegetables 0.037 (7.6) -0.015 (2.9) Fruit & nuts 0.097 (3.6) -0.046 (1.6) Sugar & gur 0.052 (10.6) -0.046 (8.7) Note: Regressions also include demographic ratios by sex and age, as well as dunumies for scheduled caste, religion, and worker type. Source: Deaton, Parikh, and Subratnanian (1994, Table 5). was the case for the Engel curves in Chapter 4, the elasticity with respect to house- hold size tends to be large and negative when the elasticity on total expenditure is large and positive, suggesting that increases in household size act like reductions in income. The estimated coefficients on household size are mostly smaller in absolute size than the coefficients on total expenditure, a result that once again suggests the presence of economies of scale (see the discussion in the final subsec- tion of Section 4.3). The demographic composition of the household exerts little or no influence on the reported unit values. However, some of the other variables are occasionally important. In Maharashtra, households headed by agricultural laborers buy lower- quality cereals than do others, presumably because of their greater need for calo- ries. Hindu, Jain, Parsi, and Buddhist households spend more per unit of meat than do Muslim or Christian households, not because they get charged more for meat, but because they rarely consume meat, but when they do so, will purchase only the relatively expensive chicken and lamb. These results, and those from similar regressions in Deaton (1988, 1990a) for Cte d'Ivoire and Indonesia, respectively, show that the quality effects in unit values are real, that they work as expected with better-off households paying more per unit, but that the size of the effects is modest. That they should be so makes good intuitive sense. Although it is true that it is possible to find very expensive and very cheap varieties within any commodity group, most people will buy a range of qualities, so that it would be a large difference if rich households were to spend twice as much (say) per kilo as poor households, even for a heterogeneous commodity like meat. It is a rough but useful rule of thumb that rich households spend about six times as much as poor households, so that we have an upper limit on the quality elasticity of 15 percent, close to what is observed in practice. Even though the quality effects on unit values are modest, it is wise to be cau- tious about treating unit values as if they were prices. Any positive effect of in- 292 THE ANALYSIS OF HOUSEHOLD SURVEYS comes on unit values will cause price responses to be (absolutely) overstated. The argument is straightforward and depends on higher prices causing consumers to shade down quality. Suppose that when there is a price increase, consumers adapt, not only by buying less, but also by buying lower-quality items, thus spreading the consequences of the price increase over more than one dimension. We want to measure the price elasticity of quantity, which in an idealized environment would be calculated as the percentage reduction in quantity divided by the percentage increase in price. However, we observe not the price but the unit value. With quality shading, the unit value increases along with the price, but not by the full amount. As a result, if we divide the percentage decrease in quantity by the per- centage increase in unit value, we are dividing by something that is too small, and the result of the calculation will be too large. If the response of quality to income is close to zero, it is also plausible that quality shading in response to price in- creases will also be negligible, but such a result needs to be established more formally, which is one of the objects of the model in the next section. Measurement error in unit values Quality effects in unit values are not the only reason why they cannot be treated as prices, and may not even be the most important. Because unit values are derived from reported expenditures and quantities, measurement error in quantity will be transmitted to measurement error in the unit value, inducing a spurious negative correlation. To see how this works, consider the simplest possible model in which the logarithm of demand depends on the logarithm of price, and where there are no quality effects, so that the ratio of expenditure to quantity is price, but expendi- ture and quantity are each measured with error. The substantive equation can therefore be written as (5.19) Inqh = jt + EPlnph + Uh where the subscript h denotes a household, and where ep is the price elasticity that we wish to estimate. The magnitudes in (5.19) are measured with error, and are not observed; instead, households report (5.20) lnp, = Inp, + vIA (5.21) lnq = lnqh + v2 for measurement errors vIh and v2, with variances a, and ao2, and covariance 0I2' Note that (5.20) and (5.21) do not imply that households either recall or re- port price and quantity; the two equations are entirely consistent with perhaps the most obvious recall strategy, where households report expenditures and quantities, and the analyst calculates the price. What is important is that the covariance 012 be allowed to take on any value. If households recall expenditures correctly, but have difficulty in recalling quantities, then the covariance will be negative, as it will be in the more general case where there are measurement errors in expendi- LOOKING AT PRICE AND TAX REIFoRM 293 tures that are independent of the errors in the quantities. But we cannot be sure how households do their reporting. For examnple, even if they are asked to report expenditures and quantities, they may calculate the latter from the former by re- calling a price. This and other possibilities are consistent with (5.20) and (5.21) provided that the covariance is not restricted. If the measurement equations (5.20) and (5.21) are used to substitute for the unobservable quantities in (5.19), and the resulting equation is estimated by OLS, then straightforward algebra shows that the probability limit of the estimate of the price elasticity is given by a special case of (2.62) (5.22) plim ep = ep + m (12 - epol2) where m* is the large sample variance of the mismeasured price. The important thing to note about this formula is that, even in the "standard" case where the co- variance 0,2 is negative, it is not necessarily the case that the spurious negative correlation between mismeasured price and mismeasured quantity will lead to a downward bias. Instead, (5.22) is a generalization of the standard attenuation result. Indeed, if the covariance is zero, (5.22) is exactly the attenuation formula, and the mismeasurement of price has the usual effect of biasing the elasticity towards zero, which since the elasticity is negative, means an upward bias. Adding in the effect of the covariance will give a downward bias if the covariance is nega- tive, but taking the two effects together, we cannot sign the net effect. Even so, in the special but leading case where the measurement errors in the logarithm of expenditure and the logarithm of unit value are independent, the covariance al, is equal to minus the variance ol2. If in addition, & > -1,so that demand is price inelastic, the estimate of the price elasticity will be biased downward, with the spurious negative correlation dominating the attenuation bias. More generally, but still on the assumption that °12 = -al, the estimated price elasticity will be biased toward - 1. In the next section, I propose a methodology that permits consistent estimation of the price elasticities in the presence of both measurement errors and quality contamination of unit values. S.3 Modeling the choice of quality and quantity The purpose of this section is to construct an empirical model that can be used with the survey data, including the unit value information, to provide consistent estimates of price elasticities of demand. The basic ideas that underlie the model are straightforward, but the application involves a certain amount of detail, and the final model can appear to be complex if encountered all at once. To avoid unnecessary complexity, I proceed in stages, introducing a very simple version of the model first, and using it to explain the basic principles. I then introduce the various complications one at a time. For the reader only interested in the empiri- cal results, only the first few subsections need be read, leaving the more technical material to be consulted only by those who wish to use the methods. 294 THE ANALYSIS OF HOUSEHOLD SURVEYS A stripped-down model of demand and unit values In the simplest possible model of demand, the quantities consumed are related to total outlay (or income) and to price. To that basic model, we must add the special features of the current problem, notably that we observe not price but unit value, and that prices do not vary within villages (clusters) but only between them. One version of such a model has a demand function and a unit value function (5.23) Inqh = a0 + eInxh, + Epln7c +fC + u 0 (5.24) Inv,,, a' + 3l Inx, + *InJc, + UIc In these two equations, h is a household (or observation) and c is the cluster or village where it lives. The reported quantity demanded is qh, and recorded unit value is vhc. Total outlay is Xhc and is the only household-specific variable intro- duced at this stage; I shall bring in demographics and other household characteris- tics later. The logarithms of both demand and unit value depend on the logarithm of the price of the good, written here as In sn; I shall introduce cross-price effects later but, for the moment, In7ic is a scalar. Price is not observed-so that (5.23) and (5.24) are not standard regressions-and does not vary within clusters; hence the c subscript and the absence of the h subscript. The total expenditure elasticity iE and price elasticity Ep are treated as constants by (5.23), and the latter is the main parameter of interest. The quantity equation has two error terms. The first, fc, is a village-level effect that is the same for all households in the village, and that should be thought of either as a "random effect" or as a "fixed effect." The choice is largely a matter of taste, but the important point is that the f., while possibly correlated with household outlays-and with other sociodemographic characteristics-must be uncorrelated with the prices. Since both fixed effects and prices are unobserved, we cannot hope to measure the influence of the latter if they are allowed to be arbitrarily correlated. The reason that the fC are required at all is clear if we think about average demand over villages, where even with identical village incomes, prices, and other variables that we can measure, we should still expect random variation from one village to another. Individuals within villages will often be more like one another than people in different villages, even when we have con- trolled for all the obvious determinants of demand. 0 The second error term in the demand equation, uhc, is more standard, but it incorporates not only the usual unobservables, but also any measurement error in the logarithms of recorded quantities. It therefore includes the measurement error in (5.21), and will be correlated with the error term in the unit value equation, Uhc, which includes measurement error in unit values as in (5.20). If the number of households in each village is sufficiently large, the means of u 0 and u I will be approximately zero, although there is no reason to expect either measurement error or behavioral errors to average to zero for each village in the usual samples where there are only a few households in each. LOOKING AT PRICE AND TAX REFORM 295 The unit value equation is essentially the same as (5.18) above although, for expository purposes, I have removed the demographics and have explicitly in- cluded the unobservable prices. The coefficient hi would be unity if unit values were equal to prices up to measurement error-in which case a' and j3' would be zero-but quality shading in response to prices will make a less than unity. Unlike the quantity equation, the unit value equation contains no village fixed effects. Prices certainly vary from village to village, but, conditional on price, unit value depends only on the quality effects and measurement error. The introduc- tion of an additional fixed effect would break the link between prices and unit values, would prevent the latter giving any useful information about the former, and would thus remove any possibility of identification. In the next two subsections, I present the economic and econometric back- ground for identifying the parameters in (5.23) and (5.24), after which I shall dis- cuss complications. These are concerned with the introduction of the sociodemo- graphics, with allowing for cross-price as well as own-price elasticities, and with the adoption of a more appropriate functional form than is provided by the loga- rithmic formulation. However, the simple structure avoids notational clutter, and the logarithmic specification is one in which the subsequent analysis is the most transparent. It is clear from inspection of (5.23) and (5.24) that, without further informa- tion, and in the presence of the unobservable price terms, not all of the parameters can be estimated. That any parameters can be identified may seem surprising at first. However, in the previous section we saw how the parameters of total expen- diture and the sociodemographics in the unit value equation could be consistently estimated by including cluster dummies on the assumption that prices are constant within villages. This result clearly also extends to the quantity as well as the unit value equation, and will allow estimation of the effect of any variable that is not constant within villages. The immediate issue is how to identify the coefficients on the price terms. As written in (5.23) and (5.24), the situation looks hopeless. Since we know nothing about prices, there is no way of pinning down either Ep or *. However, it is pos- sible to say something about their ratio. In particular, if we use (5.24) to write In it in terms of the logarithm of unit value, total outlay, and the error term, and substitute the result into (5.23), we obtain a linear relationship between the loga- rithm of quantity, the logarithm of outlay, and the logarithm of unit value, plus various error terms. The coefficient on the logarithm of unit value in this equation is 4Ihep, which can be estimated, not by OLS, since the measurement errors will make the unit values correlated with the compound error term, but by (say) instru- mental variables provided a suitable instrument can be found. If such is the case, the relationship between the logarithm of quantity and the logarithm of unit value identifies not the price elasticity Ep, but the hybrid parameter t-1'E, an amount that will be larger than ep if there is quality shading and * is less than unity. This is simply the formalization in terms of (5.23) and (5.24) of the general argument given at the end of the previous section, that the existence of quality effects in unit values will tend to bias downward estimates of price elasticities that are 296 THE ANALYSIS OF HOUSEHOLD SURVEYS obtained by comparing quantities with unit values on the false supposition that the latter can be treated as if they were prices. To disentangle esp from estimates of *IJ1 e requires more information, which can be obtained from a model of quality shading. I do this in the next subsection, where I define quality more precisely, and then model it in a way that is specific enough to circumvent the identification problem. The basic idea is that quality shading in response to price increases is not likely to be very large if the elasticity of quantity with respect to income is not very large. More precisely, if the para- meters P1 in (5.24) are close to zero-as we have already seen they are in prac- tice-then the parameter * should be close to unity, so that i-fcp is close to e1 . If so, we can more or less ignore the quality issue and treat the unit values as prices, except for measurement error. The next subsection can therefore be skip- ped by readers prepared to take the result on trust and to accept that the quality corrections in the next section are more for completeness than for substance. Modeling quality In textbook consumer theory, there are quantities and prices, and their product is expenditure. When quality is introduced into the picture, there are three players, quantity, quality, and price. Expenditure is the product of quantity and unit value, and unit value is part price and part quality. Indeed, we can define quality in such a way that unit value is price multiplied by quality, so that the textbook identity is extended to become expenditure is equal to the product of quantity, quality, and price. There are several different ways of thinking about quality, but the one that is most suitable for my purposes is to define quality as a property of commodity aggregates. At the finest possible level of disaggregation, there exist perfectly homogeneous goods, rice defined in terms of purity, variety, and number of bro- ken grains, or a car defined by make, year, and fittings. The actual categories used in the survey are collections or bundles of such goods, rice or cars. However, two rice bundles with the same physical weight may be made up of different compo- sitions of the varieties, and we can think of the higher-quality bundle as that which contains the higher proportion of relatively expensive varieties, where expensive and cheap are defined relative to some reference price vector. To make these notions precise, think of a group of goods (rice, or meat) de- noted by the subscript G. If qG denotes the household's vector of consumption levels of each item within the group, I define a group quantity index QG by (5.25) QG kG-qG where kG is a vector used to add together the possibly incommensurate items in the group. If physical weight is a sensible measure, and if the quantities are re- ported as weights, then each element of kG will be unity. More generally, the vector can be used, for example, to add together flour, wheat, and bread, in wheat equivalent units, or to measure quantities in terms of calories, in which case kG would be a vector of calorie conversion coefficients. Corresponding to the quan- LOOKING AT PRICE AND TAX REFORM 297 tity vector is a vector of prices, PG' and since each commodity that makes up the vector is perfectly homogeneous by definition, these prices contain no quality effects. They are genuine prices, not unit values. Write this price vector in the form (5.26) PG = nGPG where itG is a scalar measure of the level of prices in the group, and pG is a refer- ence price vector. Equation (5.26) is introduced in order to give us a simple way of changing the level of prices while keeping fixed the structure of prices within the group. When I come to implement the model using data from households in different villages, I shall treat the T's as varying from one village to the next, and assume that relative prices are at least approximately constant across villages. If expenditure on the group PG.qG is denoted XG, we can use the two equa- tions (5.25) and (5.26) to give the identity (5.27) XG = PG-qG QG(PG.qG/kG.qG) = QGTG(PG-qGkG-qG) so that, if we define quality (c by (5.28) (G =pG q IkG-qG (5.27) implies that expenditure is the product of quantity, price, and quality. In logarithms (5.29) lnxG = lnQG + lnnG + lntG. The definition of quality in (5.28) is clearly the only one that will allow (5.29) to hold given the definition of group price and group quantity. But it also corre- sponds to the ideas sketched in the first paragraph of this subsection. Quality is defined as the value of a bundle of goods at fixed reference prices relative to its physical volume. It is a function of the consumption bundle qG, and any change in this bundle towards relatively expensive goods and away from relatively cheap goods will increase the quality of the bundle as a whole. Having settled definitional matters, we can now address the issue of quality shading in response to price, and whether we can say anything about what hap- pens to quality (G when there is a change in the price it. The definitions allow us to tie the price elasticity of quality to the usual price and quantity elasticities, but without further restriction, there is no straightforward link between the in- come and price elasticities of quality. However, such a link can be constructed if the goods in G form a separable group in household preferences. Provided the marginal rates of substitution between different goods in the group are independ- ent of quantities consumed outside of the group, there will always exist subgroup demand functions for the goods inside the group in which the only arguments are total group expenditure and within-group prices (see for example Deaton and Muellbauer 1980a, pp. 120-26). When preferences are separable over a group, overall utility depends on the consumption of goods within the group through a 298 THE ANALYSIS OF HOUSEHOLD SURVEYS group subutility function, so that maximization of overall utility requires that subutility be maximized subject to whatever is spent on the group. We therefore have subgroup demand functions (5.30) qG = fG (XGPG) = fG(XG/IG'PGO) where the last step comes from using (5.26) and the fact that demand functions are homogeneous of degree zero. Equation (5.30) shows that separability delivers the restriction that we need. According to (5.28), and given the reference price vector, quality depends only on the composition of demand within the group. But according to (5.30), the group demands depend only on the ratio of group expenditure to group price, so that changes in group price must work in very much the same way as do changes in total expenditure. This links the price elasticity of quality (shading) to the income elasticity of quality. By (5.28), 9G depends on pG, kG, and qG' the first two of which are fixed constants. Thus the effect on quality of changes in price %G works only through qG, which by (5.30), depends only on the ratio of XG to ltG. As a result, iG depends only on In xG - In 7EG and not on each separately, so that (5.31) alniG a OinCG ( ainxG (5.31) alnnG alnxG alnG J The term in brackets on the right-hand side is the price elasticity of QG with respect to 7tG' or eP, say. The first term is closely related to the Prais-Houthakker (income) elasticity of quality which is ,3l in (5.24). In particular, by the chain rule, alKnG anCG aInxG (5.32) alnx alnxG alnx The last term on the right-hand side of (5.32) is the total expenditure elasticity of the group, e,, say, so that, combining (5.31) and (5.32), we have aInCG o (5.33) 81 = ep /e. a In nG The elasticity of unit value with respect to price, or the parameter t in (5.24) is one plus the elasticity of quality to price, so that we have finally that (5.34) *r = 1 + PI eP /ex. According to (5.33), separability implies that quality shading in response to price is determined by the price, income, and quality elasticities of the commodity group. When prices rise holding relative prices constant, there is a reduction in demand for the group as a whole, and the size of the reduction is controlled by the price elasticity e . When less is bought, there is a quality effect whose magnitude depends on the elasticity of quality with respect to expenditure on the group, an elasticity that is the quality elasticity with respect to total expenditure divided by the expenditure elasticity of the group. As a result, there will be no quality shad- ing if either the price elasticity or the quality elasticity is zero. The parameter ii, LOOKING AT PRICE AND TAX REFORM 299 the elasticity of unit value with respect to price in the unit value equation (5.24), is one if there is no quality shading, and will be less than one by an amount that is larger the larger are the price and quality elasticities. The separability assumption thus provides the basis for quantifying (and correcting) the bias that would arise if quality effects are ignored, and unit values treated as prices. Because (5.34) provides an additional relationship between the quantity and unit value elasticities e and , it can be used to identify the two elasticities separately from estimates of the ratio * c p, which as we have seen is all that can be identified from the quantity and unit value data. If we write 4 for the ratio of ep to i*j, equation (5.34) implies that (5.35) (I) tP (1+ he'/ so that, upon rearrangement, we have (5.36) 6 1 / As a result, if we know the ratio 4) and the quantity and quality elasticities e" and , we can use (5.36) to calculate the price elasticity Ep. Since all of the magni- tudes on the right-hand side of (5.36) can be estimated from the data, the separability-based model of quality shading provides the basis for identifying the price elasticity. Equation (5.36) can also be thought of as a correction formula, where the uncorrected (for the effects of quality) price elasticity is 4), and where the correc- tion will be small if Ii', the total expenditure elasticity of quality, is small. In the next subsection, I show how the data can be used to give consistent estimates of all of the identified parameters up to and including the ratio 4. Once this is done, a final correction is made by applying equation (5.36) to the estimated parameters to obtain an estimate of the price elasticity. Estimating the stripped-down model We are now in a position to estimate all of the parameters of the two-equation quantity and unit value model (5.23) and (5.24). There are two stages. The first, which we have already seen-uses within-village information to estimate the total expenditure elasticities, and the second, discussed here, uses between-village in- formation to estimate the price elasticities. Suppose that the first-stage estimates of e, and ,Bl have been obtained. Use these estimates to construct the two variables (5.37) Yh= In qh, - elnxh, (5.38) Y1 Inv=hc 1 Inxh,. Note that these are not the residuals from the first-stage regressions; these regres- 300 THE ANALYSIS OF HOUSEHOLD SURVEYS sions contain village dummies, and in (5.37) and (5.38), it is only the effects of total expenditure that are being netted out, and the price information contained in the village dummies must be left in. The residuals from the first-stage regressions play a different role, which is to estimate the variances and covariances of the residuals in the quantity and unit value equations. Let e 0 and e 1 be the residuals from the quantity and unit value regressions, respectively, and write n - k - C - 1 for the number of degrees of freedom in these regressions; recall n is the total number of observations, C the number of clusters, and k-which is one here-is the number of other right-hand-side variables. Then we can estimate the variance of u 1 and the covariance of uch and uch using (5.39) -" = e''e'l(n-k-C-l) 6d1 = eo1el1(n-k-C-1). We shall shortly see how these estimates are to be used. Since the second stage uses the between-village information, it begins from (5.37) and (5.38), the quantities and unit values purged of the effects of total ex- penditure, averaging them by cluster. Denote these averages 'o and 9^, respectively, the absence of the household subscript h denoting village averages. Corresponding to these estimates are the underlying "true" values y 0 and y I which we would have obtained had the first-stage parameters been known rather than estimated. From (5.23) and (5.24), these can be written as (5.40) y = a +ePlnnc+f +u0C (5.41) YC = +*rnic+uc where, once again, the u's have been averaged over villages. For a large enough sample of villages, we would therefore have (5.42) cov(yc°,ycl) = e~Pm+ol'lnc where m is the large-sample intervillage variance of log prices, and n is the num- ber of households per village. The last term comes from noting that uc and uc in (5.40) and (5.41) are averages over the nc households in each cluster. (When I come to deal with the complications, I shall return to these formulas and to what happens when there are different numbers of households in different clusters, as well as when there are different numbers of observations on quantities and unit values.) Corresponding to (5.42), we can also write the variance of the corrected unit values as (5.43) vary' = *2M + a" Inc. The comparison of (5.42) and (5.43) yields pC cov (yc YC) - ool/nc (5.44) ¢ + var(y ) - nc LOOKING AT PRICE AND TAX REFORm 301 so that, by replacing theoretical magnitudes by their first-stage estimates, we can obtain a consistent estimate of the ratio *-j &, . For reference, I record the for- mula: cov( y I) -d°l/nc (5.45) YC)Y C_c where the terms on the right-hand side come from (5.37), (5.38), and (5.39). Equations (5.44) and (5.45) look somewhat unfamiliar at first sight, but they are readily related to standard estimators, especially errors-in-variables estima- tors. Note first that the bivariate OLS estimator of the regression of y on x is the ratio of the covariance of x and y to the variance of x. In the absence of a°l and all, the error variances and covariances from the first stage, (5.45) is simply the oLs regression of average village demand on the village average unit value, where both right-hand and left-hand variables have been purged of the effects of house- hold total expenditure. The ua°1 and o" terms, which would vanish if each vil- lage were large enough, are designed to correct for the part of the between-village variances and covariances that comes from measurement and econometric error in the underlying first-stage equations. Indeed, (5.45) is a standard errors-in-vari- ables estimator in which the X'X matrix and X 'y vector are purged of the contri- butions of measurement error (see 2.66). That the estimator is relatively unfamil- iar in practice is because it is not feasible in the absence of information about the moments of the measurement error, information that is rarely available, at least in economics. In the current case, because of the two-stage structure of the problem, the information can be obtained from the residuals of the first-stage within-village regressions. That (5.45) reduces to OLS when the number of households in each village is infinite, gives a useful insight into the nature of the estimation procedure. It is de- signed to give a consistent estimate of the ratio 4) as the number of villages tends to infinity, holding fixed the number of households in each village. Although the first stage of estimation uses only the within-village information, it pools such data over many villages so that, provided the fixed effects are the only form of village heterogeneity, the first-stage parameters will converge to their true values as the number of villages increases, even if there are only a few households per village. But even if the first-stage parameters were known, so that the "hats" in (5.45) could be removed, the correction for the first-stage errors would still have to be made if there are not enough households in each village to average them away. Apart from this correction, (5.45) is simply a least squares estimate of the price elasticity using the variation in quantities and prices across the villages. In the case where y's and a's are known, the asymptotic variance of 4) is given by the delta method, or from Fuller (1987, p. 108) (5.46) v(+) = C1i(m"J -o01Inc )2[,I0m"l + (Min -4)m 1)] where (5.47) p 0 = m00_-2m0n,4+m11"p,2 302 THE ANALYSIS OF HOUSEHOLD SURVEYS and m°°, m I1, and mi01 are the variance of y0, the variance of y 1, and the co- variance of y° and y 1, respectively. This formula will generally understate the variance because it ignores the estimation uncertainty from the first-stage esti- mates. The formula given by Fuller also covers the case where the a's are esti- mated, but it does not allow for the fact that the y's are also estimated here, nor for the resulting lack of independence between the estimates of the y's and the a's. But because the first-stage estimates are based on many more data points than the second-stage estimates-there are more households than there are villages- the contribution to the variance of the estimate from the first stage is small rela- tive to that captured by (5.46) and (5.47). As a result, these formulas will often be adequate without further modification. Variance estimates can also be obtained by bootstrapping, which I shall use for the final estimates below. An example from Cote dI'voire Although the methodology outlined here requires modification before use, it is useful to work through a simple example of this version of the model to see how the numbers fit into the formulas. The data come from the 1979 Enqu&e Budget Consommation from COte d'Ivoire, and the full set of calculations are reported in Deaton (1987a). I illustrate with the example of beef using the numbers in Table 5.4. The top panel, of "crude" elasticities, shows the results of yielding to the temptation of regressing the logarithm of quantity on the logarithm of unit value and other variables. With no allowance for taste variation across households, the regression generates an estimate of the "price elasticity" of -0.560, which is ro- bust to the inclusion of broad regional and seasonal dummies, but increases to -0.796 when cluster-level and seasonal dummies are included. In these data, clusters were visited on multiple occasions, so that it is possible to include both seasonal and village effects, or, as in the last row of the top panel, to interact the village and quarterly dummies so as to calculate the within-estimator, which has the value of -0.940. If it is true that there is no genuine price variation within villages at a given visit, and if unit values were truly prices and were measured without error, the within-estimator would not be identified. As it is, the model interprets this estimate as the covariance-to-variance ratio of the error terms in the quantity and unit value equations, a quantity that has nothing to do with the price elasticity. The second panel, labeled "improved elasticities," follows the calculations of this subsection. The intervillage covariance of the village averages of corrected log unit value and corrected log quantity is -0.083, so that compared to the inter- village variance of the former of 0.138, we would get an intervillage OLS estimate of the price elasticity of -0.600. According to the model, this estimate is contami- nated by the effects of the measurement error because the cluster size is insuffi- ciently large to average it away. Indeed, the average cluster size in this case is only 1.97 households-see the next subsection and equation (5.55) below for an explanation of how this is measured-and the estimates of the error covariance and variance are -0.049 and 0.052, respectively. The sign of the covariance con- LOOKING AT PRICE AND TAX REFORM 303 Table 5.4. Estimates of the price elasticity of beef, Cote d'Ivoire, 1979 Estimate t-value "Crude" elasticities No taste variation -0.560 (5.0) Regional and seasonal taste variation -0.547 (4.4) Village and seasonal taste variation -0.796 (4.6) Within-estimator al1/a11 4.940 "Improved" elasticities Intervillage covariance -0.083 Intervillage price variance 0.138 Estimate of a°0 -0.049 Estimate of all 0.052 "Average" village size 1.97 Covariance to variance ratio -0.600 With measurement error correction -0.512 (2.5) With quality correction -0.504 (2.6) Source: Deaton (1987a). firms what might be expected, that measurement errors in quantity and unit value are negatively correlated. Correction for measurement errors using (5.45) leads to an estimate of the price elasticity of -0.512 with an asymptotic t-value of 2.5. The estimated expenditure elasticity of quality for beef is only 0.025, so that the final quality correction (5.36) exerts little further change, leaving us with a final esti- mate of -0.504 with an asymptotic t-value of 2.6. In this example, the final esti- mate is close to the initial estimate, which is an accident of these particular data; in general, the "crude" elasticity estimates will not be consistent. *Functionalforln The logarithmic model (5.23) and (5.24) is analytically tractable and has the ad- vantage that its parameters are elasticities which provide a simple, convenient, and dimensionless way to measure price responses. However, not all households consume all the goods, and because we cannot take the logarithm of zero, the logarithmic model can only be used to describe the behavior of those households who purchase positive amounts. For narrowly defined commodities, such a re- striction can eliminate a large fraction of the sample, and when we come to ex- tend the model to many goods, the restriction that all households purchase all the goods can result in an unacceptable loss of households from the sample. The fundamental issue here is whether we wish to model demand conditional on mak- ing purchases, or whether we want an unconditional formulation, covering non- consumers as well as consumers. For some purposes, the former would be correct, for example if the factors leading to purchase are different from the factors that determine the amount once the decision to purchase has been taken. However, it is equally clear that for purposes of tax and price reform, we need to include all 304 THE ANALYSIS OF HOUSEHOLD SURVEYS households, whether they purchase or not. We are ultimately trying to calculate the change in government revenue associated with a change in price, see (5.5), and the derivative in the denominator of (5.9) is a sum over all households, whether they make purchases or not. The model must therefore be one that in- cludes all households, which is not possible for the logarithmic formulation. A simple alternative is provided by modifying (5.23) to make the dependent variable the budget share rather than the logarithm of quantity consumed. This is Working's model that I used as an Engel curve in Section 4.2, although now with the addition of price terms. If we make the change to (5.23) and at the same time add a vector of demographic and other variables z,h, we can write -0 o '.0 (5.48) Whc = a + PlInx,c + y*.Zhc + OInn, +fc + Uch where Whc is the share of the good in the budget of household h in cluster c. This equation is paired with the unit value equation (5.24), which need only be supple- mented with demographic terms to read (5.49) invhc = a + P, inxhc + Y'.Zhc + * In tc + uhc. Although the same right-hand-side variables appear in (5.48) as in the logarithmic model (5.23), their coefficients are no longer elasticities, so that the model needs reinterpretation and the quality correction formulas must also be reworked. I shall turn to this below, but first it is worth thinking briefly about what kind of demand function is represented by (5.48) and what, if any, is its relationship to standard demand theory. At a superficial level, (5.48) is closely related to standard demand models like the "almost ideal demand system" in which the budget share is a linear function of the logarithms of total expenditure and prices. However, the analogy is at best incomplete. First, this is a model where utility depends, not only on quantity, but also on quality, so that, at the very least it has to be shown that (5.48) is consistent with the appropriately augmented utility theory. I shall examine this question below, but even when the model is utility-consistent, the fact may be largely irrelevant in practice. Standard empirical demand systems, such as the almost ideal system, are derived under the assumption that all goods are purchased, so that the budget shares are strictly positive. While choice theory certainly permits "corner solutions" in which nothing is bought of some goods, the formal incor- poration of corners into tractable empirical demand functions remains a largely unsolved problem. For a one-good model, the presence of corners generates a Tobit model which, as we have seen in Chapter 2, is problematic enough in prac- tical situations. However, when several goods are considered at once, as will be the case in the real applications below, there are a large number of possible re- gimes depending on which combination of goods is being bought. As a result, equation (5.48) can at best be thought of as a linear approximation to the regres- sion function of the budget share conditional on the right-hand-side variables, averaging over zero and nonzero purchases alike. While this may be more or less satisfactory for the purpose at hand-the modeling of average behavior-the LOOKING AT PRICE AND TAX REI)oRM 305 averaging will remove any direct link between (5.48) and utility theory. In this respect, the model is on much the same footing as an aggregate demand system where the averaging over agents almost never permits an interpretation in terms of a representative agent. Consider now what (5.48) and (5.49) imply for the elasticities and how the estimation of the previous section must be modified to deal with the new func- tional form. In (5.48) the elasticity of the budget share with respect to total expen- diture is ,0/w, but since the budget share is unit value times quantity divided by total expenditure, we have (5.50) ex + PI = (P°/w) + 1 where Pl is, as before, the total expenditure elasticity of quality from the (un- changed) unit value equation (5.24) or (5.49). Similarly, for the price elasticity E& we have (5.51) ep+4i = (O/w). Because both total expenditure and price elasticities depend on the budget share, neither will be constant but will vary from household to household. The two-stage estimation technique works in the same general way as before. At the first stage, the budget shares and the logarithms of the unit values are each regressed on the household demographics and the logarithms of total expenditure. The "corrected" y 0 and y I are formed as before, although the former is now a corrected budget share rather than a log quantity. As before the residuals from the first-stage regressions are used to estimate the covariance and variance a01 and al, and the results used to calculate $ in (5.45), although given the budget share equation (5.48), what is being estimated is now the ratio tI- 0. Again as before, we use the simple quality theory to allow the separate recovery of both parame- ters, although the formulas are now different. The theoretical relationship between the quality, price and expenditure elasticities in (5.34) gives, on substitution from (5.50) and (5.51) and rearrangement (5.52) = 1- 3(w-0) Given ¢ = tIC' 0 and (5.52), 0 can therefore be recovered from (5.53) 0 = (w -___) _ .1 +(W-4i) where (5.54) PI ( 1 3o+ w(1 - PI) If P' is small, C will be small, and so will be the correction to 0 in (5.53); when the income elasticity of quality is small, there will be little shading of unit value in response to price. 306 THE ANALYSIS OF HOUSEHOLD SURVEYS With the parameters recovered, the elasticities can be calculated from (5.50) and (5.51). Equations (5.46) and (5.47) remain correct for the variance of $ and the variances and covariances of the other parameters can be obtained using the delta method (see Deaton 1988 for the formulas). One other practical detail is worth noting. Since not every household reports a purchase, and since unit values are only reported for those who do, there will be unequal numbers of observations for the budget share equation (5.48) and the unit value equation (5.49). Suppose that in cluster c there are nc households of which nc record purchases. At the first stage of the estimation, the imbalance in numbers makes no difference. At the second stage, however, when the corrected budget shares and unit values are averaged by cluster, there are different numbers in the budget share average than in the unit value average, and the numbers will differ from village to village. The difference is important when we compute the vari- ances and covariance of the averaged residuals uc and uc. Since the former is averaged over nc observations, its variance is a°/nc, while the variance for the latter is clearly c11/n'. Straightforward algebra shows that the covariance is o01/nc. As a result, the nc in equations (5.42) through (5.46) must be replaced by 'X and T+, respectively, where (555) x-l = limC-'Enj'; (1 l =imC-'E(n)-. (5.55) T ~~~~C- c In practice, these magnitudes are unknown, but estimates of r- and (X+)1 can be calculated from the averages in (5.55) ignoring the limits. Rewriting the model in terms of budget shares, (5.48) and (5.49), takes us close to what we require for implementation. What remains is an exploration of the welfare foundations of the model, of how quality and quantity interact in util- ity. This analysis will also show how to embed the demand for each good in a system of demand equations. Although the general model is the one that should be used in practice, the simpler case in this subsection is useful for investigating practical econometric issues. In Deaton (1990a), I report a number of Monte Carlo experiments using (5.48) and (5.49). The experiments yield estimates that are correctly centered, in contrast to estimates that do not allow for village fixed effects, or that ignore measurement error. As usual, the price of consistency is an increase in variance. However, the experiments show that the estimator performs adequately even when there are as few as two observations in each village. In- creasing the number of villages or clusters is much more important than increas- ing the number of observations in each. The asymptotic variance formulas work- ed well and accurately predicted the variance across experiments. *Quality, quantity, and welfare: cross-price effects In this subsection, I show how quality and quantity can be interpreted within the standard model of utility maximization. At the same time, we shall see how to incorporate cross-price effects. For price and tax reform, where subsidies and taxes are frequently levied on closely related goods, some knowledge of cross- LOOKING AT PRICE AND TAX REFORM 307 price effects is essential, and as I argued in Section 5.1, it is one of the main ad- vantages of household survey data that they provide some hope of estimating these effects. The extension of the model from a single equation to a system of demand equations is conceptually straightforward, and consists of little more than adding additional price terms to the budget share equation (5.48). Even so, there are some theoretical issues that must also be taken care of, particularly the exten- sion of the model of quality under separability to the multigood case, so that we can handle the effects of changes in the price of one good on the quality choice of another. These theoretical issues are the topic of this subsection. In the economet- ric analysis, which is discussed in the next subsection, there is a good deal of what is essentially book-keeping, and the resulting algebra is a good deal more com- plex than that required so far. Readers who are willing to take this material on trust can safely omit the details and move on to Section 5.4 where the empirical results are presented. The theoretical basis for a multigood model with quality effects is a utility function that is separable in each of M commodity groups. Write this as (5.56) u = V[U1(q1),. ..,vG(qG),...,um(qM)] where each qG is a vector of goods, each element of which is perfectly homoge- neous. The group utility functions, which are sometimes referred to as subutility or felicity functions, have all the properties of the usual utility functions, and I shall denote the value of each by UG. Since overall utility is increasing in each of the subutilities, and since consumption in each group, qG, affects overall utility only in so far as it affects group subutility, the consumer will maximize each vG(qG) subject to the amount spent on the group, an amount that I denote by XG = PG.qG. For this subutility maximization problem, define the cost function CG(UG 'PG)' the minimum amount of money needed to attain UG at prices PG For a utility-maximizing consumer, money is spent efficiently, so that (5.57) XG = CG(UG,PG). I can now set up the definitions of quality and quantity. Physical quantity is defined as before, equation (5.25) repeated here for convenience (5.58) QG = kG.qG. The utility from the consumption of group G is conveniently measured using "money-metric utility," the minimum cost need to reach uG at a reference price vector pG. Define quality 'G implicitly by the identity (5.59) CG(UG'PG) = G QG* According to (5.59), the utility from group G consumption can be expressed as the product of quality and quantity; in the special case where there is only one 308 THE ANALYSIS OF HOUSEHOLD SURVEYS way of attaining utility UG, which is to buy the quantity qG1 (5.59) reduces to (5.28). We also define the price index lTG as the ratio of costs at actual prices to costs at the reference prices, CG(UG,PG) (5.60) ItG 0 CG(UGIPG) so that, as before, we have (5.61) XG SG (G QG Expenditure is the product of quantity, quality, and price. In general, the price indexes, its. are functions of the utility levels UG; this will not be so when prefer- ences are homothetic within the group, in which case the pattern of demand in the group is independent of the level of living, and there are no quality effects! In this formulation, group utility UG is a monotone increasing function of the product of quality and quantity, (5.59), so that we can write overall utility as (5.62) U = V (41Ql ... (GQG. X4MQ) which, given (5.61), is maximized subject to the budget constraint M (5.63) T G (G QG = X. This is a standard-form utility maximization problem, and will have standard de- mand functions as solutions. I write these (5.64) 9GQG = gG(X,1- - , I aG..."TM) which will satisfy all of the standard properties of demand functions: homogene- ity, Slutsky symmetry, and negative definiteness of the Slutsky matrix. Provided we treat the product of quality and quantity as the object of preference, the stan- dard apparatus of demand analysis applies as usual. Tlhe demand functions (5.64) are somewhat more complicated than they look. T1he prices of the composite goods, TiGr defined by (5.60), are each a function of UG I or equivalently, of UGQG' so that (5.64) is not an explicit set of demand func- tions, but a set of equations that has to hold when choices are made optimally. The dependence of the Tr's on the u's vanishes under the conditions of the Hicks' composite commodity theorem which, in the current context, means that the rela- tive prices of the goods within each group are the same in each cluster. If the con- stant of proportionality in cluster c is ic*, so that the price vector in the cluster is PGC = =i4 pG (5.60) gives TEG, = TC which is independent of UG. In practice, we will not usually want to assume the constancy of relative prices-in areas by water, fish will be cheap, while in arid areas, fish will be dear-and we need only the higher-level assumption that the dependence of nG on UG is negligible, for which constant relative prices is a sufficient, but not necessary, condition. LOOKING AT PRICE AND TAX REFORM 309 For the empirical work, we draw from the literature and adopt a standard flex- ible functional form for the demand system (5.64). One suitable form is the al- most ideal demand system of Deaton and Muellbauer (1980b), whereby (5.64) takes the form M (5.65) WG = aG + fG ln(x/h ) + E 0H In71H H=l where (the unsubscripted) it is a linearly homogeneous price index formed from all the prices (see Deaton and Muellbauer for the theoretical definition). Here I follow their practical suggestion of using a Stone index for it whereby (5.66) In 7: E H In 7H H=1 and WvH is the sample average of the budget share of H. If (5.66) is substituted into (5.65), we have (5.67) WG = aG +Po Inx + E 0GH In7tH H=l = aG +po Inx + x (oGH -PGH)lIt h. H=l The unit values VG = ,G lG are themselves functions of total expenditure and the prices, and for lack of any theoretical guidance on functional form, I use the generalization of (5.49), (5.68) InvG = a + Ptinx + E liGHIn 'CH H=I where the original scalar quality shading elasticity * is replaced by a matrix of elasticities PGH. In the absence of quality shading, unit value would be equal to price, the T matrix would be the identity matrix, and a' and f' would be zero. Because the budget shares add to unity, the vector ao in (5.65) must sum to unity, while the vector Po, and each column of the 0-matrix, must add to zero. The system should also be homogeneous of degree zero in total expenditure and prices; a doubling of both should double unit values and leave the budget shares unchanged. This will occur if, and only if, the rows of OH add to zero, or equiva- lently that, for all G, (5.69) EH0GH +G = 0 Similarly, linear homogeneity of the unit value equation (5.68) requires that, for all groups G, (5.70) -rH GH +G 1. The adding-up and homogeneity restrictions enable us to "complete" the demand system by adding another commodity defined as "all other goods," and to infer its own- and cross-price elasticities from the adding-up and homogeneity restrictions. 310 THE ANALYSIS OF HOUSEHOLD SURVEYS When using unit values to indicate prices, which is the methodological basis for this work, we will never obtain prices for a complete enumeration of consumers' expenditures, since there are many goods that do not have obvious quantity units, and for which only expenditure data are collected. For these goods, we can do no better than an "other goods" category. The Slutsky (or substitution) matrix of the demand system will be symmetric if, and only if, OH' = 0 * , or equivalently, (5.71) OGH + PGVH OHG + PHWGV It is common in demand analysis to use the symmetry restrictions to add to the precision of the parameter estimates. In the context of price reform it might be thought that such considerations would carry less weight, partly because there are so many observations in the survey data, and partly because the existence of corner solutions-zero purchases-for many households precludes a clean rela- tionship between the theory and its empirical application. Unless we build differ- ent models for the different "regimes" comprising different combinations of pur- chased goods, and then link these regimes within a general statistical model of selection, we cannot follow the parameters of the theory through into the empiri- cal specification. As a result, even though the utility restrictions hold for a single agent who buys all of the goods, they will not hold in the survey data. However, there are also some arguments for using the restrictions. First, al- though there are many observations, there is not always a great deal of price vari- ation. In the Pakistani data, there are at least partially successful controls on the prices of sugar and of edible oil, so that even with nearly 10,000 observations, it will be difficult to estimate price elasticities involving these goods. Indeed, the interest in price reform is often at its most intense in countries with a history of price control and we are frequently faced with using survey data with little price variation to try to predict what will happen when markets are liberalized. Second, not all of the restrictions from the theory have the same status, and we might be much more willing to impose absence of money illusion on the empirical demand functions than to impose Slutsky symmetry. Unfortunately, it is the latter that is likely to be the most helpful when we are trying to infer the price elasticities involving a controlled good. In such a situation, we need restrictions from some- where, and we might use the utility restrictions for want of better ones, just as symmetry and homogeneity are frequently imposed on aggregate data without rigorous justification. One task remains, to generalize the quality model so as to link the matrices e and T, so that they can be separately identified. The multigood treatment of this section permits a direct generalization of equation (5.34). From the definitions of quality and quantity, equations (5.58) and (5.59), we have (5.72) lnEG = InC(UG,PG) - ln(kG.qG ) As before, note that qG =fG(xG IG,pG), so that (5.31) generalizes to LOOKING AT PRICE AND TAX REFORM 311 alnlG alnfG alnXG 1 aln tH alnXG aln1cH ) Equation (5.32) remains unchanged, so that aln G GP eGH (5.74) aln lH eG In consequence, we have (5.75) alGH a G = OGH + e G alInlltH eG where OGH is the Kronecker delta, equal to unity when G = H and zero otherwise. When G = H, (5.75) and (5.34) are identical. Otherwise, when the price of one commodity increases, the effects on quality of another are controlled by how much the price increase affects quantity, i.e. by the cross-price elasticity. *Cross-price effects: estimation A good place to begin is with a final statement of the two equations for the budget share and for the logarithm of unit value. With demographics, fixed effects, and errors included, (5.67) and (5.68) become, for good G, household h, in cluster c M (5.76) WGhc = %G + iG3nXhc + YG.Zh + ZOGHln7rHc + (fG, + UGh) H=1 M (5.77) lnvGhc = aG + Pf3Inxhc + YG.Zhc + E *GHInTHc +u H=1 and our task is to estimate the parameters. The first stage is exactly as before; the budget shares, the logarithms of the unit values, the z's, and the logarithms of the x's are demeaned by their cluster means, and the two regressions run using the demeaned variables. Because all of the prices are constant within clusters, the demeaning removes all the prices as well as the fixed effects and so allows con- sistent estimation of the a's, n's, and y's. The first-stage estimates are used to compute the village averages -0~~~~~- (5.78) YGC = nc- -(WGic - 00GIr"Xic YOG-Zid (5.78) YG nGC S(wnGiC PGlnxlC Y-ZC iec (5.79) n nc E(Inv Gi - G Gnx ic iec where nc is the number of households in cluster c, nGc is the number of house- holds in the cluster who purchase good G (and thus report unit values), and super- imposed tildes indicate estimates from the first, within-cluster stage. As before, the first-stage regressions are also used to estimate the variances and covariances of the u 0 and u 1 in (5.76), and (5.77), although in place of the original two variances and a covariance there are now 2M variances and M(2M - 1) covariances. Suppose that E, typical element cGH9 is the variance- 312 THE ANALYSIS OF HOUSEHOLD SURVEYS covariance matrix of the u°'s, that Q, typical element X GH' is the variance- covariance matrix of the u"s, and that X, typical element XGH' is the covariance matrix of u 1 (on the rows) and u° (on the columns). Estimators of these matrices are constructed from 0 0 (5.80) dGH (n- C-k)-1EeGhCeHhc c hEc (5.81) G GH = n-C-k) SEE e I c hEc (5.82) XGH (n-C-k)-'S EeGheHhc c hec where e' and e° are the residuals from the first-stage, within-village unit value and budget share regressions, respectively. In practice, it will often be too much to expect the data to deliver good estimates of all elements of these three matrices, in which case only the diagonal elements need be calculated, and the off-diagonals set to zero. The empirical results in the next section are constructed in this way. The between-village variance-covariance matrix of the (theoretical, not esti- mated) yG is written Q, that of the yG is S, their covariance is R, and the elements of these matrices are estimated from (5.78) and (5.79), (5.83) qGH = COV(Y9Gc YHd) SGH = C°V(yGCOyH1d' tGH = cov(9YCsYHd )- If we were to use the y9 and 91 to run a between-village multivariate ordinary least squares regression, we would obtain (5.84) BOLS = T where the Gth column of Bo0s is the vector of OLS coefficients of Y^ regressed on all the y "s as explanatory variables. However, as in the univariate case, this estimator takes no account of the influence of u° and u' when the cluster size is finite. The corrected estimator is written (5.85) B = (S2N'y)(T-XN') where N c = C -IE D(n )-', D(n ) is a diagonal matrix formed from the ele- ments of n and the matrix N is the corresponding quantity fonned from the nc's. Gc' The estimator (5.85) is the multivariate generalization of (5.45) in the univariate case, so that the matrix B now plays the role previously played by the scalar (P. As in that case, its probability limit in the presence of the quality effects is not 9, but the matrix generalization of the ratio of 9 to T, (5.86) plim B = B = (TY)-IO The transposition is because B has the estimates for each equation down the col- umns, whereas in 9 they are arranged along the rows. LOOKING AT PRICE AND TAX REFORM 313 As before, the ( matrix is not identified without further information, here sup- plied by the separability theory of quality. The quality theory delivered equation (5.75) above, which in matrix notation is (5.87) T = I + D(P')D(e)-1E. E is the matrix of price elasticities, and the diagonalization operator D(.) con- verts its vector argument into a diagonal matrix. The matrix of price elasticities and the vector of total expenditure elasticities are linked to the model parameters by (5.88) E = -T + D(W)-10 (5.89) e = I - ,3 t P°D(v)-. Equations (5.87) through (5.89) can be combined with (5.86) so as to retrieve 0 and thus E from B, (5.90) e = B'T = B'[I-D(C)B'+D(C)D(fv)]-' (5.91) E = [D(v)-'B'-I]T = [D(fv)-'B'-I] [I-D(()B'+D(()D(iw)y1 where the elements of ( are defined by (5.92) CG = [(1-fPG)fVG+pOG]I PG- This completes the estimation stage; the first-stage parameters and residuals are used to make the covariance matrices in (5.80) through (5.83), the results are used to calculate the matrix B in (5.85), an estimate that is corrected using the first-stage estimates to give the 0 parameters or elasticity matrix using (5.90) or (5.91). Variance-covariance matrices for the estimated parameters and elasticities can be obtained by bootstrapping-which is the approach used for the results reported here-or analytical formulas can be derived that are valid in large samples. The algebra for the latter requires the introduction of a good deal of new notation, and is not given here. Intrepid readers can use the results in Deaton and Grimard (1992) and Deaton (1990a) as templates for the slightly different expressions here. The main value of such an exercise would be to provide asymptotic approxi- mations that could be used to improve the bootstrap calculations, as described in Section 1.4. However, I suspect that a straightforward application of the bootstrap is adequate, and it certainly delivers standard errors that are close to those from the asymptotic formulas. In practice, I have adopted a further shortcut, which is to bootstrap only the second stage of the estimation. The first-stage, within-cluster, estimates are calculated-with standard errors routinely provided by STATA-and (5.78) and (5.79) are used to calculate estimates of the cluster averages of the "purged" unit values and budget shares. This cluster-level data set is then treated as the base data from which bootstrap samples are drawn. This shortcut saves a 314 THE ANALYSIS OF HOUSEHOLD SURVEYS large amount of computer time because the first-stage estimation is much more time-consuming than the second stage. It is justified by inspection of the asymp- totic variance formulas, which show that it is the contribution from the second stage that dominates, and from practical experience which shows that the first- stage contribution to the variance is negligible. *Technical note: completing the system This note, which is intended only for those who wish to pursue their own calcula- tions, and need to understand the code, lays out the calculations needed to com- plete the system by adding a single composite commodity, "nonfood." I also ex- plain how to impose symmetry, at least approximately. The calculations start from the MxM matrix 0 defined in (5.67) and calculated from (5.90). Write the corresponding (M+ 1) x (M+ 1) matrix for the complete system C, which differs from 9) by having an additional row and column. Using the homogeneity restric- tion (5.69), the final column of (ix is given by M (5.93) ox PG OGH. H=1 The final row of E)x is computed from the adding-up restriction, so that M (5.94) eM IG E oHG' H=I In the same way, the adding-up restrictions are used to extend the vectors ao, P, and w to a'x, pox, and w' x. The vector of quality elasticities ,B1 cannot be ex- tended in the same way; instead it is necessary to assume some plausible quality elasticity for the nonfood category. That done, (5.92) is used to calculate the extended vector CX. From (5.90) applied to the complete system, we have (5.95) E = B- x (5.96) TI,x = I+D(x D(v x) -D(x)B xl so that, eliminating B x, (5.97) Tx = [I + D(0)D(v x)] 11[I + D(Cx)x4]. The full-system matrix of price elasticities and of total expenditure elasticities are then calculated from (5.88) and (5.89). Satisfying the symmetry condition (5.71) requires the imposition of a nonlin- ear restriction on the matrix B. Rather than attempt this, I have fallen back on an approximation that relies on the validity of the empirical finding that the quality elasticities are small. In this case, T is approximately equal to the identity matrix, LOOKING AT PRICE AND TAX REFORM 315 so that 9 and B' are approximately equal. In view of this, the symmetry condi- tion used in the code is that the matrix (5.98) B + w pf' be symmetric. (An alternative would be simply to require that B be symmetric.) To see how the restricted estimate is obtained, we need some notation; a good introduction to the concepts used here is provided by Magnus and Neudecker (1988). The "vec" operator converts a matrix into a vector by stacking its col- umns vertically. The Kronecker product of two matrices, as in A C B, is the matrix formed by multiplying each element of A by the whole matrix B, and arranging the results into a matrix patterned after A, so that the top left-hand submatrix is al,B, with a12B to its right, and a21B underneath, and so on. The "commutation" matrix K is defined by what it does, which is to rearrange the vec of a matrix B so that it becomes the vec of the transpose of B, (5.99) KvecB = vecB'. Finally, we require the matrix L, a "selection" matrix which picks out from the vec of a square matrix the elements that lie below the diagonal in the original matrix. Given these definitions, the restriction (5.98) can be rewritten as (5.100) L(I - K) (vecB + Po 0®i) = 0 where I have used the fact that vec(a b 1) = bOa. The selection matrix L is re- quired here in order to prevent each symmetry restriction being applied twice. Without it, we would be trying to impose that element ( 1,2 should equal element (2,1 1 as well as that element {2,1 } should equal element { 1,21, a redundancy that will lead to singularities in the matrices to be inverted. If we write b for vecB, (5.100) can be written in the familiar linear form (5.101) Rb = r with R and r defined from (5.100). We already have an unrestricted estimate from (5.85). Write A for the matrix S-L"2Ni in (5.85), and define the restricted esti- mate of B, by (5.102) vecBR = vecB+((I®A)' [R(I®A)-'R'/-'(r-RvecB). 5.4 Empirical results for India and Pakistan In this section, I take the technical apparatus for granted and return to my main purpose of deriving demand parameters for the analysis of price reform. I begin with a preliminary subsection on data issues and the selection of goods, and then turn to the first- and second-stage results. 316 THE ANALYSIS OF HOUSEHOLD SURVEYS Preparatory analysis The first choice in any demand analysis is the choice of goods to be included. Because we are interested in price and tax reform, the analysis must distinguish goods with different (actual or potential) tax rates. In some cases, there are goods of particular concern, for example those whose nutrient content is important, or goods that are substitutes or complements for the goods whose prices might be altered. There are also data limitations. We cannot estimate the demand for a good whose price is constant across the sample, and it will be difficult to obtain reliable estimates of goods that are purchased by only a few households. Larger demand systems are also harder to deal with than small ones; the more goods, the greater the computational problem, and the harder it is to report the results. It is of the greatest importance to subject the data to close scrutiny before undertaking the formal analysis. In addition to outliers or incorrectly coded data, there are special problems associated with the unit values which must be carefully checked for plausibility. We have already seen one way of doing so, by checking the reported unit values against other price data, by looking at seasonal and spatial patterns, and by an analysis of variance. Univariate analysis of the unit values is also important in order to expose difficulties that can arise if spatial variation in tastes causes spatial differences in unit values. The problem arises in its most acute form when the commodity group consists of a single good that comes in different forms. Consider, for example, the category "fish." The main components of the group will usually be "dried fish" and "fresh fish," with different unit val- ues per kilo; suppose that, allowing for transport and processing margins (the cost of drying the fish), a dried fish costs a little more than the same fish fresh, but weighs less because it contains less water. Suppose also that the country has a coastline but no lakes or rivers, and that the proportion of dried fish rises with the distance from the sea. Even if everyone consumes exactly the same number of fish, some fresh and some dried but with added water, there will be a negative spatial correlation between the weight of fish purchased and their unit values, a correlation that we are in danger of interpreting as the price elasticity of demand. Thefirst-stage estimates The first-stage estimates of the unit value equations are presented in Tables 5.2 and 5.3, and have already been discussed in Section 5.2 above. Tables 5.5 and 5.6 present a selection of coefficients from the corresponding within-village regres- sions for the budget shares. The last column of each table lists the sample aver- ages of the budget shares for each good. Wheat, including bread and flour, is the basic staple in most of Pakistan, and accounts for 12.8 percent of the budget of rural households. In rural Maharashtra, wheat accounts for only 3.7 percent of the budget, and the most important single category is jowar, which accounts for 12.3 percent of the total, and is followed in importance by rice, with 8.2 percent. All of these basic foodstuffs attract negative f0 coefficients, and thus have total ex- penditure elasticities that are less than unity. The total expenditure elasticity of LOOKING AT PRICE AND TAX REFORM 317 Table 5.5. Within-village regressions for budget shares, rural Pakistan, 1984-85 Food In x In n ex Wheat -0.068 0.070 0.37 0.128 (53.4) (47.0) Rice -0.006 0.007 0.68 0.027 (7.3) (8.4) Dairy products 0.024 -0.007 1.05 0.127 (12.0) (3.1) Meat 0.009 -0.006 1.10 0.037 (13.1) (7.6) Oils and fats -0.024 0.013 0.42 0.041 (39.9) (17.7) Sugar -0.004 0.002 0.86 0.029 (7.9) (3.9) Other foods -0.040 0.020 0.59 0.122 (38.3) (16.1) Note: Absolute values of t-values are shown in brackets. Source: Deaton and Grimard (1992). wheat in Pakistan is 0.37, that of jowar in Maharashtra is 0.29. The elasticity of rice, which is the secondary cereal in both surveys, is 0.68 in Pakistan and 0.67 in Maharashtra. Meat and dairy products have greater than unity elasticities in both countries, and edible oil and sugar are necessities in both. These results suggest two similar populations, but with different cereals playing the primary and sec- ondary roles in each. The relationship between the coefficients on total expenditure and household size conform to the by now familiar pattern where the estimate for In n has the opposite sign to that for In x, but is smaller in absolute magnitude. For the basic cereals, wheat in Pakistan and coarse cereals and rice in Maharashtra, the coeffi- cients on household size are close to (minus) the coefficients on total expenditure, so that the per capita consumption is approximately determined by PCE independ- ently of household size. Price responses: the second-stage estimatesfor Pakistan The top panel of Table 5.7 presents the first set of own- and cross-price elastici- ties for rural Pakistan together with bootstrapped estimates of "standard errors." The numbers are arrayed so that the elasticity in row i and column j is the re- sponse of consumption of good i to the price of good j. The bootstrapped "stan- dard errors" are calculated by making 1,000 draws from the cluster (second-stage) data, recalculating the estimates for each, and then finding (half) the length of the interval that is symmetric around the bootstrapped mean, and that contains 68.3 percent of the bootstrapped estimates. If the distribution of estimates were nor- mal, this interval would be one standard deviation on either side of the mean, 318 THE ANALYSIS OF HOUSEHOLD SURVEYS Table 5.6. Within-village regressions for budget shares, rural Maharashtra, 1983 Food Inx Inn eC w Rice -0.023 0.020 0.67 0.082 (11.5) (9.6) Wheat 0.006 0.003 1.12 0.037 (4.2) (2.1) Jowar -0.082 0.084 0.29 0.123 (33.2) (7.5) Other cereals -0.015 0.017 0.46 0.035 (9.3) (9.6) Pulses and gram -0.018 0.011 0.67 0.060 (20.8) (11.8) Dairy products 0.012 -0.006 1.11 0.053 (7.6) (3.4) Oils and fats -0.011 0.008 0.79 0.060 (12.7) (8.3) Meat, eggs, and fish 0.008 0.000 1.05 0.034 (6.3) (0.1) Vegetables -0.018 0.007 0.59 0.047 (26.5) (10.3) Fruit and nuts 0.002 -0.001 0.98 0.023 (3.0) (0.9) Sugar and gur -0.013 0.008 0.65 0.043 (20.2) (11.9) Note: Absolute values of t-values are shown in brackets. Source: Deaton, Parikh, and Subramanian (1994, Tables 2,5, and 6). which is why I refer to the numbers as "standard errors" but, in general, there is no reason to suppose that the distributions have finite moments. In the top panel of the table, I have completed the system, but have not imposed symmetry. Prov- ince and quarter effects in demands are allowed for by regressing the corrected cluster averages of budget shares and unit values on quarterly and province dum- mies and then using the residuals in the second-stage calculations. The rationale for this will be discussed in the next subsection. The estimates are relatively well determined, at least in comparison to similar estimates from time-series data. Apart from sugar-on which more below-all of the diagonal terms-the own-price elasticities-are negative, and that for rice is less than -1. Only a minority of the cross-price effects is significantly different from zero; several of these involve the two goods rice and meat. According to these estimates, increases in the price of rice increase the demand for all other goods except other foods, and there are statistically significant effects for oils and fats and for sugar. Similarly, an increase in the price of meat has significantly positive effects on the demand for rice and for oils and fats. The estimated elasti- cities in the sugar and oils and fats columns are poorly determined; this is because there is little variance of sugar and oil prices in the sample. Although the estimated Table 5.7. Unconstrained and symmetry-constrained estimates of price elasticities, rural Pakistan, 1984-85 Food Wheat Rice Dairy Meat Oils & fats Sugar Otherfoods Nonfoods Unconstrained estimates Wheat -0.61 (0.10) 0.16 (0.07) 0.02 (0.04) -0.06 (0.07) 0.30 (0.31) -0.14 (0.27) -0.09 (0.10) -0.03 (0.39) Rice -0.06 (0.45) -2.16 (0.25) -0.25 (0.12) 0.66 (0.32) -1.20 (1.00) -0.59 (1.10) -0.14 (0.37) 2.95 (1.58) Dairy 0.13 (0.14) 0.09 (0.07) -0.89 (0.04) -0.13 (0.09) 0.10 (0.39) 0.72 (0.34) -0.24 (0.14) -0.84 (0.46) Meat -0.58 (0.22) 0.18 (0.14) -0.05 (0.08) -0.57 (0.18) -1.57 (0.70) -0.75 (0.60) 0.82 (0.20) 1.29 (0.81) Oils and 0.09 (0.05) 0.12 (0.03) 0.01 (0.02) 0.09 (0.04) -0.80 (0.18) -0.12 (0.14) -0.01 (0.05) 0.46 (0.23) fats 0.02 (0.17) 0.34 (0.09) 0.07 (0.05) 0.15 (0.10) -0.16 (0.51) 0.11 (0.53) 0.05 (0.15) -1.36 (0.75) Sugar 0.22 (0.09) -0.12 (0.05) 0.02 (0.03) 0.08 (0.06) 0.24 (0.41) 0.09 (0.25) -0.51 (0.10) -0.62 (0.40) Other foods 0.09 (0.03) 0.02 (0.01) 0.00 (0.01) 0.03 (0.02) -0.06 (0.07) 0.08 (0.06) 0.03 (0.02) -0.60 (0.09) Nonfoods Symmetry-constrained estimates Wheat -0.63 (0.10) 0.11 (0.05) 0.03 (0.04) -0.08 (0.05) 0.08 (0.05) -0.01 (0.05) 0.10 (0.07) -0.04 (0.15) Rice 0.49 (0.27) -2.04 (0.22) -0.14 (0.11) 0.47 (0.17) 0.44 (0.13) 0.36 (0.12) -0.58 (0.18) 0.20 (0.38) Dairy -0.05 (0.03) -0.04 (0.02) -0.90 (0.04) -0.03 (0.02) -0.02 (0.01) 0.02 (0.01) -0.05 (0.03) 0.00 (0.05) Meat -0.40 (0.20) 0.33 (0.13) -0.10 (0.08) -0.54 (0.18) 0.14 (0.10) 0.04 (0.09) 0.40 (0.16) -1.11 (0.32) Oils and 0.10 (0.06) 0.11 (0.03) 0.01 (0.02) 0.06 (0.03) -0.81 (0.18) -0.10 (0.10) 0.02 (0.07) 0.44 (0.23) fats -0.08 (0.21) 0.29 (0.10) 0.11 (0.06) 0.06 (0.11) -0.34 (0.34) 0.09 (0.53) 0.12 (0.18) -1.04 (0.72) Sugar 0.07 (0.07) -0.11 (0.10) 0.01 (0.03) 0.13 (0.04) 0.01 (0.06) 0.03 (0.04) -0.50 (0.09) -0.24 (0.12) Other food 0.05 (0.02) 0.00 (0.01) 0.00 (0.01) 0.04 (0.01) -0.03 (0.03) 0.04 (0.03) 0.07 (0.02) -0.58 (0.06) Nonfoods Note: The rows show the conmmodity being affected and the columns the cormmodity whose price is changing. The figures in brackets are obtained from 1,000 replications of the bootstrap using the cluster-level data and are defined as half the length of the interval around the bootstrap mean that contains 0.638 (the fraction of a normal random variable within two standard deviations of the mean) of the bootstrap replications. Source: Author's calculations using Household Income and Expenditure Survey, 1984-85. 320 THE ANALYSIS OF HOUSEHOLD SURVEYS own-price elasticity of sugar is positive, it has a large standard error, and is not significantly different from zero. Indeed, the standard errors in these two columns are two to three times larger than those elsewhere in the table. Given the lack of information in the survey on sugar and oil prices, the sym- metry-constrained estimates deserve attention. Although the theoretical restriction cannot be rigorously defended, it offers us the choice between some answers and no answers, and will guarantee that the estimates satisfy unique substitution-com- plementarity patterns, ruling out the possibility that good a is a substitute for good b when good b is a complement for good a. The restricted estimates for the full system are given in the bottom panel of the table. In those cases where the previous estimates were well determined, which are the goods whose unit values show substantial variation in the data, there is little change in the estimates, although the restrictions bring some increase in estimated precision. However, there are large reductions in the standard errors in the sugar and oil columns, because symmetry allows us to use the effect of (say) rice on sugar, something that is well determined because of the variability in rice prices, to "fill in" the estimate of the effect of sugar on rice. Of course, symmetry cannot help with the own-price elasticities, nor can adding up and homogeneity, which are being used to complete the system, not to impose cross-equation restrictions. As a result, the own-price elasticities for both sugar and oil still have large stan- dard errors, and the estimate for the former remains (insignificantly) positive. Tlhe symmetry restriction also helps pin down some of the other estimates. The substitutability between rice and wheat-an effect that is of considerable impor- tance for price reform in an economy where both are subsidized-is similar in the top and bottom panels, but is more precise in the latter. The pattern of elasticities shows that meat, oils, and sugar are also substitutes for rice. The final row and column relates to the "ghost" category of nonfood, where we have no price data, but where the elasticities are determined by the theoretical restrictions. Appro- priately enough, the standard errors in the last column are large. The configuration of these elasticities is very different from what would be the case if we had used a demand model based on additive preferences, such as the linear expenditure system. Such models not only restrict the cross-price elasticities to be small, as is mostly the case here, but also enforces an approximate propor- tionality that has dramatic effects for tax policy. The estimates in Tables 5.5 and 5.7 show no such proportionality. Table 5.5 estimates that meat and dairy products are luxuries and rice a necessity, and yet Table 5.7 estimates that rice is more price elastic than is either dairy products or meat. Taxing rice is therefore doubly unat- tractive compared with taxing either of the other two goods. I shall turn to these issues in more detail in the next and final section. Price estimates and taste variation: Maharashtra Table 5.8 provides a first look at the price elasticities for Maharashtra, but is pri- marily designed to explore another issue, which is the extent to which we should LOOKING AT PRICE AND TAX REFORM 321 Table 5.8. The effects of alternative specification on own-price elasticities, rural Maharashtra, 1983 Neither Food Quarters & quarters nor regions regions Quarters only Regions only Rice -1.19 -2.38 -2.31 -1,10 Wheat -1.29 -1.17 -1.25 -1.22 Jowar -0.52 -0.80 -0.81 -0.55 Other cereals -3.31 -3.39 -3.65 -3.17 Pulses -0.53 -0.76 -0.77 -0.53 Dairy products -0.15 -0.61 -0.53 -0.22 Edible oils -0.40 -0.48 -0.42 -0.49 Meat -1.08 -1.84 -1.85 -1.05 Vegetables -0.73 -0.74 -0.82 -0.71 Fruit -1.09 -1.04 -1.02 -1.12 Sugar and gur -0.20 -0.29 -0.33 -0.18 All cereals -0.28 -0.38 -0.38 -0.22 All foods -0.27 -0.32 -0.30 -0.26 Source: Deaton, Parikh, and Subramanian (1994 Table 7). make separate allowance for regional and temporal taste variation that is unre lated to price, and whether we wish to require that demand respond to seasonal price differences in the same way as it responds to spatial price differences. Rice may usually be 10 percent cheaper at the beginning of the year than in the sum- mer, and it may generally be 10 percent cheaper in one district than another. But seasonal and regional differences in tastes may result in differences in demands not being the same in the two situations, even once we have controlled for in- comes, demographics, and other observable covariates. Of course, if we imposed no structure on tastes, and allow demands to vary arbitrarily from village to vil- lage, it would be impossible to estimate any parameters. My preferred procedure is to allow quarterly and regional dummies, and the (unconstrained) own-price elasticities from this specification are reported in the first column of Table 5.8, together with aggregate elasticities for cereals and the sum of the foods, calcu- lated on the assumption that prices change proportionally for all components of the aggregate. Columns two through four show the effects on these own-price elasticities of excluding both sets of dummies, of including only quarterly dummies, and of including only regional dummies. The important issue turns out to be whether or not regional dummies are included. The estimates in the first and last columns, where regions are included, are close to one another, as are the estimates in the second and third columns, where regions are excluded. In most but not all cases, the estimates in the central columns are absolutely larger than those in the outer columns. One interpretation runs in terms of the long-run effects of prices. Some interregional price differences are of very long standing, and it is plausible on Table 5.9. Symmetry-constrained estimates of price elasticities, rural Maharashtra, 1983 Other Dairy Other Food Rice Wheat Jowar cereals Pulses products Edible oils Meat Vegetables Fruit Sugar goods Rice -1.05 0.28 0.37 -0.46 -0.71 -0.33 -0.19 -0.30 -0.02 -0.14 0.24 1.13 Wheat 0.63 -1.32 -0.18 0.42 0.40 0.28 -0.05 -0.09 -0.03 0.02 -0.27 -0.94 Jowar 0.27 -0.02 -0.45 0.54 0.04 0.12 0.08 0.22 0.06 0.03 -0.26 -0.94 Other cereals -1.15 0.47 2.04 -3.29 -0.12 0.16 0.21 -0.33 0.14 0.07 0.21 1.02 Pulses -0.95 0.25 0.04 -0.07 -0.57 0.04 0.10 -0.03 0.22 0.11 0.29 -0.04 Dairy products -0.57 0.19 0.18 0.08 0.02 -0.13 0.17 -0.01 -0.09 -0.05 0.21 -1.18 X Edibleoils -0.27 -0.01 0.12 0.11 0.09 0.16 -0.28 0.06 0.03 0.04 0.32 -1.11 Meat -0.81 -0.10 0.73 -0.37 -0.10 -0.01 0.09 -1.12 0.16 -0.02 0.15 0.18 Vegetables -0.03 -0.00 0.12 0.10 0.31 -0.08 0.05 0.13 -0.66 0.05 -0.15 -0.44 Fruit -0.57 0.04 0.11 0.09 0.29 -0.12 0.09 -0.03 0.09 -1.09 -0.02 0.04 Sugar and gur 0.48 -0.22 -0.82 0.16 0.44 0.28 0.47 0.13 -0.17 -0.00 -0.33 -1.12 Other goods -0.18 0.06 0.26 -0.04 0.03 0.10 0.12 -0.01 0.05 0.00 0.09 -0.85 Note: Estimates shown in boldface are greater in absolute value than twice their bootstrapped standard errors. For the description of bootstrapped standard errors, see the note to Table 5.7. Source: Deaton, Parikh, and Subramnanian (1994 Table 8). LOOKING AT PRICE AND TAX REFORM 323 general theoretical grounds that long-run price elasticities are absolutely larger than short-term elasticities. In consequence, regional dummies may capture some of the long-run price effects, and estimated price elasticities will be lower when regional dummies are included. When we consider price reform proposals, we are probably not greatly interested in effects that take many years, perhaps even centu- ries, to be established, so that the relevant estimates are the generally lower figures that come from inclusion of the dummies. Table 5.9 gives the final set of results, obtained by completing the system and by imposing the symmetry restriction; the patterns in this table are not very differ- ent from the unconstrained estimates which are therefore not shown here. With 12 goods, the inclusion of standard errors makes it very hard to read the table, so I have highlighted those estimates that are more than twice their (bootstrapped) standard errors. As in the Pakistani case, the credibility and usefulness of symme- try is enhanced by the fact that the important and well-determined elasticities in the table, particularly the own-price elasticities, are changed very little by the imposition of the restriction. The results show a number of important patterns, particularly of substitutability between the various foods. Rice, wheat, and jowar are substitutes, as are jowar and other coarse cereals. Sensibly enough, pulses and dairy products are comple- ments with rice, but pulses are substitutes for vegetables, fruit and nuts, and sugar and gur. Sugar is a substitute for other cereals, edible oils, and dairy products, as well as for pulses, but is complementary with jowar. The substitutability between edible oils and sugar is important because the former is taxed-implicitly through protection of domestic processing-and the latter subsidized, so that we need to look at both together when thinking about reform. It is instructive to compare these estimates with those for Pakistan. In Pakistan, the staple food is wheat, rather than jowar, and as incomes rise, the movement is towards rice, as opposed to towards rice and then wheat in Maharashtra. As was the case for the total expenditure elasticities, and with allowance for the different roles of the different foods, the elasticities are similar. The own-price elasticity for wheat in rural Pakistan is estimated to be -0.63, while that for the "superior" rice is -2.04. In Table 5.9, jowar has a price elasticity of -0.45, while the estimates for rice and wheat are -1.05 and -1.32, respectively. In both cases rice and wheat are substitutes, in Pakistan the elasticity of rice with respect to the wheat price is 0.49, compared with 0.63 for the estimated elasticity of wheat with respect to the rice price in Maharashtra. 5.5 Looking at price and tax reform I am now in a position to return to the main purpose of the chapter, the analysis of price and tax reform. Spatial price variation has been investigated, a model pro- posed, and parameters estimated. It is now time to use the results to investigate the consequences of various kinds of price reform in India and Pakistan. A good deal of the work in such exercises is concerned with the description of the en- vironment, with how actual taxes and subsidies work, and with the derivation of 324 THE ANALYSIS OF HOUSEHOLD SURVEYS shadow prices. Given the length of this chapter, and the fact that I have nothing new to add to these topics, I content myself with a brief description of the back- ground in each country, and with a description of how the results of Section 5.4 can be combined with the theoretical formulas in Section 5.1. I start with the des- criptions of each country, then discuss how to adapt the formulas of Section 5.1 to accommodate the empirical results, and conclude with the policy evaluation. Shadow taxes and subsidies in Pakistan The implementation of the tax reform formula (5.9) or (5.12) requires information on the consumption of the various commodities, which we have from the survey, the price derivatives, which have been estimated in the previous section, social in- come weights, which are discussed below, and shadow prices. Ahmad, Coady, and Stern (1988) have estimated a complete set of shadow prices for Pakistan based on an 86 x 86 input-output table for 1976, and Ahmad and Stern (1990, 1991) use these shadow prices in their tax reform experiments. However, these prices embody a large number of assumptions, about shadow prices for land, labor, and assets as well as, most crucially, judgments about which goods are tradeable (further divided into importables and exportables) and which goods are nontraded. These assumptions all have a role in determining the shadow prices, and while the procedure itself is unimpeachable, the use of their prices for the current exercise makes it very difficult to isolate the role played in the reform proposals by each of the different elements in the story. Instead, I shall work here with an illustrative set of shadow prices for the eight goods in the demand analy- sis. These prices are illustrative only in the sense that they do not claim to take into account all the ingredients that should ideally be included in shadow prices. However, they are based on the actual situation in Pakistan, and they are simple and transparent enough so that we can see how features of that situation enter into the analysis of price reform. The crucial commodities are wheat, rice, and sugar. Rice and wheat sell at domestic prices that are substantially lower than their world prices at the official exchange rate. According to the Pakistan Economic Survey (Government of Paki- stan, annual), the border price for both rice and wheat was approximately 40 percent above the domestic price in the mid-80s. Both are tradeable, and for both I work with accounting ratios (shadow prices divided by consumer prices) of 1.4. In the case of sugar, there is a heavily protected domestic refining industry, and the border price of refined sugar is some 60 percent of the domestic price. Oils and fats are also protected by a system of taxes and subsidies with the result that the border price is perhaps 95 percent of the domestic price. For the other four goods in the system, I work with accounting ratios of unity. Although there are tariffs and taxes on many nonfood items, which would call for accounting ratios less than unity, many other items are nontraded, so that their shadow prices de- pend on the shadow price or conversion factor for labor, which, given the pricing of wheat and rice, is likely greater than unity. The experiments are therefore conducted under the supposition that the ratio of shadow prices to consumer LOOKNG AT PRICE AND TAX REFORM 325 prices for the eight goods (wheat, rice, dairy, meat, oils and fats, sugar, other food, nonfood) are given by the vector (1.4, 1.4, 1.0, 1.0, 0.95, 0.6, 1.0, 1.0). It is important to be precise about exactly which policy experiments are being considered, and which are not. The instruments that are being considered here are instruments that increase or decrease the consumer price of the good, as would be the case, for example, if there were a value-added tax, or if the good had a fixed world price and were imported over a tariff. Such instruments may be available in Pakistan for sugar or for oils. However, to the extent that the rice and wheat sub- sidies are maintained by export taxes, the experiments described here do not correspond to what would happen if the export tax were changed. An increase in an export tax increases consumer welfare at the same time as it increases govern- ment revenue, so that if only these two effects were considered, as is the case in (5.9) and (5.12), it would be desirable to increase the tax indefinitely. Of course, export taxes are limited by the supply responses of farmers, as well as by interna- tional demand when the country has some monopoly power, as is the case for Pakistan with basmati. An export tax is an instrument that alters not only con- sumer prices but also producer prices, and it cannot be fully analyzed without looking at the supply responses. The calculations here, which look only at the demand side, correspond to a different hypothetical experiment, in which pro- ducer and consumer prices are separated. This would suppose that the govern- ment procures the total output of rice and wheat at one set of prices, that it sells the commodities either to foreigners at world prices, or to domestic consumers at a third set of prices. It is the consequences of changing these domestic prices that can be examined using the apparatus described above. Of course, these are artifi- cial experiments which are not feasible in practice. Farmers who grow wheat or rice cannot be charged one price for consumption and paid another for procure- ment. However, more realistic policies will have the same effects as those de- scribed here, plus additional effects that work through the supply side. What I have to say here is only a part of the story, but it is a part that has to be under- stood if well-informed policy decisions are to be made. Shadow taxes and subsidies in India The Indian reform exercise is rendered somewhat artificial by the fact that the survey data come from Maharashtra, rather than all India. In consequence, I focus only on the major distortions in Indian agricultural policy, and once again make no attempt to derive a full set of shadow prices; for an example of the latter for 1979/80 based on the data used in the Technical Note to the Sixth Indian Plan (see Ahmad, Coady, and Stern 1986). Indian domestic prices of rice and of wheat are held below their world prices by the Public Distribution System (PDS) which procures and stockpiles cereals, and which sells them to consumers through "fair- price" shops. While purchases in fair-price shops are rationed, so that above the ration, the marginal price is the free-market price-which is presumably higher than it would be in the absence of PDS-we simplify by treating the PDS as if it straightforwardly subsidized rice and wheat. The other important distortion that 326 THE ANALYSIS OF HOUSEHOLD SURVEYS we incorporate is the effective taxation of edible oils. The government protects the domestic groundnut processing industry, and in consequence the prices of edible oils are above the world prices. Gulati, Hansen, and Pursell (1990) calcu- late that, on average from 1980 to 1987, the domestic price of rice was 67 percent of its world price, that of wheat was 80 percent of the world price, and that of groundnuts was 150 percent of the world price. These translate into tax factors, the (shadow) tax share in the domestic price ri/ (I +x ,) of -0.50, -0.25, and 0.33 for rice, wheat and groundnut oil, so that the accounting ratios for the 12 goods (rice, wheat, jowar, other cereals, pulses, dairy products, edible oils, meat, vegeta- bles, fruit, sugar, other goods) are (1.5, 1.25, 1.00, 1.00, 1.00, 1.00, 0.67, 1.00, 1.00, 1.00, 1.00, 1.00). While these are stylized figures, and ignore other price distortions in food prices, like the Pakistani figures, they are based on the reality of the 1980s, and will serve to illustrate the general points of the analysis. *Adapting the price reform formulas Before using the formulas of Section 5.5, I need to match them to the empirical model, and more substantively, I need to adapt the theory to a world in which prices are not the same for everyone and where both quality and quantity adapts to price. To do the latter requires an assumption about how tax changes affect the different prices that people pay, and the simplest such assumption is that changes in taxes affect all prices proportionately within a commodity group. This makes most sense when taxes are ad valorem, and may fail in some cases, for example if a good is taxed through an import tariff, if spatial price variation reflects transport costs, and if transport costs depend on volume rather than value. Suppose that the (shadow) tax rate on good i is Ti, so that the price paid, xic, is ci (I + 'r), where ft, is taken to be fixed as the tax rate changes. When utility de- pends on quality as well as quantity as in equation (5.62), the derivative of the cost of living with respect to price changes is the product of quality and quantity. A change in the tax rate Ari induces a price change i't,A'r,, so that if h buys ih Qih, the compensation required is ftij ihQihAxi =xhwihATi(l +r,)-l where, as before, xh is total expenditure and wih is the budget share of good i. I adopt the standard Atkinson social welfare function (5.13), so that (5.103) = ( h The numerator of the Xi ratio in (5.9) is therefore replaced by aw (5.104) drh (1+h) JCjhihihQih = (l+r;) -h(Xh/nXh) xhwih A value of e of unity implies that additional income is twice as valuable to some- one with half the income, with higher values implying a greater focus on the poor. Note that, by (5.103), social welfare accounting is done for individuals, not households, but under the assumptions that each person in the household receives the same, and that PCE is an adequate measure of the welfare of each. LOOKING AT PRICE AND TAX REFORM 327 Tax revenue collected from household h is Al M (5.105) Rh = Erk~kk h = E(Tkkl+Tk)XhWkh. k=I k=1 The derivative of revenue with respect to a change in the rate can be related to the empirical model by looking first at the derivative of the budget shares with re- spect to the tax rate, kh = Wkh 3lni=( (5.106) a-ci aln7ci aTi I+ ri Substituting into (5.105) and averaging over all households, we have (5.107) ~~~aR i M TkOki1 (5.107) (l + -i) = i wVi[l l+ E1+ W] ani k=I 1+Trk of where x is the mean of xh and wif is the "plutocratic" average budget share, H H (5.108) w7i = E X^hW /h E xh. h=1 h=1 Finally, if we define the "socially representative budget share" wi[ by (5.109) i= [ (xh/nh) xhwi,. / S Xh h=1 h=I the marginal cost-benefit ratio of a tax increase Ai can be written Ai ~ ~~w Elw (5.110) 1 + i ( 'ii1) + E k ki 1 X,i fvi koi 1 +rk Wi The numerator of (5.110) is a pure distributional measure for good i; it can be interpreted as the relative shares of the market-representative individual (the representative agent) and the socially representative individual, whose income is lower the higher is the inequality aversion parameter e. This measure is modified by the action of the terms in the denominator. The first of these (apart from 1) is the (shadow) tax factor multiplied by the elasticity of expenditure on good i with respect to its price, quality and quantity effects taken together. This term mea- sures the own-price distortionary effect of the tax. If it is large and negative, as would be the case for a heavily taxed and price-elastic good, the term will con- tribute to a large 1,-ratio and would indicate the costliness of raising further reve- nue by that route. The last term is the sum of the tax factors multiplied by the cross-price elasticities, and captures the effects on other goods of the change in the tax on good i, again with quantity and quality effects included. From a theo- retical point of view, this decomposition is trivial, but when we look at the results, it is useful to separate the own- and cross-price effects, because the former are likely to be more reliably measured than the latter, and because the latter are often 328 THE ANALYSIS OF HOUSEHOLD SURVEYS omitted or handled by assumption in more restrictive systems. (In the calculations below, the term we is arbitrarily scaled to sum to unity across goods. Changes in social welfare are not measured in money units, and for any given value of C, we are concerned only with the relative values of the ''s across different goods.) Equity and efficiency in price reform in Pakistan Table 5.10 shows the efficiency effects of raising taxes on each of the goods, distinguishing between the terms in the denominator of (5.110). The first column shows the tax factors t~/(1 +-ri) calculated from the accounting ratios discussed above; these are the shadow taxes calculated from comparison of the world and domestic prices. Wheat and rice carry shadow subsidies, oils and sugar, shadow taxes. The second column shows the own-price elasticities of quantity times qual- ity; these are the own-good contributions to the tax distortion. Because the quality elasticities are small, the combined elasticities are approximately equal to the own-price elasticities (see equations (5.90) and (5.91) above), but they are con- ceptually distinct. Taxes are ad valorem, so that quality shading in response to tax increases depresses revenue just as do decreases in quantity. The third column shows the own-price distortions and corresponds to the middle term in the deno- minator of (5.110). Goods that bear no shadow taxes do not appear in this column since there are no distortionary effects of small tax increases at zero tax rates. The large effects are on wheat and rice, with the latter much larger because its own- price elasticity is much larger. By themselves, these own-price terms will generate small cost-benefit ratios, particularly for rice. Subsidies are being paid, they are distortionary, particularly for rice, and it is desirable to reduce them, at least from this limited perspective. The cross terms in the next column are typically smaller, the exception being oils and fats, where the negative term acts so as to decrease the attractiveness of raising those prices. By checking back through the matrix of price elasticities in the lower half of Table 5.8, we can see that the cross-price elasticity responsible is that between the price of sugar and the demand for rice. An increase in the price of edible oil causes the demand for rice to rise-the elasticity is 0.44-which is distortionary because rice subsidized, an effect which is enhanced by a smaller but still positive cross-price elasticity with wheat, which is also subsidized. The final column in the table presents the sum of both effects, plus 1 according to (5.110). According to these-and bearing in mind that nothing has yet been said about distributional issues-rice is the commodity whose price should be raised, and oils and fats the commodity whose price should be lowered. The wheat sub- sidy too is distortionary, and there is an efficiency case for raising the price. Equity effects are incorporated in Table 5.11 for a range of values of the distributional parameter e. The first panel corresponds to e = 0 where there are no distributional concerns, and the cost-benefit ratios are simply the reciprocals of the last column in Table 5.10. Rice is a good candidate for a price increase, and edible oils for a price decrease. As we move through the table to the right, the distributional effects modify this picture to some extent. The first column in each LOOKING AT PRICE AND TAX REFORM 329 Table 5.10. Efficiency aspects of price reform in Pakistan X i-1 Own Cross Food I+?.i Wi effect effects Total Wheat -0.40 -0.64 0.25 -0.05 1.20 Rice -0.40 -2.08 0.83 -0.05 1.78 Dairy products 0.00 -1.01 0.00 0.01 1.01 Meat 0.00 -0.57 0.00 0.01 1.01 Oils and fats 0.05 -2.33 -0.12 -0.38 0.51 Sugar 0.40 0.13 0.05 -0.14 0.91 Other foods 0.00 -0.53 0.00 0.02 1.02 Nonfoods 0.00 -1.11 0.00 -0.02 0.98 Note: The columns correspond to the elements in the denominator of (5.110). Column 3 is the product of columns I and 2; column 4 is the last term in the denominator; and column 5 is I plus the sum of columns 3 and 4. Source: Author's calculations using Household Income and Expenditure Survey, 1984-85. case, which shows the relative budget shares of an increasingly poor individual relative to the market-representative individual, moves away from luxuries and towards necessities as the distributional parameter increases. Wheat, the basic staple, attracts a larger weight as we move down the income distribution, as do oils and fats. Other food, which contains pulses, also has a weight that increases with e. Wheat is a target for price increases on efficiency grounds, although not as much as is rice, but the equity effects tend to make it less attractive, and its 1- value becomes relatively large as we move to the right in the table. However, for oils and fats, which were the main candidates for price decreases, the equity ef- fects only strengthen the conclusions. Oils and fats figure relatively heavily in the budgets of low-income consumers, and reducing their price is desirable for both equity and efficiency reasons. Rice remains the best candidate for consumer tax increases; it is not consumed by the very poor, and its subsidy is the most distor- tionary. Table 5.11. Equity effects and cost-benefit ratios for price increases, Pakistan .=0 e=0.5 E=l.0 e=2.0 Food wEI1V A w5li A weIVi A w'Ni A Wheat 1.00 0.83 1.09 0.90 1.16 0.97 1.28 1.06 Rice 1.00 0.56 1.04 0.58 1.06 0.59 1.06 0.59 Dairy prod. 1.00 0.99 1.01 1.00 1.01 1.00 0.97 0.96 Meat 1.00 0.99 0.98 0.97 0.95 0.94 0.88 0.87 Oils and fats 1.00 1.97 1.07 2.12 1.13 2.24 1.22 2.41 Sugar 1.00 1.10 1.01 1.10 1.00 1.10 0.97 1.06 Other foods 1.00 0.98 1.04 1.01 1.06 1.04 1.10 1.08 Nonfoods 1.00 1.02 0.96 0.99 0.94 0.96 0.92 0.94 Note: The two columns for each value of e are the numerator of (5.110) and (5.110) itself. Source: Author's calculations using Household Income and Expendiutre Survey, 1984-85. 330 THE ANALYSIS OF HOUSEHOLD SURVEYS In conclusion, I emphasize once again that these policy implications do not apply directly to the tax instruments currently in use in Pakistan. The findings suggest that raising revenue from an increase in the price of rice to consumers would be desirable on grounds of efficiency and equity. But the main instrument for controlling the price of rice is the export tax, a decrease in which would cer- tainly increase the consumer price of rice, but it would also decrease rather than increase government revenue, and it would change supplies of rice, effects which we are in no position to consider here. Nevertheless, the fact that rice and wheat are substitutes in consumption, and that there are further substitution effects be- tween rice, wheat, and edible oils, would have to be taken into account in a re- form of any export tax, just as they are taken into account here. Indeed, perhaps the most useful lesson of this analysis is how important it is to measure the way in which price changes affect the pattern of demand. The Pakistani substitution patterns between rice, wheat, and edible oils are not consistent with additive pre- ferences, and so cannot be accommodated within a model like the linear expendi- ture system. In countries where some prices are far from opportunity cost, cross- price effects of tax changes will often be as important as the effects on the good itself, and these effects must be measured in a flexible way. Nor could additive preferences accommodate the pattern of total expenditure and own-price elastici- ties that characterize demand patterns in Pakistan. Oils and fats are a necessity, but have a high own-price elasticity, so that the ratio of the own-price to income elasticity is many times greater than for a rice, which has high own-price and income elasticities. Additive preferences require that this ratio be (approximately) the same for all goods. Yet it is this ratio that is the principal determinant of how the balance between equity and efficiency ought to be struck. Equity and efficiency in price reform in India Table 5.12 shows the calculated efficiency effects for the Indian case, and corres- ponds to Table 5.10 for Pakistan. As before, the first column shows the factors Tr/(1 +-r1) calculated from the accounting ratios, while the second column shows the own-price elasticities of quality and quantity together. The product of the first and second columns, which is shown as the third column, gives the contribution of the own-price effects to the measure of the distortion that would be caused by a marginal increase in price. These are nonzero only for the goods that bear taxes or subsidies, rice, wheat, and oils. Wheat is more price elastic than rice, but its subsidy is less, and the own-price distortion effect is also less. However, as the next column shows, the distortion caused by the wheat subsidy is alleviated by the cross-price effects, largely because a lower wheat price draws demand away from rice, which is even more heavily subsidized. There are also important cross-price effects for other cereals, pulses, dairy products, meat, and fruit. Increases in the price of any of these goods decreases the demand for rice, and helps reduce the costs of the rice subsidy. On efficiency grounds alone, the prices of rice, other cereals, pulses, dairy products, and meat should be increased, and those of jowar and nonfoods decreased. LOOKONG AT PRICE AND TAX REFORm 331 Table 5.12. Efficiency aspects of price reform in India Own Cross Food +ci wi effect effects Total Rice -0.50 -1.13 0.57 -0.15 1.42 Wheat -0.25 -1.33 0.33 -0.32 1.01 Jowar 0.00 -0.39 0.00 -0.12 0.88 Other cereals 0.00 -3.51 0.00 0.58 1.58 Pulses 0.00 -0.60 0.00 0.55 1.55 Dairy products 0.00 -0.21 0.00 0.26 1.26 Edible oils 0.33 -0.27 -0.09 0.16 1.07 Meat 0.00 -1.13 0.00 0.43 1.43 Vegetables 0.00 -0.66 0.00 0.04 1.04 Fruit 0.00 -1.08 0.00 0.27 1.27 Sugar and gur 0.00 -0.28 0.00 -0.03 0.97 All other 0.00 -1.43 0.00 -0.20 0.80 Note: The columns correspond to the elements of the denominator of (5.110). Column 3 is the product of columns I and 2; column 4 is the last term in the denominator; and column 5 is 1 plus the sum of columns 3 and 4. Source: Deaton, Parikh, and Subramanian, (1994 Table 9). Table 5.13 brings in the equity effects, and computes the cost-benefit ratios that trade off both equity and efficiency. In the first pair of columns, the Atkinson inequality aversion parameter is zero, so that all individuals are treated alike, no matter how much their household spends. In this case, the X-ratios are simply the reciprocals of the last column in Table 5.12, and we get the same ranking of rela- tive tax costs. As we move to the right, and the e parameter increases, the equity Table 5.13. Equity effects and cost-benefit ratios for price increases, India E=0 e=0.5 e=1.0 e=2.0 Food w6Iii I IwVe/ N x V'Ii X Rice 1.00 0.71 1.02 0.72 1.03 0.73 1.02 0.72 Wheat 1.00 0.99 0.98 0.97 0.96 0.95 0.91 0.90 Jowar 1.00 1.14 1.11 1.27 1.21 1.38 1.39 1.58 Other cereals 1.00 0.63 1.09 0.69 1.16 0.73 1.24 0.78 Pulses 1.00 0.65 1.05 0.68 1.09 0.70 1.15 0.75 Dairy prod. 1.00 0.79 0.97 0.77 0.94 0.74 0.87 0.69 Edible oils 1.00 0.94 1.02 0.96 1.04 0.98 1.07 1.01 Meat 1.00 0.70 1.00 0.70 0.98 0.69 0.94 0.66 Vegetables 1.00 0.96 1.04 1.00 1.06 1.02 1.09 1.05 Fruit 1.00 0.79 0.96 0.76 0.93 0.73 0.84 0.66 Sugar & gur 1.00 1.03 1.03 1.07 1.07 1.10 1.12 1.16 All other 1.00 1.25 0.96 1.20 0.92 1.15 0.88 1.10 Note: The two columns for each value of e are the numerator of (5.1 110) and (5.11 0) itself. Source: Deaton, Parikh, and Subramanian (1994 Table 10). 332 THE ANALYSIS OF HOUSEHOLD SURVEYS column gives larger values to the goods most heavily consumed by the poor, and relatively smaller values to those that are most heavily consumed by those who live in households that are better-off. For c = 0.5, jowar receives the highest weight, with other cereals, pulses, vegetables, sugar, rice, and edible oils all showing equity effects greater than unity. These are the goods most heavily con- sumed by the poor. As E increases further, the importance of these goods for distribution increases further. Of the two subsidized cereals, the equity case is stronger for rice than wheat-which is indeed why rice carries the larger subsi- dy-and the difference between them increases with the degree of inequality aversion. Neither cereal is consumed as heavily by the poor as are the coarse cereals, although there are limitations on the extent to which the government could intervene in that market. The cost-benefit ratios A bring together the equity and efficiency effects. Be- cause of the subsidy, and because it is less heavily consumed by the poor than coarse cereals, rice is always a candidate for a price increase. But when concern for the poor is relatively low, increases in the prices of pulses and of other cereals are less socially costly, because of the large cross-price elasticities with rice. Given any positive degree of equity preference, the most attractive candidate for a price decrease is jowar. When equity preference is large, the efficiency case for raising prices of pulses and other cereals is outweighed by equity criteria, and meat and fruit become the best candidates for tax increases. 5.6 Price reform: parametric and nonparametric analysis This has been a long chapter, and although the topic has been the same through- out, we have followed it from the theory through the data problems to the econo- metric implementation and policy evaluation. Although the methodology is inevi- tably complex in parts, it has been successfully applied in a number of countries, there is standardized software which is included in the Code Appendix, and the real difficulties lie, as always, not in the econometric sophistication, but in setting up the problem in a sensible way, picking the right goods for the analysis, and in the crucial preliminary analysis of the data, looking for outliers, interpreting and examining the unit value data, and making whatever adaptations are necessary for local conditions or institutions. It is rarely possible to do satisfactory policy analy- sis with "off-the-shelf' econometrics. Even so, the procedures of this chapter should be helpful as a basis for analysis of tax reform and pricing issues. In this brief final section, I want to draw attention to areas where the methodology is weakest, and where there is likely to be the greatest payoff to further research. A major contrast between the analysis of this chapter and that in Chapters 3 and 4 is that the absence of nonparametric methods. Instead, I have used a para- metric model of demand to deliver the price responses, and this is in sharp con- trast to exercises like that in Chapter 3, where I looked at the distributional effects of rice pricing in Thailand without having to resort to a specific functional form. The reason for the difference is the difficulty in estimating price responses; even with thousands of observations, there is often limited price variation, so that there LOOKING AT PRICE AND TAX REFORM 333 is little hope of direct measurement without imposing prior structure. Although the model in this chapter also used a parametric form to specify the effects of total expenditure and sociodemographics, this is only necessary to allow isolation of the price responses, and no use of the results is made in the policy exercises in Section 5.5; the survey data themselves are used directly to calculate the average and socially representative budget shares in the cost-benefit formulas. An important topic for research is whether it is possible to do better, and bring a nonparametric element to the estimation of the price responses. The hope comes from the fact that the cost-benefit formula (5.9) does not require an estimate of the price derivative at every point in the sample, or even for different values of total expenditure or the socio-demographics. Instead, all that appears in (5.9) is the aggregate of each household's response, or equivalently, since we can divide top and bottom of (5.9) by the number of households, the average price response for all households. Among recent developments in econometrics have been meth- ods for estimating such average derivatives without requiring the specification of a functional form (see Hardle and Stoker 1989; Stoker 1991; and Powell, Stock, and Stoker 1989, for the basic techniques, and Deaton 1995, for a discussion in the context of pricing policy in developing countries). These "semiparametric" methods deliver only averages, not estimates for each household, but in large enough samples they do so almost as efficiently as does parametric regression. A preliminary investigation of average derivative techniques using the Paki- stani data is provided by Deaton and Ng (1996). The method has both advantages and disadvantages. Among the former is the removal of any need to write down a theoretically consistent model of demand. Even more importantly, the absence of a functional form provides a clean solution to the problems associated with some households buying the good, and some not. The average derivative methods esti- mate the derivative of the regression function, the expectation of demand condi- tional on prices, total expenditure, and sociodemographics, but unconditional on whether or not the household buys the good. This is exactly what the cost-benefit formulas require, and so the method avoids the conventional detour through the complexity of utility theory and switching regression models. There are equally serious disadvantages. First, there is no apparent way of allowing for either quality effects or measurement error. The former is perhaps not too serious, but there are many examples where the neglect of the latter could produce serious biases. Second, although we can do without a functional form, it remains necessary to specify which are the conditioning variables, and the omis- sion of relevant variables will lead to inconsistent estimates of price derivatives when the omitted variables are correlated with prices, in exactly the same way that OLS is biased by omitted variables. Third, although the average derivative method finesses the problem of zero observations in the unit values, it provides no assistance with the fact that unit value observations are missing for such households. Because the missing households are unlikely to be randomly select- ed-more households will buy nothing where prices are high than where they are low-there is a risk of selectivity bias, and that risk is present whether we use parametric or semiparametric methods. In the methods of this chapter, the effects 334 THE ANALYSIS OF HOUSEHOLD SURVEYS of selectivity are moderated by the averaging over clusters, so that a village is included as long as one of its households purchases the good. Such averaging is much less attractive once the parametric framework is abandoned, since although the average derivative estimation is asymptotically as efficient as ordinary least squares, it requires a good many more observations in practice. The final disadvantage, at least for the present, is the time required to compute the estimates. The formulas are not complex, and the estimators can be written in closed form, so that there is no need for iterative solution, but with realistic sam- ple sizes, the calculations are slow even with very fast machines. Although this problem will inevitably diminish over time, the methods are difficult to recom- mend now, partly because many researchers in developing countries have access to fast machines with only a considerable lag, and because the computational costs have limited the experience with the methods in real applications, so that we cannot be sure that there are not hidden difficulties that will only become appar- ent through use. 5.7 Guide to further reading The World Bank volume edited by Newbery and Stern (1987) contains the theo- retical background for this chapter, as well as a number of applied essays that discuss applications to specific countries. The introductory essays by Stem, Chap- ters 2 and 3, as well as Chapters 7 and 13 on taxation and development, and on agricultural taxation by Newbery are worth special attention. Newbery's Chapter 18 on agricultural price reform in the Republic of Korea is a model for how such studies should be done. The book by Ahmad and Stern (1991) also reviews the theory and, like this chapter, applies it to tax reform in Pakistan. Sah and Stiglitz (1992) develop the theory of pricing in the context of a conflict between agri- cultural producers and urban consumers. The theory of price reform is closely re- lated to the theory of cost-benefit analysis, the former dealing with the effect on social welfare of small changes in prices, the latter with the effects of small changes in quantities-projects. Survey papers that emphasize this link are Dreze and Stern (1987, 1990) as well as the splendid review by Squire (1989). Quite different insights into pricing policy are obtained from the perspective of political economy; this literature is more concerned with why prices and taxes come to be what they are, and less with prescriptions based on equity and efficiency (see the World Bank volumes by Krueger, Schiff, and Valdes 1991-92, the early political analysis by Bates 1981, Grossman and Helpman 1994, and the immensely read- able monograph by Dixit 1996). 6 Saving and consumption smoothing There are many reasons to be interested in the saving behavior of households in developing countries. One is to understand the link between saving and growth, between saving and economic development itself. Although economists are still far from a generally accepted understanding of the process of development, sav- ing plays an important role in all the theoretical accounts and the data show a strong positive correlation between saving and growth, both over time in individ- ual countries, and in international comparisons across countries. If we believe that saving generates growth, and since there is a close link between household and national saving rates, both over time and over countries-see Figure 6.1 for evi- dence from a number of developing countries-then it is to the determinants of household saving that we must look if we are to understand economic growth. If instead we suppose that saving rates are a by-product of economic growth gener- ated in some other way-for example by government-directed investment pro- grams, or by the dynamic gains from trade-our understanding is not complete unless we can explain why it is that economic growth causes households to save a larger fraction of their incomes. The second reason for examining household saving is to understand how people deal with fluctuations in their incomes. The majority of people in poor countries are engaged in agriculture, and their livelihoods are often subject to great uncertainty, from weather and natural calamities, from sickness, and from fluctuations in the prices of their crops. Individuals close to subsistence need to free consumption from income, so that they are not driven to extremities simply because their incomes are temporarily low. These "smoothing" or "insurance" mechanisms can take a number of forms, of which saving is one. By laying aside some income in good times, people can accumulate balances for use in bad times, a procedure that would be available even to Robinson Crusoe, provided that he had some commodity that could be stored. In a society with many people, risks can be pooled, either through formal financial intermediaries or through more informal networks of personal credit or social insurance, at the local or national levels. One useful way of classifying saving behavior is by the length of the time period over which households can detach consumption from income. We would 335 336 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 6.1. National and household saving, selected countries (percent) Botswana Colombia Jamaica 40- 30 20 N sa 10- 0I l Korea, Republic of Peru Philippines 30 - 20- 10- 0- Sri Lanka Thailand 1980 1985 1990 40 30- 20 -National saving 10 - 0 Household saving 1980 1985 1990 1980 1985 1990 Source: UN Statistical Yearbook, various years. normally expect eyen the poorest households to be able to consume on days on which they received no income, but it is more of an issue whether or not con- sumption is at risk from seasonal fluctuations in income, and there is a substantial literature in development economics on the risks associated with seasonal short- ages and fluctuations in prices. Another literature, often couched in terms of the permanent income hypothesis, looks at whether farmers set something aside in good years and so detach their annual consumption levels from their annual in- comes. Beyond this, people may smooth their consumption over their lifetimes, saving when they are young to make provision for their old age. Whether and how well individuals undertake these various kinds of saving has implications not only for welfare, but also for the behavior of the macroeconomy and for the rela- tionship between saving and growth. In particular, the life-cycle hypothesis, that households successfully undertake lifetime consumption smoothing, has the fa- mous implication that changes in the rate of economic growth will cause changes in saving rates in the same direction (Modigliani 1966, 1970, 1986). Paralleling the taxonomy of consumption smoothing over different time peri- ods is a taxonomy of consumption smoothing over different groups of people. At one extreme, every individual is an island, and must deal alone with income fluctuations and other threats to consumption by saving and dissaving. At the other extreme, we can imagine social insurance schemes, spanning large groups of agents, in which individual risk is diversified away by being spread over every- SAVING AND CONSUMPTION SMOOTHING 337 one, leaving consumption subject only to economy-wide risk. Intermediate cases are of the greatest practical interest. We usually suppose that consumption risk is pooled within families, and the anthropological literature gives many examples of risk pooling in agricultural villages. Social security exists in one form or another, but rarely provides more than partial protection. My concem in this book is with the analysis of household survey data, and the main task of this chapter is to present evidence from surveys that helps us under- stand the determinants of household saving. This is a difficult and speculative undertaking, and one that is a good deal more provisional than anything else in the book. First, for the reasons discussed in Chapter 1, the measurement of saving in household surveys is subject to large margins of error, so that especially where household saving rates are low, it may be almost impossible to obtain any useful measure of household saving. Second, there are several different theoretical ex- planations of household saving, and there is no general agreement on which (if any) is correct. We are not yet at the stage where the policy prescriptions are clear, and where the only remaining task for empirical analysis is to provide the magnitudes necessary to set the controls. Instead, new empirical evidence and theoretical insights are still coming in, and we remain at the stage where there coexist different models and different interpretations with quite different policy implications In the main sections of this chapter, I present some of the most important models of household consumption, and discuss methods and results for using them to interpret the survey data. As always, my aim is not to provide a compre- hensive treatment of the topic, nor to survey previous research in the area, but to provide template examples of empirical analyses that cast light on the topic in question. I consider in tum the life-cycle hypothesis (Section 6.1) and the shorter- term smoothing over years and seasons that is predicted by various versions of the permanent income hypothesis (Section 6.2). Section 6.3 develops the theory of intertemporal choice and uses the results to consider extensions beyond the per- manent income and life-cycle models, extensions where households have a pre- cautionary motive for saving, or where households have no access to credit mar- kets, but use assets as a buffer to help smooth their consumption. Section 6.4 re- views recent work on insurance mechanisms, on whether households are able to stabilize their consumption by pooling risks across individuals instead of by sav- ing and dissaving over time. Section 6.5 is about the relationship between saving, age, and inequality. Section 6.6 tries to bring all of the material together and to summarize its implications for a number of policy questions. In particular, I draw some tentative conclusions about the relationship between saving and growth. 6.1 Life-cycle interpretations of saving The life-cycle model, originally proposed by Modigliani and Brumberg (1954, 1979) sees individuals (or couples) smoothing their consumption over their life- times. In the simplest "stripped-down" model, people receive labor income (earn- ings) only until the retirement age, but hold their consumption constant over their 338 THE ANALYSIS OF HOUSEHOLD SURVEYS lives. As a result, they are net savers during their working years, and dissavers during retirement. The assets that are sold by the elderly to finance their con- sumption are accumulated by the young, and provided there is neither population nor income growth, the provision for old age will support a constant ratio of wealth to earnings in the economy as a whole, and there will be no net saving. With either population growth or per capita earnings growth, the total amount of saving by the young will be magnified relative to the total amount of dissaving by the elderly, who are fewer in number and have less lifetime wealth, so that in- creases in either type of growth will increase the saving rate. Simplified versions of the model predict that the saving ratio should increase by about 2 percentage points for each percentage point increase in the growth rate, a prediction that is consistent both with the cross-country relationship between saving and growth as well as with the declines in saving rates that have accompanied the productivity slowdown in the industrialized countries (Modigliani 1993). The positive causality from growth to saving is qualitatively robust to changes in assumptions about the shapes of lifetime profiles of consumption, earnings, and saving, provided only that the average age at which saving takes place is lower than the average age of dissaving. This may not be the case if the rate of econo- mic growth is expected to be very high, because the young may wish to dissave in anticipation of higher earnings in the future. Even so, it is plausible that these groups will be unable to borrow the resources needed to finance such spending, so that a presumption remains that growth will increase saving rates for the econ- omy as a whole. A fuller account of this summary can be found in many sources, for example in Modigliani's (1986) Nobel address; a review by the author is Deaton (1992c, Chapter 2). The validity of the life-cycle description of saving has been the subject of a great deal of debate and research, mostly in the context of developed economies. Although there is still room for differences of opinion, my own belief, argued more fully in Deaton (1992c), is that the life-cycle model overstates the degree to which consumption is in fact detached from income over the life cycle, and that aside from institutionalized employer or national pension schemes, relatively few households undertake the long-term saving and dissaving that is predicted by the model. There are also questions about applying the model to developing coun- tries. In the poorest economies, where family size is large, and life expectancy relatively low, the fraction of old people is small and very few of them live alone. In the LsMS for CMte d'Ivoire in 1986, less than 1 (10) percent of females (males) aged 70 or over live alone or with their spouse. Even in Thailand, where the de- mographic transition is much further advanced, the corresponding figures for men and women in rural areas in the 1986 Socioeconomic Survey were 15 and 25 percent (see Deaton and Paxson 1992, Table 6.4). When the elderly live with their children, they can be provided for directly and personally; there is no need for the accumulation and decumulation of marketable assets that is done in developed economies through anonymous and impersonal financial markets. Nevertheless, there is a wide range of income levels, demographic structures and customary living arrangements among developing countries. For those coun- SAVING AND CONSUMPTION SMOOTHING 339 tries where economic development and the demographic transition have pro- ceeded the furthest, living arrangements are already changing, and provision for consumption by the elderly is moving into the forefront of public policy. These changes are at their most rapid in the fast-growing economies of East and South East Asia, countries which also have national and household saving rates that are very high by historical and international standards. The mechanisms that produce this conjunction are far from clear. Is the high saving a consequence of the rapid growth, or its cause? Are the high saving rates generated by demographic change and the fact that many current savers will have to fend for themselves in their old age? Does the political pressure for social security spring from the same concerns, and will its introduction cut saving rates and threaten economic growth? Indeed, will the aging of the population that is already underway mean lower saving rates as the fraction of dissavers increases? One key to many of these questions is whether or not the life-cycle hypothesis provides an adequate description of the behavior of saving. In this section, I look at life-cycle saving behavior in the data from CMte d'Ivoire, Thailand, and Taiwan (China), starting with simple age profiles of con- sumption, and moving on to a more sophisticated analysis based on the cohort methods of Section 2.6 to interpret the data in terms of the life-cycle model. Age profiles of consumption Perhaps the most obvious way to examine the life-cycle behavior of consumption from survey data is to plot consumption and saving against age. But as soon as we try to do so, we encounter the difficulty of measuring income and thus saving, as well as the fact that much income data and almost all consumption data relate to households and not individuals. It is far from clear that the large households that exist in many rural areas around the world, and that often contain subunits of several families, can be adequately described in terms of the simple life cycle of the nuclear family. Even so, we have little choice but to classify households by the age of the head, to measure income as best we can, and to see whether the results make sense. Figure 6.2, taken from Deaton (1992c, p. 55), shows age profiles of consump- tion and income for 1,600 or so households in the CMe d'Ivoire Living Standards Surveys for 1985 and 1986, as well as profiles for 3,589 urban and 5,012 rural households from the 1986 Thai Socioeconomic Survey. In order to keep the pro- files relatively smooth, I have used the averages for each age to calculate the five- year moving averages shown in the figures; five-year smoothing also eliminates problems of "age-heaping" when some people report their age rounded to the nearest five years. The graphs do not look much like the stylized life-cycle profiles of the text- book; there is little evidence of either "hump saving" among the young or of dis- saving among the elderly. Instead, and with the exception of urban Thailand, there is very little saving (or dissaving) at any age, with the consumption profile very close to the income profile. Even in urban Thailand, where there is positive 340 THE ANALYSIS OF HOUSEHOLD SURVEYS Figure 6.2. Age profiles of income and consumption, Cote d'Ivoire, 1985-86, and Thailand, 1986 2,850- Consumption 2,850 \ ~~~~~~Income 2,350- I 2,350 1,850- 1,850 =:1,350-/t 1,350 850- Ivorian households, 1985 8 Ivorian households, 1986 350- 350 10,550 4,000- Income 8,550 Income A', 3.5 8f ,550- 302,500 / \ 6,550- 2,500 4,550 v Consumption 2 Consumption / Thai urban households, 1986 ' Thai rural households, 1986 2,550 ' ' ' ____. . , , , 1,50 , 20 30 40 50 60 70 20 30 40 50 60 70 Age of household head Source: Author's calculations using CILSS 1985 and 1986, and Socioeconomic Survey of Thailand, 1986. saving, the consumption profile is similar to the income profile, and most saving takes place after age 40. There is also no sign of the predicted dissaving among older households. This close co-ordination between consumption and income has been encountered in several other cross-sectional data sets. For example, Carroll and Summers (1991) find a close correlation in the United States between con- sumption and income profiles for different educational and occupational groups, and conclude that consumption "tracks" income more closely than would be ex- pected from life-cycle models whose essence is that consumption and income need bear no relationship to one another. Similarly, Attanasio and Davis (1993) use cohort data from the United States to find a close relationship between five-. year consumption and income changes, although the relationship is much weaker between one-year changes. More comprehensively, Paxson (1996) uses cohort data from the United States, Britain, Thailand, and Taiwan (China) to document consumption and income tracking. While it is possible that the age profiles of consumption are simply those that people like, and just happen to match the shape of income profiles, it stretches belief that the coincidence should happen for every educational and occupational group as well as for a wide range of countries, the preferences of each just happening to look like its income path, even though the income profiles differ from case to case. Even so, the evidence in Figure 6.2 has to be treated with a great deal of cau- tion and is subject to several interpretations. In spite of positive saving among SAVING AND CONSUMPTION SMOOTHING 341 urban Thai households, the household saving rate from the 1986 survey is much lower than that reported in the National Income and Product Accounts (NIPA). Although NIPA measures of saving should never be automatically credited over measures from household surveys-NIPA measures of household saving are fre- quently derived as residuals-the discrepancies serve to remind us of the general difficulty of measuring saving, whether by surveys or other means. In this partic- ular case, Paxson (1992) has pointed out that in the presence of inflation, the practice of comparing money income reported for the last year with twelve times the money consumption reported for the last month will depress estimates of saving. However, her corrections for 1986 are not large, and she reports negative total savings on average for all nonurban households; among the latter, total sav- ing is less than ten percent of their total income. These figures are not consistent with the national accounts data shown in Figure 6.1. Even in the absence of measurement error, there are a number of difficulties with age profiles of consumption such as those in Figure 6.2. First, these profiles are simply the cross sections for the households in the surveys, and there is no reason to suppose that the profiles represent the typical or expected experience for any individual household or group of households. We are not looking across ages for the same household or same cohort of households, but at the experience at dif- ferent ages of different groups of households, whose members were born at dif- ferent dates and have had quite different lifetime expe