WPS6351 Policy Research Working Paper 6351 How Much International Variation in Child Height Can Sanitation Explain? Dean Spears The World Bank Sustainable Development Network Water and Sanitation Program February 2013 Policy Research Working Paper 6351 Abstract Physical height is an important economic variable human capital is robustly stable, even after accounting reflecting health and human capital. Puzzlingly, however, for other heterogeneity, such as in GDP. The author differences in average height across developing countries applies three complementary empirical strategies to are not well explained by differences in wealth. In identify the association between sanitation and child particular, children in India are shorter, on average, height: country-level regressions across 140 country- than children in Africa who are poorer, on average, a years in 65 developing countries; within-country analysis paradox called “the Asian enigmaâ€? which has received of differences over time within Indian districts; and much attention from economists. This paper provides econometric decomposition of the India-Africa height the first documentation of a quantitatively important differences in child-level data. Open defecation, which is gradient between child height and sanitation that can exceptionally widespread in India, can account for much statistically explain a large fraction of international height or all of the excess stunting in India. differences. This association between sanitation and This paper is a product of the Water and Sanitation Program, Sustainable Development Network. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The author may be contacted at dspears@princeton.edu. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team How much international variation in child height can sanitation explain? Dean Spears∗ First circulated: 10 December 2012 This version: 17 January 2013 Abstract Physical height is an important economic variable reflecting health and human capital. Puzzlingly, however, differences in average height across developing countries are not well explained by differences in wealth. In particular, children in India are shorter, on average, than children in Africa who are poorer, on average, a paradox called “the Asian enigmaâ€? which has received much attention from economists. This paper provides the ï¬?rst documentation of a quantitatively important gradient between child height and sanitation that can statistically explain a large fraction of interna- tional height differences. This association between sanitation and human capital is robustly stable, even after accounting for other heterogeneity, such as in GDP. I apply three complementary empirical strategies to identify the association between sanitation and child height: country-level regressions across 140 country-years in 65 developing countries; within-country analysis of differences over time within Indian districts; and econometric decomposition of the India-Africa height difference in child-level data. Open defecation, which is exceptionally widespread in India, can account for much or all of the excess stunting in India. ∗ Princeton University. Wallace Hall. Princeton, NJ 08540. dspears@princeton.edu. I am grateful for helpful suggestions from Luis Andres, Robert Chambers, Juan Costain, Jean Dr` eze, Marianne Fay, Ariel Fiszbein, John Newman, Doug Miller; from seminar participants at LBSNAA, Princeton, and the World Bank and WSP; and especially from Anne Case, Diane Coffey, Angus Deaton, Michael Geruso, and Jeff Hammer. Errors and interpretations are my own. Portions of this paper have previously circulated as “Sanitation and open defecation explain international variation in childrens height: Evidence from 140 na- tionally representative household surveysâ€? and “What does the NFHS-3 (DHS) tell us about rural sanitation externalities?,â€? both of which this working paper supersedes. 1 1 Introduction Physical height is a topic of expanding interest to economists (Steckel, 2009), in large part because it is an important correlate of human capital and health and is a predictor of eco- nomic productivity (Currie, 2009). Despite this attention, an important puzzle persists: international differences in height across present day developing countries are not well ex- plained by differences in economic well-being (Deaton, 2007). In particular, people in India are shorter, on average, than people in Africa, despite the fact that Indians are also richer, on average, a fact that has been labeled the “Asian enigmaâ€? (Ramalingaswami et al., 1996). One candidate explanation which has received relatively little attention in economists’ recent investigations of the puzzle of Indian stunting (e.g. Deaton, 2007; Tarozzi, 2008; Jayachandran and Pande, 2012; Panagariya, 2012) is sanitation. Medical research documents that chronic childhood environmental exposure to fecal germs can be an important cause of stunting (Humphrey, 2009). Sanitation coverage is exceptionally poor in India, where over half of households defecate openly without using a toilet or latrine, a much larger fraction than in other countries with similar income. According to joint UNICEF and WHO (2012) estimates for 2010, 15 percent of people in the world, and 19 percent of people in developing countries, openly defecate without using any toilet or latrine. The primary contribution of this paper is to document that much of the variation in child height among developing countries can be explained by differences in rates of open defecation. Sanitation robustly explains variation in stunting, even after accounting for GDP and other dimensions of heterogeneous economic development. Other recent papers, concentrating on internal validity, have demonstrated the existence of a causal effect of sanitation on child height (Spears, 2012a; Hammer and Spears, 2012); in contrast, this paper assesses the statistical global importance of sanitation quantitatively, using descriptive regressions and econometric decomposition techniques. In particular, differences in open defecation are sufficient to statistically explain much or all of the difference in average height 2 between Indian and African children. These results suggest that open defecation is a policy priority of ï¬?rst-order importance. This paper makes several contribution to the literature. First, to my knowledge, it offers the ï¬?rst documentation of a quantitatively important cross-country gradient between sanitation and child human capital. Although the association between income and health has been widely studied within and across developing countries, the importance of sanitation has received much less attention. Moreover, I show that sanitation predicts child height even conditional on income. Controlling for GDP, the difference between Nigeria’s 26 percent open defecation rate and India’s 55 percent is associated with an increase in child height approximately equivalent to quadrupling GDP per capita. Second, this paper documents an interaction between sanitation and population density, consistent with a mechanism in which open defecation harms human capital through exposure to environmental germs. The number of people defecating openly per square kilometer linearly explains 65 percent of international variation in child height. This ï¬?nding clariï¬?es the policy case for sanitation as a public good. Third, it contributes to a resolution of the puzzle of the “Asian enigmaâ€? of Indian stunting, which has received much recent attention from economists. Finally, the conclusions offer a reminder that height, often refereed to as an indicator of “malnutrition,â€? broadly reflects early-life net nutrition, including losses due to disease. Three sections of the paper contribute complementary analyses of the relationship be- tween height and open defecation, each focusing on a different dimension of heterogeneity. Section 2 studies country-year average sanitation and child heights; here, each observation is a collapsed DHS survey. Open defecation is particularly harmful to children’s health where population density is high, creating a special risk of stunting in India. Section 3 compares children within one country, introducing district ï¬?xed effects to repeated cross- section data constructed out of two rounds of India’s National Family and Health Survey, in order to study differences within districts over time. Section 4 considers whether the 3 India-Africa height gap can be explained by heterogeneity in village-level open defecation rates, using individual-level data on child heights and decomposition analysis in the spirit of Oaxaca-Blinder. Reweighting Indian data to match the sanitation of an African sample counterfactually increases the height of Indian children by more than the India-Africa gap. All three approaches ï¬?nd a similar and quantitatively important association between height and sanitation. Finally, a concluding section 5 considers whether estimates of of the asso- ciation between height and sanitation in this paper and from the literature are sufficient to account for the India-Africa gap. 1.1 Open defecation causes stunting A growing literature in economics documents that physical height has its origins in early life health (e.g. Case and Paxson, 2008), especially in poor countries where environmental threats to health are more important than they are in rich countries, relative to genetics (Martorell et al., 1977; Spears, 2012b). Two existing literatures indicate that early-life expo- sure to fecal germs in the environment reduces children’s subsequent height. First, medical and epidemiological literatures have documented the mechanisms linking open defecation to poor health and early life human capital accumulation.1 Humphrey (2009) documents that chronic but subclinical “environmental enteropathyâ€? – a disorder caused by repeated fecal contamination which increases the small intestine’s permeability to pathogens while reducing nutrient absorption – could cause malnutrition, stunting, and cognitive deï¬?cits, even without necessarily manifesting as diarrhea (see also Petri et al., 2008; Mondal et al., 2011). Relat- edly, Checkley et al. (2008) use detailed longitudinal data to study an association between childhood diarrhea and subsequent height. Second, recent econometric studies ï¬?nd an effect of a government sanitation program in 1 Perhaps the recent paper most complementary to this one is Fink et al.’s (2011) regression of an indicator for child stunting on variables including sanitation in 172 pooled DHS surveys. However, one key difference is that they focus on within-country height-sanitation correlations: all regressions include DHS survey ï¬?xed effects for country-years. 4 rural India. From 1999 until its replacement with a new program in 2012, the Indian central government operated a “flagshipâ€? rural sanitation program called the Total Sanitation Cam- paign (TSC). Averaging over implementation heterogeneity throughout rural India, Spears (2012a) ï¬?nds that the TSC reduced infant mortality and increased children’s height, on aver- age. In a follow-up study, Spears and Lamba (2012) ï¬?nd that early life exposure to improved rural sanitation due to the TSC additionally caused an increase in cognitive achievement at age six. Similarly, Hammer and Spears (2012) report a randomized ï¬?eld experiment in Ma- harashtra, in which children living in villages randomly assigned to a treatment group that received sanitation motivation and subsidized latrine construction grew taller than children in control villages. Section 5.1 considers the estimates of these causally well-identiï¬?ed studies in the context of this paper’s results. 1.2 Open defecation is common in India Of the 1.1 billion people who defecate openly, nearly 60 percent live in India, which means they make up more than half of the population of India. These large numbers are roughly corroborated by the Indian government’s 2011 census, which found that 53.1 percent of all Indian households – and 69.3 percent of rural households – “usuallyâ€? do not use any kind of toilet or latrine. In the 2005-6 National Family Health Survey, India’s version of the DHS, 55.3 percent of all Indian households reported defecating openly, a number which rose to 74 percent among rural households. These statistics are striking for several reasons. First, open defecation is much more common in India than it is in many countries in Africa where, on average, poorer people live. UNICEF and the WHO estimate that in 2010, 25 percent of people in sub-Saharan Africa openly defecated. In the largest three sub-Saharan countries – Nigeria, Ethiopia, and the Democratic Republic of the Congo – in their most recent DHS surveys, 31.1, 38.3, and 12.1 percent of households report defecating openly. Second, despite accelerated GDP growth in India, open defecation has not rapidly de- 5 clined in India over the past two decades, not even during the rapid growth period since the early 1990s. In the DHS, where 55.3 percent of Indian households defecated openly in 2005- 06, 63.7 did in the earlier 1998 survey round, and 69.7 did in 1992. This is particularly true for poor people: the joint UNICEF and WHO report concludes that “the poorest 40 percent of the population in Southern Asia have barely beneï¬?ted from improvements in sanitation.â€? In 2010, 86 percent of the poorest quintile of South Asians defecated openly. Therefore, it is already well-known that open defecation is bad for children’s health; that early-life disease leads to lasting stunting; and that open defecation is exceptionally widespread in India. The contribution of this paper is to quantitatively assess the importance of the link among these facts. The results indicate that sanitation is a statistically important predictor of differences in the height of children in developing countries and can explain differences of interest to economists and of signiï¬?cance to human development. 2 Evidence from country means: 140 DHS surveys Across countries, observed in different years, how much of the variation in child height is explained by variation in open defecation? This section uses 140 DHS surveys, each collapsed into a single observation, to show that sanitation alone explains more than half of the variation across country-years. The analysis proceeds in several steps. First, section 2.2 documents that, across country means, height is associated with open defecation, with little change after controlling for GDP. Next, section 2.2.1 uses country ï¬?xed effects and replication on sub-samples of world regions to show that no geographic or genetic differences are responsible for the result. Then, section 2.2.2 veriï¬?es that other dimensions of infrastructure or well-being do not similarly predict child height. Section 2.2.3 observes that children would be more exposed to fecal pathogens where population is more dense, and ï¬?nds that open defecation interacts with population density. Section 2.2.4 documents that the association between height and open defecation is 6 steeper among older children, consistent with an unfolding effect of accumulating exposure. Finally, section 2.3 considers the average height difference between children in South Asian and Sub-Saharan African countries, and shows that much of this gap is accounted for by sanitation. 2.1 Data All data used in this paper are publicly available free of charge on the internet. Demographic and Health Surveys (DHS) are large, nationally representative surveys conducted in poor and middle-income countries. DHS surveys are collected and organized to be internationally comparable. In some countries, several rounds of DHS data have been collected; in others only one or two. I use every DHS survey which recorded household sanitation and measured child height.2 This creates a maximum sample of 140 country-years and 65 countries, ranging in frequency from 26 countries that appear in the survey once to 10 that appear in four separate DHS surveys. The earliest survey in the dataset was collected in Pakistan in 1990; the most recent are from 2010. I match data from other sources to the collapsed DHS surveys. GDP per capita and population are taken from the Penn World Tables. “Polityâ€? and “Democracyâ€? scores of democratization are taken from the Polity IV database. A measure of calorie availability produced by the World Food Program is used in some speciï¬?cations. All other variables are from DHS surveys. Using these data, the basic regression I estimate is heightcy = βopen def ecationcy + αc + γy + Xcy θ + εcy , (1) where observations are collapsed DHS surveys, c indexes countries, and y indexes years. Open defecation is a fraction from 0 to 1 of the population reporting open defecation without using 2 I use published summary statistics available online at www.measuredhs.com. DHS surveys do not include rich countries, such as the U.S. One important omission is China, where there has not been a DHS survey. 7 a toilet or latrine.3 Height is the average height of children under 5 or children under 3, used in separate regressions. As robustness checks, results are replicated with country ï¬?xed effects αc , year ï¬?xed effects γy , and time-varying controls Xcy , including the log of GDP per capita, all of which are added separately in stages. Standard errors are clustered by 65 countries. 2.2 Regression results Figure 1 depicts the main result of this section: a negative association between open defeca- tion and child height is visible across country years both for children under 3 and children under 5. Regression lines are plotted with and without weighting by country population. The three largest circles are India’s National Family and Health Surveys (only one large cir- cle appears in panel (b) because only the 2005 survey measured height of children up to age 5). Average height in India is, indeed, low. However, the fact that the Indian observations are on the regression line – and not special outliers – is an initial suggestion that sanitation might help resolve the “Asian enigmaâ€? of Indian height. Table 1 reports estimates of regression 1 and will be referenced throughout this section of the paper. The main estimate of a linear decrease in height of 1.24 standard deviations associated with changing the fraction openly defecating from 0 to 1 is qualitatively similar to Spears’s (2012a) estimates of 1.15 to 1.59, where effects of sanitation are identiï¬?ed using heterogeneity in the implementation of an Indian government program. In column 1, san- itation alone linearly explains 54 percent of the country-year variation of children’s height in DHS surveys. Because sanitation and height are both improving over time, column 2 adds year ï¬?xed effects; the point estimates slightly increase and standard errors decrease, suggesting that the result is not an artifact of time trends. Does the signiï¬?cance of open defecation merely reflect general economic development? Column 3 adds a control for GDP per capita; the coefficient on sanitation remains similar, 3 For example, in India’s NFHS-3, the survey asks “What kind of toilet facility do members of your household usually use,â€? with the relevant answer “No facility/uses open space or ï¬?eld.â€? This importantly distinguishes latrine use from latrine ownership. 8 which is consistent with Deaton’s (2007) observation that income does not explain cross- country height differences.4 This is reflected in panel (a) of ï¬?gure 2, which plots the residuals after regressing height of children under 3 years old on the log of GDP against the residuals after regressing open defecation on the log of GDP. The association remains, and the R2 is similar: sanitation linearly explains 54.2 percent of variation in child height, and the sanitation residual explains 53.9 percent of the variation in the height residual. Panel (b) of ï¬?gure 2 adds average height and exposure to open defecation in wealth subsets of India’s 2005 DHS to the basic plot of country mean height and sanitation. Included in published DHS data is a classiï¬?cation of households into wealth quintiles based on asset ownership. Average height of children within these groups is plotted against the rate of open defecation among all households in the primary sampling unit where they live, that is, the local open defecation to which they are exposed. Additionally, I follow Tarozzi (2008) in identifying an elite top 2.5 percent of the Indian population: children who live in urban homes with flush toilets that they do not share with other households; whose mothers are literate and have been to secondary school; and whose families have electricity, a radio, a refrigerator, and a motorcycle or car. Even these relatively rich children are shorter than healthy norms; this is expected, because 7 percent of the households living near even these rich children defecate openly. Indeed, the graph shows that their stunted height is approximately what would be predicted given the open defecation in their environment. More broadly, the association between height and sanitation among these wealth groups is close to the the international trend computed from country means. Exposure to nearby open defecation linearly explains 99.5 percent of the variation in child height across the ï¬?ve asset 4 GDP per capita statistically signiï¬?cantly interacts with open defecation to predict height: heightcy = −1.42 − 1.18 open def ecationcy + 0.14 ln (GDP )cy − 0.59 open def ecationcy × ln (GDP )cy , where open defe- cation and GDP are demeaned, the coefficients on open defecation and the interaction are statistically sig- niï¬?cant at the 0.01 level, and GDP is statistically signiï¬?cant at the 0.05 level. Thus, the slope on ln(GDP) would be 0.36 with no open defecation, but only 0.038 at India’s 2005 level of open defecation, consistent with the low apparent effect of recent Indian economic growth on stunting. Although it is difficult to interpret this result causally, one possibility is that private health inputs such as food do less to promote child height in a very threatening disease environment; I thank Angus Deaton for this suggestion. 9 quintiles. 2.2.1 A geographic or genetic artifact? Perhaps people who live in certain countries or regions tend to be tall or short, and this is coincidentally correlated with open defecation. Is the result driven by certain countries or regions, or ï¬?xed differences such as genetics? Figure 3 presents initial evidence against this possibility. The sample is restricted to countries with more than one DHS observation, and country means across collapsed DHS surveys are subtracted from the height and sanitation survey averages. The ï¬?gure plots the difference in a country-year’s height from that country’s mean across DHS surveys against the difference in sanitation. The slope is similar to the undifferenced plot. Moreover, panel (b) continues to demonstrate an association despite not including any data from India. Returning to table 1, column 4 adds country ï¬?xed effects. A control is also added for the average height of mothers of measured children; this is in anticipation of a possibility observed eze (2009) and considered in more depth in section 4.3 that Indian stunting by Deaton and Dr` is not caused by current nutritional deprivation or sanitary conditions, but is instead an effect of historical conditions that stunted the growth of women who are now mothers, restricting children’s uterine growth. DHS surveys are categorized into six global regions;5 column 5 adds six region-speciï¬?c linear time trends r δr yeary , to rule out that the effect is driven by spurious changes in speciï¬?c parts of the world. Neither of these additions importantly change the estimate of the coefficient, although adding so many controls increases the standard errors. Table 2 further conï¬?rms that no one region is responsible for the results. The association between height and sanitation is replicated in regressions that omit each of the six world regions in turn. The coefficient near 1 notably remains when South Asian observations are 5 The regions are sub-Saharan Africa, South Asia, the Middle East & North Africa, Central Asia, East & Southeast Asia, and Latin America. 10 omitted, again suggesting that the result is not merely reflecting India. 2.2.2 A statistical coincidence?: Omitted and placebo variables Across rich and poor places, good conditions are often found together, and problems are often found in places with other problems. Would any measure of infrastructure, governance, or welfare be as correlated with height as is sanitation? Column 6 of table 1 adds time-varying controls. These include female literacy, which is an important predictor of child welfare, and accessibility of water supply, all also from the DHS.6 Development outcomes are often attributed to good “institutions;â€? the set of controls includes the polity and autocracy scores from the Polity IV database. With these controls added, the association between height and sanitation is essentially unchanged.7 Table 3 isolates each of these alternative independent variables in turn. None of these “placeboâ€? predictors matter for child height, conditional on sanitation and and GDP.8 In particular, conditional on sanitation and GDP, child height is not associated with other types of infrastructure (electriï¬?cation, water), governance (a democratic polity, autocracy), female literacy, or nutritional measures such as food availability, the breastfeeding rate,9 or the fraction of infants who are fed “other liquidsâ€? beyond breastmilk in the last 24 hours. 6 Despite the frequency of undifferentiated references to “water and sanitation,â€? improving water supply and reducing open defecation have very different effects on child health and other outcomes and should not be conflated (Black and Fawcett, 2008). 7 If controls for the fraction of infants ever breastfed, the fraction of infants breastfed within the ï¬?rst day, and the fraction of infants fed “other liquidsâ€? in the past 24 hours are further added as measures of the quality of infant nutrition, the coefficient on open defecation in panel A with all controls becomes larger in absolute value, -1.59, with a standard error of 0.82. This is not statistically signiï¬?cantly larger than the other estimates. 8 Some of them do predict child height in a less thoroughly controlled speciï¬?cation. For example, in the ï¬?rst column, if GDP is removed the coefficient on female literacy almost doubles and becomes statistically signiï¬?cant at the 0.1 level. 9 Breastfeeding is an especially important variable because India has high levels of open defecation, short children, and poor breastfeeding. In a regression with open defecation, country ï¬?xed effects, log of GDP, the fraction of children ever breastfed, and the fraction of children breastfed on the ï¬?rst day, neither breast- feeding variable is statistically signiï¬?cant (t of 0.23 and 0.6 respectively), but open defecation has a similar coefficient of -0.849, statistically signiï¬?cant at the two-sided 0.1 level (n = 139). If open defecators per square kilometer is used in place of the open defecation fraction, again it is statistically signiï¬?cant (t = −6) and the breastfeeding variables are not. 11 2.2.3 Mechanism: Interaction with population density If open defecation is, indeed, stunting children’s growth by causing chronic enteric infection, then height outcomes should be consistent with this mechanism. In particular, children who are more likely to be exposed to other people’s fecal pathogens due to higher population density should suffer from larger effects of open defecation. For example, Ali et al. (2002) show that higher population density is associated with greater cholera risk in a rural area of Bangladesh, and Spears (2012a) ï¬?nds a greater effect of India’s Total Sanitation Campaign in districts with higher population density. To test this conjecture, I construct a crude measure of “open defecators per square kilometerâ€?: the product of population density per square kilometer times the fraction of people reporting open defecation. Figure 4 reveals that this measure of exposure to fecal pathogens (in logs, due to wide variation in population density) visibly predicts average child height. The regression in panel (a) of the ï¬?gure explains 65 percent of variation in child height. Notably, India occupies the bottom-right corner of the graph, with high rates of open defecation and very high population density. Does population density add predictive power beyond open defecation alone? The ï¬?nal column of table 1 adds an interaction between open defecation and population density. The interaction term is statistically signiï¬?cant, and the interaction and population density are jointly signiï¬?cant with, an F2,63 statistic of 4.5 (p = 0.0149) in column (a) and F2,57 statistic of 4.5 (p = 0.0155) in column (b). A further implication of this mechanism is that open defecation will have a steeper association with child height in urban places than in rural places (Bateman and Smith, 1991; Bateman et al., 1993). Table 4 investigates this using two additional collapsed datasets, one containing only the urban observations in each DHS survey and one containing only the rural observations. Although GDP per capita is not available for urban and rural parts of countries, urban and rural women’s height controls can similarly be computed from the 12 DHS. In all cases, the urban coefficient on open defecation is greater than the whole-country coefficient and the rural coefficient is smaller. Hausman tests (reported under the open defecation coefficients in columns 5 through 7) verify that urban coefficients are larger than rural coefficients from the corresponding speciï¬?cations. 2.2.4 Mechanism: A gradient that steepens with age Height-for-age z scores are computed by age-in-months so that, in principle, the heights of children of different ages can be pooled and compared. If international reference charts were genetically or otherwise inappropriate for some countries, we might expect a consistent gap across children of different ages, analogous to a country ï¬?xed effect. However, stunting in India and elsewhere develops over time: children’s mean z -scores fall relative to the norm until about 24 months of age, where they flatten out. This is consistent with early-life health deprivation causing a steepening “gradientâ€? between health and economic status, more steeply negatively sloped as children age (Case et al., 2002). If the association between height and sanitation were indeed the unfolding result of accumulating exposure to fecal pathogens, then it is plausible that the association would become steeper over the ï¬?rst two years of life, at a rate that flattens out. Figure 5 plots the coefficients from estimating the basic equation 1 separately for collapsed means of children in four age groups: 0-5 months, 6-11 months, 12-23 months, and 24-35 months. Thus, as in the rest of this section of the paper, each coefficient is computed in a regression with 140 country means, but now these height means only include children in subsets of the age range. The independent variable – country-wide open defecation – is the same in each regression. Two conclusions are visible in the ï¬?gure. First, the gradient indeed steepens in age, at a rate that flattens. Second, the mean height of Indian children in the 2005 NFHS-3 is plotted for reference. The curve has a similar shape to the age pattern of the coefficients. This suggests that a ï¬?xed exposure to open defecation could be scaled into a similar shape as the 13 Indian height deï¬?cit by an increasing association between sanitation and height. 2.3 The gap between South Asia and sub-Saharan Africa Although people in South Asia are, on average, richer than people in sub-Saharan Africa, children in South Asia are shorter, on average, and open defecation is much more common there. How much of the South Asia-Africa gap can sanitation statistically explain, at the level of country averages? Table 5 estimates regressions in the form of equation 1, with the sample restricted to countries in South Asia and sub-Saharan Africa and with an indicator variable added for data from South Asia. Of the 140 DHS surveys in ï¬?gure 1, 11 are from South Asia and 78 are from sub-Saharan Africa. In these data, children in South Asia are, on average, about one-third of a height-for-age standard deviation shorter.10 How do further controls change the estimate of this South Asia indicator? Merely lin- early controlling for open defecation reduces the gap by 30 percent from 0.360 to 0.253. Controlling, instead, for the number of people openly defecating per square kilometer (the product of population density and the open defecation rate, column 4) reduces the coefficient by 83 percent to 0.061. Column 5 veriï¬?es that this result is not merely a misleading effect of population density, controlling for which increases the gap. Pairs of columns 6-7 and 8-9 demonstrate the statistical robustness of the explanatory power of the density of open defecation. After controlling for the log of GDP per capita, adding a further control for open defecators per square kilometer explains 73 percent of the (larger) remaining gap. The density of open defecation reduces by 92 percent the height 10 Jayachandran and Pande (2012), using individual-level DHS data from Africa and South Asia, suggest that ï¬?rst-born South Asian children are taller than ï¬?rst-born African children. The country-level data studied here, however, show no similar reversal. If country means are computed using only ï¬?rst-born children, I ï¬?nd that South Asian children are 0.22 standard deviations shorter (s.e. = 0.05), a reduction but not an elimination of the 0.36 gap in table 5. In this sample of country means of ï¬?rst-borns, the gap falls to 0.15 with a control for open defecation and to 0.08 with a control for open defecation per square kilometer. In the full sample of country-level means of ï¬?rst-borns, analogously to column 1 of table 1, moving from an open defecation rate of 0 to 1 is linearly associated with a decline in height for age of 1.11 standard deviations. 14 gap after controlling for both log of GDP and year ï¬?xed effects. Sanitation initially appears to explain much of the Africa-South Asia gap in child height. Section 4.4 considers the decomposition of this difference in more detail, using child-level height data. 3 Evidence from differences within Indian districts How much of the change over time in Indian children’s height is accounted for by the increase over time in sanitation coverage? One challenge to answering this question well is that, unfortunately, improvements in sanitation in India have been slow. As an illustration, in its 2005-6 DHS, 55.3 percent of Indian households reported open defecation, and the mean child was 1.9 standard deviations below the reference mean; this combination is almost identical to neighboring Pakistan’s in its 1990-1 DHS 15 years earlier, when 53.1 percent of households did not use a toilet or latrine and the mean height for age was 2 standard deviations below the mean. This section studies change over time within India by constructing a panel of districts out of India’s 1992-3 and 1998-9 DHS surveys. 3.1 Data and empirical strategy The National Family and Health Surveys (NFHS) are India’s implementation of DHS surveys. This section analyzes a district-level panel constructed out of the NFHS-1 and NFHS-2.11 Districts are political subdivisions of states. Some districts merged or split between survey rounds, so households in the survey are matched to a constructed “districtâ€? that may be a coarser partition than actual district boundaries. In particular, a primary sampling unit (PSU) is assigned to a constructed district such that all splits and merges are assigned to the coarser partition, creating the ï¬?nest partition such that each PSU is in the same constructed district as all PSUs which would have shared a district with it in either period. Thus if there were two districts A and B in the ï¬?rst round, which split before the second round into A , 11 The third and most recent NFHS does not include district identiï¬?ers. 15 B , and C (a new district containing part of A and of B ), then all of A, B , A , B , and C would be a single constructed district, although splits this complicated are rare.12 The empirical strategy of this section is to compare the heights of rural children under 3 years old in the NFHS rounds 1 and 2, using district ï¬?xed effects.13 In particular, I regress child height on the fraction of households reporting open defecation at two levels of aggregation: districts and villages (or more precisely, rural primary sampling units). Because open defecation has negative externalities on other households, it is necessary to test for effects of community-wide sanitation coverage, rather than simply comparing households that do and do not have latrines; section 4.1 considers the econometric implications of these negative externalities in more detail. Therefore, the regression speciï¬?cation is: heightidvt = β1 open defecationd v dt + β2 open defecationdvt + αd + γt + Xidvt θ + Aidvt Ï‘ + εidvt , (2) where i indexes individual children, d are districts, v are villages (rural PSUs), and t are survey rounds 1 and 2. The dependent variable, heightidvt is the height of child i in standard deviations, scaled according to the WHO 2006 reference chart. As recommended by the WHO, outliers are excluded with z -scores less than -6 or greater than 6. The independent variables open defecationd v dt and open defecationdvt are computed fractions 0 to 1 of households reporting open defecation in the child’s district and village, respectively. Fixed effects αd and γt are included for districts and survey rounds. The vector Aidvt is a set of 72 indicators for age-in-month by sex, one for each month of age for boys and for girls.14 Controls Xidvt are at the household or child level: electriï¬?cation, water supply, household size, indictors for being 12 The NFHS was not constructed to reach all districts, so households are only included in the sample if they are members of districts that appear in both survey rounds, to permit district ï¬?xed effects. 13 Although district ï¬?xed effects are used, the NFHS did not survey the same villages in the two survey rounds; thus there remains an important cross-sectional component to the heterogeneity studied. 14 Panagariya (2012) has recently argued that height-for-age z score reference charts are inappropriate for Indian children; because age-in-months-by-sex is the level of disaggregation used to create height-for-age scores, these controls fully and flexibly account for any deviation between the mean height of Indian children and the reference charts. 16 Hindu or Muslim, a full set of birth order indicators interacted with the relationship of the child’s mother to the head of the household, twinship indicators, and month-of-birth indica- tors.15 Results are presented with and without controls and ï¬?xed effects to verify robustness. The mean PSU studied here contains 10 children under 3 used in these regressions. 3.2 Regression results Table 6 presents results from estimations of equation 2, with simple OLS in Panel A and district ï¬?xed effects in Panel B. Districts which saw greater differences in sanitation also present greater differences in child height. District-level open defecation rates do not statistically signiï¬?cantly predict child height once village-level open defecation is included, which plausibly suggests that villages are much nearer than districts (which are much larger than villages) to capturing the geographic extent of sanitation externalities. Village-level open defecation predicts child height with or without district ï¬?xed effects and with or without individual controls.16 The coefficient on village open defecation is smallest in absolute value with ï¬?xed effects and controls,17 although it is not statistically signiï¬?cantly different from the other estimates; this could reflect the well-known attenuating bias of ï¬?xed effects, if much of the important variation in sanitation that is causing variation in height has been captured by other controls, leaving noise remaining.18 15 If the survey rounds were conducted in different places in different times of year, different children would be under 36 months old. Month of birth is correlated with early-life human capital inputs (cf. Doblhammer and Vaupel, 2001, about developed countries). 16 If, instead of omitting observations with height-for-age z -score beyond ±6, a cut-off of ±10 is used, then results are very similar. For example, the coefficient in column 1 of panel A becomes -0.768 (0.222); the smallest coefficient in absolute value, column 3 in Panel B, becomes -0.292 (0.138). If the log of height in centimeters is used as the dependent variable instead of the z score, moving from 0 percent to 100 percent open defecation is associated with an approximately 2 percent decrease in height (t ≈ 4, analogously to column 2 of panel B). 17 Would any village-level (instead of household-level) asset or indicator of well-being have the same effect as sanitation? Adding village electriï¬?cation and water averages to the most controlled regression, column 3 of panel B, changes the point estimate on open defecation only slightly, from -0.35 to -0.33 (s.e. = 0.12); these two village level variables have t-statistics of 1.15 and -0.59, respectively, with a joint F -statistic of 0.73. 18 For readers concerned about this possibility, regressing height on village-level open defecation with no district or time ï¬?xed effects produces an estimate of -0.700 (t ≈ 4.5) and of -0.0501 (t ≈ 4.8) with all the non-ï¬?xed-effect controls. 17 Consistency of ï¬?xed effects estimates, which subtract level differences across groups, depends on a properly linear speciï¬?cation. Column 4 demonstrates that a quadratic term for village-level open defecation is not statistically signiï¬?cant, and indeed changes signs with and without district ï¬?xed effects. Potential non-linear relationships between village sanitation coverage and child height will be considered in more detail in 4.1.2. 4 Evidence from pooled Indian and African surveys Do differences in village-level sanitation coverage explain the difference in height between rural children in India and in sub-Saharan Africa? If so, is this just a spurious reflection of other correlated variables? This section addresses these questions using pooled child-level data from the rural parts of nine DHS surveys: India’s 2005-6 NFHS-3 and eight surveys from Africa in the 2000s.19 In particular, the DHS surveys nearest 2005 (and balanced before and after) were selected from the ï¬?ve largest African countries available;20 the included countries account for 46 percent of the 2012 population of sub-Sahara Africa. The argument of this section proceeds in several stages, building to a statistical decom- position of the India-Africa height difference, in the sense of Oaxaca-Blinder. First, section 4.1 veriï¬?es an association between village-level sanitation and height within the two regions. In particular, this section assesses the linearity of the relationship (assumed by some decom- position techniques), and notes that a village-level effect implies the presence of negative externalities. Then, section 4.2 considers a paradox implied by Deaton’s (2007) ï¬?nding that height is not strongly associated with GDP: the within-region association between open defe- cation and well-being has a different slope from the across-region association. Next, section 19 Here I again follow the WHO recommendation of dropping observations with height-for-age z -scores more than 6 standard deviations from the mean. 20 This excludes South Africa, where height has not been measured in a DHS survey. Beyond this data availability constraint, this exclusion may be appropriate due to South Africa’s unique history and demog- raphy; its exceptionally high sanitation coverage (11.6 percent open defecation in 1998) would make it a positive outlier even in the African sample. The eight African DHS surveys used are: DRC 2007, Ethiopia 2000 and 2005, Kenya 2005 and 2008, Nigeria 2003 and 2008, and Tanzania 2004. DHS sampling weights are used throughout. 18 4.3 considers an alternative account of stunting in India: could it be a mere effect of short mothers causing restricted fetal growth, reflecting historical conditions and not a healthy environment that might exist today? Finally, section 4.4 proceeds with the decomposition, applying linear and non-parametric approaches to explain the India-Africa gap. 4.1 Effects of village sanitation: A negative externality As a ï¬?rst step towards explaining the height difference between Indian and African children, this section veriï¬?es that village-level open defecation predicts children’s height within each region. As in section 3, I use household-level DHS data to ï¬?nd the fraction of households in a PSU reporting open defecation, which I treat as an estimate of village-level open defecation. Thus, separately for each region r Africa and India, I estimate: heightivcr = β1 open defecationv 2 i vcr + β3 open defecationvcr + β3 open defecationivcr + (3) αc + Xivcr θ + Aivcr Ï‘ + εivcr , where i indexes individual children under 5, v are villages (rural PSUs), c are country-years (DHS surveys) in Africa and states in India, and r are regions (India or Africa). The de- pendent variable, heightivcr is the height of child i in standard deviations, scaled according to the WHO 2006 reference chart. The independent variable open defecationv vcr is the com- puted fraction 0 to 1 of households reporting open defecation in the child’s village (again, implemented as rural primary sampling unit), with a quadratic term included in some spec- iï¬?cations. Household-level open defecation open defecationi ivcr is an indicator, 0 or 1, for the child’s household. Including both household and village-level open defecation tests whether one household’s open defecation involves negative externalities for other households.21 In other words, is it only a household’s own sanitation that matters, or do other households’ sanitation matter, even controlling for one’s own? 21 G¨unther and Fink’s (2010) working paper version of Fink et al. (2011) conducts a similar analysis, regressing diarrhea and child mortality on household and cluster-mean water and sanitation variables. 19 Fixed effects αc are included for some speciï¬?cations. As before, the vector Aivcr is a set of 72 indicators for age-in-month by sex, one for each month of age for boys and for girls. Finally, Xivcr is a vector of child or household level controls: indicators for household dirt floor; access to piped water; electriï¬?cation; and ownership of a TV, bicycle, motorcycle, and clean cooking fuel; and the child’s mother’s literacy, knowledge of oral rehydration, age at ï¬?rst birth, count of children ever born, and relationship to the head of the household. These controls help ensure that any correlation between height and open defecation is unlikely to reflect mere wealth differences. 4.1.1 Regression results Figure 6 plots, separately for the African and Indian samples, the local polynomial regres- sions of child height on village open defecation, separating households that do and do not defecate openly. The ï¬?gures make clear the distinct private and social beneï¬?ts of sanitation. The private beneï¬?t is the vertical distance between the two lines; thus, in an average Indian village, children in households that do not openly defecate are about half of a standard devi- ation taller than children in households that do. The social beneï¬?t – a negative externality on other households – is visible in the downward slope of the regression lines: children living in villages with less open defecation overall are taller, on average. Of course, some fraction of these correlations also reflects omitted variable bias. The dashed vertical lines show that open defecation is much more common in the Indian than in the African data. Moreover, children in households that do not practice open defecation are shorter in Africa than in India at all levels of village open defecation. Table 7 veriï¬?es the statistical signiï¬?cance of these results and estimates regression equa- tion 3. In both samples, there is a clear association between child height and village-level sanitation. Especially in the Indian sample, the estimate changes little when controls are added. The coefficient on household-level sanitation is less robust: in the Indian sample it becomes much smaller when household and child controls are added, and in the African 20 sample it loses statistical signiï¬?cance. 4.1.2 Linear effects on height? So far, this paper has largely studied linear regression. A non-linear relationship between sanitation and height could be important for two reasons: ï¬?rst, ï¬?xed effects regression could be inconsistent; and second, a linear Blinder-Oaxaca decomposition could be inappropriate. Returning to ï¬?gure 6, the relationship between sanitation and height appears approximately linear in the Indian data, but may not be among openly-defecating households in the African data. Non-linearity can be tested by adding a quadratic term. Already, in table 6, a quadratic term was not statistically signiï¬?cant in the ï¬?rst two Indian DHS surveys. Panel A ï¬?nds again that, in the Indian NFHS-3, there is no evidence for a quadratic term. In contrast, Panel B does ï¬?nd a quadratic term in the African sample. In light of the evidence for the importance of open defecation per square kilometer presented in section 2.2.3, one possible explanation for this negative quadratic term is that population density is relatively low in these African countries, so open defecation is not as important for health until there is more of it; unfortunately, geographic data such as population density is not generally available at the DHS PSU level. Many sanitation policy-makers claim that there is a discontinuous increase in health when open defecation is fully eliminated. This belief was importantly popularized by Sanan and Moulik (2007), who assert that “public health outcomes can be achieved only when the entire community adopts improved sanitation behavior, the area is is 100 percent open defecation free, and excreta safely and hygienically conï¬?nedâ€? (4).22 Whether the relationship between sanitation and health is linear or another shape, it is clear from ï¬?gure 6 that there is no 22 This conclusion was based on a study by an organization called Knowledge Links in three villages of Himachal Pradesh, which found a “prevalence of diarrheaâ€? of 26 percent in a village where 95 percent of households used toilets, and of 7 percent in a village where 100 percent used toilets. Also see Shuval et al. (1981). 21 discontinuity at zero open defecation. Moreover, the statistically signiï¬?cant quadratic terms in Panel B of table 6 are negative, which suggests that, if anything, effects of village-level sanitation on height are smaller near perfect coverage in that sample. 4.2 A paradox: International differences in well-being Deaton (2007) ï¬?nds that international differences in height are not well explained by differ- ences in GDP or infant mortality. How could this be, given that poor sanitation increases infant mortality (Spears, 2012a), and richer people are more likely to have toilets or latrines? Figure 7 suggests that this puzzle is an example of Simpson’s Paradox: within separate subsets of a larger sample, the relationship between two variables can be very different from the relationship between the two variables in the larger, pooled sample.23 In particular, the relationship in the pooled data also depends upon the relationship among group means. ˆ be the OLS coefficient Consider a large dataset partitioned into subsets indexed s ∈ S . Let β ˆs the OLS coefficient found when the data of y on x in the whole, pooled dataset, and β ˆb be the “betweenâ€? regression coefficient found are restricted to subset s. Further, let β ¯s on x by regressing subsample means y ¯s , weighted by the number of observations in each subsample. Then ˆ= β ˆb , ˆs + λb β λs β (4) s∈S where the weights λs are the fractions of the total sum of squares in each subsample s and λb is the fraction of the sum of squares from the subsample means. Therefore, if the between coefficient is very different from the within coefficients, the pooled coefficient computed from the entire dataset could also be quite different from the within-subsample slopes. Figure 7 plots within-region, between, and pooled slopes to clarify this paradoxical case. Within both the Indian and African subsamples, more village-level open defecation is, indeed, associated with more infant mortality and less wealth. However, India has more open defeca- 23 The difficulties involved in inferring relationships about individuals from group average data are also sometimes referred to as the problem of “ecological inference.â€? 22 tion, lower infant mortality, and more wealth, represented by the plotted circles. Therefore, the pooled regressions are essentially flat, potentially misleadingly showing no association between open defecation and infant mortality or a count if household assets included in the DHS – consistently with Deaton’s original result. 4.3 Do short mothers cause short children? One candidate explanation for widespread stunting among Indian children that Deaton and eze (2009) highlight is that Indian mothers are small. What if there were a direct effect – Dr` independent of genetics and of the rest of the environment – of a too-small mother, causing restricted intrauterine growth, leading to lasting stunting? If so, then in principle Indian children could be short, on average, despite healthy modern environments and adequate nutrition, merely because their mothers were stunted by depravations of the past. Both the possible existence of this mechanism and any plausible magnitude are debated in the medical and epidemiological literature. Discussion dates to Ounsted et al.’s (1986) classic paper, which matched data on the birth weight of 1,092 children born in Oxford, UK with self-reported birth weights of family members to show that very low or very high birth weight babies are likely to have parents that are low or high birth weight, respectively, with a greater correlation with mothers than fathers. From this they hypothesize that mothers’ own in utero deprivations can restrict fetal growth of the children they have as adults. Although some later scholarship has criticized the conclusion that the constraint dates back to a mother’s fetal development,24 it is much more widely accepted that a mother’s size and net nutrition before and during pregnancy influence her child’s birth weight and subsequent growth (Martorell and Zongrone, 2012; Chiolero, 2010). For example, Ceesay et al. (1997) 24 In 2008 the paper was republished for commentary in the International Journal of Epidemiology ; dis- cussion noted that clean causal identiï¬?cation is rare in this debate. Leon (2008) summarizes that “[Ounsted et al.’s (1986)] ideas about the prepotency of maternal constraint are of less value and the hypothesis that in humans the degree of constraint a mother exerts on her offspring’s fetal growth is set by her own in utero experience is not supported by [their] own data or that published subsequentlyâ€? (258); Magnus (2008) and Cnattingius (2008) essentially agree, although also see Hanson and Godfrey (2008) and Horta et al. (2009) 23 ï¬?nd that daily dietary supplementation for pregnant women in rural Gambia increased birth weight, and Kanade et al. (2008) document in observational data that Indian mothers who ate more fat and micronutrients during pregnancy had larger babies. Is it possible that historical conditions that restricted Indian mothers’ size, but have now improved, are importantly restricting the fetal growth of their children? This question is difficult to answer in part because there are at least ï¬?ve reasons mothers’ and children’s height would be correlated: (1) mother’s genetics, (2) assortative mating and father’s genetics, (3) correlation of the child’s early-life environment with the mother’s early life environment, (4) endogenous effects of mothers’ early life environments on their adult ability to care for their children (including in utero and through marriage markets), and ï¬?nally (5) intrauterine growth restriction directly caused by the historically determined aspects of a mother’s size. Although all ï¬?ve would be reflected in a simple regression of child size on mother’s size, only the last mechanism would allow mothers’ stunting to itself cause present-day children’s stunting. Arguing that “variability in stature among young children is often ascribed to to health and nutrition differences in malnourished populations and to genetic differences in well- nourished populations,â€? Martorell et al. (1977) hypothesized that parent-child height cor- relation would be greater in the U.S. than in Guatemala. This is consistent with a model in which height h is the sum of two uncorrelated factors, genetics g and environment e, so h = g + e. Then, assuming that child genetics are a weighted sum ω of mother’s and father’s genetics, so gc = ωgm + (1 − ω )gf , and that environmental shocks are uncorrelated across generations, then the mother-child correlation of height is 2 ωσg 2 + σ2 , σg e where σ 2 is variance. In rich countries, where there is little relevant variance in environmental conditions, height correlations will be high. 24 Now consider a direct effect, of magnitude Ï? > 0, of mother’s height on child’s height, so hc = gc + Ï?hm + ec . To avoid unilluminating complications of inï¬?nite regress, ignore this mechanism in mothers’ generation, so hm = gm + em .25 Now mother-child height correlation is 2 ωσg Ï?+ 2 + σ2 , σg e which will be greater than in the case without Ï? because of the additional link between generations. Therefore, if conditions are such that Ï? is quantitatively important in a poor country – perhaps due to rapid change across generations in environmental conditions – but not in a richer country, the intergenerational height correlation could be greater in the poorer country. Figure 8 plots mother-child height correlations from the Indian and African DHS samples used in this sample, and for the U.S. from the National Longitudinal Study of Youth 1979. NLSY data used span several calendar years; each child’s height is used in the year he or she is ï¬?ve years old. The correlation computed from the NLSY, 0.41, is comparable to other correlations computed from developed country data (Livson et al., 1962). It is also consistent 2 with the formula above, if ω = 0.5 and σe is small. Two conclusions emerge from the ï¬?gure. First, the correlation computed for Indian children – although greater than that in the African data – is much below the correlation for U.S. children, suggesting that any Ï?, a direct causal effect of mother’s height on children’s height, may not be large. Second, the correlation between mothers’ and children’s height is lower, within both India and Africa, in villages with more open defecation, which is 2 consistent with an important role for σe and with Martorell et al.’s (1977) conjecture that environmental variation will be more important relative to genetics in poorer places. 25 Perhaps environmental conditions were so consistently bad in past generations that intrauterine restric- tion on fetal growth due to large intergenerational differences in environmental conditions were not a binding constraint. 25 4.4 Decomposing the gap Decomposition methods estimate the fraction of a difference between two groups that is statistically explained by differences in other variables (Fortin et al., 2011). Decomposition techniques are commonly applied to wage inequality in labor economics and to demographic rates. Like any econometric analysis of observational data, whether decomposition results have a causal interpretation depends on the context and the sources of variation in indepen- dent variables. This section estimates the fraction of the India-Africa height gap statistically “explainedâ€? by differences in village-level sanitation, a main result of the paper. 4.4.1 Methods of decomposition Three methods of decomposition are used. The ï¬?rst is a straightforward application of regression, as in table 5. I regress heightivs = αIndias + βopen defecationv vs + Xivs θ + εis , (5) where heightivs is the height-for-age z score of child i in village v in sample s, either India or Africa. The coefficient of interest is α, on an indicator variable that the child lives in India. The econometric question is by how much adding a control for village level open defecation ˆ in the positive direction. This is essentially identical to the pooled Blinder-Oaxaca shifts α decomposition with an indicator for group membership recommended by Jann (2008). The ˆ is evaluated with a Hausman χ2 test. Various sets statistical signiï¬?cance of the change in α of control variables Xivs are added in turn, which will, in general, change both α ˆ. ˆ and β The second method is a weighted two-way Blinder (1973)-Oaxaca (1973) decomposition, using a Stata implementation by Jann (2008). In particular, having seen in section 4.2 that open defecation has different correlations within and across the Indian and African samples, 26 I implement Reimers’s (1983) recommendation to ï¬?rst estimate heightivs = βs open defecationvs + Xivs θs + εis , (6) separately for each sample s, and then compute the difference in height “explainedâ€? by open defecation as ˆAfrica ˆIndia + 0.5β 0.5β open defecationv,Africa − open defecationv,India , (7) creating a counterfactual “effectâ€? of sanitation by weighting equally the within-sample slopes. The third method is a non-linear decomposition, which computes a new mean for the Indian sample after reweighting to match the African sample’s distribution of a set of ob- servable independent variables.26 In particular, the approach is to construct a counterfactual mean height of Indian children. First, partition both samples into groups g ∈ G(X ), which share values or ranges of values of a set of covariates X (which could include measures of open defecation). Next, for each group, compute f (g |s ), the empirical density of sample s ∈ {India, Africa} in group g , using sampling weights. Finally, compute the counterfactual mean Ëœ India = f (g |Africa) h wi hi , (8) f (g |India) g ∈G(X ) i∈g where wi is the sampling weight of observation i and hi is the height-for-age z -score of child i ¯ Africa . The basic set of reweighting Ëœ India − h in the Indian sample. The unexplained gap is then h variables used is village and household open defecation, split into 20 groups corresponding to 10 levels of village open defecation for households that do and do not openly defecate. 26 Geruso (2012) recently applied this approach to compute the fraction of the U.S. black-white life ex- pectancy gap that can be explained by a group of socioeconomic variables. 27 4.4.2 Decomposition results Table 8 presents the decomposition results. Panel A reports the change in the OLS coefficient on a dummy variable for India when a linear control for village-level open defecation is included, as in table 5. Panel B reports the change in the unexplained difference when open defecation is added to a weighted Blinder-Oaxaca decomposition. Panel C presents counterfactual differences from non-parametrically weighting the Indian sample to match the distribution of village and household open defecation in the African sample. Columns 1 and 2 present the basic result: the simple sample mean with no controls, with and without adjustment for sanitation. Village-level open defecation linearly explains 99 percent of the India-Africa gap.27 In the Blinder-Oaxaca and non-parametric decomposi- tions, sanitation explains more than 100 percent of the gap; this “overshootingâ€? is plausible because, in addition to having worse sanitation, Indian households are richer, on average, than African households. The next four pairs of columns similarly ï¬?nd that open defecation explains much of the India-Africa gap after controls are added. Speciï¬?cally, columns 3 and 4 control for demo- graphic variables before decomposing the remaining gap: sex-speciï¬?c birth order indicators and an indicator for single or multiple birth.28 Columns 5 and 6 ï¬?rst control for a vector of socioeconomic controls, the same controls used earlier in equation 3. Columns 7 and 8 con- trol for a village-level estimate of infant mortality, computed from mothers’ reported birth 27 Household-level open defecation, used instead of village-level, explains 68 percent of the gap, a reminder of the importance of disease externalities. 28 Jayachandran and Pande (2012) note that birth order is a predictor of child height in India (see footnote 10). For example, in the sample of rural children under 5 used here, I ï¬?nd that ï¬?rst children are 0.063 (p = 0.044) standard deviations taller than second children. However, this gap falls to 0.034 (p = 0.275) if controls for village and household open defecation are included (jointly signiï¬?cant with F2 ≈ 95), suggesting that children born into larger households may be more likely to be exposed to environmental fecal pathogens, although various forms of intra-household discrimination surely exist, as well (cf. Jeffery et al., 1989). Jayachandran and Pande (2012) also ï¬?nd, in their pooled DHS sample, that Indian ï¬?rst-borns are taller than African ï¬?rst-borns. In the individual-level sample studied here, Indian ï¬?rst-borns are 0.019 standard deviations shorter (a much smaller gap than in the full sample, but still negative); controlling for open defecation, they are 0.133 standard deviations taller, a 0.15 increase similar to those when sanitation is controlled for in table 8. If, for example, IMR is controlled for, Indian ï¬?rst-borns are 0.132 standard deviations shorter than African ï¬?rst-borns, which increases by 0.16 to Indians being 0.025 taller. 28 history. Finally, columns 9 and 10 control for mothers’ height. Because Indian mothers, like Indian children, are short, Indian children are taller than African children, conditional on their mothers’ height. However, section 4.3 has discussed evidence that this correlation is unlikely to be quantitatively importantly causal, and the counterfactual increase in Indian height from matching African sanitation continues to exceed the original, simple gap to be explained, even after adjustment for mothers’ height. The level of the India-Africa height gap depends on the particular set of controls added before sanitation is accounted for. However, the counterfactual change in height upon ac- counting for open defecation is strikingly similar across speciï¬?cations. In particular, the flexible non-parametric decomposition in Panel C might best accommodate any shape of the height-sanitation association. In all cases, non-parametrically matching the African distri- bution of open defecation increases the counterfactual Indian mean height by more than the 0.142 standard deviation simple difference in means. 5 Discussion: How much does sanitation explain? Several dimensions of variation in open defecation quantitatively similarly predict variation in child height: heterogeneity in aggregated country means, changes within Indian districts, and variation across village-level averages. Moreover, Spears (2012a) and Hammer and Spears (2012) document causally well-identiï¬?ed estimates of effects of sanitation on height. Finally, in the sense of econometric decomposition, exceptionally widespread open defecation can explain much of exceptional Indian stunting. So, how much taller would Indian children be if they enjoyed better sanitation and less exposure to fecal pathogens? 5.1 A linear thought experiment To answer the question above would require knowing the true average causal effect of open defecation on Indian children. However, one can envision possible answers by comparing a 29 range of estimates of the association between height and open defecation, each reflecting its own particular context and combination of internal and external validity. Children in rural India are, on average, 0.142 standard deviations shorter than children in the rural African sample in section 4; open defecation is 31.6 percentage points more common in India. Therefore, imagining a linear causal effect of sanitation β , the fraction of the rural India-Africa height gap that open defecation rates would explain would be whatever fraction β is of a 0.45 (= 0.142 ÷ 0.316) standard deviation increase in height resulting from moving from 100 percent to 0 percent open defecation. Children in India, where 55 percent of households openly defecated in the DHS, are about 2 standard deviations shorter than the reference mean. Whatever fraction β is of 3.6 would be the fraction of the India-U.S. gap explained. Table 9 collects estimates of the linear association between height and open defecation from this paper and others. As in section 4.4.2, “explainingâ€? over 100 percent of the gap is plausible because wealth differences predict that Indian children should be taller than African children. Unsurprisingly, the instrumental variable treatment-on-the-treated estimate in Hammer and Spears’s (2012) small experimental sample has a large conï¬?dence interval. Both this estimate and Spears’s (2012a) may overstate the direct effect of latrines per se because the programs studied also promoted use of existing latrines. Like many regression estimates of the effects of inputs on human capital, some of these may be biased upwards. Collectively, however, they suggest that the best linear approxima- tion to the true average causal effect of village-level sanitation coverage on Indian children’s height is likely to be a large fraction of 0.45 (ignoring the additional explanatory power of population density). If so, then sanitation could explain much or all of the difference in heights between Indian and African children. 30 5.2 Conclusion: Stunting, “malnutrition,â€? and externalities Section 4.1 presented evidence that one household’s open defecation imposes negative exter- nalities on its neighbors. Village-level externalities are important for at least two reasons. First, negative externalities are a classic rationale in public economics for government in- tervention: if households do not consider the effect of their own open defecation on other people, they will be too reluctant to switch to using a latrine. Second, statistical approaches that only study private resources will be unable to fully explain heterogeneity in height. For example, Tarozzi (2008) ï¬?nds that even Indian children in the richest households in the NFHS (that is, with the most assets) are still shorter than international reference norms. Panagariya (2012) interprets this result to suggest that international norms are incorrect for Indian children, because even children with “elite or privilegedâ€? household health inputs are stunted.29 Yet, this interpretation ignores externalities: many of the asset-rich households in the NFHS are exposed to a disease environment created by the open defecation of other households. Bhandari et al. (2002) study the height of Indian children living in Green Park, a single affluent neighborhood in South Delhi; these children grow to international reference heights. Although a child’s low height-for-age is often called “malnutrition,â€? Waterlow (2011) has advocated instead using “the term ‘stunted,’ which is purely descriptive and does not prejudge the question of whether or not the growth deï¬?cit is really the result of malnutri- tion,â€? often narrowly interpreted as food, especially in policy debates. Early-life disease – and especially chronic disease due to fecal pathogens in the environment – appears to be another important determinant of height. If so, determining whether open defecation is an importantly binding constraint on Indian children’s height may be a step towards a policy response able to resolve this Asian enigma. 29 Tarozzi does recognize that his approach does not capture important effects of “the epidemiological environment, with its impact on infectionsâ€? (463). 31 References Ali, Mohammad, Michael Emch, J.P. Donnay, Mohammad Yunus, and R.B. Sack. 2002. “The spatial epidemiology of cholera in an endemic area of Bangladesh.â€? Social Science and Medicine, 55: 1015–1024. Bateman, O. Massee and Shelley Smith. 1991. “A comparison of the health effects of water supply and sanitaiton in urban and rural Guatemala.â€? WASH Field Report 352, USAID. Bateman, O. Massee, Shelley Smith, and Philip Roark. 1993. “A comparison of the health effects of water supply and sanitaiton in urban and rural areas of ï¬?ve African countries.â€? WASH Field Report 398, USAID. Bhandari, Nita, Rajiv Bahl, Sunita Taneja, Mercedes de Onis, and Maharaj K. Bhan. 2002. “Growth performance of affluent Indian children is similar to that in devel- oped countries.â€? Bulletin of the World Health Organization, 80(3): 189–195. Black, Maggie and Ben Fawcett. 2008. The Last Taboo: Opening the Door on the Global Sanitation Crisis, London: Earthscan. Blinder, Alan S. 1973. “Wage Discrimination: Reduced Form and Structural Estimates.â€? Journal of Human Resources, 8: 436455. Case, Anne and Christina Paxson. 2008. “Stature and Status: Height, Ability, and Labor Market Outcomes.â€? Journal of Political Economy, 116(3): 499–532. Case, Anne, Darren Lubotsky, and Christina Paxson. 2002. “Economic Status and Health in Childhood: The Origins of the Gradient.â€? American Economic Review, 92(5): 1308–1334. Ceesay, Sana M, Andrew M Prentice, Timothy J Cole, Frances Foord, Lawrence T Weaver, Elizabeth M E Poskitt, and Roger G Whitehead. 1997. “Effects on birth weight and perinatal mortality of maternal dietary supplements in rural Gambia: 5 year randomized controlled trial.â€? British Medical Journal, 315: 786–790. 32 Checkley, William, Gillian Buckley, Robert H Gilman, Ana MO Assis, Richard L are Mølbak, Palle Valentiner-Branth, Claudio F Guerrant, Saul S Morris, KËš Lanata, Robert E Black, and The Childhood Malnutrition and Infection Net- work. 2008. “Multi-country analysis of the effects of diarrhoea on childhood stunting.â€? International Journal of Epidemiology, 37: 816–830. Chiolero, Arnaud. 2010. “Adult maternal body size matters.â€? International Journal of Epidemiology, 39(6): 1681. Cnattingius, Sven. 2008. “Commentary: On ‘Transmission through the female line of a mechanism constraining human fetal growth’ – does it exist?â€? International Journal of Epidemiology, 37: 252–254. Currie, Janet. 2009. “Healthy, Wealthy, and Wise: Socioeconomic Status, Poor Health in Childhood, and Human Capital Development.â€? Journal of Economic Literature, 47(1): 87–122. Deaton, Angus. 2007. “Height, health and development.â€? Proceedings of the National Academy of the Sciences, 104(33): 13232–13237. eze. 2009. “Food and Nutrition in India: Facts and Inter- Deaton, Angus and Jean Dr` pretations.â€? Economic and Political Weekly, 44(7): 42–65. Doblhammer, Gabriele and James W. Vaupel. 2001. “Lifespan depends on month of birth.â€? PNAS, 98(5): 2934–2939. unther, Isabel G¨ Fink, G¨ unther, and Kenneth Hill. 2011. “The effect of water and sanitation on child health: evidence from the demographic and health surveys 19862007.â€? International Journal of Epidemiology, 40(5): 1196–1204. Fortin, Nicole, Thomas Lemieux, and Sergio Firpo. 2011. “Decomposition Methods in Economics.â€? Handbook of Labor Economics, 4a: 1–102. Geruso, Michael. 2012. “Black-White Disparities in Life Expectancy: How Much Can the Standard SES Variables Explain?â€? Demography, 49(2): 553–574. unther, Isabel and G¨ G¨ unther Fink. 2010. “Water, Sanitation and Childrens Health: 33 Evidence from 172 DHS Surveys.â€? Policy Research Working Paper 5275, World Bank. Hammer, Jeffrey and Dean Spears. 2012. “Effects of a village sanitation intervention on children’s human capital: Evidence from a randomized experiment by the Maharashtra government.â€? working paper, Princeton University. Hanson, Mark A. and Keith M. Godfrey. 2008. “Commentary: Maternal constraint is a pre-eminent regulator of fetal growth.â€? International Journal of Epidemiology, 37: 254–255. Horta, Bernardo L, Denise P Gigante, Clive Osmond, Fernando C Barros, and Cesar G Victora. 2009. “Intergenerational effect of weight gain in childhood on offspring birthweight.â€? International Journal of Epidemiology, 38: 724–732. Humphrey, Jean H. 2009. “Child undernutrition, tropical enteropathy, toilets, and hand- washing.â€? The Lancet, 374: 1032 – 35. Jann, Ben. 2008. “A Stata implementation of the Blinder-Oaxaca decomposition.â€? Stata Journal, 8(4): 453–479. Jayachandran, Seema and Rohini Pande. 2012. “The Puzzle of High Child Malnutrition in South Asia.â€? presentation slides, International Growth Centre. Jeffery, Patricia, Roger Jeffery, and Andrew Lyon. 1989. Labour pains and labour power: Women and childbearing in India: Zed. Joint Monitoring Programme for Water Supply and Sanitation. 2012. Progress on Drinking Water and Sanitation: 2012 Update: WHO and UNICEF. Kanade, A.N., S. Rao, R.S. Kelkar, and S. Gupte. 2008. “Maternal nutrition and birth size among urban affluent and rural women in India.â€? Journal of the American College of Nutrition, 27(1): 137–145. Leon, David A. 2008. “Commentary: The development of the Ounsteds’ theory of maternal constraint – a critical perspective.â€? International Journal of Epidemiology, 37(255-259). Livson, Norman, David McNeill, and Karla Thomas. 1962. “Pooled Estimates of Parent-Child Correlations in Stature from Birth to Maturity.â€? Science, 138(3542): 818– 34 820. Magnus, Per. 2008. “Commentary: A need for unconstrained thinking of foetal growth.â€? International Journal of Epidemiology, 37(254-255). Martorell, Reynaldo and Amanda Zongrone. 2012. “Intergenerational Influences on Child Growth and Undernutrition.â€? Paediatric and Perinatal Epidemiology, 26(Suppl. 1): 302–314. Martorell, Reynaldo, Charles Yarbrough, Aaron Lechtig, Hernan Delgado, and Robert E. Klein. 1977. “Genetic-environmental interactions in physical growth.â€? Acta Pædiatr Scand, 66: 579–584. Mondal, Dinesh, Juliana Minak, Masud Alam, Yue Liu, Jing Dai, Poonum Ko- rpe, Lei Liu, Rashidul Haque, and William A. Petri, Jr. 2011. “Contribution of Enteric Infection, Altered Intestinal Barrier Function, and Maternal Malnutrition to Infant Malnutrition in Bangladesh.â€? Clinical Infectious Diseases. Oaxaca, Ronald. 1973. “Male-Female Wage Differentials in Urban Labor Markets.â€? Inter- national Economic Review, 14: 693709. Ounsted, M., S. Scott, and C. Ounsted. 1986. “Transmission through the female line of a mechanism constraining human fetal growth.â€? Annals of Human Biology, 13(2): 143–151. Panagariya, Arvind. 2012. “The Myth of Child Malnutrition in India.â€? conference paper 8, Columbia University Program on Indian Economic Policies. Petri, William A., Jr, Mark Miller, Henry J. Binder, Myron M. Levine, Rebecca Dillingham, and Richard L. Guerrant. 2008. “Enteric infections, diarrhea, and their impact on function and development.â€? Journal of Clinical Investigation, 118(4): 1266– 1290. Ramalingaswami, Vulimiri, Urban Jonsson, and Jon Rohde. 1996. “Commentary: The Asian Enigma.â€? The Progress of Nations. Reimers, Cordelia W.. 1983. “Labor Market Discrimination Against Hispanic and Black Men.â€? Review of Economics and Statistics, 65(4): 570–579. 35 Sanan, Deepak and Soma Ghosh Moulik. 2007. “Community-Led Total Sanitation in Rural Areas: An Approach that Works.â€? Field Note, World Bank Water and Sanitation Program. Shuval, Hillel I., Robert L. Tilden, Barbara H. Perry, and Robert N. Grosse. 1981. “Effect of investments in water supply and sanitation on health status: a threshold- saturation theory.â€? Bulletin of the WHO, 59(2): 243–248. Spears, Dean. 2012a. “Effects of Rural Sanitation on Infant Mortality and Human Capital: Evidence from a Local Governance Incentive in India.â€? working paper, Princeton. 2012b. “Height and Cognitive Achievement among Indian Children.â€? Economics and Human Biology, 10: 210–219. Spears, Dean and Sneha Lamba. 2012. “Effects of Early-Life Exposure to Rural Sani- tation on Childhood Cognitive Skills: Evidence from Indias Total Sanitation Campaign.â€? working paper, Princeton. Steckel, Richard. 2009. “Heights and human welfare: Recent developments and new di- rections.â€? Explorations in Economic History, 46: 1–23. Tarozzi, Alessandro. 2008. “Growth reference charts and the nutritional status of Indian children.â€? Economics & Human Biology, 6(3): 455468. Waterlow, J.C.. 2011. “Reflections on Stunting.â€? in Charles Pasternak ed. Access Not Excess: Smith-Gordon, Chap. 1: 1–8. 36 (a) children born in the last 3 years 0 -.5 average height-for-age z-score -2 -1.5 -1 -2.5 0 20 40 60 80 percent of households without toilet or latrine (b) children born in the last 5 years 0 -.5 average height-for-age z-score -2 -1.5 -1 -2.5 0 20 40 60 80 percent of households without toilet or latrine Figure 1: Open defecation predicts child height, across DHS survey round country-years Solid OLS regression lines weight by country population; dashed lines are unweighted. 37 (a) residuals after ln(GDP) height-for-age, residuals after ln(GDP) 1 .5 0 -.5 -.6 -.4 -.2 0 .2 .4 open defecation, residuals after ln(GDP) OLS, residuals after GDP OLS, simple regression (b) country means and subsets of India (2005) by wealth 0 average height-for-age z-score -.5 -1 -1.5 -2 -2.5 0 .2 .4 .6 .8 exposure to open defecation (country mean or local area average) Indian asset quintiles Indian top 2.5% Figure 2: Wealth does not account for the association between child height and sanitation OLS regression lines weight by country population; data from children under 3 years old. The “simple regressionâ€? line in panel (a) plots the slope of the uncontrolled regression of height on sanitation. Mean 38 (b) is plotted against the average rate of open height of children in Indian wealth groups in panel defecation in the primary sampling units where they live. (a) children born in the last 3 years .4 height-for-age, difference from country mean -.4 -.2 0 .2 -20 -10 0 10 20 percent of households without toilet or latrine, difference from country mean (b) children born in the last 5 years .4 height-for-age, difference from country mean -.4 -.2 0 .2 -20 -10 0 10 20 percent of households without toilet or latrine, difference from country mean Figure 3: Deviation from country mean sanitation explains deviation from mean height Solid OLS regression lines weight by country population; dashed lines are unweighted. 39 1 (a) children born in the last 3 years 0 average height-for-age z-score -1 -2 -6 -4 -2 0 2 4 log of people who defecate openly per square kilometer (b) children born in the last 5 years -3 0 average height-for-age z-score -2 -1.5 -1 -2.5 -.5 -6 -4 -2 0 2 4 log of people who defecate openly per square kilometer Figure 4: Open defecation interacts with population density to predict child height Solid OLS regression lines weight by country population; dashed lines are unweighted. 40 0 -.5 height-for-age z score -2 -1.5 -1 -2.5 0 10 20 30 40 age in months 95% CI Indian children 95% CI 95% CI sanitation coefficient sanitation coefficient, GDP control Figure 5: Open defecation is more steeply associated with child height at older ages Conï¬?dence intervals are estimates of the coefficient from a regression of average country-level child height-for-age on open defecation, restricting the sample to four age categories. The curve plots the average height-for-age z -score of Indian children by age, for reference. 41 (a) India (NFHS-3) -1.4 -1.6 height-for-age z-score -1.8 -2 -2.2 0 .2 .4 .6 .8 1 village-level open defecation 95% CI household openly defecates does not openly defecate (b) ï¬?ve African countries (8 pooled DHS surveys) -1.4 -1.6 height-for-age z-score -2 -1.8 -2.2 0 .2 .4 .6 .8 1 village-level open defecation 95% CI household openly defecates does not openly defecate Figure 6: Negative externalities: Village-level open defecation predicts child height Vertical dotted lines mark the overall mean open defecation fraction in these two rural samples. 42 (a) infant mortality 200 infant mortality rate (per 1,000) 100 50 150 0 .2 .4 .6 .8 1 village-level open defecation regional means pooled: India & Africa India five African countries (b) asset ownership 2 3 asset count 1 0 0 .2 .4 .6 .8 1 village-level open defecation regional means pooled: India & Africa India five African countries Figure 7: Simpson’s paradox: open defecation and well-being across and within regions 43 .4 mother-child height correation .2 .1 .3 0 .2 .4 .6 .8 1 village-level open defecation 95% CI U.S. (NLSY-79) India local polynomial India OLS Africa local polynomial Africa OLS Figure 8: Open defecation is associated with reduced mother-child height correlation 44 Table 1: Open defecation predicts child height across DHS surveys (1) (2) (3) (4) (5) (6) (7) Panel A: Average height-for-age z -score of children born in last 3 years open defecation -1.239*** -1.326*** -1.002*** -0.962* -1.028† -1.111* -0.663*** (0.226) (0.158) (0.156) (0.434) (0.583) (0.505) (0.181) open defecation -1.499* × density (0.631) ln(GDP) 0.202** 0.512*** 0.472** 0.757† 0.280*** (0.0733) (0.146) (0.174) (0.423) (0.0525) women’s height 0.0130 -0.0564 -0.0369 0.0425** (0.0476) (0.0904) (0.106) (0.0143) population density 0.0418 (0.190) year FEs country FEs region time trends controls n (DHS surveys) 140 140 140 130 130 102 130 R2 0.542 0.679 0.744 0.988 0.990 0.991 0.862 Panel B: Average height-for-age z -score of children born in last 5 years open defecation -1.211*** -1.443*** -0.910*** -1.335** -1.175 -1.354† -0.689*** (0.290) (0.203) (0.208) (0.474) (0.806) (0.704) (0.154) open defecation -1.446* × density (0.549) ln(GDP) 0.276** 0.270 0.208 0.178 0.341*** (0.0859) (0.208) (0.308) (0.226) (0.0454) women’s height 0.0674 -0.0180 0.0352 0.0544*** (0.0774) (0.145) (0.107) (0.0147) population density -0.0181 (0.137) year FEs country FEs region time trends controls n (DHS surveys) 117 117 117 108 108 104 108 R2 0.369 0.555 0.689 0.990 0.991 0.991 0.893 Standard errors clustered by country in parentheses (65 countries in panel A, 59 in panel B). p-values: † p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001. Open defecation is a fraction 0 to 1. Population density is demeaned to preserve the interpretation of open defecation. Controls are calorie deï¬?cit, female literacy, water within 15 minutes, knowledge of oral rehydration, polity score, and autocracy score; see the text for more complete variable deï¬?nitions. 45 Table 2: Open defecation predicts child height, omitting each world region in turn (1) (2) (3) (4) (5) (6) omitted region: S.S. Africa S. Asia M.E.&N.Af. C. Asia S.E. Asia L. Amer. Panel A: year ï¬?xed effects, no controls open defecation -1.711*** -0.988*** -1.081*** -1.316*** -1.352*** -1.186*** (0.240) (0.140) (0.216) (0.165) (0.161) (0.186) n (DHS surveys) 62 129 121 135 135 118 2 R 0.808 0.517 0.620 0.680 0.683 0.721 Panel B: year ï¬?xed effects, control for ln(GDP) open defecation -0.733*** -0.387* -0.866*** -1.001*** -1.028*** -1.065*** 46 (0.195) (0.151) (0.178) (0.162) (0.157) (0.169) n (DHS surveys) 62 129 121 135 135 118 R2 0.925 0.703 0.693 0.741 0.747 0.736 Panel C: year ï¬?xed effects, control for average height of women open defecation -0.679† -0.994*** -0.864*** -1.064*** -1.096*** -0.911*** (0.364) (0.135) (0.202) (0.157) (0.150) (0.186) n (DHS surveys) 57 121 112 125 125 110 R2 0.880 0.522 0.639 0.698 0.701 0.768 Dependent variable is mean height-for-age of children under 3. Standard errors clustered by country in parentheses. p-values: † p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001. Open defecation is a fraction 0 to 1. Table 3: Alternative “placeboâ€? independent variables do not predict height, conditional on sanitation (1) (2) (3) (4) variable: female literacy nearby water calorie deï¬?cit electriï¬?cation “placeboâ€? variable 0.00348 0.00382 0.0000230 -0.00271 (0.00327) (0.00268) (0.000964) (0.00226) open defecation -0.880*** -1.077*** -1.002*** -0.918*** (0.205) (0.158) (0.157) (0.141) ln(GDP) year FEs n (DHS surveys) 138 124 140 135 R2 0.752 0.792 0.744 0.726 (5) (6) (7) (8) variable: autocracy polity score breastfeeding fed other liquids “placeboâ€? variable 0.0248 -0.0119 -0.054 0.003 (0.0175) (0.00787) (0.036) (0.002) open defecation -0.867*** -0.843*** -1.123*** -1.101*** (0.207) (0.212) (0.187) (0.186) ln(GDP) year FEs n (DHS surveys) 136 138 139 132 R2 0.753 0.753 0.689 0.702 Dependent variable is mean height-for-age of children under 3. Standard errors clustered by country in parentheses. p-values: † p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001. Open defecation is a fraction. 47 Table 4: Open defecation is more steeply associated with child height in urban than in rural areas (1) (2) (3) (4) (5) (6) (7) subsample: total urban urban urban rural rural rural Panel A: Height-for-age z -score of children under 3 open defecation -1.239*** -2.577*** -1.577** -1.603*** -0.853*** -0.572*** -0.689*** (0.226) (0.688) (0.504) (0.447) (0.173) (0.157) (0.143) 2 2 rural = urban χ = 9.50 χ = 6.07 χ2 = 6.69 p = 0.002 p = 0.014 p = 0.009 women’s height 0.0512*** 0.0506*** 0.0407** 0.0409*** (0.0130) (0.00924) (0.0144) (0.0107) rural = urban χ2 = 2.20 χ2 = 1.79 p = 0.138 p = 0.181 year FEs n (DHS surveys) 140 140 130 130 140 130 130 2 R 0.542 0.403 0.518 0.638 0.467 0.506 0.624 48 Panel B: Height-for-age z -score of children under 5 open defecation -1.211*** -2.219** -1.620** -1.875*** -0.755** -0.527** -0.743*** (0.290) (0.825) (0.586) (0.487) (0.233) (0.198) (0.161) 2 2 2 rural = urban χ = 5.16 χ = 7.01 χ = 10.74 p = 0.023 p = 0.008 p = 0.000 women’s height 0.0620*** 0.0606*** 0.0489** 0.0523** (0.0141) (0.0138) (0.0172) (0.0152) rural = urban χ2 = 4.90 χ2 = 2.02 p = 0.027 p = 0.155 year FEs n (DHS surveys) 117 117 108 108 117 108 108 R2 0.369 0.201 0.540 0.643 0.270 0.423 0.548 Standard errors clustered by country in parentheses (65 countries in panel A, 59 in panel B). p-values: † p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001. Open defecation is a fraction 0 to 1. χ2 tests in columns 5, 6, and 7 test for equality with the coefficients in columns 2, 3, and 4, respectively. Table 5: Open defecation linearly explains much of the South Asia-Africa height gap (1) (2) (3) (4) (5) (6) (7) (8) (9) sample full restricted restricted restricted restricted restricted restricted restricted restricted South Asia -0.349*** -0.360*** -0.253** -0.061 -0.424*** -0.488*** -0.129 -0.364*** 0.029 (0.0469) (0.0521) (0.0734) (0.162) (0.050) (0.109) (0.181) (0.084) (0.184) percent explained 30% 83% 73% 92% open defecation -0.456* (0.173) open defecators -1.758* -2.184* -1.998† per square km (0.865) (0.961) (1.094) 49 population density 0.185** (0.062) ln(GDP) 0.205 0.227† 0.077 0.068 (0.122) (0.123) (0.084) (0.082) year FEs n (DHS surveys) 100 89 89 89 89 89 89 89 89 R2 0.390 0.391 0.497 0.433 0.416 0.495 0.727 0.786 0.800 Dependent variable is mean height-for-age of children under 3. Standard errors clustered by country in parentheses. p-values: † p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001. Open defecation is a fraction 0 to 1. South Asia is a dummy indicator. The “restrictedâ€? sample is the set of observations with information on both height and open defecation. The “fullâ€? sample includes all observations from South Asia or sub-Saharan Africa. Table 6: Change over time in a panel of Indian districts, NFHS-1 1992-3 to NFHS-2 1998-9 (1) (2) (3) (4) Panel A: Repeated cross section (OLS) district open defecation -0.779** -0.279 (0.155) (0.193) village open defecation -0.537** -0.523** -0.774** (0.128) (0.104) (0.170) 2 village open defecation 0.134 (0.367) survey round ï¬?xed effect age-in-months × sex control variables Panel B: District ï¬?xed effects district open defecation -0.525† 0.106 (0.307) (0.321) village open defecation -0.553** -0.353** -0.710** (0.125) (0.117) (0.179) village open defecation2 -0.548 (0.337) survey round ï¬?xed effect age-in-months × sex control variables n (children under 3) 23,588 23,588 23,588 23,588 Dependent variable is height-for-age z-score of children under 3. Standard errors clustered by district (across survey rounds) in parentheses. p-values: † p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001. Open defecation is a fraction 0 to 1. Controls are at the household or child level: electriï¬?cation, water supply, household size, indictors for being Hindu or Muslim, a full set of birth order indicators interacted with the relationship of the child’s mother to the head of the household, twinship indicators, and month-of-birth indicators. “Linear predicted height changeâ€? multiplies the statistically signiï¬?cant coefficient on open defecation by 0.063, the change in rural open defecation between the NFHS-1 and NFHS-2, to make a linear prediction based only on sanitation of the change in height, which was about 0.022. Only rural subsamples are used. 50 Table 7: Village open defecation predicts child height, India and African DHS data (1) (2) (3) (4) (5) Panel A: India (2005 NFHS-3, rural sub-sample) village open defecation -0.305** -0.289** -0.257** -0.217* -0.216* (0.0644) (0.0636) (0.0579) (0.0886) (0.0879) village open defecation2 -0.208 0.0746 -0.111 -0.174 (0.210) (0.191) (0.224) (0.220) household open defecation -0.413** -0.413** -0.0999** -0.419** -0.181** (0.0356) (0.0356) (0.0380) (0.0353) (0.0376) controls state ï¬?xed effects n (children under 5) 26,832 26,832 26,832 26,832 26,832 Panel B: ï¬?ve African countries (8 DHS surveys, rural sub-samples) village open defecation -0.294** -0.361** -0.179** -0.0551 0.00776 (0.0575) (0.0598) (0.0599) (0.0712) (0.0696) village open defecation2 -0.726** -0.572** -0.569** -0.523** (0.193) (0.187) (0.196) (0.191) household open defecation -0.0783† -0.0767† 0.0161 -0.0804* -0.000619 (0.0404) (0.0403) (0.0400) (0.0404) (0.0402) controls DHS survey (country-year) FEs n (children under 5) 44,216 44,216 44,216 44,216 44,216 Dependent variable is height-for-age z-score of children under 5. Standard errors clustered by village (survey PSU). p-values: † p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001. Village open defecation is a fraction 0 to 1; household open defecation is an indicator 0 or 1. Controls are at the household or child level: 120 age-in-month by sex indicators; indicators for household dirt floor, access to piped water, electricity, TV, bicycle, motorcycle, and clean cooking fuel; and mother’s literacy, knowledge of oral rehydration, age at ï¬?rst birth, count of children ever born, and relationship to the head of the household. 51 Table 8: Decomposition of rural India-Africa height difference: Fraction due to open defecation (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) covariates: none birth demography socio-economic IMR (village-level) mother’s height sanitation: n (children) 71,048 71,048 71,048 71,048 71,048 71,048 71,048 71,048 70,596 70,596 Panel A: Pooled OLS regression, with an indicator for the Indian sub-sample: India indicator -0.142 -0.001 -0.191 -0.049 -0.301 -0.233 -0.243 -0.099 0.134 0.262 (0.026) (0.026) (0.026) (0.027) (0.028) (0.030) (0.026) (0.028) (0.027) (0.027) village open -0.480 -0.432 -0.177 -0.418 -0.420 defecation (0.035) (0.035) (0.036) (0.035) (0.034) R2 0.0015 0.0090 0.0073 0.0142 0.0308 0.0318 0.0059 0.0124 0.0271 0.0337 52 explained: 0.994 0.754 0.225 0.593 not applicable 2 indicator equal: χ2 1 = 136 χ21 = 130 χ 1 = 24 χ21 = 120 χ21 = 136 Panel B: Oaxaca-Blinder decomposition adding linear village-level sanitation, unexplained difference: difference -0.142 0.024 -0.203 -0.045 -0.316 -0.237 -0.258 -0.082 0.149 0.297 (0.026) (0.026) (0.025) (0.027) (0.028) (0.030) (0.027) (0.029) (0.027) (0.027) explained: 1.169 0.781 0.250 0.682 not applicable Panel C: Non-parametric reweighting decomposition, counterfactual difference in means: difference -0.142 0.061 -0.242 -0.022 -0.367 -0.186 -0.154 -0.005 0.215 0.396 change: 0.203 0.220 0.181 0.149 0.181 explained: 1.43 0.908 0.493 0.967 not applicable Table 9: What fraction of international height differences can open defecation explain? fraction of Indian height gap “explainedâ€? 0 1 2 3 4 5 estimation strategy source slope Africa (8 DHS) U.S. India’s TSC, high S (2012)† 1.592 3.542 0.440 India’s TSC, low S (2012)† 1.153 2.565 0.319 experiment, OLS H&S (2012)† 0.786 1.748 0.217 experiment, TOT IV H&S (2012)† 2.928 6.514 0.810 53 DHS means, FEs Table 1 0.962 2.140 0.266 DHS means, controls Table 1 1.111 2.471 0.307 DHS means, SA&SSA Table 5* 0.721 1.604 0.199 India, district FEs Table 6 0.533 1.185 0.148 India and Africa Table 8 0.420 0.934 0.116 † These are probably overestimates of the effects of sanitation coverage because the programs they study also motivate use of existing latrines. *The “Table 5â€? estimate is not, in fact, from table 5, but is found by estimating the regression in column 3 without the South Asia indicator variable. Error bars in the fourth column plot 95 percent conï¬?dence intervals for the estimate of linear difference in height-for-age standard deviations associated with a 0 to 1 difference in the fraction openly defecating. The fourth column divides the point estimate by 0.45.