WPS7579 Policy Research Working Paper 7579 The Fiscal Cost of Weak Governance Evidence from Teacher Absence in India Karthik Muralidharan Jishnu Das Alaka Holla Aakash Mohpal Development Research Group Human Development and Public Services Team February 2016 Policy Research Working Paper 7579 Abstract The relative return to input-augmentation versus ineffi- panel data that corroborate findings from smaller-scale ciency-reduction strategies for improving education system experiments. First, reductions in student-teacher ratios are performance is a key open question for education policy in correlated with increased teacher absence. Second, increases low-income countries. Using a new nationally-representa- in the frequency of school monitoring are strongly cor- tive panel dataset of schools across 1297 villages in India, related with lower teacher absence. Simulations using this paper shows that the large investments over the past these results suggest that investing in better governance decade have led to substantial improvements in input-based by increasing the frequency of monitoring could be over measures of school quality, but only a modest reduction in ten times more cost effective at increasing teacher-student inefficiency as measured by teacher absence. In the data, contact time (net of teacher absence) than hiring more 23.6 percent of teachers were absent during unannounced teachers. Thus, at current margins, policies that decrease the visits with an associated fiscal cost of $1.5 billion/year. There inefficiency of public spending in India are likely to yield are two robust correlations in the nationally-representative substantially higher returns than those that augment inputs. This paper is a product of the Human Development and Public Services Team, Development Research Group. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at jdas1@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team The Fiscal Cost of Weak Governance: Evidence from Teacher Absence in India Karthik Muralidharan, Jishnu Das, Alaka Holla, and Aakash Mohpal∗ Keywords: teacher absence, teacher absenteeism, India, governance, state capacity, moni- toring JEL Codes: H52, I21, M54, O15 ∗ Muralidharan: University of California, San Diego, CA, kamurali@ucsd.edu. Das: The World Bank Group, Washington, DC, jdas1@worldbank.org. Holla: The World Bank Group, Washington, DC, aholla@worldbank.org. Mohpal: University of Michigan, Ann Arbor, MI, amohpal@umich.edu. Acknowledgments: We thank Julie Cullen, Gordon Dahl, Roger Gordon, Gordon Hanson, Michael Kremer, Paul Niehaus, and Adam Wagstaff for useful comments. We thank the Bill and Melinda Gates Foundation for financial support for the data collection and analysis. Additional funds for data collection were made available by the Governance Partnership Facility grant provided through the Human Development Network of the World Bank. We are grateful to Pratap Bhanu Mehta and the Center for Policy Research, New Delhi for hosting the project and providing logistical support and infrastructure. We thank Sreela Dasgupta, Anvesha Khandelwal, and L. Ravi for project management support, and Monisha Ashok, Jack Liebersohn, Prerna Mukharya, Suzanne Plant, and Anand Shukla for outstanding research assistance. The project would not have been possible without the efforts of Charu Sheela, Trilok Sisodiya, AV Surya, K. Venugopal, and other staff of the Social and Rural Research Institute (SRI) in New Delhi who oversaw the field work and primary data collection. The findings, interpretations, and conclusions expressed in this paper are those of the authors and do not necessarily represent the views of any of the organizations that the authors are affiliated with, or the view of the World Bank, its Executive Directors or the countries they represent. 1 Introduction Determining the optimal level and composition of public education spending is a key policy question in many low-income countries. Many education advocates believe that low-income countries will fall short of achieving enrollment and learning goals without substantial in- creases in public spending on education (UNESCO, 2014); others argue that public sector inefficiencies leave considerable room for improvement within existing education budgets, and that fiscal constraints make it imperative to improve the efficiency of public expenditure (World Bank, 2010). However, the data to assess the relative importance of these contentions remains sparse, in part, due to the difficulty in detecting and measuring inefficiencies in pub- lic spending. In this paper, we study one striking measure of public sector inefficiency - teacher absences - with panel data collected 7 years apart in India at a time of sharp increases in education spending. A large portion of this increase was accounted for by the salary cost of hiring teachers to reduce the student-teacher ratio in public schools. As a policy alternative to hiring more teachers, we show that reducing teacher absences by increasing school monitoring would be ten times more cost effective at reducing effective student-teacher ratio (net of teacher absence). We also find that hiring more teachers increases the teacher absence rate, which further increases the costs of teacher hiring once behavioral responses are adequately accounted for. Thus, while the default approach to improving education in low-income countries is input-augmentation, our results suggest that investing in reducing inefficiencies may yield much greater returns. India presents a particularly salient setting for our analysis. It has the largest primary education system in the world, catering to over 200 million children. Further, over the past decade, the Government of India has invested heavily in primary education under the Sarva Shiksha Abhiyan (SSA) or “Universal Education Campaign.” Partly financed by a special education tax, this national program sought to correct historical inattention to primary education and led to a substantial increase in annual spending on primary education across several major categories of inputs including school infrastructure, teacher quality, student- teacher ratios, and school feeding programs.1 However, the public education system in India also faces substantial governance chal- lenges that may limit the extent to which this additional spending translates into improved education outcomes. Our indicator of systemic inefficiency - teacher absence - presents a particularly striking indicator of weak governance. A nationally-representative study of over 1 Total spending on education more than doubled between 2004 and 2009. In the year 2004-2005, India’s education budget was Rs.1,528 billion ($25 billion) and it increased to Rs.3,783 billion ($60 billion) in 2009- 2010 (Pratham, 2010). 1 3,000 public primary schools across 19 major Indian states found that over 25 percent of teachers were absent from work on a typical working day in 2003 (Kremer et al., 2005). Although administrative data from the government’s official records2 suggest that SSA has led to an improvement in various observed measures of school quality, there is little evi- dence on whether these investments have translated into improvements in education system performance, both with respect to intermediate metrics such as teacher absence and final outcomes such as test scores. Our study of this nationwide campaign to improve school quality in India uses a new nationally-representative panel dataset of education inputs and outcomes that we collected in 2010. We constructed this data by revisiting a randomly-sampled subset of the vil- lages originally surveyed in 2003 (see Kremer et al. (2005)) and collecting detailed data on school facilities, teachers, community participation, monitoring visits by officials, and teacher absence rates. Thus, in addition to reporting updated estimates of teacher absence, and independently-measured summary statistics on input-based measures of school quality, we are able to correlate changes in input-based measures of school quality with changes in teacher absence. The panel data help mitigate concerns arising from fixed unobserved heterogeneity at the village-level, and our results provide the best available estimates using nationwide data of how sharp increases in public education spending over the last decade have affected school quality. We find significant improvements in almost all input-based measures of school quality between 2003 and 2010. The fraction of schools with toilets and electricity more than doubled, and the fraction serving mid-day meals nearly quadrupled. There were significant increases in the fraction of schools with drinking water, libraries, and a paved road nearby. The fraction of teachers with college degrees increased by 41 percent, and student-teacher ratios (STR) fell by 16 percent. The fraction of teachers not paid on time fell from 51 to 22 percent, and the fraction of teachers aware of recognition programs increased from 50 to 81 percent. Finally, the frequency of school inspections and parent-teacher association (PTA) meetings increased significantly. However, reductions in teacher absence rates were more modest. The all-India weighted average teacher absence in rural areas fell from 26.3 to 23.6 percent.3 While increased teacher hiring brought the STR down from 47 to below 40, the effective STR (after accounting for teacher absence) was still over 50 (having reduced from 64 in 2003 to 52 in 2010). The variation in teacher absence across states remains high. At one end, several top performing 2 These come from the “District Information System for Education” and are commonly referred to as the DISE data. 3 The all-India weighted average teacher absence estimated in 2003 was 25.2 percent; the corresponding figure for the rural sample was 26.3 percent. The panel survey only covered the rural sample. 2 states have teacher absence rates below 15 percent, while at the other, the poorest performing state, Jharkhand, has a teacher absence rate of 46 percent. We estimate legitimate absence rates to be in the range of 8-10 percent. Thus, the variation among states in unauthorized teacher absence rates is even higher and ranges from 5 to 35 percent, suggesting substantial variation in the quality of education governance across Indian states. Our panel-data analysis, where we correlate changes in village-level teacher absence with changes in teacher and school characteristics, and administrative and community-level mon- itoring yield two robust correlations.4 First, reductions in the school-level student-teacher ratio (STR) are correlated with an increase in teacher absence, suggesting that the potential benefits from investing in more teachers and smaller class sizes may be partly offset by an increase in teacher absence. Second, better top-down administrative monitoring is strongly correlated with lower teacher absence. Villages with regular public school inspections had teacher absence rates that were 6.5 percentage points lower than villages without inspec- tions (a 25% reduction in overall absence, and a 40% reduction in unauthorized absence). We also find that changes in monitoring frequency over time are not correlated with reduc- tions in either authorized leave or official duty but are mainly correlated with reductions in unauthorized absence rates. We combine our estimates of unauthorized teacher absence with data on the number of teachers employed and their salaries to calculate a fiscal cost of teacher absence over $1.5 billion per year. This represents 60 percent of the entire revenue collected from the spe- cial education tax used to fund SSA in 2010.5 Teacher salaries typically account for over 80% of non-capital education spending (Dongre, Kapur and Tewary, 2014) and the most expensive component of the recently passed Right to Education (RtE) Act in India is a commitment to reduce student-teacher ratios from 40:1 to 30:1, at a cost of an additional $5 billion/year. Using the most conservative panel-data estimates of the correlations between increased monitoring and reduced teacher absence, we estimate that improving school gov- ernance (by hiring more supervisory staff) would be over ten times more cost effective at increasing effective student-teacher ratio (net of teacher absence) than hiring more teachers. Our results highlight the large fiscal costs of weak governance in Indian primary education and suggest that the marginal returns to investing in better monitoring and governance are likely to be quite high. This paper makes several contributions to the literature on public economics in low- income countries. First, teacher absence is now widely used as a governance indicator in 4 As we discuss further in the results section, we consider ‘robust’ correlations to be those where the point estimates are significant and similar in both bivariate and multiple regressions, and in specifications with no fixed effects, with state fixed effects, and with district fixed effects. 5 See http://indiabudget.nic.in/budget2012-2013/ub2012-13/rec/tr.pdf 3 education in middle- and low-income countries.6 We update the estimates of teacher absence in rural India from 2003 and show that in spite of substantial increases in spending on education inputs over the last decade, improvements on this key measure of governance have been more modest. While corruption in education spending has been shown to hurt learning outcomes (Ferraz, Finan and Moreira, 2012), our estimates of the large fiscal cost of teacher absence highlight the importance of also focusing on governance issues that lead to significant amounts of ‘passive’ waste and inefficiency on an ongoing annual basis, but may not obtain as much media attention as one-off corruption scandals (Bandiera, Prat and Valletti, 2009; World Bank, 2010). Second, our results showing that decreases in STRs are correlated with increased teacher absence underscores the importance of distinguishing between average and marginal rates of corruption and waste in public spending. Niehaus and Sukhtankar (2013) propose this terminology in the context of wages paid to beneficiaries in a public-works program in India and find that marginal rates of leakage are much higher than average rates. We find the same result in the context of teachers and show that the effective absence rate of the marginal teacher hired is considerably higher than the average absence (because of the increased absence among existing teachers). This result, from a large all-India sample mirrors smaller- sample experimental findings in multiple settings: Duflo, Dupas and Kremer (2012), and Muralidharan and Sundararaman (2013) present experimental evidence (from Kenya and India) showing that provision of an extra teacher to schools led to an increase in the absence rate of existing teachers in both settings. In other words, additional spending on school inputs (of which teacher salaries are the largest component) was correlated with increased inefficiency of spending. Third, improvements in top-down administrative monitoring (inspections) are more strongly correlated with reduced teacher absence than improvements in bottom-up community mon- itoring (PTA meetings), consistent with experimental evidence on the relative effectiveness of administrative and community audits on reducing corruption in road construction in In- donesia (Olken, 2007). More broadly, a growing body of experimental evidence points to the effectiveness of audits and monitoring (accompanied by rewards or sanctions) in improving the performance of public-sector workers and service providers (including Olken (2007) in Indonesia; Duflo, Hanna and Ryan (2012) in India; and Zamboni and Litschig (2013) in Brazil). Our panel-data estimates using data from an “as is” nationwide increase in moni- toring of schools provide complementary evidence to smaller-scale experiments and suggest 6 The World Bank’s World Development Report 2004 provided estimates of provider absence in both health and education for a sample of low-income countries (World Bank, 2003; Chaudhury et al., 2006). These numbers have been widely cited in policy discussions, and reduction in provider absence rates is often included as an objective in aid agreements between donors and aid recipients. 4 that investing in better governance and monitoring of service providers may be an important component of improving state capacity for service delivery in low-income countries (Besley and Persson, 2009; Muralidharan, Niehaus and Sukhtankar, 2015). Finally, recent research has pointed to ‘misallocation’ of capital and labor in low-income countries as an important contributor to lower total factor productivity (TFP) in these settings (Hsieh and Klenow, 2009), and has also documented that a plausible reason for this misallocation is that ‘management quality’ is poorer in low-income countries, and that public- sector firms are managed especially poorly (Bloom and Van Reenen, 2010). Our results provide a striking example of weak management and misallocation in publicly-produced primary education in India (a sector that accounts for over 3% of GDP in spending). In particular, our estimates suggest that reallocating a portion of the $5 Billion/year increase in education spending budgeted for hiring more teachers towards measures focused on reducing teacher absence (for instance, by hiring more supervisory staff) may be a much more cost effective way of increasing effective teacher-student contact time. Thus, misallocation is likely to be a first-order issue in this setting, and reallocating education spending towards better governance may substantially increase TFP in publicly-produced education.7 The rest of this paper is organized as follows: Section 2 discusses our empirical methods and analytical framework. Section 3 reports summary statistics on school inputs and teacher absence. Section 4 presents the cross-sectional and panel regression results. Section 5 dis- cusses the fiscal costs of weak governance and compares the returns to investing in better monitoring with that from hiring more teachers. Section 6 discusses policy implications, and Section 7 concludes. 2 Data and Analytic Framework The nationally-representative sample used for the 2003 surveys, which our current study uses as a base, covered both urban and rural areas across the 19 most populous states of India, except Delhi. This represented over 95 percent of the country’s population. The 2010 sample covered only rural India. The sampling strategy in 2010 aimed to maintain representativeness of the current landscape of schools in rural India and to maximize the size of the panel. We met these twin objectives (representativeness and panel) by retaining the villages in the original sample to the extent possible, while re-sampling schools from the full universe of schools in these villages in 2010, and by conducting the panel analysis at the 7 Such misallocation in education spending is also seen in other low-income countries. An even more striking example is provided by de Ree et al. (2015) who experimentally study the intensive-margin impacts of an Indonesian policy reform that doubled teacher pay across the board (at a similar cost of $5 billion/year) and find that the teacher pay increase had no impact on student learning. 5 village level.8 Enumerators first conducted school censuses in each village, from which we sampled up to three schools per village for the absence surveys. During fieldwork, enumerators made three separate visits to each sampled school over a period of 10 months from January - October 2010.9 Data on school infrastructure and accessibility, finances (income and expenditure), and teacher demographics were collected once for each school (typically during the first visit, but completed in later visits if necessary), while data on time-varying metrics such as teacher and student attendance and dates of the most recent inspections and PTA meetings were collected in each of the three visits. We also assessed student learning with a test administered to a representative sample of fourth grade students in sampled schools. See Appendix A and Appendix Tables 1-3 for further details on sampling and construction of the village-level panel data set. Teacher absence was measured by direct physical verification of teacher presence within the first fifteen minutes of a survey visit. Data collected during the school census were used to pre-populate teacher rosters for the sampled schools, so that enumerators could look for teachers and record their attendance and activity immediately after their arrival at the school.10 Once teacher attendance was recorded, all other data were collected using interviews of head teachers and individual teachers.11 We record teachers as absent on a given visit if they were not found anywhere in the school in the first fifteen minutes after enumerators reached a school. We consider all the teachers in the school to be absent if the school was closed during regular working hours on a school day, and respondents near the school did not know why the school was closed or mentioned that the school was closed because no teacher had arrived or they had all left 8 This is also why the 2010 wave did not include urban areas. Since school-level identifiers from the 2003 survey were not preserved (for confidentiality reasons), the panel needed to be constructed at the town/village level. However, since the fraction of urban schools covered in 2003 (relative to the total number of schools in the sampled towns) was very small, it was not possible to construct a credible panel-data estimate of school quality in towns. In rural areas, this was not a concern because we typically covered all the public schools in a village (in 84.2 percent of the cases) and had a mean coverage rate of 82.7 percent of public schools in the sampled villages. 9 While the exact timing of the school year is not identical across states, the typical school year runs from mid-June to mid-April. The three visits therefore spanned two academic years, with the first visit being made during January - March 2010, the second visit being made during June - August, and the third visit during August - October 2010. 10 This was important given the widespread possession of cell phones among teachers, which would allow them to call up absent colleagues as soon as they saw external visitors in the school who were measuring teacher absence. 11 Not all interviews could be completed. Most non-responses were at the teacher as opposed to the school level (since absent teachers could not be interviewed, whereas school data could be obtained from either the head teacher or any other senior teacher). These non-responses are unlikely to affect the analysis in this paper because the panel-data analysis will focus on aggregated data at the village level as opposed to the individual data at the teacher level. 6 early.12 To be conservative in our measure of absence, we exclude all school closures due to bad weather, school construction/repairs, school functions and alternative uses of school premises (for instance, elections). We also exclude all part-time teachers, teachers who were transferred or deputed elsewhere, or teachers reportedly on a different shift. We construct a school infrastructure index by adding binary indicators for the presence of four indicators of school facilities - drinking water, toilets, electricity and a library. We construct a remoteness index by taking the average of nine normalized indicators of distance to various amenities including a paved road, bus station, train station, public health facility, private health clinic, university, bank, post-office and Ministry of Education office. A lower score on the remoteness index represents a better connected school. During each survey visit, we record the date of the most recent school inspection. We measure the extent of monitoring and supervision as the mean probability of being inspected in the past three months across all three visits. We used a similar procedure for construct- ing the mean probability of a parent-teacher association (PTA) meeting. Average parental education of children in a school is computed from the basic demographic data collected for the sample of fourth-grade students chosen for assessments of learning outcomes. For most of the analysis in this paper, we use the village as our unit of analysis and examine mean village-level indicators of both inputs and outcomes because a large number of new schools had been constructed between 2003 and 2010, including in villages that already had schools. This school construction resulted from a policy designed to improve school access by ensuring that every habitation with over 30 school-age children had a school within a distance of one kilometer. Thus, to ensure that our sample was representative in 2010, and at the same time amenable to panel data analysis relative to 2003, we constructed the panel at the village level, with a new representative sample of schools drawn in the sampled villages.13 All the results reported in this paper are population weighted and are thus representative of the relevant geographic unit (state or all-India). 3 Summary Statistics 3.1 Changes in Inputs The data show considerable improvements in school inputs between 2003 and 2010 along three broad categories - teacher qualifications and working conditions, school facilities, and 12 Field teams obtained lists of state and national school holidays in advance of creating the field plans and ensured that no visits were conducted on these days. 13 Even in the absence of school construction, the survey firm did not retain school and teacher level identifiers from the 2003 survey (complying with data protection norms), which would have made it difficult to construct a school-level panel (especially for villages with multiple schools). 7 monitoring (Table 1). The fraction of teachers with a college degree increased by over 40 percent (from 41 to 58 percent), the fraction reporting getting paid regularly rose by around 60 percent (from 49 to 78 percent), and the fraction reporting the existence of teacher recognition schemes rose by over 60 percent (50 to 81 percent). While the fraction of teachers who report a formal teaching credential fell by 12 percent (77 to 68 percent), the main contributor to this decline was the large increase in the hiring of contract teachers in several large states, which led to an increase in the fraction of contract teachers from 6 to 30 percent. The student-teacher ratio also fell by around 16 percent (from 47.2 to 39.8). School facilities and infrastructure improved on almost every measure. The fraction of schools with toilets and electricity more than doubled (from 40 percent to 84 percent for toilets and 22 percent to 45 percent for electricity); the fraction of schools with functioning midday meal programs nearly quadrupled (from 22 percent to 79 percent); the fraction of schools with a library increased by over 35 percent (from 51 percent to 69 percent), and almost all schools now have access to drinking water (96 percent). Initiatives outside the education ministry to increase road construction have also led to increased proximity of schools to paved roads increasing the accessibility of schools for teachers who choose to live farther away. Relative to the distribution observed in 2003, a summary index of school infrastructure improved by 0.9 standard deviations. Table 1 also documents improvements in both ‘top-down’ administrative and ‘bottom-up’ community monitoring of schools over this period. The fraction of schools inspected in the three months prior to a survey visit increased by over 40 percent (from 38 percent to 56 percent). This increase in inspection probability is even more pronounced over shorter time windows, increasing by over 60 percent for the previous two months and over 70 percent for the previous month. Finally, the extent of community oversight of schools, measured by the frequency of PTA meetings also increased: The probability that a PTA meeting took place during the three months prior to a survey visit increased by 50 percent (from 30 percent to 45 percent). Overall, Table 1 confirms that the Government of India’s increased focus on primary education in the past decade did lead to significant improvements in input-based measures of school quality. 3.2 Changes in Teacher Absence We now turn to changes in teacher absence. Table 2 (Column 2) shows teacher absence rates by state as well as the weighted average national absence rate for rural India. It also shows the corresponding figures for 2003 to facilitate comparison (Column 1). The population- weighted national average teacher absence rate for rural India fell from 26.3 percent to 23.6 percent, a reduction of 10 percent or 2.64 percentage points. 8 Considerable variation remains in teacher absence rates across states with estimates rang- ing from 12.9 percent in Tamil Nadu to a high of 45.8 percent in Jharkhand. Teacher absence rates declined in 14 out of 19 states with significant reductions in 12 states. Five states (Tamilnadu, Punjab, Maharashtra, Orissa, and Chhatisgarh) now report teacher absence rates below 15 percent. On a population-weighted basis, the largest contributor to reduc- tions in all-India teacher absence was Bihar, which is consistent with the widely-reported improvement in governance in the state during this period (Chakrabarti, 2013). In contrast, the significant increase in teacher absence in India’s most populous state Uttar Pradesh (population over 200 million), partly offset the reduction in other states. Since Chaudhury et al. (2006) find a strong negative correlation between GDP/capita and teacher absence rates (both across countries and within Indian states), one way to interpret the magnitude of these changes is to compare them with the expected reduction in teacher absence that may be attributed simply to the economic growth that has taken place in this period. Using a growth accounting (as opposed to causal) framework, we can decompose the change in teacher absence into a component explained by changes in GDP/capita (as a proxy for ‘inputs’) and one explained by a change in governance (a proxy for TFP). Cross-sectional estimates from the 2003 data suggest that a 10 percent increase in GDP/capita is associated with a 0.6 percentage point reduction in teacher absence.14 In the period between 2002 and 2010, real GDP/capita in India has grown 38 percent. Thus, growth in GDP/capita over this period should have by itself contributed to a reduction in teacher absence of 2.4 percent. Our estimate of the change in teacher absence rate is exactly in this range, and suggests that the reduction of teacher absence we document is consistent with a proportional increase in ‘inputs’ into education, but a limited improvement in TFP in this period. We discuss the policy implications of this in the conclusion. Finally, to interpret the cost of teacher absence to students, we note that the effective attention a student receives from a teacher can be increased both by reducing teacher absence as well as by hiring more teachers. To account for the reduced attention that students receive when teachers are absent, we define the “effective student teacher ratio (ESTR)” as: ST R EST R = (1 − Teacher Absence Rate) We use official DISE data on total enrollment and total number of teachers, combined 14 The cross-sectional relationship is estimated by regressing village-level teacher absence on the log of district-level per-capita consumption (from the National Sample Survey) in the 2003 survey. Estimates without state fixed effects are larger (and equal -1.17) whereas estimates with state fixed effects are smaller but still significant (and equal to -0.63). Our default estimate is based on using state-fixed effects since cross-state variation in per-capita income is much more likely to be correlated with unmeasured governance quality. Tables are available on request. 9 with the absence rates from our survey to calculate both the STR and ESTR by state in 2003 and 2010 (Table 2 - columns 4-9). We see that though all-India STR had been reduced to below 40 in this period, the effective STR after accounting for teacher absence was over 52. The effective STR in 2010 in three of India’s most educationally backward states (Bihar, Jharkhand, and Uttar Pradesh) was as high as 97, 79, and 69. These figures illustrate that teacher absence can sharply increase the effective STR experienced by students relative to the STR calculated using state-level figures on enrollment and number of teachers. 3.3 Official Records, Teaching Activity, and Stated Reasons for Absence Enumerators recorded whether a teacher had been marked as present in the log-books on the day of the visit and also on the previous day, and we see in Table 3 - Panel A that going by these records would suggest a much lower teacher absence rate of 16 percent using the same day’s records, or as low as 10.2 percent using the previous day’s records (this was not collected in 2003). These data suggest that official records can be easily manipulated, and highlight the importance of measuring teacher absence by direct physical verification as opposed to official records on log books. Enumerators also recorded the activity that teachers were engaged in at the point of observation, and we see that 53 percent teachers on the payroll were found to be actively teaching, and another 4 percent were coded as passively teaching (defined as minding the class while students do their own work). Just over 19 percent of teachers were in school but were either not in the classroom or not engaged in teaching activity while in the classroom. Thus a total of 42 percent of teachers on the payroll were either absent or not teaching at the time of direct observations (Table 3 - Panel B).15 In cases where a teacher was not found in the school, enumerators asked the head teacher (or senior-most teacher who was present) for the reason for absence. These stated reasons are summarized in Table 3 (Panel C). Two categories of clearly unauthorized absence (school closure during working hours and no valid reason for absence) account for just under half the cases of teacher absence (48 percent), which provides a lower bound on the extent of unauthorized absences of 11.3 percentage points. The two other categories of stated absence (authorized leave and official duties) that account for 52 percent of the observed absence are plausibly legitimate but cannot be verified. While head teachers may overstate the extent of official duties to shield absent col- 15 This is almost surely an underestimate (and hence a lower bound) because in many cases it is easy for a teacher who may not have been teaching to pick up a book and look like he or she is actively teaching when it is known that someone is visiting the school (see Muralidharan and Sundararaman (2010) for evidence documenting this). 10 leagues,16 they should have no reason to understate it. We can, therefore, reasonably treat the stated reasons for absence as an upper bound for duty-induced absence. This yields the important finding that one commonly cited reason for teacher absence - namely, that teachers are often asked to perform non-teaching related duties such as conducting censuses and monitoring elections - is a very small contributor to the high rates of observed teacher absence. Table 3 - Panel C shows that official non-teaching duties account for less than 1 percent of observations and under 4 percent of the cases of teacher absence (these results are unchanged from 2003). 4 Cross-section and Panel Regression Results 4.1 Correlates of Teacher Absence in 2010 Table 4 presents cross-sectional correlations between indicators of school quality and teacher absence in 2010. As discussed in section 2.4, this analysis is done at the village-level since our panel analysis must be done at the village-level. Column 1 shows the mean level of each covariate in the sample, columns 2-4 present the coefficients on each indicator in bivariate regressions with the dependent variable being teacher absence, while columns 5-7 do so in multiple regressions that include all the variables shown in Table 1 as regressors. We first show the correlations with no fixed effects, then with state fixed effects, and finally with district fixed effects. The comparison of results with and without state fixed effects is important for interpretation. Many indicators of school quality vary considerably across states in a manner that is likely to be correlated with other measures of governance and development as well as the history of education investments in these states. On a similar note, while primary education policy is typically made at the state level, there is often important variation across districts within a state based on historical as well as geographical factors (Banerjee and Iyer, 2005; Iyer, 2010). Thus, specifications with district fixed effects that are identified using only within-district variation are least likely to be confounded by omitted variables correlated with historical or geographical factors. However, there may still be important fixed omitted variables across villages (such as the level of interest in education in the community) that are correlated with both measured quality of schools and teachers as well as teacher absence. We therefore present the correlations in Table 4 for completeness and focus our discussion on the village-level panel regressions presented in Table 5. In bivariate regressions without fixed effects, teachers who have formal training, who are 16 We see this most clearly in Table 3 - Panel A, where over 7.5 percent of teachers who were not found in the school during the direct observation were marked present in the official log books, suggesting collusion among teachers in the reporting of absence in official records. 11 paid regularly, who are eligible for recognition schemes, and who are in schools with better infrastructure are less likely to be absent (Table 4 - Column 2). However, none of these correlations are significant in multiple regressions or in specifications with state or district fixed effects suggesting that these measures are correlated with each other, and that states that have a longer history of investing in education may have better indicators of school and teacher quality and lower teacher absence, but that these metrics of teacher quality do not predict teacher absence within states or districts. Overall, there are few robust correlations across all specifications except that schools that have been inspected recently have significantly lower rates of absence. One important result in the correlations is that there appears to be no significant relationship between teacher salary and the probability of teacher absence. Since salary data were not collected in the 2003 survey, this variable is not included in the panel analysis below. 4.2 Correlates of Changes in Teacher Absence between 2003 and 2010 The main identification challenge in the cross-sectional correlations presented in Table 4 (and in Kremer et al. (2005)) is that we cannot rule out the possibility that the results are confounded with village-level omitted variables. The use of panel data helps mitigate these concerns since our correlations are now identified using changes in village-level measures of school inputs. Table 5 (columns 4-6) presents results from the following regression: ∆Absi = β0 + β1 · ∆Ti + β2 · ∆Si + β3 · ∆Mi + βZi · Zi + i (1) where ∆Absi is the change in the mean teacher absence rate in government schools in village i between 2003 and 2010, ∆Ti is the change in village-level means of measures of teacher attributes, ∆Si is the change in village-level means of measures of school facilities, and ∆Mi is the change in village-level means of measures of school monitoring and supervi- sion. Zi represents different levels of fixed effects (state or district) and i is the error term. Since changes in the measures of school quality included above may be correlated, we report both bivariate regressions with only covariate at a time (columns 1-3) as well as multiple regressions that include all of these covariates (columns 4-6). Since equation 1 differences away fixed unobserved heterogeneity at the village level (and therefore at the state and district level as well), the inclusion of state and district fixed effects in the specification controls for average state and district specific changes over time in both the left-hand and right-hand side variables. Thus our panel results with state and district fixed effects are least likely to be confounded with time-invariant and time-variant 12 omitted variables.17 However, it is also worth noting that such a specification biases us against detecting small effects. First, first-differencing leaves us with less variation in the explanatory variables, which will increase standard errors. Second, to the extent there is measurement error in the explanatory variables, first differencing would also increase the attenuation bias. This is why we focus our discussion and interpretation of the results on the ones that are robustly significant and do not treat absence of evidence of significant effects as strong evidence of absence of effects. Nevertheless, the results in Table 5 suggest that several plausible narratives for the rea- sons for teacher absence seen in the cross-sectional data reported in Kremer et al. (2005) are not supported in the panel data regressions. In particular, we find no correlation between changes in school infrastructure or proximity to a paved road and teacher absence. We also find no correlation between changes in teacher professional qualifications or professional conditions (such as regularity of pay) and changes in teacher absence. We find two robust relationships in the panel regressions, where we define ‘robust’ as correlations that are significant in both bivariate and multiple regressions; significant in all three main specifications (no fixed effects, state fixed effects, and district fixed effects) and consistent across all specifications (we cannot reject that the estimates are the same across specifications). First, villages that saw a reduction in student-teacher ratio (STR)18 have significantly higher rates of teacher absence. This is a potentially counterintuitive result because a com- mon narrative for teacher absence is that their working conditions are poor, with high STRs cited as a prominent example of burdensome working conditions. However, the most com- mon outcome for students when their teacher is absent is that they are combined with other classes (typically from other grades) whose teachers are present.19 Our results therefore suggest that having more teachers may make it easier for teachers to be absent (since other teachers can handle their class), and that the impact of hiring additional teachers may be partially offset through increased teacher absence. The estimates suggest that a 10 percent reduction in STR is correlated with a 0.5 percent increase in average teacher absence. The estimates remain stable when we include state and district fixed effects and are un- changed even when we introduce a full set of controls (also measured in changes). These 17 Another way of interpreting the specifications is that the one with no fixed effects is using all the variation in the nationwide changes over time in left and right-hand side variables, and the ones with state and district fixed effects are estimated using within-state and within-district variation in the changes respectively. 18 We focus on the school-level student-teacher ratio (STR) because the policy goals for teacher hiring are stated in terms of STR. But in practical terms, lowering STR is equivalent to lowering class-sizes. 19 Doing so does not deviate from the norm in the context of rural Indian government-run primary schools because our data show that close to 80 percent of schools practice multi-grade teaching (where one teacher simultaneously teaches students across multiple grades at the same time in the same classroom) in any case. 13 results support the hypothesis that the relationship is causal. Some identification challenges in a cross-section are less salient in the panel. Decisions on teacher placement are largely based on administrative criteria of whether schools are above or below the STR norms and are unlikely to be correlated with contemporaneous changes in teacher absence.20 Indeed, a causal relationship between increased teacher hiring and increased absence of existing teachers has been established experimentally in other low-income countries, such as Kenya (Duflo, Dupas and Kremer, 2012) and India (Muralidharan and Sundararaman, 2013). Our estimates provide complementary evidence and greater external validity to these experimen- tal results and suggest that they generalize to nationally scaled-up programs of reducing class sizes by hiring more teachers (in contexts with weak teacher accountability). The second robust result in the panel data estimates is the strong negative correlation between improved school monitoring and teacher absence. In each of the three visits to a school, enumerators recorded the date of the most recent inspection, and we average across the three visits across all the sampled schools in the village to construct the variable “Prob- ability of being inspected in last 3 months”, which ranges from zero (none of the schools in the village were inspected in the prior three months in any of the three visits) to one (all the schools in the village were inspected in the prior three months in all of the three vis- its).21 The results suggest that villages where the probability of inspection in the past three months increased from zero to one had a reduction in average teacher absence of between 8.2 percentage points (with no fixed effects and no controls) and 6.4 percentage points (district fixed effects and a full set of controls). The estimates are remarkably consistent across all 6 specifications, and even the most conservative estimate suggests that teacher absence rates in schools that are regularly inspected are over 25 percent lower than in schools that are not. To further check for patterns in the data that could support a causal interpretation of this result, Table 6 breaks down the dependent variable (teacher absence) by the various categories of stated reasons for absence (official duty, authorized leave, and unauthorized absence), and shows the coefficient on the “probability of inspection” variable on each of these categories. Panel A shows the cross-section estimates (corresponding to Table 4) while Panel B shows the panel ones (corresponding to Table 5). We see that increases in inspection probability are correlated with reductions in unauthorized teacher absence, but not with reductions in teacher absence due to either official duty or authorized leave.22 These results 20 The most likely omitted variable concern would in fact go the other way. If communities that cared more about education were more likely to be able to get additional teachers, they would also be more likely to ensure better teacher attendance, suggesting that our results may be a lower bound on the magnitude of this effect. 21 We also consider two alternative constructions: “Probability of being inspected in last 2 months” and “Probability of being inspected in last 1 month.” Results are similar and available upon request. 22 The point estimates on these categories suggest that increasing inspection probability may reduce the 14 are consistent with the interpretation that improved ’top down’ administrative monitoring can have a significant and substantial impact on reducing unauthorized teacher absence. In contrast, there is less evidence that increases in ‘bottom up’ monitoring by the com- munity (measured by whether the PTA had met in the past 3 months) are correlated with reductions in teacher absence (Table 5). This is consistent with the experimental results re- ported in Olken (2007) on the impacts of monitoring corruption in Indonesia. These results should not be interpreted as suggesting that bottom-up monitoring cannot be effective, since it is also likely that they reflect differences in the effective authority over teachers possessed by administrative superiors (high) versus parents (low). PTAs in India typically do not have authority to appoint or retain regular civil-service teachers, and they cannot sanction teachers for absence or non-performance. Inspectors and administrative superiors, on the other hand, do possess authority over teachers, including the ability to demand explanations for absence, to make adverse entries in teachers’ performance record, and in extreme cases to initiate disciplinary proceedings. These actions do not take place very often, but the administrative rules provide inspectors with the powers to take these actions, whereas PTAs do not have any such powers.23 As in the case of the changes in STR, the stability of the estimates to the introduction of state and district fixed effects as well as a full set of controls helps mitigate concerns about omitted variables bias. To further address identification concerns, we examine the extent to which changes in inspection frequency can be explained by other observables, and find that there are no correlations between changes in inspection frequency and changes in other observable measures of school quality that are significant across our three standard specifi- cations (Table 7). Finally, these results are also consistent with experimental evidence from India that finds significant reduction in teacher absence in response to improved monitoring and professional consequences that are linked to better attendance (Duflo, Hanna and Ryan, 2012). Since the experimental study was carried out in a small sample of informal schools fraction of teacher absence that is recorded as “official duty” and increase the fraction that is recorded as “authorized leave”. These results are consistent with head teachers and teachers colluding to some extent to record absences as being due to official duty that do not count against a teacher’s quota of authorized leave. Increased inspections may make it difficult to sustain this collusion (since the inspector will be able to verify if the teacher is away on official duty) and require more of the absences to be counted against a teacher’s official leave quota. 23 In addition to the possibility of formal disciplinary action against absent teachers, an additional channel for the deterrence effect of increased inspections on teacher absence may stem from the possibility that inspectors can extract side payments from absent teachers in return for not making a formal adverse entry on their service record (World Bank, 2003). Social norms would make it difficult to ‘extort’ such payments from teachers who are actually present, but it would be much easier to demand a payment from an absent teacher. Thus, even if the costs of initiating formal disciplinary action are high (and the incidence of such action is low), there may be other informal channels through which more frequent inspections serve as a disincentive for teacher absence. 15 in one state in India, our estimates using a nationally-representative panel dataset of rural public schools provides complementary evidence on the likely role of improved monitoring on reducing teacher absence. In interpreting the result on school inspections, it is useful to consider why there might be variation in the frequency of inspections across villages and what this would imply for a causal interpretation. One obvious explanation is that inspectors are more likely to visit more accessible villages, but the data do not support this hypothesis since there is no correlation between changes in the remoteness index and changes in inspection rates (Table 7). District-level interviews on school governance in India suggest two important reasons for the variation in inspection frequency (Centre for Policy Research, 2012).24 The first is staffing. Districts are broken down further into administrative blocks, and schools within blocks are organized into clusters. School supervision is typically conducted by “block educa- tion officers” and “cluster resource coordinators”. We find that a significant fraction of these posts are often unfilled. For instance, in 19 percent of the cases (where we have data) even the position of the “District Education Officer (DEO)”, the senior-most education official in a district, was vacant. Further, there is high turnover in education administration (the average DEO had a tenure in office of just one year) creating periods when the positions are vacant during transitions. The lack of supervisory staff at the block-level is even more acute, as 32% of these positions were estimated to be vacant in 2010 (the year of our survey) even by an official government report (13th JRM Monitoring Report, 2011). Our interviews suggest that these staffing gaps at the block and cluster level are the most important source of vari- ation in inspection frequency within districts, since blocks and clusters without supervisory staff are much less likely to get inspected. The second source of variation in inspections is the diligence of the concerned supervisory officer. Even if all the positions of supervisory staff were filled, there would be variation in the zealousness with which these officers visited villages/schools, which might lead to some areas being inspected more often than others based on whether they were in the coverage area of a more diligent officer or not. However, since supervisors are typically assigned a coverage area of clusters or blocks that comprise many villages, variation in monitoring frequency that is driven by supervisor-level unobservable characteristics is unlikely to be correlated with other village-level characteristics that are also correlated with absence. Of course, this 24 This module was designed to complement the school surveys by allowing us to create quantitative measures of district-level education governance. Unfortunately, the non-completion rate for these interviews was very high (over 40 percent) due to non-availability, and non-response of district-level administrators. Since this non-response is clearly not random, we do not use the quantitative measures in regressions. Nevertheless, important qualitative insights can be obtained from these interview transcripts. These results are summarized in a companion policy report (Centre for Policy Research, 2012). 16 source of variation has implications for thinking about the likely effectiveness of hiring new supervisory staff (some of whom may be less diligent). We discuss these in section 5.3. 4.3 Correlates of Changes in Student Outcomes While the focus of our analysis has been on teacher absence, we also briefly consider the correlations between improvements in school quality (seen in Table 1) and learning outcomes. Table 8 presents panel regressions of the form in equation (1), where changes in normalized mean math test scores at the village level are regressed on changes at the village-level, including teacher absence, school facilities, and monitoring. Changes in household-level inputs such as growth in mean parental education and increases in the fraction of students taking private tuition are positively correlated with test score gains (significant in most specifications). School-level variables are not consistently significant, though higher STRs and higher teacher absence are both correlated with lower test scores (and significant in some specifications). However, there is no positive correlation between the improvements in most of the standard measures of school quality and student learning outcomes - including school infrastructure, mid-day meals, and teacher qualifications and training. We only treat these results as suggestive because the data (with a seven-year gap in mean village-level test scores) is not ideal for testing the impact of school characteristics on test scores. The ideal specifications would use annual panel data on student test scores matched to these characteristics and estimate value-added models of student learning. Nevertheless, the main findings here are consistent with those in studies set in low-income countries using data sets that are better suited to study student learning outcomes, which find that teacher absences significantly hurt student learning, and that school infrastructure, teacher qualifi- cations, and teacher training are typically not correlated with improved student learning.25 25 Duflo, Hanna and Ryan (2012) show experimentally that lower teacher absence raises test scores, while Muralidharan (2012) shows this in value-added estimates with five years of annual panel data on test scores in the state of Andhra Pradesh matched with the absence rate of the teacher of each student that year. (Das et al., 2007) show that high teacher absences in Zambia (mainly due to teachers falling sick) lead to significantly lower student test score gains. Muralidharan (2012) also shows that school infrastructure and teacher qualifications are not correlated with improvements in learning outcomes in the control schools (which represent the ‘business as usual’ scenario). See (Muralidharan, 2013) for a review of this evidence with a focus on India. 17 5 The Fiscal Cost of Weak Governance 5.1 The Fiscal Cost of Teacher Absence High levels of teacher absence translate into considerable waste of public funds since teacher salaries are the largest component of education spending in most countries, including India.26 Calculating these fiscal costs requires us to estimate and exclude the extent of legitimate absence from our calculations. As part of the institutional background work for this project, we obtained teacher policy documents from several states across India. Analysis of these documents indicates that the annual allowance for personal and sick leave is 5 percent on average across states. This is close to the survey estimate of 5.9 percent (Table 3), but we use the official data since the stated reasons may be over-reported. Estimating the extent of legitimate absence due to ‘official duty’ (outside the school) is more difficult because there are no standard figures for the ‘expected’ level of teacher absence for official duties. Policy norms prescribe minimal disruption to teachers during the school day and stipulate that meetings and trainings be carried out on non-school days or outside school hours. Since we are not able to verify the claim that teachers were on official duty, and there is evidence that head teachers try to cover up for teacher absences by claiming that these are due to ‘official duties’, our default estimate treats half of these cases as legitimate. This gives us a base case of legitimate absence of 8 percent (5 percent authorized leave, and 3 percent official duty). We also consider a more conservative case where the legitimate rate of absence is 10 percent. This 8-10 percent range of legitimate absence also makes sense because the fraction of teacher observations that are classified as either ‘authorized leave’ or ‘official duty’ is in this range for the five states with the lowest overall absence rates - even treating the stated reasons for absence as being fully true (tables available on request). To estimate the cost of teacher absence, we use teacher salary data from our surveys and use administrative (DISE) data on the number of primary school teachers by state (Table 9 - columns 1 and 2).27 We provide three estimates of the fiscal cost of teacher absence in columns 3 to 5 of Table 9 (based on assuming the rate of legitimate teacher absence to be 8, 9, and 10 percent respectively), and these calculations suggest that the annual fiscal cost of teacher absence is around Rs.81-93 billion (US$1.4-1.6 billion/year at 2010 exchange rates). 26 This waste is especially costly in low-income countries because they typically have low tax/GDP ratios and hence face greater fiscal constraints on mobilizing resources for public investment. 27 The salary figures in our surveys do not include the fiscal cost of the benefits provided to civil service teachers. Imputing the value of these benefits is difficult for the majority of teachers who are on a defined benefits pension program. However, newer cohorts of government employees are covered by a (less generous) defined contribution retirement program where the government contributes 10% of pay to a retirement account. We use this conservative estimate and add 10% to the average salary figures. No adjustment is made for medical benefits. 18 5.2 Calculating the Returns to Better Governance in Education The results in Table 5-7 suggest that of all the investments made in improving school quality in the period from 2003 to 2010, the only one that had a significant impact on reducing teacher absence was increased administrative monitoring and supervision. In this section, we calculate the returns to a marginal increase in the probability of a school being inspected. We make the following assumptions: (a) enough supervisory staff are hired to increase the probability of a school being inspected in the past 3 months by 10 percentage points (relative to a current probability of 56 percent); (b) increasing inspection probability by 10 percentage points would reduce mean teacher absence across the schools in a village by 0.64 percentage points (the most conservative estimate of the correlation between increased inspection probability and reduced teacher absence from Table 5); (c) the full cost (salary and travel) of a supervisor is 2.8 times that of a teacher; (d) a supervisor works 200 days per year and can cover 2 schools per day.28 The results of this estimation are presented in Table 10 and we see in column 3 that the cost of hiring enough supervisors to increase the probability of a school being inspected by 10 percentage points is Rs.448 million/year. However, the reduction in wasted salary from this investment in terms of reduced teacher absence amounts to Rs.4.5 billion/year, suggesting that the returns to investing in better governance are ten times greater than the cost. Thus, improving school governance by hiring enough staff to increase the frequency of monitoring could be a highly cost-effective investment (on the current margin). 5.3 Hiring More Supervisors vs. More Teachers Column 5 of Table 10 shows the extent to which the effective STR (ESTR) can be reduced by hiring enough supervisors to increase the probability that a school was inspected in the past three months by 10 percentage points (the same magnitude used in the calculations in the previous section). To compare the relative cost effectiveness of hiring more supervisors versus more teachers, we calculate the salary cost of hiring more teachers to achieve the same reduction in ESTR and report these numbers in column 6. Comparing columns 3 and 6, we see that hiring more supervisors would be 12.8 times more cost effective at reducing ESTR than doing so by hiring more teachers. The difference between the estimates in columns 4 and 6 stems from accounting for the fact that hiring more teachers will increase the absence rates 28 We use DISE data on the number of schools in each state to calculate the number of supervisors who will be required to increase the probability of inspections in a 3-month interval by 10 percentage points. The cost estimates are conservative and assume that the salary costs are double that of a teacher and that the travel costs are equal to 80 percent of a full months’ salary (which is higher than the typical travel and daily allowance provided to education department employees to travel to/from a village to district headquarters). 19 of the existing teachers, which is the other robust result in the panel regressions presented in Table 5 (again, we use the most conservative estimate). Thus, the estimates in column 6 account for the fact that the marginal rate of absence from hiring an extra teacher is higher than the average absence rate, and that increased spending on hiring teachers is correlated with an increase in inefficiency. The difference in the relative cost effectiveness of the two policy options is large enough, that the policy recommendation of hiring more supervisors rather than teachers (on the current margin) would be unchanged even if the supervisors were to work less efficiently than assumed in these calculations. For instance, if supervisors were absent at the same rate as teachers (say 25 percent), allocating marginal funds to hire an additional supervisor would still be nearly ten times more cost effective at reducing ESTR than using those funds to hire an additional teacher. 6 Policy Implications The main caveat to using our results to recommend a universal policy of hiring more su- pervisors to scale up the frequency of school inspections is that our estimates are based on correlations and may not be convincing enough to warrant a universal scale up. Nevertheless, it is worth noting that both our key results - the correlation between increased monitoring and reduced teacher absence, and the correlation between lower STR and increased teacher absence - are consistent with experimental evidence from smaller-scale, which increases our confidence in their validity. Further, our estimates are based on an expansion of existing system of inspections, and use nationwide panel data (which mitigates omitted variables concerns) representing close to a billion people, and complement results from smaller-scale randomized experiments warranting them greater external validity for several reasons. First, while our results support results from smaller randomized experiments, there is ev- idence that experimentally-estimated positive results of interventions that are implemented by NGOs may not be replicated when the programs are implemented by governments (Baner- jee, Duflo and Glennerster, 2008). Second, there is also evidence of site-selection bias where implementing partners are more likely to be willing to rigorously evaluate programs in loca- tions where they are more likely to be successful (Allcott, 2015). Finally, even in the absence of such a bias, most experiments are conducted in very few sites, and may yield impre- cise treatment effects (for inference over a larger population) in a setting where unobserved site-specific covariates may interact with the treatment (Pritchett and Sandefur, 2013).29 29 The largest education experiments to date that we know of have been conducted over five districts in one state of India (Muralidharan and Sundararaman, 2010, 2011, 2013). While these experiments feature random assignment in representative samples of schools (in a state with over 80 million people), they still 20 Thus, even if small-scale experiments are unbiased within sample, they may be biased and also imprecise for population-level inference. In other words, there is likely to be a trade-off between the potential omitted variable bias in our panel-data estimates on one hand, and the advantages of greater precision, “as is” implementation, and unbiased site selection on the other. We do not attempt to quantify this trade-off in this paper (since we have no objective basis of doing so). However, one way of reconciling this trade-off is to conduct a substantial nationwide expansion of school inspections by hiring more staff in the context of a large experimental evaluation. From a decision-theoretic perspective, our results are strong enough to support such a policy even if there is only a 1% chance that our estimates are causal. In Appendix B, we formally show that, barring extreme priors, a policy-maker interested in lowering effective student-teacher ratio will find it cost-effective to invest in or scale-up monitoring of teachers. 7 Conclusions The central and state governments in India have considerably increased spending on primary education over the past decade. We contribute towards understanding the impact of these substantial nationwide investments in primary education in India by constructing a unique nationally-representative panel data set on education quality in rural India. We find that there has been a substantial improvement in several measures of school quality including infrastructure, student-teacher ratios, and monitoring. However, teacher absence rates con- tinue to be high, with 23.6 percent of teachers in public schools across rural India being absent during unannounced visits to schools. Using village-level panel data, we find two robust correlations in the panel data that provide external validity in nationally-representative data to results established in smaller- scale experiments. First, reductions in student-teacher ratios are strongly correlated with increased teacher absence, suggesting that increased spending on hiring additional teachers was accompanied by increased inefficiency, which may limit the extent to which additional spending may improve outcomes. Second, increases in the frequency of inspections are strongly correlated with lower teacher absence, suggesting that of all the investments in improving school quality, the one that was most effective in reducing teacher absence was improved administrative monitoring of schools and teachers. We calculate that the fiscal cost of teacher absence is over $1.5 billion per year, and estimate that investing in improved gov- ernance by increasing the frequency of monitoring would be over ten times more cost effective at increasing student-teacher contact time than doing so by hiring additional teachers. come from just one state, compared to the estimates in this paper that use panel data from 190 districts across 19 states. 21 In interpreting our results, it may be useful to think of the performance of the education system (measured by the level of teacher absence) as comprising two components - ‘inputs’ into the production of education that expand with income growth (such as school infrastruc- ture, class size, and teacher salaries), and the efficiency of the use of these inputs (which would correspond to the TFP of education production). Our results suggest that the Indian education system has made significant progress on the former, but made less progress on the latter. Using this growth accounting perspective, we see that the reduction in teacher absence observed between 2003 and 2010 is exactly in line with what we would expect from the growth in per-capita income that has taken place during this period. This is consistent with the growth in income enabling an expansion of a broad range of inputs into education that was for the most part an increase along existing spending patterns. On the other hand, a strategic reallocation of resources to governance and monitoring (as indicated by our re- sults) may achieve a greater reduction in effective student-teacher ratio for a given level of GDP/capita. Such a reduction of misallocation of spending may significantly improve the TFP of public spending on education. Our results suggest that one promising way of improving school governance and achieving such a reallocation of resources would be to expand the existing system of administrative monitoring of teachers and schools by hiring more supervisory staff. Our calculations indicate that such a marginal expansion could (on the current margin) have a significant impact on reducing teacher absence, and that this would be highly cost effective in terms of reducing the fiscal cost of weak governance. More broadly, our results suggest that the returns to investing in state capacity to better monitor the implementation of social programs in low-income countries may be quite high, and that at the very least there is a strong case for expanding such programs in the context of large experimental evaluations of “as is” implementation to obtain more precise estimates of their benefits.30 30 Muralidharan, Niehaus and Sukhtankar (2015) is an example of just such an experimental evaluation, in the context of an ambitious initiative by the Government of Andhra Pradesh (AP) to improve governance in public welfare programs through biometric payments infrastructure. Working with the government of AP, they randomize the rollout of the new biometric payments infrastructure over a potential universe of 20 million beneficiaries, and estimate that the program reduced ‘leakage’ in the rural employment guarantee scheme by an amount that was nine times the cost of the program. Interestingly, this effect is of a similar magnitude to the returns that we estimate to investing in better monitoring of teachers in this paper. 22 References 13th JRM Monitoring Report. 2011. “13th Joint Review Mission (JRM) Report of Sarva Shiksha Abhiyan.” Allcott, Hunt. 2015. “Site Selection Bias in Program Evaluation.” Quarterly Journal of Economics, 130(3): 1117–1165. Bandiera, Oriana, Andrea Prat, and Tommaso Valletti. 2009. “Active and Passive Waste in Govern- ment Spending: Evidence from a Policy Experiment.” American Economic Review, 99(4): 1278–1308. Banerjee, Abhijit, and Lakshmi Iyer. 2005. “History, Institutions, and Economic Performance: The Legacy of Colonial Land Tenure in India.” American Economic Review, 95(4): 1190–1213. Banerjee, Abhijit, Esther Duflo, and Rachel Glennerster. 2008. “Putting a Band-Aid on a Corpse: Incentives for Nurses in the Indian Public Health Care System.” Journal of the European Economic Association, 6(2-3): 487–500. Besley, Timothy, and Torsten Persson. 2009. “The Origins of State Capacity: Property Rights, Taxa- tion, and Politics.” American Economic Review, 99(4): 1218–1244. Bloom, Nicholas, and John Van Reenen. 2010. “Why Do Managment Practices Differ across Firms and Countries?” Journal of Economic Perspectives, 24(1): 203–224. Centre for Policy Research. 2012. “Quality of Education in Rural India (QuERI): Governance Report.” Centre for Policy Research, New Delhi. Chakrabarti, R. 2013. Bihar Breakthrough: The Turnaround of a Beleaguered State. New Delhi:Rupa Publications. Chaudhury, Nazmul, Jeffrey Hammer, Michael Kremer, Karthik Muralidharan, and F. Halsey Rogers. 2006. “Missing in Action: Teacher and Health Worker Absence in Developing Countries.” Journal of Economic Perspectives, 20(1): 90–116. Das, Jishnu, Stefan Dercon, James Habyarimana, and Pramila Krishnan. 2007. “Teacher Shocks and Student Learning: Evidence from Zambia.” Journal of Human Resources, 42(4): 820–862. de Ree, Joppe, Karthik Muralidharan, Menno Pradhan, and F. Halsey Rogers. 2015. “Double for Nothing: Experimental Evidence on the Impact of an Unconditional Teacher Salary Increase on Student Performance in Indonesia.” UC San Diego. Dongre, Ambrish A, Avani Kapur, and Vibhu Tewary. 2014. How much does India Spend per Student on Elementary Education? New Delhi:Accountability Initiative, Centre for Policy Research. Duflo, Esther, Pascaline Dupas, and Michael Kremer. 2012. “School Governance, Teacher Incentives, and Pupil-Teacher Ratios: Experimental Evidence from Kenyan Primary Schools.” National Bureau of Economic Research, Working Paper 17939. Duflo, Esther, Rema Hanna, and Stephen Ryan. 2012. “Incentives Work: Getting Teachers to Come to School.” American Economic Review, 102(4): 1241–1278. Ferraz, Claudio, Frederico Finan, and Diana B Moreira. 2012. “Corrupting learning: Evidence from missing federal education funds in Brazil.” Journal of Public Economics, 96(9-10): 712–726. Foster, Andrew D, and Mark R Rosenzweig. 1996. “Technical Change and Human-Capital Returns and Investments: Evidence from the Green Revolution.” American Economic Review, 86(4): 931–953. 23 Hsieh, Chang-Tai, and Peter J. Klenow. 2009. “Misallocation and Manufacturing TFP in China and India.” Quarterly Journal of Economics, 124(4): 1403–1448. Iyer, Lakshmi. 2010. “Direct versus Indirect Colonial Rule in India: Long-term Consequences.” Review of Economics and Statistics, 92(4): 693–713. Kremer, Michael, Karthik Muralidharan, Nazmul Chaudhury, F. Halsey Rogers, and Jeffrey Hammer. 2005. “Teacher Absence in India: A Snapshot.” Journal of the European Economic Association, 3(2-3): 658–667. Muralidharan, Karthik. 2012. “Long Term Effects of Teacher Performance Pay: Experimental Evidence from India.” UC San Diego. Muralidharan, Karthik. 2013. “Priorities for Primary Education Policy in India’s 12th Five-year Plan.” India Policy Forum, 9: 1–46. Muralidharan, Karthik, and Venkatesh Sundararaman. 2010. “The Impact of Diagnostic Feedback to Teachers on Student Learning: Experimental Evidence from India.” Economic Journal, 120(546): F187– F203. Muralidharan, Karthik, and Venkatesh Sundararaman. 2011. “Teacher Performance Pay: Experi- mental Evidence from India.” Journal of Political Economy, 119(1): 39–77. Muralidharan, Karthik, and Venkatesh Sundararaman. 2013. “Contract Teachers: Experimental Evidence from India.” National Bureau of Economic Research, Working Paper 19440. Muralidharan, Karthik, Paul Niehaus, and Sandip Sukhtankar. 2015. “Building State Capacity: Evidence from Biometric Smartcards in India.” UC San Diego. Niehaus, Paul, and Sandip Sukhtankar. 2013. “The Marginal Rate of Corruption in Public Programs: Evidence from India.” Journal of Public Economics, 104: 52–64. Olken, Benjamin. 2007. “Monitoring Corruption: Evidence from a Field Experiment in Indonesia.” Journal of Political Economy, 115(2): 200–249. Pratham. 2010. “Annual Status of Education Report.” Pritchett, Lant, and Justin Sandefur. 2013. “Context Matters for Size: Why External Validity Claims and Development Practice Don’t Mix.” Center for Global Development, Working Paper 336. UNESCO. 2014. Teaching and Learning: Achieving Quality for All. Paris, France:UNESCO. World Bank. 2003. World Development Report 2004: Making Services Work for Poor People. Washington DC:Oxford University Press for the World Bank. World Bank. 2010. Silent and lethal: How quiet corruption undermines Africa’s development efforts. Wash- ington DC:World Bank. Zamboni, Yves, and Stephan Litschig. 2013. “Audit Risk and Rent Extraction: Evidence from a Randomized Evaluation in Brazil.” Universitat Pompeu Fabra. 24 Tables Table 1. Changes in Key Variables Between 2003 and 2010, Village-Level Data (1) (2) (3) Summary Statistics Difference Year 2003 Year 2010 (Ho: No diff) TEACHER VARIABLES Have bachelors degree 0.41 0.58 0.174*** Have teacher training 0.77 0.68 -0.085*** Are contract teachers 0.06 0.30 0.233*** Are paid regularly 0.49 0.78 0.285*** Recognition scheme exists 0.50 0.81 0.309*** SCHOOL VARIABLES Student-teacher ratio (STR) 47.19 39.80 -7.388*** Mid-day meals 0.22 0.79 0.576*** Infrastructure index (0-4) 2.14 3.35 1.205*** Has drinking water 0.80 0.96 0.160*** Has toilets 0.40 0.84 0.440*** Has electricity 0.22 0.45 0.236*** Has library 0.51 0.69 0.183*** MONITORING & COMMUNITY VARIABLES Road is within 1km 0.69 0.78 0.092*** Probability of inspection in last 3 months 0.38 0.56 0.176*** Probability of inspection in last 2 months 0.31 0.50 0.189*** Probability of inspection in last 1 months 0.22 0.38 0.155*** Probability of PTA meeting in last 3 months 0.30 0.45 0.153*** Mean parental education (1-7 scale) 2.03 2.43 0.394*** State per-capita GDP (thousands of Rs.) 14.74 30.21 15.473*** Source: Authors' calculations; Central Statistical Organization, India. Notes : Summary statistics (except Student-teacher ratio) are weighted by rural population of Socio-Cultural Regions (SCRs) in Census 2001. Student-teacher ratio is weighted by SCR school enrolment. Data for number of days since inspection and truncated at 99th percentile. State per-capita GDP figures are in 2004-2005 prices. *** significant at 1%, ** significant at 5%, * significant at 10% 25 Table 2. Absence Rate of Teachers & Student-Teacher Ratios in Rural Public Schools by State by Year (1) (2) (3) (4) (5) (6) (7) (8) (9) Absence Rates (%) Student-Teacher Ratio Effective Student-Teacher Ratio Year 2003 Year 2010 Change Year 2003 Year 2010 Change Year 2003 Year 2010 Change Andhra Pradesh 23.38 21.48 -1.90 27.51 25.79 -1.71 35.90 32.85 -3.05 Assam 36.15 26.26 -9.89*** 28.21 36.07 7.86*** 44.18 48.92 4.74 Bihar 39.42 28.69 -10.73*** 72.44 69.01 -3.43 119.57 96.78 -22.79 Chattisgarh 30.47 14.20 -16.28*** 42.12 33.05 -9.07*** 60.59 38.52 -22.07 Gujarat 17.92 16.14 -1.77* 40.42 31.94 -8.48*** 49.24 38.09 -11.15 Haryana 21.07 17.75 -3.31** 34.40 36.34 1.94 43.58 44.18 0.60 Himachal Pradesh 22.67 30.74 8.07*** 18.04 21.73 3.69** 23.33 31.38 8.04 Jharkhand 43.50 45.84 2.34 52.30 42.84 -9.47*** 92.57 79.09 -13.48 Karnataka 22.60 23.93 1.33 29.07 23.62 -5.45*** 37.56 31.05 -6.51 Kerala 19.60 15.79 -3.81*** 24.84 24.49 -0.36 30.90 29.08 -1.82 26 Madhya Pradesh 18.19 26.34 8.16*** 37.19 46.57 9.39*** 45.45 63.23 17.78 Maharastra 15.43 14.12 -1.31 34.54 28.66 -5.88*** 40.84 33.38 -7.47 Orissa 21.69 14.24 -7.46*** 47.01 36.63 -10.38*** 60.04 42.72 -17.32 Punjab 36.66 13.54 -23.13*** 30.80 31.43 0.63 48.63 36.36 -12.28 Rajasthan 25.13 22.72 -2.42* 38.91 32.05 -6.86*** 51.97 41.47 -10.50 Tamilnadu 20.43 12.92 -7.51*** 29.56 25.85 -3.71** 37.15 29.69 -7.47 Uttar Pradesh 26.72 31.21 4.49*** 69.37 47.40 -21.97*** 94.66 68.90 -25.76 Uttaranchal 32.29 21.02 -11.27*** 24.49 31.02 6.54** 36.17 39.28 3.12 West Bengal 26.41 20.97 -5.44*** 58.23 41.61 -16.62*** 79.12 52.65 -26.47 India 26.29 23.64 -2.64*** 47.19 39.80 -7.39*** 64.02 52.13 -11.89 Source : Authors' calculations; DISE Notes : All figures are weighted by SCR's rural population. Absence figures for 2003 differ slightly from the figures in the Kremer et al. (2005) paper. This is because the urban schools are removed from the sample. We do not conduct inference on the changes in "Effective Student-Teacher Ratio" because the data on total number of teachers are obtained from administrative (DISE) data. *** significant at 1%, ** significant at 5%, * significant at 10% Table 3. Teacher Activity and Reasons for Absence PANEL A: PHYSICAL VERIFICATION & LOGBOOK RECORDS OF ABSENCE Logbook Physical Log-book records (last verification (%) records (today) working day) Year 2003 26.29 19.07 - Year 2010 23.64 15.94 10.24 PANEL B: PHYSICAL VERIFICATION & TEACHER ACTIVITY Teacher Absent (%) Teacher Found in Classroom (%) Found outside Actively teaching Passively teaching Not Teaching classroom (%) Year 2003 42.93 5.56 15.88 9.35 26.29 Year 2010 53.08 4.16 8.96 10.15 23.64 PANEL C. STATED REASONS FOR ABSENCE School On Official Duty (%) Authorized No Reason Closed (%) Official other Leave (%) (%) Official non- Official teaching (panchayat Total teaching related related (trainings, meetings, (Official Duty) (elections, health meetings, etc.) political campaigns, etc.) meetings, etc.) Year 2003 6.08 7.19 5.93 0.95 0.31 7.62 5.40 Year 2010 6.60 6.43 5.21 0.93 0.29 5.91 4.70 Source : Authors' calculations. Notes : All figures are weighted by SCR's rural population. In 2003, log-book records of last working day were not recorded in the survey. In 0.37 percent of cases, respondents said that a log-book was not maintained in the school, 0.23 percent refused to show log-book. The full list of activities under for not teaching are - doing administrative/paper work, talking to/accompanying the surveyor, chatting/talking (with teachers, others), reading magazines/newspapers, sleeping, watching TV/listening to radio, doing other personal work, idle. Reasons for school closed are - opening hours but no one has arrived yet, opening hours but everyone left, and no reason. 27 Table 4. Cross-section OLS Regressions Results, Village Level, 2010 Data (Dependent Variable: Teacher Absence Rate (%)) (1) (2) (3) (4) (5) (6) (7) SUMMARY BIVARIATE REGRESSIONS MULTIPLE REGRESSIONS STATISTICS no fixed w/ state w/ district no fixed w/ state w/ district Year 2010 effects fixed effects fixed effects effects fixed effects fixed effects TEACHER VARIABLES Have bachelors degree 0.58 -1.03 -6.20*** -7.51*** -1.96 -5.78** -6.84*** (0.32) (1.94) (2.39) (2.57) (1.76) (2.45) (2.59) Have teacher training 0.68 -11.95*** -3.48 -2.92 -2.39 -2.43 -2.09 (0.31) (2.38) (2.39) (2.73) (2.81) (2.69) (2.87) Are contract teachers 0.30 10.97*** 0.46 -1.12 -2.25 -0.27 -2.32 (0.30) (2.37) (2.48) (2.97) (2.83) (2.71) (3.21) Are paid regularly 0.78 -7.72*** -1.51 -1.24 -2.53 -1.10 -0.60 (0.39) (1.95) (1.92) (2.20) (2.00) (1.95) (2.17) Recognition scheme exists 0.81 -6.53*** -1.43 -1.72 -2.25 -0.19 -0.94 (0.37) (2.12) (1.86) (2.07) (2.08) (1.81) (2.01) Log of salary 9.25 -3.70*** -0.58 -0.30 0.43 -0.18 -0.15 (0.62) (1.08) (0.88) (0.96) (1.01) (0.94) (0.99) SCHOOL VARIABLES Log student-teacher ratio 3.50 1.88 -2.31** -4.07*** -2.42** -1.65* -3.29*** (0.59) (1.26) (1.15) (1.40) (1.10) (0.99) (1.24) Mid-day meals 0.79 0.77 0.57 2.62 0.49 0.47 2.01 (0.38) (1.74) (1.80) (2.07) (1.70) (1.77) (2.03) Infrastructure index (0-4) 3.35 -3.44*** -0.23 -0.31 -0.89 0.07 0.07 (1.30) (0.56) (0.70) (0.80) (0.68) (0.69) (0.77) Remoteness index (normalized) 0.04 0.26 0.58 0.76 0.19 0.17 0.14 (0.95) (0.68) (0.59) (0.64) (0.64) (0.61) (0.65) MONITORING & COMMUNITY VARIABLES Probability of inspection in last 3 months 0.56 -10.47*** -7.87*** -7.63*** -6.64*** -6.32*** -6.20*** (0.29) (2.07) (2.08) (2.39) (1.90) (2.04) (2.37) Probability of PTA meeting in last 3 months 0.45 -6.72*** -2.80** -3.22** -2.59* -1.77 -2.13 (0.48) (1.51) (1.17) (1.32) (1.33) (1.13) (1.32) Mean parental education (1-7 scale) 2.43 -3.16*** 0.37 -0.46 -0.90 0.64 -0.82 (0.74) (1.00) (0.97) (1.08) (1.00) (0.95) (1.07) Log state per-capita GDP 3.29 -11.01*** -9.27*** (0.49) (1.51) (2.50) REGRESSION STATISTICS Constant 74.58*** (11.76) R-squared 0.139 0.231 0.394 Adjusted R-squared 0.126 0.211 0.273 F-statistic (Inspected = PTA met) 3.186* 3.450* 2.024 Number of villages 1,555 1,555 1,555 Source : Authors' calculations. Notes : In summary statistics, standard deviations are in parentheses; in bivariate and multiple regressions, robust standard errors clustered at the district-level are in parentheses. In bivariate regressions, each cell is a separate regression of the row variables with the dependent variable being the teacher absence rate in percentage at the village-level. Infrastructure index variable uses availability of four items (as in Table 1) with higher values representing better infrastructure; similarly remoteness index uses distances to nine sets of facilities, with higher values representing more remote villages. Summary statistics and regressions are weighted by SCR's population. *** significant at 1%, ** significant at 5%, * significant at 10% 28 Table 5. Panel OLS Regression Results , Village-Level (Dependent Variable: Percentage Points Change in Teacher Absence) (1) (2) (3) (4) (5) (6) BIVARIATE REGRESSIONS MULTIPLE REGRESSIONS no fixed w/ state fixed w/ district no fixed w/ state fixed w/ district effects effects fixed effects effects effects fixed effects CHANGES IN TEACHER VARIABLES Have bachelors degree -0.42 -1.69 -3.69 -1.68 -2.31 -4.71 (2.55) (2.52) (2.91) (2.51) (2.57) (3.04) Have teacher training 1.10 1.12 0.52 1.08 0.79 1.53 (2.51) (2.76) (3.12) (2.81) (2.85) (3.19) Are contract teachers -4.89 -3.39 -0.86 -5.26 -3.84 -0.83 (3.20) (3.41) (3.52) (3.37) (3.60) (4.03) Are paid regularly -0.18 -0.83 -1.47 -0.28 -0.97 -0.56 (1.70) (1.81) (2.11) (1.67) (1.77) (2.24) Recognition scheme exists -3.87** -3.34* -3.69** -3.06* -2.03 -3.34 (1.76) (1.75) (1.87) (1.71) (1.69) (2.23) CHANGES IN SCHOOL VARIABLES Log student-teacher ratio -5.33*** -4.89*** -4.48** -5.56*** -4.95*** -4.69*** (1.83) (1.68) (1.91) (1.81) (1.57) (1.78) Mid-day meals 1.31 1.81 4.19 1.62 0.95 2.14 (1.73) (2.09) (2.59) (1.73) (2.08) (2.85) Infrastructure index (0-4) -1.10* -0.97 -1.01 -0.97 -0.68 -0.96 (0.66) (0.69) (0.76) (0.66) (0.66) (0.78) Remoteness index (normalized) -1.16 -0.93 -0.55 -1.25 -1.04 -0.81 (1.05) (1.06) (1.08) (1.00) (0.95) (1.13) CHANGES IN MONITORING & COMMUNITY VARIABLES Probability of inspection in last 3 months -8.23*** -7.31*** -6.60*** -7.35*** -6.56*** -6.41*** (1.94) (1.98) (1.91) (1.83) (1.83) (2.01) Probability of PTA meeting in last 3 months -1.65 -3.18* -3.80** -1.71 -2.08 -2.96 (1.74) (1.63) (1.72) (1.67) (1.64) (2.02) Mean parental education (1-7 scale) -1.29 -0.09 0.48 -1.13 -0.46 0.51 (1.40) (1.38) (1.44) (1.29) (1.32) (1.46) Log state per-capita GDP -4.69 -6.18 (7.39) (7.18) REGRESSION STATISTICS Constant 3.43 (5.50) R-squared 0.071 0.143 0.346 Adjusted R-squared 0.054 0.115 0.188 F-statistic (Inspected = PTA met) 4.419** 2.921* 1.268 Number of villages 1,297 1,297 1,297 Source : Authors' calculations. Notes : Robust standard errors clustered at the district-level are in parentheses. Infrastructure and remoteness index variables are as defined in Table 4. Regressions are weighted by SCR's population; *** significant at 1%, ** significant at 5%, * significant at 10% 29 Table 6. Correlation between Inspection Frequency and Teacher Absence by Reason (1) (2) (3) (4) (5) (6) BIVARIATE REGRESSIONS MULTIPLE REGRESSIONS w/ state fixed w/ district fixed w/ state fixed w/ district fixed no fixed effects no fixed effects effects effects effects effects PANEL A. CROSS-SECTION ANALYSIS Dependent Variable: Village-Level Teacher Absence by Reason in 2010 (%) (Coefficient on Probability of Inspection Reported) On Official Duty -2.16** -2.51** -2.48* -1.55 -2.16** -2.06 (1.01) (1.02) (1.29) (1.03) (1.04) (1.30) Authorized Leave 2.35*** 1.65** 1.26 2.28*** 1.44* 1.17 (0.78) (0.84) (1.05) (0.82) (0.85) (1.08) Unauthorized Absence -10.67*** -7.02*** -6.41*** -7.39*** -5.59*** -5.30** (1.96) (1.84) (2.10) (1.77) (1.81) (2.12) 30 PANEL B: PANEL ANALYSIS Dependent Variable: Change in Village-Level Teacher Absence by Reason between 2003 and 2010 (percentage points) (Coefficient on Change in Probability of Inspection Reported) On Official Duty -1.77* -1.05 -1.45 -1.43 -1.00 -1.49 (0.92) (0.85) (0.97) (0.91) (0.83) (0.96) Authorized Leave 0.77 0.42 0.59 0.59 0.33 0.50 (0.83) (0.84) (0.91) (0.85) (0.84) (0.91) Unauthorized Absence -7.22*** -6.68*** -5.74*** -6.51*** -6.07*** -5.41*** (1.69) (1.86) (1.78) (1.66) (1.79) (1.75) Source : Authors' calculations. Notes : Robust standard errors clustered at the district-level are in parenthesis. Regressions are weighted by SCR's population. *** significant at 1%, ** significant at 5%, * significant at 10% Table 7. Panel OLS Regression Results, Village-Level (Dependent Variable: Change in Probability of Inspection in Past 3 Months) (1) (2) (3) (4) (5) (6) BIVARIATE REGRESSIONS MULTIPLE REGRESSIONS no fixed w/ state w/ district no fixed w/ state w/ district effects fixed effects fixed effects effects fixed effects fixed effects CHANGES IN TEACHER VARIABLES Have bachelors degree -0.003 0.042 0.039 0.006 0.037 0.030 (0.046) (0.053) (0.050) (0.046) (0.051) (0.055) Have teacher training 0.041 0.054 0.085 0.029 0.046 0.064 (0.056) (0.057) (0.054) (0.053) (0.055) (0.061) Are contract teachers 0.055 0.063 -0.040 0.108* 0.088 -0.009 (0.053) (0.073) (0.069) (0.059) (0.070) (0.082) Are paid regularly -0.036 -0.010 -0.010 -0.037 -0.005 -0.004 (0.030) (0.035) (0.035) (0.031) (0.035) (0.041) Recognition scheme exists 0.069** 0.062** 0.020 0.067** 0.060* 0.023 (0.028) (0.031) (0.032) (0.028) (0.031) (0.037) CHANGES IN SCHOOL VARIABLES Log student-teacher ratio 0.055* 0.032 0.029 0.049 0.024 0.012 (0.031) (0.032) (0.034) (0.030) (0.031) (0.037) Mid-day meals 0.007 -0.008 -0.024 0.018 -0.008 -0.017 (0.032) (0.041) (0.046) (0.034) (0.042) (0.050) Infrastructure index (0-4) 0.010 0.011 0.005 0.006 0.011 0.004 (0.012) (0.013) (0.015) (0.013) (0.013) (0.015) Remoteness index (normalized) -0.023 -0.026 -0.032 -0.024 -0.024 -0.028 (0.022) (0.022) (0.020) (0.021) (0.021) (0.024) CHANGES IN MONITORING & COMMUNITY VARIABLES Probability of PTA meeting in last 3 months 0.018 0.052** 0.068** 0.033 0.053** 0.070** (0.023) (0.024) (0.029) (0.023) (0.024) (0.027) Mean parental education (1-7 scale) -0.03 -0.04 -0.04** -0.04 -0.04* -0.05** (0.026) (0.026) (0.022) (0.023) (0.024) (0.025) Log state per-capita GDP -4.69 0.40** (7.392) (0.167) REGRESSION STATISTICS Constant -0.13 (0.138) R-squared 0.051 0.093 0.315 Adjusted R-squared 0.034 0.065 0.152 Number of villages 1,300 1,300 1,300 Source : Authors' calculations. Notes : Robust standard errors clustered at the district-level are in parentheses. Infrastructure and remoteness index variables are as defined in Table 4. Regressions are weighted by SCR's population. *** significant at 1%, ** significant at 5%, * significant at 10% 31 Table 8. Panel OLS Regression Results, Village-Level (Dependent Variable: Change in Normalized Math Score) (1) (2) (3) (4) (5) (6) BIVARIATE REGRESSIONS MULTIPLE REGRESSIONS no fixed w/ state w/ district no fixed w/ state w/ district effects fixed effects fixed effects effects fixed effects fixed effects CHANGES IN STUDENT VARIABLES Average age 0.09** 0.05 0.07 0.10** 0.07 0.09 (0.05) (0.04) (0.05) (0.05) (0.04) (0.05) Proportion male 0.14 0.13 0.14 0.07 0.10 0.10 (0.15) (0.15) (0.17) (0.15) (0.15) (0.17) Private tuition 0.29** 0.27** 0.19 0.26** 0.30** 0.21 (0.13) (0.13) (0.18) (0.12) (0.13) (0.18) CHANGES IN TEACHER VARIABLES Have bachelors degree -0.22** -0.13 0.03 -0.28** -0.17 -0.02 (0.12) (0.12) (0.12) (0.11) (0.12) (0.12) Have teacher training -0.05 -0.02 0.03 -0.06 0.06 0.07 (0.12) (0.11) (0.13) (0.12) (0.11) (0.13) Are contract teachers -0.13 0.10 -0.07 0.07 0.12 -0.05 (0.15) (0.15) (0.18) (0.16) (0.15) (0.19) Are paid regularly -0.04 0.11 0.12 -0.02 0.10 0.11 (0.08) (0.07) (0.08) (0.08) (0.07) (0.08) Recognition scheme exists -0.05 0.04 0.03 -0.03 0.02 0.01 (0.08) (0.08) (0.09) (0.08) (0.08) (0.09) CHANGES IN SCHOOL VARIABLES Absence rate of teachers -0.005*** -0.004*** -0.002 -0.006*** -0.005*** -0.002 (0.002) (0.002) (0.002) (0.002) (0.002) (0.002) Log student-teacher ratio -0.07 -0.11* -0.13* -0.09 -0.13* -0.15* (0.08) (0.07) (0.08) (0.08) (0.07) (0.08) Mid-day meals -0.23* -0.06 -0.05 -0.16* -0.05 -0.04 (0.08) (0.09) (0.12) (0.08) (0.09) (0.11) Infrastructure index (0-4) 0.00 -0.03 0.01 -0.01 -0.02 0.00 (0.03) (0.03) (0.04) (0.03) (0.03) (0.04) Remoteness index (normalized) 0.05 0.04 0.03 0.03 0.04 0.03 (0.05) (0.04) (0.05) (0.05) (0.04) (0.05) Probability of inspection in last 3 months -0.01 -0.06 -0.10 -0.02 -0.07 -0.12 (0.08) (0.08) (0.09) (0.09) (0.08) (0.09) CHANGES IN MONITORING & COMMUNITY VARIABLES Probability of PTA meeting in last 3 months -0.06 0.01 0.07 -0.03 0.01 0.06 (0.07) (0.07) (0.08) (0.08) (0.07) (0.08) Mean parental education (1-7 scale) 0.17*** 0.15*** 0.16*** 0.17*** 0.15*** 0.17*** (0.05) (0.04) (0.06) (0.05) (0.05) (0.06) Log state per-capita GDP -5.91 0.71 (7.59) (0.44) REGRESSION STATISTICS Constant -0.86** (0.41) R-squared 0.066 0.180 0.434 Adjusted R-squared 0.042 0.146 0.277 Number of villages 1,155 1,155 1,155 Source : Authors' calculations. Notes : Robust standard errors clustered at the district-level are in parenthesis; infrastructure and remoteness index variables are as defined in Table 4; regressions are weighted by SCR's population; *** significant at 1%, ** significant at 5%, * significant at 10% 32 Table 9. The Fiscal Cost of Absence (in 2010 Prices and Salaries) (1) (2) (3) (4) (5) Average Monthly Number of Total Loss Due to Absence (millions of Rs.) Teacher Salary (Rs.) Teachers Allowed Absence: Allowed Absence: Allowed Absence: 8% 9% 10% Andhra Pradesh 10,299 347,875 6,374 5,901 5,428 Assam 9,567 167,161 3,855 3,644 3,433 Bihar 8,645 336,359 7,942 7,559 7,175 Chattisgarh 8,290 155,573 1,055 885 715 Gujarat 15,804 198,584 3,374 2,960 2,546 Haryana 16,236 77,980 1,630 1,463 1,296 Himachal Pradesh 12,199 48,507 1,776 1,698 1,620 Jharkhand 9,734 135,690 6,598 6,423 6,249 Karnataka 10,897 195,929 4,489 4,207 3,925 Kerala 10,751 54,976 608 529 451 Madhya Pradesh 9,294 267,846 6,027 5,698 5,370 Maharastra 17,246 288,914 4,025 3,367 2,710 Orissa 9,382 192,119 1,484 1,246 1,008 Punjab 12,654 105,930 980 803 626 Rajasthan 14,165 271,205 7,463 6,956 6,448 Tamilnadu 18,489 150,820 1,811 1,443 1,075 Uttar Pradesh 10,370 491,455 15,615 14,942 14,269 Uttaranchal 17,155 45,782 1,350 1,246 1,143 West Bengal 10,555 416,633 7,527 6,946 6,366 India 11,368 3,949,338 92,699 86,773 80,847 Source : Authors' calculations; DISE. Notes : 2010 teacher salaries are from Teacher Long and School Census Data. Data on total number of teachers is obtained from DISE State Report Cards. All figures are in 2010 prices 33 Table 10. Marginal Returns to Investing in Governance (in 2010 Prices and Salaries) (1) (2) (3) (4) (5) (6) Student-teacher Ratio Effect of Increasing Probability of Inspection in Past 3 Cost to Produce Equal Effect (2009-2010) months by 10 percentage points Through Teacher Hiring Annual Savings From Student-teacher Effective Student- Annual Cost Expected Effective Reduced Teacher Annual Cost (Rs. millions) Ratio teacher Ratio (Rs. millions) Student-teacher Ratio Absence (Rs. millions) Andhra Pradesh 17.8 22.7 31.0 350.8 22.5 433.5 Assam 24.5 33.2 15.9 154.5 33.0 204.2 Bihar 58.2 81.6 21.2 273.6 80.8 374.9 Chattisgarh 24.5 28.5 13.9 120.1 28.3 135.0 Gujarat 29.8 35.5 19.1 291.8 35.3 336.2 Haryana 26.8 32.5 8.8 118.9 32.3 139.8 Himachal Pradesh 15.4 22.2 6.8 56.0 22.0 79.2 Jharkhand 41.3 76.2 14.8 127.9 75.3 236.3 Karnataka 23.6 31.0 18.5 201.6 30.8 257.7 Kerala 19.6 23.2 2.0 56.3 23.1 64.5 Madhya Pradesh 39.8 54.0 40.6 250.9 53.5 332.1 Maharastra 25.7 29.9 45.0 486.8 29.7 546.8 Orissa 29.4 34.3 20.5 177.5 34.1 199.7 Punjab 20.5 23.7 10.2 137.4 23.5 153.2 Rajasthan 26.2 33.9 40.0 361.6 33.6 454.5 Tamilnadu 28.3 32.5 24.6 264.9 32.3 293.2 Uttar Pradesh 40.1 58.2 58.4 489.4 57.7 697.1 Uttaranchal 20.6 26.0 10.7 73.3 25.8 90.0 West Bengal 32.3 40.8 30.1 409.4 40.5 502.5 India 31.7 41.5 448.0 4,509.6 41.1 5,742.0 Source : Authors' calculations; DISE. Notes : Number of schools, number of teachers, and enrollment figures are from administrative (DISE) data. Simulation assumes that one inspection every 3 months reduces absence linearly by 6.4 percentage points. Inspector costs are assumed to be two times teacher salaries, travel costs are assumed to be 80 percent of monthly salary, and an inspector is assumed to work 200 days a year and inspect two schools every day. The Fiscal Cost of Weak Governance: Evidence from Teacher Absence in India Karthik Muralidharan, Jishnu Das, Alaka Holla, and Aakash Mohpal Appendices A Sampling and Construction of Village-Level Panel Dataset The original survey in 2003 covered the 19 largest states of India by population (except Delhi). Within each state, 10 districts were sampled using Probability Proportional to Size (PPS) and within each district, 10 primary sampling units (PSUs, which could be villages or towns) were sampled by PPS, thereby yielding a nationally representative sample of 1,900 PSUs across 190 districts (including towns and villages). The exception is Uttar Pradesh where 11 districts were sampled and Uttaranchal where 9 districts were sampled (since Uttaranchal had only 9 districts, and Uttar Pradesh is the largest state in India). Additionally, to account for the considerable geographic diversity within Indian states, the sample was stratified by geographic socio-cultural region (SCRs), and the 10 districts in each state were allocated to SCRs proportional to the population of the SCRs. Similarly, the 10 PSUs within each district were allocated to villages/towns proportional to the rural/urban population split in the district. All sampling was done on the basis of the 1991 census, since that was the latest Census data available at the time of the study. The 2003 sample was augmented to include 241 villages from the REDS survey (Foster and Rosenzweig, 1996). Since the REDS villages are drawn as a representative sample within districts, including these villages does not change the representativeness of the sample. If a REDS district was in our main sample, the REDS villages were included (typically 2 to 4 per REDS district) and additional villages were sampled randomly to make up the total desired sample size. If a REDS district was not in our sample, those villages were covered in addition to our core sample. Including these villages provides more precise estimates of outcomes in the SCRs where they are located. All analysis is weighted by SCR populations, so the final estimates continue to be nationally-representative on a population weighted basis. The final sample in 2003 comprised of 2,141 rural and urban PSUs across 19 states of India. In 2010, since the survey only covered rural areas, the sample size was reduced from 10 to 8 villages per district. All districts in the 2003 sample were retained in the 2010 study, with three exceptions where full-urban districts sampled in 2003 were replaced with a new PPS sampled district from the same SCR. The three replaced districts are Hyderabad in Andhra Pradesh, Ahmedabad in Gujarat, and Greater Bombay in Maharashtra, which are 35 highly urban districts containing their respective state capitals. As we highlight in the paper, to meet our objective to maintain both representativeness of the current landscape of schools in rural India and to maximize the size of the panel, we retain villages from the 2003 study to the extent possible. In Column 1 of Table A1, we provide state-wise counts of rural PSUs that were sampled in the 2003 study. After removing PSUs in the three replaced districts altogether and all other urban PSUs from the 2003 study, the maximum panel size we could draw, including the REDS villages was 1,668. We sampled a 2003 village by default as long as the village had a population between 250 and 10,000 as per the 1991 Census, and we could locate the village in the 2001 Census.31 In districts where we had more than 8 rural PSUs in 2003, we sampled 8 PSUs randomly. The lower cutoff on population was based on the Government of India’s mandate that all rural habitations exceeding 250 people should have a school with 1 km. Since villages and hamlets can be absorbed into expanding cities over time, we match the originally sampled 1991 village to the villages in the 2001 Census to make sure that the sampled village still exists. From the 2003 list of 1,668 villages, we had to remove 249 from the 2010 sampling frame for reasons we discuss below (see Columns 5 through 9 of Table A1 for the distribution of these villages across states). 69 villages were dropped because they fall in districts that had more than 8 villages in the 2003 round. A further 129 villages were removed either because their population was below 250, or had far exceeded 10,000 in the 2001 Census (20,000 for Kerala). A total of 36 villages could not be located in the 2001 Census (suggesting that they had either been depopulated or absorbed into nearby towns). Finally, 15 villages were replaced due to safety, logistical and accessibility reasons. Thus, our sample consists of 1,419 villages from 2003 (Table A1 - column 3). In districts where we had fewer than 8 villages in the 2003 sample (recall that the ru- ral/urban sampling within districts was done on the basis of population ratios, and thus districts where over 25% of the population in 1991 was urban would have fewer than 8 vil- lages), we sample more villages as required to reach a minimum sample size of 8 villages per district for the 2010 survey. The new villages were sampled PPS from the universe of eligible villages in the 2001 Census that were not already sampled. The cross-section sample (including REDS villages) thus consists of 1,650 villages (Table A1 - column 2). Of the 1,650 villages that comprise our 2010 sample, data from 1,555 villages were in- cluded in the analysis presented in this paper (Table A2 - column 2). First, we found that 29 of the 1,650 villages have no schools in the village. A large proportion of these villages 31 The exception to this is Kerala, which has a much higher population density, where the upper cut-off was 20,000 36 (12 out of 29) are in Himachal Pradesh, which is a sparsely populated mountainous state, with many small habitations. Another 39 villages did not have a public school within the village, but did have a private school. Since this paper focuses on changes in public schools, these villages are not included in the analysis. In Kerala, we lose another 12 villages, because all schools in the village refused to be allowed to be surveyed.32 Finally, we drop 15 more villages from our analysis because in these villages, schools were either not functional or closed in all three visits, which means we were unable to complete surveys. A state-level breakdown of these 95 villages is provided in Columns 4-7 of Table A2. The decline in the cross-section sample size for reasons we discussed above, also reduces the number of villages for which we have panel data. After accounting for the above 95 villages and 53 villages in 2003 for which we have no data (for similar reasons as outlined for the 2010 survey round), our final panel size is 1,297 villages. These 1,297 villages form the core of our analysis. To ensure a representative sample of schools, enumerators first conducted a full map- ping of all public and private schools in each sampled village. Enumerators conducted “Participatory Resource Assessments” with households at multiple locations (at least three) within each village to obtain a list of all primary schools within the boundary of the vil- lage. All enumerated schools were administered a short survey that included questions on school administration such as management (public or private), enrollment, infrastructure etc. Enumerators also collected a list of all teachers in the school and their demographic characteristics. This school listing in each sampled village provided the frame for school sampling. We sampled up to three schools per village. If the village had three or fewer schools, all schools were sampled. If the village had more than three schools, we stratified the schools by management type and randomly sampled two public schools and one private school to the extent possible. In the event that there were only one public school and two or more private schools, one government and two private schools were sampled. Table A3 provides the state-level breakdown of the number of schools and teachers in the final (public school) sample used in this paper (both cross section and panel). B A Decision-Theoretic Case for Scale-Ups of Monitoring with an RCT Formally, consider a simple binary policy regarding the number of supervisors to be hired that can take the values {0, 1}, where the current policy is {0} and {1} represents a ‘new’ policy of hiring enough supervisors to ensure that all schools are inspected once in three months. The costs of the new policy are the additional salary and operational costs of hiring 32 Permission to survey was refused in spite of the survey team possessing the required permission doc- uments. Kerala has a history of strong unions and it was not possible for the field teams to overcome this opposition. 37 supervisors, and the benefits are the reduced fiscal cost of teacher absence. Denote these by C {1} and B {1} respectively, and assume that it is optimal to implement the policy if B {1} > C {1}. However, while C {1} is known, there is uncertainty around B {1} and a randomized controlled trial (RCT) in the context of a policy movement towards {1} would reduce the uncertainty around B {1}. Suppose that after the trial, the likelihood that the optimal policy switches from {0} to {1} is p and that the expected per-period net benefit of such a switch is q . Let cost of data collection and analysis of a trial be C {data} and the discount rate be r. Let the period of the trial be one year and the fraction of the population participating in the trial be N . Half of those in the trial are allocated to a treatment group and the other half to a control group. Since data collection will be based on a representative sample of trial sites, we assume that C {data} does not vary with the size of the trial. The one period cost of the trial is then C {data} + (N/2) ∗ C {1}. The benefits of the trial are the expected one-period benefit of the new policy (during the trial) and the discounted benefits of switching to a new policy (in perpetuity), weighted by the probability that the trial will lead to a switch in the policy. Thus, the trial should be conducted as long as: pq 1 C {data} + (N/2) ∗ C {1} < (N/2) ∗ B {1} + ∗ r 1+r To focus on the benefits of learning if the optimal policy should be {1} instead of {0}, we abstract away from the benefit of the policy during the trial period and the one-period delay in implementing the new policy (if found to be optimal), in which case the trial should be conducted as long as: pq C {data} + (N/2) ∗ C {1} < r Using our results to calibrate these quantities, it is straightforward to see that the ex- pected benefits of a trial are very large even under extremely conservative assumptions. The estimates in Table 10 suggest that the marginal cost of {1} would be $33 million and that the marginal benefit would be $331 million (using our panel data estimates).33 Thus, if our estimates are true, q would be around $300 million/year, and using a discount rate of 10%, the net present value of moving to {1} would be $3 billion. Now suppose there is only a 1% chance that the causal impacts of inspections on teacher absence are as great as the panel 33 The estimates in Table 10 are based on hiring enough supervisors to increase the probability of a school being inspected in the previous 3 months by 10 percentage points. Since the current probability of a school being inspected in the previous 3 months is 56 percent (Table 1), we scale up the estimates in Table 10 by a factor of 4.4 since moving to {1} would imply that the other 44 percent of schools should also be inspected. We use an exchange rate of US$1 = 60 Indian Rupees. 38 data estimates presented here and that there is a 99% chance that the causal impacts of inspection are not significantly different from zero (i.e. p = 0.01). Even then, we see that q [ (1+ r) ∗ pq ] is $30 million. On the cost side, we conservatively estimate (using data from our own field costs) that a highly-powered trial would have C {data} in the range of $1 million. A trial with an N of 0.06 would be a very large trial and could cover a nationally-representative sample across all major Indian states, but would only cost $1 million/year.34 Thus, even including all costs of data collection, the upper bound of the costs of such a trial would be $2 million compared to a likely lower-bound expected benefit of $30 million.35 An expansion of school inspections in the context of an experimental evaluation would therefore make sense even if there was only a 1% chance of the true effects being the same as our panel-data estimates. If we use a medical ethics perspective in this setting, we also need to consider the costs of not providing a treatment that is known (or highly likely) to be effective. In this case, that would be the foregone one-period benefit of scaling up the treatment immediately (which we estimate to be around $300 million). Thus, depending on their prior beliefs, and the extent to which our panel data estimates shift these priors, some policy makers may choose to switch the policy regime from {0} to {1} immediately. However, the point of our exercise above is to show that policy makers, depending on their beliefs, should either implement {1} immediately or do a large expansion in the context of an RCT as described above, but it would only be under an extreme set of beliefs (that there is less than a 1% chance of our panel-data estimates being truly causal) that a policy maker would do nothing based on our results. 34 India has around 600,000 villages, 44% of which would be 264,000 villages. An N of 0.06 with half the sample getting the treatment would imply that an additional 7900 villages would be treated (3% of 264,000), which would be a very large trial by the standards of most experiments. Since covering all the remaining 264,000 villages is estimated to cost $33 million, the cost of covering 3 of the villages would be $1 million. 35 Note that we use extremely conservative estimates for p assigning only a 1% probability of true estimates as large as our panel-data based estimates and assigning the rest of the 99% probability to finding a zero effect. If we were to assign a uniform distribution of likely point estimates between zero and our panel-data estimates (this is also conservative because we would not assign any probability to the true estimate being larger than the panel-data estimate), the expected benefit would be even larger. 39 Appendix Tables Table A1. Description of Sample: Panel Construction Number of Villages Reasons for Reduction in Panel Size Reduction in Year 2003 Year 2010 Panel Panel Size More than 8 panel Village population Village population Village not found Other Reasons Sample Sample Sample villages in district less than 250 more than 10,000 in Census 2001 Andhra Pradesh 81 87 73 8 3 0 4 1 0 Assam 98 87 77 21 5 3 0 10 3 Bihar 94 84 84 10 10 0 0 0 0 Chattisgarh 85 80 76 9 1 0 1 2 5 Gujarat 82 88 74 8 2 2 2 0 2 Haryana 81 81 75 6 3 1 1 1 0 Himachal Pradesh 89 80 60 29 2 22 0 4 1 Jharkhand 87 84 73 14 7 4 0 1 2 40 Karnataka 91 89 84 7 2 3 2 0 0 Kerala 83 83 43 40 0 0 40 0 0 Madhya Pradesh 88 90 81 7 3 1 2 1 0 Maharastra 85 91 80 5 2 0 3 0 0 Orissa 92 87 79 13 4 5 1 3 0 Punjab 78 82 75 3 0 0 1 2 0 Rajasthan 91 98 85 6 1 1 0 4 0 Tamilnadu 84 87 69 15 5 0 6 4 0 Uttar Pradesh 114 113 104 10 9 1 0 0 0 Uttaranchal 80 72 57 23 6 14 1 2 0 West Bengal 85 87 70 15 4 3 5 1 2 India 1,668 1,650 1,419 249 69 60 69 36 15 Source : Authors' calculations. Notes : The upper population cutoff for all states was 10,000 as per the 1991 census, except Kerala where the cutoff was 20,000. The category others include: replaced because high Naxalite activity (6 villages), replaced because duplicate in 2003 sample (2 villages), replaced because district was replaced (2 villages) replaced because village too remote (1 village), replaced because name missing in 2003 list (1 village), replaced because of floods in village (2 village), replaced because village could not be located (1 village). Table A2. Description of Sample: Data and Attrition Reasons for Attrition Year 2010 Sample Reasons for Attrition (Year 2010) Panel Sample (Panel) No public School(s) Included in No school Other Included in No data for No data for Sampled Attrition school in refused to Sampled Attrition Analysis in village reasons Analysis year 2010 year 2003 village survey Andhra Pradesh 87 86 1 0 0 0 1 73 70 3 1 2 Assam 87 83 4 1 3 0 0 77 72 5 3 2 Bihar 84 81 3 1 1 0 1 84 77 7 3 4 Chattisgarh 80 75 5 2 1 0 2 76 69 7 4 3 Gujarat 88 85 3 0 3 0 0 74 71 3 3 0 Haryana 81 80 1 0 1 0 0 75 63 12 0 12 Himachal Pradesh 80 59 21 16 5 0 0 60 43 17 16 1 Jharkhand 84 81 3 2 1 0 0 73 58 15 3 12 Karnataka 89 88 1 0 1 0 0 84 82 2 1 1 Kerala 83 65 18 0 5 12 1 43 31 12 8 4 Madhya Pradesh 90 88 2 0 1 0 1 81 78 3 2 1 41 Maharastra 91 83 8 1 3 0 4 80 73 7 7 0 Orissa 87 83 4 2 1 0 1 79 73 6 3 3 Punjab 82 80 2 1 1 0 0 75 71 4 2 2 Rajasthan 98 94 4 1 2 0 1 85 83 2 2 0 Tamilnadu 87 79 8 1 5 0 2 69 62 7 5 2 Uttar Pradesh 113 111 2 0 2 0 0 104 100 4 2 2 Uttaranchal 72 67 5 1 3 0 1 57 52 5 4 1 West Bengal 87 87 0 0 0 0 0 70 69 1 0 1 India 1,650 1,555 95 29 39 12 15 1,419 1,297 122 69 53 Source : Authors' calculations. Notes : The category others include: high Naxalite activity, village not reachable, schools not functional, and schools closed in all three visits. In 2003, if a village did not have any schools, surveyors went to the neighboring village. In 2010, the village was simply recorded as having no school. Table A3. Description of Sample: Final Sample Year 2010 Sample Panel Number of Number of Number of Number of Number of Number of Number of Number of villages schools teachers villages schools in 2003 schools 2010 Teachers in 2003 Teachers in 2010 Andhra Pradesh 86 130 509 70 107 107 372 405 Assam 83 150 525 72 122 134 437 473 Bihar 81 124 757 77 112 119 341 731 Chattisgarh 75 100 450 69 94 92 259 412 Gujarat 85 119 944 71 101 98 419 798 Haryana 80 105 520 63 85 83 386 395 Himachal Pradesh 59 70 270 43 44 51 172 205 Jharkhand 81 132 493 58 76 94 244 374 Karnataka 88 120 572 82 117 112 598 530 Kerala 65 105 608 31 57 50 353 307 Madhya Pradesh 88 146 476 78 116 133 367 427 Maharastra 83 98 495 73 96 88 441 451 Orissa 83 114 483 73 88 101 295 439 Punjab 80 88 469 71 75 76 355 417 Rajasthan 94 141 671 83 132 121 497 565 Tamilnadu 79 96 445 62 124 75 455 363 Uttar Pradesh 111 135 616 100 131 119 442 542 Uttaranchal 67 73 207 52 61 57 177 151 West Bengal 87 151 668 69 108 121 331 531 India 1,555 2,197 10,178 1,297 1,846 1,831 6,941 8,516 Source : Authors' calculations.