70575 Bulgaria: The impact of school closures on dropout rates Main results and lessons for the future 1 This note sums up results and lessons learned from an impact assessment of school closures on dropout rates in Bulgaria. The assessment was undertaken jointly by the Government of Bulgaria’s Impact Evaluation Task Force and the World Bank. The assessment was undertaken to guide policy, but also to provide on-the-job training to the Task Force members in the methods and techniques of impact evaluations. This note was compiled to share the results with a larger audience, and therefore focuses more on results and lessons learned from the process and less on the methods and technical aspects. January 2010 Unpublished working group document 1 On the World Bank side, the impact assessment was a joint effort by Norbert Schady, Lars Sondergaard, task team leader Christian Bodewig, and consultant Thomas Pave Sohnesen. This note was prepared by Thomas Pave Sohnesen. 1 Executive Summary Too many students of mandatory school age in Bulgaria drop out of school. This note is primarily about the impact of school closures on dropout rates, but it should also be seen in the larger context whereby Bulgaria has an important challenge in keeping children of mandatory school age in schools. Irrespective of school closures, 3.8 percent of students of mandatory school age dropped out in 2007. The main challenge in impact evaluation is finding an appropriate counterfactual. The best estimate of impact of school closure is the difference between dropout rates in closed schools and at comparable schools (counterfactuals). The differences between dropout at closed schools and the average dropout rate, for instance, is not a good estimate of impact, since this is unlikely to be an appropriate comparison group. Hence, identifying good counterfactuals is a critical element in conducting an impact assessment. School closures resulted in dropout rates that were around two times higher in 2007 and 2008. Different methods, different specifications and different counterfactuals all give similar results, showing that dropout rates in closed schools are two times or more higher due to closure. Although school closures lead to higher dropouts, most of the students dropping out in 2007 and 2008 did so for other reasons. In fact, the total number of students dropping out from schools that were closed down only account for 2 percent of total drop outs in 2007 and 7 percent in 2008. The somewhat higher number in 2008 can be attributed to the fact that the schools which closed in 2008 were slightly larger than those that closed in 2007. This study is an example of policy relevant research that can be undertaken with the existing rich data. The impact assessment looked at a negative consequence of school closure. However, existing data could also be used to look at potential positive aspects of the school closures. For instance, a future study could use data on learning outcomes of 4th, 5th and 6th graders to examine whether school closures resulted in better learning outcomes for students that moved to larger schools. Better data access and standards could facilitate more policy relevant research at little cost. Despite good intentions, the process of collecting and joining data from several sources was not without challenges. It took the team five months from initiation before actual analysis could begin. The Government should consider widening the access of the very rich data that is collected so it could be used by researchers inside and outside Bulgaria to learn as much as possible from the existing data. Developing standards so that ministerial data can be used with ease in conjunction with data from the National Institute of Statistics should also be a priority, as combining data from several sources would offer much greater value than stand-alone data. To better understand the reasons behind the increase in drop out, more information about students will be needed, including information regarding their socio-economic background and the language spoken at home. Interviews with regional inspectorates in Sofia and Lovech suggest that many of the students which dropped out were Roma. However, existing data bases do not allow this dimension to be explored further. One way to rectify this would be to collect data on the language spoken at home and/or proxies for the socio-economic background of students (e.g. parents’ education). 2 1. Background This note summarizes results and lessons learned from an assessment of impact of school closures on dropout rates in Bulgaria undertaken by the World Bank and the Government of Bulgaria’s Impact Evaluation Task Force. The assessment was undertaken to guide policy, but also to provide on-the-job training to the Task Force members in the methods and techniques of impact evaluations. This note focuses mainly on results and lessons learned from the process and less on methods and technical aspects. Knowing the impact of school closures on dropout rates is relevant as many schools have already closed down and more are expected to do so. The number of schools in Bulgaria has been declining steadily during the last decade, and more schools have been allowed to close in recent years. In 2007 around 100 schools providing general education were shut, while nearly 300 (around 15 percent of all schools) were closed over the summer of 2008. Still, the consolidation process is expected to continue with more school closures over the coming years. Determining whether the school closures have any negative impact in terms of students leaving school is therefore of key interest to policy makers. The focus is on the impact of school closures on dropout rates among students from grade 1 through 8 in “normal� public schools. For this assessment the team was provided an electronic data base (extracted from the Ministry of Education and Science’s Education Management Information System) with information on school and class level for all students in Bulgaria. To focus on what we considered the most important issue, students that were not full time in “normal� schools were excluded from the analysis. This means that adult students doing part time studies and the young studying at “special� schools (for instance, prisons, hospitals, and private schools) were excluded from the analysis. Further, for this note we generally only present results regarding primary schools (grade 1 to 4), basic schools (grade 1 to 8), and lower secondary schools (grade 5 to 8). We focus on these schools because they cover the years of mandatory schooling (grade 1 to 8), and they represent the bulk of school closures that have taken place. Comprehensive schools (grade 1 to 12) also cover the mandatory school age, but only one school of this type closed in 2007 and two in 2008. 2. The challenge of estimating the impact of school closures on dropout rates The main challenge in estimating impact of any intervention is finding a good comparison (counterfactual) – that is, what would have been the outcome without the intervention? The estimated impact of any policy intervention is the difference between the outcome with the program and without it. However, we can never observe anyone participating and not participating in a program at the same time. Hence, we look for other outcomes that we think are reasonable comparisons – this is called the counterfactual. Dropout rates in schools that closed were more than two times higher than in schools that remained open in 2007 and 2008. In the summer of 2007 (end of the 2006/2007 school year) around 100 schools providing general education were closed down. By counting the number of students that were enrolled in another school the following year we can estimate the dropout rate 3 for the closed schools2. In 2007 the average dropout rate was 14.9 percent in schools that closed, compared to 6.2 percent observed among schools that did not close in 2007 or 20083. In summer of 2008 around 300 schools were closed down. Among these schools 11.3 percent dropped out on average, compared to 4.9 percent in schools that did not close (Table 1). It should be noted that these dropout rates are the average across schools and hence give relatively higher weight to the larger number of small schools. The national average dropout rates can be seen in table a1 in the appendix. Table 1 Average school dropout rates in schools that closed and remained open, 2006-2008 Schools not closed School closed in 2007 School closed in 2008 2007 6.2% 14.9% 8.3% 2008 4.9% 11.3% Source: Ministry of Education and authors’ calculations. Note: The dropout rate is for daytime students in grade 1 to 7 in public schools However, closed schools were different from the “average� schools in important ways. Therefore, finding an appropriate group of schools with which to compare the closed schools (i.e., a “counterfactual�) is critical. Schools that closed were not typical based on several observed characteristics. Compared to those that remained open the closed schools were characterized by the following: fewer students, lower student-teacher ratios, more likely to be located in rural areas, more likely located in municipalities with higher poverty rates, and with a lower population density (table 2 in appendix). These characteristics could explain some or all of the higher dropout rate that we observe among closed schools. Hence, comparing the dropout rates for closed schools with the national average dropout rate would be a false estimate of the impact of school closure. Based on observed characteristics we can compare closed schools to similar ones that stayed open. Several statistical methods and techniques exist that can help isolate the true impact of school closure. One method is “matching�, which first identifies other schools that have similar observed characteristics and then compares the dropout rates in these schools to those that closed. Another method is regression-based analysis where observed characteristics are used to control for the impact of each characteristic on dropout rates. This method thereby isolates the impact that can be attributed to the school closure as opposed to any other factor. A caveat to all methods is that we can only isolate the true impact from school closure to the extent that we have similar schools to compare with and that the only main difference is whether they closed down or not. Further, we also need to be able to identify these schools based on observed characteristics. For instance, if the leadership skills of the principal or the physical condition of the school are important factors for closing one school over another, we will not be able to match or control for these factors as they are not observed characteristics in our data set. For the estimated impact in 2007 we estimate the impact of school closure based on two groups of counterfactuals: 1) Open schools that have the same characteristics as schools that closed down, and 2) Open schools that closed the following year in 2008. This latter comparison is based on the idea that schools that close have something in common (both observed and unobserved factors), and that the year of closure would be partially random. Table a2 in the appendix shows that schools that closed in 2 Schools receive subsidies per student and therefore have an incentive to be very exact in their reporting of enrolled students to the Ministry of Education. The risk that these numbers are significantly biased by the student move is therefore considered minimal. 3 The estimate excludes students that were attending graduating classes, and therefore could not be expected to be enrolled the following year. 4 2008 were largely located in similar areas to those that closed in 2007, but they were generally larger than those that closed in 2007. The availability of two consecutive years allows for a comparison of dropout rates before and after school closure for all schools in 2008. This mitigates the challenge of finding appropriate schools for comparison. In the available data we have dropout rates for 2007 and 2008. Hence for those that closed in 2008 we can observe the dropout rate before and after they closed. By looking at the difference between dropout rates in 2007 and 2008 we also have an estimate of the impact of school closure. The advantage of this comparison is that we implicitly control for unobserved characteristics that have not changed over time. Factors such as the skills of the principals and teachers, physical quality of the school, and background of students and parents are unlikely to have changed dramatically during the two years in the same schools. In practical terms we look at the change in dropout rates between the two years and then apply the same methods described above to estimate the impact of school closure. This method is called “difference-in-difference�. 3. Results on dropout rates and school closures Irrespective of school closures, 3.8 percent of students of mandatory school age dropped out in 2007. Hence keeping students of mandatory school age in schools is an important challenge in Bulgaria. This note is primarily about the impact of school closures on dropout rates; however it should also be seen in the larger context in which Bulgaria has an important challenge in keeping children of mandatory school age in schools. The Ministry of Education’s data bases indicate that in schools that did not close, 3.8 percent of children that were enrolled in grades 1 to 7 in primary, basic, lower secondary, and comprehensive general schools in the 2006/07 school year did not enroll the following year. Lower dropout rates in 2008 compared to 2007 could be due to stricter control with schools as opposed to improvements in the number of students out of school. Dropout rates – both for schools that closed and remained open – are generally higher in 2007 than in 2008 (see table 1 above). This could be due to many factors. One factor that likely explains some of this difference is the stricter control with enrolled students that the Ministry of Education implemented between the two years. Since schools receive payment based on their number of students, schools have a strong and positive incentive to enroll as many students as possible. Unfortunately this resulted in schools “enrolling� students that were generally not present. Stricter control likely had the effect that these ghost students were no longer enrolled and therefore appear as dropouts in 2007. It is noteworthy that even though not included as dropouts in 2008 these children are still out of school. Compared to similar schools, closing a school resulted in dropout rates around two times higher than in schools that did not close. The higher drop out rate observed among closed schools could be because they were different than schools that remained open and not due to the fact they there were closed. However, comparing drop out rates in schools that closed to schools that have similar characteristics but did not close, reveal that dropout rates were around two times higher in closed schools in both years. Tables 3 and 4 in the appendix show the estimated impact based on different methods and comparisons. Fortunately the results are not sensitive to choice of method and the robustness of different specifications indicates that bias from unobservables might not be a major issue. Comparing schools before and after school closure also show dropout rates around two times higher due to the school closure. Using the difference-in-difference method, in which we 5 implicitly control for all the unobserved school characteristics that have not changed over time by looking at the difference in dropout rate from one year to another, we also obtain an estimated impact of a dropout rate that was around two times higher due to school closure (see table a4 in appendix). It is reassuring that despite different levels of dropout rates and the use of different methods, we obtain a similar result in terms of estimated impact. Although school closures lead to higher drop outs, most of the students dropping out in 2007 and 2008 did so for other reasons. In fact, the total number of students dropping out from schools that were closed down only account for 2 percent of total drop outs in 2007 and 7 percent in 2008. The somewhat higher number in 2008 can be attributed to the fact that the schools which closed in 2008 were slightly larger than those that closed in 2007. In both years, it was the case that the schools that were closed were generally small schools. Therefore, total enrollments in these schools make up a relatively small share of the absolute number of students. Dropout rates vary substantially across schools. Not all schools have high dropout rates. For instance 17 and 14 percent of schools had no drop outs in 2007 and 2008 respectively, while at the same time 14 and 10 percent had a dropout rate of over 30 percent in those same years. Based on regressions it also appears that rural and small schools have higher dropout rates. Higher grade levels also seem to have higher dropout rates. Figure 1 illustrates dropout rates and impact of school closures across grades in 2008. It seems that while dropout rates are higher in higher grades, the impact seems somewhat proportional to the dropout rate among students in schools that did not close. This corresponds well with the repeated results of dropout rates being two times higher. Figure 1 – Impact of school closure across grades, 2008 16.0% 14.0% 12.0% 10.0% Impact school of school closure on fifth 8.0% graders 6.0% Impact school of school closure on 4.0% second graders 2.0% 0.0% Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Open schools Closed Schools Note: The estimated impact is based on difference-in-difference estimates for 2008. However, a better understanding of which students drop out is key, with the Roma population being a clear example. Knowing what types of schools that have a high dropout rate is informative, but only takes us part of the way towards designing good policy interventions. For instance, even among small and rural schools there is large variation across schools. Hence other characteristics that are not available for this study could be important. Interviews with Regional 6 Inspectors in Sofia and in Lovech gave a clear indication to the study team that Roma students are particularly at risk of dropping out. Education attainment data (figure 1 in appendix) confirms that many Roma children do not complete general education. However, is dropping out of school only a problem among Roma? Do school closures impact Roma children differently than other students? These and other questions are interesting and policy-relevant questions that can only be answered with more in-depth analysis at the student level. 4. Lessons for the future This impact analysis of school closures illustrates how policy could be guided by utilizing already existing data, while more analysis of potential benefit to policy makers could also be done in this area. The impact assessment shows that closing schools is not without consequence for the share of students dropping out. This analysis could be broadened and used to look further into student and school dropouts to provide more detailed guidance. Other aspects such as learning outcomes could also be assessed. Did students that moved from closed schools improve their learning outcomes because they now study at larger and more resourceful schools? Do students that stayed in protected small schools have worse educational outcomes than students that moved? There are many options for further study here. Ongoing school closures in 2009 could be used to better understand the impact of the policy. What interventions might prevent students from dropping out as a consequence of the closures? To answer this question best a study should ideally be initiated before decisions are made on which schools will close (or before the school is informed about the decision), or at least before the physical closure of the school. This is because being able to observe students both before and after school closure is key in understanding the main factors that lead to school dropouts. Improved data access is needed to improve the timeliness and volume of policy relevant research. Since this study was undertaken as a joint study between the World Bank and GoB, we were given unprecedented access to valuable student level data. However, despite good intentions the process of collecting and merging data from several sources was not without challenges. It took the team five months from initiation before actual analysis could begin4. To accelerate this phase in future studies, the GoB may wish to consider widening data access in order to facilitate use by researchers inside and outside Bulgaria to learn as much as possible from the existing data. Developing standards so that data from Ministries and the National Institute of Statistics for instance can be merged with ease should also be a priority. Combining data from several sources can multiply the value of stand-alone data, but without some standardization this process can be unnecessarily complicated. To better understand the reasons behind the increase in school dropouts, more information about students will be needed, including information regarding their socio-economic background and the language spoken at home. Interviews with regional inspectorates in Sofia and Lovech suggest that many of the students which dropped out were Roma. However, existing data bases do not allow this dimension to be explored further. One way to rectify this would be to collect data on the language spoken at home and/or proxies for the socio-economic background of students (e.g. parents’ education). 4 Even now certain data quality issues remain unresolved. However, we do not believe that these issues could challenge the significant impact we observe of school closures. 7 5. Appendix Table 1 National student dropout rates in schools that closed and remained open, 2006-2008 Schools not closed School closed in 2007 School closed in 2008 2007 All schools 5.0% 15.6% 7.4% Rural 7.3% 16.5% 8.8% Urban 4.0% 12.7% 3.6% Primary 4.1% 11.4% 5.3% Basic and Lower 5.1% 16.6% 7.8% secondary 2008 All schools 4.1% 10.5% Rural Schools 5.8% 12.3% Urban Schools 3.2% 6.1% Primary Schools 3.0% 6.2% Basic and Lower 4.2% 11.3% secondary Schools Table 2 Mean differences, characteristics of schools and municipalities Teacher- Share of Population Number of Percentage student schools in Poverty rate density of students Roma ratio urban area municipality (1) Schools closed down in 2007 41 11.5 11.6% 18.8% 6.0% 90 (2) Schools closed down in 2008 62 11.2 10.5% 17.1% 6.7% 81 (3) Schools never closed down 211 12.7 36.9% 16.1% 6.1% 223 Statistical test of significant difference Difference: (1)-(2) (3.44)** (0.48) (2.46)* (0.31) (1.27) (0.48) Difference: (1)-(3) (8.15)** (3.10)** (5.04)** (3.68)** (0.22) (2.50)* Difference: (2)-(3) (12.27)** (6.33)** (8.95)** (2.21)* (1.78) (4.61)** Note: Differences show the t-value for a test of equal mean. * means significant at 5 percent level, ** significant at 1 percent level. 8 Table 3 Impact of school closure on dropout rates in 2007 Comparison group: Schools closed down in Comparison group: Schools not closed down 2008 OLS Matching OLS Matching School School Controls Controls + No School Abadie- No School Abadie- + location location controls Controls Imbens controls Controls Imbens & region & region (1) (2) (4) (1) (2) (4) dummies dummies (3) (3) Impact of school closure 0.066 0.073 0.078 0.077 0.088 0.088 0.092 0.124 on dropout rate (SE) (0.012)** (0.013)** (0.014)** (0.021)** (0.007)** (0.008)** (0.008)** (0.022)** N 382 370 364 364 1640 1613 1565 1565 R- 0.07 0.10 0.22 0.08 0.15 0.24 squared Notes. The table shows the estimated impact of school closure on dropout rates. Column 1 is a regression of dropout rate on school closure. Column 2 is the same as column 1 plus the following school variables: school size, average class size, dummy for urban schools, and type of school. Column 3 is the same as column 2 plus the following municipal variables: poverty rate, population density, share of population being Turkish, share of population being Roma, dummies for all regions. Column 4 is nearest neighbor matching using the same variables as in column 3. Estimates with * are significant at 5%; ** significant at 1% Table 4 Impact of school closure on dropout rates in 2008 Comparison group: Schools not closed down Difference in difference OLS Matching OLS School School Controls + Controls + No School Abadie- School location No controls location & controls Controls Imbens Controls & region (1) region (1) (2) (4) (2) dummies dummies (3) (3) Impact of school closure on 0.064 0.062 0.062 0.064 0.043 0.047 0.048 dropout rate (SE) (0.004)** (0.005)** (0.005)** (0.010) ** (0.004)** (0.006)** (0.005)** N 1832 1809 1757 1757 1832 1809 1757 R-squared 0.11 0.15 0.23 0.06 0.05 0.09 Notes. The table shows the estimated impact of school closure on dropout rates. Column 1 is a regression of dropout rate on school closure. Column 2 is the same as column 1 plus the following school variables: school size, average class size, dummy for urban schools, and type of school. Column 3 is the same as column 2 plus the following municipal variables: poverty rate, population density, share of population being Turkish, share of population being Roma, dummies for all regions. Column 4 is nearest neighbor matching using the same variables as in column 3. Estimates with * are significant at 5%; ** significant at 1% 9 Figure 1 Educational attainment Ye ars of e ducation comple te d by e thnicity (20-28 yr olds) 100 90 Bulgarian 80 Roma 70 (%) 60 T urkish 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Years of completed schooling Distribution for 2007 of dropout rates for closed and open schools .1 .08 .06 .04 .02 0 0 10 20 30 40 Dropout rate ( % ) Open schools Closed schools 10 Distribution for 2008 of dropout rates for closed and open schools .15 .1 .05 0 0 10 20 30 40 Dropout rate ( % ) Open schools Closed schools 11