104246 Student performance and attendance in Moldova from a socio-economic perspective This technical note was prepared at the request of the Ministry of Education of Moldova by Anna Olefir (TTL, Operations Officer, GEDDR), Tobias Stohr, Tom Coupe and Anatol Gremalschi (Consultants). The note also benefited from comments and suggestions by Andrea Guedes (Senior Operations Officer), Igor Kheyfets (Economist) and Lucas Gortazar (Consultant). The funding for this study was provided by the Poverty and Social Impact Analysis (PSIA) Multi-Donor Trust Fund (MDTF). _____________________________________________________________________________________ December 2015 Table of Contents Executive Summary ................................................................................................................................ 2 Introduction ............................................................................................................................................. 8 Descriptive Analysis of Student Performance and Attendance by Socio-Economic Background and School Environment .............................................................................................................................. 10 Determinants of Student and School Performance ................................................................................ 21 Risks Associated with the School Network Consolidation Reform ...................................................... 32 Conclusions and Policy Recommendations........................................................................................... 38 Literature ............................................................................................................................................... 42 Annex 1: Description of data................................................................................................................. 44 Annex 2: Methodological approach ...................................................................................................... 46 Annex 3: Overview of important variables ........................................................................................... 49 Annex 4: Additional tables .................................................................................................................... 52 1 Executive Summary Despite the recent progress in reducing poverty and promoting shared prosperity, Moldova remains one of the poorest countries in Europe. Based on the Europe and Central Asia (ECA) standardized poverty lines of US$5/day, 39 percent of the population was poor in 2013. Poverty in Moldova remains a rural phenomenon and large spatial disparities remain. With 57 percent of the population living in rural areas, 84 percent of the poor are concentrated there. The most vulnerable groups at risk of poverty in Moldova remain those with low education levels, households with three or more children, those in rural areas, families relying on self- employment, the elderly, and Roma. In Moldova, the bottom 40 percent are particularly affected by weaknesses in the quality and efficiency of education services, limiting their income- generating opportunities. The Life in Transition Survey (2010) indicates that for primary and secondary education satisfaction with public services was lower among the bottom 40 percent than the top 60 percent. Moldova’s performance in PISA 2009 Plus showed that students from rural schools (where 57 percent of children of Moldova are studying and where most of the socio-economically disadvantaged bottom 40 percent are) significantly lag behind their peers from largest cities like the capital Chisinau; that is by more than three years of schooling. Education represents an important factor that influences the level of poverty. Not surprisingly, improving the quality, relevance, and efficiency of the education system is one of the main priorities of the Government of Moldova. However, the demographic and fiscal realities of the country have not made it easy for the government to fulfill its mandate in education. Over the years, Moldova’s education sector has witnessed uneven education quality and lack of efficiency. From 1991 to 2010, Moldova’s school-age population has decreased by over 50 percent (the school network and staffing barely changed) and spending for education skyrocketed to 9.4 percent of GDP (twice the regional average) without visible improvements in student learning outcomes. The results on PISA 2009 Plus showed that the performance of the country's 15-year-olds in reading, math, and science is among the lowest in Europe, equivalent to a loss of about two years of schooling. Around 60 percent of Moldova's 15-year- olds lack the basic levels of proficiency in reading and math literacy needed to participate effectively and productively in the society. In response, the Government of Moldova has reemphasized through its ambitious education reform program approved in 2011 the need for quality, efficiency, and evidence-based policy making. Over the last few years, the Government has taken a number of critical steps including school network consolidation reform to right-size the school network, adopted per-student financing nationwide, provided autonomy to schools in managing financial and human resources; modernized the Baccalaureate exams system, strengthened its integrity for merit- based way of determining university admission; developed the student-level Education Management Information System, took proactive steps in opening education data so that to strengthen citizens’ oversight of reforms, among many others. The reforms started to produce positive results, including better efficiency of the network and generated significant savings that could and should be invested into the quality-enhancing initiatives. School rationalization has led to an adjustment in the school network, including halting the inefficient pulverization of resources across schools that are too small to function 2 efficiently and with acceptable quality. The largest efficiency gains though can be and are made through reorganization and closure of schools in rural areas. This is also where most socio- economically disadvantaged students live and study. The reform brings opportunity of providing these children with a better education in larger and more resourced schools but it also creates risks both for student attendance and performance (in particular for poor and disadvantaged students) as a result of their transition to a new school environment. The analysis of performance and attendance of the socio-economically disadvantaged students in Moldova vis-à-vis their better-off peers, factors that influence performance and attendance, as well as risks associated with the school network consolidation is thus of paramount importance to inform the ongoing education reforms and mitigate arising risks (ensuring that students directly affected by the reform are adequately accommodated in the ‘receiving’ schools). The following key questions have been analyzed in this report: (i) How does the performance (as well as attendance/absenteeism) of socially disadvantaged Moldovan secondary students compare to that of their better-off peers?; (ii) What are the determinants of student performance and attendance?; and (iii) What is an impact of school closures on dropouts and absenteeism? In order to explore the effect of various factors, the hierarchical regression models were used to explain how much variation in performance is predicted by the student characteristics (e.g. individual socioeconomic status, gender, ethnicity, urban/rural or vulnerabilities status) and the school attributes (such as class-size, student-teacher ratio, proportion of qualified teachers, socioeconomic school composition). The education production function of modelling the relationship between student test scores1 and its predictors in the form of individual or school- level characteristics were used as it allows distinguishing the impact of school characteristics versus student characteristics. While the student-level EMIS and PISA 2009 Plus datasets have valuable data on various aspects of schooling in Moldova, the EMIS is not without shortcomings, in particular regarding the completeness of the data on student vulnerabilities limiting possible analysis. Data from Moldova’s participation in the 2015 PISA round (that was supported by the World Bank Moldova Education Reform Project) should be released in the end of 2016 and will considerably improve the available data for any future analysis of the role of socio-economic background on education of importance for education policy makers. As to the risks associated with the school network consolidation, while the EMIS covers the reasons of dropouts, the schools are unlikely to report that the students do that because of the reform, hence, indirect evidence is used to gauge the realities on the ground through the dropout rates of students from reformed and non-reformed schools. Regression analyses were not performed as it would make little sense given the very small sample size. We also explored in a multivariate framework the student absence, using as an outcome variable the number of hours of absence per year, and as the independent variables – being from a closed school, being from a class that was closed, individual characteristics such as gender, risks factors etc. Analysis of how the learning outcomes are distributed throughout a school system as well as the risks associated with the school network consolidation (namely, dropouts and absenteeism) 1 4th and 9th grades test results, Baccalaureate exams scores in math and Romanian after grade 12 and PISA 2009 Plus tests for 15 year olds in math, reading and science 3 provide a number of valuable policy insights that could guide the efforts aimed to ensure the delivery of quality teaching and learning across the entire education system. Key findings and policy implications:  Students’ average performance in the PISA test depends strongly on their socio - economic status and most disadvantaged students do not attain even the baseline proficiency levels in reading, math and science. While Moldovan students from the top 60 percent are on average as well-off as their international PISA peers [index of economic, social and cultural status (ESCS): 0.07], the bottom 40 percent are significantly more disadvantaged by the international standards (ESCS: -1.51). The ESCS index indicates that 93.4 percent of students that participated in PISA internationally are better off than the average Moldovan student from the bottom 40 percent. As to their performance, the bottom 40 percent of students according to the ESCS have an average reading score 45 points below that of their less disadvantaged classmates and on math and science the gaps are 46 and 43 points respectively (equivalent to slightly more than a year of schooling). Most disadvantaged Moldovan students are particularly weak in the “reflect and evaluate” and “non-continuous text” tasks and do not attain the PISA baseline proficiency in reading, math and science. These students risk completing their studies without acquiring the skills and competencies needed to fully participate in society and continue learning throughout their lives. EMIS grades similarly are substantially lower for vulnerable students, especially in the 4th and 9th grade and to a lesser extent in the 12th grade (as only the more better-off students tend to continue the upper secondary education).  Descriptive analysis reveals substantial performance gaps between schools, rural and urban areas as well as between genders; and understanding these differences is important for designing relevant interventions. While schools with few disadvantaged students can be expected to score close to the international average of 500 points in the PISA test, schools with a large majority of disadvantaged students can be expected to have an average scores of less than 400 points, corresponding to a 2.5 year gap in educational development. Girls and boys score on average similarly in terms of math, but boys do worse than girls in science (13 points difference on average, representing a third of a year) and especially in reading (47 points difference on average or slightly more than a year of schooling). Gender overall explains about 5 percent of variation in reading grades suggesting the need for monitoring and gender-sensitive interventions in relevant subject (such as improvements in curriculum through the gender lens and/or catch-up classes for boys, in particular from disadvantaged backgrounds). Rural students lag behind their urban peers by an average 60 points in reading, 55 in math and 48 in science, corresponding to a 1.2 to 1.5 year gap in educational development, though the urban-rural divide is not a significant factor once controlling for other factors in multivariate framework suggesting that in targeted interventions it is important to focus on low-performing schools with a high share of disadvantaged students rather than on their location. Determinants of students’ achievement interact and add up causing very significant differences between the educational outcomes of the best and worst performing students. For example, urban children who do not belong to the bottom 40 percent of the ESCS do particularly well. Rural students from a disadvantaged socio-economic background do particularly poorly. The differences between 4 these groups are striking: 75 points for girls (almost 2 years of schooling) and 82 points for boys in reading (more than 2 years of schooling).  Performance differences in Moldova from the socioeconomic prospective are largely between schools suggesting that targeted interventions should focus on disadvantaged schools rather than on individual vulnerable students. About a third of the variation in PISA scores is linked to school level differences. Much of the explained variation comes from school socioeconomic composition, with one standard deviation counting for 39-42 additional PISA points. Variation in student’s socio-economic background can explain between 5 and 10 percent of the variation in grades, with one standard deviation in ESCS index counting for 14-15 additional PISA points. Performance differences in Moldova are largely between schools rather than within schools which suggest that in targeted operations focusing on schools with a high share of disadvantaged students might be more advisable than targeting individual students. This might be also convenient for financial considerations.  Despite large achievement gaps, internationally Moldova scores average in terms of the impact of socio-economic background on PISA scores suggesting some room for education equality improvement. Similar (and even worse) inequitable outcomes can be found in rich countries such as New Zealand or France, which have a good PISA performance, though in some other countries socio-economic background has less of an impact on educational performance which suggests some potential for the reduction of education inequality which on its own, if considered in terms of learning opportunities, 'has the potential to produce quick gains in economic and social welfare' (Porta et al, 2011; OECD, 2012b). International evidence also suggests that substantial progress in student performance can be achieved in a diverse social settings with moderate economic resources. This is of particular importance for Moldova, in light of its already high spending on education, efforts to make school system more efficient, and existing social disparities.  Student absenteeism is prevalent among children at risk suggesting that close monitoring of absenteeism for this group is important. The student risk factors provide a telling picture of the determinants of absence. Vulnerable students miss more hours, with and without excuse. Among vulnerable students, the impact of additional risk factors depends on the grade level. Absence without excuse peaks expectedly in grade 9 for both genders. Students who have to travel less than 3 km to school miss about 13 percent fewer hours which underlines the importance of continued monitoring of attendance and dropouts in relation to the school network consolidation process using EMIS. Also, girls miss about 40 percent fewer hours without excuse than boys. Urban students miss fewer hours without excuse vis-à-vis their rural peers.  As to the risks of the school network consolidation, dropouts so far have not been an issue but the student attendance from reformed schools is a concern and should be carefully monitored. After controlling for a number of individual characteristics and post- reform school differences, students from closed schools tend to miss more lessons without excuse (with possible subsequent implications for student performance). Therefore, it is important to continue careful monitoring of student absenteeism in schools receiving students from closed or reorganized schools to mitigate the relevant risks. 5 Main policy recommendations: Moldova should target to improve quality of education and learning opportunities for all students, equitably across the entire education system. There is room for education inequality reduction in the country and international evidence (Porta et. al., 2011) suggests that achievement of substantial education quality improvements together with the reduction of the share of poorly performing students and results variability based on student socioeconomic background is possible and leads to significant economic and social gains. Delivery of equal learning opportunities to all pays off and countries with very different economic conditions and social settings demonstrated the ability to raise the quality of educational outcomes substantially and equitably despite existing social disparities (and with moderate economic resources). The challenges ahead are high (i) in light of the level of education system performance overall (around 60 percent of Moldova's 15-year-olds lack the basic levels of proficiency in reading and math with large performance gaps between students with different socioeconomic backgrounds, across genders, schools, and urban and rural areas), (ii) the need to equip as many students - Moldova’s future workforce - as possible with at least the baseline competencies and skills that enable them to fully participate in social and economic life and continue learning throughout their lives; (iii) the complexity of task itself that is the need to prepare students for new realities of globalized world and dealing with more rapid change than ever before (‘for jobs that have not yet been created, for new technologies and challenges that will appear’, OECD: 2010b); and (iv) complex nature of various education policies that need to be aligned, integrated and maintained over sustained period of time. In light of the existing gender gaps in student performance, there is need for relevant interventions including gender-sensitive review and improvements in curriculum for reading and/or catch-up classes for boys, in particular from disadvantaged backgrounds. It is important to target the struggling disadvantaged schools and hub schools (receiving students from consolidated schools) where investment of educational resources can potentially make the greatest difference. The analysis shows that in terms of targeted interventions it is advisable to focus on low-performing schools with a high share of disadvantaged students both in urban and rural areas including hub schools receiving students from closed or reorganized schools. Turning around these schools can give impetus for quality enhancement of the whole system. A number of interventions could be considered to help disadvantaged schools improve including: (i) strengthening and supporting school leadership including through mentoring from experienced head teachers, training and other support; (ii) attracting and motivating high-quality teachers to work in hard-to-staff disadvantaged schools (teacher shortages are an issue at the moment in particular in science and math as evidenced by this and Moldova SABER-Teachers reports); (iii) offering struggling schools extra funding responsive to the needs of the most disadvantaged students and schools; (iv) strengthening accountability mechanisms among many others. Turning around low-performing schools with a high share of disadvantaged students requires strong leadership. Therefore, preparing and developing effective school leaders is the starting point of the transformation process. Better quality of EMIS data and its use for evidence-based education policy making are important. Education reforms should drive change on the basis of good evidence. Moldova has 6 made substantial investments in creating valuable educational datasets including EMIS and PISA. At the same time, it is important to strengthen quality and reliability of data collected through EMIS by improving data collection procedures and processes, quality controls, information flows and developing and implementing data validation mechanisms. Strengthening statistical and analytical capacity of the Ministry of Education to analyze relevant data is also of paramount importance for evidence-based education policy making and timely adjustments of the ongoing education reforms. The EMIS is also important for continuous monitoring of attendance and absenteeism of children at risk and students from closed or reorganized schools as mentioned above. The country’s school system requires stronger accountability mechanisms in place. Moldova has provided wider autonomy to schools in managing financial and human resources. The school boards and leadership now have much more control of the way the resources are used, people are deployed, the work is organized and the way in which the work gets done. PISA results internationally show that increased school autonomy tends to be closely linked to school performance only if there are effective accountability mechanisms at school level. Therefore, equal access to learning opportunities must be accompanies with proper accountability mechanisms. And good data and capacity building for its analysis, mentioned above, are also important for implementation of open data initiatives encouraging citizens’ engagement and oversight of the reforms, promoting and enabling environment for social accountability and supporting the efforts of the Government to build modern, cost-effective and high-quality education sector. It is also important to note that this study was complemented with qualitative assessment (focus groups and semi-structured interviews) in order to understand what information stakeholders have and need on performance and other parameters of the schooling in their institution to be able to demand better education for the students. Its results showed that gaps and weaknesses in information delivery and communication about ongoing reforms are much higher in rural areas where most of the poor live (see the report ‘Gaps and weaknesses in information delivery and communication at the school level). The funding for both studies became possible thank to the Poverty and Social Impact Analysis (PSIA) Multi-Donor Trust Fund (MDTF). The findings of the technical note and the report have been presented to the staff of the Ministry, Quality Assurance Agency of the Ministry (responsible for national and international assessments including PISA) as well as education NGOs. In addition, the findings of the report on ‘Gaps and Weaknesses in information delivery and communication at the school level’ have been discussed with the previous and current leadership of the Ministry of Education in the context of MERP project implementation (i.e. development and dissemination of schools report cards and how the process can be strengthened since for now the information is extensively used by school principals but does not fully reach parents and students in particularly in rural areas, where most poor live). A training session was also conducted for the staff of the Ministry, Quality Assurance Agency of the Ministry as well as education NGOs in order to strengthen their capacity in conducting analysis of PISA and EMIS data and making a full use of the collected data. 7 Introduction A feasibility study by the Government of Moldova in 2011 concluded that up to half of Moldova’s rural schools may need to be reorganized in the next five years in order to counteract the sharp population decline that has taken place over the last 20 years. An average school in Moldova operated at 54 percent of the capacity for which it was designed, leading to wasteful expenditures in the form of heating bills and public utilities. As a result, a high proportion of the education budget went toward the financing of personnel expenditures and maintenance costs of school buildings (including expenditures on heating and utilities), crowding out much needed quality-enhancing investment in capital and educational materials. With analytical and financial support from the World Bank (including under the ongoing Moldova Education Reform Project), the Government has taken politically and socially difficult reform initiatives to right-size the school network (the largest subsector in Moldova’s education budget representing around half of the education spending) and to strengthen the quality of education provided. Over the last three years 115 schools have been closed and almost 200 schools have been reorganized representing together around 21 percent of the school network. The Government also introduced per-student financing nationwide; provided autonomy to schools in managing financial and human resources; and the Ministry of Education modernized the Baccalaureate exams system, strengthened its integrity for merit-based way of determining university admission; took proactive steps in opening education data so that to strengthen citizens’ oversight of reforms; developed the student-level EMIS and on its basis prepared the school report cards for public dissemination (as one of the powerful mechanisms for improving the accountability of education services providers). In parallel to closing non-viable schools and transferring the student population, the Government under support of the IDA-financed Moldova Education Reform Project is working on strengthening “receiving” schools to ensure that those offer an adequate learning environment (including physical space and the availability of educational inputs). While the largest efficiency gains can be and are made in rural areas this is also where most socio-economically disadvantaged students live and study. While the reform brings opportunity of providing children with a better education in larger and more resourced schools, children’s transition to a new school environment also creates risks both for attendance and performance, in particular to poor and marginalized children. Closing down schools represents both an opportunity to provide children with a better education through their reallocation to better resourced, larger schools with better qualified teachers and better facilities, stronger peers and more specialization. This increase in cost-effectiveness can help save money. Moving children to a new environment however also presents a risk: children may start missing classes (which can affect performance) or even drop out because students ‘moved’ might feel discouraged or alienated in the new school environment, as the new school is too far away or there are too big classes. For academically strong children with resourceful parents, such move is unlikely to result in a greater risk of lower academic performance and attendance of children. But for socio- economically disadvantaged children, for ethnic minorities, and for children struggling academically in school, such move could increase the likelihood of relevant negative effects. 8 This report aims to explore the following: (i) how does the performance (as well as attendance/absenteeism) of socially disadvantaged Moldovan secondary students compare to that of better-off students; (ii) what are the determinants of performance and attendance, in particular those that are under control of the education system and can be influenced in the course of education reforms so that to improve the quality of education for students from both genders and from all socio-economics backgrounds; and (iii) examine the risks associated with the school network consolidation to inform ongoing education reforms (and ensure that students directly affected by the reform are adequately accommodated in the “receiving” schools). To answer these questions, the analysis of the student-level EMIS and PISA 2009 Plus datasets have been conducted which together have more than 500 variables on various aspects of schooling in Moldova, including the socio-economic status and vulnerabilities such as being an orphan or single-parent child, having parent/parents abroad or unemployed, having disabilities, ethnic minority status etc. The EMIS data present a number of important variables for the whole population of students but has a number of shortcomings in particular related to the completeness and quality of data for vulnerable students which can and should be addressed by the Ministry in the future. 9 Descriptive Analysis of Student Performance and Attendance by Socio-Economic Background and School Environment There is a tremendous amount of literature on the determinants of student achievement, which has been summarized for example by Hanushek and Woessmann’s (2011), who focus on international differences in student achievement. They find that a link between the socio- economic background of students and their academic achievement exists in almost all countries, but the strength of this association differs. The body of evidence on determinants of student achievement in transition economies such as Eastern European countries is small but growing. Amermueller et al. (2003) uses TIMSS data to estimate educational production functions for seven Eastern European countries, estimating contribution of school inputs and student background variables. Tkhoryk (2011) updates the Amermueller et al. (2003) study, including a larger set of Eastern European countries. Coupe et al. (2011) list the studies that estimate the effect of school size, class size and the student- teacher ratio in Eastern European countries. Like studies for other regions, studies focusing on Eastern European countries typically find relatively large effects of student background variables and relatively small effects of quantitative school input variables. But they also reveal considerable variation across countries, suggesting studies for individual countries are important to inform country policies. Walker (2011) presents an overall study of the 10 countries, including Moldova, which participated in PISA 2009+. He finds that relative to the OECD average, in Moldova the part of performance that can be explained by the school which a student attends is relatively small (about 30 percent of the proportion of the total variance in reading performance that is between- school variance, compared to the OECD average of about 40 percent). He documents further that there is a relation between the student’s socio-economic background and reading performance and that differences in reading performance between schools are mainly due to student and school socio-economic factors, and not to differences in school governance factors. As far as we are aware, there is only one previous study that focuses exclusively on student achievement in Moldova. Capita (2012) uses school level information to estimate the effect of school size, class size, student-teacher ratio and other school and community characteristics on the average test scores of students at the end of secondary education. She finds that bigger schools tend to do better, while class size does not seem related to performance. She further finds a negative effect for the student/teacher ratio (for rural but not for urban schools) and of the share of large families, but no effect of expenditures. In this report we study the role of individual characteristics of students, including their socio- economic background, in Moldova in detail. To this end we use student-level data from the PISA 2009+ test and the Moldovan Education Management Information System (EMIS, see Appendix 2). The two data sources provide important information on student educational outcomes. The PISA test provides an internationally comparable estimate of a 9th grade student’s ability based on performance in a standardized test in different subjects. EMIS on the other hand records grades of students in Moldova on national standardized tests in grade 4, 9, and 12. In grade 4 these are graded by local teachers following centralized instructions but not the students’ own teachers (“internal examinations”). In grade 9 they are administered by local teachers but not the ones teaching the respective subject and graded at the district centers by teachers who do not come from the same 10 schools as students (“part-external examinations). The standardized tests in grade 12 (“external examinations”) are carried out in specifically created centers, administered by teachers from other schools than the students’ and graded by other teachers at the National Evaluation Center2. Both PISA and EMIS include both student-level and school-level variables. While the PISA test covers reading, math and science, the latter of these is not among the compulsory subjects covered in the final tests included in EMIS. These are Romanian, Romanian as a non-native language, math and foreign languages. The Moldovan EMIS has several advantages. First, it covers the whole population of students (unlike PISA which tests a sample of students). Second, it covers different age groups, while PISA focuses on 9th graders. Third, EMIS provides the most recent data (the latest available data is for 2014, while PISA data are available only for 2010. Finally, EMIS covers some other educational outcomes such as the absence of students. In particular, absence without excuse can be an important indicator of current or future educational problems. PISA, on the other hand, has the advantage that more variables are available for all students in the sample, while for EMIS less information is available and for some variables, the information is missing (or has not been coded for all students). The PISA data come with an internationally comparable ESCS. It combines the international socio-economic index of occupational status (ISEI) of parents, their maximum years of schooling, and indices of family wealth, educational and classical culture resources at students’ homes. Normalized and centered at zero using all PISA participants around the world, the ESCS shows that relative to other countries’ students, Moldovan students are on average 0.6 standard deviations less economically, socially and culturally well-off than their peers in other countries. In this report we will often compare the top 60 and bottom 40 percent of students according to the ESCS. Students who belong to the top 60 percent in Moldova are on average almost identical (ESCS: 0.07) to the average international PISA participant, whereas students from the bottom 40 percent are significantly disadvantaged by the international standards (ESCS: -1.51). The ESCS index indicates that 93.4 percent of students in the international PISA test are better off than the average Moldovan student from the bottom 40 percent. These students are thus clearly deprived by international standards. In the EMIS records no equivalent to the ESCS exists. However, EMIS contains several individual characteristics of students that can help analyze grades and absenteeism from an equity perspective. Among these are seven indicators that typically are negatively correlated with worse educational outcomes from other studies. First, students can be flagged by teachers as ‘vulnerable’ ("SituatieDeRisc"). If a student is deemed ‘vulnerable’, additional information is added: whether the student is “at risk” ("InStareDeRisc"), an orphan, disabled, has unemployed parents, is from a family with low income, or from a single parent family. Until 2013 being at risk had no formal definition. Since then there are specific guidelines as to what constitutes risk. The variable “at risk” indicates that the child is registered in the “Register of Children at Risk”3 (Moldovan Name), which is kept by the social worker in the community. 2 As Hanushek and Woessmann (2011) discuss, the availability of external exams itself is positive for student achievement internationally. 3 According to the Law on child protection at risk, these criteria are: a) children are subjected to violence b) children are neglected; c) children homelessness, begging, and prostitution; d) children are deprived of parental care and supervision due to their absence from home for reasons unknown; e) children's parents have died; f) children living in the streets, fled or were expelled from home; g) children's parents refuse to exercise their parental duties for growth and childcare; h) children were abandoned by their parents; i) parents of children were declared as incapable by a court decision. 11 Table 1 reports which risk factors are the most common in each grade and in the whole sample. According to the information in EMIS 92.8 percent of students are not deemed vulnerable by the class teacher. Sub-indices are only recorded if a student is deemed to be vulnerable. About a third of the 7.2 percent of students who are recorded as vulnerable are also registered with the community social workers. 0.2 percent of students are recorded as orphans. Only 0.7 percent of children have a parent who is registered as unemployed. This small share is partly explained by the very low official unemployment rate in Moldova. About 2.5% of students are recorded in EMIS as living in families with low income. Finally, 2.2 percent of children are recorded as living in a single parent family. For ‘vulnerable’ students, EMIS furthermore contains additional information at the individual level that do not constitute clear risk factors. These are living in family with more than two minors, attending a school whose language of instruction does not fit a student’s native language, and having migrant parents. Such factors can exacerbate existing risks but are not monotonically related to worse school outcomes. For example, parents may choose to send a child to a school with a language of instruction other than the mother tongue because they consider its quality superior. We will add these additional factors later in the analysis to provide evidence of their correlation with school outcomes. Table 1: Distribution of EMIS risk factors by grade Grade 4 9 12 All Risk factors Recorded as vulnerable 7.5% 7.9% 5.1% 7.2% „At risk“ 2.2% 2.5% 1.5% 2.2% Orphan 0.1% 0.3% 0.1% 0.2% Special educational need 0.3% 0.2% 0.1% 0.2% At least one parent unemployed 0.8% 0.8% 0.4% 0.7% Family with low income 2.9% 2.7% 0.6% 2.4% Single parent family 1.8% 2.6% 1.9% 2.2% At least one risk factor 7.5% 7.9% 5.1% 7.2% No risk factor 92.5% 92.1% 94.9% 92.8% Other factors >2 minors in family 2.6% 1.9% 0.6% 1.9% Language of instruction no match 0.4% 0.4% 0.2% 0.4% At least one parent migrant 2.6% 3.0% 2.5% 2.8% Number of students 26,523 30,943 15,278 72,744 Notes: Based on EMIS data for the school year 2013-14. Sample averages within each grade reported. Vulnerability is based on the assessment of the class teacher. “At risk” indicates that students is registered as such with the community social worker. Overall, Table 1 shows that the information in EMIS regarding risk factors and the additional factors do not cover all students and are only recoded for students who have been deemed vulnerable by their class teacher. If this person does not trigger the entry of these data in the risk factor form, information such as parental migration are not recorded. A simple average of the migration variable in EMIS would suggest that 2.8 percent of children have a migrant parent. From other data (MDHS, 2006; Lücke and Stöhr, 2012), we however know that the share is likely to be above 30 percent. Thus only a small part of children from migrant families have been recorded as such. 12 That EMIS only records information when a student is deemed vulnerable by the teacher is important for interpretation of any results based on these data for two reasons. First, reading any comparison of groups along the factors from the risk record in EMIS, one has to remember that a part of the group which has a value of zero for a specific indicator will in reality belong to the group which has a positive value on that indicator. Orphans for example will be coded as zero, and hence seemingly be non-orphans, except if they were also considered as vulnerable. The relevant comparison is thus always one between students considered vulnerable and with a particular characteristic and all students who are not considered to be vulnerable, some of whom will share the same characteristic. Second, it means that in contrast to the analysis of PISA data, we cannot distinguish the top 40 percent and bottom 60 percent in terms of disadvantage. Rather, EMIS offers variation among approximately the top 93 percent and the bottom 7 percent. Analyses of the differences within the group of 93 percent of students without risk factors however would have to rely on information on a more aggregate than the student level, for example at the school or locality level, since no individual level indicators for these students are available. Being considered vulnerable in EMIS is based on the often subjective decision of a class teacher. Therefore, some teachers may have higher, others may have lower (unobserved) thresholds at which they label students as vulnerable. If vulnerability clusters in schools, for example because teachers discuss their students’ situation with colleagues, this further increases the importance of modelling the school level. The majority of cohorts (66.7 percent) in schools4 do not have a single student who is considered to be a vulnerable. In fact this means that for such cohorts not a single vulnerability record exists on EMIS. Such a high percentages of students who are not vulnerable is likely to be caused by insufficient reporting by at least some schools while others genuinely do not consider any of their students as vulnerable. The pattern means that analyses which rely on the information in EMIS will to a considerable extent only use variation from those schools that have created records of their students and the involved measurement error causes estimated effects to be biased towards zero. We will now briefly analyze school-level differences in school outcomes before providing a detailed overview of student-level outcomes. The next chapter then combines both individual and school level data in a multi-level analysis. School-level differences in PISA test results and EMIS Figure 1 provides evidence of the relationship between the share of disadvantaged students in schools and the average PISA scores or EMIS grades, respectively. Panel A shows that there is a strong negative correlation between the school-level averages in reading, math, and science and the share of disadvantaged students (Panel A). Typically, the schools with the fewest disadvantaged students come close to the international average of 500 points in the PISA test whereas some schools with the most disadvantaged students have average scores of just 300 points. This means these students on average are far from achieving even the second of six levels of proficiency that the PISA test distinguishes and fall within the lowest performing 5 percent internationally (see e.g. OECD 2011, p.8). However, the variation of often 100 points across schools with comparable shares of disadvantaged students shows that there is need for more detailed analysis of the drivers of heterogeneity within schools for which student-level analysis is necessary. Panels B-D plot grades and risk factors from EMIS. Contrary to the strong correlation in Panel A the weak negative correlation between grades and the share of students vulnerable according to EMIS is barely visible to the naked eye. Of course, one needs to 4 Defined here as all students in a particular grade in a school in one year. 13 remember that the definition of vulnerability is much more comprehensive in the PISA dataset than in the EMIS dataset, which might explain the difference in correlation we find between these two datasets. While the average PISA test results at schools with more disadvantaged students are significantly worse these schools did not report more often that student absenteeism is a problem. Also they did not report differently regarding the severity of students skipping classes. These schools however reported significantly more problems with student disruptions as well as more problems with students being bullied by others. Other reported hindrances of learning such as low expectations of teachers or teacher absenteeism do not differ significantly in the share of disadvantaged students5. Other potentially important determinants of students’ achievements are school size, school staff and school infrastructure. The average PISA participating school attended by a disadvantaged student has 356 students whereas enrollment is 544 students in the schools of less disadvantaged students on average. The student teacher ratios are statistically indistinguishable at 12.3 and 12.4. The schools of more disadvantaged PISA participants however have statistically significantly fewer qualified teachers (84 percent to 89 percent, self- reported by schools). On other characteristics of schools, the differences exist but are not very pronounced. For example, the access to computers per student does not differ. The availability of computers however does not proxy the general quality of infrastructure well, which schools report to be more of a hindrance to teaching on average in schools attended by more disadvantaged students. Furthermore more disadvantaged students are less likely to have access to some extracurricular activities such as music bands or choirs, but in general the differences are small (e.g. 99 percent and 98 percent of bottom 40 percent and top 60 percent students attend schools with sports teams). Such differences are far more pronounced when comparing schools in rural and urban areas rather than basing this comparison on the background of students. 5 This and all other statistics that are only mentioned in the text are available on request. 14 Figure 1: School-level PISA results, EMIS grades and the share of economically, socially, and culturally disadvantaged students across important subjects Panel A: Pisa test scores Panel B: Grades in class 4 Panel D: Grades in class 12 Panel C: Grades in class 9 Student-level differences in PISA tests While differences between schools are instructive, any analysis of education outcomes must focus on the student level. Students’ average performance in the PISA test depends strongly on their socio-economic status as was pointed out by earlier work on education in Moldova (Walker, 2011). Table 2 demonstrates this. The bottom 40 percent of students according to the ESCS have an average reading score 45 points below that of their less disadvantaged classmates. On math and science the gaps are 46 and 43 points respectively. Such differences in PISA scores can be interpreted as lags in educational attainment. Usually, a 40 points difference is interpreted as a gap of one year of schooling (OECD, 2010a). This means the disadvantaged students in Moldova are more than three years behind the international average (500 points) on reading and math and only slightly less on science. Table 3 reports the average grades in Romanian, Romanian as a non-native, math, and foreign languages in 4th, 9th and 12th grade. In 4th grade, the grades of students without a single positive risk factor are on average about half a point higher than those of their peers who are recorded as vulnerable. The differences are similar in 9th grade and smaller in 12th grade. The similarity of the influence of socio-economic status in grade 4 and 9, which are compulsory, and the dissimilarity in grade 12 are likely to reflect the self-selection into upper secondary education after compulsory schooling ends. As a consequence, better students are more likely to continue their education and the poorest performing vulnerable students will typically not. 15 This is also reflected by the lower share of vulnerable students among 12th graders, which is only 5.1 percent compared to more than 7 percent among the younger cohorts. Table 2: PISA performance of bottom 40% vis-a-vis top 60% according to ESCS % of Std. Std. Std. Location Category Reading Math Science students Error Error Error Total Top 60 60.1 407 3.3 417 3.4 431 3.2 Total Bottom 40 40.9 362 3.2 371 3.8 388 3.8 Total Total 100 388 2.8 397 3.1 413 3.0 Note: Based on PISA 2009+ data. Top 60% and Bottom 40% indicate whether the student belongs to the top 60% or bottom 40% of Moldovan PISA participants according to the ESCS index. Deviation from 40% and 60% due to students with the same ESCS value. Standard errors are standard errors for the estimate of the mean. Table 3: Grades of students with risk factors vis-a-vis students without % of Romanian Foreign Grade Category Romanian Math students (non-native) Language 4 Not vulnerable 92.5% 7.92 7.91 7.83 - 4 Vulnerable 7.5% 7.36 7.48 7.29 - 9 Not vulnerable 92.1% 7.38 7.37 6.94 7.24 9 Vulnerable 7.9% 6.76 7.07 6.43 6.65 12 Not vulnerable 94.9% 7.65 7.44 7.48 7.65 12 Vulnerable 5.1% 7.28 7.34 7.20 7.43 Note: Based on EMIS data for the school year 2013-14. Vulnerable and not vulnerable indicate whether at least one risk factor is positive in EMIS. These are vulnerability, "at risk", being an orphan, being from a single parent family, being disabled, being from a family with three or more minors, being from a family with low income and attending a school that does not teach in a student's native language. Share of students for each grade calculated based on the sum of students with grades in Romanian or Romanian as a non-native. Statistically significant values. While the “vulnerable” label is a fuzzy proxy, children who are subject to multiple risk factors clearly do worse than less their peers from less adverse conditions. An EMIS based substitute for the socio-economic gradient is a risk factor gradient as shown in Figure 3 for average grades in class 4, 9, and 12. This is calculated as the linear relationship of the number positive risk factors and grades. The continuous and dashed lines indicate that there is a strong negative correlation between grades and risk factors in 4th and 9th grade. The dotted line shows that the average gradient in 12th grade is significantly flatter. Indeed, those coming to upper secondary schools are less likely to vulnerable students and perform better in general. The most disaffected students however simply do not take the baccalaureate. This is reflected by the lack of students in 12th grade who have positive values on all risk factors at the same time, which is why the dotted line ends at the value 5. 16 Figure 2: Predicted linear relationship between grades and risk factors using 2013-14 EMIS Such simple group averages between more and less disadvantaged students mask the heterogeneity within these groups. Even larger gaps than those reported in Table 2 can be observed when comparing rural to urban areas. Table 4 reports estimates that the 56.8 percent of rural students lag their urban peers by an average 60 points in reading, 55 in math and 48 in science. This translates into a 1 and 1.5 year gap, depending on the PISA sub-score. Table 5 confirms that the average performance of urban students in grade 9 is better than that of their rural peers. In 12th grade however, the difference vanishes almost completely, suggesting that it is mostly the best students from rural areas who continue their school education. Two thirds of students in 12th grade attend schools in urban areas whereas approximately 60 percent of students in grade 4 and grade 9 study in rural areas. Table 4 : PISA performance of rural and urban students % of Std. Std. Std. Grade Location Reading Math Science students Error Error Error 9 Rural 56.8 362 4.0 374 4.6 392 4.4 9 Urban 43.2 422 4.6 429 4.6 440 4.1 9 Total 100 388 2.8 397 3.1 413 3.0 Note: Based on PISA 2009+ data. Rural/urban difference based on reported school community in item SC04. Table 5: Grades of rural versus urban students Romanian (non- Foreign Grade Location % of students Romanian Math native) Language 4 Rural 59.5% 7.68 7.67 7.58 - 4 Urban 40.5% 8.22 8.01 8.11 - 9 Rural 62.8% 7.21 7.18 6.78 7.02 9 Urban 37.2% 7.58 7.47 7.09 7.49 12 Rural 34.6% 7.66 7.43 7.41 7.57 12 Urban 65.4% 7.62 7.43 7.49 7.67 Note: Based on EMIS data for the school year 2013-14. Rural/urban status based on NBS definition. Share of students for each grade calculated based on the sum of students with grades in Romanian or Romanian as a non-native. Statistically significant values. 17 Another very important factor is gender. Table 6 shows that girls are especially far ahead of boys in reading. This relationship holds both for students from the top 60 percent and the bottom 40 percent of the socio-economic distribution. The gap between boys and girls is large both in rural and in urban areas. Table 6 furthermore disaggregates the average PISA results by socio- economic status, gender, and rural/urban backgrounds to provide evidence of the best and worst performing groups of students. The disaggregation in Panel A shows that urban children who do not belong to the bottom 40 percent of the ESCS do particularly well. Rural students from a disadvantaged socio-economic background do particularly poorly. The differences between these groups are striking: 75 points for girls and 82 points for boys in reading for example. Moldova thus has large achievement gaps between genders, between rural and urban areas and a strong dependence of test scores on the socio-economic background. Determinants of students’ achievement interact and add up causing very significant differences between the educational outcomes of the best and worst performing students. Such (and even worse) inequitable outcomes can also be found in rich countries such as New Zealand or France, which have a good PISA performance (521 and 496 point in reading, respectively) as OECD (2010b, p. 102) shows (comparable numbers for Moldova can be found in Walker 2011, p79). While socio-economic background has less of an influence on educational performance in some other countries (like Indonesia or Tunisia), overall Moldova scores fairly average in terms of the impact of socio-economic background on PISA scores. The economic consequences of such large differences within one country can be better understood by using the estimates of Woessman and Hanushek (2007) on the relationship of a country’s growth and their average test scores. Using a sample of the participating countries in the PISA 2000 test, they find that 47 additional PISA score points on the mathematics scale were correlated with a 1 percent higher growth rate over the past 40 years. Thus, bringing the average Moldovan student, who currently has an average math score of 397 points up to the level that is already achieved by the average urban students from less disadvantaged backgrounds (439 points), would amount to an extra 0.9 percent of annual growth if the same relationship held in the future. Table 6: Performance of bottom 40% by gender, urban/rural divide Panel A: Performance of the top 60% students Std. Std. Std. Location Category % of students Reading Math Science Error Error Error Total Female 47.9 432 3.7 416 3.7 438 3.5 Total Male 52.9 385 3.5 418 3.7 425 3.5 Rural Total 43.5 376 4.7 388 5.0 407 4.9 Urban Total 56.6 432 4.6 439 4.7 450 4.0 Rural Female 19.8 405 5.2 389 5.1 418 5.7 Rural Male 23.6 351 5.2 388 5.9 397 5.5 Urban Female 28.0 450 5.1 435 4.9 452 4.6 Urban Male 28.5 413 5.2 443 5.4 448 4.5 18 Panel B: Performance of the bottom 40% Std. Std. Std. Location Category % of students Reading Math Science Error Error Error Total Female 49.8 384 3.4 370 4.1 396 4.5 Total Male 50.2 340 3.9 372 4.5 380 4.1 Rural Total 75.5 353 4.0 363 4.8 381 4.7 Urban Total 24.5 390 5.5 396 4.9 408 5.5 Rural Female 37.2 375 4.2 361 5.0 389 5.4 Rural Male 38.3 331 4.7 364 5.6 374 5.1 Urban Female 12.6 411 6.0 395 5.7 415 6.0 Urban Male 11.9 368 7.1 398 6.0 401 6.6 Note: Based on PISA 2009+ data. Top 60% and bottom 40% of Moldovan PISA participants according to the ESCS index. Disaggregation based on student characteristic ST04Q01 "Gender" and school characteristic SC04Q01 "School Community". Due to the 2009 PISA test’s focus on reading several components of reading ability were tested separately. Differences across sub-scores (e.g. interpretation of text) are strikingly large. While the average ability of the less disadvantaged students is relatively similar across sub-indices it varies considerably among the more disadvantaged students. Compared to the international 500 point threshold disadvantaged Moldovan students are particularly weak in the “reflect and evaluate” and “non-continuous text” tasks. On average the more disadvantaged students not only do worse on reading tasks, they are also significantly less likely to read for pleasure. While there are no substantial differences in voluntarily reading newspapers or comic books, the more disadvantaged students are far less likely to regularly use the internet and to apply their reading skills online. This is of course a result of their poorer background, but problematic because young people in many countries nowadays do most of their reading and writing on the internet. The 2015 PISA test will focus on science and thus allow similar disaggregation, which can be helpful to target improvements at learning outcomes. The poor performance of students could lead to substantial incidence of grade retention (see Table 7). This is however used less commonly in Moldova than in other countries. Only 4.6 percent of students in the PISA data ever repeated a grade. Among these there is a large group which shows worrying signs of serious problems. The majority of students who have ever repeated a grade had already at least been retained twice. Students from more disadvantaged backgrounds are more than twice as likely to be subject to grade retention as their more affluent class mates. However, controlling for the average PISA scores in reading, math, and science, the students from the bottom 40 percent are not significantly more likely to be retained than their peers. Thus grade retention is mostly explained by academic achievements rather than directly by the socio-economic background of a student. Table 7: Class retention depending on disadvantaged status Repetition at any ISCED level Top 60% Bottom 40% Total Never 97.5% 94.5% 96.4% Once 0.8% 2.0% 1.3% Twice or more 1.6% 3.4% 2.3% Notes: Based on PISA 2009+ data. Top 60% and Bottom 40% indicate whether the student belongs to the top 60% or bottom 40% of Moldovan PISA participants according to the ESCS index. The rightmost column reports the average for Moldova. 19 Despite their poor performance by international standards Moldovan students who participated in the 2009+ PISA test respond quite positively when asked about the usefulness of their schooling in general. Across ESCS background students report the subjective feeling that school is a good preparation for adult life. The likelihood of students reporting that school had done little to prepare them for adult life is a mere 1.8 percentage points higher for students from the bottom 40 percent of the ESCS and 5 percent compared to 3.9 percent report that that school had been a waste of time in general. These differences are only significant at the 10 percent significance level. By international comparison Moldovan PISA participants thus report a rather positive view. Although their average PISA scores are significantly worse, Moldovan students’ view of school is on average for example more positive than that of English students (cf. Bradshaw et al. 2010). While in this chapter, we have discussed descriptive statistics, the following chapter will feature econometric analyses which allows more detailed analysis of characteristics such as individual’s risk factors by simultaneously controlling for the role of many factors. The descriptive comparison in this chapter generally has shown that for studying the impact of socio-economic background on students’ performance the PISA test scores are superior to EMIS based information because of wider data availability and the PISA test’s more detailed measurement of performance. Moldova’s participation in the 2015 PISA round, which was supported by the World Bank Moldova Education Reform Project (data should be released in the end of 2016) will thus improve the available data for analyses of the role of socio-economic background on education considerably. 20 Determinants of Student and School Performance In this section we provide evidence of the determinants of PISA scores and grades. We use multivariate multi-level methods, which can take into account both individual characteristics, school characteristics and the correlation of individual outcomes at the cohort-school level at the same time. More detail on the methods used in this this section can be found in Annex 2. Modelling PISA scores and grades As a starting point we use a so-called empty model in Table 8 on PISA data, columns (1), (3), and (5), which consists only of an intercept that captures the average of all students and a random effect at the school level6. The intraclass correlation coefficient (icc)7 is 0.30 in column 1 which indicates that 30 percent of the variance in the reading score originates at the school level. For math and science the iccs are 0.36 and 0.32 respectively. This shows that omitting school-level differences, either captured by the general composition of students that differs across schools or specific school characteristics, is a serious oversimplification of the determinants of educational outcomes. In columns 2, 4 and 6 we add several covariates at the student level and the school level to the model. Students from more well-off backgrounds are better in reading, math and science. As one point of the ESCS index corresponds to one standard deviation (0.98 pts in Moldova), having a one standard deviation higher ESCS index yields 14-15 additional points across the various test dimensions, which translates to about four months of schooling. Girls’ average reading scores are 43 points higher than those of boys, whereas the math scores are slightly lower and the science difference is 11 points. The difference in reading ability can be interpreted as the average girl being about one school year ahead of her male classmates. A similarly large difference of almost a year is found between Romanian speaking and other students – those speaking Romanian at home do substantially worse than others. One possible explanation is that the test language was determined at the level of the school, so that some Romanian speakers took tests in Russian language (and vice versa). Students who either immigrated themselves or whose parents have immigrated do not show a significantly different performance when controlling for the other factors. In other countries, by contrast, the children of immigrants often lag their native classmates by a significant margin (OECD, 2012a). Parents are an important determinant of children’s educational achievements. Our estimates show that it is mainly the absence of both parents which is problematic. Children who live without both their parent (many of these are orphans) are more than half a year behind their classmates. The same cannot be found for children who only live with a single parent or children whose parents are migrants. Other differences at the individual level are statistically insignificant. The school-level factors show that the strikingly large rural/urban differences that were found in the descriptive tables above are explained by other factors in the model such as the socio- economic status of students. Furthermore, the average socio-economic status of other students 6 This random school effect is assumed to be normally distributed and suggests that school specific effects vary with a standard deviation of 48.8 (column 1) around the overall mean of 385.6 points. The residual, i.e. all additional variation, for example originating at the individual level, can be modelled by another normally distributed error with a standard deviation of 74.2. 7 The icc is defined as the variance of the school-level error term over the total variance. From the table it can be calculated as 48.8²/(48.8²+74.2²).To see how to calculate it in different statistical packages, see Albright and Marinova (2010). 21 is a very strong indicator. Compared to the differences above, a one-standard deviation increase8 in the average ESCS at a school is associated with students being half a year ahead of their peers. This can be due to one of three explanations. First, peer effects, i.e. the (positive or negative) effect the (better or worse) performance of classmates can have on students’ outcomes. Second, contextual effects, i.e. students from the same school being subject to the same unobserved school-level factors, or third, due to correlated effects, i.e. students from the same school being more similar to each other in their unobserved characteristics than to random other students and thus showing more similar behavior to their classmates than to students from other schools. Mixed models allow adding interactions between student-level and school-level characteristics. The possible hypotheses that can be tested even in basic models as those of Table 8 multiply with every control variable that is added. Below we describe some of the findings of such tests (without providing the actual tables). Such additional tests suggest that the individual ESCS of a student does not interact significantly with the average ESCS in a student’s school. Disadvantaged students thus do not do significantly worse when attending a school with many other disadvantaged students. Furthermore, the effect of the ESCS does not differ significantly between boys and girls for reading and math scores. Boys’ science scores however increase more strongly in the ESCS index than girls. Disadvantaged students face disadvantages along many lines. A popular hypothesis is that low- quality schools are often found in poor neighborhoods, which worsens the outcomes of disadvantaged students. Put more generally, schools may differ in their ability to overcome differences in socio-economic background. Hence, the effect of the individual ESCS index on performance might differ by school. Extending the model in the even-numbered columns of Table 8 to allow for such an interaction between individual ESCS and school intercepts suggests that no such relationship exists in Moldova. This is the case with and without allowing for the effect of the average ESCS at a school. The explanatory power of the model can be summarized graphically. As Figure 3 shows, much of the explained variation comes from school-level differences in the socio-economic and cultural background of students (school socioeconomic composition). The individual ESCS index explains between 5 and 10 percent of the variation. The average ESCS of all children who took the PISA exam at a students’ school however can explain between 10 and 15 percent of the variation. Being female explains about 5 percent of the variation in reading scores. The language of the student contributes another 3 to 5 percent to explained variance, depending on the test dimension. Other included characteristics of students or schools only add little to the model. 8 This is 0.54 and thus about half as large as the standard deviation for individual ESCS. 22 Table 8: Basic hierarchical model of PISA scores (1) (2) (3) (4) (5) (6) Reading Reading Math Math Science Science Fixed Effects Intercept 385.6*** 444.17*** 395.8*** 481.63*** 411.1*** 470.03*** (3.81) (29.15) (3.90) (34.48) (3.76) (34.37) ESCS 14.55*** 14.83*** 14.39*** (1.39) (1.66) (1.57) Female 42.69*** -4.40** 11.27*** (2.21) (2.17) (2.51) Romanian speaking -42.46*** -34.96*** -34.56*** (5.82) (6.99) (6.67) Born abroad 10.22 4.27 3.48 (8.61) (9.81) (8.89) Migrant Background 0.11 1.85 -0.26 (4.67) (4.7) (4.44) Without Parents -29.63*** -25.61*** -26.44*** (7.85) (8.75) (9.24) Single Parent -2.89 0.22 1.89 (3.41) (3.26) (3.5) Unemployed Parents -0.81 -3.78 -4.40 (2.73) (2.45) (2.79) Both Parents Potentially Migrants 14.18 3.73 11.55 (9.36) (10.34) (11.18) Urban 1.81 1.35 -12.37 (6.91) (8.3) (8.07) Mean ESCS 40.21*** 39.56*** 42.41*** (6.46) (7.73) (7.38) Student-teacher-ratio -4.22 -1.70 -4.36 (3.41) (4.06) (3.99) Student-teacher-ratio² 0.15 0.02 0.15 (0.14) (0.16) (0.16) Proportion of qualified teachers 23.00 1.37 42.65* (20.61) (24.48) (23.83) Random Effects Intercept 48.8*** 27.32*** 50.6*** 34.3*** 48.3*** 32.91*** (5.70) (6.01) (5.86) (2.34) (5.61) (2.27) Residual 74.2*** 69.20*** 68.2*** 66.20*** 71.2*** 69.4*** (1.76) (0.82) (1.83) (0.87) (1.92) (0.98) Number of 5,158 5,158 5,158 5,158 5,158 5,158 observations Note: Based on PISA 2009+ data. *, **, and *** denote statistical significance at the 0.1, 0.05, and 0.01 level, respectively. Standard errors in parentheses. 23 Figure 3: Contributions to explained variance in PISA scores Contribution of characteristics to explained 40 variantion in PISA scores 35 30 5.1 25 4.5 6.4 0.1 2.9 20 0.4 15 13.5 12.7 11.5 10 5 6.2 7.6 7.4 0 Reading Math Science ESCS School's average ESCS Female (0/1) Romanian speaking (0/1) City (0/1) Female (0/1) * ESCS Born abroad (0/1) Migrant background (0/1) Single parent at home (0/1) Without parents at home (0/1) Parents unemployed (0/1) Potential migrant household (0/1) Proportion of qualified teachers Student Teacher Ratio Student Teacher Ratio² Town (0/1) Figure 4: Contributions to explained variance in EMIS math grades Contribution of characteristics to explained 12 variantion in grades in math (EMIS) 0.2 0.5 10 0.7 0.0 8 0.6 0.0 6 2.7 4 8.5 0.7 7.0 2 2.4 0 Grade 4 Grade 9 Grade 12 Female Recorded as vulnerable "At risk" Orphan Special educational needs At least one parent unemployed Family with low income Single parent >2 minors in family Mother tongue and school language match At least one parent migrant Both parents migrants Less than 3 km to school Student-teacher_ratio School's share of students with special needs School's share of vulnerable students School's share of teachers with masters degree or more Urban SADI (multiple deprivation index) 24 Table 9: Basic hierarchical model of grades by subject (1) (2) (3) (4) (5) (6) (7) (8) Romanian Romanian Foreign Foreign Romanian Romanian Math Math (non-native) (non-native) Languages Languages Fixed Effects Intercept 7.54*** 6.89*** 7.55*** 7.55*** 7.28*** 6.78*** 7.26*** 7.28*** (0.01) (0.12) (0.03) (0.03) (0.01) (0.26) (0.02) (0.01) Female 1.03*** 0.96*** 0.68*** 1.08*** (0.01) (0.03) (0.01) (0.02) Grade 4 0.53*** 0.51*** 0.84*** (0.03) (0.06) (0.03) Grade 12 0.19*** 0.05 0.46*** 0.36*** (0.04) (0.08) (0.04) (0.04) Risk factors Recorded vulnerable -0.30*** -0.01 -0.25*** -0.31*** (0.06) (0.13) (0.06) (0.08) "At risk" -0.09 -0.27** -0.12** -0.19*** (0.06) (0.13) (0.06) (0.07) Orphan -0.21 0.14 -0.21 -0.25 (0.15) (0.31) (0.15) (0.17) Special educational needs -0.61*** -0.13 -0.62*** -0.46*** (0.14) (0.29) (0.14) (0.17) Unemployed parents -0.30*** -0.19 -0.27*** -0.14 (0.08) (0.15) (0.08) (0.10) Family with low income -0.53*** -0.42*** -0.51*** -0.57*** (0.06) (0.12) (0.06) (0.07) Other factors Single parent family -0.07 -0.06 -0.09 0.02 (0.06) (0.13) (0.06) (0.07) >2 minors in family -0.03 -0.22* -0.06 -0.10 (0.06) (0.13) (0.06) (0.07) Language of instruction -0.06 -0.00 0.13 0.09 matches (0.20) (0.13) (0.11) (0.14) Parents migrants 0.16*** -0.19 0.10* 0.08 (0.06) (0.13) (0.06) (0.07) Both parents migrants -0.02 0.32 0.03 0.02 (0.08) (0.21) (0.08) (0.10) Distance less than 3 km 0.07*** 0.07** 0.09*** 0.10*** (0.02) (0.04) (0.02) (0.02) Urban 0.25*** 0.15** 0.29*** 0.19*** (0.03) (0.07) (0.03) (0.04) School-level variables Student-teacher-ratio -0.04* -0.01 -0.01 -0.02 (0.02) (0.06) (0.02) (0.03) Student-teacher-ratio² 0.00 0.00 0.00 0.00 (0.00) (0.00) (0.00) (0.00) Share of students with -0.44* -0.74 -0.78*** -0.98*** special educational needs (0.23) (0.61) (0.23) (0.32) Share of vulnerable 0.03 0.84*** 0.05 0.02 students (0.16) (0.29) (0.14) (0.20) Share of teachers with 0.61* -1.71* 0.51 0.77* masters degree or more (0.33) (0.95) (0.32) (0.43) Random Effects Intercept 0.25*** 0.17*** 0.25*** 0.20*** 0.36*** 0.17*** 0.31*** 0.22*** (0.01) (0.01) (0.02) (0.02) (0.01) (0.01) (0.01) (0.01) Residual 1.87*** 1.60*** 1.78*** 1.54*** 1.95*** 1.80*** 2.01*** 1.72*** (0.01) (0.01) (0.02) (0.02) (0.01) (0.01) (0.01) (0.02) Number of observations 53,524 53,524 12,587 12,587 65,162 65,162 45,660 45,660 Number of groups 3,225 3,225 570 570 2,520 2,520 1,536 1,536 Note: Based on EMIS data for the school year 2013-14. *, **, and *** denote statistical significance at the 0.1, 0.05, and 0.01 level, respectively. Standard errors in parentheses. 25 In Table 9 we estimate a similar model to Table 8 using EMIS data. The intercept of the empty model in the odd numbered columns shows that the average grades depending on the subject are between 7.2 and 7.6. The estimates furthermore show that there is significant correlation at the school-cohort level, defined as a specific grade level at a specific school. National assessments results, however, are significantly more similar across schools than PISA scores. The intra-class correlation indicates that between 11 and 16 percent of variation originates at the level of the school-cohort, considerably less than the 30 to 36 percent originating at this level among PISA participants in grade 99. The comparison of odd and even numbered columns in Table 9 shows that a disadvantaged background as recorded in EMIS is a strong predictor of a students’ grade. Almost all risk factor that are recorded on EMIS (remember that they are only recorded if a student is considered to be vulnerable) correlate negatively with grades in at least some subjects. In line with their higher PISA scores, girls receive significantly higher grades, about two thirds of a grade in math and a whole grade in Romanian and foreign languages. Furthermore, students receive the worst grades in grade 9, the baseline category, whereas grades are typically significantly higher in grade 4 and grade 12. Other factors covered by EMIS such as students being from single parent families or migrant families do not constitute negative risks for students’ grades among vulnerable students though this might be due to the shortcomings of the data on relevant variables. Generally, students from urban areas receive better grades and students who have to travel more than three kilometers to school, an arbitrary but obviously meaningful cutoff covered on EMIS, do worse than their classmates on average. At the school-level, students in schools with many students with special educational needs doo worse. A similar graph to Figure 3 can be calculated for EMIS grades. Due to lower resolution of grades compared to the PISA score and lower data quality (most variables are available only for a small group of vulnerable students) the explanatory power of the control variables is far lower for grades than for PISA scores as Figure 4 shows. This is also not overcome when enriching EMIS data with school- or locality-level variation such as local or rayon level deprivation indices (SADI). The only very substantial contribution to explained variance in a regression that includes all covariates reported in the graph are the gender of the student and for younger students whether they attend schools in urban or rural areas. It should be noted that both these variables are available for all students in EMIS, while most EMIS variables are available only for a small group of vulnerable students which makes it hard for these other variables to have a meaningful impact on the overall variation in grades. The insignificance of school level variables such as the share of highly skilled teachers does however not preclude that students in general and disadvantaged students in particular could not benefit from specific characteristics of schools. John Hattie (2009) summarizes the findings of more than 800 meta studies of student achievement. He concludes that interventions like changing class size or introducing web-based learning have relatively limited impact, that interventions like changing school size or introducing cooperative learning have an average impact and that interventions like providing feedback or improving teacher clarity have relatively big impact. This can be analyzed in more detail by assessing which school characteristics are associated with a lower dependence of PISA scores o n a student’s ESCS index. This is tested quantitatively by estimating regressions which include both the ESCS, each 9There is no perfect way of achieving comparability, because the PISA test was administered to a subset of 9 th graders at each school, whereas EMIS data cover all students from each of the grades. 26 of the aggregate school characteristic that come with the PISA dataset and their interaction10. In the interest of brevity we show the results for math only. In Table 11 we estimate a linear regression (panel) model with school fixed effects. All observed and unobserved school characteristics are thus accounted for and not explicitly estimated. The results in Table 11 can be understood in the following way. As before a positive coefficient of ESCS means that students from more favorable socio-economic backgrounds have higher PISA math scores. At the bottom, we report results from a separate regression which assesses the role of school characteristics. These are not directly included in the main regressions of the table, because we eliminate any observed and unobserved school-level factors in order to receive more reliable estimates. A positive coefficient on the respective school characteristic at the bottom of the table means that this characteristic is correlated with better average performance of all students at schools that have this characteristic, while simultaneously controlling for all individual and school characteristics that can be found in the table. Focusing again on the top of the table, a positive interaction between the school characteristic and a student’s ESCS indicates that students from advantaged socio-economic backgrounds do even better in schools that have this characteristic compared to their more disadvantaged peers. Such a characteristic is thus correlated with more dependence of educational outcomes on socio-economic status and less inequitable outcomes. An effective equity-enhancing characteristic would have a large positive significant coefficient at the bottom of the table (i.e. lifting all students regardless of their socio- economic status) and a negative interaction coefficient, lowering the inequity in the system. The interaction coefficient should not be too large to ensure that both disadvantaged and well- off students see a net benefit from the respective school characteristic. Generally, these results should be treated cautiously, because when testing a large number of indicators it is likely that some will have statistically significant coefficients just by chance. Several of the 18 school characteristics in Table 10, for example the proportion of qualified or certified teachers and extra-curricular activity, are significantly correlated with average students’ performance. However, there is no school characteristic which is both correlated with better average outcomes as well as a lower impact of ESCS on PISA math scores. Higher school leadership and higher responsibility for the curriculum are associated with smaller differences between disadvantaged students and well-off students, but these characteristics are not generally associated with a higher test scores. The decrease in unequal outcomes between students due to stronger leadership does however not come at the expense of better off students. There are furthermore clearly negative characteristics such as teacher shortages (TCSHORT) in Column 17, which harms all students and is associated with more unequal outcomes. The results suggest that, in line with the international literature, school-level quality measures are associated with considerable differences in school outcomes (cf. Hanushek and Woessmann, 2011). Most of these factors however do not affect the role of socio-economic background on achievement; rather they foster or hinder students’ achievements across the board. 10In order to interpret the interaction it is important to adjust both the ESCS and the school characteristics to be fully in the positive domain. The PISA indices are however standardized around zero. Therefore, we add the absolute value of each variable’s minimum to it such that the minimum becomes 0. 27 Table 10: School characteristics correlated with lower or higher dependence of PISA math scores on socio-economic status, full covariates Part 1 (1) (2) (3) (4) (5) (6) (7) (8) (9) Specification (school characteristic) ABGROUP COMPWEB IRATCOMP PCGIRLS PROPCERT PROPQUAL SCHSIZE SELSCH STRATIO ESCS 16.65*** 15.40*** 12.68*** 1.664 18.73*** 5.903 16.90*** 17.57*** 14.48** (shifted to positive domain) (4.878) (2.488) (3.322) (19.92) (6.948) (12.02) (2.362) (4.780) (6.751) ESCS * School Characteristic -0.615 -0.234 6.170 0.165 -4.424 8.343 -0.00387 -0.893 0.0297 (both shifted to positive domain) (1.860) (3.759) (7.979) (0.248) (6.987) (11.19) (0.00401) (1.595) (0.462) Female (0/1) -4.617** -3.888* -4.443** -4.441** -4.487** -4.487** -4.495** -4.508** -4.566** (2.125) (2.174) (2.087) (2.087) (2.200) (2.172) (2.089) (2.110) (2.123) Romanian Speaking (0/1) -42.73*** -37.13** -41.76*** -42.65*** -42.66*** -42.52*** -42.61*** -42.86*** -42.66*** (14.91) (14.93) (15.64) (14.83) (14.86) (14.80) (14.84) (14.89) (14.82) Born abroad (0/1) 2.830 2.268 3.562 4.022 4.418 4.299 3.970 3.391 3.938 (9.979) (10.50) (10.12) (10.01) (9.941) (9.867) (9.993) (10.03) (10.04) Migrant background (0/1) 2.829 1.574 2.220 1.853 1.627 1.607 1.991 2.239 1.818 (4.809) (5.062) (4.987) (4.812) (4.847) (4.798) (4.812) (4.874) (4.904) Lives without parents (0/1) -22.97*** -26.57*** -25.07*** -25.34*** -28.23*** -25.96*** -25.17*** -24.81*** -27.81*** (8.392) (8.203) (8.446) (8.212) (8.460) (8.684) (8.221) (8.159) (8.516) Lives with single parent (0/1) 0.322 1.248 0.231 0.343 0.878 0.664 0.270 0.280 0.317 (3.190) (3.326) (3.233) (3.237) (3.319) (3.251) (3.239) (3.200) (3.293) Unemployed parents (0/1) -4.680* -4.777* -4.968** -4.994** -3.873 -3.954 -5.001** -5.176** -4.397* (2.440) (2.484) (2.392) (2.402) (2.448) (2.457) (2.404) (2.381) (2.425) Potential migrant household (0/1) 2.011 5.723 4.746 4.542 7.226 4.301 4.414 3.809 6.730 (9.566) (9.984) (10.04) (9.731) (10.00) (10.28) (9.742) (9.644) (10.07) Constant 457.5*** 444.6*** 437.8*** 386.2*** 467.9*** 406.4*** 458.7*** 462.1*** 447.1*** (27.91) (13.91) (16.95) (96.87) (29.64) (56.35) (14.40) (27.58) (32.29) Observations 4,549 4,448 4,690 4,718 4,567 4,575 4,718 4,676 4,666 R-squared 0.047 0.047 0.047 0.047 0.046 0.047 0.047 0.047 0.047 Coefficient of school characteristic in 6.307*** 1.831 -2.020 0.595*** 37.50*** 57.58*** 0.0353*** 3.078** -1.119** separate linear regression§ (1.632) (3.481) (5.800) (0.229) (5.359) (9.836) (0.00361) (1.462) (0.435) Notes: Based on PISA 2009+ data. Standard errors account for sampling structure of the PISA survey. For details on the different school level characteristics, please refer to the OECD (2009). The asterisk ***, **, and * denote statistical significance at the 0.01, 0.05, and 0.1 levels, respectively. Dependent variable is the set of PISA plausible values for math. §: This is a separate linear regression that indicates which school characteristics correlate positively with PISA scores in the presence of all other individual covariates in general. Appendix 3 has the meaning of the abbreviations used for the school characteristics. 28 Part 2 (10) (11) (12) (13) (14) (15) (16) (17) (18) Specification (school characteristic) EXCURACT LDRSHP RESPCURR RESPRES SCMATEDU STUDBEHA TCHPARTI TCSHORT TEACBEHA Dependent variable Pisa average Pisa average Pisa average Pisa average Pisa average Pisa average Pisa average Pisa average Pisa average ESCS 15.86*** 19.95*** 17.14*** 13.61*** 17.00*** 17.41*** 14.50*** 11.93*** 16.15** (shifted to positive domain) (3.556) (2.945) (2.105) (2.091) (3.792) (5.780) (2.496) (1.876) (6.608) ESCS * School Characteristic -0.495 -2.723* -3.637* 3.854 -1.318 -0.766 0.294 3.243* -0.316 (both shifted to positive domain) (1.628) (1.446) (2.159) (3.888) (2.174) (1.655) (1.303) (1.676) (1.594) Female (0/1) -4.444** -4.402** -4.427** -4.438** -4.456** -4.381** -4.438** -4.466** -4.426** (2.087) (2.087) (2.087) (2.087) (2.086) (2.090) (2.088) (2.105) (2.085) Romanian Speaking (0/1) -42.78*** -43.42*** -42.89*** -42.73*** -42.67*** -42.68*** -42.68*** -43.55*** -42.73*** (14.77) (14.81) (14.84) (14.83) (14.83) (14.82) (14.83) (14.79) (14.73) Born abroad (0/1) 3.847 3.860 3.902 3.792 3.909 3.909 3.833 3.849 3.823 (9.985) (9.993) (9.989) (9.997) (10.02) (10.02) (9.992) (10.03) (10.01) Migrant background (0/1) 1.884 1.902 1.842 1.833 1.814 1.823 1.863 2.096 1.838 (4.814) (4.812) (4.811) (4.811) (4.827) (4.828) (4.813) (4.866) (4.834) Lives without parents (0/1) -25.23*** -24.91*** -25.08*** -25.03*** -25.21*** -25.32*** -25.27*** -25.92*** -25.30*** (8.226) (8.220) (8.212) (8.234) (8.244) (8.205) (8.225) (8.228) (8.192) Lives with single parent (0/1) 0.297 0.287 0.280 0.276 0.307 0.296 0.350 0.354 0.327 (3.226) (3.237) (3.236) (3.242) (3.238) (3.227) (3.246) (3.213) (3.234) Unemployed parents (0/1) -4.982** -5.059** -4.820** -4.978** -4.966** -4.984** -4.985** -4.977** -4.969** (2.402) (2.404) (2.403) (2.401) (2.401) (2.401) (2.402) (2.388) (2.404) Potential migrant household (0/1) 4.458 4.112 4.002 4.286 4.503 4.637 4.451 5.063 4.528 (9.744) (9.733) (9.710) (9.725) (9.731) (9.646) (9.747) (9.701) (9.649) Constant 453.5*** 473.0*** 459.0*** 443.0*** 458.5*** 460.3*** 446.8*** 435.7*** 454.5*** (17.67) (17.05) (13.98) (13.10) (17.10) (22.98) (15.47) (14.24) (27.20) Observations 4,718 4,718 4,718 4,718 4,718 4,718 4,718 4,685 4,718 R-squared (within) 0.047 0.048 0.048 0.047 0.047 0.047 0.047 0.048 0.047 Coefficient of school characteristic 8.666*** 1.107 -12.37*** -13.95*** 7.276*** 7.528*** -3.039** -7.040*** 4.196*** in separate linear regression§ (1.696) (1.442) (2.018) (3.809) (1.715) (1.190) (1.266) (1.684) (1.155) Notes: Based on PISA 2009+ data. Standard errors account for sampling structure of the PISA survey. For details on the different school level characteristics, please refer to the OECD (2009). The asterisk ***, **, and * denote statistical significance at the 0.01, 0.05, and 0.1 levels, respectively. Dependent variable is the set of PISA plausible values for math. §: This is a separate linear regression that indicates which school characteristics correlate positively with PISA scores in the presence of all other individual covariates in general. Appendix 3 has the meaning of the abbreviations used for the school characteristics. 29 Modelling student’s absence Another measure of learning outcomes which can help to draw a more complete picture is a student’s absence from school. EMIS records the number of hours missed due to both absence with excuse and due to absence without excuse. Absence is a relevant indicator for education because it can often serve as an early warning sign of students’ problems even before they have an impact on grades. Furthermore, able students with e.g. personal problems may receive similar grades to their less able peers but are squandering their talent from a societal perspective. EMIS records of absence appear relatively more complete than for risk factors. For 77 percent of school-cohorts at least one student is recorded having missed a positive number of hours either with or without excuse. The average fourth grader misses 4.2 hours without excuse, the average 9th grader misses 23.3 hours and the average 12th grader misses 19.9 hours. Boys miss between 50 and 60 percent more hours than girls in each grade. In Table 11 we re-estimate the model with the variables underlying Figure 4 to study absence and account for the count data structure in the dependent variables. In column 1 and 2 a student’s hours of absence with and without excuse are used as dependent variables. Absence without excuse is of course a much more suitable indicator of educational problems. In columns 3 to 5 we study whether its correlation with students’ and schools’ characteristics differs between grades. The estimates show that boys and girls show only small differences in absence with excuse, but girls miss 39.3 percent fewer hours without excuse than boys11. This difference is most strong for children in puberty. Additional intercept terms for grade 4 and grade 12 show that absence without excuse peaks in grade 9 for both genders. At this age children miss 5.5 times more hours than their 4th grade counterparts. The risk factors recorded in EMIS provide a telling picture of the determinants of absence. Students registered as vulnerable are missing significantly more hours with excuse (+15%) and without excuse (+27%) than non-vulnerable students. Among vulnerable students, additional risk characteristics further increase the missed hours count: being orphan, being ‘at risk’, having unemployed parents or coming from a low income family all contribute to a higher total number of hours missed. Column 4 to 6 suggest, however, that the both the importance and the sign differ across grades. For example, the penalty for being ‘at risk’ is sizeable and positive for 9th graders, sizeable and negative for 4th graders, and not significant for 12th graders. Students who have to travel less than 3 km to school miss about 13 percent fewer hours, indicating that distance affects both grades and absences. Among the school characteristics at the bottom of Table 10, we find a significant, hump shaped relationship between absence and the student-teacher-ratio. Shares of students with special needs or vulnerability as well as teachers’ qualification are not generally significantly correlated with differing absence without excuse. Finally, urban students tend to be less often absent. 11The reported coefficients can be transformed into incidence risk ratios by plugging the respective coefficient into an exponential function, for example, exp(-0.449)=0.607. 30 Table 11: Random effects poisson estimates of hours of absence (1) (2) (3) (4) (5) Absence with Absence without Absence without Absence without Absence without excuse excuse excuse excuse excuse Grade All All 4 9 12 Intercept 2.205*** 0.906*** 1.454*** 2.068*** 1.306 (0.408) (0.434) (0.772) (0.563) (1.290) Female 0.0277*** -0.449*** -0.363*** -0.468*** -0.404*** (0.00189) (0.00234) (0.00754) (0.00299) (0.00447) Grade 4 -0.770*** -1.700*** (0.00259) (0.00403) Grade 12 0.0206*** -0.0253*** (0.00245) (0.00312) Risk factors Recorded vulnerable 0.135*** 0.238*** 0.403*** 0.293*** 0.0871*** (0.00808) (0.00935) (0.0330) (0.0110) (0.0275) „At risk“ 0.0250*** 0.344*** -0.438*** 0.423*** -0.0195 (0.00741) (0.00845) (0.0296) (0.0102) (0.0225) Orphan -0.0673*** 0.137*** -1.771*** 0.156*** -0.897*** (0.0182) (0.0196) (0.0976) (0.0212) (0.0862) Special educational needs 0.0384** 0.00786 0.573*** -0.172*** -2.473*** (0.0177) (0.0194) (0.0412) (0.0239) (0.317) Unemployed parents 0.103*** 0.191*** 0.776*** 0.00440 0.228*** (0.00997) (0.0111) (0.0297) (0.0135) (0.0350) Family with low income 0.0360*** 0.667*** 0.897*** 0.702*** 0.198*** (0.00697) (0.00764) (0.0285) (0.00902) (0.0280) Other factors Single parent family 0.0971*** 0.0561*** 0.149*** 0.00105 0.0670*** (0.00700) (0.00745) (0.0254) (0.00860) (0.0218) >2 minors in family -0.0109 0.0751*** -0.186*** 0.0715*** 0.371*** (0.00794) (0.00802) (0.0243) (0.00921) (0.0280) Language of instruction matches -0.0322** 0.0749*** 1.228*** -0.0106 -0.119** (0.0143) (0.0183) (0.0431) (0.0247) (0.0519) Parents migrants -0.0238*** -0.112*** -0.0378 -0.137*** 0.0568*** (0.00725) (0.00816) (0.0286) (0.00966) (0.0214) Both parents migrants 0.0471*** 0.149*** -0.426*** 0.0483*** 0.443*** (0.0106) (0.0115) (0.0449) (0.0142) (0.0233) Distance less than 3 km -0.0303*** -0.130*** -0.278*** -0.100*** -0.124*** (0.00253) (0.00333) (0.0123) (0.00469) (0.00562) Urban 0.655*** -0.211 -0.553*** -0.265* -0.0891 (0.126) (0.143) (0.197) (0.152) (0.224) School-level variables Student-teacher-ratio 0.230*** 0.191** 0.101 0.344*** 0.410* (0.0836) (0.0895) (0.166) (0.126) (0.220) Student-teacher-ratio² -0.0135*** -0.00902** -0.00603 -0.0179*** -0.0184* (0.00422) (0.00457) (0.00872) (0.00679) (0.00952) Share of students -0.325 -0.965 1.451 -2.449* -3.476* with special educational need (0.776) (1.003) (1.368) (1.463) (1.848) Share of vulnerable 0.536 -0.0885 0.565 -0.239 -1.010 students (0.549) (0.635) (0.941) (0.663) (1.363) Share of teachers 1.251 0.720 -0.452 -0.278 2.681 with master's degree or more (1.294) (1.518) (2.117) (1.601) (2.545) Observations 44,442 44,442 16,486 18,716 9,240 Number of schools 766 766 661 694 255 Note: Based on EMIS data for the school year 2013-14. *, **, and *** denote statistical significance at the 0.1, 0.05, and 0.01 level, respectively. Standard errors in parentheses. Reported coefficients can be transformed into incidence risk rates by calculating exp(coefficient). 31 Risks Associated with the School Network Consolidation Reform Improving the quality, relevance, and efficiency of the education system is one of the main priorities of the Government of Moldova. However, the demographic and fiscal realities of the country have not made it easy for the government to fulfill its mandate in education. Over the years, Moldova’s education sector has witnessed uneven education quality and lack of efficiency. By 2010, Moldova’s school-age population has decreased by over 50 percent since 1991. The school network and staffing levels however barely changed. An average school by 2011 operated at 54 percent of the capacity for which it was designed, with much lower capacity utilization in rural areas compared to urban ones, leading to wasteful expenditures in the form of heating bills and public utilities. The spending on education reached 9.4 percent of GDP (twice the region average) and was not accompanied by the improvement of learning outcomes12. The feasibility study conducted by the Ministry of Education in 2010 under the World Bank support concluded that up to half of Moldova’s rural schools may need to be reorganized within a five-year timeframe in order to counteract the sharp population decline that has taken place over the last 20 years13. In 2011 the Government launched Structural Reform in the education sector with a view to improve quality of education through consolidation of schools and better utilization of resources— human and physical. The reform has already produced some positive results. School rationalization has led to an adjustment in the school network, including halting the inefficient pulverization of resources across schools that are too small to function efficiently and with acceptable quality. In 2011-14, central and local authorities have closed over 100 educational institutions, most of these (55) were closed in 2012. Maintaining closed schools alone would have required an additional MDL 357 million from the government in 2013 (an additional 5 percent of the total education spending). Apart from school closures and class consolidation, authorities reorganized a number of lyceums (upper secondary schools) into gymnasiums (lower secondary education institutions) and gymnasiums into primary schools and created hub schools. Overall, the average class size in the country grew from 19.0 in the 2010 school year to 20.2 in 2013. The student-teacher ratio showed an upward trend (about 8 percent increase). The new per-student based financing mechanism also lead to a number of positive developments at the school level. According to the results of assessment of school directors transparency of budget allocations improved. The new system provided wider autonomy to schools and incentives to spend funds more efficiently (consolidating classes and reducing the redundant staff). At the same time, the largest efficiency gains from the ongoing education reforms are made through reorganization and closure of schools in rural areas where most socio-economically disadvantaged students live and study. The reform brings opportunity of providing these children with a better education in larger and more resourced schools but it also creates risks 12For example, in TIMSS or PISA (see Moldova Education Strategy 2020 for details). 13 The rural educational institutions were divided into 4 groups: Group 1 − Clear-cut cases of schools to close down without impairing access to education; Group 2 − Clear-cut cases of schools not to close down because they have a sufficient number of pupils or the demographic situation in the respective community is favorable to further operation of such institutions; Group 3 − Clear-cut cases of schools not to close down because access to education would be impaired; and Group 4 − Not clear -cut cases: schools with an uncertain future. According to the situation in 2010, Group 1 included 283 educational institutions: 43 primary schools, 212 gymnasiums and 28 lyceums. 32 both for student attendance and performance in particular poor and marginalized as a result of their transition to a new school environment. The analysis of risks associated with the school network consolidation reform (dropouts and absenteeism) is thus of paramount importance. The analysis does not produce causal estimation, as it is limited by the non-availability of pre-reform records of students and endogenous program placement, i.e. the fact that schools’ and students’ observed and unobserved characteristics were considered when making the decision to close schools or classes. This is likely to introduce omitted variable bias because it is unlikely that students from closed schools are exactly comparable to their peers in unaffected schools. The consolidation of the school network can affect students in several ways in the short run (summarized in Böhme, 2012). Students might, for example, have to travel longer distances to school, which could make them more likely to drop out or to perform poorly. They may attend better run or better funded schools. However, their presence could have negative spill-overs on their new classmates, for example if their new schools became overcrowded and could not cope with the influx of new students. Furthermore, the new arrivals may be not fully integrated at their new schools or stigmatized as coming from worse performing schools, which might be mistaken as being inherently worse students. In the absence of pre-reform records of students, one has to rely on cross-sectional EMIS information. These contain the reason why a particular student has joined a particular school. Using the EMIS records, we will distinguish three groups: Students who joined the school because their previous school was closed, students who joined a school because their class was closed and all other students. Overall, according to the EMIS records 2.0 percent of students in grade 4, 9, and 12 were affected by school closures and a further 0.74 percent were affected by closures of their class as Table 12 shows. Note that using these variables only the short term effect of the school reform can be analyzed. The share of students affected by school closures is about 3 percent for 4th and 9th graders but only 1.1 percent among 12th graders. The share of students affected by the closure of their particular class is significantly higher for older students. For 8.4 percent of students the information is not available and they will be excluded from the analysis from hereon. Below we first study dropouts (for the working definition see Annex 1) and then briefly analyze how the reform has affected absenteeism. 33 Table 10: Reason for attending particular school Grades Motive 4 12 Total 9 Standard scenarios Enrolled in this school in 1st grade 21,310 20,892 5,944 48,146 Came in 5th grade from a primary school (where only 4 classes) 17 1,814 967 2,798 Came in 10th grade of a secondary school (where only 9 classes) 6 22 3,255 3,283 Was transferred from another school, where there is such a class 1,562 3,469 2,173 7,204 Affected by consolidation reform Came from closed school 629 705 140 1,474 Came from school where class was closed 60 278 200 538 Other reasons Came from abroad 171 232 129 532 Came to school after long illness 5 6 0 11 Came to school because the family raised money 2 3 5 10 Came to school because the family received social welfare 3 6 0 9 Other 496 970 1,250 2,716 Incomplete records Missing information 2,286 2,575 1,226 6,087 Total 26,547 30,972 15,289 72,808 Notes: Based on EMIS data for the school year 2013-14. Dropouts The most important negative consequence the school reform could have had is increasing drop out. However, the decision of school closures was made considering the availability of nearby schools to which affected students could go. Any effect could therefore be expected to be small. The reason for drop outs is covered in EMIS, but schools are unlikely to report that they are dropping out because of a reform. Hence, indirect evidence has to be used. Using EMIS we can analyze drop outs during the school year. It does not cover students who failed to register at any school after the summer break, in particular if their school had been closed. In total, there were 366 recorded drop outs in grades 1 to 9 during the school year 2013-14, making up slightly above 0.1 percent of students. For 295 of these the motive for attending the particular school, which is our way of distinguishing who was affected directly by the school reform, is recorded on EMIS. For the remaining 19.4 percent this record is missing, making it impossible to infer whether these students were affected by the school reform. Excluding these students from the record will thus provide a lower bound estimate. This means that the true school reform effect is higher than reflected by our analysis if a more than proportional share of students whose records are incomplete had been subject to the school reform. Among the students with complete records, 15 out of 295 came from schools that were either closed (11) or in which the respective classes were closed (4). Among the students in grade 4 and grade 9, the share of drop outs is over-proportional. The records indicate that 5 out of 86 students or 5.8 percent of drop outs were those of students who came from closed or reorganized school. The comparable share for other students is 3.2 percent. Such a difference of two students can be easily caused by the differences in the average socio-economic background of students from reformed school and non-reformed schools. Given the very small sample and data quality, it makes little sense to study this in a multivariate framework. The most conservative conclusion from the available 34 data is that the school reform has a negligible effect on dropouts. For the next steps of the analysis we will work with the (crucial) assumption that the effect on drop outs among observed 4th and 9th graders was basically zero. Absenteeism Table 11 reports fixed effects estimates of the correlation of school absence. Such regressions estimate differences only based on the differences of students within a specific school (which is what is best when analyzing the effect of distance to school and school closures). Systematic differences in the outcome and explanatory variables between schools are thus evened out and the estimates come from variation of students around the school-cohort-specific mean. As an outcome variable we use the number of hours of absence per year. The model includes being from a closed school, being from a class that was closed, individual characteristics such as gender and risk factors denoted by , the school fixed effect to capture the (conditional) average absence at a school to, for example account for systematic data quality issues at particular schools, and the error term . = + 1 ( ℎ) + 2 ( ℎ) + + The results show which characteristics can explain the differences of students attending the same cohort in one school. Column 1 reports the number of hours of absence with excuse and column 2 that without excuse for the whole sample. Columns 3-5 do the same for grades 4, 9, and 12 separately. The results suggest that being from a closed school or being from a closed class does not significantly affect the number of hours of absence with excuse. Those students from closed schools miss on average 28.4 percent more hours without excuse. This is driven by 4th and 9th graders, who are absent more than about 46 and 32 percent more hours than their classmates whose school has not been closed, respectively. 12th graders from closed schools are not significantly more absent, again showing clear sign of positive self-selection. As we cannot control for the characteristics of students who are not recorded as vulnerable by their teachers, these results should be treated cautiously. So only if one can show (or are willing to assume) that the absenteeism of students from closed schools was very similar to that of their peers before the reform, these estimates can be interpreted as a proxy of the effect of school reform. Additional tests in which we interact a student’s status as coming from a closed school with each of the risk factors show no clear pattern. This is at least partly due to the fact that few students are from closed schools and are labelled vulnerable at the same time. Hence, one can infer very little about which risk factors exacerbate potential effects of school closure on absence. Also, because of the lack of student’s individual pre-reform grades, it is not possible to conclusively explore the effect of the reform on student’s grades. Thus, relying on s number of assumptions about data quality, the school reform has a negligible effect on dropouts. At the same time, after controlling for a number of individual characteristics and post-reform school differences it appears that students from closed schools miss more lessons without excuse with possible subsequent implications for their performance. Therefore, it is important to continue close monitoring of student absenteeism in schools receiving students from closed or reorganized schools. 35 Table 11: Linear fixed effects regression of absence accounting for school fixed effects (1) (2) (3) (4) (5) Absence with Absence without Absence without Absence without Absence without excuse excuse excuse excuse excuse Grade All All 4 9 12 Female 0.0306** -0.418*** -0.409*** -0.426*** -0.400*** (0.0134) (0.0216) (0.0484) (0.0305) (0.0338) Reform indicators From closed school 0.0575 0.284*** 0.462** 0.327*** 0.0311 (0.0614) (0.0791) (0.183) (0.0959) (0.181) From reformed school 0.0773 0.0102 -0.427 0.116 -0.176 (0.0798) (0.151) (0.428) (0.208) (0.175) Risk factors Recorded vulnerable 0.115* 0.325*** 0.510** 0.387*** -0.0222 (0.0615) (0.0985) (0.204) (0.117) (0.177) „At risk“ 0.0647 0.339*** -0.367 0.422*** 0.0400 (0.0603) (0.0991) (0.239) (0.118) (0.116) Orphan -0.0491 -0.188 -1.853*** -0.116 -0.210 (0.112) (0.184) (0.599) (0.206) (0.471) Special educational 0.0481 0.270 0.752*** 0.130 -1.057 needs (0.139) (0.200) (0.259) (0.262) (0.822) Unemployed parents 0.0420 0.189* 0.741*** 0.0703 0.194 (0.0828) (0.110) (0.223) (0.130) (0.202) Family with low 0.0450 0.624*** 0.705*** 0.613*** 0.106 income (0.0463) (0.0763) (0.215) (0.0883) (0.257) Single parent family 0.0675 -0.0164 0.277 -0.0552 0.0517 (0.0553) (0.0788) (0.211) (0.0900) (0.172) Other factors >2 minors in family 0.000212 0.0569 -0.0166 0.0527 0.276 (0.0524) (0.0865) (0.190) (0.0997) (0.174) Language of 0.00147 0.0619 1.128** -0.138 0.166 instruction matches (0.122) (0.202) (0.543) (0.184) (0.228) Parents migrants 0.00283 -0.121 -0.144 -0.132 0.143 (0.0528) (0.0827) (0.201) (0.0984) (0.160) Both parents migrants 0.0298 0.0765 -0.446* -0.0154 0.419* (0.0737) (0.101) (0.234) (0.110) (0.214) Distance less than 3 -0.0273 -0.114*** -0.269*** -0.0580 -0.151*** km (0.0188) (0.0357) (0.0946) (0.0591) (0.0431) Observations 52,660 47,777 14,712 22,257 10,808 Number of schools 1,896 1,713 590 807 316 Note: Based on EMIS data for the school year 2013-14. Estimates from poisson panel data model with fixed effects at the cohort level. Estimates are based only on within-information. Schools which did not report absence within a cohort are dropped. Also no grade specific effects can be estimated. “Urban” dummy automatically dropped because it has too little within-variation to be included. *, **, and *** denote statistical significance at the 0.1, 0.05, and 0.01 level, respectively. Standard errors that cluster at the school-level in parentheses. 36 Figure 5: Distribution of math grades in EMIS records (rounded down) Figure 6: Average grades of students from closed schools and other students by subject and grade Panel A: Romanian Panel B: Romanian (non-native) Panel C: Math Panel D: Foreign languages 37 Conclusions and Policy Recommendations Analysis of how the learning outcomes are distributed throughout a school system as well as the risks associated with the school network consolidation reform (dropouts and attendance in particular) provide a number of valuable policy insights that could guide the efforts aimed to ensure the delivery of quality teaching and learning across the entire education system. Key findings and policy implications:  Students’ average performance in the PISA test depends strongly on their socio - economic status and most disadvantaged students do not attain even the baseline proficiency levels in reading, math and science. While Moldovan students from the top 60 percent are on average as well-off as their international PISA peers [index of economic, social and cultural status (ESCS): 0.07], the bottom 40 percent are significantly more disadvantaged by the international standards (ESCS: -1.51). The ESCS index indicates that 93.4 percent of students that participated in PISA internationally are better off than the average Moldovan student from the bottom 40 percent. As to their performance, the bottom 40 percent of students according to the have an average reading score 45 points below that of their less disadvantaged classmates and on math and science the gaps are 46 and 43 points respectively (equivalent to slightly more than a year of schooling). Most disadvantaged Moldovan students are particularly weak in the “reflect and evaluate” and “non-continuous text” tasks and do not attain the PISA baseline proficiency in reading, math and science. These students risk completing their studies without acquiring the skills and competencies needed to fully participate in society and continue learning throughout their lives. EMIS grades similarly are substantially lower for vulnerable students, especially in the 4th and 9th grade and to a lesser extent in the 12th grade (as more better-off students tend to continue the upper secondary education).  Descriptive analysis reveals substantial performance gaps between schools, rural and urban areas as well as between genders; and understanding these differences is important for designing relevant interventions. While schools with few disadvantaged students can be expected to score close to the international average of 500 points in the PISA test, schools with a large majority of disadvantaged students can be expected to have an average scores of less than 400 points, corresponding to a 2.5 year gap in educational development. Girls and boys score on average similarly in terms of math, but boys do worse than girls in science (13 points difference on average, representing a third of a year) and especially in reading (47 points difference on average or slightly more than a year of schooling). Gender overall explains about 5 percent of variation in reading grades suggesting the need for monitoring and gender-sensitive interventions in relevant subject (such as improvements in curriculum through the gender lens and/or catch-up classes for boys, in particular from disadvantaged backgrounds). Rural students lag behind their urban peers by an average 60 points in reading, 55 in math and 48 in science, corresponding to a 1.2 to 1.5 year gap in educational development, though urban-rural divide is not significant once controlling for other factors in multivariate framework suggesting that in targeted interventions it is important to focus on low-performing schools with a high share of 38 disadvantaged students rather than on their location. Determinants of students’ achievement interact and add up causing very significant differences between the educational outcomes of the best and worst performing students. For example, urban children who do not belong to the bottom 40 percent of the ESCS do particularly well. Rural students from a disadvantaged socio-economic background do particularly poorly. The differences between these groups are striking: 75 points for girls (almost 2 years of schooling) and 82 points for boys in reading (more than 2 years of schooling).  Performance differences in Moldova from the socioeconomic prospective are largely between schools suggesting that targeted interventions should focus on disadvantaged schools rather than on individual vulnerable students. About a third of the variation in PISA scores is linked to school level differences. Much of the explained variation comes from school socioeconomic composition, with one standard deviation counting for 39-42 additional PISA points. Variation in student’s socio-economic background can explain between 5 and 10 percent of the variation in grades, with one standard deviation in ESCS index counting for 14-15 additional PISA points. Performance differences in Moldova are largely between schools rather than within schools which suggest that in targeted operations focusing on schools with a high share of disadvantaged students might be more advisable than targeting individual students. This might be also expedient for financial considerations.  Despite large achievement gaps, internationally Moldova scores average in terms of the impact of socio-economic background on PISA scores suggesting some room for education equality improvement. Similar (and even worse) inequitable outcomes can be found in rich countries such as New Zealand or France, which have a good PISA performance, though in some other countries socio-economic background has less of an impact on educational performance which suggests some potential for the reduction of education inequality which on its own, if considered in terms of learning opprtunities, 'has the potential to produce quick gains in economic and social welfare' (Porta et al, 2011; OECD, 2012b). International evidence also suggests that substantial progress in student performance can be achieved in a diverse social settings with moderate economic resources. This is of particular importance for Moldova, in light of its already high spending on education, efforts to make school system more efficient, and existing social disparities.  Student absenteeism is prevalent among children at risk suggesting that close monitoring of absenteeism for this group is important. The student risk factors provide a telling picture of the determinants of absence. Vulnerable students miss more hours, with and without excuse. Among vulnerable students, the impact of additional risk factors depends on the grade level. Absence without excuse peaks expectedly in grade 9 for both genders. Students who have to travel less than 3 km to school miss about 13 percent fewer hours which underlines the importance of continued monitoring of attendance and dropouts in relation to the school network consolidation process using EMIS. Also, girls miss about 40 percent fewer hours without excuse than boys. Urban students miss fewer hours without excuse vis-à-vis their rural peers.  As to the risks of the school network consolidation, the reform so far had a negligible effect on dropouts but the student attendance from reformed schools is a concern and should be carefully monitored. After controlling for a number of individual characteristics and post-reform school differences, students from closed schools tend to miss more lessons 39 without excuse (with possible subsequent implications for student performance). Therefore, it is important to continue careful monitoring of student absenteeism in schools receiving students from closed or reorganized schools to mitigate the relevant risks. Main policy recommendations: Moldova should target to improve quality of education and learning opportunities for all students, equitably across the entire education system. There is room for education inequality reduction in the country and international evidence suggests that achievement of substantial education quality improvements together with the reduction of the share of poorly performing students and results variability based on student socioeconomic background is possible and leads to significant economic and social gains. Delivery of equal learning opportunities to all pays off and countries with very different economic conditions and social settings demonstrated the ability to raise the quality of educational outcomes substantially and equitably despite existing social disparities (and with moderate economic resources). The challenges ahead are high in light of the level of education system performance overall (as evidenced by the fact that around 60 percent of Moldova's 15-year-olds lack the basic levels of proficiency in reading and math and in light of that fact that large performance gaps exist between students with different socioeconomic backgrounds, across genders, schools, and urban and rural areas), (ii) the need to equip as many students - Moldovan future workforce - as possible with at least the baseline competencies and skills that enable them to fully participate in social and economic life and continue learning throughout their lives; (iii) the complexity of task itself that is the need to prepare students for new realities of globalized world and dealing with more rapid change than ever before (‘for jobs that have not yet been created, for new technologies and challenges that will appear’, OECD: 2010b); and (iv) complex nature of various education policies that need to be aligned, integrated and maintained over sustained period of time. In light of the existing gender gaps in student performance, there is need for relevant interventions including gender-sensitive review and improvements in curriculum for reading and/or catch-up classes for boys, in particular from disadvantaged backgrounds. It is important to target the struggling disadvantaged schools and hub schools where investment of educational resources can potentially make the greatest difference. The analysis shows that in terms of targeted interventions it is advisable to focus on low-performing schools with a high share of disadvantaged students both in urban and rural areas including hub schools receiving students from closed or reorganized schools. Turning around these schools can give impetus for quality enhancement of the whole system. A number of interventions could be considered to help disadvantaged schools improve including: (i) strengthening and supporting school leadership including through mentoring from experienced head teachers, training and other support; (ii) attracting and motivating high-quality teachers to work in hard- to-staff disadvantaged schools (teacher shortages are an issue at the moment in particular in science and math as evidenced by this and Moldova SABER-Teachers reports); (iii) offering struggling schools extra funding responsive to the needs of the most disadvantaged students and schools; (iv) strengthening accountability mechanisms among many others. Turning around low-performing schools with a high share of disadvantaged students requires strong leadership. 40 Therefore, preparing and developing effective school leaders is the starting point of the transformation process. Better quality of EMIS data and its use for evidence-based education policy making are important. Education reforms should drive change on the basis of good evidence. Moldova has made substantial investments in creating valuable educational datasets including EMIS and PISA. At the same time, it is important to strengthen quality and reliability of data collected through EMIS by improving data collection procedures and processes, quality controls, information flows and developing and implementing data validation mechanisms. Strengthening statistical and analytical capacity of the Ministry of Education to analyze relevant data is also of paramount importance for evidence-based education policy making and timely adjustments of the ongoing education reforms. The EMIS is also important for continuous monitoring of attendance and absenteeism of children at risk and students from closed or reorganized schools as mentioned above. The country’s school system requires stronger accountability mechanisms in place. Moldova has provided wider autonomy to schools in managing financial and human resources. The school boards and leadership now have much more control of the way the resources are used, people are deployed, the work is organized and the way in which the work gets done. PISA results internationally show that increased school autonomy tends to be closely linked to school performance only if there are effective accountability mechanisms at school level. Therefore, equal access to learning opportunities must be accompanies with proper accountability mechanisms. And good data and capacity building for its analysis, mentioned above, are also important for implementation of open data initiatives encouraging citizens’ engagement and oversight of the reforms, promoting and enabling environment for social accountability and supporting the efforts of the Government to build modern, cost-effective and high-quality education sector. 41 Literature Albright, Jeremy J and Marinova Dani M. (2010) Estimating Multilevel Models using SPSS, Stata, SAS, and R, mimeo. Ammermüller, Andreas, Heijke, Hans and Ludger Wößmann (2003). Schooling Quality in Eastern Europe: Educational Production During Transition, Research Centre for Education and the Labour Market, ROA-RM-2003/2E, Maastricht, March 2003. Böhme, Marcus (2012) Results framework for the evaluation of an education sector reform in Moldova, mimeo. Bradshaw, J., Ager, R., Burge, B. and Wheater, R. (2010). PISA 2009: Achievement of 15- Year-Olds in England. Slough: NFER. Capita, Irina (2012). The Impact Of School Size On Educational Outcome: The Case Of Moldova, Kyiv School of Economics Thesis Coupe, Tom, Anna Olefir and Juan Diego Alonso (2011), Is Optimization an Opportunity? An Assessment of the Impact of Class Size and School Size on the Performance of Ukrainian Secondary Schools, World Bank Policy Research Working Paper No. 5879 Hanushek, Eric, John Kain, Jacob Markmanc and Steven Rivkind (2003) Does Peer Ability Affect Student Achievement? Journal of Applied Econometrics 18: 527–544 (2003) Hanushek, Eric A. and Ludger Wößmann (2007). Education Quality and Economic Growth, World Bank Publication, 27 pages. Hanushek, Eric A. & Woessmann, Ludger, 2011. The Economics of International Differences in Educational Achievement, in the Handbook of the Economics of Education, Elsevier. Lücke, Matthias and Stöhr, T. (2012). The Effects of Migration in Moldova and Georgia on Children and Elderly Left Behind - Country Report: Moldova, MDHS (2006). Moldova Demographic and Health Survey 2005. Calverton, Maryland: National Scientific and Applied Center for Preventive Medicine of the Ministry of Health and Social Protection and ORC Macro. OECD (2009), PISA Data Analysis Manual: SPSS and SAS, Second Edition, ISBN: 9789264056244 OECD (2010a), PISA 2009 Results: What Students Know and Can Do – Student Performance in Reading, Mathematics and Science (Volume I), http://dx.doi.org/10.1787/9789264091450- en OECD (2010b), PISA 2009 Results: Overcoming Social Background – Equity in Learning Opportunities and Outcomes (Volume II), http://dx.doi.org/10.1787/9789264091504-en OECD (2011), PISA 2009 at a Glance, OECD Publishing, Paris. DOI: http://dx.doi.org/10.1787/9789264095298-en 42 OECD (2012a). Untapped Skills: Realizing the Potential of Immigrant Students, OECD Publishing, http://dx.doi.org/10.1787/9789264172470-en OECD (2012b). Equity and Quality in Education: Supporting Disadvantaged Students and Schools, OECD Publishing Porta E. et al. (2011). Assessing Sector Performance and Inequality in Education. The World Bank. Tkhoryk, Oleg (2011). School Size As A Determinant Of Educational Performance In Transition Countries, Kyiv School of Economics Thesis Walker, Maurice (2011). PISA 2009 Plus Results : Performance of 15-year-olds in reading, mathematics and science for 10 additional participants. Melbourne: ACER Press 43 Annex 1: Description of data In this report we rely on two data sources. The first is the 2009 OECD Program for International Student Assessment (PISA) and the second is the Moldovan Education Management Information System (EMIS). PISA is a test of performance of 15 year old students in reading, math and science. The test is supplemented by detailed student-level and school-level questionnaires, which are filled out by students and school principals respectively. Moldova participated in the so called PISA 2009+ round, which took place after the main 2009 test round, which was administered in 65 countries (Walker, 2011). The PISA assessment is based on a nationally representative subsample of Moldovan students. In total, data covers 5194 students from 186 schools. All of these were born in 1994. Internationally, PISA scores vary around an average of about 500 points, with a standard deviation of 100 points. Moldovan students got averages of around 400 points, depending on the subject. This indicates that many children have severe shortfalls compared to students in richer OECD countries. The vast majority of Moldovan students only reach the lower three of six levels of competency and more than half of 15 year old students are considered functionally illiterate. This result places Moldova approximately at the level that could be expected for a similarly economically developed participant country. Our second data source, EMIS, is a data collection tool used by the Ministry of Education to record information at the beginning and the end of each school year. Following the regulatory and legal framework changes regarding protection of personal data and the wish of the Ministry of Education, EMIS was not applied uniformly in each school year. Student-level data is entered by the head teacher of each class based on the information in the student’s personal files. As the Ministry of Education does not fully control the situation in schools, this can lead to missing information or incorrect entries. Comparing the data collected by the National Bureau of Statistics and that collected by EMIS reveals a divergence of the main indicators which characterize the general education system (number of students in each school, number of students in each grade level, number of boys and girls, etc.), showing that data entry errors do not exceed 3%. EMIS covers both students in compulsory (typically grades 1-9) and non-compulsory (10-12) education. Reaching the age of 16 students are not considered dropouts even if they leave during the school year. The EMIS records of dropouts we work with in this report thus cover all students who in 2013-14 were included in the school registers as enrolled in one of the grades 1-9, who left left school during the school year without returning and who had not reached the age of 16 at the end of the school year. For each school subject the EMIS allows collecting annual marks in all grades, the results of the national test at the end of grade 4, the results of the national exam at the end of grade 9 and the results in the baccalaureate exams at the end of grade 12. To ensure comparability between schools, we only rely on national exams. Due to data availability, we only work with data from 44 the school year 2013-4. We thus work with the universe of Moldovan students in grade 4, 9, and 12 who took the exams and are recorded in EMIS. The oldest of these students thus were typically born in 1998. They hence do not correspond directly to the students who took part in PISA 2009+. For each school subject the EMIS allows collecting annual marks in all grades, the results of the national test at the end of grade 4, the results of the national exam at the end of grade 9 and the results in the baccalaureate exams at the end of grade 12. The tests for the 4th, 9th and 12th end of grade assessment are developed by the Ministry of Education and are identical for all the schools in the country. However the test administration differs: the 4th and 9th grade tests are administered by the schools themselves, while the 12th grade tests are administered by the special examination centers created by the Ministry of Education. To ensure comparability between schools, we only rely on national exams. Due to data availability, we only work with data from the school year 2013-4. We thus work with the universe of Moldovan students in grade 4, 9, and 12 who took the exams and are recorded in EMIS. The oldest of these students thus were typically born in 1998. They hence do not correspond directly to the students who took part in PISA 2009+. EMIS also contains basic individual characteristics of students such as the gender and the mother tongue. Furthermore, several risk factors covering orphan status, having unemployed parents, having parents working abroad, being from a low-income family etc. that we work with in detail in this report. Until 2013 the term “vulnerability” didn’t have an official definition, the students were included in this category based on the estimations of the class head teachers. Since 2013, the term "vulnerability" was replaced with "pupils at risk" and "pupils with special educational needs”. The precise definition of these is based on government regulation. 45 Annex 2: Methodological approach The most important determinants of students’ educational outcomes lie at the individual and the school level. Relative to other students in a classroom, individual observed and unobserved ability and motivation are key determinants of educational outcomes. Other student-level determinants are the socio-economic background and support by parents, which strictly speaking occur at the family level. This however can be ignored, because we typically do not analyze siblings in this report and would, for reasons of privacy, not know anyway. Learning in a classroom with other students, children are however affected by the characteristics and the behavior of their classmates (peer effects). If for example students attend classes with many low ability students, their own performance in school may be negatively affected. There are furthermore important school-level determinants that affect everyone in a classroom (contextual effects). If students have a particularly good teacher, all of them will have the chance to improve their outcomes. Such school- or classroom-level effects are extremely difficult to separate and doing so is beyond the scope of this report. We will analyze them jointly and, where necessary, eliminate them econometrically. Hence, it is important to model both the individual and the school level. We do this by relying on mixed models. Mixed models allow formulating different levels at which effects on educational outcomes occur. At each level observed characteristics can be included and unobserved differences can be modelled and accounted for. Furthermore, these models are sufficiently general to allow researchers to test whether observed or unobserved model components interact at different levels. As a starting point into mixed model analyses in this paper and to assess the importance of school-level variation, we use a model without covariates, the so called “empty model”. Denoting the student level by i and the school level as j this can be written = 00 + 0 + , where 00 is a general intercept, 0 is a random school-level effect and the error term. To this covariates can be added at both the student ( ) and the school level ( ). Combining the two levels = 0 + + and = 00 + 0 + we estimate the model = 00 + 0 + + + . From this relatively simple two level specification one can depart in order to test a multitude of hypotheses. We do this selectively depending on the context. One important way is allow the influence of unobserved school characteristics to vary by socio-economic background. For a model with only one covariate at the individual level this is done by specifying the random slope 1 = 10 + 1 . Then, = 00 + 0 + + ( 10 + 1 )1 + . To make the text more accessible, we do not specify each tested model in the main body of the text. Instead we describe them verbally. Using the technical background outlined here, the reader should thus be able to infer the technical details of only briefly mentioned results. 46 PISA The PISA 2009+ tests were conducted using a nationally representative, two-stage stratified sample. The first level of sampling were schools, the second were students within these. Analyses of the PISA data should take into account this sampling structure. As advised by the OECD (2009) we use Fay’s variant of the balanced repeated replication model. The PISA methodology furthermore uses plausible values which help achieve comparability of students’ ability in the face of several complications. For example, not every students works on exactly the same questions in the test but a measure of ability should allow comparison of students across different versions of the test. The construct is meant to reflect plausible values of ability of a student in each of the three subjects, thus the name. In the analysis we account for both the sampling structure and the use of plausible values. WE conduct our analyses in Stata accounting for which a user-written package called “pv” is available. EMIS EMIS data cover the universe of Moldovan students in the respective grades. Therefore no adjustments to sampling such as weights are necessary. Data quality is however an issue. There are clear signs that not all schools properly report students’ data. This is clearest with respect to risk factors. By design students who are not at risk and for whom there are no records both have empty cells in the EMIS database. If there was full reporting, the absence of risk records for students would indicate that students are not vulnerable. The fact that many schools however do not label a single student vulnerable indicates that some class teachers in fact misreport students with vulnerabilities as not vulnerable. Unfortunately, there are probably class teachers who genuinely do not perceive any students as vulnerable and who therefore deliberately did not enter data. Simply excluding all classes without any vulnerable students would thus be wrong. Hence, we treat these variables econometrically as subject to considerable measurement error. Due to the scarcity of other variables at the student level, which is understandable given privacy concerns, there is no way to impute student-level data or assess the true extent of measurement error. As measurement error due to misreporting is likely to occur mostly at the class or school level, we take into account differences at the school or cohort (school-class) level. This is mostly done by making comparison within schools rather than between students across schools. If a school generally fails to report risk factors, using identification based on variation in risk factors around the school mean is a more robust strategy for inference than working in the pooled cross-section of all students. In the chapter on school reform we therefore use fixed effects panel regressions where contrary to the standard use of panel data models we do not follow individuals over time but rather analyze school’s individual students within one year. This means that estimated effects are not based on schools that do not report data on a particular variable at all. If the respective students from these schools vary in the outcome variable, they thus only contribute to a higher share of unexplained variance. On a more general level, though, if risk factors are not sufficiently 47 reported at other levels, for example by only some teachers in each school, estimated effects will be biased towards zero. The true effect will thus be larger in absolute value than the estimated effect. In some analyses we are interested in the role of factors which do not vary at the student-level but rather the class- or school-level. In this case, fixed effects panel regressions are not useful. Instead we use random effects panel models or standard mixed models. These rely on rather strong assumptions regarding the role of student-level heterogeneity. Most crucial is the assumption that the unobserved heterogeneity at the level of the panel variable (here: school or cohorts within schools) is uncorrelated with the observed characteristics at the same level. A violation would lead to inconsistent results. The size of the bias introduced by this violation depends on the strength of the correlation between unit effects and covariates. The direction of the bias depends among other things on sign of this correlation. This should be kept in mind when reading the results based on this assumption. Depending on the dependent variable, we assume an approximately normal distribution (grades) or count data characteristics (absence). Accordingly, we switch between using linear models and poisson type models, depending on the outcome variable. 48 Annex 3: Overview of important variables PISA Student-level: PISA scores: Based on five “plausible values” for each core subject of the test. Main outcome variables. When working with plausible values, the statistical methodology has to be adjusted accordingly. Repetition: Covers whether student repeated grades at ISCED levels 1, 2, or 3. Coding: 1 "No, never" 2 "Yes, once" 3 "Yes, twice or more". The variable was constructed from the ISCED-level specific data. ESCS: Pisa Index of Economic, Social, and Cultural status. Approximately normally distributed around -0.56. Minima and maxima are -4.7 and 2.4. Positive values are considered a socio- economically advantaged background and negative ones are considered economically disadvantaged. Bottom40: Dummy, 1 if student is among the bottom 40 percent according to ESCS Female: Dummy, 1 if student reports to be female. Romanian speaking: Dummy, 1 if student reports to speak Romanian at home. Based on imputed language variable that comes with the dataset. Born abroad: Dummy, 1 if the students reported not being born in Moldova Migrant background: Dummy, 1 if the student was either born abroad and/or at least one parent was born abroad. In the PISA database, there is no way to clearly define who is an orphan, because this was not covered by the questionnaire. However, there are several alternatives with potential explanatory power. Without parents: Dummy, 1 if both mother and father do not usually live at home with the student. Potential migrant household: Dummy, 1 if both mother and father do not usually live at home but are looking for work or working elsewhere. Thus, these are potentially migrant households with both parents abroad, although it is not clear where exactly the parents are. Single parents: Dummy, 1 if one of the parents usually lives with the student and the other parent usually does not. This group could also comprise migrant families with just one migrant family or, for example, divorced parents. School-level: STR: Student-teacher ratio at the school level. SC_(…): Variables that hold the school-level means of the respective variable. ABGROUP: Index of ability grouping between classes, covering whether students are grouped by ability in no, in any, or in all subjects. COMPWEB: Proportion of computers connected to the Internet 49 IRATCOMP: Ratio of computers and school size PCGIRLS: Proportion of girls in the school PROPCERT: Proportion of certified teachers PROPQUAL: Proportion of qualified teachers SCHSIZE: Total school enrolment SELSCH: Index of academic school selectivity covering whether schools admit students based on academic performance and recommendation from the students’ former schools. STRATIO: Student-Teacher ratio. Part time teachers are weighted 0.5, full time teachers 1. EXCURACT: Index based on how many of 13 extra-curricular activities are offered by the school. LDRSHP: Index of school leadership proxying how frequently principals are involved in a set of 14 school affairs such as monitoring and ensuring quality of education. RESPCURR: Index of school responsibility for curriculum and assessment. Based on how much control the principal and teachers have about establishing student assessment policies; textbooks choice, course content, the menu of courses on offer. RESPRES: Index of school responsibility for resource allocation. Based on how much control schools have over six different school management decisions such as hiring teachers and setting their starting salaries. SCMATEDU: Index of the quality of the schools educational resources, based on seven factors that may hinder education at their school, for example the lack of (adequate) lab equipment, computers, library material etc. STUDBEHA: Index of six proxies of student behavior, based on whether learning is hindered by students’ absenteeism or skipping classes, disruption of class, lack of respect for teachers, use of drugs, and intimidation of other students. TCHPARTI: Index of teacher participation across 12 kinds of responsibilities. TCSHORT: Based on the principals assessment whether teacher shortage hinders learning, specifically a lack of qualified teachers in science, mathematics teachers, Romanian and Russian, and any other subjects. TEACBEHA: Index of seven teacher behavior proxies, based on whether learning is hindered by teachers’ lack of expectations and encouragement of students, absenteeism, resistance to change, strictness, not meeting students’ needs and poor student-teacher relations. EMIS Student-level: Vulnerability: Dummy based on the value of the variable "SituatieDeRisc". Vulnerability indicators are only recorded when the class teacher indicated that the respective child is at risk. Until 2013 this had no formal definition. Since then there are specific guidelines as to what constitutes risk. 50 At risk: Dummy based on the value of the variable "InStareDeRisc". It indicates that the child is registered in the “Register of Children at Risk”, which is kept by the social worker in the community. Gender, being an orphan, having special educational needs, having unemployed parents, being from a family with low income, being from a single parent family, coming from a family with more than two minors, having to travel more than three kilometers to school, and attending a school with matching language of instruction are taken directly from EMIS. Unemployed parents: Dummy variable, 1 if at least one parent is unemployed. Parents migrants/both parents migrants: Two separate dummy variables created from the number of parents who are abroad. Parents migrants takes the value 1 if at least 1 parent is a migrant. School-level: Student-teacher-ratio is calculated using the number of didactic staff (CadruDidactic) who are covered on EMIS per school and the number of students (taken from the “Raport Statistic (anual)” of the NBS, which covers all students. The number of students with special educational need is also taken from NBS data. All other school level variables are based on EMIS. 51 Annex 4: Additional tables Performance of bottom 40% by several categories Std. Std. Std. % of ESCS Category Reading Erro Math Erro Science Erro students r r r Top 60% Female 28.3 432 3.4 416 3.5 438 3.4 Top 60% Male 30.8 385 3.4 418 3.5 425 3.2 Bottom 40% Female 20.4 384 3.2 370 3.7 396 4.0 Bottom 40% Male 20.5 340 3.7 372 4.3 380 3.8 Top 60% Rural 25.7 376 4.5 388 4.8 407 4.7 Top 60% Urban 33.4 432 4.5 439 4.6 450 3.8 Bottom 40% Rural 30.9 353 3.8 363 4.6 381 4.5 Bottom 40% Urban 10.0 390 5.3 396 4.6 408 5.0 Top 60% Lower School Size 25.1 386 5.0 401 5.2 415 5.0 Top 60% Higher School Size 34.0 423 4.7 429 4.9 443 4.2 Bottom 40% Lower School Size 27.5 353 4.0 365 4.9 381 4.9 Bottom 40% Higher School Size 13.5 381 6.1 383 6.4 403 6.6 Top 60% Lower Student Teacher Ratio 30.2 412 5.1 423 5.0 433 4.8 Top 60% Higher Student Teacher Ratio 29.2 403 5.7 412 5.6 429 5.4 Bottom 40% Lower Student Teacher Ratio 22.0 364 4.7 376 5.5 388 5.8 Bottom 40% Higher Student Teacher Ratio 18.7 360 4.4 367 4.8 389 4.7 Top 60% Less School Leadership 30.3 403 5.3 417 5.3 427 4.8 Top 60% More School Leadership 28.8 412 4.9 418 5.0 436 4.6 Bottom 40% Less School Leadership 22.5 354 4.2 367 5.3 380 5.5 Bottom 40% More School Leadership 18.4 372 4.9 376 5.5 398 5.0 Top 60% Worse Disciplinary Climate 30.0 403 3.2 412 3.5 426 3.3 Top 60% Better Disciplinary Climate 29.6 415 3.7 425 3.8 439 3.4 Bottom 40% Worse Disciplinary Climate 20.5 357 3.7 363 4.2 382 4.1 Bottom 40% Better Disciplinary Climate 20.0 374 3.8 383 4.6 402 4.8 Worse Student Teacher Top 60% Relations 40.7 407 3.4 418 3.6 430 3.2 Better Student Teacher Top 60% Relations 18.8 413 3.5 420 3.7 437 3.4 Worse Student Teacher Bottom 40% Relations 28.3 361 3.1 370 3.7 387 3.6 Better Student Teacher Bottom 40% Relations 12.2 376 4.3 380 4.8 401 4.3 Higher Preparation for Adult Top 60% Life 51.3 416 3.0 423 3.2 438 2.9 LowerPreparation for Adult Top 60% Life 8.1 370 7.3 390 6.7 403 6.1 Higher Preparation for Adult Bottom 40% Life 34.3 371 3.0 377 3.6 395 3.6 LowerPreparation for Adult Bottom 40% Life 6.2 334 5.5 352 5.3 368 5.0 Top 60% Lower Quality Resources 29.9 408 6.0 418 5.5 433 5.1 Top 60% Higher Quality Resources 29.2 407 5.5 416 5.6 429 5.3 Bottom 40% Lower Quality Resources 19.9 362 4.6 371 4.8 387 4.9 Bottom 40% Higher Quality Resources 21.1 363 5.0 371 6.0 389 5.6 Top 60% Less assessment 40.6 409 3.8 417 4.2 432 3.7 Top 60% More Assessment 18.5 404 7.1 417 6.1 429 6.6 Bottom 40% Less assessment 28.2 363 4.0 372 4.2 389 4.5 52 Bottom 40% More Assessment 12.7 360 5.1 369 5.9 387 5.9 Fewer extracurricular Top 60% activities 29.0 395 5.2 407 5.1 424 5.0 More extracurricular Top 60% activities 30.1 420 5.5 427 5.7 438 4.9 Fewer extracurricular Bottom 40% activities 28.2 360 4.1 370 4.8 387 4.9 More extracurricular Bottom 40% activities 12.8 368 5.8 373 6.0 390 5.3 Top 60% Less Reading for Joy 30.2 396 3.4 414 3.5 424 3.2 Top 60% More reading for Joy 28.9 420 3.4 421 3.7 439 3.4 Bottom 40% Less Reading for Joy 20.9 350 3.7 368 4.1 381 4.0 Bottom 40% More reading for Joy 20.0 376 3.4 375 4.0 396 4.0 Top 60% Less Online Reading 22.4 378 3.3 378 3.3 408 3.5 Top 60% More Online Reading 37.0 427 3.4 427 3.4 446 3.0 Bottom 40% Less Online Reading 31.8 360 3.2 360 3.2 386 3.6 Bottom 40% More Online Reading 8.8 378 3.9 378 3.9 401 4.4 Note: Top 60% and Bottom 40% indicate whether the student belongs to the top 60% or bottom 40% of Moldovan PISA participants according to the ESCS index. Disaggregation based on student characteristic ST04Q01 "Gender" and school characteristics SC04Q01 "School Community", SCHSIZE "School Size", STRATIO "Student Teacher Ratio", LDRSHP_School "Leadership", DISCLIMAD "Disciplinary Climate", STUDREL "Teacher Student Relations", the percentage of measures adopted in questions SC16a-h, EXCURACT "Extraccurricular activities", and student characteristics JOYRED "Joy Like Reading" and ONLNREAD "Online Reading". Distinctions "Better/Worse", "More/Less" or "Higher/Lower" indicate that the variable was spilit at its median. 53