Report No 19066-PE Peru Education at a Crossroads Challenges and Opportunities for the 21 st Century (In Two Volumes) Volume II Background Notes and Appendices December 30, 1999 Human Development Department Bolivia, Paraguay and Peru Country Management Unit Latin America and the Caribbean Region ocrrument of the Wd Bank ACRONYMS AND ABBREVIATIONS ADE Areas de Desarrollo Educativo (Education Development Areas) AE Area de Ejecuci6n (Area of Execution) AFP Administraci6n de Fondo de Pensiones APP Authorized Pensionable Position CIAS Comite Interministerial de Asuntos Sociales (Inter-ministerial Committee of Social Affairs) CORDELICA Corporaci6n de Desarrollo de Lima y Callao CTAR Consejo Transitorio de Administraci6n Regional (Transitional Council of Regional Administration) DRE Director Regional de Educaci6n FONAVI Fondo Nacional de Vivienda (National Housing Fund) FONCODES Fondo Nacional de Compensaci6n y Desarrollo Social (Social Fund) GRADE Grupo de Analisis para el Desarrollo INEI Instituto Nacional de Estadistica e Informatica INFES Infraestructura Nacional para Educaci6n y Salud (National Infrastructure for Education and Health) IPSS Instituto Peruano de Seguro Social (Peruvian Institute of Social Security) IST Institutos Superiores Tecnicos (Higher Technical Institutes) ISP Institutos Superiores Pedag6gicos (Higher Institutes of Pedagogy) MECEP Proyecto para Mejoramiento de la Calidad de la Educaci6n Primaria MED Ministerio de Educaci6n (Ministry of Education) MEF Ministerio de Economia y Finanzas (Ministry of Economy and Finance) MINSA Ministerio de Salud (Ministry of Health) OECD Organization for Economic Cooperation and Development ONP Oficina de Normalizaci6n Previsional (Pension Office) PLANMED Planning Unit in MED PROMUDEH Ministerio de Promoci6n de la Mujer y del Desarollo Humano (Ministry for the Promotion of Women and Human Development) PRES Ministerio de la Presidencia (Ministry of the Presidency) USE Unidades de Servicios Educativos (Educational Service Units) UNESCO United Nations Educational, Scientific and Cultural Organization Exchange Rates (1997): Soles 2.66 = US$1 Fiscal Year: January 1 to December 31 School Year: April 1 to December 31 (180 days/year) Vice Presidents Shahid Javed Burki (through June 30, 1999) David de Ferranti (from July 1, 1999) Country Director Isabel Guerrero Lead Economist Ernesto May Sector Director Xavier Coll Education Sector Manager Jamil Salmi Lead Specialist in Human Donald Winkler Development Country Sector Leader Evangeline Javier Task Team Leader Kin Bing Wu Peru Education at a Crossroads: Challenges and Opportunities for the 218t Century Volume 11: Background Notes and Appendices TABLE OF CONTENTS: VOLUME II BACKGROUND NOTES 1. The Structure of Education ............................................................1 2. Income Elasticity of Demand for Education and Engel's Curve .............................3 Table 1: Determinants of Household Budget Shares ......................................................4 Table 2: Elasticity Estimates from Engel's Curves ........................................................5 3. Private and Social Returns to Public Education in Urban Peru ..............................7 Table 1: Earnings Functions Coefficients ............................................................9 Table 2: Linear Hypothesis on Regression Coefficients ............................................... 10 Table 3: Estimated Educational Premiums .......................................................... 1. I Figure 1: Earnings by Age and Educational Level, Females ........................................ 10 Figure 2: Earnings by Age and Educational Level, Males ........................................... 11 4. Determinants of Achievement ........................................................... 13 Table 1: Descriptive Statistics of Student-Level Variables Used in the HLM Model ........................................................... 1 5 Table 2: Descriptive Statistics of School-Level Variables Used in the HLM Model ........................................................... 16 Table 3: Descriptive Statistics of Department-Level Variables Used in the HLM Model ........................................................... 17 Table 4: Effects of Student Characteristics on Student Outcomes .............................. 22 Table 5: Effects of School Characteristics on School Mean ......................................... 24 Table 6: Cross-Level Effects of School Characteristics on Mathematics Achievement Slopes ........................................................... 27 Table 7: Extent to which Variation in Math Achievement Is Accounted for by Student-Level Characteristics and the Variation in True School Mean Mathematics Achievement Is Accounted for by School-Level Factors ............... 28 Table 8: Correlation Matrix ........................................................... 31 Table 9: Effects of Departmental Characteristics on the Grand Mean of Math Test Scores ........................................................... 32 Table 10: Final Three-level Model for Average Math Achievement with Interaction ........................................................... 36 Table 11: The Extent to which Mathematics Achievement is Accounted for by Student, School, and Department Level Characteristics ...................................... 40 5. Teacher Education and Professional Development .................................................. 41 Table 1: A Comparison of Old and New Pilot Curriculum in Teacher Education ........................................................... 43 APPENDICES 1. Student Enrollment Statistics ...................... 51 1.1. Enrollment in Formal and Nonformal Education (Disaggregated by Minors and Adults) in Public Institutions by Level, 1990-1997 1.2. Enrollment in Formal and Nonformal Education (Broadly Grouped) in Public Institutions by Level as Percentage of Total, 1990-1997 1.3. Enrollment in Formal and Nonformal Education (Disaggregated by Minors and Adults) in Private Institutions by Level, 1990-1997 1.4. Enrollment in Formal and Nonformal Education (Broadly Grouped) in Private Institutions by Level as Percentage of Total, 1990-1997 1.5. Total Enrollment in Formal and Nonformal Education (Disaggregated by Minors and Adults) in Public and Private Institutions, 1990-1997 1.6. Public Enrollment by Level and by Department, 1997 2. Teacher Statistics ............. 59 2.1. Teachers in Formal and Nonformal Education (Disaggregated by Minors and Adults) in Public Institutions by Level, 1990-1997 2.2. Teachers in Formal and Nonformal Education (Broadly Grouped) in Public Institutions by Level as Percentage of Total, 1990-1997 2.3. Teachers in Formnal and Nonformal Education (Disaggregated by Minors and Adults) in Private Institutions by Level, 1990-1997 2.4. Teachers in Formal and Nonformal Education (Broadly Grouped) in Private Institutions by Level as Percentage of Total, 1990-1997 2.5. Total Teachers in Formal and Nonformal Education (Disaggregated by Minors and Adults) in Public and Private Institutions by Level, 1990-1997 2.6. Teacher-to-Student Ratio in Formal and Nonformal Education (Disaggregated by Minors and Adults) in Public Institutions by Level, 1990-1997 2.7. Teacher-to-Student Ratio in Formal and Nonformal Education (Disaggregated by Minors and Adults) in Private Institutions by Level, 1990-1997 2.8. Teachers by Level and by Department, 1997 2.9. Teacher to Student Ratio by Level and by Department, 1997 2.10. Student Enrollment and Teachers in Public FPedagogical Institutes by Region, 1997 2.11. Student Enrollment and Teachers in Private Pedagogical Institutes by Region, 1997 2.12. Changes in Student Enrollment and Student to Teacher Ratios in Public Pedagogical Institutes by Region, 1997 2.13. Change in Student Enrollment and Student to Teacher Ratios in Private Pedagogical Institutes by Region, 1997 3. School Statistics ............ 75 3.1. Public Schools for Formal and Nonformnal Education (Disaggregated by Minors and Adults) by Level, 1990-1997 ii 3.2. Public Schools for Formal and Nonformal Education (Broadly Grouped) by Level as Percentage of Total, 1990-1997 3.3. Private Schools for Formal and Nonformal Education (Disaggregated by Minors and Adults) by Level, 1990-1997 3.4. Private Schools for Formal and Nonformal Education (Broadly Grouped) by Level as Percentage of Total, 1990-1997 3.5. Total Public and Private Schools for Formal and Nonformal Education (Disaggregated by Minors and Adults) by Level, 1990-1997 3.6. Total Public and Private Schools for Formal and Nonformal Education (Broadly Grouped) by Level as Percentage of Total, 1990-1997 4. Indicators of Equity and Efficiency ........................... 83 4.la. Rural Gross Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.1 b. Urban Gross Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.1 c. Rural Net Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4. ld. Urban Net Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.2a. Rural Public Gross Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.2b. Urban Public Gross Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.2c. Rural Public Net Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.2d. Urban Public Net Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.3a. Rural Private Gross Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.3b. Urban Private Gross Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.3c. Rural Private Net Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.3d. Urban Private Net Enrollment Ratio by Gender, Age, and Consumption Quintile, 1997 4.4a. Simulation 1 - Distribution of Public Expenditure by Consumption Quintile, 1997 4.4b. Simulation 2 - Distribution of Public Expenditure by Consumption Quintile, 1997 4.4c. Simulation 3 - Distribution of Public Expenditure by Consumption Quintile, 1997 4.4d. Simulation 4 - Distribution of Public Expenditure by Consumption Quintile, 1997 4.4e. Simulation 5 - Distribution of Public Expenditure by Consumption Quintile, 1997 4.5. Water and Sanitation in Public and Private Schools by Age and Income Group, 1994 iii 4.6. Typology of Urban and Rural Schools, Based on School Characteristics, Infrastructure, Equipment and Other Resources, Principals' and Teachers' Characteristics and Perceptions, 1994 4.7. Internal Efficiency of Public Education (Primary and Secondary) in Peru (Average 1994 to 1996) 5. International Comparison of Between-School Variation in Achievement ............ 105 5.1. International Comparison: Between-School Variation in Achievement in Selected Countries 5.2. International Comparison: Between-School Variance in IEA International Studv on Reading, 1990 6. Public Expenditure on Education .........................................................111 6.1 Gross Domestic Product, Total Government Expenditure, and Total Public Expenditure on Education, 1970-1997 (Million Soles in Current Prices) 6.2 Gross Domestic Product, Total Government Expenditure, Total Public Expenditure on Education, and Tax Revenue of Central Government, 1970-1997 (Million Soles in Constant 1997 Prices) 6.3. Gross Domestic Product, Total Government Expenditure, and Total Public Expenditure on Education, 1970-1997 (Million US dollars at the 1997 Exchange Rate) 6.4. Recurrent and Capital Expenditure on Education, 1990-1997 (Constant 1997 Soles) 6.5. Public Expenditure on Education by Budgetary Entities, 1990-1997 6.6. Functional Composition of Public Expenditure on Education According to Pre- 1997 Classification, 1990-1996 6.7 Reclassified Functional Composition of Public Expenditure on Education According to the 1997 Classification, 1990-1997 6.8. Functional Composition of Public Expenditure on Education by Budgetary Entities, 1995-1997 6.9. Public Expenditure on Education by Level, 1'990-1997 6.10.Per Student Recurrent Public Expenditure by Level, 1990-1997 6.11. Recurrent public Expenditure by Level, by Function and by Department from Central Government Allocation, 1997 6.12. Recurrent Public Expenditure by Level, by Function and by Department from Own Resources, 1997 6.13. Recurrent Public Ependiture by Level, and by Department from Other Sources. 1997 6.14. Total Public Expenditure on Education by Level, and by Department from All Sources of Funding, 1997 6.15.Departmental Revenues from Central Government Allocation as a Percentage of Total, 1997 6.16. Department's Own Resrouces as a Percentage of Total, 1997 6.17. Other Resources as a Percentage of Total, 1997 6.18. Per Student Recurrent Expenditure by Level and by Department, 1997 (Soles) 6.19.Teachers Salary Scale (July 1990-August 1997) (Soles in Current Prices) iv 7. Household Expenditure on Education ....................................... 157 7.1. Average Household Expenditures on Pre-Primary Education by School Type, 1997 (Soles Per Child) 7.2. Average Household Expenditures on Primary Education by School Type, 1997 (Soles Per Student) 7.3. Average Household Expenditures on Secondary Education by School Type, 1997 (Soles Per Student) 7.4. Average Household Expenditures on Tertiary Nonuniversity Education by School Type, 1997 (Soles Per Student) 7.5. Average Household Expenditures on University Education by School Type, 1997 (Soles Per Student) 7.6. Average Household Expenditures on Education by Education Level, 1997 (Soles Per Student) 7.7. Average Household Expenditures on Education by School Type, 1997 (Soles Per Student) 7.8. Total Household Expenditures on Pre-Primary Education by School Type, 1997 (Soles) 7.9. Total Household Expenditures on Primary Education by School Type, 1997 (Soles) 7.10. Total Household Expenditures on Secondary Education by School Type, 1997 (Soles) 7.11. Total Household Expenditures on Tertiary Nonuniversity Education by School Type, 1997 (Soles) 7.12.Total Household Expenditures on University Education by School Type, 1997 (Soles) 7.13. Total Household Expenditures on Education by Education Level, 1997 (Soles) 7.14. Total Household Expenditures on Education by School Type, 1997 (Soles) 8. Population Projection ....................................... 173 8.1. Assumptions of Population Projection 8.2. Population by Single Years of Age for Selected Age Ranges and Years, 1995- 2020 8.3. Projected School-Age Population, 1995-2020 9. External Support for Education Since 1990 ....................................... 179 10. Selected Indicators for International Comparison ....................................... 189 10.1. Educational Expenditure as a Percentage of GDP for all Levels of Education Combined, by Source of Funds (1995) 10.2. Educational Expenditure as a Percentage of GDP for Primary and Secondary Education, by Source of Funds (1995) 10.3. Educational Expenditure as a Percentage of GDP for Tertiary Education, by Source of Funds (1995) 10.4.Educational Expenditure from Public and Private Sources for Educational Institutions as a Percentage of GDP by Level of Education (1995) v 10.5.Index of the Change between 1990 and 1995 in Public and Private Expenditure on Education, by Level of Education (1990=100) 10.6. Educational Expenditure on Primary and Secondary Education by Resource Category for Public and Private Institutions (1995) 10.7.Educational Expenditure on Tertiary Education by Resource Category for Public and Private Institutions (1995) vi Background Notes Background Note 1. The Structure of Education The existing structure of education comprises four levels: initial, primary, secondary, and tertiary. (See Appendices 1.1-1.3 for enrollment in public institutions, Appendices 1.4-1.6 for enrollment in private institutions, and Appendices 1.7-1.9 for total enrollment). Initial education is offered in daycare for those under the age of 3 and in kindergartens for those between the ages of 3 and 5. There is also a nonformal system of initial education. It is estimated that about 20 percent of those under 5 are having some form of initial education (INEI, 1997). In 1997, 522,000 children enrolled in initial education in the public system, and 147,000 in private organizations. In a recent proposal for restructuring education, one year of initial education is to be made compulsory and form part of basic education. Primary education comprises six grades, intended for the age group between 6 and 11, but also available to adults who have not received it. In 1997, about 3.7 million persons enrolled in public, formal and nonformal programs, and 491,000 in private programs. The majority of primary schools are coeducational and the program of study comprises 25 hours per week during 36 weeks per year (900 hours per annum). Secondary education is offered to the age group between 12 and 16, as well as to adults who did not have it. In 1997, about 1.6 million enrolled in public secondary schools, and 318,000 in private schools. Secondary education is organized in two cycles: the first has a common curriculum for all students in Grades 7 to 8, and the second has a diversified curriculum of three years, divided into science and humanity streams. Secondary education is offered at 36 hours per week for 38 weeks in a year (1,368 hours per annum). Tertiary education includes nonuniversity and university education. Nonuniversity institutions include teacher training institutions (institutos superieroes pedagogicos or ISPs for short), technical education institutions (institutos superiories tecnicos or ISTs), and schools for the arts. In 1997, 211,000 students attended public universities, and 129,000 private universities. Another 165,000 students enrolled in public tertiary institutions, and 139,000 in private institutions. In 1997, MED proposed major changes in the structure of the system, with the aims to improve the articulation between levels, to meet needs of a changing labor market, and to improve system efficiency and organizational flexibility. It pledged to universalize one year of initial education, improve the quality of primary education, reduce secondary education from five to four years, and introduce two years of preparatory course work (bachillerato) which will provide the transition to tertiary education or to the world of work. In other words, basic education will comprise 11 years of instruction, which includes one year of pre-school, six years of primary and four years of secondary education. 1 What is new is not only the structural change but the introduction of certifications of study at three levels: at the end of basic education, bachillerato, and tertiary nonuniversity education, respectively. Accreditable capacities of basic education will include: (a) comprehension of reading, editing, communication, and expression; (b) development of logic and mathematics; (c) management of the basics of technology and informatics; (d) facility for continuous learning and holistic reasoning; (e) creativity and imagination; (f) understanding of environment; (g) local, national, and universal culture; (h) basic work and organizational abilities; and (i) basic knowledge of an international language. Accreditable capacities of the bachillerato will include: (a) productive use of resources (time, space, skills, and technology); (b) abilities to search and select information; (c) facility for analysis, synthesis, abstraction, and systematization; (d) proficiency in an international language; (e) internediate professional competency; (i) tools for management and self-employment. Students will be certified after having had no less than 2,500 hours of studies in tertiary non-university education. The proposed bachillerato is divided into two streams: (a) scientific and technological, and (b) scientific and humanistic. The former will prepare for studies in engineering, medicine, mathematics, the sciences and accounting in universities, and technical courses in tertiary non-university education. The latter will prepare for studies in law, education, the social sciences and humanities in universities, and tourism, graphic arts, translation, catering, and public relations in other tertiary education. In each stream, there will be a core curriculum and other subjects that prepare for the world of work. The core curriculum is shared by both streams and includes science and technology, earth science, oral and written communication, economics and management, informatics, history of Peru, natural philosophy, and international language. Bachillerato can be offered by (a) secondary colleges as add-ons to their four years of secondary education, (b) universities before the beginning of undergraduate studies, (c) postsecondary institutes before the beginning of two years of tertiary education, and (d) academic institutes specialized just in offering bachillerato. This ambitious plan requires investment in infrastructure, curriculum development, and teacher training. The implementation of bachillerato is sequenced as follows. In 1997, the proposed structural change was made public; the modernization of the secondary curriculum has begun; the transitional fifth. year of secondary schooling was elaborated; and the bachillerato curriculum was proposed. Subsequently, a law was promulgated to give the structural change legal force; new curriculum and training of principals and teachers was piloted; the development and distribution of education materials in secondary education was initiated; a new administrative system was set up; and infrastructure was planned. Thereafter, a second application of transition curriculum in the fifth year of secondary was implemented; training of teachers; and equipping institutions for implementation of bachillerato with follow-up and monitoring. Full scale implementation is expected to begin in 2000, affecting 200,000 young people each year. The following year will see the first batch of graduates from bachillerato. The effort to revamp the education system is expected to come to fruition in 2007. 2 Background Note 2. Income Elasticity of Demand for Education and Engel's Curve' The share of household expenditures for education were analyzed using an Engel equation framework. The explanatory variables include the logarithm of income (here proxied by total expenditure), the logarithm of the size of the household, and a set of variables intended to capture the gender and age composition of the household (with age brackets set up to correspond to the various levels of education in Peru). The explanatory variables also include a dummy for residence in the Lima metropolitan area, and three variables indicating, respectively, the education level of the household head, whether or not the household head is male, and whether or not the household head belongs to an indigenous group. The focus of analysis in this section is the expenditure variable, but the other variables are included as "control" variables, so that the coefficient estimates reported in this section are not biased. In other words, it is important to be sure that what we call the effect of income is indeed the effect of income, and not, say, the effect mainly of the education of the household head. In addition to estimating the Engel function for expenditure, we also provide estimates for expenditures on Food, on Health, and on Other Expenditures. The object of the analysis is to compute income elasticities for each of the budget shares. It is an empirically established fact that the income elasticity for food shares is negative, because poor households need to spend larger shares on food, but an a priori judgment cannot be made about the income elasticities of the other budget shares. In particular it is important to compare the income elasticity for education with those for health. The object of the analysis is to estimate a value for b, the slope on income in the budget share regression, as well as r1, the income elasticity which tells us the percentage points by which the budget share goes up for a given percentage increase in income. The estimates from the Engel function analysis are presented in Table 1. It can be seen from the table that the average budget share for education is 0.0467 and the coefficient on log total expenditures is 0.0128. The respective values for health related expenditures are 0.0411 for the average budget share and 0.0151 for the coefficient on log total expenditures. The income effect of food has the expected negative sign and the coefficient on log total expenditure for food share is of the same order of magnitude as reported from other countries. It is of interest to note that the dummies for Lima, rural location, female head, and indigenous head of household are economically and statistically insignificant in the education share regression. Some of these dummy variables are important in the food share regression, such as the 0.1315 effect of a rural location. The lack of significance of the dumrmy variables for the education share, in contrast with the significance for food share, tells an important story about the stability of preferences for education across This analysis was undertaken by Suhas Parandeker. 3 households which vary across these measured variables. The estimates of elasticity, derived from the regression coefficients are reported in Table 2. Table 1: Determinants of Household Budget Shares Budget Shares Explanatory variables Education I Food Health Other Mean OLS Coefficient (t-value for H0 Coeff = 0) (Std Dev.) Intercept -0 1424 0.8066 -0 0970 0 4328 (-10 02) (24 078) (-6 455) (13.52) Logarithm of total household expenditure 0 0128 -0.0377 0.0151 0 0098 9 2304 (8 113) (-10.139) (9 072) (2 750) (0 7202) Logarithm of total household size 0.0187 0 001 0 0046 -0.0242 1 5189 (8.755) (0 199) (2 035) (-5.052) (0.4920) Proportion of boys aged 0-5 years -0 0179 0.1294 0 0026 -0.1142 0.0706 (-2.209) (6 769) (0 307) (-6 249) (0 1174) Proportion of boys aged 6-11 years 0.0746 0 1185 -0.0095 -0.1837 0 0660 (9 109) (6 133) (-1 091) (-9951) (0 1130) Proportion of boys aged 12-16 years 0 1120 00489 -0.0131 -0 1477 0.0483 (12 43) (2.3000) (-I 377) (-7 279) (0 1000) Proportion of boys aged 17-22 years 0 0730 0.0322 -0 0119 -0 0932 0 0505 (8 875) (I 661) (-1 367) (-5.037) (0 1064) Proportion of girls aged 0-5 years -0 0321 0.1612 0 0118 -0.1409 0 0671 (-3.970) (8 443) (1 383) (-7 724) (0.1163) Proportion of girls aged 6-11 years 0 0717 0.0728 -0.0020 -0 1425 0.0607 (8 503) (3 661) (-0.234) (-7 498) (0 1085) Proportion of girls aged 12-16 years 0.0805 0 0899 -0.0169 -0 1535 0 0470 (8.711) (4 125) (-1.728) (-7 374) (0.0944) Proportion of girls aged 17-22 years 0 0718 0 0427 -0.0081 -0 1063 0 0560 (8 419) (2 125) (-0 906) (-5.537) (0 1069) Proportion of girls aged > 22 years 0 0209 -0 0008 0.0078 -0 0279 0 2815 (3.198) (-0 054) (I 122) (-I 891) (0.1848) Dummy for residence in metropolitan Lima -0.0032 -0.0130 -0 0071 0 0234 0 2893 (-I 646) (-2 812) (-3.430) (5 286) (0.4357) Dummy for residence in rural area -0 0010 0 1315 0.0075 -0 1380 0.3481 (-0.515) (27 00) (3 436) (-29 64) (0.4852) Female head of household -0 0020 -0.0148 -0 0007 0 0175 0.1563 (-0 776) (-2 434) (-0.248) (3 008) (0.3602) Indigenous head of household 0 0022 0 0022 -0.0035 -0.0009 0.2335 (1.182) (0.493) (-1 749) (-0 219) (0 4111) Education in years of the head of household 0 0019 -0.0049 -0 0011 0 0040 7.7645 (10 15) (-10 81) (-5.208) (9 257) (4 8485) Mean value of budget share 0.0467 0 5050 0.0411 0 4072 R2 0.28 0 49 0.04 0.48 F value 91.8 233 5 8 94 220 8 Sample Size (N=3820 Households) 4 Table 2: Elasticity Estimates from Engel's Curves Expenditure Group Elasticity Education Food Health Other Budget share with respect to total expenditure 0 274 -0 0747 0 3674 0 0241 Specific expenditure with respect to total expenditure 1.274 0,9253 1.367 1 024 Budget share with respect to household size -0 6133 -0,0969 -0.0613 0 0769 Specific expenditure with respect to household size -0.6133 -0.0969 -0.0613 0 0769 The findings from the Engel's curve analysis are a mixed blessing. On the one hand, the income elasticity is a low 0.27, and education expenditures are considered to be a necessity by Peruvian households.2 This is a positive finding, as it indicates that there is a strong underlying demand for education in Peru. A high income elasticity would indicate that the item of expenditure is a luxury-households spend money on luxuries when they have the money, but simply do without it when they do not have money. The relative magnitudes are small, but the evidence also suggests that education expenditures are less responsive to changes in income as compared to expenditures on health. However, from the point of view of educational policy, the implication is that we cannot rely on general increases in income to bring about greater expenditures on education. For every doubling of household income, the budget share spent on education would go up only by a quarter. Add this finding to the fact that levels of household expenditure on education vary vastly by income level. (This fact can be seen from the Lorenz curve analysis reported in the main body of this report-the total amount spent on education by the richest quintile in Peru was more than 13 times the total amount spent on education by the poorest quintile.) The findings show the need for specific policy instruments that will address the inability of poorer households to incur additional expenditures.3 2 Mwabu's work on Kenya indicated a much higher income elasticity of education expenditures of 0.73. 3 To make sure that the conclusion was not based just on one pooled set of regressions, the regressions (not reported here) were run separately for subsamples by indigenous and nonindigenous, rural and urban, and poor and rich. Consistently, the pattern is that the income elasticities are lower for the more disadvantaged groups. 5 Engel's Curves: Formulae for Elasticity Estimates The Engel curve estimates are based on the following equation, presented as Equation (3) in the Working Paper by Germano Mwabu.4 i x i- pi 0log(x) + 17i log(N) + EYi r(n N) +. oz + 6 where w= the share of expenditure of the i th grouping of household expenditure items. i household spending for the four groups, viz., education, health, food, and other expenses. (The share is conceptually equal to pi times qi, the price times the quantity, divided by the total expenditure x, but pi and qi are not empirically observed as separate entities in the actual estimation of the Engel curve.) n, = the number of family members in the age-by-gender group j. These groups range in the reported estimation from (boys aged 0 to 5 years) to (girls aged older than 22 years). Ni= the total family size, thus (nj/N) represents the relative size of group j in the family. z = a set of control variables. These include (a) dummy for residence in metropolitan Lima, (b) dummy for residence in rural area, (c) female head of household, (d) indigenous head of household, and (e) education in years of the head of household. 6i = the error term in the regression equation, assumed to be i.i.d. normal. aj,6j, i, yi = the parameters to be estimated. The equations for elasticities follow from the above equation. Letting Si represent the elasticity of the budget share for expenditure group i, and Ei represent the elasticity of specific expenditure with respect to total expenditure, x, and household size, N, the elasticities are: a) Six= 31i/w, b) SiN 1/wi ( i- £- yij (nj/N)) c) Eix= 1 + (f13/wi) d) EiN I /wi (i - mi Yij (nj/N)) 4Household Composition and Expenditures on Human Capital Inputs in Kenya. by Germano Mwabu, Department of Economics, Yale University, 1994. 6 Background Note 3. Private and Social Returns to Public Education in Urban Peru5 To estimate the private and social rates of return to public education, the rate of discount (r) was calculated. This discount rate equalizes the stream of discounted benefits to the stream of costs related to a given level of education at a given point in time. Thus, r can be determined by solving the following equation: ((Wn-_W - K (1) t=1 (1 + r)t E(Wn -1 + Cn)(+ (1) where n = Level of education T = Number of periods in the labor market of an individual with "n" education W, = Yearly labor income of individual with "n" education K = Number of periods taken to achieve "n" education C, = Direct costs of studying for level "n" education. The left hand side of the equation represents the benefits of achieving the additional level of education, which is simply expressed by calculating the present value of the differential between the earnings with "n" education and "n-I" education. The cost of studying "n" education is expressed in the right hand side of the equation, and its two elements represent the foregone earnings, (assuming that no one works while studying) and the direct costs of having achieved "n" education (basically, tuition). The data for this calculation was obtained from Instituto Cuanto's household 6 survey of 1997. The analysis was restricted only to urban areas . The sample was constrained to those individuals that had always studied in the public system, so the estimated rates of return would only capture the effect of public education. Instead of calculating streams of average income by age and level of education, we decided to estimate an earnings function to calculate the yearly income associated with the educational level and age of the individual. Hence, the following equation was estimated separately for males and females in the sample: 5 This analysis was undertaken by Jaime Saavedra, with assistance of Eduardo Maruyama. 6 In Peruvian rural areas, household survey measurement of labor income is highly inaccurate due to high participation of self-employment, high seasonality, self-consumption, etc. Usually, expenditure data is recommended instead of income data for these areas. 7 4 4 4f' In Y = ,o + E /lnELn + ,32AGE + 63AGE2 + Ef34n(AGE X ELn)+ E 85n(AGE2 x ELn)+ 36HY (2) n=l n=I n1= where Y = Yearly labor income Eln = Dummy variable for educational level "n" (1 = Primary education, 2 = Secondary education, 3 = Non-university higher education, 4 = University higher education) HY = Hours worked per year. This specification allowed finding different life-cycle earnings patterns for all educational levels (including no education), i.e. to find the streams of W, required in equation (1). For the private rate of return, the basic assumption was that public education had no direct costs7, so the only cost of a given level of education were the foregone earnings. To calculate social rates of return we used 1997 nation wide public expenditure data by level of education and student as the direct cost of education. Table 1 shows the regression coefficients obtained from equation (2). Table 2 in Chapter 3 shows the results of solving equation (1), for males and females. Given the low level of significance of many variables in the regression shown in Table 2 of Chapter 3, we tested the linear hypothesis that all the coefficients of a given level of education were equal to zero. For example, for primary education we tested the following hypothesis: Ho : /JPRImARY = O,/JPRIMARY x AGE = O,,PRIMARY x AGE Results of these tests are shown in Table 2. 7 It must be noted that even though tuition is free in the public system, families' expenditure in education might be important if we consider the amount spent in school uniforms, books, etc. 8 Table 1. Earnings Functions Coefficients (Dependent variable is the natural log ofyearly earnings) Variable Coefficient Female Male Constant 6.0424 4.0796 Primary -0.8048 1.8678 Secondary 0.2877 1.7883 NU-Higher -0.9007 2.4524 U-Higher -2.2527 1.2498 Age 0.0089 0.1389 Age2*100 -0.0020 -0.1504 Primary*Age 0.0606 -0.0657 Secondary*Age 0.0274 -0.0477 NU-Higher*Age 0.1238 -0.0698 U-Higher*Age 0.1927 -0.0094 Primary*Age2*100 -0.0695 0.0725 Secondary*Age2*100 -0.0390 0.0538 NU-Higher*Age2* 100 -0.1689 0.0812 U-Higher*Age2* 100 -0.2266 0.0311 Hours per year 0.0004 0.0002 Number of observations 1435 2535 R2 0.27 0.26 pF 3691) ob>F PPRI=O, PRIxAGE=O, 2.45 0. 10.45 0. 2 PPRIxAGE =0 0619 0000 PSEC=O, PSECxAGEO, 6.12 0. 19.47 0. OSEC.AGE =0 0004 0000 PNUHO, PNUHxAGE=O, 12.96 0. 20.25 0. 2NHG =00000 0000 fNUHxAGE 0°°°°°° OUw=0, IUHxAGE=O, 27.11 0. 37.11 0. aUHXAGE0=° 0000 0000 Graphs 1 and 2 show the earnings streams by educational level for males and females calculated from the regression. Figure 1. Earnings by Age and Educational Level, Females 2500 3000 2000 ~~~~~~~~~~~~~~~~~~2500 2 1500 0.... 11000 ....... . i(OO~~~~~~~~~~~~00 1500 50 50'' ' ' ' ' '0 1 2 1 36 4 6 5 6 6 iJ12 17 22 27 32 37 42 47 52 57 62 -500 -1(000 Age Age - Noeducadon Pri-may ----- POdn-y Seonday 6000 . 900.-_ S5 0D00 -, \ anoo /t/ \ \ 0 7000 4000 I 6020 3000 5020 Agc Agc --Secondaty - Non-unive -ily -igher - Seconde. - U eiveriCyhigIher 10 Figure 2. Earnings by Age and Educational Level, Males 4500 7000 4000 6000 3500 30012 S0W0 51 2500 4000 / /. . 2000 2i 000 500 1-'''-'' 321000 0 -50(0 12 17 22 27 32 37 42 47 52 57 62 ° -1000 - - -1000 1 21 261 36 41 46 51 56 61 Agc Ag. - . . No education -Pnmay . . . Pnimary - Secondey 7000 16000 7000 14000 6000/ . . . . . . . . . 12000/ / ~ 5 000 10000 300 7000 . . . 600 . .. .. . '' . . ... 2000 -40 1000 l 200D . 0- -,ooo 7 22 27 32 37 42 47 7 52 _ 62 2000 7 22 27 32 55 42 47 52 07 62 Age Age ----- OccSlca5y - Non-univirsty higicr ----. Seomidwry - UUmveity higher Table 3 shows the regression coefficients used to construct the index of education premium in Chapter 3. The regression with which these coefficients were estimates were Mincerian earnings equations that include cumulative educational dummies, experience and its square, tenure and its square, marital status, gender, if living in Lima and occupational training. Table 3. Estimated Educational Premiums 1985 1991 1994 1997 Primary/No education 0.418 0.230 0.275 0.427 Secondary/Primary 0.449 0.205 0.274 0.360 Non-university higher/Secondary 0.528 0.237 0.328 0.415 University higher/Secondary 0.581 0.502 0.698 0.864 11 12 Background Note 4. Determinants of Achievement8 Analysis of determinants of achievement is an important tool to inform policy choice. This Background Note describe the analytical approach of hierarchical linear modeling which is used to identify the factors affecting math achievement at the levels of students, schools, and departments. The findings cannot be used to judge the impact of education policy of the 1990s because of the usually long time lag between intervention and effects on teaching and learning in the classroom. 1. The dataset. The dataset was drawn from the first national standardized test of mathematics in Grade 4 in 1996 and the accompanying questionnaires. Results of the 1996 assessment remain to be publicly released. Thus, this Background Note reports no statistical tabulations of test scores directly nor does it attributes average scores to schools or departments. It is nonetheless still possible to undertake limited analyses of broad determinants of outcomes without reporting scores. This Note undertakes only such analyses, although more extensive results could be reported when the assessment enters the public domain. The sample comprised 50,479 students who were selected from a population of 618,719 Fourth Graders in 1,275 schools in 25 departments. Thirty students in each sample school were given the test, which lasted for an hour. The sample included private and public schools but under sampled rural schools. Single-teacher schools in remote areas were excluded; these accounted for 29 percent of all schools in the country and enrolled about 6 percent of the population of Fourth Graders. This sample frame has resulted in a relatively narrow achievement gap between urban and rural areas. The dependent variable (also known as the outcome variable) relates to performance on the mathematics test. For simplicity, this will be referred to as outcomes or achievement in this Note. The assessment instruments included multiple choice items in sets, natural numbers, fractions, decimals, geometry, and international units and money. Because the answers required were not dependent on interpretation, this outcome measure can be considered a reasonable measure of performance in mathematics. This analysis applied reliability tests9 and found the instrument reliable. The scores are informative about the relative performance of students compared among themselves. 8 The analysis of data was undertaken by Pete Goldschmidt. 9 The reliability of a test is defined as the consistency of the information, or scores, obtained. Any test occasion will produce some errors of measurement, which are assumed to be random. That is, students taking the same test on different occasions will score slightly differently due to chance errors (e.g. accidentally marking the answer as B, when they mean C). If the analysis entails using the total test score, then what is of concern is whether any individual (or set of) item(s) scores are not related to the overall test score. There are several methods to 13 The independent variables (also known as the predictor or explanatory variables) were mostly drawn from, but not limited to, information collected by three questionnaires which accompanied the math test for the principal of the sample school, the teacher of the subject, and parents of the 30 students who took the test, respectively. The independent variables selected for this analysis are as follows: * At the student level (also known as level 1), the predictor variables were grouped into four categories: (a) ascriptive characteristics (gender, mother tongue, and student age), (b) availability and usage of text materials, (c) student attendance and study habits, and (d) parental roles and expectations. * At the school level (also known as level 2), the predictor variables were divided into seven groups: (a) geographic (such as urban and rural, and the cost, mountain, and jungle), (b) public or private school type, (c) text usage, (d) teacher characteristics, (e) teacher roles, (f) principal characteristics, and (g) parent roles. * At the department level (also known as level 3), the predictor variables were drawn from four data sources: (a) variables which were aggregated from the student- and school-levels in the 1996 test dataset (such as departmental percentage of private school students, over-aged students, female students, Quechua speaking students, 4th Grade teachers with a Master's degree, and with a title from Istitutos Superiores Pedagogicos); (b) government expenditure data on public spending on basic education per student by department in 1994; (c) household survey data on household expenditure on basic education per capita by department in 1994; and (d) FONCODE's 1993 Poverty Map which provided information on departmental characteristics (such as poverty index, percentage of population in chronic malnutrition, mortality rates, illiteracy rates, and school non-attendance rates). This dataset has certain limitations. First, the assessment was undertaken at a single point in time so it is not possible to control for prior learning. Second, the questionnaires did not contain questions on parental education, number of siblings at home, family socioeconomic status (SES) or resources (e.g. family income or expenditure, type of dwelling, availability of water and electricity, etc.),10 or school resources (e.g. public spending per student, parental contribution per student, availability of water, electricity, library and laboratory, etc.). In other words, some key predictors were not available to enable controlling for their effects. The only variables that may proxy public and private finance of education were public and private expenditure at the departmental level. This would yield a reliability coefficient. More commonly, and the method used for this test, is to generate Cronbach's Alpha; which is to correlate all possible scores with n-I test items (i.e. remove item 1 from the score and correlate to the test with item 2 removed, etc). The dependent variable in this study has passed these tests. 10 Although the questionnaire contains a question on parental occupation, the inclusion of the housewife category into the list confounded the effects because a large number of mothers checked this category. Therefore, it is not possible to even to use occupation as a proxy for SES. 14 2. Descriptive statistics In Volume 1 of this report, Table 3 presents the index of relative outcomes in mathematics. The national average is set to 1 and the averages of other subgroups can be compared against it. The outcome differentials were substantial, particularly between private and public schools, Spanish-speakers and Quechua-speakers, and between the jungle and other regions. The coefficients of variability show large disparity within each subgroup. Table 1 below presents the mean, standard deviation, minimum and maximum value of variables at the student level. Most of the data were collected from a few questions on students attached to the test and from the questionnaire for parents. Although the original sample had 50,479 students, only 40,766 returned the test, of whom, only 33,233 respondents had all the observations. The most common missing value was gender and type of school attended. Nonetheless, the mean did not change. In Table 1, the column which shows mean or percentage indicates either the average value of the variable or the percentage share of each categorical variable (for example, girls accounted for 50 percent of the students in the sample). The percentage share of omitted variables, such as boys (which are used for comparison with predictors in the same categories) can be deduced from the percentage share of girls, and its standard deviation can be derived from the formula in the footnote. 1 " Table 1: Descriptive Statistics of Student Level Variables Used in the HLM Model Mean or Standard Minimum Maximum Percentage deviation ___ ___ ______ Ascriptive characteristics Girls (boys omitted) 050 050 0 Aymara 0 03 0.35 0 Quechua (Spanish speakers omitted) 0.15 0 16 0 Student over the age of I0 for Grade 4 0 23 0 42 0 Materials (text books) No textbooks 015 0.36 0 School provided textbooks 0.06 0 24 0 Sibling's textbooks (dictated by teacher omitted) 0 21 0 41 0 Student aftendance & study habits Daily attendance (sporadic attendance omitted) 0 07 0 26 0 No studying 0.01 0 08 0 Studies regularly 0 27 0.44 0 Studies for exams 016 0.37 0 Studies because expected 0.20 0.40 0 (Studies because of self-motivation omitted) l The equation for standard deviation is((p(l-p))/n)A.5. p is the proportion of I's; so if the left out category is boys, for example, p for them would be 1-.493 (or could do it as 100-49.3). A.5 is the square root. In case of percentage being presented as decimals, the equation would need to be adjusted to 100-p. 15 Parental expectation, roles & home environment: Goal of schooling Develop literacy 019 0.39 0 1 Develop nothing 0 06 0.24 0 1 Develop comprehensively 0.23 0 42 0 1 Develop math (Learning well in general omitted) 0 13 0.34 0 1 Home Academic Support Environment for studying (through homework omitted) 0 09 0.28 0 1 None 0.22 0.42 0 1 Special education programs 0 01 0 11 0 1 Additional reading 0.19 0 39 0 1 Father assistance (Mother assistance omitted) 0 20 0 40 0 1 No assistance 0.25 0 43 0 1 Other family assistance 0.23 0.42 0 1 Sample size 33,233 Table 2 presents descriptive statistics of school-level predictors. Some of the variables were aggregrated from the student level (such as the school means of students accessibility to text), while others were collected from surveys of teachers and principals. Table 2: Descriptive Statistics of School-Level Variables Used in the HLM Model Mean or Standard Minimum Maximum Percentage Deviation Geographic Rural (urban omitted) 0 1S9 0 39 0 1 Selva 0 21 0.41 0 1 Sierra (costa omitted) 0 37 0.48 0 I School Type Private (public omitted) 0.14 0 35 0 1 Text Usage No text 15 41. 14.93 0 100 School provided text 6 21 8.57 0 77 Sibling's text 20 95 13 36 0 100 (Text dictated by teacher omitted) Teacher characteristics Number of years of service 12 17 7.59 1 57 Number of training courses (1990-96) 6.8:3 2 96 0 11 Teacher language: Aymara (Spanish speaker omitted) 001 0 10 0 1 Teacher language: Qucchua 0 08 0 27 0 1 Teachers graduated from universities 0.15 0 36 0 1 Teachers graduated from ISP 0 51 0.50 0 1 Teachers graduated from IST 0.01 0 11 0 1 Teachers graduated from professional courses 0 17 0 37 0 1 Professional titles in other specialties 0.01 0.11 0 1 University graduates 0 06 0 23 0 1 University leavers (finished courses without degree) 0.03 0.16 0 1 Secondary school graduates 0 00 0.06 0 1 (Secondary leavers with teacher training omitted) 16 Table 2: Descriptive Statistics of School-Level Variables Used in the HLM Model Mean or Standard Minimum Maximum Percentage Deviation Teacher appointed by manager (Appointed official 0 03 0 18 0 1 omitted) Contract teacher 0.16 0.36 0 1 Teacher roles Explain materials 0.11 0.32 0 1 Invite guests 0 01 0 08 0 1 Student participation vs assess performance 0.79 0 41 0 1 (Assess performance omitted) Principal characteristics (Spanish speaker omitted) Principal's language: Aymara 0 01 0 11 0 1 Principal's language: Quechua 0.10 0 30 0 1 Principal's language Other 0 01 0 09 0 1 Parent roles (according to teachers) Check attendance 0.08 0 26 0 1 Check homework 0 21 0 41 0 1 Prepare children for exams 0 05 0 21 0 1 Provide nutrition 0.05 0 21 0 1 Stimulate learning (no participation omitted) 0 26 0.44 0 1 Sample size 1,275 Table 3 presents the descriptive statistics of departmental-level variables. Some of the variables were aggregrated from the student level (such as percentage of students who are females, or in private schools), while others were collected from surveys of teachers and principals (such as percentage of teachers with various qualifications). Table 3: Descriptive Statistics of Departmental-Level Variables in the HLM Model Mean/Percentage Standard Minimum Maximum Deviation Public expenditure on basic education per student (US$) 141.3 31.4 71.0 223.0 Household expenditure on basic education per capita (US$) 74.5 42.0 17.9 144.0 Poverty Index 3.0 1.1 1.0 4.5 Female students 49.3 2.6 44.5 56.5 Over-aged students 23.5 10.1 10.1 42.3 Quechua students 14 8 21.3 0.0 67.5 Private school students 14.8 11.9 0.0 50.0 Teachers with MA degree 12.5 12.6 0.0 44.8 Teachers graduated from ISP 52.1 14.0 27.6 76.5 Teacher years of service 11.9 1 75 8.4 15.7 # of training courses attended 6.8 1.0 4.6 8.3 Sample size 25 17 3. The analytical approach of hierarchical linear modeling The appropriate approach to analyze this dataset is hierarchical linear modeling (HLM). It is because the structure of the data was hierarchical: student-level variables were nested within schools, and in turn, school-level variables were nested within departments. For example, students' accessibility to text materials is an indicator of students' home resource; but when it is aggregated to the school level, it became an indicator of school resource and the normative environment (Bryk and Raudenbush, 1992). Mixing individual and aggregated explanatory variables can lead to both statistical and substantive errors in interpretation of the effects of the group, such as the school or the department (Aitkin and Longford, 1986; Burstein, 1980). Group effects are truly important because students with the same characteristics might have different learning outcomes if they attend schools with different organization, quality, policies and practices or if they live in different departments (Akin and Garfinkel, 1977). For this reason, the Ordinary Least Square (OLS) regression analysis cannot be applied to this dataset because it does not take into account the hierarchical structure of the data. If the variance in test scores attributable to differences between schools is large, OLS regression analysis will severely understate standard errors and overestimate their significance, thereby leading to falsely rejecting the null hypothesis. However, hierarchical linear modeling (HLM) allows personal and contextual (such as school and department) effects on an individual's score to be analyzed (Bryk and Raudenbush, 1992). Unconditional models. The first step in HLM was to estimate the fully unconditional models, which can be at two levels (students and schools) or three levels (students, schools, and departments). The unconditional models for three-level analysis in this study are as follows: yijk Ojk +e e jk' N(0, 2), (Equation 1) (Level 1) lt0k = OOk + rOjk, rOk N(,' Ok) (Equation 2) (Level 2) POOk O UOOk' U00k N(O, To)- (Equation 3) (Level 3) where Y ijk was math test score for student i in school j in department k; 7TCOjk was the mean test score at school j in department k; POOk was the departmental mean of the test score in departmnent k, and yOo( was the grand mean of the math test score. The eij, was the student-level random components in school j in department k; the rojk was the school- level random components in school j in departmernt k; and u00k was the departmental- level random component in department k. The cY2 was the error term (residual) of the variance in test scores between students; the r k was the error term of the total variance in test scores between schools, and the t00 was the error term of the total variance in test scores between departments. 18 This unconditional model allowed for the calculation of the intra-class correlation. This provided estimates of (a) the total variance in test scores between students (within schools), (b) the total variance in test scores between schools (within departments), and (c) the total variance in test scores between departments: p = j2/(l + X +± 2) (Equation 4) (Level 1, between students) Ok 00 p = X /( u + - + cy2 ) (Equation 5) (Level 2, between schools) Ok Ok 00 p = /( + + g2 ) (Equation 6) (Level 3, between departments) 00 Ok OO where p was the intra-class correlation, and the error terms (residuals) of the variance on the right side have been described in Equations 1, 2, and 3. Subsequently, the unconditional estimates of the errors in Equations 1, 2, and 3 provided the basis for computing the proportion of variance in test scores explained by additional variables at each of the three levels. It should be noted that HLM does not generate R-squared statistics. The explanatory power of a model is indicated by how much of the proportion of variance in outcome it can explain. Conditional models. The next step was to specify a conditional model with random effects analysis of covariance (ANCOVA) for each of the three levels. At level 1, the model used student-level variables, and allowed the intercept and slopes to vary across schools and departments. The model was as follows: Yijk rOjk + 7Ijk (Xijk - X jk) + eijkl eijk -N(O, a2) (Equation 7) (Level 1) where X's were background characteristics of student i (such as girls, over-aged, and Quechua speakers) in school j and department k; and eijk was the student-level random effect. The intercept term of the conditional model was similar to that in the unconditional model, except that the mean was now adjusted for the covariates (student level variables). In this case, X's were centered on the school mean (the average value of a given variable of school j).12 Centering allowed 7c Ojk to be interpreted as the mean of school j in department k for test score of student i in the same school, adjusted for differences among schools in student characteristics. In this manner, differences in student characteristics could be taken into account. 12 For example, if there is a continuous variable for the number of hours of studying per week, this could be centered around the mean hours of studying per week at school j, thereby adjusting for the time of students actually studying. One advantage of school-mean centering all the variables is to easily identify the marginal effect of any single predictor, after controlling for the effects of other covariates. This would allow addressing the question of: if a student is an average in all respects at school j, what is the marginal effect of hours studying? For categorical variables, group-mean centering works in the same manner. At levels 2 and 3, grand-mean centering refers to the same procedure and effect. 19 Unlike OLS regression coefficients, the intercept and slope parameters were subscripted by j and k, indicating that each school could have a different intercept and slope(s). The student-level coefficients, njk, could be specified as being either fixed, non- randomly varying, or randomly varying (Bryk and F'audenbush, 1992). A model with several student-level predictors could have any combination of the three specifications. If there is significant variation in intercepts and slopes between schools, then this can be modeled by including predictors at the school- and student-levels aggregated to the mean of school j. Thus the student-level intercepts and slopes became outcomes, and the school-level ANCOVA model was as follows: 1ok =k %gk+ Polk (Wj W ) + ±rOk rO k -N(O, X ) 7I;jk | P10k + P lk (Wi - W ) + r ljk' rIjk -N(O, I I) (Equation 8) (Level 2) where W's were school characteristics (for example, the average years of service of teachers in a school); rOjk was the school-level random effect; and P3[ was the pooled within-school regression coefficient. The W's were centered on the grand mean (see the same footnote on mean-centering). The intercept and slope were modeled to vary randomly and to be affected by a characteristic, W, of school j. The interpretation of TCCJk would be how the adjusted school means of the outcome, Y, were affected by the school characteristics W's, given student characteristics, X's. Similarly, the slope coefficient could be described as being affected by W's, given X's. If there was significant variation in intercepts and slopes between departments, then this could be modeled by including department-level predictors, as well as school- and student-level predictors aggregated to the mean of department k. Thus the school- and student-level intercepts and slopes became outcomes. The department-level ANCOVA model was as follows: ¾Ok7'YOOO±YO0I(Zk Z )+UOOk' UOk -N(0,T )d 10Ck7100 7110(Zk Z ) ±UOOk, UCOk - N(0, 00) (Equation 9) (Level 3) where Z's were department characteristics (for example, the poverty index). The Z's were centered on the grand mean (see the same footnote o1nL centering). The intercept and slope were modeled to vary randomly and be affected by a characteristic, Z, of department k. The interpretation of Pook would be how the adjusted departmental mean of the test score were affected by the departmental characteristics Z's, given both student characteristics, X's, and school characteristics, W's. Similarly, the slope coefficient can be described as being affected by Z's, given X's and W's. In case where student-level effects varied much between schools and departments, the next step was to analyze whether school and department variables have effects on student-level variables. This was known as the cross-level model. 20 At level 2, using information from the unconditional and the conditional models, the proportion of the variation in the 7r's is explained by the school-level variables. For example, the proportion of the variation of 7rI would be computed as follows: [ 11 l(unconditional) - 1 l(conditional)]/l11 l(unconditional) (Equation 10) Additionally, a X2 test could be used to test whether the error (residual) variation Trc was significant; in which case additional variation in 7rc was left to be explained. This indicated that the relationship between the outcome and the student-level predictor varies significantly from school to school, even when controlling for the school-level variables modeling that particular coefficient. Similarly, at level 3, the proportion of variation in the Ps can be explained by department level variables and can also be determined by Equation 10, using the error variances at level 3. As with the level 2 analysis, the relationship between the outcome and department variables could be examined by using a x2 test to determine whether the school level predictor continued to vary from department to department after controlling for department level variables. 4. Two-level analysis (student and school) The analysis began with the student and school levels in order to explore in depth the effects of variables at these two levels on mathematics outcomes. The approach was guided by four questions: (a) What were the marginal effects of various student characteristics on average student performance, after controlling for other covariates in the student-level model? (b) What were the marginal effects of school characteristics on average school outcomes, after controlling for other covariates in the school-level model? (c) What were the cross-level effects? In other words, what were the effects on a student who attended a particular school, after controlling for individual characteristics? (d) What proportion of variance in outcomes was attributable to differences between students (within schools) and between schools? (a) Effects of student characteristics on average student outcomes (Level 1 model) Table 4 shows the marginal effects of each of the above described student characteristics, controlling for other covariates in the model (see Equation 7 for the model).13 When other concomitant variables were held constant, girls tended to do worse than boys. Students over the age of 10 performed significantly worse than younger children. This comes as no surprise because over-aged students tend to be repeaters. To a lesser extent than gender and age, the mother tongue also had an effect, but it was confined to Quechua speakers who did less well than Spanish speakers. There was no statistically significant difference in the outcomes of Aymara speakers and Spanish speakers. For policy research, it is important to identify the variables that enable Aymara speakers to perform so much better than other indigenous groups. 3 This model did not control for school-level variables. 21 Table 4: Effects of Student Characteristics on Student Outcomes Coefficient Standard Error Intercept 45 1 0.45 Ascriptive Characteristics Girls (compared with boys) -3 58 * 0 21 Mother tongue Aymara (compared with Spanish speakers) -0.65 0 71 Mother tongue Quechua (compared w/ Spanish speakers) -O 70 * 0 35 Student over the age of 10 for grade 4 -1.84 * 0.22 Text availability (compared with text dictated by teachers) No text -0.69 0 26 School provided text -0.38 0.36 Siblings have text -0 06 0 22 Student attendance & study habits Daily attendance (compared with sporadic attendance) 1 62 * 0 33 No studying (compared with study because of self- -2.71 * 0.98 motivation) Study regularly (compared with self-motivation) -1 69 * 0.21 Study for exams (compared with self-motivation) -2 80 * 0.25 Study because expected (compared with self-motivation) -3 87 * 0 24 Parental expectations of school (compared with general learning) Develop literacy -1.13 * 0 23 Develop nothing -I 98 * 0 37 Develop comprehensively 0.42 0 23 Develop mathematics 1 09* 0 26 Home academic support Provide environment for studying (compared with provide 0.46 0.31 support through homework) Provide no support -0 22 0 22 Special education programs -0.74 0.75 Provide additional reading 1 06 * 0.23 Father assistance ( compared with mother's assistance) 0 31 0 24 No assistance 0.86 * 0.23 Other family assistance -0 70 * 0.23 *p<=05 Student attendance and study habits mattered. Students who attended school daily did better than those who attended sporadically. Motivation was important. Students who undertook their study because they were motivated had higher scores than students who studied for other reasons. Parental roles and expectations also affected achievement. Parents who expected school to develop mathematics skills saw their children performing better in math, compared to parents who expected schools to develop literacy, generally, or nothing. Interestingly, home academic support mattered only when parents provided additional reading material, not simply through providing a general environment for studying, or 22 through help with homework or other special programs. The assistance of mothers and other family members turned out not to be helpful in this sample. One might speculate as to whether this is due to lower educational level of mothers and other family members. (b) Effects of school characteristics on school mean (Level 2 Model) Table 5 presents the marginal effects of each of the above described school characteristics, controlling for other covariates in the model'4 (See Equation 8 for the model). Holding other concomitant variables constant, rural and urban areas had no statistically different effects on achievement, but geographic region had big effects. Schools in the mountain region performed less well than those on the coast, whereas the jungle region did much worse than the coast. Students in private schools were associated with much higher achievement. The non-availability of textbooks was negatively associated with learning outcomes. Schools with 50 percent or more of students who had no textbook, or who used their siblings' textbooks, did worse than those whose text was based on dictation by teachers. Teachers who had more years of service had a positive impact on student achievement, but in-service training did not. This, however, changed in a 3-level analysis. In this two-level analysis, there was also no statistically significant difference between teachers of various academic qualifications, conditions of service, and in-service (but this is not true in a three level analysis). This may be because there are insufficient variance between schools in these variables to show the difference, but once aggregated to the departmental level, the difference has statistical significance. A more disturbing finding is that teachers whose mother tongue was Quechua were associated negatively with student math achievement, in comparison with teachers whose mother tongue was Spanish, but this was not true for teachers whose mother tongue was Aymara. Principals' characteristics also mirrored those of teachers. Even after controlling for students' mother tongue, Quechua speaking teachers were associated negatively with math performance. This may be due to Quechua speaking teachers being less prepared, and calls for special attention to the training of Quechua-speaking teachers. That Aymara-speaking teachers were indistinguishable from Spanish speaking teachers in terms of their impact on achievement disproves the notion that indigenous teachers are not effective. It also poses a very important research question as to why Aymara students and teachers were doing so much better than other indigenous groups. If the variables that enable them to overcome their disadvantage can be identified, they might also be used to help other indigenous peoples. Teacher perception of their role made a difference. If teachers perceived that their role was to assess and improve performance, they had large positive effects on achievement, in contrast to those who considered their role simply to explain materials, invite guests, and encourage student participation. This seemed to indicate that focusing on outcomes produced the desired results. 14 This model did not control for student-level variable. 23 Table 5: Effects of School Characteristics on School Mean Coefficient Standard Error 45 10 0.37 Intercept Geographic Rural (compared with urban) -1 84 1 04 Selva (compared with costa) -5 65 * 1 07 Sierra (compared with costa) -2 77 * 0 92 School type Private (compared with public) 1271 * 124 Text availability (compared with text dictated by teacher) Difference between % of student at school with (1): No text -0 18 * 0 03 School provided text -0 07 0 05 Sibling text -0.13 * 0 03 Teacher Characteristics Number of years of service 0.14 * 0 06 Number of training courses taken (between 1990-96) 0.14 0.13 Mother tongue: Aymara (compared with Spanish) -0 85 3.83 Mother tongue: Quechua -5.17 * 1 51 University graduates with teacher's title (compared with 3.39 1 90 secondary school leavers with teacher training only) ISP graduates with teacher's title 1.69 1.67 IST graduates with teacher's title 2 43 3 80 Graduated from professional courses 0.62 1.82 Professional titles in other specialties -0.94 3 83 University graduates without teacher's title 2 61 2 22 University leavers who finished courses but had no degree -1 11 2 79 Appointed by manager (compared w/ officially appointed) -1 84 2 12 Contract -0.36 1 14 Teacher roles Explain materials (compared with focusing on learning -4 36 * 1.69 outcomes by assessing performance) Invite guests (2) -9.50 * 4 71 Encourage student participation -3.48 * 1 31 Principal characteristics Mother tongue is Aymara (compared with Spanish) 6.78 3.62 Mother tongue is Quechua -4 44 * 1 38 Mother tongue is other languages -2 22 4 20 Parent roles (according to teachers) Check attendance (compared with no participation) 3.97 * 1 51 Check homework 3.05 * 1 06 Prepare children for exams 7.58 * 1 90 Piovide nutrition 3 24 1.86 Stimulate learning 5 30 * 1 01 Notes: (1) Percent in 00 0% (i e to calculate the effect at 50%, the coefficient is multiplied by 50) (2) The meaning of teacher's role being to invite guests is unclear from the questionnaire p <=05 24 Parental role as perceived by teachers was also important. Parents who checked attendance and homework, prepared children for exams, and stimulated learning had children who performed significantly better than those parents who did not participate in their children's education. This might be an indicator that pro-active teachers who tried to get parents more involved and communicate more have positive effects on children. At the school-level, this variable might be a proxy of community support. (c) Cross-level effects of school characteristics on achievement slopes. This analysis examines whether or not the effects of student characteristics varied across school. In other words, were there school-level factors that had mitigated the student-level effects? Ascriptive characteristics of students, accessibility of text, study habit, parental role and expectations, and home academic support were crossed with geographic variables of school location, availability of text, and other school characteristics such as private schools, teacher in-service training, and year of service. Only the coefficients and the standard errors of the group of variables which have statistical significance in some of them are presented in Table 6. Those which has no significance at all were not recorded, leaving blank spaces in the table to make it easier to read. Table 6 shows that although girls in general performed less well than boys, those in the jungle and mountain regions did better relative to girls on the coast. Girls also did slightly better when schools provided the text. There was no significant difference in math achievement between boys and girls in the rural and urban areas, or in private and public schools. Overaged students perforned worse in general and far worse in private schools, relative to achievement of overaged students in public schools. This might be attributable to a more competitive environment in private schools that did not help overaged students to catch up. With respect to the mother tongue of students, there was no significant difference in math achievement between Aymara and Spanish speakers, whether they were in private or public schools. Quechua speakers, however, not only performed less well than Spanish speakers in general, they performed significantly worse in private schools, relative to their performance in public schools. With respect to study habits, students did better when they were self-motivated to study than if they studied because they were expected to. This had a greater effect than the replies on whether they studied only for exams, studied regularly, or did not study at all. However, students in the rural areas performed better if they studied because they were expected to. In private schools, students who did not study performed significantly worse. In fact, the biggest negative effect was found among private school students who did not study, relative to the performance of students who did not study in public schools. It might be because private schools have much higher expectations for studying hard and those who did not study fell behind. 25 Home academic support had positive effects on achievement only when the home provided additional reading material. The effect of additional student reading was strengthened when teachers had in-service training. If parental expectation was to develop math skills, versus general learning, there was a positive effect on math achievement. If it focused instead on other goals, such as developing literacy, comprehensive development, or lacked any definite goal, it did not produce higher math scores. However, in rural areas, even if the expectation was to develop literacy, there was a positive effect on math scores; but if the goal was to develop nothing in rural schools, the negative effect was washed out, possibly because it did not matter what the expectations were. No text was worse than having teachers dictated notes. But the years of service negatively impacted on the effects of school provided texts. The reason was unclear. In summary, the analysis of cross-level effects confirmed some common sense notions. For example, in private schools, students who did not study, were not high achievers in general (those over-aged and Quechua speakers), and those who were not self-motivated did significantly worse than their counterparts in public schools. At the same time, the analysis also revealed many puzzles that require further investigation. For example, why were girls in the sierra and selva doing better, relative to boys, than girls in the costa? Why did Quechua speakers perform worse in private schools than public schools? The greatest puzzle of all is perhaps why experienced teachers were associated with higher math scores, but the score decreased when they could not use their own text and have to use school provided text? Could the school-provided text proxy a new curriculum which experienced teachers are less prepared to teach? Answers to these puzzles might help policymakers design more effective interventions. (d) Within-school and between-school variance in outcomes Applying Equations 1, 2, 4 and 5 to the unconditional models, it was found that some 54 percent of the variance in math achievement was attributable to between-school differences, while 46 percent was attributable to 'within-school differences (between students). (Table 7). The higher the between-school variance, the more inequality among schools there is. Normally, a 30 percent difference in variance is the cutoff point for identifying serious equity problems (See Appendices 5.1 and 5.2). Student-level variables explained only 4.7 percent of the within-school variance in outcomes. 2.9 percent of the variance was explained by ascriptive characteristics, 0.1 percent by the availability of and usage of texts, 1.2 percent by student attendance and study habits, and 0.5 percent by parental roles. Between-school variables cumulatively explained 34.2 percent of the variance in outcomes -9.5 percent by geographic factors, 9.5 percent by text usage and homework assignments, 11.6 percent by teachers' characteristics, 0.7 percent by teachers' roles, I percent by principals' characteristics, and 1.9 percent by parental roles. 26 Table 6: Cross Level Effects of School Characteristics on Mathematics Achievement Slopes Geographic Text provided by School characteristics Mean Rural Selva Sierra Sibling School None Private school In-ser. Training Yrs. of Service Difference between Effect S.E. Coef SE. Coef S.E. Coef S.E. Coef. S.E. Coef. S.E. Coef. S.E. Coef S.E. Coef. SE. Coef. S.E. I A Girls and boys -3.61 * 0.21 0.69 0.53 1.26 4 0.55 1.14 * 0.48 0.02 0.02 0.05 * 0.02 0.03 4 0.01 -0.53 0.68 B Student over age for grade (1) -2.15 * 0.26 -2.72 * 1.26 C Aymara and Spanish 0.95 0.73 3.94 2.60 D Quechuaand Spanish -1.03 * 0.36 4.28 * 1.37 E No studying vs. self motivated -3.14 * 1.04 -0.76 2.22 -8.99 * 4.01 F Studies regularly vs. self- -1.71 * 0.21 -0.22 0.59 -2.26 * 0.58 motivated G Studies for exams vs. self- -2.78 * 0.25 1.27 0.68 -2.04 * 0.74 motivation H Studies because expected vs. 4.02 * 0.24 2.08 * 0.60 -3.44 * 0.78 Self-motivated I Environment for study vs. 0.47 0.31 0.18 0.10 through homework J None vs. through homework -0.20 0.22 0.06 0.07 K Special education prigrarns. -0.71 0.75 0.11 0.26 Vs. through homework L Additional reading vs. through 1.02 * 0.23 0.16 * 0.08 homework MDevelop literacyvs. -1.10 * 0.23 1.87 * 0.60 -0.13 0.08 0.11 * 0.03 Learning well, in general N Develop nothmg vs. -2.01 * 0.38 2.08 * 0.94 -0.06 0.13 0.00 0.05 Learnig well, in general O Develop comprehensively vs. 0.43 0.23 0.98 0.64 0.04 0.08 0.01 0.03 Learmng well, in general P Develop mathematics vs. 1.06 * 0.26 -0.42 0.72 0.18 * 0.09 0.01 0.03 Learming well, in general Q No text versus teacher dictated -0.69 * 0.26 0.02 0.04 text R School provided text versus -0.45 0.36 -0.13 * 0.05 teacher dictated text S Sibling text versus teacher -0.08 0.22 -0.05 0.03 dictated text *P