WPS6385 Policy Research Working Paper 6385 Setting Reasonable Performance Targets for Public Service Delivery John L. Newman João Pedro Azevedo The World Bank Latin America and the Caribbean Region Poverty Reduction and Economic Management Department March 2013 Policy Research Working Paper 6385 Abstract Reaching agreement on a reasonable performance target principal chooses to follow once it becomes apparent is a challenge, with costs associated with getting it wrong. that the performance targets were set unrealistically Attention in the literature has focused on the potential high. If the principal chooses simply to waive any negative effects of gaming or of creaming. However, even possible repercussions for the agents for not meeting the if there is no gaming or creaming taking place, there performance targets, this can undermine the credibility can still be costs associated with setting a level of the of the system. If the principal insists on holding agents performance target that is either too low or too high. On to meeting the performance targets—no matter how the one hand, if the negotiated performance target is too unrealistic they were—this can breed resentment and low, there is a strong risk that the target would be met adversely affect future productivity. This paper considers without any change in behavior or performance from some approaches to target setting that have been used what would have been realized without a performance in the literature and proposes an approach based on the management system. In that case, there would be no use of quantile regressions to construct a Characteristic benefit—only the cost of covering the administrative Adjusted Performance distribution of performance to costs associated with developing the monitoring guide the selection of targets. The paper then presents and management systems. On the other hand, if the two concrete examples of applications of this approach negotiated performance target is too high, there could related to the setting of targets on School Test Scores and also be significant costs. The exact nature of the costs Improvement in Homicide rates in Police Districts in the depends on which one of two unattractive options the State of Minas Gerais, Brazil. This paper is a product of the Poverty Reduction and Economic Management Department, Latin America and the Caribbean Region. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http:// econ.worldbank.org. The authors may be contacted at jnewman@worldbank.org and jazevedo@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Setting Reasonable Performance Targets for Public Service Delivery 1 John L. Newman* João Pedro Azevedo+ Keywords: Government Performance; Behavior of Economic Agents; Target Setting; Latin America; Brazil; Minas Gerais JEL Codes: H11, H30; H83 Sector Board: Poverty Reduction (POV) The findings, interpretations, judgments and conclusions expressed in this publication are those of the authors and should not be attributed to the World Bank, to its affiliated organizations, or to members of the Board of Executive Directors or the governments they represent. 1 The authors would like to thank Hernan Jorge Winkler for excellent research assistance, the World Bank’s Public Sector Performance Global Expert Team for providing references on country experience on public sector improvement initiatives and Tadeu Barreto, Iran Almeida Pordeus, Eder Campos, Eber Golcalves, Juliana de Lucena Ruas Riani, Tarsila Velloso, Barbara Bruns, David Evans, Roland Clarke, Laura Zoratto, Deborah Wetzel and Jose Guilherme Reis for very helpful conversations on the issues in this paper. (*) Poverty Reduction and Economic Management team of South Asia; and, (+) Poverty Reduction and Economic Management team of Europe and Central Asia. Correspondence should be directed to jnewman@worldbank.org and jazevedo@worldbank.org 1. Introduction It is increasingly common for negotiations over performance targets to take place as part of a process to improve performance in the public sector. 2 These negotiations are carried out between a principal (the entity that is demanding the improved performance, usually the Ministry of Finance, Planning or the Head of a Sectoral Ministry) and an agent (the entity that is responsible for improving performance, often a sectoral ministry, a subnational unit of a sectoral ministry or an entity delivering services such as a school or a health center). There are quite a few issues that must be successfully resolved for the introduction of performance targets to bring about an improvement in public sector performance (see the recent book edited by Heckman, Heinrich and Smith, 2011) 3. This paper deals with only one of the issues - albeit an important one - that of arriving at the performance target in the first place 4. It considers two cases in Minas Gerais, Brazil - education and crime and violence - and compares the relative merits of two approaches to incorporating information on past performance of different agents to arrive at reasonable targets. One approach has been used in the setting of targets in Job Training Partnership Act (JTPA) program. An alternative approach – and the recommended one - is based on estimating quantile regressions and is being used in estimating growth percentiles of student performance in the education sector in a few states in the U.S. There are strong parallels in the nature of the problem in assessing what constitutes adequate performance in student achievement and in guiding the selection of targets in a performance management system that operates across different levels of government. However, this approach used in education has not yet made its way into the arena of general public sector management performance. It could and should be considered for the more general task of arriving at reasonable targets in many sectors. 2. The problem – How to set a reasonable performance target Reaching agreement on a reasonable performance target is a challenge, with costs associated with getting it wrong. Attention in the literature has focused on the potential negative effects of gaming or of creaming. However, even if there is no gaming or creaming taking place, there can still be costs associated from setting a level of the performance target that is either too low or too high. If the negotiated performance target is too low, there is a strong risk that the target would be met without any change in behavior or performance from what would have been realized without a performance 2 Countries which have introduced performance targets include Australia (see http://www.federalfinancialrelations.gov.au/content/performance_reporting.aspx); Chile (see http://www.gob.cl/cumplimiento/gestion_de_cumplimiento.html); Colombia (see http://www.sigob.gov.co/pnd/inst.aspx) ; France (see http://www.performance-publique.budget.gouv.fr/la- performance-de-laction-publique.html) ; India (see http://www.performance.gov.in/ ); Indonesia (see bappenas.go.id/get-file-server/node/9374/) ; Malaysia (see http://www.pemandu.gov.my/ ); the UK ( see http://www.instituteforgovernment.org.uk/performance/ ); and the US (see http://www.performance.gov/ ). The state of Minas Gerais in Brazil has also placed a strong emphasis on performance management (see http://www.agendademelhorias.org.br/ ). 3 That book points out that while there is general interest in performance management in many sectors, there is a long history of managing such systems in job training and exploits that long history to address general issues related to performance management. 4 Courty, Heinrich and Marschke (2011) also discuss the setting of targets (or standards). However, their analysis does not consider the use of quantile regressions considered in this article. 2 management system. In that case, there would be no benefit – only the cost of covering the administrative costs associated with developing the monitoring and management systems. On the other hand, if the negotiated performance target is too high, there could also be significant costs. The exact nature of the costs depends upon which one of two unattractive options the principal chooses to follow once it becomes apparent that the performance targets were set unrealistically high. If the principal chooses simply to waive any possible repercussions for the agents for not meeting the performance targets, this can undermine the credibility of the system. If the principal insists on holding agents to meeting the performance targets – no matter how unrealistic they were – this can breed resentment and adversely affect future productivity. The task is further complicated by the fact that there is typically a cascading down of performance targets being set at a first level administrative unit (perhaps at a national or state level) down to a second and/or third level administrative unit (perhaps a district or a county). There is often considerable variation in the initial conditions and the potential for performance across the lower level administrative units, making it difficult or unreasonable to set identical performance targets across administrative units. 3. Possible approaches Performance targets are usually set as an absolute target for an agent to reach, which is known to the agent prior to the period where performance would be reviewed. 5 The absolute target may be specified for a level of the performance indicator or a change in the indicator.6 It is not uncommon for an absolute target to be set simply at a value which is a given percentage above the previous value, or some average of the previous value of the chosen performance indicator. For example, the principal could request that the agents reach a performance level that is 5 percent higher than the previous value. The problem with this approach is that it does not take into account the different initial conditions that the agents face that could make it easier or harder to achieve the targets. Ideally, one would want the probability of achieving the target to be due solely to effort. There should be no systematic tendency for an agent with a given set of characteristics to succeed or fail in reaching the target. However, it there are different initial conditions that are present and that do affect the probability of achieving the target, then not taking them into account in the setting of the target will generate a systematic pattern in ex post observations of whether the target is achieved or not. This notion – that setting a reasonable target would be consistent with there being no systematic relation between achieving the target and a circumstance outside the control of the agent – is very similar to the underlying principle that has been used to create a metric to measure the equality of opportunity (Barros et al, 2009). In section 4, we show how that metric can be used to assess the relative quality of different approaches to target-setting. 5 Barlevy and Neal (2011) propose a Pay for Percentile incentive plan that is based on relative performance and not on attaining a performance level or change relative to a target. A nice brief discussion of the plan is presented by Barlevy and Neal in VoxEU (see http://www.voxeu.org/index.php?q=node/7073) This is discussed in more detail in section 6). 6 In the education sector, many school districts and researchers stress the importance of looking at both achievement (the level) and growth. A nice interactive demonstration of both performance and growth information on schools for the Colorado State Department of Education can be found at http://www.schoolview.org/ColoradoGrowthModel.asp 3 There are several approaches that can and have been used to adjust the targets to observable characteristics that are related to the difficulty of achieving the target. One possible approach to adjusting the target for differences in initial conditions would be to generate a predicted mean value of performance for the agent, given the agent’s characteristics, and use that predicted mean value to help guide negotiations on possible performance standards. The predicted mean value could be obtained by estimating an ordinary least squares (OLS) regression and multiplying the estimated coefficients by the value of the characteristics of the agent. However, this adjustment is not entirely satisfactory as the idea of working with performance targets is to try to induce behavior that would encourage agents to avoid poor performance and/or achieve better than average performance. There is interest in identifying performance that is distinct from average performance. The incentives that are built into performance management systems typically kick in when performance is either exceptionally poor (which might invoke some requirement that the agent receive additional supervision or technical help) or exceptionally good (which might generate a reward). It would be possible to use the predicted mean value as an anchor and negotiate standards that would be 10% or 20% higher than the predicted mean. However, in so doing the negotiations would begin to be drawn further and further away from a sound empirical grounding. One does not really know how demanding it would be to reach a performance level that is 20% higher than the mean. Has that been achieved before? How often and how easily? An alternative approach, used by the U.S. Labor Department in setting the performance targets for training centers in the Job Training Partnership Act (JTPA) program, is to set targets based on a specified percentile distribution of the performance indicator – adjusted upwards or downwards according to the characteristics of particular agents (see Courty, Heinrich & Marschke, 2011). The JTPA was concerned with avoiding poor performance in the individual centers and, therefore, selected the 25th percentile as the base performance target. The JTPA managers then factored in an adjustment that was determined from an OLS regression of the performance indicator on the characteristics of the center (expressed as deviations from the mean). The adjustment factor was obtained by multiplying the estimated coefficients from the OLS regression by the value of the characteristics of the center. This is better than the first approach, in that it takes as its starting point a percentile of the performance distribution that is judged to be relevant. However, the adjustment factor does have a limitation in that it adjusts the target by the same amount, regardless of whether the relevant percentile is chosen to be the 25th percentile or the 67th percentile. There is no reason to believe that the influence of the characteristics of the agents would be the same around the 25th percentile as they would be around the 67th percentile. A third approach - and our recommended approach - is to use quantile regressions of the performance indicator on a set of observable initial conditions to generate a predicted percentile for each agent. The target would then be set as the value that pertains to a common percentile, for example the 67th percentile. Different agents would face a common percentile target, but this would correspond to different absolute values of the target because their characteristics differ. This would be true whether the target is specified in terms of a specific level of the performance indicator or a change in the performance indicator. Quantile regressions can control for whatever observable characteristics the interested observer may want to take into account in putting past performance and future goals in perspective. In contrast to an OLS regression that estimates the relation at the mean of the distribution, quantile regression can estimate the relation at the median, the 25th percentile or, indeed, at any given percentile (Koenker and Bassett, 1978; Koenker, 2005). Just as the estimated coefficients from an OLS regression might be combined with values of the country characteristics to yield a predicted mean value of the 4 change in the indicator of interest, the estimated coefficients from a quantile regression on the 25th or 67th percentile can be used to generate predicted percentiles at the 25th or 67th percentile. Quantile regressions have been used to generate student growth percentiles (Betebenner, 2009) and conditional growth charts (Wei and He, 2006). The work on student growth percentiles has generally conditioned only on past achievement levels (and not observable characteristics of the students or their teachers), as the effort of introducing the concept into educational management in such places as the states of Colorado and Massachusetts has been to separate the description of student progress (the SGP) from the attribution of responsibility for that progress 7. Quantile regressions have been used quite extensively in economics (see Koenker and Hallock, 2001 for a review). Three applications that are most closely related to controlling for heterogeneity in looking at the performance in percentiles are: a) Essama-Nassh et al (2010), which used quantile regressions 8 in an extension of the growth incidence curves that have been used to look at pro-poor growth (Ravallion and Chen, 2003): b) the World Bank Poverty Assessment for Guatemala (World Bank, 2009), which used quantile regressions to look at conditional percentile distributions for a set of poverty and social indicators; and c) (Azevedo and Pizzolitto, 2009) which used quantile regressions to assess the targets of the National Development Plan of the Dominican Republic. The approaches followed with student growth percentiles and conditional growth charts share some similarities with the approach followed with growth incidence curves. They both look at the growth of an indicator at a given percentile in the distribution. They both can control for background characteristics. However, growth incidence curves are typically calculated from repeated cross-sections. They typically cannot provide information on the distribution of performance of those households or economic actors who began at a given percentile. In contrast, student growth percentiles or conditional growth charts (often of children’s height or weight) typically look at repeated observations of the same individual and looks at where the performance of that individual falls in the full distribution of performance of a comparable group. For the issue of target setting, one is typically concerned with repeated observations of the same agents, so the approach of the student growth percentiles or conditional growth charts would appear to be the most relevant. 4. An illustration of target setting using School Education Data from Minas Gerais, Brazil This section uses data on student achievement scores across different schools in the state of Minas Gerais, Brazil to illustrate some of the implications of using different approaches to set performance targets. The primary source of data for this exercise is from the PROEB (Program of Evaluation of Basic Education), which carries out Portuguese and math tests at the pupil level for municipal and state school students in 5th, 8th and 11th grade in the state of Minas Gerais. This analysis focuses only the State schools (which 7 See Betenner et al (2011). See also the primers on student growth percentiles put together by the states of Colorado and Massachusetts. ((http://www.cde.state.co.us/cdedocs/Research/PDF/Aprimeronstudentgrowthpercentiles.pdf) and ( http://www.doe.mass.edu/mcas/growth/InterpretiveGuide.pdf ) 8 They use unconditional quantile regressions developed by Firpo et al (2009), rather than the conditional quantile regressions used by Betebenner, Wei and He. 5 were under the jurisdiction of the State Government of Minas Gerais) and, for simplicity, focuses only on the results of 5th grade students. Student achievement tests were introduced by the state government of Minas Gerais in 1992 as part of its evaluation program designed to improve the quality of education, and was part of a broader proposal for education on the part of the government. The achievement tests included complementary socio-economic information through self-administrated questionnaires. Initially, the evaluations were planned and carried out in two year cycles, but in 1998 the system was modified to administer yearly tests. 9 In 2007, the state of Minas Gerais embarked on an ambitious and successful reform program (State for Results) to improve performance in a comprehensive set of public sector activities across all sectors that involved setting targets on a wide range of indicators (see annex 3). In the education sector, the Secretary of Education set targets in 2008 for proficiency in test scores in Portuguese and Mathematics for different grades. While some negotiation of the targets took place, for the most part the targets were set at a 4 percent increase over the 2007 test score for the school.10 Figure 1 illustrates that there was some (slight) difference in the target according to the value of the Portuguese proficiency in 2007. It shows the difference between the target and 1.04 times the 2007 value of Portuguese proficiency for different values of Portuguese proficiency in 2007. Figure 1. Variation in Difference between 2008 Secretary of Education Target and 1.04 * Actual Portuguese Proficiency in 2007, Schools in Minas Gerais, Brazil 9 Today, the state of Minas Gerais has a comprehensive evaluation system for its educational system - SIMAVE: Quality of Education and School Evaluation System. This system is made up of three evaluation programs: PROALFA — geared towards the evaluation of literacy levels; PROEB — to verify the efficiency and quality of education based on performance in the final levels of education; PAAE — to perform progressive learning diagnoses to support pedagogical interventions. 10 The fact that the target was very close to a set percentage for each school can be seen from the results of a regression of the 2008 target on the 2007 value of the test score (with the regression constrained to exclude the constant). Target for 2008 Portuguese Proficiency = 1.04279 *2007 Portuguese proficiency (4,866) R2 = .9999 N=2416 6 15 10 Difference 5 0 -5 100 150 200 250 300 2007 Portuguese Proficiency, 5th Grade Figure 2. plots the difference between the 2008 actual and Secretary of Education target performance of schools in average Portuguese proficiency of 5th grade students against some observable characteristics of schools (the number of students in 2007, the school participation rate of students in the PROEB exam in 2008 and the average Portuguese proficiency of 5th grade students in 2007). As discussed in section 3, ideally there should be no relation between whether the target is achieved and the observed characteristics. While there seems to be no (or little) relation between the difference of the 2008 actual and target levels of Portuguese proficiency and the number of students and the participation rate in the exam, it does appear as if those schools with higher levels of Portuguese proficiency in 2007 were systematically less likely to achieve the target. The same 4% improvement applied to a higher level of achievement calls for a larger absolute improvement. The graphs show there is a strong tendency for schools which had a higher absolute target to be less likely to have achieved their targets. This is more likely to due to how the targets were set in the first place, then one the ability of any particular school to achieve the target. 7 Figure 2. Variation in Actual 2008 Performance and 2008 Sec. of Education Target in Portuguese Proficiency, 5th Grade Schools - Minas Gerais, Brazil 100 Difference 2008 Actual - Sec. Ed Target -50 0-100 50 0 1000 2000 3000 4000 Total Number of Students 2007 100 Difference 2008 Actual - Sec Ed Target -50 0 -100 50 0 50 100 150 200 School Participation Rate in PROEB 2008 100 Difference 2008 Actual - Sec. Ed Target -50 0-100 50 100 150 200 250 300 School Average Proficiency Score in Portuguese of 5th Grade Students, 2007 8 Comparison of Results from Secretary of Education Targets and Targets Based on Quantile Regressions Figure 3 graphs the same relationship – this time replacing the target set by the Secretary of Education with the target generated from quantile regressions (our recommended approach). The target for each school is generated from the following procedure. First, a quantile regression at the 67th percentile was run that regressed the change in Portuguese proficiency between 2007 and 2006 on a regional dummy for Grande Norte, the total number of students in the school in 2006 and the initial level of Portuguese proficiency in 2006. Then, the estimated coefficients from that estimation were multiplied by the 2007 values of the right hand side variables in the quantile regression (i.e. the constant term and the regional dummy and the 2007 values of the number of students and Portuguese proficiency). For each school, this yields the change in Portuguese proficiency that they would have if they were able to achieve the performance equivalent to those in the 67th percentile of schools with their given characteristics. Finally, this estimated change at the 67th percentile of performance for the particular characteristics of the school was added to the base 2007 value of Portuguese proficiency for each school to generate a target level for 2008. Figure 3. Variation in Actual 2008 Performance and 2008 Quantile Regression Target in Portuguese Proficiency, 5th Grade Schools - Minas Gerais, Brazil 100 Difference 2008 Actual - QR Target -50 0 -100 50 100 150 200 250 300 School Average Proficiency Score in Portuguese of 5th Grade Students, 2007 9 Figure 4 again graphs the same relationship – this time with the target set in the same fashion as was followed in the Job Training Partnership Act (JTPA). The JTPA approach was to start from the value of a specific percentile of the empirical distribution of the performance standard (in this case, the 67th percentile of the change in Portuguese proficiency in 5th grade) and adjust it upwards or downwards according to the characteristics of the school. The adjustment factor is calculated by multiplying the characteristics of the school (expressed as a deviation from the mean) by the estimated coefficients for that characteristic obtained from an OLS regression of the change in Portuguese proficiency on the characteristics. The value of the base 67th percentile of the change plus the adjustment factor was added to the 2007 Portuguese proficiency to yield a 2008 target for the level of Portuguese proficiency. Figure 4. Variation in Actual 2008 Performance and 2008 Target in Portuguese Proficiency, 5th Grade Schools - Minas Gerais , Brazil (Target based on JTPA-type adjustment to 67th Percentile) 100 Difference 2008 Actual - JTPA Target -50 0 -100 50 100 150 200 250 300 School; Average Proficiency Score in Portuguese of 5th Grade Students, 2007 The differences revealed in these graphs are also revealed in Table 1, which compares estimated the results of probit estimates of the probability of achieving the target on the Portuguese proficiency in 5th grade in 2007, the number of students in 2007, a dummy variable for whether the school was in the Grande Norte, a poorer part of Minas Gerais and the participation rate in the PROEB test in 2008. Note that the pseudo R2 is highest for the probit using the Secretary of Education target, while the number of significant coefficients is lowest for the target based on the quantile regressions. Note also, that even 10 though there are differences in the explanatory power across the different targets, there are not large differences in characteristics in the targets themselves. This actually is a warning sign as it indicates that, in the aggregate, there may be little difference in the targets, but there can still be systematic differences in who is likely to achieve the target. Table 1. Probit Estimates Using Alternative Targets (Meeting target=1, 0 otherwise) Secretary of Education Target Based on Target based on JTPA- Target Quantile Regression at type adjustment to value 67th Percentile at 67th percentile Constant 4.176*** .194 -.178 (14.84) (0.78) (-0.72) 2007 Portuguese -.021*** -.001 .001 proficiency (-14.90) (-.085) (0.89) 2007 Number of -.0001* .00001 -.0001** students (-1.93) (0.19) (-2.36) 2008 Participation Rate .0002* .0003** .0003** in PROEB (-1.86) (2.42) (2.45) Dummy for Grande -.131** .096 .161*** Norte (-2.08) (1.60) (2.68) Pseudo R2 .0887 .0052 .0090 N 2204 2255 2255 Mean of target 199.9 199.8 200.0 Standard Deviation of 22.2 15.4 14.8 target Range of target 130.1-300.4 148.1-264.5 150.2-259.9 Percentage of Schools 52.3% 53.1% 52.3% that met target in 2008 Looking at patterns of coefficients in probit estimations is one way of judging whether the probability of achieving the target is systematically related to the particular circumstances of the agents. However, there is some value to having a summary measure that could be used to judge whether there is any significant relation to the circumstances. A summary measure is readily available because the problem that is posed in judging the adequacy of the target is an exact parallel to the problem of determining whether opportunities are equally distributed in the population in the literature of equality of opportunity. The metric developed for the equality of opportunity, the human opportunity index, is an inequality-adjusted coverage rate, with the adjustment or penalty determined by the extent to which there is a deviation from randomness in who receives the opportunity and who does not. If there is no systematic relation between who gets the coverage and a set of circumstances, then there will be no difference between the estimated coverage rate and the human opportunity index. The relevant test, then, is a test of whether the penalty is equal to zero. 11 Table 2 presents the mean value of the key HOI related indicators as well as the 95 percent lower and upper bound confidence intervals. Looking at the penalties, it is clear that the targets based on either the quantile regression or the JTPA-type adjustment do better than the Secretary of Education target (the 4% improvement). The quantile regression model appears to do marginally better than the JTPA-type adjustment, but in both cases the confidence interval lies above zero. It is worth pointing out that setting the targets according to a quantile regression estimation or according to a JTPA-type adjustment was never actually carried out. The results presented here strongly suggest that setting the targets based on the quantile regressions or the JTPA-type adjustments is likely to eliminate the tendency for the outcomes to be systematically related to some circumstances of the schools. It is entirely possible that if a direct randomized experiment were to be carried out testing how different schools respond to targets set in the 3 different approaches, that the small but positive penalties observed with the quantile regression and JTPA-type adjustments would be eliminated. Table 2. HOI Indicators with alternative definitions of targets Secretary of Education Target Based on Target based on JTPA- Target Quantile Regression at type adjustment to value 67th Percentile at 67th percentile Coverage (Pct meeting 52.27 53.13 52.28 the target) (50.31 to54.23) (51.07 to 55.18) (50.23 to 54.34) Dissimilarity Index 13.03 2.50 3.58 (6.64 to19.42) (-5.24 to 10.24) (-4.43 to 11.58) HOI 45.46 51.79 50.41 (43.29 to 47.62) (49.51 to 54.08) (48.12 to 52.71) Penalty 6.81 1.33 1.87 (6.04 to 7.59) (0.35 to 2.30) (0.87 to 2.86) 5. Looking at the entire unconditional and conditional distribution of performance The results presented in Table 2 indicate that it is important to make some adjustment to initial conditions as compared with an across the board target of a 4% improvement, but it appears that there is not that great a difference between the quantile regression approach and the JTPA-type adjustment. However, the fact that in this particular example the differences are not that great does not mean that this will always be the case. It depends on how much the estimated coefficients vary across different quantiles. If the coefficients in the quantile regressions do not vary much, then it is not surprising that the adjustments based on an OLS would be close to those bases on a quantile regression. Indeed, the two approaches will 12 lead to equivalent targets for all schools if the coefficients in the quantile regressions are constant over all quantiles. Annex 1 plots the coefficients for 99 different quantile regression estimations and the plots reveal that the coefficients are not always constant. Of course, each case is different. But in general, one would not expect the coefficients in quantile regressions to always be constant – so there will be differences between the approach followed by JTPA (which implies a constant adjustment to each percentile) and the approach based on quantile regressions. Because it is just as easy to do the quantile regressions in STATA as the OLS adjustments and because the quantile regressions are more flexible, the quantile regression approach would appear to dominate the OLS JTPA-type adjustment. Comparison of Empirical and CAP Distributions The discussion in the paper has focused on the setting of targets and, for that, one need only choose the percentile that is considered most relevant and then estimate a single quantile regression at that percentile. In addition to target setting, a policy maker might be interested in ascertaining whether an observed change in an indicator reflects performance that is good, bad or indifferent. For that it is useful to have some context against which to compare the outcome. One could compare the observed change with the empirical distribution of all past changes. However, the same considerations come into play as with target setting. The empirical distribution does not take into account the particular circumstances of the agent and to properly judge whether the performance has been good, bad or indifferent some consideration of the particular circumstances of the agent needs to be taken into account. Just as one estimated a target based on the quantile regression of the 67th and 25th percentiles, one can estimate 99 different quantile regressions and use the estimated coefficients to generate an estimated value if performance was equivalent to that of the 10th, 20th or nth percentile. We call the distribution of these estimated percentiles, the CAP or Characteristic Adjusted Performance distribution Figure 6 plots the empirical distribution for the change in Portuguese proficiency between 2007 and 2006, together with the CAP distributions for the change in Portuguese proficiency for the same period for two selected schools in Minas Gerais. 11 School A was randomly selected from the set of schools that were in the top 25 percent of all schools in terms of the number of students in 2007 and in terms of the level of Portuguese proficiency in the 5th grade. School B was randomly selected from the set of schools that were in the bottom 25 percent of all schools in terms of the number of students in 2007 and in the level of Portuguese proficiency in the 5th grade. The red vertical lines indicate the values at the 25th and 67th percentiles. 11 Annex 2 provides information on a STATA program that has been developed to automate the process of generating the CAP distribution. 13 Figure 5. Comparison of Empirical distribution and CAP distributions for Two Schools in Minas Gerais, Brazil 100 50 . 0 -50 0 20 40 60 80 100 Empirical distribution CAP distribution - School A CAP distribution - School B 95 % Confidence Intervals given by shaded blue area While the CAP distribution for the smaller and less proficient school closely approximated that of the empirical distribution, that was not the case for the larger and more proficient school. Note that the 95 percent confidence intervals increase substantially at the extremes, largely below the 5th percentile and above the 95th percentile. From a practical standpoint, this should not create many serious problems. One would rarely want to set a performance target that is equivalent to the performance of the top 95th percent of the entire distribution. Similarly, one would rarely want to avoid the poor performance of just the bottom 5 percent of the distribution. An example from Crime and Violence Another example of how one can make use of information on the entire CAP distribution is provided by an analysis of the performance of police departments in reducing homicides – again, in the state of Minas Gerais. As explained earlier, the State for Results program in Minas Gerais called for setting targets across a variety of sectors and was not focused just on achievement data on schools. Targets for reducing homicides were set, but this was done as an outcome of a negotiation – rather than setting a 4% 14 improvement as was the case in education. Information on the distribution of performance based on all available past changes and based on the information conditioned on characteristics relevant for the different police districts can help inform those negotiations. For example, homicide data may be available for all police districts in a state (as was the case for Minas Gerais). The target that may be negotiated between the principal and the agent may be for improvement in homicides at the state level, but if the target is to be met, it will be up to individual police districts to improve the outcomes in each one of their districts. It is often the case that a sector will set targets not only at a high administrative level but also for sub state administrative units. Setting the same state-level target for each sub state administrative unit implies greater or less demands on performance improvement, given that the police districts are starting from different initial conditions. Comparing the targets to some of the changes that have taken place in the recent past may help provide some check on the reasonableness of the performance improvements which are being sought. Figure 6 – Unconditional relative performance on homicide reduction from 2000-2007, by Police Regions Trend 2000/2008 Barbacena Trend 2004/2008 100 Vespasiano Belo Horizonte Target 2008/2009 80 Unaí Contagem 60 Uberlândia 40 Curvelo 20 Uberaba 0 Divinópolis Teófilo Otoni Governador Valadares Patos de Minas Ipatinga Montes Claros Juiz de Fora Lavras Source: Author’s calculations using data from Peixoto et al (2010) Figure 6 represents a simple way to present comparison for many police districts simultaneously. This type of figure is useful when the number of districts is limited, but would (obviously) not be useful for 15 presenting targets for individual schools where the number of individual schools would be in the thousands. Rather than presenting the entire empirical distribution for the indicator of interest (as in figure 5) and showing where the target falls in that distribution, Figure 6 presents the percentiles for recent changes, changes over a longer period and changes that would be implied by meeting the targets - all together for all the different police districts. The graph enables one to see quickly that there are some districts where the relative performance being asked of a district with its given target is not very different from both its medium-term and recent tendencies (Juiz de Fora), some districts where the targets are quite a bit more demanding than their medium-term tendencies, but not from their recent tendencies (Vespasiano, Belo Horizonte and Contagem) and some districts where meeting the targets would require some considerable shift from both their medium-term and recent tendencies (Teofilio Otoni and Montes Claros). Identifying where a district may have to up its game considerably should alert stakeholders to examine what improvements in policy or implementation are in place to suggest that there would be such a jump in relative performance. Figure 6 made use only of the empirical distribution of changes in homicides. That is, the analysis did not try to control for differences in the initial conditions of the different police districts. It may well be that those involved in the target setting process may only be interested in comparing themselves against all changes that have taken place, irrespective of the conditions under which the changes took place. If that is the case, then the empirical distribution is all that is needed. However, it is quite likely that stakeholders involved in a target setting process may wish to define some specific characteristics to control for in order to generate a comparison that they consider more relevant than a comparison to all other districts. If there were many more observations than what is normally available, one might form a comparator group by selecting some units to be included in the comparator group, dropping others. However, constructing a comparison group by dropping observations is not efficient and would greatly reduce the sample size. Instead, it is possible to retain all observations and carry out some multivariate statistical analysis to take into account differences in characteristics. When working with the empirical distribution, the outcomes for a police district were judged to be a good or poor simply by noting where the particular police district's observed change fell in the empirical distribution of all observed annual changes. However, in an approach exactly analogous to the creation of the CAP distribution for education, one can estimate 99 different quantile regressions, multiply the 99 sets of coefficients by the characteristics of each district and create a CAP distribution for each police district. One can then compare where the actual performance fell relative to the unconditional distribution and the CAP distribution. Thus, when controlling for characteristics of the police districts, the observed change can be compared with a counterfactual distribution of predicted percentiles that is specific to each police district. 12 The final consideration is the choice of the characteristics to control for. The idea is not to explain the change in homicides, but to generate a standard for comparing performance that could be more relevant for the police district in question. Each government or interested party could pick the specific 12 A Taylor series expansion to calculate the 95% confidence interval of the predicted values. This technique is sometimes called the delta method (for more details see Newman, Azevedo ,et all (2008)). 16 characteristics that it would like to control for in the comparison. The only requirement is that the characteristic be observable and that data exist for the units of analysis of interest. Figure 7 presents comparisons across different police districts relative to their own CAP distributions. One can note by comparing Figures 8 and 7 that, taking account of the characteristics does make a slight difference to the comparisons of the targets to recent tendencies. In general, the targets appear somewhat less demanding relative to past performance, once one takes into account the particular characteristics of the police districts today. Figure 7. Conditional relative performance on homicide reduction from 2000-2008, by Police Regions Trend 2000/2008 Barbacena Trend 2004/2008 Vespasiano 100 Belo Horizonte 90 Target 2008/2009 80 Unaí 70 Contagem 60 50 Uberlândia 40 Curvelo 30 20 10 Uberaba 0 Divinópolis Teófilo Otoni Governador Valadares Patos de Minas Ipatinga Montes Claros Juiz de Fora Lavras Source: Author’s calculations using data from Peixoto et al (2010) 17 6. Final considerations – Did the introduction of targets improve performance and how could this information on growth percentiles and CAP distributions be used in an incentive program? As stated in the introduction, this paper is focused on a narrow question – how to go about setting performance targets – and it has explored ways in which past information can be used to help set a target. Assessing whether the introduction of the State for Results management shock, with its emphasis on setting targets, led to better performance lies beyond the scope of this paper. 13 However, the information on the percentiles does suggest that, perhaps, performance improved. For example, the percent of schools meeting the target in 2008 based on the 67th percentile of the distribution of changes between 2007 and 2006 was 53.1 percent. If performance had been stagnant, one would have expected 33 percent of the schools to have a performance greater than the 67th percentile. Just as one can set a performance target at the 67th percentile, one can set a performance target at any percentile and determine what percentage of schools met or exceeded that target. At the low end of the distribution, 86.2 percent of the schools in 2008 had a level of performance better than that of the 25th percentile. If performance had been stagnant, one would have expected 75% of the schools to have performance calculated on the basis of the 25th percentile of past performance. At the highest part of the distribution, 22 percent of the schools in 2008 had a level of performance higher than the 90th percentile (as opposed to 10 percent if performance had remained the same). The discussion on target setting contained in this paper has been focused on setting an absolute level of a target, where the actors are informed of the value of the target prior to the period where they take decisions on effort. Recently, Barlevy and Neal (2011) have put forth an interesting proposal that attempts to generate incentives for higher performance not by setting an absolute target for agents to reach, but by setting up a competition across agents and offering a payment that consists of a common base and a bonus that is proportional to a percentile performance index that measures relative performance across agents facing comparable conditions. 14 Because it is structured as a competition among peers, there is no absolute target set ex ante. Their “Pay for Percentile� incentive is based on relative performance that is revealed ex-post. The authors argue that by structuring the incentives in this fashion, one can implement effective incentive schemes for teachers without ever forming aggregate 13 The distinctive pattern of the targets that were set by the Secretary of Education (as revealed in figure 1) might allow for the estimation of the impact of the introduction of the target setting program on performance through the use of a regression discontinuity approach. But that is the subject of a different paper. 14 Barlevy and Neal (2011) describe the calculation of the percentile performance index as follows: “ For each student in a school system, first form a comparison set of students against which the student will be compared (generally speaking, will be those who begin at the same baseline achievement)… At the end of the year, give a cumulative assessment to all students. Then, assign each student a percentile score based on his end of year rank among the students in his comparison set. For each teacher, sum these within-peer percentiles scores overall all students she teaches and denote this sum as a percentile performance index�. 18 statistical measures of what individual teachers have contributed to learning of all students in their classrooms. This does not provide information on changes in achievement over time (the performance measure is purely ordinal), but the authors suggest that education authorities can benefit from treating the provision of incentives and the documenting of student progress as separate tasks. It is not immediately obvious that agents will respond in the same way to a relative target where they know they must do better than a comparator group, but they do not have an anchor of a given target to shoot for. This might be too abstract a goal and greater effort might be obtained by having a very clear goal. The approach described in this paper is also based on a relative notion – aiming for the 67th percentile of the quantile regression, but then that relative performance is converted to an absolute value that agents try to achieve. For that reason, the estimated target must be calculated from data that are available at the beginning of the period. The “Pay for Percentile� incentive would not have to use information at the beginning of the period. Since the incentive plan is a competition, only the rules must be fully specified at the beginning of the period. The actual determination of the percentile performance index can be calculated at the end of the period, once the data become available. While the Pay for Percentile proposal was made for education, it could be useful to explore parallels with incentive plans for decentralized decision making. The actors in education are school districts, schools and individual students. The actors in decentralized decision-making plans are national governments, state and district level governments, but there are obvious parallels with arrangements within the education sector. 7. Conclusions This paper has considered the problem of selecting a reasonable performance standard and has illustrated how quantile regressions could be used to help generate a target that reflects the different characteristics of individual agents. This approach is a bit more flexible than the approach that was followed with the JTPA program in that it does not assume that the adjustment for the agents’ characteristics would necessarily be the same at different percentiles of the distribution of the performance standard. It is no harder to implement. It is worth noting that the differences between the empirical and the CAP distributions were not all that great. It would, perhaps, be disturbing if they were. One might be inclined to conclude that this ability to take note of an agent’s initial conditions does not add much. However, the value of being able to calculate the CAP distributions has both a tactical and a technical value. The ability to take note of initial conditions makes it harder for an agent (a school, district or state, depending on the unit of analysis) to dismiss information on the distribution of performance as being completely irrelevant for them because of some factor which was not taken into account. Whatever observable factor an agent thinks should be taken into account could be taken into account in coming up with the CAP distribution. The evidence suggests that school performance in Portuguese proficiency improved in 2008 following the introduction of targets into public sector management. However, a thorough analysis of the response to the introduction of the new system lies beyond the scope of this paper, which is focused on the issue of 19 how to make use of past information in setting a target. A more thorough analysis of the system would have to address whether the particular design of the incentive systems in Minas Gerais managed to generate positive improvements, while avoiding unintended outcomes and the possibility of welfare losses. It would also be important to analyze the dynamics of the system over time – how the agents reacted to the introduction of the targets and what subsequent modifications, if any, were made to the implementation of the public sector management approach. While a full assessment awaits this type of analysis, for the narrower task of selecting targets (which is an intrinsic element of performance management systems), we conclude that constructing CAP distributions and using targets based on the performance achieved across common percentiles for different agents could help ground the negotiations over performance standards. 20 References Azevedo, Joao Pedro and Pizzolitto, Georgina. V. (2009) Benchmarking: Análisis para la Republica Dominicana. Washington, D.C.: World Bank, LCSPP. http://www.camaradediputados.gov.do/masterlex/mlx/docs/2F/1B0/1B1/1C5/1C9.pdf Azevedo, Joao Pedro (2011) "WBOPENDATA: Stata module to access World Bank databases," Statistical Software Components S457234, Boston College Department of Economics, revised 31 Jan 2013. Barlevy, G. and D. Neal, (2012 “Pay for Percentile�, American Economic Review, 102(5 ) August, 2012, pages 1805-- -1831 . Barros, R., F. Ferreira, J. Molinas Vega and J. Saavedra (2009). Measuring Inequality of Opportunities in Latin American and the Caribbean. The International Bank for Reconstruction and Development/ The World Bank. Betebenner, D., R.J. Wenning and D. Briggs, “Student Growth Percentiles and Shoe Leather� http://www.ednewscolorado.org/2011/09/13/24400-student-growth-percentiles-and-shoe leather. Courty, P., C. Heinrich and G. Marschke, (2011) “Setting the Standards: Performance Targets and Benchmarks�, in J.J. Heckman, C.J. Heinrich, P. Courty, G. Marschke and J. Smith (eds.), The Performance of Performance Standards, W.E. Upjohn Institute for Employment Research, Kalamazoo, Michigan, 2011 Guimarães, Tadeu Barreto and Eder Campos (2010) “Monitoring and Evaluation System in the Minas Gerais State Government: Aspects of Management� in Lopez-Acevedo et al (ed.) Challenges in Monitoring and Evaluation: An Opportunity to Institutionalize M&E Systems. World Bank: Washington, DC. Essama-Nssah, B. L. Bassole and S. Paul, (2010)“Accounting for Heterogeneity in Growth Incidence in Cameroon�, World Bank Policy Research Working Paper No. 5464, November, 2010. Heckman, J. J. C.J. Heinrich, P. Courty, G. Marschke and J. Smith (eds.), The Performance of Performance Standards, W.E. Upjohn Institute for Employment Research, Kalamazoo, Michigan, 2011 Klasen, S., M. Grosse and K. Harttgen (2008), “Measuring Pro-Poor Growth using Non-Income Indicators�, World Development 36(6): pages 1021-1047. Koenker, R. (2005), Quantile Regression, Cambridge Books, Cambridge University Press, October, 2005. Koenker, R. and G Bassett (1978), “Regression Quantiles�, Econometrica, Vol. 46, No. 1. (Jan., 1978), pages. 33-50. Koenker, R. and K. Hallock, “Quantile Regression�, Journal of Economic Perspectives, Vol 15, No. 4, Fall 2001, pages 143-156. 21 Newman, J., J.P. Azevedo, J. Saavedra and E. Molina, (2009) “The Real bottom Line: Benchmarking Performance in Poverty Reduction in Latin America and the Caribbean, The World Bank mimeo, June, 2009. Peixoto, Betânia, Marcus Vinícius Gonçalves da Cruz and João Pedro Azevedo (2010). "Gestão Por Resultados Em Minas Gerais: Uma Avaliação Das Metas De Redução Da Criminalidade," Proceedings of the 14th Seminar on the Economy of Minas Gerais. Cedeplar, Universidade Federal de Minas Gerais. Wei, Y., & He, X. (2006). Conditional growth charts. The Annals of Statistics, 34(5), 2069–2097. (Available at http://projecteuclid.org/DPubS/Repository/ 1.0/Disseminate?view=body&id=pdfview_1&handle=euclid.aos/1169571786) World Bank (2009), Guatemala Poverty Assessment: Good Performance at Low Levels, Report No. 43920-GT, March 18, 2009, Central America Dept>, Poverty Reduction and Economic Management Unit, Latin America and the Caribbean Region. (Available at http://www- wds.worldbank.org/external/default/WDSContentServer/WDSP/IB/2009/07/08/000333038_20 090708235221/Rendered/PDF/439200ESW0GT0P1IC0Disclosed07171091.pdf) 22 Annex 1 This annex plots the values of the estimated coefficients for the right hand side variables included in the 99 quantile regressions on the change in Portuguese proficiency between 2007 and 2006 in 5th grade in schools in Minas Gerais, Brazil. These estimations were used to generate the CAP distributions, discussed in this note. Ninety-five percent confidence intervals are provided in the blue shaded area. The plot on the left hand side provides the full range of estimated coefficients, while the plot on the right hand side is truncated to cover the range from the 6th through the 94th percentiles (dropping the bottom and top 5 percentiles). This was done because the standard errors increase significantly in the tails of the distribution and presenting the figures for the entire range makes it more difficult to note how the values change over the different quantile regressions. Figure A1.1 Plot of Coefficients of Constant Term in Different Quantile Regressions Estimated Coefficients Constant Term in Quantiles 1 to 99 Estimated Coefficients Constant Term Across Quantiles 6 to 94 300 200 200 150 100 100 . . 50 0 -100 0 0 20 40 60 80 100 0 20 40 60 80 100 . . 23 Figure A1.2 Plot of Coefficients of 2006 Portuguese Proficiency in Different Quantile Regressions Estimated Coefficients on 5th Grade Portuguese Proficiency in 2006 across Quantiles 1 to 99 Estimated Coefficients on 5th Grade Portuguese Proficiency in 2006 Across Quantiles 6 to 94 -.2 .5 -.3 0 -.4 -.5 . . -.5 -1 -.6 -.7 -1.5 0 20 40 60 80 100 0 20 40 60 80 100 . . Figure A1.3 Plot of Coefficients of Regional Dummy of Grande Norte in Different Quantile Regressions Estimated Coefficients of Regional Dummy for Grande Norte across Quantiles 1 to 99 Estimated Coefficients of Regional Dummy for Grande Norte across Quantiles 6 to 94 20 0 0 -5 . . -20 -10 -40 -15 0 20 40 60 80 100 0 20 40 60 80 100 . . 24 Figure A1.4 Plot of Coefficients of Total Number of Students in the School in 2006 in Different Quantile Regressions Estimated Coefficients Number of Students in School in 2006 Across Quantiles 6 to 94 Estimated Coefficients Number of Students in School in 2006 Across Quantiles 1 to 99 .04 .01 .02 .005 . 0 0 . -.005 -.02 -.01 -.04 0 20 40 60 80 100 0 20 40 60 80 100 . . 25 Annex 2: Accessing World Bank Data on Indicators wbopendata (Azevedo, 2011) allows Stata users to download over 8,000 series of indicators from the World Bank databases, including: Development Africa Development Indicators; Doing Business; Education Statistics; Enterprise Surveys; Global Development Finance; Gender Statistics; Health Nutrition and Population Statistics; International Development Association - Results Measurement System; Millennium Development Goals; World Development Indicators; Worldwide Governance Indicators. These indicators include information from over 256 countries and regions, since 1960. It can be easily installed from the Statistical Software Components repository, by simply typing, from within the Stata command line interface : ssc install wbopendata After this process wbopendata works just like any other Stata command, users can type either help wbopendata Or db wbopendata (to activate the visual interface) Users can chose from one of three of the languages supported by the database (and Stata), namely, English, Spanish, or French. Three possible downloads options are currently supported: Country - all indicators for all years for a single country. Topic - all indicators within a specific topic, for all years and all countries. Indicator - all years for all countries for a single indicator. Users can also choose to have the data displayed in either the wide or long format (wide is the default option). Note that the reshape is done locally, so it will require the appropriate amount of RAM to work properly. wbopendata draws from the main World Bank collections of development indicators, compiled from officially-recognized international sources. It presents the most current and accurate global development data available, and includes national, regional and global estimates. The access to these databases is made possible by the World Bank's Open Data Initiative (http://data.worldbank.org) which provides open full access to World Bank databases. 26 Annex 3: Integrated Minas Gerais Development Plan (PMDI) and Targeting system 15 In September 2007, the Minas Gerais State Government launched its integrated Minas Gerais Development Plan (PMDI), sanctioned by State Law 17001/2007 for the period from 2007 to 2023. This plan is the principal document guiding the sustainable long-term development of the state of Minas Gerais, describing the various challenges faced by the state and the major choices the state needs to make for it to achieve full development status. For more information on the plan and its monitoring and evaluation systems please see Gimarães and Campos (2010). The plan includes a diagnosis of the present situation of the state, as well as an assessment of the future prospects for the socio-economic outlook for Brazil as a whole. Based on this diagnosis, targets were established for a number of different development- oriented areas. These targets involve actions covering a variety of concerns - from public safety to problems associated with the environment. The plan has five basic strategic action lines aimed at ensuring the following: 1. Educated, qualified and healthy people; 2. Encouragement for young leaders with the aim of expanding job opportunities, entrepreneurship and the social inclusion of this segment of society; 3. Dynamic and innovative companies with programs to boost economic development , infrastructure, science and technology and the introduction of a state ‘pact’ or agreement targeted at boosting the rate of investment and competitiveness of the Minas Gerais economy; 4. Safe and orderly cities, with programs to improve the environment, public safety, housing, sanitation and overall quality of life for citizens; and 5. Equality among different areas and people, through programs targeted at the places with the lowest HDI which contain the most vulnerable sectors of the population, through efforts to reduce poverty, generate employment and income and to provide sustainable food and nutrition security. The State Secretaries were responsible for meeting the targets for the so-called "results areas" of their respective Secretariats. Monitoring of the initiatives in these areas will be undertaken by the State Government. The "results areas" and their main objectives defined by the PMDI were: 15 Integrated Development Plan for Minas Gerais (Plano de Desenvolvimento Mineiro Integrado (PMDI) available at: http://www.planejamento.mg.gov.br/governo/publicacoes/arquivos/Plano_Mineiro_Desenvolvimento_Integrado _Final.pdf 27 I. Good quality education: to boost the quality of elementary and secondary education and to reduce regional disparities in this sector. II. Youth leaders: to upgrade the number of young people finishing secondary education and to provide more opportunities for these to become productive citizens; III. Investment and added value of the manufacturing sector: to increase the annual volume of private, public and partnership-related investment in the manufacturing sector and to undertake training programs to ensure a flow of qualified workers, in partnership with the private sector. IV. Innovation, technology and quality: to establish an "innovation agenda" with a view to improving what the state already has and to develop new areas of endeavor to be defined jointly with relevant stakeholders, including those in the manufacturing sector, universities and research institutions. V. Development of Norte de Minas, Jequitinhonha, Mercury and Rio Dace: to boost the volume of private investment in these areas by attracting productive capital and by improving infrastructure, education, jobs training and health and public sanitation conditions. VI. Integration and development of the transport system: to expand, repair and ensure ongoing maintenance of the state highway network, ensuring good value for money for the work involved, to conclude the ProAcesso and to seek, together with the Federal Government and other states, solutions for the Federal Highway system as a whole. VII. Poverty reduction and productive inclusion: to reduce chronic poverty through education and to promote productive activities among the adult population. To reduce illiteracy, to boost sustainable food and nutritional security, to strengthen family agriculture and eradicate child labor. VIII. City and services network: to increase the number of municipalities with an adequate Social Responsibility Index (�ndice Mineiro de Responsabilidade Social – IMRS) by aiming for an interconnected and hierarchized network of public and private good quality services covering the different areas (see Diagram 7 of the Annex). IX. Healthy living (Vida Saudável): to ensure primary healthcare for the entire population, to reduce mother and child mortality, to increase longevity and improve health services for all adults in the population suffering from cardiovascular diseases and diabetes and to substantially enhance access to basic sanitation. X. Law and order: to reduce crime in the state, including by integrating the police forces and by focusing on intelligence-gathering, preventive anti-crime measures and modernization of the prison system. 28 Targets have been established in each of the above key "results areas". As for specific targets directed towards improving the quality of education in the state of Minas Gerais, the three main strategic objectives over the next 23 years are: (i) to increase the average educational level of the state's population; (ii) to reduce regional educational disparities throughout the state; and (iii) to bring the quality of education in Minas Gerais up to international standards. In addition to the long-term target date of 2023, an intermediate target has also been established to be met by 2011. Table 7 below highlights the various targets to be achieved in the area of elementary education in Minas Gerais. The PMDI outlined the long-term strategy for the state of Minas Gerais and the GERAES basically the mechanism to be used by the state to achieve its targets. In this context, the GERAES served as a managing agency for executing the portfolio of priority projects by the State in an effort to achieve its goals. This initiative has five specific goals: (i) to transform the state´s vision of the future into concrete results and to bring about the desired changes; (ii) to produce a multiplying effect capable of generating further private or public initiatives; (iii) to mobilize and assemble public, private and public/private resources; (iv) to ensure that the population in general understands that the future of the state depends on 29 its undertaking concrete actions; and (v) to organize a well-focused project with measurable objectives, actions, targets, deadlines, costs, expected results etc under strict management control. Each "priority project" will be associated with at least one of the 10 core objectives of the State government. The four priority projects in the education sector, for the period of this analysis, were the following: 1. Full-time schooling: to increase the length of the school day for students, particularly in schools located in the most vulnerable areas; 2. Teacher qualifications and performance: to determine the qualifications required by future teachers in the public education network and to upscale current teaching skills generally. To establish a system of professional certification, to provide incentives for teachers to finish university degrees and to undertake continuing education. To ensure paid leave to enable teachers to take postgraduate courses; 3. New standards of management in the elementary education sector: to improve the performance of schools by defining and implementing basic standards with respect to school management, school premises and to use teaching methods geared to heightened efficiency and improved performance by students. 4. Systems for evaluating schools and education quality: systematic evaluation and monitoring of schools and students, to include a transparent process for evaluating and disseminating results. In order to implement this strategy, the State of Minas Gerais adopted a few principles and accompanying instruments to improve the state public sector management, these included: (1) institutional alignment instruments (such as results agreements, strategic management, and results committees), (2) benchmarks and strategic project goals and their final results (outputs and outcome indicators), and (3) the strengthening of feedback loops (as an effort to allow for the continuous learning and fine tuning of the public policy). The two primary instruments adopted to generate an adequate set of incentives to implement the strategy were the: results agreements and results committees. Results agreements are management contracts signed by the governor and the state secretariats (first-tier agreements), and between state secretariats and their units of delivery (second tier agreements). 30 These agreements are periodically revisited by the results committees (at least quarterly meetings), and once a year they are fully revised. The outcome of this revision (first and second tier) determines who receives the “results-award� (up to one month of additional wage). The prize is linked to the proper implementation of works and the attainment of results by the state secretariats and their unit’s of delivery. In January 2011, the Government of Minas Gerais launched a revised version of their Integrated Minas Gerais Development Plan 16, however the overall approach of relying on indicators and targets remained. 16 PLANO MINEIRO DE DESENVOLVIMENTO INTEGRADO. PMDI 2011 – 2030 - GESTÃO PARA A CIDADANIA. 31