33364 World Bank Social Safety Nets Primer Notes 2003 No. 14 Impact Evaluation of Social Programs: A Policy Perspective Governments and donor organizations increas- preferred on methodological grounds because it ingly recognize that rigorous evaluations of public minimizes the effects of pre-existing differences interventions should be part of the social policy between the participants and the comparison decision-making process. Yet there is frequently a group that can be confounded with the effects of gap between the desire for information on the program participation (selection bias). effectiveness of programs and an understanding If random assignment is ruled out, however, it of the potential and the limitations of evaluation may still be possible to estimate impacts reliably tools. This note reviews the basic elements of using non-experimental methods. Multivariate- good impact evaluations, identifies some of the regression models, matched-comparison methods, political economy aspects that influence whether double-difference and instrumental-variables they are conducted, and explores ways to encour- methods can attempt to control statistically for age use of evaluation. sources of selection bias. Most recent impact eval- Elements of An Impact Evaluation uations in developing countries have relied upon Program-impact evaluations assess the impacts on non-experimental methods due to cost and data- participants attributed to direct participation in a availability considerations. program or intervention. It should not be con- Integrating qualitative methods can add rich fused with other types of policy assessments or detail and permit a grounded analysis of the with program monitoring. Good evaluations underlying causes of outcomes. They permit a include not just quantitative estimates of program deeper understanding of program processes, impacts, but also seek to explain why they external conditions, and individual behaviors. occurred (or did not), and what the resulting pol- The methods are open-ended, relying on semi- icy implications may be. Full program evaluations structured interviews in an individual or group have evolved to include several typical compo- setting and interviewer observations. nents: a) a process study, which examines the par- ticular program operations and processes; b) an impact study, the technical heart of the evaluation; Key Design Features of a Good Impact Evaluation and c) a cost-benefit assessment that compares the To provide the highest value, an impact evaluation should include: costs of offering the program with the benefits to Clear objectives. Evaluation questions should be determined participants. early, be simple and measurable. Credible evaluator. The evaluator should be independent of the Conducting a successful impact evaluation agency or institution whose program is being evaluated. requires advance planning. There are five major Rigorous methodology. Experimental estimates are the ideal but considerations. a well-chosen matched comparison group may suffice. Adequate sample size. The sample should be large enough to Selecting a specialized evaluator ­ external to detect program effects of plausible size. In addition, the size should the government or implementing agency ­ is gen- permit assessment of program impacts on key subgroups of the tar- get population, as appropriate to the program. Minimum erally preferred to ensure objectivity and inde- detectable effects should be determined prior to the implementa- pendence. In addition, quantitative impact evalu- tion of the evaluation. ation requires specialized skills and expertise. Baseline data. Need to establish the appropriate comparison group and to control for observable program selection criteria. Selecting quantitative methods to estimate Sufficient follow up. Follow-up data should be collected after impacts fall into two strategies. Experimental enough time has passed to plausibly detect an impact, and should estimates compare the outcomes of the partici- measure the relevant outcome variables. pants with those of a randomly assigned control Multiple evaluation components. The impact evaluation should do more than detect program effects ­ it should also examine pro- group that is otherwise eligible for the program gram process, reasons for observed outcomes, and cost effective- and similar to the participants, but did not ness. receive program benefits. Experimental design is Source: Blomquist (2003), Ezemanari et al. (1999). John Blomquist prepared this note based on his forthcoming Social Safety Nets Primer Paper, "Impact Evaluation of Social Programs: A Policy Perspective". Data availability and quality are the most larly in relation to the scarce resources available important factors affecting the quality of analysis. for social programs. Negative findings have the Often, new surveys are required to get sufficient potential to hinder social agendas and damage information on program participants, including political careers. It may seem easier and safer for baseline and follow-up surveys. However, existing policymakers not to present their detractors with household data can sometimes be used for non- a club to beat them. experimental methods. For example, more than However, political need can also work in favor of 30 countries have a version of the Living Stan- evaluation. One of the main reasons formal im- dards Measurement Survey (LSMS), and many pact evaluations are undertaken is to gain politi- have surveys for multiple years. cal support for a program. This is particularly true There is considerable variation in the cost of for programs that are seen as strategically impor- impact evaluations of social safety net and other tant for national policy, or for programs that are social programs, ranging anywhere from US$ introducing innovative approaches. The Mexican 200,000 to over US$ 1, 000,000, with an average government paid for the evaluation of Progresa in for a rigorous evaluation probably between US$ part because the conditional cash-transfer model 300,000 ­ 400,000. Features affecting total costs was relatively new and viewed as a potential include the number and type of policy questions replacement for certain subsidy programs. to be addressed, the methodology, extent of new- Evaluation can provide improved general knowl- data collection, the size and scope of the program edge about a type of program, yet, as with public being evaluated, and local capacity. goods, that often is not an adequate incentive for Political Economy of Evaluation the individual policymaker or government to Politics and political economy play an important undertake a specific evaluation. Outside donors role in the decision to conduct a program evalua- and agencies might consider providing more re- tion. The issues stem from principal-agent prob- sources for country-program evaluations or create lems, where stakeholders ­ including variously the incentives for program evaluation and assessment. government or funding agency, the implementing Developing an Evaluation Culture agency, and beneficiaries ­ do not have consistent International experience suggests that more could incentives to support an evaluation. be done to foster an evaluation culture. There are One reason for reluctance comes from a misun- several possibilities. derstanding of what an evaluation can deliver. Specifically, results are not always available in `Policy entrepreneurs' might assist in the educa- time for short-term decisions and they can appear tion and outreach, convincing stakeholders by ambiguous and difficult to translate into policy the demonstration effect of actually doing an actions. The fact that different evaluations of the evaluation. Funding agencies can make evalua- same program can produce conflicting results tion mandatory. Some governments have fol- depending on data sources and methodology lowed this strategy, imposing the evaluation seems at first glance to limit the policy value of requirement from a level of authority that cannot the exercise. For example, two separate evalua- be challenged by the program proponents, as in tions of Peru's Social Fund, completed a year Chile and the U.S. apart and using different methodologies and data, Including stakeholders in the evaluation process arrived at opposite conclusions on key impacts. and disseminating findings is very important to However, this and other possible examples only ensuring transparency and communication. reinforce the point that evaluation design is criti- This will help gain support and ensure results are cal. The best evaluations use an experimental actually used for policy. design and/or use a variety of methods to esti- mate impacts, thereby providing an indication of Many of the countries where evaluation has taken the robustness of the findings. hold are those that have put a priority on devel- oping institutional capacity, as in Chile during A second factor hindering the use of formal eval- the early 1990s. Donors can make support for uations is political concern over their conduct evaluations part of the underlying project to help and the possible repercussions from the results. build local capability. Evaluation is assumed to be very costly, particu- The World Bank Social Safety Nets Primer series is intended to provide a practical resource for those engaged in the design and implementation of safety net programs around the world. Readers will find information on good practices for a variety of types of interventions, country con- texts, themes and target groups, as well as current thinking on the role of social safety nets in the broader development agenda. World Bank, Human Development Network Social Protection, Social Safety Nets http://www.worldbank.org/safetynets Printed on 100% post-consumer recycled paper