78759


Impact Assessment Framework:
SME Finance




 OCTOBER 2012
Impact Assessment Framework: SME Finance                                                                       1



Impact Assessment Framework:
SME Finance
October 2012
Prepared by the World Bank on behalf of the G20 Global Partnership for Financial Inclusion (GPFI) SME
Finance Sub-Group.

Available online at: www.gpfi.org, www.worldbank.org/financialinclusion


Acknowledgments
This report was commissioned by the GPFI SME Finance Sub-Group, and prepared by Claudia Ruiz
(lead author) and Inessa Love, under the guidance of Douglas Pearce.

The report benefited from input and review by: Giorgio Albareto (Banca d’Italia), Aidan Coville (World
Bank), Elizabeth Davidson (World Bank), Susanne Dorasil (BMZ), Felipe Alexander Dunsch (World Bank),
Aurora Ferrari (World Bank), Matthew Gamser (IFC), Randall Kempner (Aspen Network of Development
Entrepreneurs), Leora Klapper (World Bank), Miriam Koreen (OECD), David McKenzie (World Bank), and
Riccardo Settimo (Banca d’Italia).

The report was completed under the leadership of the co-chairs of the G20 SME Finance Sub-Group:
Susanne Dorasil (Federal Ministry for Economic Cooperation and Development, Germany), Aysen
Kulakoglu (Treasury of Turkey), and Jonathan Rose (U.S. Treasury).

© 2012 International Bank for Reconstruction and Development / The World Bank
1818 H Street NW
Washington DC 20433
Telephone: 202-473-1000
Internet: www.worldbank.org

This work is a product of the staff of The World Bank with external contributions. The findings, interpretations,
and conclusions expressed in this work do not necessarily reflect the views of The World Bank, its Board of
Executive Directors, or the governments they represent.

The World Bank does not guarantee the accuracy of the data included in this work. The boundaries, colors,
denominations, and other information shown on any map in this work do not imply any judgment on the part
of The World Bank concerning the legal status of any territory or the endorsement or acceptance of such
boundaries.

Rights and Permissions
The material in this work is subject to copyright. Because The World Bank encourages dissemination of
its knowledge, this work may be reproduced, in whole or in part, for noncommercial purposes as long as
full attribution to this work is given.

Any queries on rights and licenses, including subsidiary rights, should be addressed to the
Office of the Publisher, The World Bank, 1818 H Street NW, Washington, DC 20433, USA;
fax: 202-522-2422; e-mail: pubrights@worldbank.org.




                                      Global Partnership for Financial Inclusion
2                                                                               Impact Assessment Framework: SME Finance




Table of Contents
Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

I. Introduction and Overview  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

II. Why Are Impact Evaluations Relevant for SME Policies?  . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

III. Menu of SME Finance Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

IV. Implementing an Impact Evaluation  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Operational Aspects of an Impact Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Budget Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Time Considerations  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Selecting an Impact Evaluation Method for an SME Finance Policy  . . . . . . . . . . . . . . . . . . . . 15

Steps in the Impact Evaluation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

V. Impact Evaluation Methods—The Experimental Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 20

VI. Impact Evaluation Methods—Non-experimental Approaches . . . . . . . . . . . . . . . . . . . . . . . 23

Difference-in-Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Instrumental Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Regression Discontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Propensity Score Matching  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

VII. Minimal Standard Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

VIII. Conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

References  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Appendix 1. General Concerns  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Appendix 2. Size and Power of RCT  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Appendix 3. Examples of Impact Evaluations  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Appendix 4. Assumptions, Strengths, and Limitations of Different Approaches  . . . . . . . . . . . . 44




                                                  Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                                       3




Figure 1. Evaluation Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Figure 2. Suggested Designs for Evaluations Planned Ahead  . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Figure 3. Suggested Designs for Evaluations Not Planned Ahead . . . . . . . . . . . . . . . . . . . . . . . . 17

Figure 4. The DD Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Figure 5. Random Discontinuity  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27


Table 1. Examples of SME Finance Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Table 2. Steps in the Impact Evaluation Process  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18


Box 1. Types of random assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Box 2. Public Sector Intervention Evaluation: Business Training in Bosnia and Herzegovina  . . . 20

Box 3. Encouragement Design  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Box 4. Regulatory Reform Evaluation: Business Registration in Mexico  . . . . . . . . . . . . . . . . . . . 24

Box 5. Public Intervention Evaluation: Thailand Microfinance Fund . . . . . . . . . . . . . . . . . . . . . . . 26

Box 6. Financial Infrastructure Evaluation—Role of Angel Funds in U.S. Start-up Firms . . . . . . . 28

Box 7. Public Intervention Evaluation: Chile’s Supplier Development Program  . . . . . . . . . . . . . . 31

Box 8. Changes in Behavior in Response to Program Assignment Experiment . . . . . . . . . . . . . . 36

Box 9. Scaling Up Small Interventions  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38




                                                   Global Partnership for Financial Inclusion
4                                                             Impact Assessment Framework: SME Finance




Executive Summary
Small and medium enterprises (SMEs) are a policy            provides an overview of the relevance, application,
priority for many countries, given their significance       strengths, and limitations of impact evaluation
in terms of employment and economic activity.               techniques. Relevant operational information
Many new policies, legal reforms, programs, and             regarding budget and timing issues is also present
funds from both the public sector and donors focus          in the Framework.
on access to financial services and investment
                                                            The Framework covers experimental and non-
for SMEs. It is therefore important to assess and
                                                            experimental approaches that can be used to
understand the impacts of these interventions to
                                                            evaluate a broad context of SME policies and
support SME finance so that they can be designed
                                                            programs and provides examples of actual impact
and implemented to most effectively meet their
                                                            evaluations for each of the components of the SME
goals in a particular market or country.
                                                            Finance Policy Guide (GFPI 2011).
Impact evaluation is an empirical assessment of
                                                            The experimental approaches discussed in this
whether a program or policy has achieved desired
                                                            Framework include basic randomized control
objectives. Impact evaluations help policy makers
                                                            trials, oversubscription, randomized phase-in, and
to quantify the effects of different policies, design
                                                            encouragement design. All of these approaches
the most effective interventions (that is, programs,
                                                            rely on a randomization device that allows the
policies, and regulations), improve targeting, refine
                                                            evaluator to isolate the impact of a policy:
policies to better fit objectives, optimize the scarce
use of resources, and understand the underlying              Basic RCTs refer to classic random assignments
mechanisms. Tracking the impact of a policy,                   that take a baseline survey and randomly select
regulation, or program during its implementation               some SMEs to receive the intervention. This
(real-time impact evaluation) allows modifications             approach can prove useful for interventions
to be made that can ensure the intended results                that are not implemented at the national level,
are achieved.                                                  such as local/regional interventions.
Surprisingly, the cost of more rigorous impact               In the oversubscription design, a subset of firms
evaluations is not much higher than the cost of                is randomly assigned to receive a program
minimal-standard monitoring. The most expensive                from the set of eligible firms that apply to it. This
part of both monitoring and assessing impact is                approach is useful to evaluate interventions in
collecting new data. If data are available, then               which the lack of funds necessitates limiting
the difference in cost between two methods is                  the number of firms that can participate in the
not substantial. For instance, in cases where                  intervention.
administrative data can be used, the budget to
design and implement an impact evaluation is                 The      randomized        phase-in    approach
significantly reduced.                                         randomizes the timing or sequence in which a
                                                               project is rolled out. As its name suggests, this
This Impact Assessment Framework discusses                     methodology is well suited to evaluate policies
the importance of rigorous impact evaluation and               that are implemented at different stages.




                                     Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                        5



 In the encouragement design, certain firms                    not be associated with the outcome variable
   are randomly promoted (for instance, through                  for reasons other than participation in the
   financial incentives or marketing campaigns) to               intervention. For instance, if a lending project
   participate in the program, although the program              took place in a municipality with a particular
   is available to the rest of the population. This              political party ruling, then the presence of
   approach can be used to evaluate policies                     this political party would strongly predict
   implemented at the national level and that were               SME exposure to the lending project. But any
   not rolled out differently.                                   change in SME outcomes should be due to the
                                                                 project itself and not through other channels
The experimental approach allows for credible                    associated to the political party in charge.
identification of the intervention impact and can be
used to plan impact evaluations of different types in        Regression discontinuity is used to evaluate
advance. However, experimental evaluations need                interventions in which a defined cutoff
to be set up before the policy or program is put in            determines eligibility (such as policies provided
place, and their findings might not hold in different          to certain SMEs with less than a specific
contexts (an issue commonly referred to as external            number of workers in the year before the
validity).                                                     intervention). By comparing the outcomes of
                                                               firms that just passed the cutoff with firms that
The non-experimental methodologies covered in                  just missed the cutoff, evaluators can measure
this Framework include difference-in-differences,              the intervention’s effect.
instrumental variables, regression discontinuity,
and propensity score matching. Unlike experiments,           The propensity score matching (PSM)
non-experimental evaluations do not include an                 methodology can be used to evaluate an
exogenous device planned in advance to isolate                 SME intervention in which the institutional
the impact of a policy. Thus, these methods rely on            arrangements that defined selection into the
identifying a control group and then using statistical         project are observed and known, and a control
techniques to ensure the impact estimate is properly           group is not maintained. A control group can be
measured. These approaches are commonly used                   made up of firms not participating in the program,
to evaluate policies when an evaluation was not                and the impact of the intervention determined
planned in advance.                                            by comparing the evolution of outcomes over
                                                               time between the two sets of firms.
 The difference-in-difference approach uses a
   comparable group of firms that was not exposed           While the lack of a randomization device
   to the policy of interest as its control group. The      makes it more challenging for non-experimental
   approach then compares the outcomes over                 methodologies to isolate the impact of an
   time of SMEs exposed to the policy relative to           intervention, when done properly, these
   other firms from the control group. As long as a         approaches provide robust estimates of the effect
   control group can be identified, this approach           of interventions.
   could be used to evaluate a variety of policies,
                                                            The Framework also discusses minimal standard
   including national-level interventions targeting
                                                            monitoring, which consists of monitoring outcomes
   SMEs and interventions at the regional level,
                                                            over time for the subjects receiving the intervention.
   among others.
                                                            The main difference with other impact evaluation
 The instrumental variables approach relies               methods is that a minimal standard monitoring
   on instruments to isolate the impact of a                does not follow a control group to identify the effect
   policy. Instruments are strong predictors of             of a policy, which makes the results less rigorous
   participation in the intervention but should             and credible.



                                     Global Partnership for Financial Inclusion
6                                                             Impact Assessment Framework: SME Finance




This Framework provides insights and criteria on the        The Framework         offers   the   following   overall
basis of which a suitable approach can be selected          guidance:
to evaluate an SME Finance policy, regulation, or
                                                             To isolate a policy’s effect, it is important to
program, including:
                                                               conduct a rigorous impact evaluation instead
 Basic RCTs are well suited to evaluate SME                  of relying on before–after comparisons, which
   interventions that have a clear distinction                 tend to generate flawed results.
   between those who participate in the program
                                                             Impact evaluations planned ahead of the
   and those who do not (for example, public
                                                               intervention offer more evaluation method
   programs providing financial training to SMEs).
                                                               options than evaluations conducted after the
 Approaches that randomize the rollout of the                program or policy has been rolled out. Thus, it
   implementation through randomized phase-in                  pays to plan evaluations before the intervention
   or encouragement design can be more suitable                has started.
   to evaluate interventions where the distinction
                                                             There is no “one size fits all�? approach to impact
   of who participates is not clear, such as broad
                                                               evaluation, and the most appropriate approach
   SME finance policies or regulatory reforms.
                                                               to evaluate an intervention will depend on the
 To evaluate policies such as bank lending to                operational characteristics of the policy being
   SMEs, where institutions follow certain criteria            evaluated.
   to select eligible firms, both oversubscription
                                                             Rigorous impact evaluations can be
   and regression discontinuity might be suitable
                                                               complemented by qualitative assessments
   approaches. Oversubscription is particularly
                                                               to provide a better understanding on the
   relevant when there are limited resources or
                                                               functioning, limitations, and strengths of the
   implementation capacity and demand for a
                                                               evaluated policy.
   program or service exceed supply.
                                                             Data collection is typically the most costly
 Where the evaluation takes place after the
                                                               component of an evaluation. Evaluations that
   policy has been already implemented, the
                                                               rely on existing data and ongoing or already-
   evaluation approach is mainly determined
                                                               planned surveys can save on this cost
   by the characteristics of the intervention and
                                                               component.
   how it was implemented. For instance, the
   difference-in-difference approach might be well           Real-time     impact     evaluation    during
   suited to evaluate policies aimed at improving              implementation allows modifications to
   opportunities for female-led SMEs (since                    be made to help ensure that the intended
   the evaluator can compare the evolution of                  impacts are achieved. Rigorous impact
   female-led relative to male-led SMEs) or SME                evaluation assessments can improve the
   interventions that were rolled out sequentially             design, implementation, and impact of policies,
   across regions for political or logistical reasons,         regulations, and programs to support SME
   such as financial infrastructure projects.                  finance.
 Alternatively, policies with a cutoff that
   determined who was eligible for the intervention
   are well suited for the regression discontinuity
   approach, such as a factoring project for SMEs
   employing fewer than 50 workers at the time of
   registration.



                                     Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                        7




I. Introduction and Overview
SMEs play a key role in economic development                       This Framework was prepared as a resource for
and make an important contribution to employment.                  regulators and policy makers to provide an over-
Financial access is critical for SME growth and                    view of methodologies used to evaluate the impact
development, and the availability of external finance              of various SME finance policies, interventions, and
is positively associated with productivity and growth.             regulations. The Framework provides a compre-
However, access to financial services remains a                    hensive set of impact evaluation techniques; their
key constraint to SME growth and development,                      key assumptions, strengths, and limitations; and
especially in emerging economies (GFPI 2011).                      examples of their implementation in SME finance
                                                                   policy contexts.1 The techniques described in this
Policy makers and regulators have a wide menu of                   Framework can be applied to real-time impact as-
tools at their disposal to support increased access                sessment that feeds back into policy implementa-
to financial services, as demonstrated in the                      tion. Operational aspects of impact evaluation, such
comprehensive GPFI SME Finance Policy Guide                        as budget and timing issues, are also discussed in
(2011). Financial access for SMEs can be expanded                  the Framework. As detailed in the Framework, the
by promoting a favorable legal and regulatory                      impact evaluation approaches can then be selected
environment, complemented by a sound financial                     to suit different policy contexts and priorities.
infrastructure and targeted public interventions. It is
important to assess the impacts of various policies                The first part of the Framework introduces the vari-
in order to prioritize, tailor, and sequence reforms               ous impact evaluation approaches, discusses bud-
to be most effective in addressing constraints to                  get and time considerations for planning an evalu-
financial access in a particular market or country.                ation, presents an outline of all necessary steps in
                                                                   the impact evaluation process, and maps evalua-
Impact evaluations assess whether a program or
                                                                   tion approaches to different types of SME finance
policy has achieved the desired objectives. These
                                                                   policies. The role of qualitative assessments as a
evaluations are usually systematic empirical studies,              complement of impact evaluation is also discussed.
most often using actual data and statistical methods               The second part of the Framework addresses in
to measure outcomes and quantify the impact of                     more detail the different methods. Section V de-
the program or policy. Impact evaluations are a key                scribes the experimental approach. Section VI cov-
ingredient for policy analysis and for understanding               ers non-experimental methodologies, which range
what works—that is, what are the most effective
                                                                   from difference-in-difference and instrumental vari-
policies to achieve desired objectives, such as
                                                                   ables to regression discontinuity and propensity
alleviating poverty, increasing access to finance,
                                                                   score matching. Section VII describes the minimal
or enhancing growth and development in certain
                                                                   standard monitoring, discusses its advantages and
contexts. Thus, it is important to include impact
                                                                   disadvantages, and contrasts this method to more
evaluation in the design of policy and legal reforms
                                                                   rigorous impact evaluation techniques. Appendices
and interventions. ­­
                                                                   1 and 2 present technical considerations regarding



1
 The intention of the Framework is to provide an overview of impact evaluation methods and how they can be applied, rather than to
present an exhaustive survey of all existing or ongoing evaluations.


                                           Global Partnership for Financial Inclusion
8                                                           Impact Assessment Framework: SME Finance




estimation approaches. Appendix 3 compiles some           a systematic review of various evaluation methods
examples of impact evaluations of SME finance             relevant to SME finance, with pros and cons of
interventions. Finally, Appendix 4 summarizes the         each method and examples from their applications
key assumptions, strengths, and limitations of each       in SME finance policies. Gertler et al. (2011) offer
evaluation approach examined in the Framework.            a comprehensive impact evaluation guideline with
                                                          detailed information on operational and technical
Several recent surveys on the topic of impact             issues. Bauchet et al. (2011) provide an excellent
evaluation are relevant for this paper. McKenzie          survey of randomized evaluations of microfinance.
(2010) offers a survey of impact evaluations in           Winters, Salazar, and Maffioli (2010) provide a
a broader area of finance and private sector              thorough survey of impact evaluations of agricultural
development. He makes a strong case for impact            projects. While the objective of this framework is also
evaluations in the financial private development          to review different impact evaluation approaches,
area. This paper complements his work, as it offers       our focus is on SME finance policies.




                                   Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                    9




II. Why Are Impact Evaluations Relevant
for SME Policies?
SME interventions can benefit from using impact            Help prioritize resources by identifying the most
evaluations in various ways. These evaluations can:          cost-effective policies; and

 Clarify the effect that interventions have on           Make it possible to trace the different stages of
   firms’ outcomes and whether that impact                   an intervention so that evaluators are able to
   achieved the expected objectives;                         distinguish which key step in the program is not
                                                             working as expected.
 Help to improve existing programs by comparing
   alternative design choices (for instance,              Unlike minimal-standard monitoring or simple
   comparing the performance of loan contracts            before-and-after comparisons, impact evaluations
   with weekly versus monthly payments);                  isolate the effect of an intervention from all other
                                                          factors that might alter the outcome of interest.
 Improve program targeting by identifying which
   firms benefit the most, or what barriers prevent
   others from gaining from interventions;




                                   Global Partnership for Financial Inclusion
10                                                                Impact Assessment Framework: SME Finance




III. Menu of SME Finance Policies
While the impact evaluation methods presented                   are those examined in the GPFI SME Finance
in this Framework can serve to evaluate a broad                 Policy Guide, which are classified in three groups:
set of SME and financial inclusion interventions,               (1) regulatory and supervisory frameworks; (2)
the Framework’s main focus are SME Finance                      financial infrastructure; and (3) public interventions.
policies. More concretely, the SME finance reforms              Table 1 provides examples of these policies by type
and interventions that the Framework covers                     of intervention.



 Table 1. Examples of SME Finance Policies

 INTERVENTION TYPE                                           EXAMPLES

 1. Regulatory and supervisory frameworks
 Frameworks to promote competition                           Regulations enabling entry of new banks
                                                             Regulatory framework for licensing requirements

 2. Financial infrastructure
 Insolvency regime                                           Bankruptcy reforms
 Credit information systems                                  Introduction of credit bureaus, credit registries
 Equity Investment                                           Reforms encouraging venture capital, angel funds
 Accounting and auditing standards for SMEs                  Reforms facilitating business registration procedures


 3. Public interventions
 Public credit guarantee (PCG) schemes                       Funds for guarantee to SMEs
 Lending by state-owned financial institutions               Micro and SME finance programs
 Apexes and other wholesale funding facilities               Direct lending in the form of grants
 SME capacity, creditworthiness                              Business/financial literacy training for entrepreneurs
 Value-chain organization projects                           Subsidies to promote technology transfer to SMEs




                                         Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                       11




IV. Implementing an Impact Evaluation
The key challenge in evaluating the impact of any                  SMEs that obtain a loan may be fundamentally
program is to ensure that observed outcomes                        different from those that do not. While controlling
are a direct result of the program itself and would                observables (such as size, age, and industry) may
not have occurred without the program. Without                     reduce these differences, some of the important
credibly addressing this, the impact evaluation may                differences are more difficult to observe—such
assign the outcome to the program, when in reality                 as the entrepreneurial talent of the owners, their
it could have occurred without it.                                 different risk preferences, or their social support
                                                                   network. Observed differences in performance
To clearly see the issue at stake, suppose a program               between these two groups may be attributed to
affects some SMEs but not others. In essence, two                  these latent (that is, unobservable) differences in
questions must be addressed to resolve the issue:                  characteristics rather than to access to finance.
(1) How would SMEs who have participated in the
program have done without the program? and (2)                     Impact evaluation techniques deal with these
How would those who have not participated in the                   issues by identifying a proper counterfactual
program have fared if they had participated? These                 group to compare with the group of SMEs that
questions are referred to as counterfactual because                were affected by the policy and, in this way,
neither of these scenarios occurred in reality and                 estimate as cleanly as possible the effect of the
thus are unobservable.                                             policy of interest. In general, impact evaluation
                                                                   approaches can be classified in two broad groups:
Observing the same SME over time will not, in                      experimental and non-experimental (see Figure
most cases, give a reliable estimate of the impact                 1).2 Experimental methodologies randomly assign
the program had on it because many other things
                                                                   the intervention between the group that participates
may have changed at the same time the program
                                                                   in the program or policy (the treatment group)
was introduced. The solution to this problem is to
                                                                   and the group that does not (the control group) to
estimate the average impact of the program rather
                                                                   ensure that any difference between these groups
than the impact on each firm. One way to do that is
                                                                   is attributed to the intervention. There are different
to compare the average impact on the group that
                                                                   ways to conduct randomized assignment. The
has participated in the program (also known as the
                                                                   most common randomized approaches that will
“treatment�? group) with an outcome for a similar
                                                                   be discussed in the Framework are described in
group that has not (the “comparison�? or “control�?
                                                                   Box 1. Non-experimental approaches—such as
group).
                                                                   difference-in-difference, instrumental variables,
The challenge is to ensure that this comparison                    regression discontinuity, and propensity score
group is identical to the treatment group in all                   matching—identify a control group and then use
ways, except for participating in the program. For                 statistical techniques to ensure that the impact
example, to evaluate the impact of access to finance               estimate is properly measured. Sections V and VI
on SME productivity, it is not sufficient to compare               of the Framework describe in detail each of these
those with a loan to those without one because                     methods, their assumptions, and their advantages


2
  While minimal standard monitoring is not considered a rigorous impact evaluation method, it is a widely used approach to monitor
the effect of policies on the targeted subjects. Section VII describes this method in more detail.


                                           Global Partnership for Financial Inclusion
12                                                                Impact Assessment Framework: SME Finance




and disadvantages, providing examples of specific               program participants and stakeholders about the
SME finance policies that were evaluated with them.             policy, its success, and its limitations. Through
Appendix 4 summarizes the main assumptions and                  surveys, interviews, focus groups, and/or case
characteristics of the evaluation methods discussed             studies, qualitative evaluations collect additional
in this paper.                                                  information that sheds light on the satisfaction
                                                                of participants, on the relevant mechanisms
While discussing in detail qualitative assessments              responsible for the impact of the intervention, and
is beyond the focus of this Framework, it is worth              on general feedback to adjust and improve the
mentioning that these types of analysis are an                  operation of the policy or intervention. The OECD
important complement to the findings reached                    Framework for the evaluation of SME policies by
through a rigorous impact evaluation. Qualitative               Storey and Potter (2007) provides an in-depth
assessments are commonly based on opinions of                   review of these assessments.


Figure 1. Evaluation Approaches




     Box 1. Types of Random Assignment
     Basic assignment. The classic model for random assignment is to take a baseline survey and randomly assign
     some participants to the project. This can be done on the level of individuals, firms, schools, or villages.

     Oversubscription design. In this design, all eligible candidates are allowed to apply to the program, and a subset
     of all applicants is randomly assigned to receive the program (via a lottery system, for example). This design is
     useful when resources are limited and the demand for a program or service exceeds supply. This design can also
     be useful in randomizing among marginal loan applicants, as in Karlan and Zinman (2010).

     Randomized phase-in. Because of the resource constraints, some units (individuals or geographic areas) subject
     to the program cannot receive the treatment at the same time. In such cases, randomizing who receives the
     program first is a fair way to allocate the resources and also allows for an impact evaluation of the program’s
     effectiveness.

     Encouragement design. In this design, some individuals or firms are randomly “encouraged�? (via financial
     incentives or marketing materials) to participate in the program, even though the program is available to the rest
     of the population.




                                         Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                          13




Operational Aspects of an Impact                                     in which the data will be gathered, and the labor
                                                                     costs of each country. Yet the Alliance for Finan-
Evaluation                                                           cial Inclusion (AFI) provides estimated survey costs
Budget Considerations                                                for different types of surveys with different sample
                                                                     sizes. According to AFI, a nationally representative
The overall cost of implementing an impact                           cross-sectional survey could range from $100,000 to
evaluation usually represents a small fraction of                    $700,000 depending on the sample size (from 1,000
the total cost of the intervention. While the cost                   to 7,000 observations) and the country where the
of an impact evaluation varies, it is possible to                    survey was conducted. Information from the Living
generate reasonable estimates up-front based on                      Standards Measurement Study of the World Bank
understanding the main cost drivers. These costs can                 indicates that survey costs range from $150 to $300
be broadly categorized into “technical assistance�?                   per household, with a usual sample size between
and “data collection,�? with data collection being the                3,000 to 5,000 households.
most important cost driver, generally constituting
approximately 60 to 80 percent of the cost of an                     The possibility of using administrative data can
impact evaluation. For instance, while the average                   thus greatly reduce impact evaluation costs, but
World Bank impact evaluation costs $500k to                          assessing the availability of relevant data is critical
$900k (Gertler et al. 2011) when data collection is                  and needs to consider the following factors:
required, the cost declines to $50k to $200k when
                                                                     1.	 Impact evaluations require data before and
administrative data can be used.
                                                                         after the intervention in both control and
Administrative data consists of information                              treatment groups. Administrative data would
collected for some official purpose, such as                             need to be available over these time periods
reporting to government agencies or maintaining                          for these population groups.
records of program participants. While this data is
                                                                     2.	 The more time points available, the more
not designed to perform evaluations, if the available
                                                                         accurate the results: data available at regular
indicators fit with the objectives of the evaluation,
                                                                         intervals for the indicators of interest improve
administrative data is a valid option to consider.3
                                                                         precision, preferably available before, during,
Data Collection Costs                                                    and after the intervention.

It is difficult to determine the costs of data collection            3.	 Access and confidentiality can be challenging:
precisely since these will depend on different vari-                     While administrative data may exist, accessing
ables such as the sample size needed, the type of                        these data may be difficult for security reasons.
data to be collected (household, individual, admin-                      In addition, the time required to access the data
istrative), the length of the surveys, the frequency                     in a workable format needs to be factored into
                                                                          the process.

3
  For example, the Italian tax authority conducts a “Sector Studies�? survey to collect information on SMEs activities, economic
outcomes, and other variables with the objective of computing how much SMEs pay in taxes. These administrative data have been
used in different evaluation projects. In Chile, the Suppliers Development Program, which seeks to strengthen the commercial linkages
between small- and medium-sized local suppliers and their large firm customers, keeps records of all participating firms. These
records were used in an evaluation to understand the effect of the program on SME productivity (Arraiz, Henriquez, and Stucchi 2011).


                                            Global Partnership for Financial Inclusion
14                                                            Impact Assessment Framework: SME Finance




4.	 Available indicators dictate the questions that         monitoring include monitoring costs of the
    can be asked: The types of outcomes that                evaluation (if planned ahead) and researchers’
    can be monitored are restricted to the types of         time, but these are usually a small part of the overall
    indicators collected in the administrative data.        budget. In addition, minimal-standard monitoring
    Evaluators must make sure that the available            does not need to collect data on the control groups.
    indicators allow them to monitor the main               It does, however, need data on the periods before
    outcomes of interest for the evaluation.                the intervention started and after it was rolled out.

5.	 Data format and quality: Administrative data            Technical Assistance Costs
    are usually collected for purposes other than
                                                            In addition to data collection costs, impact
    statistical analysis. As such, data may not
                                                            evaluation work requires budgeting staff time,
    necessarily be in a format that can be directly
                                                            travel arrangements, and dissemination costs.
    analyzed, requiring effort to clean and reframe
                                                            Contributions from researchers are not needed
    for analysis purposes. The quality of these data
                                                            throughout the whole project timeframe; they
    also needs to be scrutinized if the evaluation
                                                            mainly contribute work for the impact evaluation
    team has not been involved in its collection.
                                                            design and sampling, as well as data cleaning and
In many cases, administrative data are not available        analysis. However, most impact evaluations that
in exactly the right format needed for the impact           include data collection need a constant presence
evaluation for any of the reasons described above.          in the field, such as a field coordinator, to monitor
However, it is often possible to work with the office       data collection efforts. Still, these costs are usually
responsible for collecting the administrative data          a smaller part of the overall budget compared to
to adapt the data collection activities (for example,       data collection.
by adding specific questions to the larger survey
or by including control group data collection). Ex          In summary, the main budget items of an impact
ante evaluations allow the administrative data to be        evaluation are:
adapted to suit the needs of the evaluation, which           Data collection. The team should identify
is not possible when relying on historical data for an         all primary and secondary data collection
ex post evaluation.                                            requirements and provide a budget for
Impact evaluation methods will only affect costs in            completion (minimum baseline and follow-up
as much as they influence the data requirements.               data), including qualitative and/or cost-analysis
For instance, approaches such as propensity                    data collection requirements where applicable.
score matching or regression discontinuity                   Impact evaluation team. The budget should
require information on a large set of subjects.                include all staff and consultant time for
Non-experimental methods such as difference-in-                managing the impact evaluation, including
difference also require baseline data to ensure that           design, implementation, and analysis.
the control and treatment groups are comparable.
Though more limited in its precision, an RCT is the          Travel. All necessary travel costs for required
only method that does not specifically require a               project supervision must be factored in,
baseline to be conducted since, by definition, the             including airfare, accommodations, and food.
control and treatment groups will be comparable.
                                                             Specialists. The budget should include any
However, it is generally good practice to collect
                                                               additional consultant time and travel for
baseline data for any evaluation method used.
                                                               technical assistance (such as survey instrument
Additional costs of impact evaluations not                     development, data quality control, and data
necessarily associated with minimal-standard                   entry program development).



                                     Global Partnership for Financial Inclusion
  Impact Assessment Framework: SME Finance                                                                     15



 Dissemination plan. Any costs associated with            conduct multiple surveys (midline and endline), which
   travel or logistics must be taken into account for       allow the evaluators to draw short-term and long-
   at least one field-based presentation at baseline        term conclusions. In addition, tracking the progress
   and one at follow-up, as well as any costs               of the intervention with a midline survey may help
   associated with producing written materials.             to realign the program to improve the overall project
                                                            outcomes. Follow-up surveys that measure long-
 Miscellaneous. The budget should include any             term impacts after the program implementation often
   additional costs related to the impact evaluation,       produce the most convincing evidence regarding
   such as payments for institutional review of the         program effectiveness.
   research protocol.
                                                            The timing of an evaluation must also account for
Time Considerations                                         when certain information is needed to inform decision
                                                            making and must synchronize evaluation and data
Ideally, impact evaluations should be planned prior         collection activities to key decision-making points.
to the rollout of the program. Doing so allows the          The production of results should be timed to inform
team to collect meaningful pre-intervention baseline        budgets, program expansion, or other policy decisions.
data and organize the project implementation for an
eventual RCT (allocation of treatment and control
group) and helps stakeholders to reach consensus            Selecting an Impact Evaluation
on the program objectives.                                  Method for an SME Finance Policy
To identify the impact of any intervention, evaluators      The operational characteristics of the policy
then need to allow sufficient time for the impacts to       should guide the selection of the impact evaluation
manifest. Both short-term and long-term impacts             method. More concretely, there are two important
can be considered, depending on the intervention,           components of the policy that matter when selecting
objectives, and the theory of change backing the            an evaluation approach: i) who is eligible to the
project design. The following factors need to be            program and ii) how eligible subjects are selected
weighted to determine when to collect follow-up data        to participate or receive the program. There is no
(Gertler et al. 2011):                                      “one size fits all�? impact evaluation approach, and
                                                            the best approach will differ with the situation and
 Program cycle (including program duration),              the policy’s characteristics. An additional factor to
   time of implementation, and potential delays.            consider is whether the evaluation was planned ex
 Expected time needed for the program to affect           ante (before the program has started) or is occurring
                                                            ex post (during or after the program began).
   outcomes, as well as the nature of outcomes of
   interest.                                                SME finance impact evaluations that are planned in
                                                            advance offer more options for evaluation methods
 Policymaking cycles.
                                                            than those conducted after the program or policy
Often, performing an evaluation too soon after the          has been rolled out. Planning ahead has several
intervention may miss the important long-term               advantages. For instance, the evaluator can carry out
consequences. Also, the evaluation timeline must            baseline analysis to establish appropriate comparison
adapt to the timeline of the project rather than to         groups. Evaluators can also decide whether they
the evaluation driving the timeline of the project.         need to collect specific data not covered in other
Evaluators therefore need to be flexible regarding          sources. Under some circumstances, evaluators
the timing. A strong monitoring system can help track       can introduce a randomization device to increase
the progress of the actual implementation.                  comparability of control and treatment groups and
                                                            thus strengthen the evaluation results.
When sufficient budget is available, it is advised to

                                       Global Partnership for Financial Inclusion
16                                                           Impact Assessment Framework: SME Finance



Figure 2. Suggested Designs for Evaluations Planned Ahead




Planned evaluations can be used even in                    Other very common interventions are those that
interventions in which no obvious control group was        take place simultaneously and at the national
followed. For instance, national interventions that        level. Evaluators can still find methods to evaluate
were implemented at the same time everywhere               the impact of these types of interventions. Think,
can still be evaluated with a rigorous method. Think       for instance, of interventions trying to reduce
of a nationwide intervention in which firms apply          the regulatory costs that SMEs face. We might
to participate in a program. Evaluators might plan         expect these interventions to have a substantially
ahead for an encouragement device to evaluate the          higher effect on SMEs than on larger firms. If this
intervention (such as reducing the cost of applying        is the case, then evaluators can plan ahead a
for randomly selected firms). Now think of this            difference-in-difference evaluation by comparing
same intervention but with the additional constraint       the performance of SMEs before and after the
of limited fund availability, reducing the number of       intervention with that of larger firms.
firms the program can accommodate. Evaluators
                                                           Finally, evaluators might be creative and utilize
can use an oversubscription design in which firms
                                                           the lack of information among SMEs about new
from the pool of applicants are randomly assigned
                                                           nationwide interventions. Let us suppose that a
to the program while the others are not.

                                    Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                    17



regulation to facilitate the requirements to open          that randomly provide an incentive to some groups
a business was implemented but not marketed                to participate in the program.
to the public. Evaluators might then plan an
                                                           Figure 3 presents a method to help evaluators
encouragement design evaluation in which they
                                                           select an approach for evaluations that were not
randomly provide detailed information on the new
                                                           planned before the intervention. If, for instance, a
regulation only to a subset of firms.
                                                           credit bureau was established in different regions
Figure 2 presents a method for selecting the               over time, a difference-in-difference approach can
most appropriate evaluation approach when                  evaluate its impact by comparing the outcomes
the evaluation is planned ahead of the program             over time on regions where the credit bureau
implementation. While there is no unique mapping of        started (the treatment group) with comparable
evaluation approaches for interventions, in general,       regions where the credit bureau was not yet
interventions that clearly distinguish participants        implemented (the control group).
from non-participants are good candidates for
                                                           Sections V and VI discuss in more detail the main
RCTs. Several public interventions might fall into
                                                           features of each of the impact evaluation methods,
this category, such as programs providing training
                                                           providing examples of interventions evaluated
or grants to SMEs. In other interventions, such as
                                                           using each approach and discussing their main
regulatory reforms, who receives the benefits and
                                                           assumptions, advantages, and disadvantages.
who does not might not be as clear. These types
of interventions might be more suitable to evaluate        Appendix 3 discusses several examples of impact
approaches that randomize the rollout of the               evaluations performed for various SME finance
implementation sequentially throughout regions or          policies.

Figure 3. Suggested Designs for Evaluations Not Planned Ahead




                                    Global Partnership for Financial Inclusion
18                                                                         Impact Assessment Framework: SME Finance




Steps in the Impact Evaluation                                            if the policy was intended to increase employment
                                                                          of SME workers, then a natural indicator to evaluate
Process                                                                   is the number of jobs. Evaluators should identify
This section summarizes the recommended steps                             which indicators they plan to use, keeping in mind
that an impact evaluation should follow.4 We                              the data available to perform the evaluation.
classified the main steps into four groups: pre-
                                                                          During the evaluation design stage, the team
evaluation assessment, evaluation design, data
                                                                          must review if the indicators to monitor can be
collection, and analysis of results.
                                                                          retrieved from data already available or if new
During the pre-evaluation assessment, the team                            data collection is needed. Since collecting data is
must have a clear understanding of the intervention                       the most expensive part of an impact evaluation,
that will be evaluated. It is important to know its                       an effective way to maintain a tight budget is by
main operational characteristics, such as eligibility                     using preexisting data whenever possible. Based
criteria for the program and how the eligible SMEs                        on the intervention’s characteristics and the type
are selected for participation. This information                          of data to be used, evaluators must decide on
is crucial since these characteristics will be the                        the most suitable impact evaluation approach
main factors influencing the selection of the proper                      (mainly, identify which subjects will constitute the
impact evaluation method.                                                 treated and control groups). In the next section,
                                                                          we provide some guidelines on how to select the
At this stage, the team should also identify the                          appropriate method.
objectives for which the policy was designed. Was
the policy intended to increase employment of SME                         Data collection is the third step, and it will apply
workers? Was it planned to raise productivity of rural                    in cases in which evaluators plan to collect new
SMEs? Having clearly defined policy objectives will                       data. This includes the entire process from survey
guide the evaluation team to decide which indicators                      design, to piloting the questionnaires, conducting
to monitor throughout the evaluation. For instance,                       fieldwork, and validating the data.


    Table 2. Steps in the Impact Evaluation Process

    I. Pre-evaluation assessment                                         ave a clear understanding of the characteristics of the
                                                                      H
                                                                        intervention
                                                                      Identify objectives of the intervention
                                                                      Identify the outcomes/indicators to evaluate


    II. Evaluation design                                             Review data available to perform evaluation and determine
                                                                        whether new data is needed
                                                                      Select an impact evaluation method


    III. Data collection (if needed)                                    Design survey
                                                                        Pilot questionnaires
                                                                        Conduct fieldwork
                                                                        Process and validate data


    IV. Analysis of results                                           Produce findings of the evaluation



4
    Gertler et al. (2011) provide an in-depth description of a roadmap for impact evaluations.


                                               Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                          19



In the final stage, evaluators analyze the outcomes
in the treatment and the control groups and
produce the results. At this stage, the evaluators
can determine the impacts of the intervention and
present them to the appropriate policy makers.

Table 2 outlines the main activities to follow at each
step of the impact evaluation process.




                                     Global Partnership for Financial Inclusion
20                                                                 Impact Assessment Framework: SME Finance




V. Impact Evaluation Methods—The
Experimental Approach
In recent years, randomized experiments, also                    identification problem—ensuring the outcome of
known as randomized control trials (RCTs), have                  the program or policy would not have occurred in
increasingly become the preferred method of                      such program or policy’s absence.
evaluation for many development economists
                                                                 Box 2 describes an RCT that evaluated a business
(Duflo and Kremer 2006). The essence of the
                                                                 training program targeted at entrepreneurs in
RCT design lies in randomly assigning some units
                                                                 Bosnia and Herzegovina. For a discussion of
(individuals or firms) to receive the “treatment�? (that
                                                                 several prominent examples of RCT evaluations
is, participation in the program) and others to serve
                                                                 relevant for SME finance policies, see Appendix 3.
as a control group. Such random assignment allows
for a credible attribution of the outcomes observed              Key Assumptions
to the program investigated.
                                                                 The key assumption of the RCT evaluation is the
The key reason the RCT methodology has gained                    random assignment of subjects (such as SMEs) to
so much popularity lies in its ability to address the




     Box 2. Public Sector Intervention Evaluation: Business
     Training in Bosnia and Herzegovina
     While access to finance has long been thought of as a constraint on SME growth, another set of constraints has
     recently emerged—business skills, or “managerial capital,�? which is thought to be lacking in many entrepreneurs.
     Thus, business training programs and managerial education have become an important focus for policy makers.
     Business training programs are a good example of interventions that can be evaluated using RCT because they
     can be randomly administered to a subset of the SMEs to create a clear control group.

     A randomized evaluation of a comprehensive business and financial literacy training program for entrepreneurs
     ages 18 to 30 was conducted in Bosnia and Herzegovina (Bruhn and Zia 2011). The sample included small
     businesses with an average of two employees. The course covered basic business concepts and accounting
     skills, as well as investment and growth strategies, with a particular emphasis on the importance of up-front capital
     investment. The researchers randomly selected treatment and control groups, and performed baseline surveys
     in both groups. Similar to many other RCT studies, this study had a relatively low take-up rate: only 39 percent of
     those in the treatment group actually attended the business training course; others cited lack of time as the reason
     for nonattendance.

     The authors found that the training program led to better business practices, such as separation of business and
     personal accounts and more favorable loan terms, greater investment, and some improvements in sales and
     profits (but only among a subsample of entrepreneurs with higher financial literacy). However, the program had no
     effect on firm survival or business start-up, or on loan default rates.

     The type of information generated by such studies would enable policy makers to design effective financial literacy
     training programs and target the subsets of SMEs for which such training programs would be the most effective.




                                          Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                  21



participate in the program. While such assignment               impacts differ for different subsamples, such as
is random by design, it must be assumed that                    men and women.
SMEs cannot manipulate the program assignment
(for example, by moving into or out of the affected             Limitations
areas). In addition, all those assigned to the control          Not All Policies Are Suitable for RCT
group must be credibly excluded from receiving any
benefits from the intervention.                                 For an RCT to work, there must be a clear distinction
                                                                between the treatment and control groups. The
Strengths                                                       best candidates for RCTs are programs that are
                                                                targeted to individuals, firms, or local communities.
Clear Comparison Group
                                                                For example, Ravallion (2009) argues that
The random assignment to participate in the                     randomization is not suitable for a large subset
program by design creates a valid comparison                    of policies important for development economics
group since individuals or firms are randomly placed            because most often these policies apply to the
in the treatment group or the control group. Hence,             whole country, the whole population, or all firms.
placement does not depend on any preexisting                    Investigating such a policy using RCT is unlikely
characteristics that may influence the outcome of               to be feasible because no group can be randomly
the program. In this case, one can be reasonably                selected not to receive the “treatment.�? Examples
well assured that program participation is the only             of such policies within the SME finance framework
reason different average outcomes are observed in               include most policies affecting legal, regulatory, and
the two groups. In other words, when a randomized               supervisory frameworks, as those policies most
evaluation is correctly designed and implemented,               often are implemented on an economy-wide scale.
it provides an unbiased estimate of the impact of               However, such policies can often be evaluated
the program in the study sample.                                using encouragement design, a type of RCT (see
                                                                Box 3), or nonrandomized methods.
Baseline Data Not Necessary
                                                                Sometimes policies that are intended to affect
RCT evaluations can be performed without detailed               the whole economy may be designed to allow for
baseline data, which can save on the costs of data              randomization or for ex post program evaluation
collection. Nevertheless, baseline data are often               if the rollout happens in stages. For example, an
helpful to verify the assignment and also study how



    Box 3. Encouragement Design
    Encouragement design is likely to be applicable for a wide variety of evaluations of SME-related policies. This
    method can be very useful for evaluating policies and interventions that are implemented at the country level, such
    as most changes in regulatory and supervisory frameworks. Such policies can be evaluated in a semi-randomized
    fashion. In this method, some units (such as firms or households) selected at random receive incentives to
    participate in a program that is available to all. Such encouragement can be in the form of information, marketing
    materials, or financial stimulus.

    An example of an encouragement design mechanism can consist of reducing the cost of applications for a random
    subset of SMEs to a guarantee program. If firms receiving the encouragement are more likely to apply to the
    program, this mechanism will predict program participation. Moreover, as this program is assigned randomly,
    it will not be correlated with firms’ access to credit, so the incentives can be used to evaluate the impact of the
    intervention.




                                         Global Partnership for Financial Inclusion
22                                                                   Impact Assessment Framework: SME Finance




enterprise registration reform was rolled out in                    Power of the Design
stages in different municipalities in Mexico (Bruhn
                                                                    One important issue with experimental design is
2008). While the sequence of these events can be
                                                                    the power to detect the program effect. The power
credibly seen as exogenous to the outcomes of
                                                                    of the design is the probability that a statistically
interest, it was not done randomly. Nevertheless,
                                                                    significant result will be obtained. In other words,
Duflo and Kremer (2005) argue that randomly
                                                                    the power is the assurance that the result observed
determining the order of phase-in may be a fair way
                                                                    is unlikely due to pure chance. One way to address
to introduce a program and also will allow for RCT
                                                                    the issue of power is to ensure a sufficiently large
evaluation.5
                                                                    sample size. Appendix 2 offers more details on the
An important limitation of RCT is that it cannot be                 issues of power and take-up.
used to randomly select the recipients of a loan,
                                                                    Take-up
as financial institutions need to ensure that their
recipients are creditworthy and that the loans will be              Related to the problem of power is the take-up of
repaid. Thus, the allocation of credit should not alter             the program, or the proportion of those affected
the risk-assessment process of the bank because                     by a policy or a program—whether individuals,
it could undermine the viability of the SME finance                 households, or SMEs—that will actually use the
program. An example of a design that takes this                     program. Any program or intervention’s impact will
issue into account is Karlan and Zinman (2008). In                  significantly depend on the take-up. For example,
their study, consumers first applied for loans, and                 not all enterprises will chose to register formally or
then the pool of marginally rejected candidates was                 to obtain a loan even if they are assigned to the
randomly assigned to receive a loan. Such studies                   “treatment�? group that offers a particular intervention.
may also help banks better refine their credit-                     A program that increases the availability of finance
scoring methodologies.                                              may not have the desired impact if SMEs do not
                                                                    actually need more access (but perhaps suffer from
Another common issue with evaluating programs
                                                                    high costs of access).
using randomized methods is that some individuals
or firms must be restricted from access to the                      The first challenge with low take-up is that it increases
program. There may be political opposition to                       the sample size needed to generate statistically
delaying program access to some people or firms,                    significant differences. The second challenge is
or there may be ethical considerations.                             one of interpreting program impact (see Appendix 1
                                                                    for a discussion of technical details), which means
Finally, for an RCT evaluation to be feasible,
                                                                    that program effects must be carefully interpreted to
evaluators need to obtain data on a sufficient number
                                                                    decide whether the parameter estimated is, in fact,
of treated versus untreated “units.�? If the units are
                                                                    one of policy interest.
individuals or firms, it is most likely that sufficient
numbers can be found for a statistically valid
comparison. But if, for example, the unit of analysis
is financial institutions in a highly concentrated
financial sector, then there might not be enough of
them to compare one group to the other.




5
  However, randomized phase-in may become problematic when the comparison group is affected by the expectation of future
treatment. For example, in the case of a phased-in microcredit program, individuals in the comparison groups may delay investing
in anticipation of cheaper credit once they have access to the program. In this case, the comparison group does not provide a valid
counterfactual.


                                           Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                  23




VI. Impact Evaluation Methods—
Non-experimental Approaches
Difference-in-Difference                                         To evaluate an intervention using DD, data on the
                                                                 outcomes of interest for the treatment and control
The difference-in-difference (DD) approach is one                groups are needed from periods before and after
of the most popular methodologies used in impact                 the intervention. Figure 4 illustrates the DD effect.6
evaluation, including assessments of SME finance
policies. This methodology compares outcomes,                    The DD approach is well suited to evaluate
before and after an intervention took place and                  SME interventions in which the implementation
between the group that received the intervention                 of the program took place at different stages
(treated group) and a control group. The function                (for example, a program that was rolled out
of the control group is to take into account                     across municipalities over time) or in which the
changes over time that might also affect the                     implementation was targeted to some groups and
treatment group’s outcomes. Thus, by comparing                   not others (for example, a project targeting particular
the outcomes of the control group to the outcomes                municipalities). The evaluator must understand the
of the treated group, any factors affecting both                 reasons for targeting specific groups, and whether
groups in the same manner are canceled out. As                   the treatment group was selected to maximize the
with RCTs, the control group is used to infer what               performance of the intervention, then DD estimates
would have happened to the treated group if the                  could produce biased results (see Box 4 for a DD
intervention had not taken place.                                impact evaluation example).



Figure 4. The DD Effect



                                                Intervention
    Outcome of
     interest                                                                          Treatment group

                                                                                DD EFFECT
                                                                                         Control group




                               Period before                   Period after               time
                                intervention                   intervention
6
  The DD effect is computed through two subtractions. First, changes in the outcomes from periods before and after the policy
was implemented are computed separately for both groups. Then, to net out any aggregate trend confounding the impact of the
intervention, the gains of the treated group are subtracted from the outcomes’ changes of the control group.



                                          Global Partnership for Financial Inclusion
24                                                                  Impact Assessment Framework: SME Finance




     Box 4. Regulatory Reform Evaluation: Business Registration
     in Mexico
     In 2002, the Mexican Federal Commission for Improving Regulation (COFEMER) implemented a new system that
     substantially reduced the number of procedures and days required to register a business. The objective of this
     system was to simplify business registration procedures in Mexico. Due to staff constraints, the system could not
     be implemented in all municipalities at the same time. While the system was launched in some municipalities in
     2002, others were still in the process of setting it up in 2006. Interestingly, the timing of the implementation across
     municipalities had no particular pattern.

     Bruhn (2011) used this exogenous variation on the timing of the implementation across municipalities to evaluate
     the impact of this new business registration reform on economic outputs. Using a difference-in-difference approach,
     she classified the municipalities that set up the system early as the treatment group. The control group consisted
     of municipalities with similar characteristics to those in the treatment group but where the system had not yet been
     implemented. As long as the changes in the economic outcomes over time would have been similar in the absence
     of the reform, this approach to examine the impact of this reform is valid.

     To make sure this was the case, Bruhn examined whether the control municipalities could be used as a proper
     counterfactual by first establishing that these municipalities were comparable to the treated ones. Using data from
     periods before the reform, she showed that there were no statistically significant differences in the output data,
     which diminished concerns of selection bias issues between control and treatment municipalities. She also verified
     that both early and late adopters were geographically dispersed throughout Mexico, reducing the contagion issue
     by which firms from control municipalities could be benefiting from the reform.

     Her findings suggest that the reform increased the number of registered businesses by 5 percent and employment
     in these industries by 2.8 percent. By increasing competition, the reform benefited consumers and hurt incumbent
     businesses: after the reform, the price level fell by 0.6 percent and the income of incumbent registered businesses
     declined by 3.2 percent.




Key Assumptions                                                   However, this assumption might be violated when
                                                                  evaluating interventions in which firms self-select
The fundamental assumption of the DD estimator                    into the program. Take, for instance, an evaluation
is that the control group trend is identical to the               of a new state bank providing loans to SMEs in
trend that the treated group would have had in the                which firms have to apply for the loan. Using as a
absence of treatment. While this assumption is                    control group those firms that decided not to apply
not testable, its validity should always be carefully             for the loan and as treatment those firms that did
examined to ensure that the DD properly estimates                 apply will very likely produce biased results. Firms
the impact of the program. If data are available for              that select into SME interventions do so because
several years preceding the treatment, then one                   they expect some gains from their participation,
straightforward way to assess the validity of this                while firms that decide not to participate are likely
assumption is to analyze whether pretreatment                     to expect no substantial gains from it. In this case,
trends were equal between groups. While this does                 the control group is not a good representation of
not formally prove the identification assumption                  the treatment group. In contrast, if the state bank
(which, as mentioned, is not testable), the equality              entered in some municipalities and not in others
of pretreatment trends suggests that the treated                  for logistical or political concerns, then a more
and control groups are, indeed, comparable and                    robust comparison would be to use SMEs from
thus reinforces the credibility of the estimates.

                                           Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                          25



municipalities without the state bank as the control           Instrumental Variables
group and SMEs from municipalities where the
bank entered as the treatment group.                           The instrumental variables (IV) approach can
                                                               be used to evaluate SME interventions in which
Strengths                                                      firms, based on unobserved information, can
                                                               select whether to participate in the program. Very
DD Controls for Factors that Do Not Vary over
                                                               often entrepreneurs self-select themselves to
Time
                                                               participate in SME finance projects. For example,
One benefit of this approach is that DD estimates              an intervention providing public credit guarantees
control for all differences (observable and not)               with the objective of increasing firms’ access to
between control and treated groups that do not                 credit may require that entrepreneurs apply for the
change over time, minimizing potential biases in               guarantee. Firms that expect to benefit from having
impact estimates.                                              a public guarantee will apply, while firms that expect
                                                               little or no benefit from the program will not.
Limitations
                                                               To evaluate interventions of this type, an instrument
The Key Assumption Is Not Testable                             or set of instruments is required. A valid instrument
One of the main issues of this methodology is that             must be a strong predictor of participation in
its underlying assumption (of equal trends that                the intervention and must not be correlated with
the treatment and control groups would have had                the outcome variable for reasons other than
without the intervention) is not testable, and if it fails     participation in the intervention (that is, it must be
to hold, then the DD impact will be biased.                    exogenous). In this example, an instrument must
                                                               predict firms’ choice to participate in the public
Targeted Interventions                                         guarantee program but must not influence firms’
                                                               access to credit for reasons other than participation
The estimates could be biased if the intervention
                                                               in the guarantee program.
targeted groups that are expected to experience
higher gains. For instance, if a microcredit                   Once an instrument is identified, the impact
intervention was implemented in villages with                  of an intervention is computed in two steps. In
inherently high demand for credit, then the effect             the first step, the instrument is used to predict
that the program would have in treated villages is             program participation. In the second step, the
potentially different from the effect that it would have       predicted participation (which is independent of
in the control group, since the demand for loans               the outcome variable) is used to evaluate the
in this group is lower. Therefore, it is important to          intervention’s impact.
understand the motives behind the intervention’s
implementation and the choice of the treated group.            Box 5 discusses an example of an IV impact
                                                               evaluation that analyzed the effect of a microcredit
Other Changes that Affect One Group and Not                    program in Thailand.
the Other
                                                               Key Assumptions
Another issue to consider is that this approach will
fail to identify the impact of a policy if any change          The IV estimates are valid if the instrument:
other than the intervention occurs over time                    Is a strong predictor of participation to the
affecting one group and not the other. When using                 intervention. In the microfinance evaluation
DDs, one must be confident that such changes did                  example, the evaluators were interested
not occur.                                                        in understanding the impact of credit on
                                                                  economic outcomes of Thai villages. Since the


                                        Global Partnership for Financial Inclusion
26                                                                 Impact Assessment Framework: SME Finance




     amount of credit injected in all villages through            Strengths
     the program was the same, smaller villages
     ended up receiving a more intense credit                     IV Controls for Unobserved Information
     injection than larger ones. The evaluators’
                                                                  One benefit of the IV approach is that it controls
     instrument (interactions between the number of
                                                                  for unobserved differences between participating
     households in a village and the program years)
                                                                  and nonparticipating subjects. IV estimates isolate
     is a good predictor of the intensity of credit
                                                                  the effect of the intervention from unobserved
     received in each village because the number
                                                                  information that influences self-selection into the
     of households determined the intensity of the
                                                                  program.
     credit injection.
                                                                  Baseline Data Are Not Needed
 Is not correlated with the outcomes evaluated.
   In the example above, the instrument used for                  To estimate the IV impact, baseline data are not
   the evaluation (number of households in each                   needed.
   village during the program years) must not
   influence the consumption of Thai households,                  Limitations
   their investments, and overall asset and                       Unplanned IV Evaluations Are Rare
   income growth except through the effect of the
   program.7                                                      Evaluations of an intervention in which an IV
                                                                  design was not planned ex ante are rare because
If these assumptions do not hold, the impact                      finding a valid exogenous instrument that predicts
estimates will be biased.                                         participation is extremely challenging.



     Box 5. Public Intervention Evaluation:
     Thailand Microfinance Fund
     During 2001 and 2002, a substantial microfinance initiative was implemented in Thailand: Thailand’s Million Baht
     Village Fund Program. This public intervention consisted of injecting funds into all 77,000 Thai villages. The initial
     funds distributed were significant, corresponding to about 1.5 percent of Thai GDP in 2001. Each transfer was
     used to form an independent village bank for lending within the village. Importantly, every village, regardless of its
     characteristics, was eligible to receive the program. This program is among the largest government microfinance
     initiatives of its kind.

     Kaboski and Townsend (forthcoming) evaluated the impact that Thailand’s Million Baht Village Fund Program
     had on economic outputs of Thai villages using the IV approach. As each village received the same amount of
     money, regardless of the population of the village, smaller villages received a relatively more intense injection of
     credit. Due to the nature of the intervention, the expansion of credit in villages by the Thai Fund Program could be
     correlated with the number of households in a village during the program years. Using these interactions of number
     of households and the program years as instruments for the amount of credit received, the authors assessed the
     impact of this program. Their findings suggest that the Million Baht Village Fund injection of microcredit in villages
     did increase the overall credit in the economy. Households borrowed more, consumed more, and increased their
     earnings. A short-term effect of increasing future incomes and making business and market labor more important
     sources of income was also found. The increased borrowing and short-lived consumption response, despite no
     decline in interest rates, point to a relaxation of credit constraints. The increased labor income and especially wage
     rates indicate important spillover effects that may have also affected non-borrowers.




                                          Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                           27




IV Estimates Only Local Effects                                        would be no reason to believe that a firm with 19
                                                                       workers is different from a firm with 20 workers.
A second limitation of the IV approach is that it
estimates only the local average treatment effects                     The assumption of this method is that at the margin
(LATE). This means that the IV estimates measure                       of the cutoff, the assignment to the treatment and
only the impact that the intervention had on those                     the control groups is close to random. By comparing
subjects that were affected by the instrument                          the outcomes of treated firms (firms with 20
(Angrist and Kreuger 2001).8 In many cases, these                      workers) with control firms (firms with 19 workers),
local firms are not necessarily the most important                     evaluators can measure the intervention’s effect
for national policy makers.                                            (see Box 6 for an example).

                                                                       Graphically, the outcome variable (that is, firms’
Regression Discontinuity                                               productivity) should show a discontinuity at the
                                                                       cutoff value (that is, at 20 workers). Figure 5
Regression discontinuity (RD) is a non-experimental
                                                                       illustrates this example.
approach used to evaluate interventions that have
a defined cutoff for participation. For instance,                      One way to validate the RD estimates is to use
a business training project aimed at increasing                        pre-intervention data on the treatment and control
firms’ productivity may be provided only to firms                      groups and to analyze whether discontinuity exists
that employed more than 20 workers in the year                         between these two groups at the cutoff (Angrist and
before the intervention. This exogenous cutoff                         Pischke 2009). If no discontinuity is found for pre-
provides a design that allows the identification of                    intervention periods, then evidence supports that
the intervention’s impact, since firms at the margin                   the discontinuity was generated by the intervention.
of the threshold would not differ substantially: there


    Figure 5. Random Discontinuity




     ������������=productivity


                                                                                                 Treated group

                                                                                                 Control group




                                                           20 workers                                  ������������=number of
                                                                                                       workers

7
  The instrument would violate this assumption if, even in the absence of credit, larger Thai villages might have experienced different
trends in economic activity or business growth than smaller villages.
8
    See also Appendix 1 for a discussion of related issues that arise with RCT.


                                               Global Partnership for Financial Inclusion
28                                                                      Impact Assessment Framework: SME Finance




     Box 6. Financial Infrastructure Evaluation—Role of Angel
     Funds in U.S. Start-up Firms
     Most equity funding of SMEs around the world comes from two sources: retained earnings and capital provided
     by personal savings, friends and family, and other “angel�? investors.1 Similar to venture capitalists, angel funds
     are investors for high-potential start-up investments, commonly structured as semiformal networks of high net
     worth individuals who decide to invest in projects of aspiring entrepreneurs based on their own assessments. To
     evaluate the impact of angel funds in U.S. start-up firms, Kerr, Lerner, and Schoar (2010) obtained information on
     prospective ventures from a large angel investment group. Using a regression discontinuity approach to evaluate
     the effect of angel funding on the performance of high-growth start-up firms, the authors compared firms that fall
     just above and just below the funding criteria of the angel group. The evaluation found a strong, positive effect of
     angel funding on the survival and growth of ventures.

     1
         World Bank Enterprise Surveys: http://www.enterprisesurveys.org/.




Key Assumptions                                                       Limitations
The key assumption behind the RD approach is that                     Independence of Threshold
the potential outcome (that is, firms’ productivity)
                                                                      The most important issue to consider when
may be associated with the cutoff variable (that
                                                                      implementing an RD evaluation is the validity of the
is, number of workers), but in a smooth manner.
                                                                      cutoff. If the cutoff was assigned with the objective
In other words, in the absence of the intervention,
                                                                      of maximizing the intervention’s impact, then
this association should have been smooth at the
                                                                      conclusions from the RD will be biased. The cutoff
cutoff. In this way, any discontinuity in the potential
                                                                      selected must be independent of the expected
outcome at the cutoff is interpreted as a causal
                                                                      outcomes from the intervention. Suppose that in
effect of the intervention. This is known as the
                                                                      the example of the business training project, firms
continuity assumption (Van der Klaauw 2008).
                                                                      with at least 20 workers are concentrated in the
Strengths                                                             most developed region of the country. Firms in this
                                                                      region are more likely to have access to finance.
Baseline Data Are Not Needed                                          Thus, the effects of providing business training are
One benefit to using an RD design is that baseline                    likely to be higher if the cutoff is 20 workers, since
data are not needed to estimate the impact.                           these firms will also have access to better terms of
However, data from pre-intervention periods are                       credit, likely increasing their productivity, than if the
strongly recommended to perform robustness                            cutoff were 15 or 10 workers.
checks on the validity of the discontinuity.                          Manipulation of the Assignment
RD Estimates Are Comparable to Randomized                             Moreover, RD inferences will be invalid if firms are
Estimates                                                             able to manipulate assignment into the program.
A second advantage of the RD approach is that from                    For instance, if the cutoff is a specific number
a methodological point of view, a solid RD design                     of employees, then firms can easily hire one
is comparable in internal validity to a randomized                    more employee to participate in the intervention,
experiment.                                                           prompting selection issues that contaminate the RD
                                                                      impact. As long as firms are unable to manipulate


                                              Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                             29



their eligibility into the program, the RD estimates                   difference in outcomes between the treated group
are valid. Thus, RD design is more flexible than the                   (that is, firms participating in the program) and
IV approach since the IV methodology requires that                     control group (comparable firms not participating
the instrument is exogenous to the outcomes and                        in the program). The approach matches treated
that firms are not able to manipulate the assignment                   firms to non-treated ones using propensity scores
(Lee and Lemieux forthcoming).                                         that summarize all observable information used
                                                                       to assign treatment (or eligibility to the program).
Sufficient Observations Close to the Cutoff                            Thus, PSM can be used to identify a control group
A second issue of the RD approach is that in order to                  that is statistically equivalent to the treatment
measure impact estimates, sufficient observations in                   group. As in all other approaches, the control
close proximity to the cutoff must be available. In the                group is used to infer what would have happened
business training example, sufficient firms with 18 to                 to intervention participants without it.
22 workers (a number that is close to the cutoff of 20)                To compute the propensity score, one must
would be needed to evaluate the RD effect.                             estimate the conditional probability of participating
Estimated Parameters Might Not Be the Most                             in the intervention as a function of the observed
Important Ones                                                         characteristics.9 These characteristics are then
                                                                       aggregated into the score. Once a control group is
As in the case of the IV methodology, RD estimates                     identified, the impact of an intervention is measured
can only estimate the average treatment effect of                      by the difference in outcomes between the treated
observations close to the cutoff (that is, the local                   and control groups (see Box 7 for an example).
treatment effect). This implies that it might be
difficult to draw conclusions about the impact of                      Key Assumptions
the intervention for firms away from the cutoff of 20
                                                                       The assumption underlying the PSM estimates
workers.
                                                                       is known as the conditional independence
                                                                       assumption. This assumption implies that after
Propensity Score Matching                                              controlling for observable differences between the
                                                                       treated and control group, the outcome resulting in
Propensity score matching (PSM) is a non-                              the absence of the intervention would be the same
experimental approach that can be used to analyze                      in both cases. Thus, conditional on the score, any
the impact of an SME intervention in which (1) the                     differences between the treated and control group
institutional arrangements that defined selection                      are attributed to the effect of the intervention.
into the project are known by the evaluator and
(2) a control group is not maintained. Under                           In other words, this assumption implies that using
these circumstances, the PSM approach can                              observed information from SMEs is enough to
identify a control group from the group of firms not                   identify a statistically equivalent control group. This
participating in the program.                                          assumption is unlikely to hold in SME interventions in
                                                                       which firms self-select to participate based on factors
The intuition of this method is to find a control                      that are difficult to observe from the data, such as
group whose observable characteristics are                             entrepreneurial attitudes, managers’ skills, or risk
similar to the treated group but that did not                          aversion. If these unobserved factors are driving
participate in the intervention. The impact of                         firms’ participation in the program, then the PSM
the intervention will then be measured as the                          approach will fail to identify a proper control group.

9
  The conditional probability can be estimated through a probit or a logit model in which the dependent variable is an indicator variable
equal to 1 if the subject participated in the intervention, and 0 otherwise. The independent variables are the observed characteristics
that determined participation in the intervention.


                                              Global Partnership for Financial Inclusion
30                                                             Impact Assessment Framework: SME Finance




Strengths                                                     PSM Does Not Control for Unobserved Self-
                                                              selection
PSM Makes It Possible to Identify a Control
Group When the Eligibility Criteria Are Known                 If unobservable characteristics also influence
and Observed                                                  participation in the intervention and outcomes (self-
                                                              selection issues such as the ones discussed in the
The overall advantage of the PSM approach is that             example), then the PSM by itself is not an appropriate
a control group can be identified when the selection          method. This could be the case when participating
process is known and observed.                                entrepreneurs or firms self-select in the intervention
                                                              for reasons that also influence their performance.
The PSM approach is especially useful when
                                                              Evaluations using PSM in these situations tend to
several characteristics influence the eligibility for an
                                                              at least combine PSM with an alternative approach,
intervention, since it provides a natural weighting
                                                              such as DD, in order to remove the bias due to time-
scheme (the score) that yields unbiased estimates
                                                              invariant unobservable characteristics (such as
of the intervention effect (Dehejia and Wahba 2002).
                                                              motivation, skills, or risk aversion).
Limitations
                                                              Eligibility Criteria Must Not Be Associated with
PSM Is Data Intensive                                         Participation in the Intervention

Data on sufficient firms and detailed information on          Another issue to take into consideration when
their characteristics are needed to identify a control        using PSM is that information from the institutional
group that is statistically identical to the treated          arrangements of the intervention is needed to
group.                                                        identify the participant selection characteristics
                                                              (Caliendo and Kopening 2008). For valid PSM
                                                              estimates, these variables must not be affected by
                                                              participation in the intervention.




                                      Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                     31




   Box 7. Public Intervention Evaluation: Chile’s Supplier
   Development Program
   In Chile, the Suppliers Development Program encouraged large firms to invest in the training of their SME suppliers,
   strengthening the linkage between large (potentially exporter) firms and SMEs. Large firms participating in the
   program were expected to provide professional advice, personnel training, technical assistance, or technology
   transfer to their SME partners. The program would then subsidize the cost of these activities. Each project
   participating in the program consisted of one large firm that sponsored the knowledge transfer and at least 20
   SMEs in the agriculture and forestry sector, or at least 10 SMEs in other economic activity sectors.

   An evaluation of the program was done by Arraiz, Henriquez, and Stucchi (2011). Administrative data allowed
   the evaluators to follow beneficiary and non-beneficiary firms for several years before and after the program
   was in place. To identify a control group, the evaluators estimated the propensity score using the probability of
   participating in the program with firms’ information from 2002, the year before the beneficiaries started participating
   in the program. The score helped the evaluators determine a control group, which was composed of firms that did
   not take part in the program but that had similar probabilities of participating.

   A concern of evaluators was that unobserved characteristics of firms (such as managers’ skills or motivation)
   could have influenced their participation into the program and their success in it. In such cases, the PSM approach
   should be combined with other evaluation methods that control for unobserved information that might influence
   self-selection. The evaluators combined PSM with the DD approach, since DD estimates control for all unobserved
   information between the treated and control groups that do not change over time. After identifying their control
   group through PSM, the evaluators estimated the DD effect of the program. The evaluation found that both local
   SME suppliers and large firms benefited from participating in it. Local SMEs that participated in the program
   increased sales and employment. Large firms increased their sales and their likelihood of becoming exporters.




                                         Global Partnership for Financial Inclusion
32                                                          Impact Assessment Framework: SME Finance




VII. Minimal Standard Monitoring
Minimal standard monitoring typically refers to           A second advantage regards budget. While
before-and-after comparisons that monitor over            the difference in cost between rigorous impact
time the performance of the subjects affected by          assessments and before-and-after comparisons
an intervention. The main distinction between             should not be substantially different if data
a minimal standard monitoring and an impact               collection is not needed, impact assessments still
evaluation approach is that minimal standard              need to reserve budget for monitoring costs of
monitoring does not follow a control group to learn       the evaluation and researchers’ time; whereas in
what would have happened to the treatment group           minimal-standard monitoring, if these costs exist,
in the absence of the intervention.                       then they should be lower.

Suppose, for instance, that evaluators are                The drawback of using before-and-after
interested in analyzing the impact that a public          comparisons is that there is no control group that
credit program has on the profits of SMEs. To             allows us to know what would have happened if
do a minimal standard monitoring, the only data           firms had not received the intervention. With this
needed would be information before and after the          method, the odds of falsely attributing an effect are
program on the profits of SMEs that participated          large. This method can only identify how subjects
in the program. The before-and-after effect is then       change over time. Part of these changes might be
measured by the difference in the average profits         attributed to the intervention, but any other factor
before and after the program.                             changing over time parallel to the intervention (such
                                                          as economic growth or changing macroeconomic
An advantage of this approach is that evaluators          conditions) will contaminate the evaluation.
only need to have information on the subjects of          Therefore, we are not able to confidently measure
interest before and after the reform took place.          and isolate the impact of the intervention.
Compared to rigorous impact evaluations, this
approach demands the least amount of data.




                                   Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                         33




VIII. Conclusions
As stated in the SME Finance Policy Guide                     McKenzie (2010) argues that the SME sector is
(GPFI 2011), further work is needed on impact                 one area that is particularly full of unexploited
assessment techniques for SME finance policies                possibilities for impact evaluations: “SME focused
and interventions. Only a handful of rigorous studies         policies are typically carried out by governments
exist. More studies are needed on a wider range of            and international financial institutions (IFIs) rather
policies in a number of different institutional settings      than NGOs, and are too expensive usually for
to learn what works, where, and why. To identify              researchers to fund the program on offer themselves.
good practice models, it is important to examine if           As a result, there is a real knowledge gap—and
the results of certain policies can be repeated in            an opportunity to be grasped. If governments and
other environments.                                           operations staff at IFIs can work with researchers in
                                                              evaluating the many projects being implemented,
This Framework is intended as a resource for                  it should be possible to evaluate rigorously many
policy makers and regulators to select adequate               of the policies being carried out for SMEs and to
approaches to evaluate SME finance policies and               learn where modifications of existing strategies are
interventions. While the focus of the Framework is            needed.�?
on SME finance policies, the methods described
can be applied to evaluate a broader set of SME               In summary, more work is needed to evaluate
interventions. The paper reviews a variety of impact          the wide variety of SME finance policies, and
evaluation methods—randomized experiments,                    international organizations are well suited to fill
difference-in-difference, propensity scoring, and             in these knowledge gaps. As Duflo and Kremer
regression discontinuity designs—and provides                 (2005, p.342) state, “The benefits of knowing which
recommendations on how to map the various                     programs work and which do not extend far beyond
techniques to interventions spanning regulatory               any program or agency, and credible impact
and supervisory frameworks, financial infrastructure          evaluations are global public goods in the sense
programs, and public interventions.                           that they can offer reliable guidance to international
                                                              organizations, governments, donors, and NGOs
It is important to understand and consider all                beyond national borders.�?
possible evaluation options and not focus on any
single approach, such as randomization. While
randomization has many advantages, it is not
necessarily the optimal choice in all situations, and
it has its own limitations that need to be addressed
in carefully planned and implemented studies.
The impact evaluation studies should be driven by
important policy questions rather than by methods
of evaluation.




                                       Global Partnership for Financial Inclusion
34                                                             Impact Assessment Framework: SME Finance




References
Angrist, Joshua D., and Guido Imbens. 1994. “Iden-           Bruhn, Miriam, and Bilal Zia. 2011. “Stimulating
tification and Estimation of Local Average Treat-            Managerial Capital in Emerging Markets—The Im-
ment Effects.�? Econometrica 62 (2): 467–75.                  pact of Business and Financial Literacy for Young
                                                             Entrepreneurs,�? Policy Research Working Paper
Angrist, Joshua D., and Alan B. Kreuger. 2001.
                                                             5642, World Bank, Washington, DC.
“Instrumental Variables and the Search for Identi-
fication: From Supply and Demand to Natural Ex-              Bruhn, Miriam, and I. Love. 2009. “The Economic
periments.�? Journal of Economic Perspectives 15              Impact of Banking the Unbanked: Evidence from
(Fall): 69–85.                                               Mexico,�? Policy Research Working Paper 4981,
                                                             World Bank, Washington, DC.
Angrist, J.D., and J.S. Pischke. 2009. Mostly Harm-
less Econometrics: An Empiricist’s Companion.                Burgess, R., and R. Pande. 2005.�?Can Rural Banks
Princeton, NJ: Princeton University Press.                   Reduce Poverty? Evidence from the Indian Social
                                                             Banking Experiment.�? American Economic Review
Arraiz, I., F. Henriquez, and R. Stucchi. 2011. “Im-
                                                             95 (3): 780–95.
pact of the Chilean Supplier Development Program
on the Performance of SME and Their Large Firm               Caliendo, M., and S. Kopening. 2008. “Some Prac-
Customers,�? Working Paper, Inter-American Devel-             tical Guidance for the Implementation of Propensity
opment Bank, Washington, DC.                                 Score Matching.�? Journal of Economic Surveys 22:
                                                             31–72.
Ashraf, Nava, Dean Karlan, and Wesley Yin. 2006.
“Household Decision Making and Savings Impacts:              Cole, Shawn, T. Sampson, and B. Zia. 2011. “Prices
Further Evidence from a Commitment Savings                   or Knowledge? What Drives Demand for Financial
Product in the Philippines,�? Working Paper 939,              Services in Emerging Markets?�? Journal of Finance
Economic Growth Center, Yale University, New Ha-             66 (6): 1933–67.
ven.
                                                             ———. 2009. “Financial Literacy, Financial Deci-
Bauchet, Jonathan, C. Marshall, L. Starita, J.               sions, and the Demand for Financial Services: Evi-
Thomas, and A. Yalouris. 2011. “Latest Findings              dence from India and Indonesia.�? Working Paper
from Randomized Evaluations of Microfinance,�?                09-117, Harvard Business School, Cambridge, MA.
Consultative Group to Assist the Poor Report No 2,
                                                             De Mel, Suresh, D. McKenzie, and C. Woodruff.
Washington, DC, December. http://www.cgap.org/
                                                             2008a. “Are Women More Credit Constrained?
gm/document-1.9.55766/FORUM2.pdf.
                                                             Experimental Evidence on Gender and Microen-
-----------. 2008. “License to Sell: The Effect of Busi-     terprise Returns.�? American Economic Journal: Ap-
ness Registration Reform on Entrepreneurial Ac-              plied Economics 1(3): 1-32.
tivity in Mexico,�? Policy Research Working Paper
                                                             ———. 2008b. “Returns to Capital: Results from
4538, World Bank, Washington, DC.
                                                             a Randomized Experiment.�? Quarterly Journal of
Bruhn, Miriam. 2011. “License to Sell: The Effect of         Economics 123 (4): 1329–72.
Business Registration Reform on Entrepreneurial
                                                             Dehejia, R., and S. Wahba. 2002. “Propensity
Activity in Mexico.�? Review of Economics and Sta-
                                                             Score Matching Methods for Non-Experimental
tistics 93(1): 382–386.


                                      Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                      35



Causal Studies.�? Review of Economics and Statistics           Economies: Implications for Microfinance,�? American
84 (1): 151–61.                                               Economic Review 98(3): 1040–68.

De Janvry, Alain, C. McIntosh, and E. Sadoulet                Kerr, W. R., J. Lerner, and A. Schoar. 2010. “The
(2008) “The Supply- and Demand-Side Impacts of                Consequences of Entrepreneurial Finance: A Re-
Credit Market Information,�? San Diego, University of          gression Discontinuity Analysis.�? Working Paper 10-
California–San Diego, unpublished.                            086, Harvard Business School, Cambridge, MA.

Duflo, Esther, and Michael Kremer. 2005. “Use of              Lee, D. S., and T. Lemieux. Forthcoming. “Regres-
Randomization in the Evaluation of Development                sion Discontinuity Designs in Economics.�? Journal of
Effectiveness.�? In Evaluating Development Effec-              Economic Literature.
tiveness, ed. Osvaldo Feinstein, Gregory K. Ingram,
                                                              McKenzie, David, and Christopher Woodruff. 2008.
and George K. Pitman, 205–32. New Brunswick, NJ:
                                                              “Experimental Evidence on Returns to Capital and
Transaction Publishers.
                                                              Access to Finance in Mexico.�? World Bank Economic
Duflo, Esther, and Emmanuel Saez. 2003. “The Role             Review 22(3): 457–82.
of Information and Social Interactions in Retirement
                                                              McKenzie, David. 2010. “Impact Assessments in Fi-
Plan Decisions: Evidence from a Randomized Ex-
                                                              nance and Private Sector Development: What Have
periment.�? Quarterly Journal of Economics 118(3):
                                                              We Learned and What Should We Learn?�? World
815–842.
                                                              Bank Research Observer 25(2): 209-33.
Gertler, Paul J., Sebastian Martinez ,Patrick Pre-
                                                              Ndovie. 2010. “Malawi Business Environment
mand, Laura B. Rawlings, and Christel M. J. Ver-
                                                              Strengthening Technical Assistance Project (BE-
meersch. 2011. “Impact Evaluation in Practice.�?
                                                              STAP) Impact Evaluation.�? Presentation, Dakar.
World Bank, Washington, DC.
                                                              Ravallion, Martin. 2009. “Should the Randomistas
Global Partnership for Financial Inclusion (GPFI).
                                                              Rule?�? The Economists’ Voice (February). www.be-
2011. “SME Finance Policy Guide.�? Paper on Behalf
                                                              press.com/ev.
of the Global Partnership for Financial Inclusion. IFC,
Washington, DC.                                               Storey, D. J., and J. Potter. 2007. “OECD Framework
                                                              for the Evaluation of SME and Entrepreneurship Pol-
Imbens, Guido W., and Jeffrey M. Wooldridge. 2009.
                                                              icies and Programme.�? Organisation for Economic
“Recent Developments in the Econometrics of Pro-
                                                              Co-operation and Development (OECD), Paris.
gram Evaluation.�? Journal of Economic Literature
47(1): 5–86.                                                  Todd, Petra E., and Kenneth I. Wolpin. Forthcoming.
                                                              “Structural Estimation and Policy Evaluation in De-
Kaboski, Joseph P., and Robert M. Townsend. Forth-
                                                              veloping Countries.�? Annual Review of Economics.
coming. “A Structural Evaluation of a Large-Scale
Experimental Microfinance Initiative.�? Econometrica.          Van der Klaauw, W. 2008. “Regression Discontinuity
                                                              Analysis: A Survey of Recent Developments in Eco-
Karlan, Dean, and Jonathan Zinman. 2010. “Expand-
                                                              nomics.�? Labor 22 (2): 219–45.
ing Credit Access: Using Randomized Supply Deci-
sions to Estimate the Impacts.�? Review of Financial           Winters, P., L. Salazar, and A. Maffioli. 2010. “De-
Studies 23 (1): 433–64.                                       signing Impact Evaluations for Agricultural Projects,�?
                                                              Impact Evaluation Guidelines, Strategy Develop-
———. 2009. “Observing Unobservables: Identi-
                                                              ment Division, Technical Notes IDB-TN-198, Inter-
fying Information Asymmetries With a Consumer
                                                              American Development Bank, Washington, DC.
Credit Field Experiment.�? Econometrica 77(6):
1993–2008.                                                    World Bank, 2012. “Impact Evaluation Toolkit.�? World
                                                              Bank, Washington, DC.
———. 2008. “Credit Elasticities in Less-Developed


                                     Global Partnership for Financial Inclusion
36                                                                   Impact Assessment Framework: SME Finance




Appendix 1. General Concerns
As discussed throughout the Framework, each                        best for addressing selection bias because they
method has its limitations; however, a number of                   randomly assign units to be treated. However,
concerns apply to all impact evaluation methods. In                several other sources of bias may still crop up in
this section we review such general concerns.                      an RCT and may also be an issue in other types of
                                                                   impact evaluation.
Biases—Selection, Attrition, and Spillovers
                                                                   One common problem is that the mere fact of being
All impact evaluations face selection bias and need                assigned to participate in a program (whether or
to have a credible way of addressing it. RCTs are                  not such assignment is done randomly) may cause



     Box 8. Changes in Behavior in Response to Program
     Assignment Experiment
     One common concern with impact evaluations is that they can change the behavior of treatment and control
     groups. For example, if the treatment group receives a loan or a training program while the control group does not,
     then the treatment group may see this as a positive boost to entrepreneurs’ morale, which may have an effect on
     their effort. This would contaminate the pure impact of the loan because the impact may be due to a short-term
     boost in morale and increased effort, and not to the additional finance or training content.

     On the other side, individuals or firms in the control group may change their behavior in response to not being
     assigned into the program. For example, if some areas are affected and others are not, then individuals may move
     across the border into (or out of) the affected areas. In a delayed phase-in situation, when one area receives an
     intervention while another expects to receive it in the future, the possibility that the intervention is coming in the
     future is likely to affect behavior in the control group. Another example would be a program that involves collecting
     accounting data on firms as part of the baseline analysis. Here, the firms that are not in the treatment group may
     still change their behavior because their accounting data are collected and observed by the evaluators.

     Thus, even if randomized methods have been employed and the intended allocation of the program was random,
     then the differences in behavior may contaminate this random assignment and produce biased results. Other
     approaches may also be subject to such sources of bias.

     One advantage of experiments is that they can explicitly address any possible changes in behavior. For example,
     in Ashraf, Karlan, and Yin’s 2006 study of a commitment savings account, the change in behavior for those who
     received information about the new account could come simply because of the reminder about the importance of
     savings. To deal with this possibility, the researchers introduced another treatment group that received marketing
     on the existing savings product, which also served as a reminder about the usefulness of savings. Thus, the
     possibility that the outcome for the new type of account was simply due to the change in savings behavior could
     be eliminated by adding this third group.10



10
   However, adding another group affects the issues of power, discussed above. This may explain why Ashraf, Karlan, and Yin (2006)
find insignificant estimates for the coefficients on the third group.


                                           Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                              37



the treatment or comparison group to change its                        were offered the prize. This allowed the authors to
behavior, which may contaminate the results of the                     explore both the direct effect on attendance and
experiment (see Box 8).                                                plan enrollment of being offered an incentive and
                                                                       the spillover effect of being in a department in which
In addition, there may be spillover effects from                       others had been offered incentives.
those participating in a program in comparison to
those that do not. For example, a program designed                     Finally, there could be differences in attrition rates
to enhance financial literacy of entrepreneurs may                     (that is, dropout) between treatment and control
have spillover effects on those not receiving the                      groups, which may also affect the results.11
program so that their literacy increases as well.
                                                                       Scaling Up and Systematic Effects
This can easily happen if both treated and non-
treated entrepreneurs belong to the same business                      Many program evaluations, especially RCTs,
association or have other social connections.                          are often of small-scale interventions and might
Spillovers may also come from redistribution of                        have a different impact if implemented on a large
resources by the government. For example, if some                      scale.12 For example, capital grants or directed
villages are positively affected by the experiment                     loan programs for SMEs offered by governmental
but others are not, then the local government may                      financial institutions may crowd out private sector
find other ways to channel resources to unaffected                     loans. In the long run, capital grants may skew
villages (Ravallion 2009).                                             incentives of microentrepreneurs who will be
If the spillover effects on non-treated individuals                    waiting for grants rather than efficiently running
are generally positive, then the impact estimates                      their businesses. Such effects may be particularly
will be smaller than they would have been without                      important for assessing the welfare implications of
spillovers. This problem affects both randomized                       scaling up a program. Scaling up programs raises
and nonrandomized evaluations. In some cases                           several other issues (see Box 9).
the experiments can be designed to directly                            Another example would be a small-scale training
measure the spillovers. For example, in their study                    program that improves participants’ chances to
of information and 401(k) participation, Duflo and                     obtain a job. However, scaling up such a program
Saez (2003) randomized the offer of getting an                         may not necessarily raise aggregate employment
incentive to attend an information session at two                      because in a world with a fixed number of jobs, a
levels. First, a set of university departments were                    training program could only redistribute the jobs
randomly chosen for treatment, and then a random                       (see Imbens and Wooldridge 2009).
set of individuals within treatment departments




11
  Attrition refers to a situation in which individuals or firms leave the sample observed by researchers. This could be due to closures
for firms or a move for individuals or firms, or simply refusing to participate in subsequent surveys. If there are systematic differences
in the attrition rates in the two groups, then the results may be biased in either direction. For example, if improving access to finance
allows the weakest firms to survive, then the differences in attrition will make the group with access look weaker because it has a
higher proportion of the weakest firms.
12
   In technical terms, RCTs estimate what are known as partial equilibrium treatment effects, which may differ from general equilibrium
treatment effects (Duflo and Kremer 2005).


                                              Global Partnership for Financial Inclusion
38                                                                     Impact Assessment Framework: SME Finance




     Box 9. Scaling Up Small Interventions
     Scaling up a small program raises several additional issues.

     Incentives. Most of the RCTs have been implemented by nongovernmental organizations (NGOs) or researchers,
     who are highly motivated to achieve the best possible outcome of the experiment. In addition, researchers often
     select the best NGOs to work with and test some of the products highly relevant to NGOs’ work and image. Thus,
     experiments are often done under a set of ideal conditions, which may not be possible to replicate or scale up. The
     outcomes might be significantly different when the same program is implemented by government officials with a
     very different set of incentives (Ravallion 2009).

     Allocation of resources. It is plausible that significantly more resources are allocated to the program during
     an experimental phase than would have been under a more realistic situation or in a less favorable context.
     Alternatively, such bias could go another way if the first phase of an experiment does not produce significant
     results because of ineffective implementation. However, the knowledge generated from the first phase would
     make subsequent phases more effective. Thus, it is important to understand the institutional and implementation
     factors that may make the same program successful in one place but not another.

     Different outcomes. In an experimental setting, some firms with potentially low impact are mixed in with firms
     with potentially high impact from the same program because of the random assignment. If the program is scaled
     up, then the most likely takers will be firms with potentially high impact. Thus, the outcomes of a national program
     can be fundamentally different from those of an experiment because of the different types of individuals or firms
     participating (Ravallion 2009).




External Validity                                                    not be effective for different types of firms in the
                                                                     same country or for the same type of firm in other
In impact evaluation discussions, it is common to                    countries. Alternatively, a program that had some
see references to the internal and external validity                 minor variation from the one being tested may or
of the evaluation. Internal validity refers to ensuring              may not be effective in the exact same situation
that the measured impact is indeed caused by the                     as the one tested. While issues of external validity
intervention being tested, while external validity                   arise with other evaluation techniques, they more
refers to the confidence that the impact measured                    often appear in the context of RCTs.
in a specific study would carry over to other samples
or populations.                                                      One way to address external validity concerns is
                                                                     to replicate the evaluations in various settings. It is
RCTs in general have a good track record for                         important to test how robust different programs are
ensuring internal validity (aside from the issues                    in different settings to produce valuable implemen-
discussed above, which often can be addressed).                      tation knowledge. However, extensive replication is
However, RCTs often are criticized on the basis of                   expensive and time-consuming.13
their external validity (that is, transferability of the
results to other situations, such as different samples               Another way to alleviate external validity concerns
of firms, or variations in policies or countries).                   is to couple experiments with the theory of why
For example, a specific program that was found                       the program is expected to work (see Duflo and
effective for one type of firm in one country may                    Kremer 2005).

13
  In addition, researchers are unlikely to be interested in running the same program in different settings because the lack of novelty
will greatly reduce the chances of publication.


                                            Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                                          39




Appendix 2. Size and Power of RCT
Sample sizes, as well as other design choices, will                  by randomly selecting villages and treating all in-
affect the power of an experiment. For example, if                   dividuals in a village), as the errors are likely to be
there are too few units in treatment or control groups,              correlated within the group. The larger the groups
then the comparison of averages may not produce                      that are randomized, the larger the total sample
statistically significant results simply due to small                size needed to achieve a given power.
sample. This can lead to erroneous conclusions.
                                                                     Low take-up exacerbates the issues of power be-
For example, the program may be deemed to have
                                                                     cause it reduces the number of units on which to
a significant effect when it actually does not, or
                                                                     base the statistical analysis. For example, consider
the program may be deemed ineffective when it
                                                                     a program such as a new loan product or a busi-
actually is effective.
                                                                     ness training that aims to raise the profits by 25 per-
The issue of power in RCTs can be addressed by                       cent of microenterprises undertaking the program.
ensuring a sufficient number of observations in                      A randomized experiment that offered the program
each group and optimally dividing the proportion                     to half the firms and used a single follow-up survey
of individuals in treated and control groups based                   to estimate its impact would require a sample size
on the relative costs of treatment versus data                       of 670 firms if take-up was 100 percent, but would
collection. The larger the expected difference                       need a sample size of 2,700 with 50 percent take-
between treatment and control groups (that is, the                   up and 67,000 with 10 percent take-up.14
effect size), the smaller the sample size needed for
equal power.                                                         Thus, one solution to the problem of low take-up is
                                                                     to employ a very large sample so that the resulting
Larger sample sizes are needed when there are                        sample will still contain enough firms or households
several treatment groups and the researcher is                       to enable the researchers to detect a program
interested in detecting the differences between                      impact of a given size. An example of a randomized
various treatments in addition to detecting                          experiment with sample sizes of this magnitude is
differences between treatment and control groups.                    seen in Karlan and Zinman (2009), where 58,000
Moreover, if researchers are interested in the effect                direct-mail offers were randomly sent by a South
of the program on a subgroup—for example, impact                     African lender, with 8.7 percent of those contacted
on female entrepreneurs relative to males—then                       applying for a loan. However, the downside is that
the experiment must have enough power for this                       this solution can be very expensive and therefore
subgroup. This is nontrivial, especially in samples                  not feasible in many situations.
where female entrepreneurship is significantly
less likely, which is not uncommon. Stratification                   The second solution to the low-power problem is to
methods can be used to ensure sufficient number                      restrict the study to a group of units for which take-
of female entrepreneurs in the sample.                               up would be much higher. For example, a business
                                                                     training program could be advertised to all eligible
In some situations, the evaluation design concerns                   firms, and then the number of slots available in the
individuals or firms within the groups (for example,                 program could be randomly allocated among the

14
  In addition, researchers are unlikely to be interested in running the same program in different settings because the lack of novelty
will greatly reduce the chances of publication.


                                             Global Partnership for Financial Inclusion
40                                                           Impact Assessment Framework: SME Finance




group of interested firms. Presumably, the take-up         example, policy makers might be interested in the
would be higher if the firms have already expressed        effect of the loan program on all firms or on firms
interest in the training. An example of such a de-         interested in taking up credit. But an evaluation
sign is seen in Karlan and Zinman (2008), in which         such as Karlan and Zinman (2008), based on the
consumers first apply for loans and then the pool of       marginal applicants, only informs researchers of
marginally rejected candidates (all of whom wanted         the impact on those firms that fall within a narrow
a loan) is randomly assigned to receive a loan.            band in terms of their creditworthiness according
                                                           to the specific credit-scoring model used by the
The advantage of the second approach is that               bank. Such firms may be different in important
it requires much smaller samples to detect a               ways from the general population of firms. Thus,
treatment impact. The downside is that the program         this experiment cannot be used to evaluate the
impact estimated will apply only to the self-selected      impact of credit on all firms that desire credit or on
group of individuals or firms that expressed interest      the poorest segments of the population.
in the program, not to the general population. For




                                    Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                        41




Appendix 3. Examples of Impact Evalu-
ations
A discussion of several evaluation approaches by            to open four branches in locations without banks.
type of intervention is provided below.                     This policy expanded the presence of banks in rural
                                                            areas of Indian states.
Regulatory and Supervisory Frameworks
                                                            Burgess and Pande (2005) used an instrumental
Entry of a New Bank in Mexico (DD Evaluation)               variables approach to evaluate the impact of this
Bruhn and Love (2009) evaluate the impact on                policy on poverty outcomes. The instruments were
economic activity of the opening of a major bank in         the policy-induced trend reversals of a state’s initial
Mexico. In 2002, Banco Azteca opened more than              financial development in its rural branch growth.
800 branches across the country. Branches were              In other words, less financially developed states
opened on the same day inside all of the preexisting        in 1961 were less likely to receive bank branches
stores of its parent company, Grupo Elektra.                in the periods outside the reform and substantially
                                                            more likely to receive them during the years of the
Since Azteca entered only in municipalities with a          reform. As these trend reversals were significant in
preexisting Elektra store, these municipalities were        the years of the reform and had no direct impact on
used as the treatment group, and municipalities             poverty outcomes, these instruments proved to be
with similar characteristics but no Elektra store were      valid.
used as the control group. Employing a difference-
in-difference approach, the authors analyze the             The evaluation concluded that rural branch
effect that Azteca had by comparing outcomes                expansion in India significantly reduced rural
before and after it opened in both treatment and            poverty. The reductions in rural poverty were
control municipalities. The gains from the opening          linked to increased savings and credit provision
of Banco Azteca are then the difference between             in rural areas. By promoting the expansion of
the changes over time in treated municipalities and         financial services into rural areas, this intervention
control municipalities.                                     allowed rural households to rely on more efficient
                                                            mechanisms to accumulate capital and to obtain
The authors find that this bank had a significant           loans for longer-term productive investments.
impact on the economic activity of individuals
belonging to the informal sector. Its opening               Financial Infrastructure
increased the proportion of informal business
                                                            Credit Information         in    Guatemala       (RCT
owners by 7.6 percent and led to a higher proportion
                                                            Evaluation)
of women working as wage earners. Additionally,
Azteca’s opening increased income by about 9                Availability of information to evaluate SME
percent for women and by about 5 percent for men.           creditworthiness is among the key institutional
                                                            constraints limiting expansion of SME finance.
2. Bank Branching Regulation in India (IV
                                                            Credit registries and bureaus could be an effective
Evaluation)
                                                            way to generate such information, as they contain
Between 1977 and 1990, the Reserve Bank of India            historical information on repayment rates and
mandated that in order to open a branch in a location       current information on obligations. Establishment
that already had bank branches, Indian banks had            of credit bureaus is one of the policies that is likely


                                     Global Partnership for Financial Inclusion
42                                                             Impact Assessment Framework: SME Finance




to have economy-wide impact and thus is difficult to         that the grants substantially raise incomes for the
evaluate using an RCT.                                       average firm receiving a grant and estimate real
                                                             returns to capital of 5.7 percent per month in Sri
De Janvry et al. (2008) used encouragement design            Lanka and 20 percent per month in Mexico, much
to examine the impact of the introduction of a credit        higher than market interest rates in both countries.
bureau in Guatemala (see also Boxes 3 and 5).                In addition, the returns are highest for high-ability,
They found that the awareness of the existence of            credit-constrained firm owners, which is consistent
a credit bureau was very low in surveys conducted            with the view that credit market failures prevent
soon after its implementation. They therefore                talented owners from getting their firms to an
randomly informed a subset of 5,000 microfinance             optimal size. Interestingly, these studies find that the
borrowers about the existence of the bureau and              impact was similar whether the grants were given in
how it worked. They found that awareness of the              cash or in the form of equipment or raw materials.
bureau led to a modest and temporary increase                On the flip side, the studies found that while one-
in repayment rates and to microfinance groups                time grants succeed in raising the incomes of poor
ejecting their worst-performing members.                     business owners, they do not lead to significant
Public Sector Interventions                                  job creation. Another surprising result of these
                                                             studies is that grants did not raise the incomes of
Financial Support to Microenterprises in Sri                 self-employed women; subsequent research has
Lanka and Mexico (RCT Evaluations)                           attempted to understand the reason for this result
                                                             (De Mel, McKenzie, and Woodruff 2008a).
Financing support for SMEs—whether through
lines of credit, directed credit, cofinancing,               Studies like these can help policy makers design
equity financing, or other forms of direct financial         more effective interventions; however, more
assistance—is a popular form of intervention. Such           evidence may be needed before recommending
interventions are based on the premise that a lack           that policy makers implement grant programs
of finance hampers entrepreneurs, market failures            on a wide scale. Specifically, replicating similar
prevent them from obtaining necessary capital,               experiments in other countries and with a variety
and therefore an injection of finance can put them           of populations would show whether such policies
on a path of increasing returns. However, credibly           would prove beneficial in other environments.
evaluating such programs requires distinguishing             In addition, while a small-scale intervention may
those that received the financial injection from             be very helpful to those receiving the grants, the
those that did not, which is difficult because of self-      general equilibrium effects of implementing such
selection issues (that is, enterprises that end up           policies on a wider scale need to be properly
receiving a loan or a grant are different on many            understood and investigated.
parameters, often unobservable, from those that do
not receive such assistance).                                Financial Literacy Programs in Indonesia and
                                                             the Dominican Republic (RCT Evaluations)
Two recent studies use RCT to evaluate the
effectiveness of grants to enterprises. De                   Financial literacy has come to play an increasingly
Mel, McKenzie, and Woodruff (2008b) study                    prominent role in financial reform in both developed
microenterprises in Sri Lanka, and McKenzie and              and developing countries, and is portrayed in global
Woodruff (2008) replicate the same experiment                policy circles as a solution for many recent crisis-
in Mexico. Grants between US$100 and US$200                  related financial problems. Many countries have set
were given to a randomly selected subset of                  up financial literacy panels that are charged with
microenterprises in each country. The authors find           developing financial literacy programs.




                                      Global Partnership for Financial Inclusion
Impact Assessment Framework: SME Finance                                                                         43



A recent study in Indonesia was designed to                 SME did not need to assess its degree of financial
evaluate the causal relationship between financial          need. Instead, the SME needed to comply with a
literacy and demand for financial services (Cole,           number of eligibility criteria, such as belonging
Sampson, and Zia 2011). The authors offered                 to a specific sector and having sound economic
seminars to randomly selected groups and educated           and financial conditions. These criteria were then
participants on the benefits and the procedure for          summarized in a scoring system that the SGS used
opening savings accounts. The authors found an              to order applications according to their guarantee
average negligible effect of such programs on the           merit. Importantly, the eligibility criteria limited the
opening of new accounts; however, among the                 percentage of applications that were rejected on
uneducated and financially illiterate households,           merit grounds.
there was a significant increase in opening new
                                                            This paper used a difference-in-difference
accounts. Moreover, they found small incentive
                                                            approach to test the fund’s role in widening credit
payments to have a much larger effect on getting
                                                            access for SMEs and lessening their borrowing
individuals to open bank accounts and to be three
                                                            costs. Using data from the fund’s books, the
times as cost-effective as financial education. This
                                                            authors compared outcomes of guaranteed SMEs
study suggests a need for more research on the
                                                            with nonguaranteed SMEs before and after the
most effective ways to encourage households and
                                                            SGS was launched. Specifically, the authors
microenterprises to save.
                                                            examined whether borrowing costs and access to
Drexler, Fischer, and Schoar (2011) report on two           credit, measured as the value of bank debt, were
randomized trials to test the impact of financial           substantially different for SMEs that participated in
training on firm-level and individual outcomes for          the program than those that did not. The difference-
microentrepreneurs in the Dominican Republic.               in-difference effect can be interpreted as a causal
They found no significant effect from a standard,           impact as long as the average outcomes for the
fundamentals-based        accounting       training;        participating SMEs and the other firms would have
however, a simplified, rule-of-thumb training               followed parallel paths over time in the absence of
produced significant and economically meaningful            the program. While this assumption is impossible
improvements in business practices and outcomes.            to test, an exercise was performed to compare how
                                                            different these two groups were before the program.
Partial Credit Guarantees in Italy (DD Approach)            The results from this exercise found no significant
In 1996, to promote lending to small firms, the             differences between the control and the treatment
Italian government established the Fund for                 groups, validating the control group as a proper
Guarantee to SME, or SGS, with the generic                  counterfactual. The difference-in-difference results
mandate of providing direct guarantees to lending           from the paper suggest that Italy’s scheme reduced
banks, co-guarantees together with other guarantor          participating SMEs’ borrowing costs by 16 to 20
institutions, and guarantees of last resort to mutual       percent. Moreover, SMEs’ bank debt increased by
guarantee institutions. To apply for a guarantee, an        12.41 percent once the scheme was available.




                                     Global Partnership for Financial Inclusion
                                                                                                                                                                                                                                                     44


                                             Appendix 4. Assumptions, Strengths, and Limitations of Different Approaches

                                                                                                                     Comparison of Impact Evaluation Approaches
                                             Approach                 Key assumptions                               Strengths                                            Limitations
                                             Randomized control        Subjects cannot manipulate                  Clear comparison group, which allows for           Not all policies are suitable for RCT
                                             trials (RCTs)            assignment into the program                   credible identification of the impact                 Local effects measured by RCTs might be different from systematic
                                                                       Subjects in the control group must                                                              effects when a program is scaled up
                                                                      be credibly excluded from receiving                                                                 External validity
                                                                      benefits from the intervention or program


                                             Difference-in-            Trend of the treated group must be          DDs control for factors (observed and              DD estimates are invalid if changes over time occurred to one
                                             difference (DD)          identical to trend of control group in the    unobserved) that do not vary over time               group but not the other, or if the two groups had different trends before
                                                                      absence of the intervention                    Cost-effective impact evaluation method           the intervention



                                             Instrumental variables    Instrument must be strongly                 IV estimates control for unobserved                If not planned ahead, IV evaluations are difficult to do
                                             (IV)                     associated with participation of the policy   information that may influence self-selection into    IV results estimate only local effects
                                                                      and must not be associated with the           the program
                                                                      outcomes evaluated


                                             Regression                In the absence of the intervention,         Baseline data not needed                           RD effects will be biased if the cutoff was assigned to maximize
                                             discontinuity (RD)       the cutoff variable should be associated       Solid RD estimates are comparable to RCT          the impact of the intervention, or if firms are able to manipulate their
                                                                      with the outcome variable in a                estimates                                            eligibility into the program
                                                                      continuous manner                                                                                   Sufficient observations surrounding the cutoff are needed
                                                                                                                                                                          RD results estimate only local effects




Global Partnership for Financial Inclusion
                                             Propensity score          After controlling for observed              PSM allows identification of a control group       PSM is data intensive
                                             matching (PSM)           differences, outcomes of treated group        when the eligibility criteria depend on multiple      PSM estimator not robust against bias caused by unobserved
                                                                      identical are to outcomes of control          variables                                            information associated with participation in the program (in this case,
                                                                      group in the absence of the intervention                                                           PSM can be combined with other approaches)
                                                                                                                                                                          PSM should be used only in cases where the evaluator has a clear
                                                                                                                                                                         and detailed understanding of the eligibility criteria of an intervention



                                             Minimal standard          Outcomes of treated subjects are not        Relatively easy method to implement—does           Results should be treated with caution since other factors besides
                                             monitoring               being affected by any other factor except     not require significant technical capacity from      the policy may be contaminating the results
                                                                      by the policy of interest                     the evaluating team or the principal investigator
                                                                                                                                                                                                                                                     Impact Assessment Framework: SME Finance