WPS6296


Policy Research Working Paper                      6296




              Learning from the Experiments
                  That Never Happened
   Lessons from Trying to Conduct Randomized Evaluations
            of Matching Grant Programs in Africa

                                Francisco Campos
                                  Aidan Coville
                                Ana M. Fernandes
                                Markus Goldstein
                                David McKenzie




The World Bank
Development Research Group
Finance and Private Sector Development Team
December 2012
Policy Research Working Paper 6296


  Abstract
  Matching grants are one of the most common policy                                 whereby only those experiments with “interesting�?
  instruments used by developing country governments to                             results get published. The hope is to mitigate this bias
  try to foster technological upgrading, innovation, exports,                       by learning from the experiments that never happened.
  use of business development services and other activities                         This paper describes the three main proximate reasons
  leading to firm growth. However, since they involve                               for lack of implementation: continued project delays,
  subsidizing firms, the risk is that they could crowd out                          politicians not willing to allow random assignment, and
  private investment, subsidizing activities that firms were                        low program take-up; and then delves into the underlying
  planning to undertake anyway, or lead to pure private                             causes of these occurring. Political economy, overly
  gains, rather than generating the public gains that justify                       stringent eligibility criteria that do not take account of
  government intervention. As a result, rigorous evaluation                         where value-added may be highest, a lack of attention
  of the effects of such programs is important. The authors                         to detail in “last mile�? issues, incentives facing project
  attempted to implement randomized experiments to                                  implementation staff, and the way impact evaluations are
  evaluate the impact of seven matching grant programs                              funded, and all help explain the failure of randomization.
  offered in six African countries, but in each case were                           Lessons are drawn from these experiences for both the
  unable to complete an experimental evaluation. One                                implementation and the possible evaluation of future
  critique of randomized experiments is publication bias,                           projects.


  This paper is a product of the Finance and Private Sector Development Team, Development Research Group. It is part of
  a larger effort by the World Bank to provide open access to its research and make a contribution to development policy
  discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.
  org. The authors may be contacted at fcampos@worldbank.org, acoville@worldbank.org, afernandes@worldbank.org,
  mgoldstein@worldbank.org, and dmckenzie@worldbank.org.




         The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
         issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
         names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
         of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
         its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                       Produced by the Research Support Team
                          Learning from the experiments that never happened:

    Lessons from trying to conduct randomized evaluations of matching grant programs in Africa#



                                       Francisco Campos, World Bank

                                          Aidan Coville, World Bank

                                       Ana M. Fernandes, World Bank

                                       Markus Goldstein, World Bank

                            David McKenzie, World Bank, BREAD, CEPR and IZA




Keywords: Matching grants; Impact Evaluation; Randomization bias; Learning from failure.

JEL codes: O14, H25, D22, C93


#
 We are thankful for the support of DFID and the World Bank’s Knowledge for Change Trust Fund for supporting
our efforts to start impact evaluations in this area, as well as the World Bank operational staff and Government
counterparts who helped and supported our attempts. We thank Alvaro González, Smita Kuriakose and Tom
Wagstaff for comments. All views expressed in this paper should be considered those of the authors alone, and do
not necessarily represent those of the World Bank.

                                                       1
    1. Introduction

        A typical matching grant consists of a partial subsidy - most commonly covering 50 percent of
the cost, but ranging as high as 90 percent - provided by a government program to a private sector firm
to help finance the costs of activities to promote exports, innovation, technological upgrading, the use
of business development services, and, more broadly, firm growth. Matching grant programs are one of
the most common policy tools used by developing country governments to actively facilitate micro,
small, and medium enterprise competitiveness, and have been included in more than 60 World Bank
projects totaling over US$1.2 billion, funding over 100,000 micro, small and medium enterprises. 1 Add in
funding provided by other development agencies and national governments, and it seems likely that at
least two billion dollars has been spent on these projects over the last twenty years.

        Yet despite all the resources spent on these projects, there is currently very little credible
evidence as to whether or not these grants spur firms to undertake innovative activities that they
otherwise would not have done, or merely subsidize firms for actions they would take anyway. From a
social return perspective, the rationale for developing matching grant programs is usually also based on
the assumption that there are positive externalities to workers, other firms, and to the country as a
whole from having firms undertake these activities – workers will receive jobs and can use their
upgraded skills in other parts of the economy, additional firms will learn from firms participating in the
program, the market for business development services will be developed, the government will receive
additional tax revenues, and society will benefit from broader economic growth. There is even less
evidence to support these assumptions.

        Several case study and non-experimental evaluations have attempted to provide some evidence
on the impacts of matching grant programs (e.g. Biggs, 1999; Phillips, 2002; Castillo et al., 2011; Crespi
et al., 2011; Gourdon et al., 2011, Lopez-Acevedo and Tan, 2011). However, these programs typically
cater only to a tiny fraction of the firms in a country, and the firms that self-select or are selected for
these programs are likely to differ in a host of both observable and unobservable ways from firms that
do not receive the funding. This is likely to lead to an upward bias in non-experimental evaluations if
more entrepreneurial firms with positive productivity shocks are the ones seeking out the program, and
a negative bias if it is better politically connected but less productive firms that receive the funding.


1
   Data from a World Bank Latin                American    and    the   Caribbean    overview    available   at
http://go.worldbank.org/OVDGTHSWY0.
        Randomized experiments do not suffer from selection bias and offer the potential to provide
more credible estimates of the impacts of these programs. Moreover, matching grants would appear ex
ante to be one of the types of private sector development activities most amenable to experimental
evaluation.

        We therefore set out to design randomized experiments to prospectively evaluate seven
matching grant programs in six African countries. Five were to be supported through World Bank loans
and technical support, while two stemmed from a direct engagement with the government focused on
increasing the use of impact evaluation (IE) to evaluate national programs. Africa is the region where
matching grants have been used the most, accounting for slightly more than half of all such projects
supported by the World Bank. Thus, conducting evaluations of projects in this region offered the
potential to both understand the impacts of the existing projects, as well as inform future work in the
region. However, for a variety of different reasons, none of these experiments ended up being
completed. 2

        The continued debate about the role of randomized experiments in development has included
the discussion of there being selection into which projects are able to be implemented as experiments
(e.g., Ravallion, 2009) and with which partners (e.g., Allcott and Mullainathan, 2011), as well as the need
for a trial registry in ensuring that all studies undertaken end up being reported on (e.g., Rasmussen et
al., 2011). Most of the attempted studies here would not have even made it to the trial registry stage,
but we think there are still important lessons to be learned from discussing the attempted evaluations,
their reasons for failure, and the insights for future work that can be drawn from this. This is particularly
the case given that almost all existing experiments with firms are with microenterprises (McKenzie, 2010
provides a review), and that experiments working with small and medium enterprises (SMEs) will
typically have to work with government agencies or ministries rather than being researcher- or NGO-led.
As a result, the lessons from matching grant programs are likely to be informative for many attempts to
evaluate other government SME development programs such as export promotion activities, and
training and support-service programs.




2
  Concurrently a randomized evaluation of a matching grant program was successfully undertaken by Bruhn et al.
(2012) in Mexico, and we will draw some lessons from their work in this paper as well.

                                                      3
        The proximate causes for our inability to complete randomized experiments are quite simple –
in some cases governments were unwilling to randomly select recipients of the grants 3, in others the
application rates to the programs were too low to enable the planned selection of a random sample of
eligible applicants, and in others continued implementation delays prevented us from starting. We
therefore investigate what the underlying causes were, and discuss the roles of (a) political economy; (b)
program eligibility criteria; (c) “last mile�? delivery of the program; (d) incentives facing project
implementation staff, and (e) difficulties matching impact evaluation funding cycles to the realities of
these projects. These findings have insights both for the design of future matching grant projects, as well
as for their possible evaluation. Designers of projects should reconsider the eligibility criteria and
selection processes, work more on delivery, and better incentivize project staff. Future attempts to carry
out experiments in this area need to moderate their expectations about timelines, better align funding
cycles with them, use evaluation techniques designed to get more out of small samples, and also
consider what we call “little IE�? - experiments that can be done within the context of the broader project
that address questions on design and efficiency, using IE as an operational tool, rather than asking what
the overall impact of the project is.

        This paper is not a critique of matching grants per se, but rather a synthesis of challenges in
designing impact evaluations to measure the causal impact of such programs, and an attempt to draw
out the insights that the attempted evaluations have provided on ways in which the implementation of
matching grants may be improved in the future. We acknowledge that a number of matching grant
programs have been implemented without impact evaluations, and may have been successful in
achieving their targets – although, unless these targets are simple targets like funding a target number
of firms, it would be difficult to tell if they have worked without an impact evaluation.

        The remainder of the paper is structured as follows: Section 2 provides a brief introduction to
the theory and practice of matching grant programs, Section 3 discusses our planned prospective
evaluations of seven projects, and Section 4 examines the reasons why these experiments failed. Section
5 concludes by drawing out the lessons from this experience for both policymakers and researchers.




3
 This concern would generally be raised at the beginning of the engagement on an IE design, but in some cases it
became pressing only after the program had been launched.

                                                       4
    2. The Theory and Practice of Matching Grants

         Matching grant programs have been seen as a market-based approach to encouraging firms to
purchase specialized services such as training for employees, implementation of standards and quality
certification, product development, trade promotion, marketing support, and support for technology
upgrading. Instead of the government directly providing these services or subsidizing suppliers of these
services, the hope has been that providing subsidies to the buyers of these services will be more
efficient, since they can then choose the specific services that their business needs, and the increased
demand may stimulate a competitive response from existing and new independent providers of
business services (Biggs, 1999). One of the earliest such funds was set up in 1961 by the Irish Export
Board as a marketing development fund, while the earliest World Bank support for such a scheme were
projects in India (India Engineering Export Development Project) and Indonesia (Export Development
Project) in 1986, both of which required 50 percent matching funds from export-oriented firms (Phillips,
2002).

         Over the past twenty years matching grants have been a mainstay of World Bank projects to
enhance private sector competitiveness, especially in Africa (33 projects) and Latin America and the
Caribbean (16 projects). Their popularity continues, with a review of 36 recent World Bank projects in
Financial and Private Sector Development identifying 40% that including a matching grant scheme as a
component of the project. While initially many of the programs were directed explicitly at exporters,
over time the focus has evolved to more general private sector development and competitiveness.
Grants have ranged in size from as small as $200 in some of the African projects to as high as $500,000
in some of the export- or bio-technology-oriented projects, with a typical project offering grants in the
$5000-$10,000 range. The most common match proportion is 50 percent 4, meaning that a $5,000 grant
would go towards purchases of $10,000 in services.

         In most cases the funding is restricted to the purchase of soft services, and excludes capital
equipment, wages, or other recurrent business expenses. 5 Sometimes higher matching proportions
(e.g., the government paying 70 or 75 percent) are given for smaller firms, and lower matching


4
  The rationale for setting any particular matching proportion is to maximize the private investment and public
gains induced from each dollar of public spending. However there is no empirical evidence to support any given
proportion over another, making the 50% match somewhat ad hoc.
5
  Of the seven projects we consider, only one in South Africa supports capital equipment purchases with a current
cost-sharing of 50% (it began in 2010 with a cost-sharing ratio of 35%). This is an independent government
intervention, which is not part of a World Bank lending program.

                                                       5
proportions (e.g., the government covering only 25 percent) are given when the project does allow
some capital expenditure

    2.1 The Economic Justification for Government Funding

        The assumption underlying the use of a matching grant program is that firms are not investing
enough in business development services currently, and that by lowering the effective price paid for
such services, firms will purchase more of them. The question which then arises is “why are profit-
maximizing firms not purchasing these services already if it is profitable for them to do so?�?. Then, even
if not profitable, are there enough positive spillovers associated with these investments to warrant
incentivizing this investment? Finally, one could also ask whether matching grants are the best way to
incentivize private investment to induce the public gains, or could other forms of funding or credit be
more effective? Answering these questions is necessary to provide an economic rationale for
government intervention, and for understanding the underlying problem that the matching grants are
intended to fix. This in turn will guide the questions that need to be answered by any evaluation.

        A first set of reasons why firms may not be undertaking profitable investments is due to market
failures. First, firms might be credit-constrained, and so unable to undertake lumpy profitable
investments in business services. While the first best solution would then be to fix the credit market,
there has been limited success in encouraging banks to lend to SMEs through partial credit guarantees
and other such schemes, and so matching grants might be a second-best solution to this credit market
failure. While banks may finance equipment since it can be collateralized, they are less likely to finance
consulting, training, or high-risk intangible activities such as those associated with startups and
innovation, for which matching grants are often used. A matching grant may also improve the signal of
quality of the business investment proposals (since they are, ostensibly, reviewed by the government for
quality, collecting information through a process that would be too costly for a bank to implement),
reducing risks for banks and increasing the likelihood of successful loan applications when firms need
access to credit in order to fund their matched proportion.

        Second, owners of small and medium firms might be risk-averse, and avoid making investments
in business development services that have high expected return, but which involve risk, because of
their inability to insure this risk. Here the first best solution would be an equity market or venture capital
market which would enable firms to share risks with investors. In the absence of such markets, the
matching grant effectively increases the expected return on the investment in services by lowering its

                                                      6
price, therefore inducing firms to take on more risky profitable projects. In fact, some variations of
matching grant programs are designed similarly to an equity investment, where the government buys a
stake in the business as a way of providing a match to their investment, with the expectation that this
will be repaid if the firm generates a profit – effectively making it a loan in the case of success and a
grant in the case of failure.

         Third, the missing market may be on the supply side, with a country having a rather limited
supply of business development service providers that can offer the service and credibly signal that their
quality. The matching grant program may therefore help by increasing the demand for such services
enough to generate a market for new service providers to enter.

         A second set of reasons concern information and decision-making constraints. Firm owners may
just not have sufficient information about the range of possibilities for making use of specialized
business development services in their firm, or undervalue them (e.g., Bloom et al., 2012), and therefore
be unaware that profitable service investment opportunities exist. The costs of gathering the
information or the complexity of the information may be too great, in which case improved salience of
the information or support in interpreting it could stimulate investment. The matching grant in this case
may work by drawing a firm owner’s attention to the range of potential services that their business
could use, and getting the firm owner to think about which may be profitable (with technical support in
some cases). Alternatively, firm owners may intend to make the investments, but keep putting this off
for another day because of present-bias in their preferences. A matching grant scheme with an
application deadline may then be the nudge that firm owners need to bear the initial mental, time, and
monetary costs of making the business development services investment, whose returns will not be felt
until the future.

         Third, the above factors may combine with regulatory barriers and small markets to result in a
lack of competition. In the absence of competition, firms will not feel the same pressure to innovate and
increase productivity, and there will be less reallocation from less productive firms not using these
services to more productive firms. The matching grant may increase competition by enabling productive
firms to overcome credit constraints or other market failures and begin competing with less productive
firms.

         Alternatively, it may be that firms are already profit-maximizing, and that investing in
specialized business development services is not a profitable investment in terms of the private returns

                                                    7
to the firm. Nevertheless, if there are positive externalities (not captured by the firm) from purchasing
specialized services, the social returns may exceed the private returns, leading to a justification for the
government to subsidize the price. For example, if firms train workers in information technology, but
then many of these workers leave to work for other companies or to start their own firms, the training
may not be worth it from the firm’s point of view, but the increase in human capital may still have
positive returns from the point of view of society. Likewise increases in employment may have positive
externalities for other citizens if unemployment is an issue, and other firms may learn from the
innovations undertaken by firms who find new products, processes, or markets as a result of using these
specialized services.

        World Bank-financed matching grant programs typically have enterprise growth as their main
objective. They often justify their need by the lack of supply of service providers and firm-level
information constraints (as in the Mozambique Competitiveness and Private Sector Development
Project for example), but commonly refer to externalities as a critical economic justification for the
program. For example, the Mauritius Manufacturing and Services Development and Competitiveness
Project Appraisal Document (PAD) notes that while intellectual property right protections help firms that
develop new products to gain private benefits from these products, there are no such protections in
place when a firm first introduces an existing product into the country. As a consequence, firms may
under-invest in searching for technologies in which a country may have an untested comparative
advantage, since the entire risk of failure is borne by the firm, whereas if successful, the benefits will be
shared with imitators. Secondly, the PAD argues that since Mauritius has a reasonably flexible labor
market, firms may under-invest in worker training and skills upgrading for fear of their workers leaving
to work for other firms or to start their own firms.

        What if none of the above cases hold? That is, what if firms are currently optimally investing in
specialized business development services in an unconstrained way given the current market price of
these services, and that there are no externalities from such services? In this case, the matching grant
does not correct any distortions but rather may create distortions. Lowering the price of the specialized
services below its market price will have a price effect and an income effect on the firm. The price effect
will tend to cause firms to over-consume specialized services, some of which (the non-effective ones)
may help deplete firms’ investment capacity, while the income effect will give them more resources
which they may use to buy other inputs or take as returns to the business owner. If the specialized
services are lumpy purchases (e.g., acquiring a quality certification), then it is also possible that the price

                                                       8
effect might be zero – that firms who would have purchased these services anyway continue to do so
(and just get a transfer from the government to do what they would have otherwise done), while firms
who would have not purchased these services continue to not purchase them when the grant is
available. Phillips (2002) notes a worst-case scenario in which subsidies may allow inefficient and
unprofitable firms to stay afloat.

        Note that these factors make the optimal selection of firms to participate in a matching grant
program quite different from the decision process facing a venture capitalist or outside investor
(McKenzie, 2011a). A venture capitalist is interested in identifying which projects have the largest
private returns, whether or not those returns are caused by their funding. In contrast, the government
should not be trying to pick the “gazelles�? (enterprises which will grow fastest), but rather the projects
which have the greatest additional impact from receiving government funding (referred to as
“additionality�? in the matching grant literature), and which have the greatest positive spillovers on other
firms. Both factors mean that impact evaluation relative to a counterfactual is required, rather than just
observing what the before-after change is in participating firms.

    2.2 Examples of Matching Grants in Practice

        We conducted qualitative interviews and case studies of some previous African matching grant
recipients and examined a sample of approved applications for the projects considered for the
evaluations in order to provide concrete examples of the activities and business development services
that firms seek these grants for in practice. To preserve anonymity, we withhold both the names of the
firms and their countries in describing these cases.

        Some examples of approved applications include: a clothing manufacturer who wanted to hire a
designer to develop new designs and patterns; a firm making medical and surgical devices who wanted
to hire a consultant to implement lean manufacturing techniques in their factory; a firm doing interior
design who wanted to conduct a market study to decide whether or not there was a market for selling
new window shades; a legal firm seeking technical assistance to help it set up an outsourcing service; a
small beach-side hotel that was looking for support in developing its website; an agri-business that
wanted to conduct a study-tour in China in search of new equipment for their business; a producer of
mineral water that wanted to obtain an ISO quality certificate; an IT company that wanted to participate
in an industry-specific international fair; a carpentry that was seeking support for training eight of their



                                                       9
employees in using a new machine. Table 1 summarizes the approved activities over a period of two
years for one of the programs we wanted to evaluate.

        Among firms which had previously received grants, one of the most successful cases was a wind-
generated electricity producer. It used a matching grant of $8,000 to develop the feasibility studies that
allowed it to enter into a joint-venture with a European firm and obtain an investment of US$1 million. It
took four years after funding for the project to become commercially operational, and within the first
year of operation the firm was already supplying 14 percent of all the electricity consumed in the two
districts it services. This project serves as a possible example of multiple levels of impact, with the firm
receiving greater profits, consumers receiving better quality electricity service (brown-outs and black-
outs were reduced), and the environmental benefits through clean energy provision. Even in this case it
is not possible to ascertain absolutely whether or not this would have happened without the matching
grant, but it is certainly the type of case where multiple market failures and externalities seem apparent,
and public benefit from the grant seems likely. Another two examples of former grant recipients
concern investment in worker and firm training. A civil engineering firm used the grant to help fund
quality certification training for its workers, while a communications firm used the grant to develop an
employee training scheme. Both firms said that they had been unsure about the effectiveness of such
training, but after seeing the results from the grant funding, they had subsequently invested more of
their own funding in continuing this activity.

        For other approved matching grants the additionality needed to justify the government support
is not as clear, particularly taking into consideration the information available to firms on the benefits of
these investments and their lack of constraints in accessing them. In fact, according to a survey
commissioned by the government in one of the countries where we started an evaluation,
approximately 25 percent of the firms receiving the matching grant stated that they would have pursued
the activity fully anyway and an additional 49 percent confirmed that they would be pursuing the
activity in the absence of the matching grant, albeit using a less expensive consultant (31 percent of
total), at a later stage (10 percent), or at a smaller scale (8 percent). Examples of less successful uses of
the matching grant, or at least of apparent limited public benefits, include providing business services to
firms who have the skills, knowledge and capacity to conduct the activities and are expected to invest in
those areas. These comprise, for instance, an IT company receiving a matching grant to develop a
website or a travel agency wanting continuous support to participate in various international tourism
fairs. Applications like these are attractive both to the business (little risk and they continue doing the

                                                     10
work they would do anyway) and to the government (the application is likely to be well-polished with
clear links to the business’s operations), but they reduce the incentive for the government to extend its
reach to identify cases where the most impact can be generated rather than cases that are most likely to
be successful.

        As these examples illustrate, there is substantial heterogeneity in the types of business
development services and activities funded through matching grants. Many of the services and activities
funded seem likely to have mostly private benefits (plus potentially fiscal and employment benefits),
while the extent to which the services and activities funded actually will lead to significant externalities
and spillovers is less obvious.

    3. The Planned Prospective Evaluations of Seven Matching Grant Programs

        In order to attempt to prospectively build impact evaluations into a range of private sector
activities being financed through the World Bank, we worked in early 2010 with the World Bank’s
Director of Financial and Private Sector Development (FPD) in the Africa Region and her team to identify
a set of projects to evaluate. Matching grants were one of the most common elements of private sector
development projects being planned. In February 2010, the DIME-FPD initiative of the World Bank
organized a 4-day workshop on impact evaluations in Dakar, Senegal, which brought together
researchers, World Bank operational teams, and the key government counterparts in order to explain
what is meant by impact evaluation, and to begin the process of building impact evaluations around key
components of these projects. Through the course of these activities, we identified six projects - five of
which were World Bank-funded - in which matching grants would be used. The seventh project comes
from an engagement that started in 2008 with the Department of Trade and Industry (DTI) in South
Africa to identify critical projects that should undergo an impact evaluation. In this process, the South
African government had shown interest in evaluating a matching grant program focused on micro and
small firms, notably because of the public debate around its effectiveness and its significant use of
public resources.

        Table 2 details the name of the seven projects, the amount of resources allocated, the average
grant size expected, and the targeted number of recipient firms. Typically the projects were designed to
last for three to five years, with funding either being allocated on a rolling basis or through multiple
funding windows. For the World Bank-funded projects (5 out of 7 of these projects), three dates are
provided to give some sense of the length of the time from project design until official implementation

                                                    11
began: the World Bank Regional Operations Committee (ROC) decision meeting date, which is the
formal corporate review deciding whether or not to proceed with a proposed operation; the Board
approval date when the project is officially approved by the World Bank, and the project effectiveness
date when the country has met all the necessary conditions needed to begin receiving the funds from
the World Bank. The actual length of time from project conception to the first matching grants being
given out is substantially longer, involving both preparation time before the ROC decision meeting, and
time taken to launch the program and receive applications after project effectiveness. Given that the
ideal time to begin an impact evaluation is early in the preparation process, and certainly before Board
approval, it can easily take two years from the time when work begins on impact evaluation design to
the time when the first grants are given out to businesses.

    3.1. Why is Impact Evaluation Important for These Projects?

        In discussing these projects with World Bank operational staff and with governments, there was
in some cases a sense that these were tried and tested policies that were politically popular, so there
was no need for rigorous evaluation.6 However, in other cases there was genuine skepticism as to
whether these grants really did provide benefits beyond the firms that received them, and concern over
the costs of implementation. Phillips (2002) provides figures on the implementation costs of earlier
matching grant projects in Africa, noting that they ranged from 19 percent of the amounts given out in
Mauritius, to 40 percent in Kenya and 54 percent in Uganda. Moreover, these figures excluded some key
costs, such as the costs of setting up project committees to evaluate the applications to the matching
grant program. Therefore the full costs are relatively high compared to the amount of matching grant
funds given out. Given that the resources originate in a World Bank loan that eventually needs to be
repaid, citizens of these countries could therefore be paying to subsidize relatively well-off business
owners with little public benefit in return.

        We also discussed the point that the purpose of the impact evaluation is not just to learn
whether the program works or not, but to help improve the future operation and targeting of such
programs if we can learn what types of businesses the program works for. As Biggs (1999) notes, ideally
matching grant programs should select projects with large economic and social returns to the country
that would not otherwise find private funding – but in his direct questions to recipients of an earlier

6
 Note that all World Bank projects are required to include project indicators and some form of evaluation, but at
most these typically measure the number of firms receiving grants and sometimes before-after changes for the
recipients captured through surveys of beneficiaries.

                                                       12
matching grant program in Mauritius, 80 percent of recipients said they would have made the
investments in technology transfer even without the matching grants. In many countries the mindset
was still very much one of trying to select the “best�? firms or the ones most likely to succeed, not
necessarily those for whom the grant funding could make a difference between success and failure. We
hoped that the process of engagement of World Bank operational staff and governments in the impact
evaluation would also help improve project implementation in this respect.

        Finally, given that these programs are pervasive throughout Africa, we emphasized that there is
a global public good element to the knowledge produced by any impact evaluations – the results would
not only influence and inform policy in the country being studied, but also help inform efforts in other
countries. For these reasons we were able to raise funding from different sources to help finance the
impact evaluations, so that, when approaching these countries to try to implement evaluations, we
were coming with some funding in hand to supplement their data collection costs.

    3.2 Why Use Randomized Experiments as an Evaluation Tool?

        There are a number of private sector development policies in which a randomized evaluation is
difficult or near impossible (e.g., changing a national regulation, building new infrastructure). Matching
grants on the other hand satisfy a number of conditions that make randomization a possibility: i) they
involve selection of individual firms; ii) the numbers of firms involved can be large enough to potentially
generate enough statistical power for measuring impacts (provided that appropriate methods are used,
see McKenzie, 2011b, 2012); and iii) data on key outcomes may be measured reasonably well through
firm surveys.

        Furthermore, matching grants are a type of program for which it is hard for non-experimental
methods to convincingly deal with selection issues – firms self-select into whether or not to apply for
the program, and then government implementation units select which firms receive the grants. An
explicit element of both selection stages is likely to be forward-looking behavior on the part of both the
government and the firm. Firms that receive support are more likely to be perceived (by the firm and by
government) as having faster potential growth in the future than firms that do not apply or do not get
selected, even when they are observably similar in terms of current size and recent growth history. As
such, matching on observable characteristics and differencing out of firm fixed effects is still unlikely to
control for the differences between the firms which participate in the program and those which do not.
In some circumstances regression-discontinuity designs may offer an appropriate (local) estimate of the

                                                    13
program’s effects provided that explicit scoring thresholds are used, but the number of firms close
enough to the scoring threshold to enable such an approach is likely to be small in most cases.

        Randomization is also likely to be a more cost-effective option than other quasi-experimental
methods. Propensity score matching, for instance, would require firstly developing a sampling frame of
businesses (which in itself is likely to be a challenge in many African countries) and then collecting
enough information from a large enough sample to be able to select the most appropriate matching
group of businesses to serve as a counterfactual – increasing the costs, time and logistics required to
prepare a baseline survey. In contrast, a randomized experiment would not require a sampling frame
(since all participating businesses in the study would come from the actual applicant pool) and baseline
data for all of these businesses could potentially be collected cheaply through an application form that
captures key outcome variables and indicators on which the randomization is likely to stratify on.

    3.3 What are the Key Initial Issues to Consider for an Impact Evaluation of these Projects?

        As outlined in McKenzie (2010), there are a number of typical challenges in conducting an
impact evaluation of policy operating at a firm level. First, the number of SMEs in many African countries
is relatively small. In our group of matching grant impact evaluations, the DTI’s BBSDP project – a
matching grant for black-owned SMEs 7 in South Africa - is a clear example of this challenge. The
government was interested in supporting firms of a certain size because it already had a number of
incentives for micro enterprises and wanted to shift focus from survivalists to established firms with
potential for growth. In that vein, when preparing the project, the government had selected a minimum
annual turnover of Rand 1 million (approximately USD 125,000) for firms to be eligible to apply.
However, based on a set of representative firm-level surveys in South Africa, it seemed that less than 3
percent of the population of firms would qualify to participate in the program. This led to adjustments in
the program, which culminated in a revised version of the project with a minimum annual turnover of
Rand 250,000, increasing the chances of finding eligible enterprises.

        Second, there is considerable heterogeneity in firm characteristics and performance, which
affects the statistical power to detect the impacts of the programs and thus the sample size
requirements. This is an important consideration, especially when one of the objectives of the study is to
measure impact heterogeneity to understand for which type of firms the matching grant is most


7
 The eligibility criteria imposed among other aspects the need for 51% of the ownership to be black-owned
(African, Indian or Colored under the South African definition).

                                                   14
effective. For example, considering the limited effects of traditional management training programs
(McKenzie and Woodruff, 2012) and unrestricted grants (Fafchamps et al., 2011) for female-owned
enterprises, it is be important to estimate the gender-disaggregated added-value of increasing access to
a range of business development services and (in the case of South Africa) new equipment. While
traditionally governments and donors think of a matching grant as a mechanism to support existing
firms that already use a minimum set of business tools - and this often leads to targeting specific sectors
such as manufacturing - it is not clear that the highest impact is achieved by investing in these better-
equipped enterprises, which already have access to information, financial mechanisms, and technical
skills. Testing whether matching grants are more effective for firms with more limited access to
networks and business tools (such as female-owned enterprises concentrated in a limited number of
low productivity sectors is hence very relevant from a policy perspective. The downside of this goal
though is that it implies the need to work with significantly larger sample sizes, adding another layer to
an already demanding initial setup.

        We were aware of these two issues, as well as the typical concerns about attrition and integrity
of the data collected in advance of developing these studies. What we will explore next are additional
barriers to evaluating matching grant programs that go beyond these concerns.

        3.4 Implementing the Randomization

        Phillips (2001) notes that all grant programs face a rationing problem since services are
effectively supplied at below the free-market price, and so by definition there is excess demand for
them. Given that the government is effectively giving away free money to firms, one might expect
significant demand for this funding, resulting in the need for projects to select which firms receive it.
Since we believe there is substantial uncertainty over which firms would best benefit from receiving
these funds, our suggestion was for randomized evaluation based on an oversubscription design. The
idea here would be to make the matching grant programs open for all firms meeting certain basic
eligibility criteria, and then randomly select which firms would be awarded the grants. In the event of
more demand for the grants than the project could fund, this would provide a fair and equitable way of
ensuring that all eligible firms received an equal chance of benefitting from these public funds, and
might reduce concerns about political connectedness determining who receives the grants.

        We planned to implement the oversubscription design as follows. First, the program would set
an initial limit on the number of grants that could be given out in the first year of the program. Each year

                                                    15
there would be one or two funding rounds. This limit would be based on natural manpower constraints
(there is a limit to how many grants can get processed, screened, verified, and site-visited), funding
constraints (typically the program has funding allocations by year), and how many firms apply in the first
year. Next, the program would issue an initial call for proposals coupled with an advertising/marketing
campaign to launch the program. In certain instances, we would provide support to government teams
in preparing outreach logistical plans, which could include establishing partnerships with provincial
government offices, microfinance institutions, banks, sector associations, with the objective of
promoting the program, organizing workshops, helping firms prepare applications, and even collecting
applications. Firms would then be aware that the program has its first round window open with a
defined deadline - with potential behavioral effects on the decision to apply on time - and all
applications received by a given date would be considered for the first set of funding, with projects not
selected eligible to re-apply again in future rounds. 8 The applications received would then go through an
initial screening to ensure that they meet the basic eligibility criteria in terms of firm size, sector, and
planned use of the funds, and, in most cases, would be visited by one of the project specialists. 9 A
random set of firms whose applications meet these criteria would then be selected for funding.

        These intensive marketing campaigns stood to benefit both the impact evaluation design and
the project implementation. From the point of view of the evaluation, they could boost the number of
applicants, improving the likelihood that an oversubscription randomization method could be justified.
On the project side, these campaigns would increase the reach of the program, ensuring that businesses
with less access to information through geographic or network marginalization could become aware of
the program and submit applications if interested. Since lack of information is likely to be associated
with other constraints that the matching grants have been designed to overcome, it is plausible that
these firms could have the most to gain from the program. The time-bound windows for applications
also have the advantage of allowing the project implementation team to dedicate time initially on
marketing and use the rest of the year working on due-diligence and processing of the grants, reducing
the risk of over-committing to multiple activities. Moreover, these funding rounds would also provide
information on what the demand is for these matching grants, and enable the identification of certain
groups of firms that the program would like to reach but which are not applying (e.g., female-owned


8
  In the first window of applications for the Mozambique project applications were concentrated in the last few
days.
9
  In certain cases, the activities to be conducted would finally be agreed at this stage, because often the firm had
an idea of what it wanted but through a needs assessment conducted by the project, it would be able to refine it.

                                                        16
firms, or firms located outside the capital city). The program could then adjust its marketing efforts for
future rounds to ensure it reaches out to these groups. Each subsequent funding round would again
follow the same procedure of randomly selecting the target number of recipients from among the
eligible firms.

        We also considered two alternative forms of randomization. The first was a phase-in design, in
which some firms would receive the funding now, and others would receive it later. This did not seem
practical for three reasons. First, not all firms apply at once. Second, if firms knew that they might
receive the funding in the future, they might adjust their behavior today, and third, some of their
funding requests might be time-sensitive.

        The second alternative was a randomized encouragement design. Under this approach, the
program is open for all firms to apply. A baseline survey of firms would be conducted to learn about the
potential demand among firms for the program, and to provide data for monitoring purposes on firms
that get the grants and firms that do not. During the process of conducting the baseline survey, firms
will be randomly chosen to receive additional “encouragement�? to apply for the matching grants. This
encouragement would be in the form of different marketing and informational approaches to increase
awareness about the program. Since resources are limited and it is costly to visit firms individually, this
approach would clearly not be feasible to follow for all firms. Therefore, the program fairly gives all firms
the same ex-ante chance of receiving this additional encouragement, and uses this to learn about the
needs of firms for the program, the effectiveness of different marketing strategies, and the impact on
firms of getting the matching grants. The danger with this approach is that the encouragement may fail
to encourage many firms into applying, so larger samples are typically needed than with a randomized
experiment of applicants. In our first attempt to evaluate a matching grant project, we tried
unsuccessfully to use this type of encouragement design to measure the impact of the Small Enterprise
Development Agency (SEDA) in South Africa. The main reason for trying this approach was that SEDA
had been operating on a first-come first-serve basis since 2004. Additionally, the government was
reluctant to limit eligible applicants from receiving the grant, making randomization difficult to
implement in practice. Given the lack of success in this first attempt, we considered this encouragement
option as a less desirable alternative to randomization in subsequent evaluations (Goldstein, 2011).




                                                     17
    4. Why Did the Randomized Experiments Not Happen?

        Out of the seven projects for which we discussed impact evaluations, five initially agreed to
implement the projects with an oversubscription-based randomized experiment included, one (SEDA,
South Africa) agreed to implement the encouragement design approach and another (Mauritius) had
recently obtained World Bank board approval but was not yet at the project effectiveness stage. This
project had already agreed with the government that the program was to be run on a first-come, first-
serve continuous basis for applications, and since this had been approved by both the Ministry of
Finance of the country and the World Bank board, it was not possible to renegotiate these terms. An
encouragement design coupled with a potential matched difference-in-differences back-up plan for
evaluation was then designed for this country, giving seven potential randomized experiments.

        In what follows we discuss what happened to the seven cases where we had an initial
agreement to at least start an impact evaluation. In order to mitigate potential political sensitivities, we
will not identify for which projects particular issues were a problem, but rather we highlight the set of
issues that affected one project or another. In none of the cases was a randomized experiment
implemented in the end, although non-experimental evaluations are ongoing in a couple of cases.

4.1 Proximate Causes of Randomization Not Being Implemented

        We can group the proximate causes for not being able to implement the randomization into
three interrelated reasons. Table 3 indicates which reasons apply to each project.

        The first reason was widespread implementation delays in the projects. This made it impossible
to start the evaluations until the projects started and caused conflict with funding deadlines for being
able to do the evaluations. In one case, the government decided to soft launch the program while many
details of the intervention were still under discussion. The details of the program were modified along
the way during the soft launch period, which lasted over one year. This obviously had adverse
consequences for the marketing strategy and the team’s focus on conducting an impact evaluation.

        Second, during these implementation delays, there was often turnover of government staff,
leading to the government changing its mind about participating in the randomized experiment, or even
about the project itself. In one case where the World Bank project was funding a matching grant with a
50 percent match, the government launched a second matching grant program, which was royalty-
based, whereby the government provided 90 percent of the funds and the firm only 10 percent, with


                                                    18
firms then repaying through royalties on incremental sales if the activities funded succeeded in
increasing sales. The World Bank project was ultimately cancelled following low demand for the original
matching grant.

        Third, the most common reason for the inability to randomize was that program take-up was
low: our randomization strategy was based on selecting among an excess supply of eligible applicants.
Instead, all projects struggled to obtain enough eligible applicants. In one country where the project was
half over when we started discussing the evaluation, the disbursement rate was only 23% and new
efforts were under way to increase spending. World Bank operations have an incentive to ensure that
the loans “disburse�?, that is, that the money promised in the loan is actually spent, and similarly project
implementation units and governments want to show that their project can spend the money allocated
to it, to help support future requests for such money. Therefore, when the few applications were
actually received, grants were given to all firms that passed the eligibility criteria making randomization
impossible.

        These problems are not unique to the matching grant programs that our studies were built
around. Goldberg and Ortiz del Salto (2011) undertook a review of World Bank matching grant programs
and found that of the 42 projects with an original closing date before 2012, 79 percent were extended at
least one year, reflecting both delays in initial implementation, as well as in disbursing money. In a
review of 15 completed projects, they report that on average only 70 percent of the allocated money
was disbursed, although this average masks considerably heterogeneity across projects, ranging from
only 6 percent being disbursed in one project in Zimbabwe, to 99 percent in a project in South Africa and
116 percent in a project in Mozambique.

4.2 Underlying Causes of a Randomized Evaluation Not Occurring

        It is easy enough to blame the inability to conduct an experiment on low program take-up,
repeated implementation delays, or government unwillingness to randomize. But in order to understand
how researchers and project teams can design projects and evaluations better in the future, it is
important to delve deeper into some of the fundamental causes underlying these outcomes. These
concern the incentives facing governments, firms, researchers, and project implementers. Again these
are to some degree interrelated, but we discuss each issue in turn.

        a) Political economy and capture reasons: Matching grant programs consist of handing out
subsidies to firms. As a result of this free money, there is a risk of capture, with those in charge of the

                                                    19
funding at various levels wanting to use it to advance their own interests. In one case, the local
Chambers of Commerce were used as implementing partners for the program. This had the obvious
advantage of using existing organizational links with firms and ensuring that the business community
had some buy-in for the program. However, our view is that the major Chamber viewed these grants as
something they could use to buy goodwill among their members. As a result, the Chamber lobbied to
make the eligibility criteria for the program exclude many small firms (who were not Chamber
members), appeared reluctant to engage in serious outreach to firms beyond its member base, and, as a
result, far fewer applications were received at this major Chamber which served the capital city and
other neighboring areas home to the majority of firms than were received from a second Chamber of
Commerce serving other areas which seemed more willing to reach out to smaller firms.

        In another country, the national government launched a credit line administered by local
governments for (existing or planned) businesses at the same time as the matching grant program. From
the entrepreneurs’ perspective, this option would provide cash-in-hand under a loan, which would, in
theory, have to be paid back with interest, but in practice had low enforcement. These two initiatives
led to a number of government officials at the national and local levels conflating the projects (often
asking in workshops about the interest rate on the matching grant), and politicizing the programs,
promoting the credit line in detriment of the matching grant because it was providing more effective
political gains (cash in hand versus “just�? a business development service).

        A second manifestation of political economy concerned election politics. In order for projects to
become effective and for matching grant programs to be launched, the government in each country had
to undertake a number of steps. In multiple countries these steps were delayed due to election cycles,
and in the case of a change in government the interest in pursuing the original project could be reduced.
In one case, funding rounds and randomization had been agreed with the project team and the highest
ranks of the Ministry, but popular demonstrations in the streets led to a cabinet reshuffling and the
replacement of the Minister. The new Minister decided to push for a revised industrial policy strategy
with clear sectors of focus. With that goal in mind, the Minister decided to use this program as a
mechanism for driving the government’s industrial policy, and in that process, decided to stop any
randomization.

        As with all project evaluations, another major political economy issue is how much desire and
pressure there is from top levels of government to know the impacts of policy. The case of the
successful completion of a randomized experiment in Puebla, Mexico by Bruhn et al. (2012) offers a

                                                    20
sharp contrast to the seven attempted experiments in African contexts. In Puebla, the director of the
program wanted to prove that it worked, and had heard of the MIT Poverty Action Lab. Impetus for the
randomized evaluation therefore came directly from the head of the program himself, who directed
program staff to follow all the suggestions made by the evaluation team in order to ensure that the
program could be rigorously evaluated. While there appear to have been some incentives for him to
have done this (proving his program worked would help in budget discussions), it also appears to in part
reflect the research interests and educational background of this director, making it harder to replicate
elsewhere.

        Ownership of the evaluation at the level of the implementation team is helpful for a successful
impact evaluation. In a couple of our matching grant studies, our initial discussions and workshops were
with the management team, top government officials, and Monitoring & Evaluation (M&E) specialists to
garner buy-in for the impact evaluation. Conversations with the day-to-day operations team that would
implement the project happened only in a second stage - in one of these cases in particular, the project
was based on a specific region of the country, which we visited only after the initial interactions. The top
officials were interested in learning about their projects and using the evaluations to prove the
programs’ effectiveness and obtain further funding. Nonetheless, when we started working with the
implementation teams on the ground, we faced significant blocks in implementing the evaluations. The
political economy of headquarters versus regional offices came into play and the implementation teams’
feared that we would be auditing their work and assessing their competencies, despite numerous
discussions about the objectives of the study. For these two projects, the implementation teams saw the
evaluation as risky (these were two existing projects with problems) and although the top levels officials
considered the evaluation to be important, it was not important enough to justify potentially damaging
already difficult relations with operational teams.

        b) Overly strict eligibility criteria: The eligibility criteria in many cases were set in a way to
exclude the vast majority of firms in most African countries. They tended to be based on a notion of
picking the firms which would grow the fastest, not necessarily the ones which would benefit the most
from the grant. Despite most African firms having fewer than 10 workers and not being fully formal
(McKenzie, 2011b), most programs required firms to be fully formal. This could be fine if the existence of
the program served as an incentive to bring firms into the formal registration and financial systems, but
in some cases the programs required the firms to have several years of formal registration, including
audited accounts and current tax records. In one case, we sat with the project implementation team to

                                                      21
assess the reasons for low-take up. In addition to traditional issues of firms’ lack of working capital to
fund the acquisition of the business development services upfront and other red tape issues, we noticed
that one of the first questions in the application form was whether or not the firms were registered,
although in theory this was not a criterion for selection into the program. Firms were also asked during
application to attach their business registration certificate. This could naturally put-off a number of firms
AVO

        After discussions with the project implementation team of another case, it was clear that the
targeting was reaching fewer firms than initially anticipated, while a critical group of businesses in a
lower turnover range were not being serviced because of the minimum turnover eligibility criterion. This
led to a change in the criteria (reducing the turnover requirement from USD 60 thousand per year to
USD 30 thousand and increasing the matching proportion by government) and a dramatic increase in
applications (Figure 1).

        c) “Last mile�? issues and red tape: Much of the emphasis in matching grant programs concerns
the design of the eligibility criteria, and the bureaucratic procedures involved in setting up a project
implementation unit to run the program. Far less attention is given to what Mullainathan (2009) terms
the “last mile�? problem – how do you actually get firms to take up a program offered? Marketing is part
of the need, but there are a number of issues related to psychology and behavioral economics that are
also important.

        Marketing efforts varied across our seven projects. In order to ensure a sufficient number of
applications and make firms aware of the program, in one country impact evaluation funding was used
to supplement the project’s marketing budget and radio and television advertisements were produced
and shown to get the word out to firms. In other countries the outreach was less intensive and slower –
in one country there were only two branch offices for the program throughout the country, neither
particularly noticeable, and the marketing materials produced were not visually very appealing.

        Developing strong awareness campaigns can be a challenge. Despite one of the projects
investing considerably in a large launch and communications campaign, a survey of 209 businesses
conducted one year after the launch found that none of the sampled businesses had heard of the
program. Of those contacted, 39% were interested, and ultimately only 6% were both interested and
eligible to participate, highlighting the need for targeted awareness campaigns when eligibility criteria
are restrictive (Figure 2).

                                                     22
        Many projects create a number of roadblocks that prevent or inhibit even interested and
potentially eligible firm owners from applying. For example, in designing one project, the default was to
have each firm separately obtain a letter from the tax authority saying that they were not delinquent on
any tax payments. This required firms to spend additional time and effort proving to a government
program that the firms were current with another part the government! This requirement was changed
after our impact evaluation team pointed out the problems, but still other bureaucratic steps which
increase the cost of applying remained. In qualitative interviews, one firm remarked that “even to apply
to these matching grants (intended to help firms get business development services) firms already need
to use a consultancy to put together their candidacy folder�?. While the use of online forms can in
principle reduce some of this bureaucracy, in some African countries where business usage of the
internet is still low, the need for applications to be online and correspondence to be via email
potentially served as another barrier for small firms to access the program.

        Other elements of how the program was to be implemented may have deterred some firms
from applying in the first place. One big issue is how the funding is actually paid out. Governments want
to try and ensure that the funds are used for the purpose firms said they would be used for, and ensure
that firms are actually putting in their matching share. They also want to try and ensure that firms are
not colluding with business service providers to pay overinflated prices and then splitting the subsidy
without work actually getting done. The solution in some cases has been for the government to require
firms to undertake a procurement process which involves getting multiple bids in writing, requiring the
firms to pay 100 percent upfront, and then reimburse the firms for 50 percent of the amount upon
presentation of receipts and proof that the service was actually provided. But if the reason for firms
requiring the matching grant in the first place is in part difficulty obtaining financing, this may prevent
some firms from being able to use the program. Moreover, when there are already issues with the
extent to which firms trust the government, and with the speed at which the government processes
reimbursements, some firms may be unwilling to take the perceived or actual risk of never getting
reimbursed. A better alternative, which the World Bank has used in Lesotho, has been for the matching
grant unit to directly pay the approved service provider the government’s share of the cost, so that firms
do not need to prepay and then be reimbursed.

        An obstacle for evaluations that get to this stage (none of ours did), is that even conditional on
being awarded a grant, a non-trivial number of firms may choose not to take it up – in the successfully
implemented randomized experiment in Puebla, Mexico, Bruhn et al. (2012) report that only 80 out of

                                                    23
150 successful applicants actually proceeded to take-up the grants even though these firms had all
signed letters of interest and applied for the program, and that in their case firms were only expected to
pay between 10 and 30 percent of the costs of consulting services. The result from this low take-up is
lower statistical power, making it harder to detect the impact of such programs. Although we did not
manage to implement the randomized experiments, the projects that we covered also suffered from low
take-up. In one of our cases shown in Figure 3, out of the 165 firms that had activities approved (under
two windows in mid-2011 and early 2012), only 51 firms completed the activities proposed as of
September 2012. An effective take-up rate of 31% has large implications for the power of the study.
Identifying a 20% increase in firm turnover assuming an average turnover of USD 30 thousand with a
coefficient of variation of one would require a sample of 786 firms assuming full compliance. 10 A 31%
take-up rate, however, would increase the sample size required to 8168 in an intention-to-treat
estimate – clearly an infeasible task even if the true impact is large. In another country, the government
provided an incentive to accredited individual consultants to help firms in their application process. This
led to a perverse incentive to artificially increase submissions, with a number of applications covering
activities that the firm never had the intention of implementing. The government has since then revised
the incentive scheme to pay a proportion upon completion of the activity.

        d) Incentives facing project staff: Project implementation units are typically staffed by
individuals on fixed wage contracts who do not have strong monetary or career incentives to generate
an excess number of applications or increase screening mechanisms to improve the targeting and
quality of the submissions. More applications mean simply more work for staff without any clear
reward. Considering the high rotation in these types of “private-sector like�? public jobs 11, they do not
necessarily value the potential scaling up of the project if strong positive results are shown by the
impact evaluation. Therefore, the effort is often kept relatively low, even in cases where the extra mile
to generate the excess number of applications would just require organizing promotion better, involving
the right partners in the awareness campaigns, and conducting more regular one-to-one meetings with
firms to assess needs and explain the available services. In one case, the government was interested in
reaching to firms outside of the capital city mostly for economic/political reasons (not impact
evaluation-related), but the implementation unit blocked setting individual targets for performance


10
  A power of 0.8 and a two-sided significance level of 5% were used in this calculation.
11
  Members of the Project Implementation Units are paid out of the project – hence it is government money – but
they usually do not sit in the Ministry, having a separate structure and higher-paid jobs than traditional
government officials.

                                                     24
(eventually these were set later once the Minister blamed the project for non-performance in certain
regions of the country). In three other cases, it was clear - and communicated to us at various moments
in time – that there was a conflict between different team members in terms of the appetite for
reaching out to new applicants. This lack of incentives from project implementation units is sometimes
also shared with World Bank operations teams, who in theory would like to see an impact evaluation of
their program, but have few formal incentives for taking on, or requesting from the government team,
the additional work required to implement a successful evaluation.

        e) Funding cycles for impact evaluation: Funding for the impact evaluations came from
monitoring and evaluation budgets included in the projects themselves, from World Bank trust funds
the researchers applied for, and from external donors. The non-project funds typically were given for
periods of two years, with any funding not spent during this period having to be returned. This created a
severe mismatch with the cycle of the project that was being evaluated, especially once implementation
delays occurred. Funders were in some cases willing to give a single extension of up to one-year, but
were not willing to extend funding longer than this. Moreover, many funders were inflexible in terms of
which countries they would fund the research on, with an extreme focus on Africa, so that it was not
possible to take funding raised for one matching grant impact evaluation and transfer it to learn from a
matching grant project taking place in another part of the world. As a result, in some cases we had to
make decisions to cancel the evaluations - despite the effort on design, the researchers’ involvement
and the field coordinators having been hired - rather than wait and see whether applications would pick
up in future rounds of the project.

    5. Lessons for Future Evaluations and Future Matching Grant Projects

        Many of these risks were not unforeseen, with one grant application for example noting that we
anticipated three potential risks: timing of the project launch getting delayed; having the government
and/or World Bank operational staff changing their minds and deciding not to proceed with
randomization; and the risk of insufficient demand for the matching grants program. We took actions to
mitigate these risks through working closely with operational task leaders and government units,
building some slack for delays into our evaluation timelines, and active attempts (including helping to
fund advertising campaigns) to try to ensure enough applications. Despite these actions we thought it
was unlikely that all the evaluations would be able to proceed as randomized evaluations, but hoped
that by taking a portfolio approach, we would at least enable a couple of cases to succeed. As noted, this
was too optimistic, and so in this section we attempt to draw lessons for future attempts.

                                                   25
         One difficult question facing researchers deciding to work on these evaluations is the extent to
which they should actively try to change the design, and especially the implementation, of the project
they are evaluating. On one hand, this can be part of the value of a prospective impact evaluation, with
the questions raised by researchers improving project design and performance. This approach is the one
often used in evaluations led by academics working with NGOs, with full-time field coordinators tasked
with micro-managing the roll-out of the program being evaluated. However, this raises concerns about
scalability and generalizability, and may result in a different set of firms participating in the program
than would be the case without researcher involvement. 12 If the program has heterogeneous impacts on
firms, the average treatment effect for the set of firms participating in a project which has had
substantial researcher involvement in implementation may then be potentially quite different from
what the average treatment effect would be in a standard program without researcher involvement.
However, our view is that many of the recommendations for ways to design matching grant programs to
make them more amenable to conducting randomized experiments are also likely to make the projects
better for firms as well, so that in practice this conflict may not always arise.

         Given this, we have the following recommendations:

     1. Have more realistic expectations about the time it takes to implement these projects: this applies
         to project staff designing the timelines for projects, to researchers considering evaluating them,
         and to funders. In most cases it seems likely that it will take at least three years from the start of
         researcher involvement in these evaluations to see results, unless the government can be more
         efficient than usual. Grant agencies should consider funding portfolios of evaluations over
         longer time horizons, with the expectation that funding will shift across projects and to
         encompass potentially new projects within that general topic range. McKenzie (2012a) sets out
         some additional implications for funders.
     2. Change the mindset from picking winners to picking positive treatment effects: The mindset of
         policymakers and operational staff is often that these projects should be trying to identify the
         firms that are gazelles (i.e., potentially high-growth). This raises the risk of little additionality, as
         the project simply ends up subsidizing firms that would grow anyway, and makes it more
         difficult to set eligibility criteria that will enable more firms to participate. In contrast, once the
         emphasis is shifted to funding the firms that would benefit most from the grant funding, it
12
  Just promoting the program to firms that would not typically apply to the program can have implications on
external validity. If the researchers’ effort creates new demand from different type of firms, it would ideally be
important to identify upfront which these are to be able to disentangle the impacts by type of firm.

                                                       26
    should become more immediately apparent to government staff that we have relatively little
    idea as to what identifies these firms, and so there is a role for expanding the program to serve
    a broad range of firms and then scientifically measuring impact to see who the program works
    best for.
3. Focus more on eligibility criteria and on making it easier for firms to apply: the most innovative
    companies may not be those which are already formal, large, and long-established. Market
    failures may apply more to younger and smaller companies, so ensuring that eligibility criteria
    do not rule these firms out seems a good idea. But just as important as the initial criteria is
    making it easy for firms that meet these criteria to find out about the funds, apply without
    undue burden, and receive the funding promptly. Incentives for both government and World
    Bank operational staff are often focused more on getting projects launched and ensuring money
    is spent, rather than on the efficient delivery of services once projects have been launched.
4. Use evaluation techniques that can obtain more power out of relatively small samples: despite
    all these efforts, in many African countries there are just not that many firms with more than
    one or two workers, and the projects will likely end up funding fewer than 100 firms a year.
    McKenzie (2011b, 2012b) discusses how one can obtain more power out of such sample sizes by
    restricting attention to a more homogeneous set of firms, and taking multiple measures on
    them. An extreme case is illustrated in Bloom et al. (2012), which conducted an experiment with
    only 20 textile firms in India, but as these firms were all in the same sector and weekly
    production data were collected, impacts of management consulting were still possible to
    measure. In practice, matching grant programs are unlikely to have the same level of
    homogeneity of firm production technologies and the same richness of data, but steps in this
    direction can still be made.
5. Practice “little IE�? as well as “big IE�?: our goal in these failed experiments was impact evaluation
    (IE) around the overall impact of these grants. We refer to this as “big IE�?. While important, we
    have seen the difficulties in doing this. Given the large number of design issues that matching
    grant programs grapple with that do not have a strong evidence basis for decision-making, it
    seems possible to also conduct more “little IE�?. By this we mean experiments embedded in the
    overall broader project that can help learn not whether or not matching grants have an impact
    overall, but instead about what the impacts of changing different design features in these grants
    could be. This shifts the question from “what is the impact of the matching grant scheme?�? to
    “what is the most effective way to maximize the impact of the matching grant scheme?�?. For

                                                27
    example, experiments could be done to test different ways of implementing reimbursements,
    different incentive schemes for project implementation staff to see which generates more and
    better applications, examining demand for the program when different match proportions are
    offered, testing alternative information campaigns, or overlaying complimentary interventions
    such as credit guarantees to address other identified constraints and test for interactions.
            This offers value both to the researchers and to operations teams. On the one hand,
    thinking about using impact evaluations to tweak and test features of the program has the
    potential to use the impact evaluation as an operational tool to improve the speed and
    efficiency of the project disbursements (for instance, by identifying mechanisms that can
    increase the quantity and quality of business applications and claims). On the other hand, this
    offers the opportunity to test cross-cutting mechanisms that may be applicable to more than
    just matching grants (e.g., how do deadlines influence firm behavior?).
6. How should we conduct impact evaluation if most outcomes are failures? Hall and Woodward
    (2010) document that among venture-capital backed entrepreneurs almost three-quarters
    receive nothing at exit while a few receive over a billion dollars. Karlan et al. (2012) likewise
    make the point that in some cases smaller firms experimenting with training and capital
    investments may have negative expected returns in the short-run from such experimentation,
    but with some outliers succeeding. It seems plausible that the impacts to society from matching
    grant projects may follow a similar model – the wind-powered electricity production deal cases
    are likely to be much rarer than cases of firms undertaking actions that have mostly private
    payoffs or no realized benefit at all. Since much of the focus of randomized experiments has
    been on identifying mean impacts, more work needs to be done to consider how to best apply
    these tools when most of the expected action is at a tail.


    Many of these lessons are likely to be applicable beyond the case of evaluation of matching
grants. Other activist government programs to help SMEs are likely to run into many of the same
issues of implementation delays, limited take-up, political constraints, a relatively small number of
firms, and difficulties matching funding cycles to project cycles. It is hoped that sharing the lessons
of our experiences with matching grants may help improve the likelihood of success of impact
evaluations in those areas as well.




                                                28
References

Allcott, Hunt and Sendhil Mullainathan (2011) “External validity and partner selection bias�?, Mimeo.
Harvard University.

Biggs, Tyler (1999) “A microeconomic evaluation of the Mauritius Technology Diffusion Scheme (TDS)�?,
Regional Program on Enterprise Development Discussion Paper no. 108.

Bloom, Nicholas, Benn Eifert, Aprajit Mahajan, David McKenzie, and John Roberts (2012) “Does
management matter? Evidence from India�?, Quarterly Journal of Economics, forthcoming.

Bruhn, Miriam, Dean Karlan and Antoinette Schoar (2012) “The impact of consulting services on small
and medium enterprises: Evidence from a randomized trial in Mexico�?, Yale Economics Department
Working Paper no. 100.

Castillo, Victoria, Alessandro Maffioli, Sofá Rojo and Rodolfo Stucchi (2011) “Innovation policy and
employment: Evidence from an impact evaluation in Argentina�?, IDB Working Paper no. IDB-TN-341

Crespi, Gustavo, Alessandro Maffioli, and Marcela Melendez (2011) “Public support to innovation: The
Colciencas experience�?, IDB Working Paper no. IDB-TN-264

Goldberg, Michael and Daniel Ortiz del Salto (2011) “Matching grants: A review of matching grants in
projects promoting private sector development�?, World Bank internal Powerpoint presentation
available             at                http://intresources.worldbank.org/INTLAC/Resources/257803-
1226691316407/Matching_Grants_Promoting_Private_Sector_Development.pdf

Goldstein, Markus (2011) “A disappointment with encouragement�?, Development Impact Blog Post April
5, http://blogs.worldbank.org/impactevaluations/node/524

Gourdon, Julien, Jean Michel Marchat, Siddharth Sharma and Tara Vishwanath (2011) “Can matching
grants promote exports? Evidence from Tunisia’s FAMEX II program�?, pp. 81-106 in Olivier Cadot, Ana
Fernandes, Julien Gourdon and Aaditya Mattoo (eds.) Where to spend the next million? Applying impact
evaluation to trade assistance. World Bank: Washington, D.C.

Hall, Robert and Susan Woodward (2010) “The burden of the nondiversifiable risk of entrepreneurship�?,
American Economic Review 100(3): 1163-94.

Fafchamps, Marcel, David Mckenzie, Simon Quinn and Christopher Woodruff (2011). “When is capital
enough to get female microenterprises growing? Evidence from a randomized experiment in Ghana�?.
CSAE Working Paper WPS/2011-11.

Karlan, Dean, Ryan Knight and Christopher Udry (2012) “Hoping to win, expected to lose: Theory and
lessons on microenterprise development�?, Mimeo. Yale.

Lopez-Acevedo, Gladys and Hong Tan (2011). Impact evaluation of small and medium enterprise
programs in Latin America and the Caribbean. The World Bank.

                                                 29
McKenzie, David (2012a) “Improving funding of impact evaluations: end the fiscal year and other rules
that have outlived their usefulness�?, Development Impact blog post, June 24,
http://blogs.worldbank.org/impactevaluations/node/829

McKenzie, David (2012b) “Beyond baseline and follow-up: the case for more T in experiments�?, Journal
of Development Economics, 99(2): 210-21.

McKenzie, David (2011a) “Should development organizations be hunting gazelles�?, All about Finance
blog post, December 8, http://blogs.worldbank.org/allaboutfinance/should-development-organizations-
be-hunting-gazelles.

McKenzie, David (2011b) “How can we learn whether firm policies are working in Africa? Challenges
(and solutions?) for experiments and structural models�?, Journal of African Economics, 20(4): 600-25.

McKenzie, David (2010) “Impact Assessments in Finance and Private Sector Development: What have we
learned and what should we learn?�?, World Bank Research Observer, 25(2): 209-33.

McKenzie, David and Christopher Woodruff (2012) “What are we learning from business training and
entrepreneurship evaluations around the developing world�?, World Bank Policy Research Working
Paper.

Mullainathan, Sendhil (2009) “Solving social problems with a nudge�?, TED India Talk,
http://www.ted.com/talks/sendhil_mullainathan.html.

Phillips, David (2001) “Implementing the market-based approach to enterprise support: An evaluation of
ten matching grant schemes�?, World Bank Policy Research Working Paper no. 2589.

Phillips, David (2002) “The market-based approach to enterprise assistance—an evaluation of the World
Bank's market development grant funds�?, Small Enterprise Development 13(1): 26-37.

Rasmussen, Ole Dahl, Nikolaj Malchow-Møller, and Thomas Barnebeck Andersen (2011) “Walking the
talk: the need for a trial registry for development interventions�?, Journal of Development Effectiveness
3:4, 502-519

Ravallion, Martin (2009) “Should the randomistas rule?�?, The Economists’ Voice, www.bepress.com/ev,
February 2009.




                                                  30
Table 1: Example of Matching Grant Activities in One of the WB-Funded Projects
Firms' Activities by Category           Number of Activities     Proportion of total
                                               Approved         amount (in USD) of
                                                                  grants approved
Employee training                                 116                   30%
Websites and e-commerce                            58                   15%
Quality certification                              15                   13%
Trade fair participation                           33                   10%
IT systems                                         30                    8%
Design of promotional materials                    52                    6%
Improvement of production efficiency               9                     5%
Business plan                                      12                    5%
Market research                                    6                     3%
Short-term management contracts                    7                     3%
Product development research                       2                     1%
M&A, partnerships, investors' search               7                     1%
Distribution systems                               1                     1%
Packaging design                                   3                     0%
Total                                             351




                                         31
Table 2: Planned Evaluations of Matching Grant Programs in Africa
                                                                                                      Expected     Anticipated           Key Project Dates
                                                                                          Total         Grant      number of Decision Board              Project
Country        Project                                                        Match (%) Amount           Size         grants     Meeting Approval Effectiveness
Cape Verde     SME Support & Economic Governance Project                       50-75%   $860,000    $2,000-10,000       280       Oct-09 Apr-10          Nov-10
Ethiopia       Sustainable Tourism Development Project                          50%     $3 million    $50,000         60-100     Mar-09 Jun-09           Feb-10
Malawi         Business Environment Strengthening Technical Assistance          50%     $2 million $5,000-10,000     200-400     Mar-07 May-07           Oct-07
Mauritius      Manufacturing & Services Development & Competitiveness Project   50%     $8 million $12,000-20,000    500-670      Oct-08     Jan-10      Mar-10
Mozambique Competitiveness and Private Sector Development Project              50-70% $4.16 million    $7,500           500       Oct-08 Feb-09          Oct-09
South Africa Black Business Supplier Development Programme (BBSDP)             50-80% $17 million     $14,000          1,200        NA         NA        Sep-10
South Africa Small Enterprise Development Agency (SEDA)*                       70-90% $1.6 million*    $1,250         1300*             ongoing since 2004
Notes: * denotes per year. The South African projects were funded by the Department of Trade and Industry, while the other country projects were all funded
through World Bank loans.

            Table 3: Summary of Causes Behind Inability to Conduct Randomized Experiments
                                           Proximate Causes                                    Underlying Problems
                        Was                           Govt.                                                          Project
                       Project                      decided                  Political   Funding Eligibility implementation "Last
                     Ultimately Implementation       not to      Low       Economy or     Cycle      Criteria         unit       mile"
            Project Cancelled?        Delay        randomize Take-up         Capture Inflexibility too strict      incentives   delivery
               A         Yes                           X          X             X           X
               B         No                            X          X                                                     X          X
               C         No             X                         X             X           X            X                          X
               D         No                            X          X                                      X              X          X
               E         No             X              X                        X                                       X          X
               F         No                            X          X             X                        X
               G         No             X                         X                         X            X              X          X
            Notes: Projects A through G indicate the seven different proposed randomized trials and are listed in random order.
            Govt. decided not to randomize includes cases where there was an initial agreement to randomize, followed by a later change.




                                                                              32
                      Figure 1: A Relaxation of Eligibility Criteria Increased Applications



                                                                                           206



                                              Change in eligibility criteria
                                                                               116   112
                                                                         96
                                                                   65
                                35            31    30        29
                 20      20
                                         11


                  1       2      3       4    5      6        7    8      9    10    11       12
                                                     Month


Notes: the figure shows the numbers of applications to the matching grant program. The change in eligibility
criteria was from a minimum turnover of 60 thousand USD to a minimum turnover of 30 thousand USD.



            Figure 2: Interested and Eligible Businesses Based on Business Awareness Study




                                                                                           Ineligible
                                                                                             33%
                        Not interested             Interested
                             61%                      39%




                                                                                           Eligible
                                                                                              6%



Notes: the figure is based on a survey of 209 businesses in one of the seven countries conducted one year after
launching the matching grant program.




                                                         33
           Figure 3: Number of Firms Applying, Getting Approvals, and Completing Activities

           564



                                417



                                                     252

                                                                          165


                                                                                                51


       Applications        Due-diligence      Approved or under        Approved             Completed
                            completed              review


Notes: the figure shows information on the number of firms in each category for a matching grant program in one
of the seven countries. On average, each firm applies for circa 2 activities.




                                                      34