Policy Research Working Paper 8877 No Household Left Behind Afghanistan Targeting the Ultra Poor Impact Evaluation Guadalupe Bedoya Aidan Coville Johannes Haushofer Mohammad Isaqzadeh Jeremy Shapiro Development Economics Development Impact Evaluation Group June 2019 Policy Research Working Paper 8877 Abstract The share of people living in extreme poverty fell from 36 across all the primary pre-specified outcomes: consumption, percent in 1990 to 10 percent in 2015 but has contin- assets, psychological well-being, total time spent working, ued to increase in many fragile and conflict-affected areas financial inclusion, and women’s empowerment. Per capita where half of the extreme poor are expected to reside by consumption increases by 30 percent (USD 24 purchasing 2030. These areas are also where the least evidence exists power parity, USD 7 nominal per month) with respect on how to tackle poverty. This paper investigates whether to the control group, and the share of households below the Targeting the Ultra Poor program can lift households the national poverty line decreases from 82 percent in out of poverty in a fragile context: Afghanistan. In 80 vil- the control group to 62 percent in the treatment group. lages in Balkh province, 1,219 of the poorest households Using modest assumptions about consumption impacts, were randomly assigned to a treatment or control group. the intervention has an estimated internal rate of return Women in treatment households received a one-off “big- of 26 percent, excluding non-monetized improvements push” package, including a transfer of livestock assets, in psychological well-being, women’s empowerment, and cash consumption stipend, skills training, and coaching. children’s health and education. These findings suggest that One year after the program ended—two years after assets “big-push” interventions can dramatically reduce poverty in were transferred—significant and large impacts are found fragile and conflict-affected regions. This paper is a product of the Development Impact Evaluation Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at gbedoya@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team No Household Left Behind: Afghanistan Targeting the Ultra Poor Impact Evaluation Guadalupe Bedoya* Aidan Coville* Johannes Haushofer Mohammad Isaqzadeh Jeremy Shapiro⊥ JEL Codes: J21, J22, O12, D13. Keywords: Poverty, Big push, Labor Supply, Women’s empowerment, Fragility, Conflict and Violence. *Corresponding authors: Bedoya, e-mail: gbedoya@worldbank.org and Coville, e-mail: acoville@worldbank.org; Development Economics Impact Evaluation, DIME, World Bank, Washington, DC.  Princeton University. ⊥The Busara Center for Behavioral Economics. The authors gratefully acknowledge the following people and organizations that supported the study. Aminata Ndiaye, Ahmed Rostom, Naila Ahmed and Guillemette Jaffrin led the World Bank-funded Access to Finance project which delivered the intervention. MISFA staff, especially Bahram Barzin and Khalil Baheer and their team including Matin Ezidyar, Shafkat Shahriyar Bin Reza and Hashmat Mohmand implemented the program. Maria Camila Ayala, Thomas Escande, Gëzime Christian, Garima Sharma, Seungmin Lee, Shivang Mehta, Catalina Salas and Rebecca de Guttry provided excellent research assistance throughout the project. Nabila Assaf, Shubha Chakravarty, Simeon Djankov, Arianna Legovini, Katharine McKee, Ana Goicoechea, Nathanael Goldberg and Aminata Ndiaye provided valuable comments. Funding was provided by the DIME Impact Evaluation to Development Impact (i2i) fund, Knowledge for Change, and UK-DFID protracted forced displacement trust funds, the World Bank Afghanistan Country Management Unit and Finance, Competitiveness and Innovation Global Practice. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the view of the World Bank, its executive directors, or the countries they represent. The authors declare that they have no relevant or material financial interests that relate to the research described in this paper. 1. INTRODUCTION One in ten people worldwide lives in extreme poverty. Despite significant achievements in economic growth, the benefits have been distributed unevenly across countries, with poverty becoming more rooted in countries affected by conflict, violence, and weak institutions. The share of the global poor living in fragile and conflict-affected countries increased from 14% in 2008 to 23% in 2015 and is expected to increase to 50% by 2030 (World Bank, 2018). In Afghanistan, the share of people living below the national poverty line increased from 38% in 2011 to 55% in 2016 (Afghanistan Living Conditions Survey, ALCS, 2016). Identifying policies that reduce the growing gap in fragile and conflict-affected areas is therefore critical for reducing poverty.1 The poor face multiple constraints that reinforce their socioeconomic status. Low levels of human capital endowments and limited access to productive inputs constrain their self- employment and wage labor opportunities. They are frequently exposed to uninsured risks, both man-made and natural, that are particularly acute in fragile and conflict settings (Dercon, 2008). Persistent poverty and conflict also place a cognitive load on individuals that impairs decision-making and may expose households to further economic duress (Mani et al., 2013; Haushofer & Fehr, 2014; Mullainathan and Shafir, 2013). The ultra-poor in Afghanistan face many of these constraints simultaneously. In the target group for this study,2 five in six households have an illiterate household head. Four in five households live below the Afghan National Poverty Line of USD 30 (nominal) per person per month (USD 112 PPP). 3 Just 1.5% of households save anything, while two-thirds of households are in debt. Indicators for women paint an even starker picture: less than 4% of primary women in the household can read and write, two in three of these women are depressed, and just over half of eligible girls attend school. These multiple constraints may give rise to poverty traps, i.e., stable equilibria from which it is difficult to escape unless multiple constraints are relieved simultaneously. For example, human capabilities (e.g., skills) and nonhuman capabilities (e.g., capital) could be complements, and if both are below what is needed for a non-poor equilibrium, cash or other forms of nonhuman capital alone may not reduce poverty. In such cases, multi-dimensional interventions would be needed to reduce persistent poverty (Rosenstein-Rodan, 1943; Murphy et al., 1989; Azarriadis and Stachurski, 2005; Barrett et al, 2019; Buera, Kaboski and 1 World Bank Fragility and Conflict and Violence Overview. Accessed on April 29, 2019 from http://www.worldbank.org/en/topic/fragilityconflictviolence/overview. 2 Statistics are derived from the 2016 baseline survey. 3 Throughout the document, monetary amounts are reported in nominal and purchasing power parity (PPP)-adjusted USD terms. The latter is set at 2018 prices using the Afghanistan CPI and PPP conversion factor from the IMF, unless otherwise stated. Figures in current USD are converted at the exchange rate for the year the data were collected with the IMF exchange rates for the corresponding year: 2016 (for baseline data), 2017 (for implementation), and 2018 (for follow-up data). The exchange rates used are: 1 USD = 67.87 AFN (2016), 1 USD = 68.03 AFN (2017), and 1 USD = 72.08 AFN (2018). All tables report PPP-adjusted amounts only. 2 Shin, 2019). Although the theoretical conditions that generate poverty traps or trap-like outcomes are well studied, rigorous tests of the predictions of poverty trap models have been constrained by the lack of appropriate data and exogenous variation. This study contributes to a growing body of evidence aiming to understand how multi-faceted interventions can help reduce persistent poverty. We test whether a “big-push” intervention called the “Targeting the Ultra Poor” (TUP) program can reduce poverty in one of the most difficult settings in the world, Afghanistan, when most recipients are women. By providing a time-limited package that combines a large investment in a productive asset, access to savings accounts, temporary cash support, skills training, coaching, and other complementary services related to education and health, the TUP aims to lift ultra-poor households out of poverty. We assess the impact of the TUP program implemented in Balkh province in Afghanistan. In our experiment, 1,219 of the poorest households across 80 villages were randomly assigned through a public lottery to either a treatment or a control group. Women in treatment households received the one-off package including a transfer of livestock – typically cows, and occasionally sheep and goats worth approximately USD 1,312 PPP (USD 357 nominal), a consumption stipend of USD 54 PPP (USD 15 nominal) delivered in 12 monthly installments, skills training, access to savings accounts and savings encouragement, facilitation of access to health care services, and coaching through biweekly visits for one year. Control households did not receive any of the program components. We study the impact of the program on consumption, food security, assets, finance, time spent working, income and revenues, mental health, women’s empowerment, child health, and education. We measure these outcomes one year after the end of all program activities, and two years after the asset transfer. We find the TUP program causes significant and meaningful improvements in the well-being of ultra-poor households in the study villages across multiple dimensions. One year after the end of the program, labor choices of ultra-poor women have expanded, and the well-being of recipient households and their members has improved. Figure 1 presents a visual summary of the results in standard deviations (SDs) that allow for comparisons across different outcomes. Per capita monthly consumption increases by 30% or USD 24 PPP (USD 7 nominal) with respect to the control group, resulting in significant improvements in an index of food security (0.49 SD). Psychological well-being improves for both the primary man and primary woman (0.26 SD and 0.58 SD, respectively). The share of households below the national poverty line decreases by 20 percentage points from 82% in the control group. 4 Household savings increase by 2,195% (USD 106 PPP; USD 31 nominal), and indebtedness decreases by 53% (USD 733 PPP; USD 211 nominal). Impacts are driven by an increase in income from livestock, due to the asset transfer, and a concomitant increase in women’s labor participation by 22 percentage points. The intervention increases time in income-generating activities for both the primary woman and man: The total time spent working of the primary 4 This figure is estimated with a methodology consistent with the national poverty line estimate. 3 woman increases by 55%, driven predominantly by more time on livestock-related self- employment activities. Figure 1. Summary TUP Impacts Across Main Outcome Groups Notes. The figure summarizes all primary and secondary treatment effects. Effect sizes are presented in standard deviations. Details on the outcomes included in the indices are reported in Table A1 of the Online Appendix. 4 Time devoted to productive activities by the primary man increases by 14%, driven exclusively by livestock activities. Women’s empowerment improves, with an index of 6 indicator groups increasing by 0.38 SD. The program also improves child health and education outcomes: the under-five diarrhea rate decreases by 8 percentage points from 51%, and school enrollment increases by 6 percentage points. We estimate that the benefits of the program are likely to exceed the cost: A calibration exercise suggests a benefit-cost ratio of 2.3 and an internal rate of return of 26%, which is large compared to existing TUP studies. Our study makes three main contributions. First, we find positive and significant impacts of the TUP program in a fragile and conflict setting. Existing evidence on similar TUP programs comes largely from stable contexts. 5 In addition, the impacts reported in the Afghanistan TUP are the largest of any of the TUP pilot programs evaluated so far. For example, the increase in consumption of 30% is larger than the 18%-21% increase in Ethiopia, India and Bangladesh three to four years after the asset transfer (Banerjee et al, 2015; Bandiera et al, 2017). These results point to the potential of this multi-faceted intervention to reduce poverty even in extremely challenging settings. Second, the intervention was successful in improving women’s labor participation in a context where gender gaps in access to assets and inputs, and discrimination in paid employment, are the norm. The results suggest that the program gives previously under- employed women economic opportunities in a context with important constraints to women’s labor participation. In addition, our study demonstrates that transfers to women can have large effects, in contrast to some evidence suggesting limited impacts on women’s business outcomes from cash transfers (de Mel et al., 2008). These results, together with the improvements in women’s empowerment, point to the TUP as a program that can reduce gender gaps as well as achieve its overall objective of reducing extreme poverty particularly in fragile and conflict-affected areas. Third, these results add to a small but growing literature showing that addressing multiple constraints simultaneously can catalyze productive investments and reduce persistent poverty in a cost-effective way. The impacts on income and consumption observed here support the potential of multi-faceted interventions to generate long-term reductions in poverty (Banerjee et al., 2015; Bandiera et al., 2017; Blattman et al., 2014; Blattman et al., 2016) and, to a more limited extent, improvements in women’s empowerment (Bandiera et al., forthcoming). However, evidence from extremely fragile settings such as Afghanistan remains scarce. Our results are consistent with the limited evidence on interventions aiming to generate impacts in poor and fragile states, which suggests that injections of capital can stimulate self-employment and raise long-term earning potential, often when implemented 5 The results from a TUP program in the Republic of Yemen are forthcoming and will add to the evidence of the program in fragile and conflict-affected settings. 5 together with complementary interventions (Blattman and Ralston, 2015). In contrast, it is unclear that interventions targeting one mechanism can produce similar impacts on poverty. For TUP programs, Banerjee et al. (2018) study whether providing only a transfer of a productive asset or access to savings are each sufficient on their own to replicate the generated impact of the multi-faceted TUP program in Ghana. None of the interventions were able to replicate these results. In more stable contexts, cash transfer programs have been successful in reducing poverty and increasing investments in education and health (Fiszbein and Schady, 2009; Macours and Vakis, 2019; Araujo, Bosch, Schady 2019; Baird et al., 2011, 2013; Haushofer & Shapiro, 2016). However, existing evidence suggests they may only have a modest ability to increase incomes of the recipients and their standard of living after the cash transfer stops (Ikegami, Carter, Barrett, Janzen, 2019; Araujo, Bosch, and Schady, 2019). Similarly, conditional cash transfer programs can be an effective tool to increase investments in education and health while reducing current poverty, but evidence is mixed on whether their impacts persist (Fiszbein and Schady, 2009; Macours and Vakis, 2019; Araujo, Bosch, Schady 2019). They seem unlikely to move recipients from one equilibrium to another one (Kraay and McKenzie, 2014). Relatedly, microfinance interventions have not produced permanent increases in consumption or income that can support long-term reductions in poverty (Buera, Kaboski, Shim, 2019; Banerjee et al., 2014). Taken as a whole, the existing evidence is consistent with our findings that a big-push approach may be appropriate to generate meaningful changes for ultra-poor households in conflict settings. This paper proceeds as follows. Section 2 describes the economic constraints facing ultra- poor households in our sample. Section 3 describes the TUP program. Section 4 lays out the Design and Methods. Section 5 presents the results. Section 6 presents a cost-benefit analysis of the program, and Section 7 describes the study limitations. Section 8 concludes. 2. SOCIOECONOMIC CONDITIONS OF THE ULTRA POOR We study ultra-poor households in four districts of Afghanistan’s Balkh province. To identify these households, the poorest villages in the province were initially chosen through a qualitative assessment by the program implementer, a government-owned entity. After this selection, a Participatory Rural Appraisal (PRA) was conducted, including a community poverty wealth ranking and physical verification of the program’s eligibility criteria, resulting in 1,219 ultra-poor households in 80 study villages (details of this process are reported in Section 4). In addition to the main TUP sample, we selected approximately 20 households from each study village, randomly drawn from the PRA population census list (excluding TUP- eligible households), to provide a representative benchmark for the TUP sample. Baseline and follow-up data were collected from 1,680 households using this approach, which we 6 refer to as the non-ultra-poor (non-UP) sample. For simplicity, we use the abbreviation UP to refer to the ultra-poor for the remainder of the paper. Poverty and Socioeconomic Conditions of UP and non-UP Households at Baseline Overall, the PRA was successful in identifying the poorest households: 80% of the households identified as UP are below the 2016 national poverty line of USD 112 PPP per person per month (USD 30 nominal), compared to 57% in the non-UP sample, and 55% at the national level for the same year (ALCS, 2016). As Panel A in Table 1 shows, 20% of UP households are women headed, compared to 5% in the non-UP sample and 0.3% in Afghanistan. 6 UP households are worse-off than non-UP households across all dimensions analyzed: Illiteracy is exceptionally high for both groups but much higher for UP primary women (96%) than non- UP (90%), while the national average is 80% for adult women. Illiteracy rates for primary men are also high at 84% and 73% for UP and non-UP households, respectively – much higher than the national average of 51% for adult men.7 School enrollment is 53% for girls and 59% for boys aged 6-19 in UP households, compared to 54% and 64%, respectively, in non-UP households. Supply constraints may contribute to this: As Table 2 shows, around half of the villages have a primary school (56%) or a secondary school (48%). UP households own significantly fewer assets than non-UP households and low levels of financial inclusion and high indebtedness are the norm: Having any savings in both groups is almost non-existent (2%), while UP households are more likely to be indebted (68%) than non-UP households (52%). Most of the debt for UP households is for consumption smoothing and health shocks (89%), rather than investment (4%), and comes from informal sources such as family and friends or grocery stores (88%). Supply-side constraints may partially explain the low coverage of formal institutions, with only 5% of villages reporting the presence of a microfinance institution, and none reporting the presence of a bank (Table 2). UP households also report consistently lower levels of psychological well-being than non-UP households. UP primary women report lower life satisfaction than women in non-UP households (5.0 vs. 6.7 points in a 1-10 scale, where 1 indicates very unsatisfied and 10 indicates very satisfied). Using standard cutoffs for depression on the Center for Epidemiological Studies Depression (CES-D) scale, 69% of women in UP households and 52% of women in non-UP households report suffering major depression. For reference, UP women would rank last on life satisfaction in the list of 60 countries for which data are available in the 2016 World Values Survey. Non-UP women would rank 39th in the same list.8 UP primary men also report high levels of depression, but much lower than women in the same 6 In this section, all figures for Afghanistan come from the ALCS 2016 and global indicators come from the World Bank Open Data Indicators, unless otherwise stated. 7 For reference, 17.3% of adult women and 10.2% of adult men worldwide are illiterate. Afghanistan and world figures are estimated for adults 15 years and over for 2016. 8 Comparison is done with the life satisfaction rating for all women in the sample of 60 countries. 7 households: 58% of primary men or 11 percentage points less than UP women would be classified as depressed (not shown). These results confirm that UP levels of consumption, human capital, asset ownership and psychological well-being are significantly lower than other households in their community and among the lowest in the world, indicating that the UP experience multidimensional poverty in both a relative and an absolute sense. Labor Markets for UP and non-UP Households Afghanistan has one of the world’s lowest employment-to-population ratios at 41, and 21% of the working population are considered underemployed (working less time than they are willing to). Women’s labor force participation is low at 27%, and women’s unemployment extremely high at 41%. Therefore, the context is one with limited labor opportunities, particularly for women. Using baseline data on labor participation and labor activities directly from the primary woman in UP and non-UP households and primary man in UP households we find that engagement in income-generating activities is similar to the national average in our study villages:9 31% of UP primary women in the control group (UP women) and 25% of non-UP primary women engage in income-generating activities (Panel B in Table 1). These activities include self-employment such as livestock rearing, work in own agricultural and non- agricultural businesses, as well as paid jobs in agriculture, maid services, formal employment, and other activities. These figures hide high levels of underemployment, with women working few full-time days in a given month, and also significant differences in activities across socioeconomic groups. Figure 2 describes the main labor activities in these villages, by showing the share of hours UP women and men devote to them. This figure reveals that UP women devote a higher proportion of their time to “other paid jobs” than non -UP women and substantially less of their working time to own livestock rearing, and other household businesses (23%) than non-UP primary women (58%). The relative time allocation for UP men resembles that of UP women; however, UP primary men are more than twice as likely to be engaged in income-generating activities compared to UP women (69% vs. 31%), and conditional on working, primary men work almost five times more hours than primary women, accounting for 14 vs. 3 full-time-day equivalents (not shown). 10 9 For non-UP households these data were only collected at follow-up and reported by the primary woman. We use data on control UP households (excluding treated UP households) and non-UP at follow-up to describe the labor market characteristics and analyze differences across socioeconomic conditions in absence of the intervention (UP control vs. non- UP) as well as across gender (UP men vs. UP women in the control group). Across each productive activity inside and outside the household, we asked for the number days worked, hours per day worked, and corresponding earnings. 10 A full-time-day equivalent is defined as 8 hours of work 8 Figure 2. Share of Hours per Activity, by Socioeconomic Status and Gender Non-UP 36% 22% 8% 7% 26% Women UP Women 17% 6% 19% 2% 57% UP Men 9% 7% 31% 7% 46% Own Livestock Rearing Own Agriculture or Other Business Paid Labor: Agriculture Formal/Salaried Employment Paid Labor: Other Notes. The figure compares the share of time spent on different income-earning activities by gender (primary UP women vs. primary UP men) and across socioeconomic status (primary UP women vs. primary non-UP women). Results are derived using non-UP and UP control group survey data at follow-up in 2018. Other paid labor includes maid services, non-agricultural activities outside of the household, and other paid work. For women, the highest earnings per hour are in formal/salaried work at USD 2.39 PPP per hour (USD 0.65 nominal), followed by own livestock rearing (USD 1.14 PPP; USD 0.31 nominal), agricultural paid labor (USD 0.98 PPP; USD 0.26 nominal), and other labor (USD 0.78 PPP; USD 0.21 nominal) (Figure 3).11 Since UP women spend the largest share of time working in the lowest paid activity (other paid jobs), this implies a much lower return on their time spent in income-generating activities than non-UP women. In addition, formal employment is accessible to only few women due to low education levels. Finally, Figure 3 also reveals important differential returns by gender across activities within UP households. Primary men earn more per hour in all activities outside the household: For other paid labor – the activity with the highest share of time spent working – the gender gap is the largest, where women receive USD 0.38 for every USD 1 men receive for returns to their work in this activity. For agricultural activities outside the household, women earn USD 0.53 per every USD 1 men receive, and USD 0.83 per every USD 1 men receive for formal/salaried employment. Why do UP households, and women in general, not allocate 11Results reflect hourly earnings for the main activity groups over all individuals with non-missing earnings and positive hours in these activities. For livestock rearing we compute hourly earnings dividing household total earnings (revenue minus input costs) by the hours worked for all adult members in the household for the last four weeks. We did not collect as detailed data for agricultural businesses, with more seasonal revenues and costs. 9 their time (or more time) to the activities with the highest earnings? One reason may be limited access to inputs and assets (e.g., high-return livestock) and lower human capital endowments (i.e., salaried/formal employment) among UP women. For instance, more non- UP than UP households own cows (28% vs. 9%) and fewer own chickens (26% vs. 40%). Non- UP households also hold a larger number of animals per type (not shown). The TUP program aims to relax these constraints. Figure 3. Earnings per Hour for Main Income-generating Activities, by Gender (USD PPP) Men Women 2.03 Paid Labor: Other 0.78 Formal/Salaried Employment 2.89 2.39 Paid Labor: Agr iculture 1.85 0.98 Own Livestock Rearing (HH) 1.14 - 0.50 1.00 1.50 2.00 2.50 3.00 3.50 USD PPP per hour Notes. This figure is derived using non-UP and UP control group survey data at follow-up in 2018. It presents earnings per hour in different activities by gender. Other paid labor includes maid services, non-agricultural activities outside of the household, and other paid work. All monetary amounts are PPP-adjusted USD terms, set at 2018 prices using the Afghanistan CPI and PPP conversion factor from the IMF, unless otherwise stated. 3. THE TUP PROGRAM The TUP program combines the transfer of a productive asset with structured training, mentoring, a basic cash stipend, and other complementary services for a defined period. The original TUP program was designed and implemented by BRAC in Bangladesh. The program targets UP women who can manage an enterprise but have no productive assets in the household and are not connected to a microfinance institution. The aim is to help them move out of extreme poverty and ultimately be able to engage in formal financial opportunities. The Afghanistan TUP program we study here was implemented under the World Bank- supported “Access to Finance” program. The program covered six provinces in Afghanistan between 2015 and 2018, supporting 7,500 households. The impact evaluation focuses on Balkh province, where 1,500 households were supported under the program. The 10 intervention was implemented by the “Microfinance Investment Support Facility for Afghanistan” (MISFA), which is an independent apex organization, own by the government, which supports a number of partners in implementing social development activities, including microfinance and TUP programs.12 In the study villages, the implementation was conducted by “Coordination for Humanitarian Assistance” (CHA), a local NGO, foll owing a standard process. To identify UP households, the program included village- and household- level selection processes. Program staff first qualitatively identified the poorest villages in the province subject to having availability of veterinary services, financial institutions and social services, and being secure and accessible. Once the villages were selected, a Participatory Rural Appraisal (PRA) was conducted to identify poor households. First, households were gathered in a community meeting place. A representative from the implementing NGO, together with a member of the Community Development Council (village leadership committee) led a community wealth ranking exercise that assigned all households in the village to the categories “well-off”, “better-off”, “poor”, and “ultra-poor”. The exercise was rescheduled if fewer than 70% of households were present. Disputes were facilitated during the meeting. The final “ultra-poor” list was verified by the NGO through a short survey, and this was followed by a final verification by MISFA of the eligible households submitted by the NGO. The final selection of TUP recipients was based on meeting at least three of the following six criteria, checked during the verification: 1. Household is financially dependent on women’s domestic work or begging; 2. Household owns less than 20 decimals (800 square meters) of land or is living in a cave; 3. Targeted woman is younger than 50 years of age; 4. There are no active adult men income earners in the household; 5. Children of school age are working for pay; and 6. Household does not own any productive assets, based on a pre-defined list used by MISFA. Once identified, the UP households received the following program components: 1. Transfer of a productive asset in the form of livestock (e.g., cows, goats); 2. A monthly cash transfer/stipend (USD 54 PPP or USD 15 nominal per month for 12 months); 3. Basic training on livestock rearing and entrepreneurship; 4. A “health subsidy” which includes the provision of a basic hygiene kit and reimbursement of up to USD 81 PPP or USD 22 nominal for medical expenses or latrine improvements; 5. Fortnightly “mentoring visits” by social organizers, and veterinary services to: a. Evaluate the asset and related outputs and recommend follow-up actions. Depending on the evaluation of the asset, additional support in the form of 12 MISFA is established as a limited liability non-profit company whose sole shareholder is the Ministry of Finance of the Islamic Republic of Afghanistan. 11 food supplements or an asset replacement were options for the program participants; and b. Promote activities encouraging improved behavior across a range of dimensions (health, education, women’s empowerment, financial inclusion, and social cohesion/community support) by providing advice and linking households directly to education, health, and financial institutions where appropriate. This included helping households to apply for national ID (Tazkira) cards if they did not yet have one. The program applied a sequenced approach. The recipient received support for their livelihood selection, which included an intensive and repeated consultation between MISFA, the partner field staff, and the UP women, so that participants could make an informed choice among different enterprise options. The livestock asset was worth USD 1,136 – 1,488 PPP (USD 309 – 405 nominal) at time of delivery and was replaced if it became sick or died. The consumption stipend aimed to support the household with basic food needs, initially to replace the potential forgone income or productive time that the TUP recipient spent learning about and initiating their business rather than attending to their usual duties. Once the household began work on their enterprise, they received follow-up visits on a biweekly basis from program staff to provide guidance on both business and social issues. After 12 months of continued support, households were assessed on a set of performance indicators and formally “graduated,” after which no further support was provided through the program. While the program is similar to the standard TUP program model, a few important differences exist (a cross-country comparison of intervention components is reported in Table A2 of the Online Appendix): 1. Coaching support lasts for 12 months instead of 18 – 24 months. 2. A health subsidy is included that does not exist in other programs. 3. The focus of asset provision is on cows rather than smaller livestock assets like goats, pigs and chickens. 4. Asset values transferred are larger than other programs where data exist. The program variations were decided by MISFA based on earlier pilots conducted in Bamyan Province, where the program attributes were finetuned to address local constraints. 4. DESIGN AND METHODS Experimental Design and Sample We use a household-level randomized experimental design to estimate the causal effect of the intervention on socioeconomic outcomes of UP households. The evaluation sample comes from 80 villages in four districts of Balkh province (Dehdadi, Dawlatabad, Nahr-e Shahi and Khulm). The participatory wealth ranking was conducted in all study villages of Balkh 12 province to identify the eligible population. This exercise started with a wealth ranking in 100 villages performed by CHA in partnership with village leaders, yielding a population of 26,957 households that were split into four community-defined categories: well-off (6%), better-off (16%), poor (34%), and ultra-poor (44%). This was followed by a verification survey, administered by CHA, to measure the six qualifying criteria for being eligible for the TUP program listed above. This removed 85% of the households identified as UP from the wealth ranking exercise. MISFA then completed a final verification exercise, removing 28% of this group whose initial eligibility was overturned due to reporting inconsistencies. This resulted in an eligible group of 1,235 households, or slightly under 5% of the population in our study area. A baseline survey was conducted among all eligible households from February to April 2016, before a public lottery to identify TUP participants took place in May 2016. The randomization was stratified by PRA group by holding separate lotteries for each. Most villages (51) had one PRA group, while larger villages were split into multiple PRA groups which were typically defined by the feeder area of a masjid (mosque). PRA groups with one TUP-eligible household were removed from the study due to the infeasibility of a lottery in these cases, effectively dropping 16 study households from 16 PRA groups. This resulted in 133 PRA groups and subsequent lotteries across our study sample. Then 1,219 UP households in 80 villages were randomly assigned into one treatment group (491 households) and one control group (728 households). Starting in May 2016, the treatment group received the full TUP package as described above, and the control group did not receive any of the components. The program lasted for 12 months from the time of asset transfer. A follow-up survey was conducted from July to October 2018, approximately 1 year after program completion and 2 years after the asset transfer. In this study, we present estimates of the impact of the program on UP households, comparing the treatment and control groups at the first follow-up. Figure 4 provides a graphical representation of how the final study sample was generated. 13 13 To avoid the potential for data mining generating spurious relationships, we registered the trial in the American Economic Association Randomized Controlled Trial registry and prepared a pre-analysis plan, pre-specifying all primary and secondary outcomes and measurement approaches: https://www.socialscienceregistry.org/trials/2665. All de- identified data will be made available on the World Bank’s Microdata Catalog and coding files used to generate the results in this report will be available on GitHub. The study also has the Institutional Review Board (IRB) clearance from Princeton University and the Ministry of Public Health in Afghanistan. 13 Figure 4. Breakdown of the Study Sample 4 out of 15 districts selected in Balkh Province 100 villages and 200 PRAs selected in study districts 26,957 households included in wealth ranking exercise Ultra-poor: Poor: Better-off: Well-off: 11,859 (44%) 9,121 (34%) 4,394 (16%) 1,583 (6%) Eligible after PMT: 1,953 Eligible after MISFA verification: 1,235 Included in lottery: 1,219 Not included in households in lottery: 16* 80 villages and 133 PRAs Control: 728 Treatment: 491 Surveyed in Surveyed in follow-up: follow-up: 689 458 (93%) (95%) Notes. The figure depicts the selection process that led to the final sample sizes and response rates. 16 households were not included in the lottery and assigned to treatment (*) because there was only 1 household eligible for the program in the PRA. 14 UP household survey: A comprehensive assessment at baseline (spring 2016) and a follow- up survey (summer 2018) were conducted to measure primary and secondary outcomes of interest for participating households in the treatment and control groups. This instrument was divided into two parts: Lady of the Household (LHH; approximately 2 hours) or primary women, and the Man Head of the Household (MHHH; approximately 45 mins). The woman in the household with the most knowledge and decision power completed the LHH survey. When she was not also the head of the household, the MHHH survey was administered to the man head. Where no lady of the household existed, the men household heads completed the LHH survey modules. They are referred in the analysis as primary woman and primary man, respectively. Saliva collection/Cortisol tests: Saliva samples were collected to measure cortisol levels, a neurobiological marker of stress, at follow-up, using SaliCap tubes. A single saliva sample was collected at the end of each survey from both the primary woman and the primary man (if there was one). All samples were kept at 4°C for no longer than necessary before freezing them at or below -20°C. The samples were then shipped to Dresden Lab Services (Germany) where they were assayed for cortisol using a standard chemiluminescent immunoassay. Village and market surveys: Additionally, two shorter instruments were used: One for village leaders to record basic village-level data such as infrastructure, violence, and other village shocks, and the other at the district market level to collect data on food prices for calculating consumption expenditures from quantities. Field supervisors visited the largest market in each of the four districts and collected sales prices for all food items found in the household consumption survey. These instruments were administered in parallel with the household surveys. Primary Outcome Measures We generate primary outcome measures as follows: 1. Consumption: Total value of food for the last 7 days, and non-food expenditures in the past month, where non-food expenditures includes personal and household items, education and medical expenses, household repairs, social expenses (e.g. weddings, funerals and other ceremonies), and temptation goods. Non-food item values are estimated by household respondents. Food values use the district market price from the market survey where it exists, or the median market price from the surveys conducted in the remaining districts otherwise. 2. Assets: An index generated using principal component analysis of number and type of productive and household assets (excluding land/property), a proxy for wealth following Filmer & Pritchett (2001). 3. Financial inclusion index: A standardized index combining the following variables: (i) primary woman knows different places to save; (ii) anyone in the household has a 15 formal savings account; (iii) household members can access formal credit if needed; (iv) household has saved in last 4 weeks; (v) household savings in last 4 weeks; (vi) household total savings. 4. Psychological well-being index: A psychological well-being index is computed separately for women and men as the standardized weighted average of scores on the Center for Epidemiologic Studies Depression (CES-D) 7-point scale (Radloff 1977), the World Values Survey (WVS) questions on happiness and life satisfaction, Cohen’s 4- item stress scale (Cohen, Kamarck, and Mermelstein 1983), and the log of cortisol levels obtained through saliva samples adjusted for confounders. 14 We adapted some questions in these indicators to the social, cultural and religious norms in the country. For this we used a combination of extensive piloting, and topic and local expertise from research members with expertise in the area and in Afghanistan. 5. Women’s empowerment index: We consider two indices. The first is a standardized index combining indicator variables for whether the women’s decision was taken into consideration and/or followed on household finances (credit and savings) and expenditure decisions (food, household repairs, clothing, land, property and other high-value expenditures). This set is consistent with other studies and was pre- specified in our pre-analysis plan (PAP). The second is a standardized index with additional indicators on voice and agency. This broader index was not pre-specified, but was constructed to follow internationally agreed standards from the United Nations and Klugman et al (2014). This second set includes the original indicator plus five other dimensions: (i) women’s participation in decisions on children’s investments; (ii) women’s participation in decisions related to their own fertility, time use, and mobility to work outside home and open a business, as well as effective access to inputs and resources including ownership of a mobile phone and having savings, loans or financial assets in their name and separate from others; (iii) women’s participation in income-generating activities (participation in paid income-generating activities and being the owner or manager of a self-employment enterprise); (iv) aspirations for daughters (educational attainment and marriage age, and school enrollment for school-age girls); and (v) political involvement and social capital, including whether the primary woman has a Tazkira (ID) card; is a member of a political party; attended village meetings; and approached village leaders about a village issue. This last indicator set is usually included as a separate index in other studies and was originally a separate outcome in our PAP; however, based on the 14We take the residuals of an OLS regression of the log-transformed cortisol levels on dummies for having ingested food, tea, nicotine through either smoking a cigarette or tobacco or through chewing tobacco or using naswar, or medications in the two hours preceding the interview, for having performed vigorous physical activity on the day of the interview, and for the time elapsed since waking, rounded to the next full hour. 16 framework and internationally agreed indicator on empowerment we adopted, we include them as part of the broader women’s empowerment index. 5. Time spent working: Total time spent working in the past 7 days in all occupations (agriculture, livestock, household chores and other paid and unpaid work), separately for the primary man and woman of the household. Design Integrity Baseline Balance We test for baseline balance using the same empirical specification used for the outcome analysis, which compares baseline treatment and control households, controlling for randomization stratification (133 PRA dummy variables) to increase precision. Table A3 of the Online Appendix presents these comparisons. We find no statistically significant or economically meaningful differences across groups for all main outcome indicators at baseline. Across 23 variables that could influence our primary outcomes of interest, we observe statistically significant differences for land ownership and engagement in livestock activities at the 5% level. Treatment households are 2 percentage points less likely to engage in livestock production (control: 5%) and are significantly more likely to own land and/or dwellings based on a principal component score aggregating these assets. While the differential livestock activity at baseline may result in an underestimate of treatment effects, the significantly higher land ownership could overestimate effect sizes if this is an important capital input for generating value from the program. To assess whether these imbalances influence any of the main results, we include these variables as covariates in a robustness regression for main outcomes, presented in the Online Appendix. Results with and without these covariates do not differ in sign or significance. Compliance We track monitoring data collected by CHA/MISFA on all TUP recipients throughout implementation and include an extensive compliance module in the follow-up survey of all treatment and control households. Table A4 of the Online Appendix summarizes the treatment compliance results, as reported in the survey. We find that the program was successful in delivering its components according to randomized assignment. At follow-up, 97% of treatment households report being aware of the TUP program, compared to 51% in the control group. Among the aware households, the majority indicate having received any of the program inputs in the treatment group (99.5%), compared to only 3% in the control group. Respondents reported receiving livestock assets in 96% of treatment and 1.5% of control households. All other intervention components had similar levels of compliance. Independent monitoring data from CHA/MISFA report similar levels of compliance, with 98% of treatment households represented on their official record of households receiving support, compared to 0.1% of control households. The data thus suggest high compliance 17 with randomization assignment and successful delivery of the program components. This is notable given the challenging implementation context. Attrition The lottery assigned 491 households to treatment and 728 households to control. The follow- up survey was successfully completed among 458 treatment households (93%) and 689 control households (95%). The difference in attrition rates across treatment and control groups is not statistically significant. 15 Data analysis Pre-analysis Plan To present a comprehensive overview of impacts on a large set of variables while balancing this with the threat of false positives, we registered the trial and published a PAP before analysis began.16 The results presented in this report follow the variable construction and econometric specifications defined in the PAP, with a few exceptions. First, we include an index generated through principal component analysis of number and type of assets, rather than value of assets, since we did not collect asset values at follow-up due to a questionnaire design oversight. Second, we present both the pre-specified measure of women’s empowerment (aligned with other studies) along with a measure that incorporates a broader set of empowerment dimensions to highlight the differences in results. Third, conditional average treatment effects compare bottom, middle, and top thirds of the distribution for each respective outcome of interest, rather than median splits, to better understand the distribution. Beyond these deviations, all other analysis follows the PAP. Econometric Specification Since treatment compliance with randomization was not perfect, we use the intention-to- treat (ITT) estimator – simply, the difference between average outcomes across treatment and control groups at follow-up. The basic specification is: = + + ∑ + (1) =1 Here, is the outcome of interest for household i at follow-up, is a dummy variable equal to 1 if household i is assigned to receive treatment and 0 otherwise. is the estimate of the average effect of the TUP intervention at follow-up. Since randomization is stratified by community, we follow Bruhn & McKenzie (2008) and include , which is a dummy variable equal to one if household i comes from community/PRA group j (of a total of 133 PRA groups). For household-level variables, standard errors are not clustered. Where data on multiple 15 We define response rates based on households with a complete LHH survey. We experienced an 88% response rate for the MHHH survey, however all main outcomes can still be estimated without this survey. 16 https://www.socialscienceregistry.org/trials/2665 18 individuals within the same household are collected (e.g., school enrollment), standard errors are clustered at the household level. Quantile and Heterogeneous Treatment Effects We estimate quantile treatment effects using the following specification: () = 1 (| = 1) − (| = 0) (2) where (| = 1) is the -th quantile of potential outcomes under treatment. This specification assumes full compliance with the random assignment to estimate the treatment-on-the-treated effects. Given the nearly universal program compliance measured for the TUP, this approximation is justifiable to simplify analysis without risk of bias. To measure differential impacts by baseline characteristics, we estimate Conditional Average Treatment Effects (CATE) with parametric specifications, due to the sample size and common support requirements of non-parametric specifications for subgroup analyses. At the risk of misspecification, these assumptions increase power and allow for identification of effects without full common support. The specification we use for these estimates is: = + + ∑2 2 + (3) =1 + ∑=1 ∗ + ∑=1 Here, is a binary indicator for each of the baseline variables listed below, defined based on their empirical distribution: 1 (2 ) identifies the middle (top) third of the baseline distribution in our analysis. The parameters of interests are then for the CATE on the bottom third, + 1 for the CATE on the middle third, and + 2 for the CATE on the top third of the baseline distribution. We measure CATEs using consumption, assets PCA, the psychological well-being index, and psychological traits associated with an “entrepreneurial personality”.17 Multiple Hypothesis Testing The analysis covers multiple outcomes, which increases the likelihood of generating false positives. In addition to using a PAP to guide analysis and computing index variables as described above, all confidence intervals for our primary outcomes (including quantile and conditional average treatment effects) also control for the family-wise error rate at the 95- percent level using the step-down bootstrap algorithm of Romano and Wolf (2010). We present the adjusted p-values {in braces}. For all other outcomes, we report naïve p-values [in brackets]. 17We use six main psychological traits collected: Impulsiveness, tenacity, polychronicity (multitasking), locus of control, achievement, and power motivation, following Mel et al. (2009). 19 5. RESULTS We find large and statistically significant effects across all primary pre-specified outcomes. The primary goal of the TUP program – increasing consumption – is achieved after one year of the conclusion of the program: per-capita consumption in the treatment group increases by 30% relative to the control group, an index of food security increases by 0.49 SD, while indices for livestock assets and household assets increase by 1.06 SD and 0.36 SD, respectively. These impacts are achieved through a 55% increase in total time spent working of the primary woman, driven predominantly by more time on livestock-related self- employment activities. Livestock-related household revenues and savings increase, and household indebtedness decreases. We also observe improvements across multiple well-being indicators for individual members of the recipient households, including psychological well-being, child health and education and women’s empowerment. An index of psychological well-being increases by 0.58 SD and 0.26 SD for the primary woman and man in the household, respectively. Child health, measured by diarrhea in the oldest child under five, improves: The rate of diarrhea in children under five in the past two weeks falls by 16%. School enrollment of school-age boys and girls increases by 4.6 and 7.2 percentage points, respectively. An index of women’s empowerment including indicators for agency, economic opportunities, aspirations for daughters and political involvement increases by 0.38 SD. Figure 1 presents a visual summary of these impacts in standard deviations. These results are consistent with the underlying theory, and international evidence, that the TUP program helps UP households overcome multiple constraints simultaneously and provides a “big push” to improve their well-being and possibly put them on a path out of extreme poverty. It is worth emphasizing that although these impacts are large, due to the low baseline outcomes of the UP households in Afghanistan, they are modest in absolute terms. For instance, the increase in consumption is equivalent to USD 24 PPP (USD 7 nominal) per capita per month, and the increase in time spent working for the primary woman is 2.3 additional full-time equivalent days a month compared to 4.2 full-time days women spend in working a month in the control group. In the next sections we discuss the variable-by-variable results for each outcome group, illustrating which variables drive the results within each group and thus suggesting potential mechanisms. We elaborate further on what these results mean in economic terms. 20 Results by Outcome Group Consumption Table 3 presents the impact on monthly per capita consumption and poverty. Monthly per capita consumption increases by USD 24 PPP (USD 7 nominal, FWER-corrected p-value < 0.0005) or 30% of the control group mean of USD 81 PPP (USD 22 nominal). This increase is explained almost entirely by an increase in food consumption. Total monthly per capita food expenditure increases by USD 21 PPP (USD 6 nominal, p-value < 0.0005), which represents an increase of 40% relative to USD 53 PPP (USD 14 nominal) in the control group (not shown). Non-food expenditure increases are small at USD 3 PPP (USD 1 nominal, p-value 0.274) and not statistically significant (not shown). The impact on consumption leads to a reduction in the share of households below the national poverty line by 20 percentage points from 82% in the control group (p-value < 0.0005). Food Security In line with increases in consumption, we find large impacts on a food security index, with an increase of 0.49 standard deviations (p-value < 0.0005). As Table 3 shows, we find significant improvements across all measures considered. The likelihood that all household members are regularly eating at least two meals a day increases by 11 percentage points (15%, p-value < 0.0005). The number of households where no adult skips or cuts the size of their meals increases by 23 percentage points (53%, p-value < 0.0005). Similarly, we find 20 percentage points fewer households where children skip or cut the size of their meals (33%, p-value < 0.0005). Although we do not directly measure food diversity and calorie-based measures of food security in this study, specific food consumption items indicate that higher-nutrient foods like dairy, nuts, vegetables and meat increase relatively more than staple foods, suggesting that both the quantity and quality of food intake may be increasing (not shown). Finance Financial inclusion improves relatively more than any other outcome measured, with an index of financial inclusion (savings behavior, account holdings, and access to credit if needed) increasing by 2.38 SD (FWER-corrected p-value < 0.0005) compared to the control group. This is partly explained by a low level of financial inclusion among control households, and is consistent with the structural focus of the program on supporting financial engagement, from helping open bank accounts to promoting savings behavior. Table 5 presents the impacts across all financial inclusion measures. Savings-related behavior shows the largest relative change. While almost no control households have access to a formal bank account (1%), this increases by 28 percentage points (p-value < 0.0005) among TUP recipients. Similarly, the likelihood that a household saved anything in the last four weeks increases by 26 percentage points (p-value < 0.0005) from a control group level of 2%, and savings over this time increased by USD 70 PPP (USD 19 nominal, p-value < 0.0005) in the treatment group from 21 USD 4 PPP (USD 1 nominal) among control households. Overall savings increases by USD 106 PPP (USD 31 nominal, p-value < 0.0005) from USD 5 PPP (USD 1 nominal). Treatment households are 12 percentage points (p-value < 0.0005) more likely to say that they can access formal credit if needed, compared to 2% of control households. Despite this, the likelihood that a household has an outstanding loan is 14 percentage points lower in the treatment group (-24%, p-value < 0.0005) (Table A5 of the Online Appendix), and the total amount of outstanding loans almost halves (-USD 733 PPP or -USD 199 nominal, p-value 0.001). The reduction in borrowing is driven by a reduction in consumption-based loans, which decreases by 15 percentage points (-29%, p-value < 0.0005) (Table A5). To better understand the source and use of loans, we explore conditional outcomes among the set of households that have an existing loan, although given the differential selection across treatment and control, these findings should be interpreted cautiously (not shown). Among control households that have an existing loan, loans are predominantly used for health/emergencies (58%) or food and essential items (64%). Treatment household loans are 21 percentage points (p-value < 0.0005) less likely to be for health or other emergencies. There is limited evidence of a change in loan source across groups; however, the various indicators measured are consistent with a slight increase in the reliance on formal versus informal loan sources among treatment households. Nonetheless, informal sources remain by far the most common source of lending across both groups. Assets We study the program’s impact on the accumulation of durable and productive assets , focusing on livestock. For durable assets, we use an index estimated through principal component analysis (PCA) to generate a proxy wealth index (Filmer & Pritchett, 2001). As Table 3 shows, we find an increase in the index of durable assets by 0.36 SD (FWER-corrected p-value < 0.0005). The value of livestock assets increases by USD 839 PPP (USD 227 nominal, FWER-corrected p-value < 0.0005), which represents an increase of 315% with respect to USD 267 PPP (USD 72 nominal) in the control group. A higher proportion of UP households in the treatment group hold any livestock compared to UP households in the control group (88% vs. 57%; Table A6 in the Online Appendix). In addition, more treatment households hold higher- return assets, starting with cows (58% vs. 9%), followed by sheep (32% vs. 15%), goats (17% vs. 8%), and chickens (38% vs. 40%) (not shown). While large and significant, the impact on livestock assets value is lower than the original value of the livestock transfer, suggesting that households may be reducing their asset base over time. This is consistent with findings by Banerjee et al. (2015), who show a similar phenomenon across six countries initially, although with no further decline three years after the asset transfer. This result may also be explained by measurement errors in estimating the value of households’ asset holdings, and could also mask heterogeneous impacts, where a portion of the households consume part of the assets while others accumulate over time. 22 Time Use and Labor supply One of the main channels through which the intervention is intended to work is an increase in labor supply through self-employment activities. As Table 4 shows, the total time spent working by the primary woman increases by 2.3 full-time-day equivalents per month (FWER- corrected p-value < 0.0005). This reflects a 55% increase relative to the control mean of 4.2 full-time-day equivalents per month. This increase is driven by increases in time spent working by 2.7 days (p-value < 0.0005) in livestock self-employment, 0.4 days (p-value 0.046) in maid services, and 0.3 days (p-value 0.025) in other non-agricultural self-employment businesses. In contrast, time spent working in other paid work decreases by 1 day (p-value 0.002). This also implies that livestock rearing has become an important part of the time spent in income-generating activities for UP women, replacing other time spent working in lower-return activities, increasing their total time in productive activities inside and outside the household. Overall, women’s labor participation – measured as market work, self-employment, or job searching in the previous two weeks – increases by 22 percentage points (p-value < 0.0005) from 35% in the control group (Table 4). These results suggest that the program gives previously under-employed women economic opportunities in a context where women’s labor participation is extremely low. This increase in labor supply does not seem to come at the expense of an excessive workload overall, as the time spent in all productive activities, including household chores, accounts for 12 full-time-day equivalents in a month in the control group. Therefore, the additional 2.3 days devoted to livestock self-employment activities leave primary women in the household with 14.3 days spent in market and non- market activities inside and outside the household. The right panel in Table 4 shows that total time spent working by primary men increases by 1.6 full-time-day equivalents per month (FWER-corrected p-value 0.119), or 14% from 11.6 days in the control group, with time spent working in livestock self-employment activities increasing by 1.8 days (p-value < 0.0005). This is an important result, given that primary men seem to be under-employed (in terms of time use). Their total labor participation does not increase, but their time devoted to productive activities does. Overall, these results suggest that the program improved income-generating activities where both the primary woman and the primary man contribute. Income and Revenues from Productive Activities Total monthly household income and revenues from productive activities increases by USD 69 PPP (USD 19 nominal, p-value 0.007) from USD 307 PPP in the control group (Table 5). This increase is mostly driven by an increase of 281% in revenues from livestock activities (p-value < 0.0005). Other changes are smaller and not statistically significant: revenues from agriculture decrease by 35% (p-value 0.346), and revenues from non-agricultural businesses 23 and paid labor income for all adults increase by 5% (p-value 0.684) and 8% (p-value 0.389), respectively. Increases in revenues from livestock come mostly from an increase in livestock sales of USD 32 PPP (USD 9 nominal, p-value 0.002) compared to the control group, followed by increases in the value of milk produced by USD 23 PPP (USD 6 nominal, p-value < 0.0005), and yogurt by USD 7 PPP (USD 2 nominal, p-value < 0.0005) (Table A6 in the Online Appendix). Since these figures reflect the revenues during the four weeks preceding the follow-up survey, our measures of income and revenues may miss long-term cycles and important smoothing dynamics of consumption. However, these results indicate that livestock sales are an important source of income. To what extent reproduction of the livestock vs. decrease in the value of the assets due to liquidation for consumption occurs is an important question to assess the sustainability of these impacts in the longer term. Psychological Well-Being Table 6 shows the impact of the program on the psychological well-being of the primary woman and man. The treatment effects on both indices are statistically significant and large, both in absolute terms and compared to other programs. The index increases by 0.58 SD (FWER-corrected p-value < 0.0005) for the primary woman and 0.26 SD (FWER-corrected p- value 0.010) for the primary man. An important difference with other studies stems from the use of a more comprehensive set of tools to measure psychological well-being. We use an index including six measures: five subjective indicators and one objective measure. The subjective measures include self-reported life satisfaction (WVS), self-esteem, depression (CES-D), self-reported happiness (WVS), and stress (Cohen). The objective measure is salivary cortisol. Other studies of TUP programs typically measure well-being through a smaller set of indicators. Table 6 shows the breakdown of the individual indicators; the results are consistent with the overall index. For primary women all indicators of psychological well- being improved: All subjective measures improved and were statistically significant, while the objective measure of salivary cortisol showed a decrease (improved), but the change was not statistically significant. For primary men, all indicators improved, although cortisol and self- esteem were not statistically significant. Women’s Empowerment We present two sets of impacts on women’s empowerment, as described in Section 4. Our first index of women’s empowerment, focused on household finances and expenditures, increases by 0.09 SD (p-value 0.140), which is not statistically significant, in line with results in other countries, which have largely failed to find treatment effects in this dimension across all countries studied (Table 7). 18 One possible reason for this lack of measured impact is that 18Banerjee et al. (2015) find impacts on women’s empowerment in their 2-year endline but it is mostly driven by one country (Pakistan) and in the 3-year endline, estimates are not statistically significant for each country. 24 this measure may be limited in scope. Our second index of women’s empowerment, including additional dimensions, increases significantly by 0.38 SD (FWER-corrected p-value < 0.0005) compared to the control group, indicating large impacts when other dimensions of women’s empowerment are considered. These results are driven by increases in three dimensions: (i) an increase of 0.34 SD (p-value < 0.0005) in the index measuring women’s participation in decisions about their own body and time, including fertility and mobility (time use, finding a job outside, opening a business), and effective access to inputs and resources including ownership of a mobile phone and having savings or loans; (ii) an increase of 0.29 SD (p-value < 0.0005) in the index on participation in income-generating activities, including paid income- generating activities and being the owner or manager of a self-employment enterprise; and (iii) a 0.33 SD increase (p-value < 0.0005) in the index on political involvement and social capital, which includes having an ID, attending community meetings, and reaching out to a community leader. The two additional indices reported smaller positive increases, both statistically insignificant: The index on children’s investment decisions (education, health, marriage) increases by 0.08 SD (p-value 0.157) and the index on social norms – measured through aspirations for daughters in education and enrollment – decreases by 0.01 (SD p- value 0.906). Two things are apparent from these results. First, they are mainly driven by changes in indicators related to economic opportunities and access to inputs, which is expected given the focus of the intervention. However, while participating in income-generating activities and having a financial account in their own name are impacted positively, and are highly linked to components of the TUP intervention, interestingly, outcomes not targeted, such as ownership of a mobile phone, having a paid job, and owning or managing a self-employment entrepreneurial activity by the primary woman are also affected. These latter indicators are directly linked to increased women’s empowerment in the literature (World Bank, 2011; World Bank, 2018). Second, the dimensions less affected are those on which the control group is already doing well. In particular, women in the control group report high participation in household finances and expenditures, children’s investment decisions (70 - 80%), and have high aspirations for daughters: they report wanting high levels of education for them (14 years), largely think that education will help them get a better job or make them wiser (95%), and want them to marry when they are adults (76%). Therefore, these indicators, as reported by primary women, seem to have less room for improvement (we did not ask these questions of the primary man). Conversely, the variables with larger impacts are those in which women report lower levels of participation or access. For instance, only 30% of primary women in the control group report having a mobile phone, 15% report having a financial account or assets in their name, 31% report participating in any paid job, and 6% report owning or managing an entrepreneurial activity (Tables A7-A9 of the Online Appendix). 25 Child Health and Education The program has the potential to affect child health and education both directly and indirectly. First, the program helped link households to healthcare facilities and schools, and provided a health stipend and hygiene kit with the objective of improving health and education outcomes directly. Second, improved labor opportunities and income for adult household members may reduce the opportunity cost of children’s school attendance, and thus increase educational outcomes indirectly. Table 8 presents health and education results. Control group levels of child health and education are extremely low. The oldest child under five is reported to experience diarrhea 51% of the time over the last two weeks. School enrollment is 53% for girls and 56% for boys. Relative to these low control group values, we find a reduction of 8 percentage points (16%, p-value 0.044) in caregiver-reported diarrhea rates for the oldest child under five. Overall school enrollment increases by 6 percentage points (p-value 0.007), and absenteeism decreases by 5 percentage points (p-value 0.002) (not shown). When separated by gender, we observe a 7-percentage point increase (13%, p- value 0.013) in school enrollment for boys and a 5-percentage point increase (9%, p-value 0.093) for girls. Of those who are enrolled, TUP-household boys have 6 percentage points (38%, p-value 0.001) fewer days missed at school, whereas the impact on girls’ absenteeism is small and not statistically significant. These results are in line with existing evidence in South Asia illustrating that, because of the future potential earnings for boys, their opportunity cost for not attending school is higher than for girls, making it likely that boys are given the opportunity to access schools before girls (Drèze & Kingdon, 2001). Quantile and Heterogeneous Treatment Effects The TUP impacts are likely to be heterogeneous for multiple reasons: (i) theory predicts that impacts will vary depending on how close households are to the unstable equilibrium of a poverty trap (if such a trap exists); (ii) individual variation in participant characteristics which affect their choices, including intertemporal rates of substitution leading to differences in short-term consumption vs. investment decisions; and (iii) variation in ability to raise livestock or entrepreneurial ability, or other unobservable characteristics. Figure 5 presents quantile impact estimates for consumption, value of livestock, and indices of asset ownership and psychological well-being. We observe larger treatment effects at the top end of the distribution for most outcomes, except for psychological well-being. However, the confidence intervals overlap for most quantiles analyzed. Only for the value of livestock do we find statistically significant differences between the first and top quantile. Figures A2 to A4 of the Online Appendix present heterogeneous treatment effects measured by the impacts by quartiles of baseline consumption, asset ownership, psychological well- 26 being and entrepreneurial personality traits.19 We find little evidence of differential impact along these dimensions on consumption, value of livestock holdings, and asset ownership. These findings are in line with existing evidence suggesting little evidence of heterogeneity by baseline characteristics (Banerjee et al., 2015). Figure 5. Quantile Treatment Effects on Consumption, Livestock Value, Assets and Psychological Well-being Notes: Quantile treatment Effects estimates of the outcomes at follow-up. Bootstrapped 95% confidence intervals (2000 replications). The confidence intervals control for the family-wise error rates (probability of at least one false rejection across tests), following Romano and Wolf (2010), using codes from Bedoya et al. (2017). Robustness Checks We perform three types of robustness checks. First, we winsorize continuous variables to test whether results are driven by outliers. Second, we remove all regression covariates (133 PRA stratification dummy variables) to assess whether possible unanticipated stratification imbalances influence results. Third, we control for baseline imbalances to see if they explain differences between treatment and control at follow-up. 19We use questions developed by industrial psychologists to measure different facets of the entrepreneurial personality (de Mel et al. 2008, de Mel et al. 2009). We identify 6 main psychological traits: Impulsiveness, tenacity, polychronicity (multitasking), locus of control, achievement, and power motivation, which have enough variation in our sample and test for heterogeneity on these traits. The specific traits were not pre-specified. Rather, we proposed conducting heterogenous analysis by “entrepreneurial spirit” in the PAP which is a combination of these traits. 27 We winsorize the top 5% of the distribution for consumption, livestock revenue, and value of outstanding loans, and find that effect sizes decrease slightly in magnitude, but retain the same sign and significance, and continue to be large. Results are presented in Table A11 of the Online Appendix. Per capita monthly consumption impacts decrease from USD 24 PPP (30% increase, p-value < 0.0005) to USD 20 PPP (26% increase, p-value < 0.0005), and monthly livestock revenue decreases from USD 63 PPP (p-value < 0.0005) to USD 56 PPP (p-value < 0.0005). The gap between treatment and control indebtedness decreases from USD 733 PPP to USD 276 PPP (p-value 0.002), with average control household indebtedness decreasing from USD 1,382 PPP to USD 874 PPP. Table A12 of the Online Appendix presents the impacts on all primary outcomes when removing stratification dummies, and Table A13 presents the same measures when including baseline imbalance covariates. The changes in the point estimates for primary outcomes are small, and standard errors increase slightly when removing stratification dummies. Since the p-values associated with the main specification already indicated highly significant results, the reduction in precision does not change any of the results. Adding baseline covariates does not change the main results either. The results from robustness checks suggest that findings from the primary specification are highly stable, supporting the validity of the results. 6. COST-BENEFIT ANALYSIS While the program shows large economic and social impacts, the per capita cost of the program is significant, both when compared to other interventions aimed at improving the lives of the poor, and when directly comparing against other TUP programs around the world. Therefore, a critical question is whether the benefits exceed the costs. Table 9 presents the program costs, economic benefits, and estimated returns from the program using a similar framework and assumptions as in Banerjee et al. (2015). To estimate the cost-benefit ratio we use a 5% social discount rate, in line with World Bank/IMF guidance (IMF, 2010). Reported total program costs are shown in line 4: AFN 114,541 (USD 6,198 PPP or USD 1,688 nominal) per recipient. Since poverty reduction is the primary goal of the program, we use consumption gains as the primary welfare measure.20,21 We assume that the (unmeasured) consumption impacts one year after the program start are equal to those observed in year two (lines 6 and 7). Line 8 shows the present value of all future consumption benefits assuming they continue for 10 years after the asset transfer. This information allows 20 Banerjee et al. (2015) also include assets in their welfare measure. We do not follow this approach because consumption already includes revenue and rents from productive assets; and because a consumption measure that excludes flow utility from non-productive assets is more conservative. 21 We use the 95% winsorized consumption values to provide a conservative estimate. 28 us to calculate three measures: (i) the cost-benefit ratio under the above assumptions; (ii) the number of years the program impacts need to be sustained for the intervention to generate a return higher than the 5% social discount rate; and (iii) the internal rate of return, calculated as the discount rate needed to make the present value of costs equal the present value of benefits. The benefit-to-cost ratio (line 10) is calculated by dividing the net present value of future returns (line 9) by total program costs (line 5) and is estimated to be 2.3 – i.e. every dollar spent on the program generates 2.3 dollars in household consumption benefits. Equivalently, we estimate the internal rate of return under the same assumptions to be 26% – i.e. the program generates a return on investment of 26% per annum in the form of consumption increases. At a minimum, impacts on consumption would need to continue for four years from the start of the program to break even. Seven-year evaluations of similar programs in other settings document consistent or increasing consumption impacts over time, suggesting that these figures may be realistic. However, given the sell-down of the productive asset observed as a source of current consumption, whether these assumptions hold for this setting remains an empirical question. We test the sensitivity of the results by estimating the internal rate of return if we used winsorized consumption measures (21%), and if the consumption impacts continue for only 5 years after transfer (14%) or in perpetuity (29%). In all cases the return is significantly larger than the social discount rate of 5%. These numbers exclude any non-monetary social returns generated through, for instance, improved psychological well-being, increased human capital from higher school enrollment, and increased women’s empowerment. They also do not take into account the opportunity cost in welfare of the increase in hours worked. In summary, the average benefits potentially exceed the costs of the program, given that the cost is time-limited and the benefits are likely to continue to accrue into the future. While this assessment should be interpreted with caution given the multiple assumptions required to generate these estimates, the results suggest that the program may generate significant social and economic returns. 7. STUDY LIMITATIONS Limitations of the study include potential spillovers to other households and equilibrium impacts and implementation issues that may limit scalability and external validity. We discuss these issues in turn. Because randomization occurred within villages, both equilibrium effects created by the intervention and within-village spillovers of treatment on UP non-recipient households must be small for the results reported in the previous sections to provide an unbiased estimate of the treatment effects. The study design, limited by financial and logistical constraints, does 29 not allow us to separate these effects from the main treatment effect to empirically validate this assumption. There is, however, some evidence indicating that this assumption may be reasonable. First, Banerjee et al. (2015) and Bandiera et al. (2017) find that neither spillovers nor general equilibrium effects within villages substantially affect their outcomes, suggesting that such effects may not be an important issue at this scale. Second and more importantly, the share of village households directly impacted by the treatment is small (2%), even compared to the previously cited studies. Even if impacts are large in relative terms for recipient households, they are small in absolute monetary terms at USD 7 (nominal) or USD 24 PPP per capita consumption per month. The overall change in daily consumption or in the number of productive assets at the village level is consequently likely to be too small to create spillovers or general equilibrium effects. Second, our study cannot answer the question of whether the quality of the program could be maintained in a scale-up, and whether a scale-up of equal quality would achieve similar results. The quality of the implementation of the program, as shown by high take-up rates and the consistently positive assessment given by the monitoring teams, might be hard to replicate and sustain at scale, particularly in a context like Afghanistan. 8. CONCLUSION We report the results from an impact evaluation of the TUP program in Balkh province in Afghanistan. The study contributes to a growing body of evidence suggesting that TUP programs improve the lives of UP households, but it differs from previous TUP evaluations in that it takes place in a fragile and conflict-affected area, which is likely to be one of the most difficult settings to implement the program. In addition, the targeted recipients are women who are among the most vulnerable populations worldwide and in this setting in particular. The main goal of the program – increasing consumption and reducing poverty – is achieved. We find that the program significantly increases consumption, revenues, and assets two years after the asset transfer and one year after the program concluded, demonstrating the potential of TUP programs to improve the well-being of UP households in this setting. The results of this study also point to the TUP as a potentially gender-smart development policy, which achieves its overall objective of reducing extreme poverty, while contributing to reducing relevant gender gaps in the process. The impacts reported in this study are the largest of any TUP program evaluated to date and compare favorably to the best-performing programs in Ethiopia, India, and Bangladesh (Banerjee et al. 2015, Bandiera et al. 2017). We present conservative estimates that suggest that the program is cost-effective, assuming that, as other programs have suggested, the gains persist over time. With a benefit-cost ratio of 2.3, the internal rate of return of the 30 program is 26%, above the internal rate of return reported for India (24%) and Bangladesh (22%), and well above the social discount rate of 5%. Given the poor track record of interventions in addressing poverty in fragile settings and the relative importance of this target group for reducing global poverty, it is important to reflect on what may explain the project’s success. The multi-faceted nature of the program, designed to simultaneously address multiple constraints faced by UP households in Afghanistan, and the relative size of the transfer and success of the program implementation, provide possible explanations for the large effects found. Households targeted by the TUP program – especially those in conflict settings like Afghanistan with strong gender norms and depressed levels of women’s human and physical capital – are particularly likely to face multiple co-dependent constraints. TUP programs combining asset transfers with skills training and “coaching” could therefore reduce persistent poverty (Barrett et al, 2019, Ghatak, 2015). Our results are consistent with the predictions of an integrative model of poverty traps. However, we cannot disentangle the extent and precise mechanisms in which, for instance, the training, coaching, and other elements of the intervention are building capabilities and psychological assets, and their complementarity to nonhuman capital. The large impacts on psychological well-being and women’s empowerment suggest that psychological constraints may play a role in our setting, consistent with recent claims that poverty may perpetuate itself through mechanisms such as depression (de Quidt and Haushofer, 2019), cognitive function (Dean, Schilbach, and Scholfield, 2019), aspirations (Macours and Vakis, 2019; Bernard et al., 2014), and more generally scarcity-driven behavioral mechanisms (Banerjee and Mullainthan, 2010; Mullainathan et al., 2016; Mullainathan and Shafir, 2013; Mani et al., 2013; Shah et al., 2012; Ghatak, 2015; Haushofer & Fehr, 2014). The Afghanistan TUP program was successful in delivering virtually all the components of the TUP package to the target households. While it is unclear whether this feat could be replicated at a larger scale, it provides evidence of what is possible in a conflict setting, and provides further justification for the relatively large effects. The transfer-to-consumption ratio in Afghanistan was also the largest investment relative to other similar programs. The results are consistent with the existing evidence where country programs with larger consumption-to-transfer ratios have typically generated larger effects. Our results thus provide useful evidence on the short-term impact of TUP programs in fragile and conflict-affected settings and support the potential of “big-push” time-limited large investments that combine capital and human capital interventions to contribute to reducing poverty in a cost-effective way. 31 REFERENCES Araujo, Maria Caridad, Mariano Bosch, and Norbert Schady. "Can Cash Transfers Help Households Escape an Inter-Generational Poverty Trap?" The Economics of Poverty Traps, by Carter, and Chavas Barrett (2019). 357 - 382. Azariadis, Costas, and John Stachurski. “Poverty Traps.” Handbook of Economic Growth, Vol. I, chap. 05 (2005). https://EconPapers.repec.org/RePEc:eee:grochp:1-05 Baird, Sarah, Craig McIntosh, and Berk Özler. “Cash or condition? Evidence from a cash transfer experiment”. The Quarterly Journal of Economics 126, no. 4 (2011): 1709-1753. Bandiera, Oriana, Robin Burgess, Narayan Das, Selim Gulesci, Imran Rasul, Munshi Sulaiman. “Labor Markets and Poverty in Village Economies.” The Quarterly Journal of Economics 132, Issue 2 (2017). 811–870. doi:10.1093/qje/qjx003. Bandiera, Oriana, Niklas Buehren, Robin Burgess, Markus Goldstein, Selim Gulesci, Imran Rasul and Munshi Sulaiman. “Women’s Empowerment in Action: Evidence from a Randomized Control Trial in Africa.” American Economic Journal. Forthcoming. Banerjee, Abhijit, Esther Duflo, Rachel Glennerster, and Cynthia Kinnan. “The Miracle of Microfinance? : Evidence from a Randomized Evaluation”. American Economic Journal: Applied Economics 7, no. 1 (2014). 22-53. Banerjee, Abhijit, Esther Duflo, Nathanael Goldberg, Dean Karlan, Robert Osei, William Parienté, Jeremy Shapiro, Bram Thuysbaert, and Christopher Udry. "A Multifaceted Program Causes Lasting Progress for the Very Poor: Evidence from Six Countries." Science 348, no. 6236 (2015): 1260799. Banerjee, Abhijit, Dean Karlan, Robert Osei, Christopher Udry, and Hannah Trachtman. "Unpacking a Multi-Faceted Program to Build Sustainable Income for the Very Poor." Working Paper (2018). Banerjee, Abhijit, and Sendhil Mullainathan. "The Shape of Temptation: Implications for the Economic Lives of the Poor." NBER Working Paper, No. 15973 (2010). Barrett, Christopher B., Michael R. Carter, and Jean-Paul Chavas. “The Economics of Poverty Traps.” National Bureau of Economic Research, (2019). Barrett, Christopher B., Michael R. Carter, Munenobu Ikegami, and Sarah A. Janzen. "Poverty Traps and the Social Protection Paradox. "The Economics of Poverty Traps, ch. 6 (2019). 223 – 256. Bedoya, Guadalupe, Luca Bittarello, Jonathan Davis, Nikolas, Mittag. “Distributional impact analysis: toolkit and illustrations of impacts beyond the average treatment effect.” Policy Research Working Paper; no. WPS 8139 (2017). Washington, D.C.: World Bank Group. 32 Bernard, Tanguy, Stefan Dercon, Kate Orkin, Alemayehu Taffesse. “The future in mind: Aspirations and forward-looking behaviour in rural Ethiopia”. Centre for Economic Policy Research, (2014). CEPR Discussion Paper No. DP10224. Blattman, Chris, Sebastian Martinez, and Nathan Fiala. "Generating Skilled Self-employment in Developing Countries: Experimental Evidence from Uganda." Quaterly Journal of Economics 129, Issue 2 (2014). 697-752. doi:10.1093/qje/qjt057 Blattman, Christopher, and Laura Ralston. "Generating employment in poor and fragile states: Evidence from labor market and entrepreneurship programs." (2015). Available at SSRN: http://dx.doi.org/10.2139/ssrn.2622220. Blattman, Christopher, Eric P. Green, Julian Jamison, M. Christian Lehmann, and Jeannie Annan. 2016. "The Returns to Microenterprise Support among the Ultrapoor: A Field Experiment in Postwar Uganda." American Economic Journal: Applied Economics 8, no. 2 (2016). 35–64. doi:10.1257/app.20150023. Buera, Francisco J., Joseph P. Kaboski, and Yongseok Shin. "Taking Stock of the Evidence on Micro-Financial Interventions." The Economics of Poverty Traps, by Barrett, Carter, and Chavas, ch. 5 (2019). 189 – 221. Bruhn, Miriam, and David McKenzie. "In Pursuit of Balance: Randomization in Practice in Development Field Experiments ". American Economic Journal: Applied Economics 1, no. 4 (2008). 200-232. Cohen, Sheldon, Tom Kamarck, and Robin Mermelstein. "A global measure of perceived stress." Journal of health and social behavior, (1983). 385-396. de Mel, Suresh, David McKenzie, and Christopher Woodruff. “Who Are the Microenterprise Owners? Evidence from Sri Lanka on Tokman v. de Soto”. IZA Discussion Paper Series, IZA DP No. 3511 (May 2008). de Mel, Suresh, David McKenzie, and Christopher Woodruff. “Are Women More Credit Constrained? Experimental Evidence on Gender and Microenterprise Returns”. American Economic Journal: Applied Economics 1, no.3 (July 2009). 1–32. de Quidt, Jonathan, and Johannes Haushofer. "Depression for Economists." The Economics of Poverty Traps, by Michael R. Carter, and Jean-Paul Chavas Christopher B. Barrett, ch.3 (2019). 127 – 152. Dean, Emma Boswell, Frank Schilbach, and Heather Schofield. "Poverty and Cognitive Function." The Economics of Poverty Traps, ch. 2 (2019). 57 - 118. Dercon, Stefan. "Fate and fear: Risk and its consequences in Africa." Journal of African Economies 17, no. suppl_2 (2008): ii97-ii127. Dreze, Jean, and Geeta Gandhi Kingdon. "School participation in rural India." Review of Development Economics 5, no. 1 (2001). 1-24. 33 Filmer, Deon, and Lant H. Pritchett. "Estimating wealth effects without expenditure data —or tears: an application to educational enrollments in states of India." Demography 38, no. 1 (2001). 115-132. doi:10.2307/3088292 Fiszbein, Ariel, and Norbert Schady. “Conditional Cash Transfers: Reducing Present and Future Poverty.” World Bank Policy Research Report, World Bank (2009). Ghatak, Maitreesh. "Theories of Poverty Traps and Anti-Poverty Policies." The World Bank Economic Review, (2015). 1–29. Haushofer, Johannes, and Ernst Fehr. "On the psychology of poverty." Science 344, no. 6186 (2014). 862-867. Haushofer, Johannes, and Jeremy Shapiro. "The Short-Term Impact of Unconditional Cash Transfers to the Poor: Experimental Evidence from Kenya." The Quarterly Journal of Economics 131, Volume 4 (2016). 1973-2042. doi:10.1093/qje/qjw025 IMF, IDA. "Staff Guidance Note on the Application of the Joint Bank-Fund Debt Sustainability Framework for Low-Income Countries." Washington DC: FMI (2010). International Financial Statistics (IFS). "Country Tables, Afghanistan, Islamic Republic of, Exchange Rates, Official Rates." International Monetary Fund, Washington DC (2019). Accessed April 22, 2019. http://data.imf.org/regular.aspx?key=61545850. Klugman, Jeni, Lucia Hanmer, Sarah Twigg, Jennifer McCleary-Sills, Tazeen Hasan, and Julieth Andrea Santamaria Bonilla. “Voice and agency: empowering women and girls for shared prosperity: Main report (English).” World Bank Group, (2014). Washington, DC. Kraay, Aart, and David McKenzie. "Do Poverty Traps Exist? Assessing the Evidence." Journal of Economic Perspectives 28, no. 3 (2014). 127-148. doi:10.1257/jep.28.3.127. Macours, Karen, and Renos Vakis. "Sustaining Impacts When Transfers End: Women Leaders, Aspirations, and Investments in Children." The Economics of Poverty Traps, by Michael R. Carter, and Jean-Paul Chavas Christopher B. Barrett, ch.9 (2019). 325 – 355. Mani, Anandi, Sendhil Mullainathan, Eldar Shafir, and Jiaying Zhao. "Poverty impedes cognitive function." Science 341, no. 6149 (2013). 976-980. Mullainathan, Sendhil, and Shafir Eldar. “Scarcity: Why Having Too Little Means so Much.” (2013). New York: Times Books, Henry Holt and Company. Mullainathan, Sendhil, Frank Shilbach, and Heather Schofield. “The Psychological Lives of the Poor.” American Economic Review: Papers & Proceeding 106, no.5 (2016). 435–440. Murphy, Kevin M., Andrei Shleifer, and Robert W. Vishny. "Industrialization and the big push." Journal of political economy 97, no. 5 (1989). 1003-1026. Organization, Central Statistics. “Afghanistan Living Conditions Survey 2016-17". ISBN:978- 9936-8050-7-1, Kabul, CSO.: Central Statistics Organization, (2018). 34 Radloff, Lenore Sawyer. "The CES-D Scale: A Self-report Depression Scale for Research in the General Population." Applied Psychological Measurement 1, no. 3 (1977). 385-401. doi: 10.1177/014662167700100306 Romano, Joseph P., and Michael Wolf. "Balanced Control of Generalized Error Rates." The Annals of Statistics 38, no. 1 (2010). 598-633. Rosenstein-Rodan, P. N. "Problems of Industrialisation of Eastern and South-Eastern Europe." The Economic Journal 53, no. 210/211 (1943). 202-211. doi:10.2307/2226317. Shah, Anuj K, Sendhil Mullainathan, and Eldar Shaf ir. “Some consequences of having too little”. Science 338, Issue 6107 (2012). 682-685. doi:10.1126/science.1222426 World Bank. “World Development Report 2012: gender equality and development: Main report (English)”. World Development Report, (2011), Washington DC: World Bank. World Bank Group. “Women, Business and the Law 2018.” Washington, DC: World Bank, (2018). © World Bank. https://openknowledge.worldbank.org/handle/10986/29498 License: CC BY 3.0 IGO.” World Bank Group. "Piecing Together the Poverty Puzzle." Poverty and Share Prosperity 2018. Washington, DC: World Bank, (2018). © The World Bank. “World Development Indicators.” Washington, D.C.: The World Bank, (2019). https://data.worldbank.org/country/afghanistan. World Bank. "Open Data Indicators." World Bank website, (2019). https://data.worldbank.org. 35 Table 1: Household Socioeconomic Conditions Ultra-poor Non-UP Mean SD Mean SD (1) (2) (3) (4) Panel A: Household Characteristics, Assets and Psychological Well-being at Baseline Household Characteristics Primary Woman Is Household Head 0.204 0.403 0.048 0.213 Primary Woman Is Illiterate 0.963 0.188 0.897 0.304 Primary Man Is Illiterate 0.844 0.363 0.730 0.444 School-age Girls are Enrolled at School 0.530 0.436 0.537 0.431 School-age Boys are Enrolled at School 0.587 0.426 0.638 0.406 Consumption and Assets Consumption Per Capita (USD), Month 87.7 78.2 133.9 134.0 Consumption Per Capita Poverty-Line Consistent (USD), Month 89.7 71.5 124.4 84.3 Household Is Below the Poverty Line 0.801 0.399 0.566 0.496 Household Saves 0.015 0.123 0.015 0.120 Households Has Any Outstanding Loans 0.678 0.467 0.519 0.500 Household Owns Land 0.626 0.484 0.779 0.415 Household Owns a Mobile Phone 0.719 0.450 0.852 0.355 Psychological Well-being Primary Woman Life Satisfaction Rating (1-10) 5.012 2.993 6.684 2.725 Primary Woman Is Depressed (7-CESD ≥ 8) 0.693 0.462 0.520 0.500 Number of Sampled Households 1,173 1,680 Panel B: Economic Activities for UP Controls and Non-UP at Follow-up Primary Woman Participates in Income-generating Activities 0.306 0.461 0.248 0.432 Household Owns Any Livestock 0.573 0.495 0.567 0.496 Household Owns Cows 0.094 0.292 0.282 0.450 Household Owns Goats 0.079 0.270 0.075 0.264 Household Owns Chickens 0.395 0.489 0.258 0.438 Household Has an Agricultural Business 0.177 0.382 0.272 0.445 Household Has Other Non-agricultural Business 0.322 0.468 0.428 0.495 Number of Sampled Households 689 1,348 Notes. Panel A is constructed using baseline data and panel B using follow-up data for control ultra-poor households and non-UP households. School-age is 6 to 19 years old. Sampling weights based on the total population for each village is used to estimate the non-UP statistics. Our consumption estimate includes food (purchased, produced and received as a gift), personal and household items, education, health, household repairs, social expenses (weddings, funeral, religious expenses, and other ceremonies), and temptation goods and legal expenses. The Afghanistan national poverty line estimate excludes legal, health and social expenses, and household construction and repairs, and includes expenses in consumer durables and housing vs. our estimate. We report both for comparability with national poverty estimates. The poverty line threshold is AFN 2,064 (USD 30 and USD 112 PPP) per capita per month from the Afghanistan Living Conditions Survey (ALCS). The psychological well-being measures includes the Center for Epidemiologic Studies Depression (CES-D) 7-point scale (Radloff 1977) and the World Values Survey (WVS) questions on life satisfaction. All monetary amounts are PPP-adjusted USD terms, set at 2018 prices using the Afghanistan CPI and PPP conversion factor from the IMF. 36 Table 2: Baseline Village Characteristics Mean SD N (1) (2) (3) Services Available in the Village Primary School 0.562 0.499 80 Secondary School 0.475 0.503 80 Health Facility 0.163 0.371 80 Veterinary 0.113 0.318 80 Commercial Bank 0.000 0.000 80 Microfinance Institution 0.050 0.219 80 Distance to Nearest Institution or Service Provider If None in Village (Minutes) Primary School 25.276 20.197 38 Secondary School 26.386 19.421 44 Health Facility 23.910 20.356 67 Veterinary 29.464 16.382 69 Commercial Bank 49.304 34.088 79 Microfinance Institution 33.446 24.689 74 Access to Electricity Days With Electricity from the Grid, Last Week 4.300 2.944 80 Conflict/Violence, Last 12 Months Experienced Conflict, Instability, or Violence Event 0.062 0.244 80 Notes. Information is reported by the local leader and/or CDC officials in the baseline village survey. Distance in minutes is estimated as time spent in minutes to reach destination using typical mean of transport. In more than 95% of cases, the chosen typical mean of transport is walking. 37 Table 3: Impact on Consumption, Food Security and Assets Treatment Effect Control % Control SD Level N Mean Mean Control (1) (2) (3) (4) (5) Consumption Consumption per Capita (USD), Month 80.835 24.027*** 30% 0.412 1147 (58.378) (4.224) {0.000}*** Household Is Below National Poverty Line 0.819 −0.199*** −24% −0.517 1147 (0.386) (0.027) [0.000] Food Security Food Security Index 0.000 0.491*** 0.491 1147 (1.000) (0.053) 0.000 Everyone in the Household Regularly 0.760 0.114*** 15% 0.268 1145 Eats at Least Two Meals a Day (0.427) (0.022) 0.000 No Adult Skips or Cuts the Size of 0.441 0.234*** 53% 0.470 1134 Meals (0.497) (0.029) 0.000 No Child Skips or Cuts the Size of Meals 0.594 0.197*** 33% 0.401 1137 (0.492) (0.026) 0.000 Assets Value of Livestock (USD) 266.655 839.076*** 315% 1.063 1111 (789.237) (58.650) {0.000}*** HH Asset Ownership Index (PCA) −0.000 0.357*** 0.357 1147 (1.000) (0.061) {0.000}*** Notes. This table reports OLS estimates of treatment effects. All regressions include 133 randomization PRA controls. Outcomes are listed on the left, and described in detail in Table A1 of the Online Appendix. For each outcome variable we report the coefficients of interest and their robust standard errors in parentheses. For the primary outcomes, we report the FWER-corrected p-values in braces. For all other outcomes, we report the, ıve” p-value in brackets. (***) (**) (*) denotes significance at (1%) (5%) (10%) level. Column (1) reports the “na¨ mean (standard deviation) of each outcome for the control group at follow-up. Columns (2), (3) and (4) report the treatment estimates in levels (2), as percentages of the control mean (3), and as standard deviations of the control group (4) for non-indexed variables. Column (5) reports the total sample for each treatment estimate. Our consumption estimate includes food (purchased, produced and received as a gift), personal and household items, education, health, household repairs, social expenses (weddings, funeral, religious expenses, and other ceremonies), and temptation goods and legal expenses. The poverty line threshold is AFN 2,064 (USD 0.41 and USD 111.67 PPP) per capita per month from the Afghanistan Living Conditions Survey (ALCS). The asset ownership index is constructed using Principal Component Analysis (PCA) on the number of assets owned. All monetary amounts are PPP-adjusted USD terms, set at 2018 prices using the Afghanistan CPI and PPP conversion factor from the IMF. 38 Table 4: Impact on Labor Supply and Time Use, Primary Woman and Man, Last 4 Weeks Primary Woman Primary Man Treatment Effect Treatment Effect Control % Control SD Control % Control SD Level N Level N Mean Mean Control Mean Mean Control (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Labor Participation 0.345 0.221*** 64% 0.465 1140 0.853 0.030 3% 0.084 776 (0.476) (0.029) (0.355) (0.025) [0.000] [0.232] Time Spent Working, Full-time Days Total 4.180 2.294*** 55% 0.269 1137 11.620 1.622 14% 0.124 770 (8.535) (0.538) (13.066) (1.008) {0.000}*** {0.119} HH Livestock 0.695 2.703*** 389% 1.131 1137 1.015 1.800*** 177% 0.412 774 (2.390) (0.261) (4.365) (0.420) [0.000] [0.000] HH Agriculture 0.080 0.165 205% 0.145 1140 0.598 0.216 36% 0.061 776 (1.137) (0.102) (3.552) (0.295) [0.106] [0.464] Own Non-Agriculture Business 0.124 0.305** 245% 0.244 1140 0.248 0.325 131% 0.145 776 (1.250) (0.136) (2.239) (0.243) [0.025] [0.181] Agriculture Outside the HH 0.837 −0.297 −35% −0.071 1140 3.472 −0.114 −3% −0.014 774 (4.163) (0.214) (8.267) (0.618) [0.166] [0.854] Maid Services 0.386 0.429** 111% 0.189 1140 0.349 0.324 93% 0.137 776 (2.265) (0.215) (2.370) (0.272) [0.046] [0.234] Non-Agriculture Outside the HH 0.053 0.072 135% 0.095 1140 1.844 −0.276 −15% −0.046 776 (0.760) (0.099) (5.977) (0.457) [0.468] [0.546] Salaried or Formal Employment 0.086 −0.055 −64% −0.039 1140 0.839 −0.177 −21% −0.039 776 (1.413) (0.063) (4.544) (0.352) [0.381] [0.616] Other Paid Work 1.925 −0.979*** −51% −0.148 1140 3.180 −0.514 −16% −0.063 774 (6.593) (0.322) (8.207) (0.617) [0.002] [0.406] Notes. This table reports OLS estimates of treatment effects. All regressions include 133 randomization PRA controls. Outcomes are listed on the left, and described in detail in Table A1 of the Online Appendix. For each outcome variable we report the coefficients of interest and their robust standard errors in parentheses. For the primary outcomes, we report the FWER-corrected p-values in braces. For all other outcomes, we report the, “na¨ıve” p-value in brackets. (***) (**) (*) denotes significance at (1%) (5%) (10%) level. Column (1) (resp. (6)) reports the mean (standard deviation) of each outcome for the primary woman (resp. primary man) of the control group at follow-up. Columns (2), (3) and (4) (resp. (7), (8) and (9)) report the treatment estimates on primary woman (resp. primary man) in levels (2) (resp. (7)), as percentages of the control mean (3) (resp. (8)), and standard deviations of the control group (4) (resp. (9)). Column (5) (resp. (10)) reports the total primary woman (resp. primary man) sample for each treatment estimate. Total is the sum of the 8 activities below. All time use estimates represent full-time-day equivalent, defined as 8 hours of work. Labor participation is a dummy defined as whether the adult household member (aged 15 and above) has been performing one of the previous activities or has been searching for work outside home in the past two weeks. Table 5: Impact on Financial Inclusion and Income and Revenues from Productive Activities Treatment Effect Control % Control SD Level N Mean Mean Control (1) (2) (3) (4) (5) Finance Financial Inclusion Index 2.378*** −0.000 2.378 1147 (0.183) (1.000) {0.000}*** Primary Woman Knows Different 0.161 0.292*** 181% 0.742 1053 Places to Save (0.393) (0.029) [0.000] Anyone in the HH Has Formal Savings 0.013 0.280*** 2137% 2.462 1143 Account (0.114) (0.021) [0.000] HH Members Can Access Formal Credit 0.021 0.122*** 588% 0.854 1130 if Needed (0.143) (0.017) [0.000] HH Has Saved in the Last 4 Weeks 0.023 0.262*** 1121% 1.732 1142 (0.151) (0.022) [0.000] HH Savings (USD), Last 4 Weeks 4.253 70.161*** 1650% 1.408 1138 (49.820) (12.822) [0.000] HH Total Savings (USD) 4.832 106.066*** 2195% 2.091 1140 (50.725) (10.488) [0.000] HH Total Outstanding Cash Loans (USD) 1, 381.798 −733.100*** −53% −0.138 1118 (5, 302.327) (217.144) [0.001] Income and Revenues from Productive Activities, Last 4 Weeks Total Household Income and Revenues 307.282 68.801*** 22% 0.177 1145 (USD) (389.758) (25.426) [0.007] Livestock Revenue (USD) 22.528 63.253*** 281% 0.595 1093 (106.317) (11.341) [0.000] Agriculture Revenue (USD) 21.953 −7.641 −35% −0.043 1113 (178.035) (8.113) [0.346] Non-agricultural Business Revenue (USD) 90.011 4.695 5% 0.027 1113 (176.402) (11.531) [0.684] Paid Labor Income, All Adults (USD) 183.847 14.207 8% 0.056 1102 (253.571) (16.494) [0.389] Notes. This table reports OLS estimates of treatment effects. All regressions include 133 randomization PRA controls. Outcomes are listed on the left, and described in detail in Table A1 of the Online Appendix. For each outcome variable we report the coefficients of interest and their robust standard errors in parentheses. For the primary outcomes, we report the FWER-corrected p-values in braces. For all other outcomes, we report the, “na¨ ıve” p-value in brackets. (***) (**) (*) denotes significance at (1%) (5%) (10%) level. Column (1) reports the mean (standard deviation) of each outcome for the control group at follow-up. Columns (2), (3) and (4) report the treatment estimates in levels (2), as percentages of the control mean (3), and as standard deviations of the control group (4) for non-indexed variables. Column (5) reports the total sample for each treatment estimate. Total household income is the sum of the income coming from livestock, agriculture, non-agricultural businesses and paid labor. Revenue from livestock includes selling of livestock and production of animal products. Revenue from agriculture refers to revenue from crop production. Paid labor income is for all adult members in the household. All monetary amounts are PPP-adjusted USD terms, set at 2018 prices using the Afghanistan CPI and PPP conversion factor from the IMF. 40 Table 6: Impact on Psychological Well-being Primary Woman Primary Man Control Treatment Effect Control Treatment Effect N N Mean Level Mean Level (1) (2) (3) (4) (5) (6) Psychological Well-being Psychological Well-being In- −0.000 0.575*** 1147 −0.000 0.255*** 694 dex (1.000) (0.059) (1.000) (0.082) {0.000}*** {0.010}*** Log Cortisol (with Controls) 0.019 −0.034 1002 −0.018 0.076 581 (0.967) (0.063) (0.885) (0.084) [0.582] [0.367] Life Satisfaction (WVS) Index −0.000 0.435*** 1147 −0.000 0.176** 694 (1.000) (0.057) (1.000) (0.078) [0.000] [0.024] Happiness (WVS) Index −0.000 0.606*** 1147 0.000 0.249*** 694 (1.000) (0.055) (1.000) (0.074) [0.000] [0.001] Depression (CESD) Index 0.000 0.504*** 1140 0.000 0.198** 685 (Negatively Coded) (1.000) (0.059) (1.000) (0.078) [0.000] [0.012] Self-esteem (Rosenberg) Index 0.000 0.234*** 1146 0.000 −0.032 693 (1.000) (0.057) (1.000) (0.081) [0.000] [0.691] Stress (Cohen) Index (Nega- 0.000 0.554*** 1138 −0.000 0.225*** 681 tively Coded) (1.000) (0.061) (1.000) (0.084) [0.000] [0.008] Life Orientation and Trust Trust (WVS) Index 0.000 0.057 1147 0.000 0.093 694 (1.000) (0.061) (1.000) (0.076) [0.350] [0.221] Life Orientation Test (LOT-R) 0.000 0.508*** 1144 0.000 0.171** 687 Index (1.000) (0.058) (1.000) (0.080) [0.000] [0.034] Notes. This table reports OLS estimates of treatment effects. All regressions include 133 randomization PRA controls. Outcomes are listed on the left, and described in detail in Table A1 of the Online Appendix. For each outcome variable we report the coefficients of interest and their robust standard errors in parentheses. For the primary outcomes, we report the FWER-corrected p-values in braces. For all other outcomes, we report the, “na¨ ıve” p-value in brackets. (***) (**) (*) denotes significance at (1%) (5%) (10%) level. Column (1) (resp. (4)) reports the mean (standard deviation) of each outcome for the primary woman (resp. primary man) of the control group at follow-up. Column (2) (resp. (5)) report the treatment estimates on primary woman (resp. primary man) in levels. Column (3) (resp. (6)) reports the total primary woman (resp. primary man) sample for each treatment estimate. The psychological well-being measure includes the Center for Epidemiologic Studies Depression (CES-D) 7-point scale (Radloff 1977), questions on happiness and life satisfaction from the World Values Survey (WVS), Cohen’s 4-item stress scale, and log cortisol levels obtained through saliva samples adjusted for confounders. 41 Table 7: Impact on Women’s Empowerment Treatment Effect Control SD Level N Mean Control (1) (2) (3) (4) Women’s Empowerment Index (6 Dimensions) −0.000 0.384*** 0.384 1147 (1.000) (0.062) {0.000}*** HH Expenditures Decisions Index (1 Dimension) −0.000 0.086 0.086 1147 (1.000) (0.058) [0.140] Children’s Investments Decisions Index −0.000 0.080 0.080 1147 (1.000) (0.056) [0.157] Fertility and Mobility Decisions & Access to Inputs Index 0.000 0.344*** 0.344 1147 (1.000) (0.062) [0.000] Participation in Income-generating Activities Index −0.000 0.292*** 0.292 1147 (1.000) (0.062) [0.000] Aspirations for Daughters Index −0.000 −0.008 −0.008 900 (1.000) (0.064) [0.906] Political Involvement and Social Capital Index 0.000 0.330*** 0.330 1147 (1.000) (0.062) [0.000] Notes. This table reports OLS estimates of treatment effects. All regressions include 133 randomization PRA controls. Outcomes are listed on the left, and described in detail in Table A1 of the Online Appendix. For each outcome variable we report the coefficients of interest and their robust standard errors in parentheses. For the primary outcomes, we report the FWER-corrected p-values in braces. For all other outcomes, we report the, ıve” p-value in brackets. (***) (**) (*) denotes significance at (1%) (5%) (10%) level. Column (1) reports the “na¨ mean (standard deviation) of each outcome for the control group at follow-up. Columns (2), (3) and (4) report the treatment estimates in levels (2), and as percentages of the control mean (3). Column (4) reports the total sample for each treatment estimate. The women’s empowerment index includes the following indices: household finances and expenditures decisions, children’s investment decision, fertility and mobility decisions/access to inputs, participation in income-generating activities, aspirations for daughters, and political involvement and social capital. The sample is restricted to households with school-age girls for the aspirations for daughters index. 42 Table 8: Impact on Education and Health Treatment Effect Control % Control SD Level N Mean Mean Control (1) (2) (3) (4) (5) School Enrollment Girls 0.529 0.046* 9% 0.093 1600 (0.499) (0.028) [0.093] Boys 0.561 0.072** 13% 0.144 1555 (0.496) (0.029) [0.013] School Absenteeism: % Missed Days, last 4 weeks Girls 0.135 −0.021 −16% −0.094 873 (0.223) (0.017) [0.227] Boys 0.160 −0.061*** −38% −0.247 917 (0.246) (0.018) [0.001] Health Diarrhea Rate in Oldest Under-5 Child, Last 2 0.507 −0.080** −16% −0.161 706 weeks (0.501) (0.040) [0.044] Under-5 Child Has Vaccination Card, At Least 0.607 0.039 6% 0.080 699 One (Physically Checked) (0.489) (0.037) [0.293] Notes. This table reports OLS estimates of treatment effects. All regressions include 133 randomization PRA controls. Outcomes are listed on the left, and described in detail in Table A1 of the Online Appendix. For each outcome variable we report the coefficients of interest and their robust standard errors in parentheses. For the primary outcomes, we report the FWER-corrected p-values in braces. For all other outcomes, we report the, ıve” p-value in brackets. (***) (**) (*) denotes significance at (1%) (5%) (10%) level. Column (1) reports the “na¨ mean (standard deviation) of each outcome for the control group at follow-up. Columns (2), (3) and (4) report the treatment estimates in levels (2), as percentages of the control mean (3), and as standard deviations of the control group (4) for non-indexed variables. Column (5) reports the total sample for each treatment estimate. Schooling outcomes include school-age children (i.e. children from 6 to 18 years old). The sample is restricted to households with school-age children for the school enrollment and absenteeism outcomes and to households with under-5 children for the health outcomes. Unit of observation is child for the schooling outcomes. 43 Table 9: Cost Benefit Analysis AFN Current USD USD PPP Costs (1) Direct Transfer Costs 56,820 837 3,075 Asset Cost 41,596 613 2,251 Food Stipend 12,363 182 669 Health Voucher 2,861 42 155 (2) Total Supervision Costs 47,004 693 2,544 Salaries of Implementing Organization Staff 27,509 405 1,489 Materials 3,742 55 203 Training 2,998 44 162 Travel Costs 470 7 25 Other Supervision Expenses 12,285 181 665 (3) Total Direct Costs 103,824 1,530 5,618 Start-up expenses 294 4 16 Indirect Costs 10,424 154 564 (4) Total Costs, calculated as if all incurred immediately at 114,541 1,688 6,198 beginning of Year 0 (5) Total Costs, Inflated to Year 2 at 5% annual dis- 126,282 1,861 6,469 count rate Benefits (6) Year 1 Nondurable Annual Consumption 34,556 509 1,770 (7) Year 2 Nondurable Annual Consumption 34,556 509 1,770 (8) Year 3 - 10 nondurable consumption discounted to year 223,342 3,291 11,442 2 at 5% annual discount rate (9) Total Benefits (6) + (7) + (8) 292,454 4,309 14,982 Returns to Investment (10) Total Benefits / Total Costs Ratio: (9)/(5) 2.32 (11) Break Even Point 4 years after program start (12) Internal Rate of Return 26% Internal Rate of Return Sensitivity Using 95% Winsorized Consumption Impacts 21% Assuming Impacts Persist for 5 Years After Transfer 14% Assuming Impacts Persist in Perpetuity After Transfer 29% Notes. 95% Winsorized consumption estimates used in benefits calculation. Social discount rate of 5% applied as per World Bank guidance note (World Bank, 2013). Cost estimates provided by MISFA. The exchange rates used is the IMF 1 USD = 67.87 AFN (2016). All monetary amounts PPP-adjusted USD terms are set at 2018 prices using the Afghanistan CPI and PPP conversion factor from the IMF. 44