Policy Research Working Paper 9092 Factor Market Failures and the Adoption of Irrigation in Rwanda Maria Ruth Jones Florence Kondylis John Ashton Loeser Jeremy Magruder Development Economics Development Impact Evaluation Group December 2019 Policy Research Working Paper 9092 Abstract This paper examines constraints to adoption of new tech- labor and inputs away from their other plots. Eliminat- nologies in the context of hillside irrigation schemes in ing this substitution would increase adoption by at least Rwanda. It leverages a plot-level spatial regression discon- 30 percent. Third, this substitution is largest for smaller tinuity design to produce 3 key results. First, irrigation households and wealthier households. This result can be enables dry season horticultural production, which boosts explained by labor market failures in a standard agricultural on-farm cash profits by 53–71 percent. Second, adoption is household model. constrained: access to irrigation causes farmers to substitute This paper is a product of the Development Impact Evaluation Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at fkondylis@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Factor Market Failures and the Adoption of Irrigation in Rwanda∗ Maria Jones† Florence Kondylis† John Loeser† Jeremy Magruder‡ Updated January 2021 JEL Classification Codes: O1, O12, O13, Q12, Q15 Keywords: Technology adoption, Irrigation, Factor markets ∗ This draft benefited from comments from Abhijit Banerjee, Chris Barrett, Paul Christian, Alain de Janvry, Simeon Djankov, Esther Duflo, Andrew Foster, Doug Gollin, Saahil Karpe, Elisabeth Sadoulet, John Strauss, Duncan Thomas, Chris Udry, and seminar audiences at Cornell University, Georgetown University, Georgia State University, Harvard University/MIT, Michigan State Uni- versity, North Carolina State University, Northwestern University, University of California, Berke- ley, University of Pittsburgh, University of Southern California, and the World Bank. We thank the European Union, the Global Agriculture and Food Security Program (GAFSP), the World Bank Rwanda Country Management Unit, the World Bank i2i fund, 3ie, and IGC for generous research funding. Emanuele Brancati, Anna Kasimatis, Roshni Khincha, Christophe Ndahimana, and Shardul Oza provided excellent research assistance. Finally, we thank the technical staff at MI- NAGRI, the staff of the LWH project implementation unit, and the World Bank management and operational teams in Rwanda for being outstanding research partners. We are particularly indebted to Esdras Byiringiro, Jolly Dusabe, Hon. Dr. Gerardine Mukeshimana, and Innocent Musabyimana for sharing their deep knowledge of Rwandan agriculture with us. Magruder acknowledges support from NIFA. This study was approved by UC Berkeley IRB (2016-06-8861), the Rwanda National Ethics Committee, and IPA IRB (10951), and is registered in the AEA RCT Registry (AEARCTR- 0001323). The views expressed in this manuscript do not reflect the views of the World Bank. All errors are our own. † Development Impact Evaluation, World Bank ‡ UC Berkeley, NBER 1 Introduction Limited adoption of productive technologies is a prominent explanation of low agri- cultural productivity in sub-Saharan Africa (World Bank, 2007). Existing productive technologies may be underutilized due to inefficiencies in the markets faced by farmer households (Udry, 1997). A recent literature has provided robust evidence that these market failures distort technology adoption, most commonly through experimental manipulation of markets for risk, credit, and information (De Janvry et al., 2017). Evidence is thinner on the role of constraints to adoption generated by failures in factor markets for land and labor. Land and labor markets are characterized by substantial frictions in developing countries (Fafchamps, 1993; Udry, 1997; LaFave & Thomas, 2016), even where these markets are particularly active (Kaur, 2014; Breza et al., 2018). Economic theory suggests land and labor market failures reduce agricultural productivity by generating inefficient allocations of labor and land across farms (Fei & Ranis, 1961; Benjamin, 1992). More recent empirical work has found that these inefficiencies are quantitatively important (Udry, 1997; Adamopoulos & Restuccia, 2014; Adamopoulos et al., 2017; Foster & Rosenzweig, 2017; Adamopoulos & Restuccia, 2018). In this paper, we demonstrate that incomplete land and labor markets contribute to the productivity gap by distorting technology adoption.1 We do so in the context of a potentially transformative technology: irrigation. Irrigation increases agricultural productivity in several ways: it adds additional agricultural seasons, enables cultiva- tion of water-intensive crops, and reduces production uncertainty. However, irrigation is also costly: it requires large construction and maintenance costs, and is associated with increased usage of complementary inputs, such as labor, fertilizer, and improved seeds. Market failures, including in factor markets, therefore have the potential to cause inefficient irrigation adoption as they induce a wedge between shadow prices and market prices of these inputs. We proceed in 3 steps. First, we establish that irrigation is a productive technol- ogy, but adoption is partial. Second, we demonstrate that this partial adoption is 1 A related question is explored in papers which evaluate the effects of land titling and other formalized property rights on farm investment (Besley, 1995; Goldstein & Udry, 2008; Deininger & Feder, 2009; Besley & Ghatak, 2010; Ali et al., 2014; Goldstein et al., 2018). In our context, farmers have been assigned formal titles to our plots and so we identify the influence of factor market frictions on technology adoption in the presence of formalized rights. Our emphasis on the role of labor market frictions is also distinct. 2 inefficient. Third, we show that labor market failures generate constraints to adoption of irrigation. We begin by estimating the returns to irrigation in Rwanda. We identify these returns using a plot-level spatial discontinuity design in newly constructed hillside irrigation schemes. We sample plots within 50 meters of gravity fed canals, which originate from a distant water source and must maintain a consistent gradient along the hillside. We survey 969 cultivators on 1,753 plots for 4 years.2 We then compare plots just inside the command area, which have access to water for irrigation, to plots just outside the command area, which do not. Treatment on the treated estimates reveal that irrigation enables the transition to dry season cultivation of horticulture. While we find no effects on rainy season yields, labor, or inputs, dry season estimates correspond to 53% - 71% growth in annual cash profits. To our knowledge, this is the first study to use a natural experiment to estimate the returns to irrigation in sub-Saharan Africa; our estimate is almost identical to an estimate from Duflo & Pande (2007) in India.3 Despite the large effects we estimate, adoption is low: only 30% of plots are irrigated 4 years after canals became operational. At this level of adoption, the sustainability of hillside irrigation systems is in doubt: even the large gains in cash profits to adopters are unable to generate enough surplus to pay for routine maintenance costs.4 We investigate the effect of irrigation on inputs to shed light on what might de- termine farmers’ decisions to adopt irrigation. In this context, the dominant input associated with irrigation is households’ own labor. The shadow wage that prices household labor is notoriously difficult to value, but if this labor were valued at the market wage, estimated effects on household labor would be 6 times as large as esti- mated effects on expenditures on hired labor and other inputs, and estimated effects 2 These numbers are only for the sample of households whose sampled plot is within 50 meters of the associated discontinuity; in full we survey 1,695 cultivators on 3,332 plots. 3 Existing work that estimates the returns to irrigation using natural experiments is predomi- nantly from groundwater irrigation in South Asia, leveraging variation in slope characteristics of river basins (Duflo & Pande, 2007), aquifer characteristics (Sekhri, 2014), or well-failures (Jacoby, 2017) for identification. Estimates of the return to irrigation in Africa include Dillon (2011), who estimates the returns to irrigation using propensity score matching in Mali. More broadly, Dillon & Fishman (2019) review the literature on the impacts of surface water irrigation infrastructure. 4 This is distinct from the collective action failures discussed in (Ostrom, 1990). Low adoption of irrigation as a threat to sustainability has also been documented by Attwood (2005), who argues that cost recovery was a challenge for canal irrigation systems in nineteenth and early twentieth century India until the introduction of sugarcane. 3 on profits would fall from 53% - 71% to -12% - 38%. Valuing household labor at the market wage may not be appropriate: rural market wages are likely to be in- efficiently high in developing countries (Kaur, 2014; Breza et al., 2018), and labor market failures in rural areas may generate heterogeneity in the shadow wage (Singh et al., 1986; Benjamin, 1992; LaFave & Thomas, 2016). Heterogeneity in the shadow wage would then cause inefficient adoption of irrigation across households.5 Alterna- tively, these results could also be consistent with unconstrained profit maximization if farmers have heterogeneous returns to or costs of adopting irrigation (Suri, 2011) and optimize at market wages. We derive a test for inefficient adoption of irrigation caused by market failures. To produce this test, we build on seminal agricultural household models (Singh et al., 1986; Benjamin, 1992) and model households’ production decisions, incorporating uncertainty, plot-level heterogeneity, and failures in insurance, credit, and labor mar- kets. Consistent with our reduced form results, we model access to irrigation as a labor- and input-complementing increase in plot-level productivity. Our test is as follows. With complete markets, farmers maximize profits on each plot and access to irrigation on one plot does not affect production decisions on other plots. In contrast, when there are failures in land and other markets, access to irrigation on one plot causes substitution of labor and inputs away from other plots.6 This test is joint for the null of frictionless land markets: if land markets are frictionless, then markets should reallocate land to farmers who can cultivate most profitably. We implement our test for inefficient adoption caused by market failures, exploit- ing the plot-level discontinuity in access to irrigation. We test whether farmers who have a plot just inside the command area reduce their input use on their other plots compared to farmers who have a plot just outside the command area. We find large substitution effects, strongly rejecting complete markets: an additional irrigated plot caused by access to irrigation is associated with a 54 - 60 percentage point decrease in the probability of irrigating a second command area plot. We find similarly large ef- fects for adoption of horticulture, household labor, and inputs. These results confirm a 5 This heterogeneity could only exist if there were frictions in at least one other market in addition to labor markets. 6 The mechanism is straightforward: access to irrigation on one plot increases input use on that plot. That increase does not affect input demand on the farmers’ other plots; however, if the farmer faces binding constraints in input, risk, or labor markets, that increase in input use must be associated with a decrease in input use on other plots. 4 simple descriptive analysis, which shows that few households are able to irrigate more than one command area plot. Applying these results, a simple back-of-the-envelope calculation implies that, absent this substitution, adoption of irrigation would be at least 30% higher. Moreover, the presence of this substitution implies current adop- tion of irrigation is inefficient: different households make different adoption decisions on technologically identical plots because of their access to irrigation on their other plots.7 The previous test shows that inefficient adoption of irrigation is caused by failures of land markets, and at least one other market; however, it does not establish which other market fails. We produce two tests that suggest that labor market constraints, as opposed to financial constraints, bind in our context. First, we extend the model and propose a test for whether labor market frictions contribute to inefficient adoption in this context. To produce this test, we consider the effects of household size and wealth on input substitution across plots, in the presence of insurance, credit, and labor market failures. We demonstrate that, while many patterns of differential substitution are possible, only labor market failures can explain irrigation access on one plot leading to greater input substitution across plots for richer households, and decreased input substitution across plots for larger households. We then estimate differential substitution with respect to household size and wealth to test for labor market failures. We find exactly this pattern: households with two additional members substitute 50% - 94% less than average size households, while one standard deviation wealthier households substitute 41% - 97% more than average wealth households. As these patterns of differential substitution can only be explained by labor market failures, and not credit or insurance market failures, these results imply that labor market failures cause substitution and contribute to inefficient adoption of irrigation. We then complement this result with experimental evidence. We conduct three randomized controlled trials with the farmers who have access to irrigation. Two of these trials focus on characteristics peculiar to irrigation systems: usage fees and 7 With sufficient time, these sites could reach an equilibrium in which this misallocation would have slowly been corrected by markets (Gollin & Udry, 2019). However, we note that our results are 4 years after initial access to irrigation, and we do not observe dynamics after 2 years. This is sufficient for our results to have meaningful implications for the long run sustainability of these schemes. Our results also complement evidence from the United States which suggests that initial allocations can persist for many decades even with seemingly well functioning land markets (Bleakley & Ferrie, 2014; Smith, 2019). 5 failures of operations and maintenance; we find neither plausibly affects farmers’ adoption decisions in our context. In the third experiment, we distribute minikits which contain all necessary inputs for horticulture cultivation to randomly selected farmers. Previous work has shown providing free minikits targets credit, risk, and information constraints: it reduces costs of growing horticulture under irrigation, basis risk, and costs of experimentation, respectively (Emerick et al., 2016; Jones et al., 2018). We find no effects of receiving minikits on adoption of horticulture in our context, in contrast to existing work. A closer analysis indicates that the farmers who take up the minikits are the same farmers who would have been likely to cultivate horticulture absent the intervention. Combining this evidence with the model-based test above, we conclude that financial and informational constraints are unlikely to be a primary explanation for low and inefficient adoption of irrigation. This paper demonstrates that frictions in land and labor markets cause inefficient adoption of hillside irrigation in Rwanda. This result integrates key findings from three large literatures in development economics. First, our result provides some ground-level evidence for the mechanisms underlying misallocation (Adamopoulos & Restuccia, 2014; Adamopoulos et al., 2017; Foster & Rosenzweig, 2017; Adamopou- los & Restuccia, 2018). We document that land misallocation hinders technology adoption, and that frictions in labor markets are one reason why land market fail- ures generate production inefficiencies. The intuition for our test expands on a deep literature on separation failures which empirically demonstrates that factor market failures affect the allocation of land and labor across households (Singh et al., 1986; Benjamin, 1992; LaFave & Thomas, 2016; Dillon & Barrett, 2017; Dillon et al., 2019).8 Our context allows us to innovate by demonstrating that separation failures induce differential adoption of irrigation on technologically identical plots. In doing so, we also contribute to a literature leveraging production function estimates to document misallocation of labor and inputs by inferring their marginal products from their allo- cations across plots or households (Jacoby, 1993; Skoufias, 1994; Udry, 1996; Shenoy, 2017; Restuccia & Santaeulalia-Llopis, 2017).9 Our test for inefficient technology 8 The existing literature does so by testing whether households with different characteristics use different levels of inputs; however, this type of test stops short of showing that these allocations are inefficient (Udry, 1997). In particular, it can only conclude that one market has failed; because it can not conclude that at least two markets have failed, by Walras’ Law it is insufficient to demonstrate an inefficiency. 9 Although demonstrating heterogeneity in the marginal product of labor is sufficient to show that labor market failures generate inefficiencies, the methods employed by this literature are typically 6 adoption caused by land and labor market failures therefore complements this lit- erature, by both imposing less structure and leveraging our plot-level discontinuity in access to irrigation as an exogenous labor- and input-complementing productivity shock. This paper is organized as follows. Section 2 describes the context we study and our sources of data. Section 3 presents our estimates of the impacts of irrigation in Rwanda. Section 4 presents our model of adoption of irrigation in the presence of market failures. We implement tests of constraints to adoption and labor market failures suggested by the model in Section 5, and experimental tests in Section 6. Section 7 concludes. 2 Data and context 2.1 Irrigation in Rwanda We study 3 hillside irrigation schemes, located in Karongi and Nyanza districts of Rwanda, that were constructed by the government in 2014; a timeline of construction and our surveys is presented in Figure 1. Rainfed irrigation in and around these sites is seasonal, with three potential seasons per year. During the main rainy season (“Rainy 1”; September - January), rainfall is sufficient for production in most years. In the second rainy season (“Rainy 2”; February - May), rainfall is sufficient in an average year but insufficient in dry years. In the dry season (“Dry”; June - August), rainfall is insufficient for agricultural production for seasonal crops. Absent irrigation, agricultural production in these sites consists of a mix of staples (primarily maize and beans) which are cultivated seasonally and primarily consumed by the cultivator, as well as perennial bananas which are sold commercially;10 most farmers adopt either a rotation of staples, fallowing land in the dry season, or cultivate bananas. Irrigation in these schemes is expected to increase yields by reducing risk in the second rainy season and enabling cultivation in the short dry season. As the dry season is relatively short, cultivating the primary staple crops is not possible for households not robust to the presence of unobserved heterogeneity across plots or measurement error (Gollin & Udry, 2019). 10 Staple rotations also include smaller amounts of sorghum and tubers, while there is also some cultivation of the perennial cassava, along with other minor crops. In our data, maize, beans, or bananas are the main crop for 85% of observations excluding horticulture. 7 Figure 1: Timeline Notes: Black lines indicate when (or the period during which) events took place, while pink lines are used to indicate survey recall periods. that cultivate during the two rainy seasons, even with irrigation. Instead, cultivating shorter cycle horticulture during the dry season becomes a possibility with irrigation. Horticulture production (most commonly eggplant, cabbage, carrots, tomatoes, and onions) can be sold at local markets where it is both consumed locally and traded for consumption in Kigali.11 As horticultural production is relatively uncommon during the dry season in Rwanda due to limited availability of irrigation, finding buyers for these crops is relatively easy during this time. At baseline 3.2% of plots outside of the command area are planted with at least some horticulture, primarily during the rainy seasons. The three schemes we study were constructed by the government from 2009 - 2014, with water beginning to flow to some parts of the schemes in 2014 Dry and becoming fully operational by 2015 Rainy 1 (August 2014 - January 2015). A repre- sentative picture from one of the schemes is presented in Figure 2. In each site, land was terraced in preparation for the irrigation works (as hillside irrigation would be infeasible on non-terraced land). Construction and rehabilitation of terraces in these sites began in 2009 - 2010. The schemes are all gravity fed, and use surface water as the source. From these water sources, a main canal (visible in Figure 2) was con- structed along a contour of the hillside; engineering specifications required the canal to be sufficiently steep so as to allow water to flow, but sufficiently gradual to control the speed of the flow, preventing manipulation of the path of the canal. Underground 11 Kigali is less than a 3 hour drive from these markets, facilitating trade. 8 secondary pipes run down the terraces from the canal every 200 meters, with valves on the main canal controlling the flow of water into these secondary pipes. Farmers draw water from tertiary valves on these pipes located on every third terrace, from which flexible hoses and dug furrows enable irrigation on all plots below the canal. The “command area” for these schemes, the land that receives access to irrigation, is made up of all the plots which are below the canal and located within 100 meters of one of these tertiary valves.12 In all sites, sufficient water is available to enable irrigation year-round. To the extent that there is heterogeneity in plot-level water pressure, the plots nearest to the canal face the lowest pressure, as the pressure available at an individual valve is determined by the volume of water in the pipe above that valve. The primary cost to farmers of irrigating a plot in this context is the labor associated with the actual irrigation, including maintaining the dug furrows and using the hoses to apply water from the valves to their plots. At the time of the study, there are no fees associated with the use of irrigation water.13 As we document in Section 6 there were not significant challenges in operations and maintenance during the sample period, perhaps due to governmental oversight and the youth of these schemes. We exploit a spatial discontinuity in irrigation coverage to estimate the impacts of irrigation. Because the main canals must conform to prescribed slopes relative to a distant and originally inaccessible water source, the geologic accident of altitude relative to this source determines which plots will and will not receive access to irrigation water. Hence, before construction, plots just above the canal should be similar to plots just below the canal, and importantly, should be managed by similar farmers. Following construction, however, the plots just below the canal fall inside the command area and have access to irrigation, while the plots just above the canal fall outside the command area and do not have access to irrigation.14 12 We define clusters of plots that share the same secondary pipe as water user groups. In addition, water user groups can be grouped into zones. The secondary pipes in each zone are located along a single segment of the canal, and the flow of water into each of these segments is regulated by a single large valve located on the canal. 13 In 2017, an attempt was made to collect taxes unconditional on the use of irrigation, to avoid influencing cultivation decisions. The taxes are small in magnitude compared to potential farmer yields, as they are meant to fund only ongoing operations and maintenance costs, and compliance was very low (4%). The research team conducted a randomized controlled trial to pay these taxes described in Appendix I and found no evidence that farmers responded to these taxes. 14 One might be concerned that during construction, the command area could be manipulated in order to cover particularly influential or productive farmers. In one of the sites, we have the 9 Figure 2: Karongi 12 hillside irrigation scheme 2.2 Data 2.2.1 Spatial sampling To take advantage of the spatial discontinuity in access generated by the command area boundary, we randomly sampled plots in close proximity to this discontinuity. In practice, we constructed this sample of plots by dropping a uniform grid of points across the site at 2-meter resolution, and then randomly sampling points within the grid within 50m of the command area boundary.15,16 After each point was sampled, georeferenced engineering layout plan that specifies the command area, in addition to the maps of the actually constructed command area. In results available upon request, our estimates are qualitatively similar when we use either the engineering layout plan or the actual map. This is consistent with the engineering layout plan closely aligning with the actual map, which in turn is consistent with the high costs of construction of these schemes. 15 This procedure will produce a sample of plots that is more representative of land than of households. In Supplementary Appendix A, we reproduce our analysis in Section 3 but weighting households inversely proportionally to their number of plots, and find that our results are qualita- tively similar with this alternative weighting. 16 In all three irrigation sites, we additionally sampled some points further from the canal inside the command area. We use these points primarily to examine experimental treatments described 10 we excluded all points within 10m of that point (to avoid selecting multiple points too close together). Enumerators visited each of these points, and identified when a point fell on cultivable land.17 With the help of a key informant (often the village leader), they then recorded the name of the cultivator of the plot, their contact information, as well as a sufficiently detailed description of the plot. These listed cultivators form our main household survey sample. For each cultivator, one of their identified plots was randomly selected, which we refer to as the sample plot. 2.2.2 Survey Our baseline survey covered 1,695 spatially sampled cultivators in August - October 2015. The survey includes detailed agricultural production data (season-by-season) for seasons 2014 Dry through 2015 Rainy 2 (June 2014 - May 2015). The dates of this survey and follow up surveys, along with the agricultural seasons they cover, are pre- sented in Figure 1. Details of the construction of key variables we use for the analysis are presented in Appendix A. As mentioned above, this is not a “true” baseline as some farmers had already gained access to irrigation in 2014 Dry. However, relatively small parts of the site had access to irrigation at this point; in Section 3.2.1 we high- light that 2014 Dry adoption of irrigation is less than 25% of adoption in subsequent dry seasons, and in Section 3.1.1 we show balance across the command area boundary in household and plot characteristics. A panel of plot-level production and input data are maintained for two plots, which were mapped using GPS devices for precise loca- tion and area measurement. The two plots on which panel data is collected represent the primary data for analysis; they include the sample plot (described above) and the farmer’s next most important plot (defined at baseline; we refer to this as the “most important plot” or “MIP”). We also collected data on household characteristics, labor force behavior, and a short consumption and food security module. In analysis, we will focus on the sample plots to learn about the effects of the irrigation itself, and the most important plot to learn about how the presence of irrigation on the sample plot impacts households’ productive decisions on their other plots. Three follow up household surveys were conducted in May - June 2017, November below in Section 6. Additionally, only two of the three sites have a viable boundary of cultivable land both just inside and just outside the command area; we use only these sites for our analysis of the impacts of access to irrigation in Section 3 and Section 5. 17 This was to discard forest, swamps, thick bushes, bodies of water, or other terrain which would make cultivation impossible. 11 - December 2017, and November 2018 - February 2019. In each survey, we asked for up to a year of recall data on agricultural production; based on the timing of our surveys we therefore have production for all agricultural seasons from June 2014 through August 2018, with the exception of 2015 Dry (June - August 2015) and 2016 Rainy 1 (September 2015 - February 2016). The household sample for the follow up surveys consists of all the baseline respon- dents, while the plot sample for the follow up surveys consists of the sample plots and most important plots. To maintain a panel of plots, we ran a “tracking survey”. This survey was triggered whenever a household’s sample plot or most important plot was sold or rented out to another household, or a household stopped renting in that plot if it was not the owner (“transacted”). Specifically, we tracked and interviewed the new household responsible for cultivation decisions on that plot to record information about cultivation and production, along with household characteristics when the new household was not already in our baseline sample. Data from this tracking survey is incorporated in all our plot level analysis, limiting plot attrition. Attrition in our survey is low, and details on attrition are presented in Table A8. Only 6.0% (6.4%) of plot-by-season observations for sample plots outside the command area in our primary analysis sample (defined in Section 3.1) are missing during the dry season (rainy season). There are three sources of attrition: household attrition, plots transacted to other farmers that we were not successful in tracking, and plots rented out to commercial farmers who were based in the capital or interna- tionally (from whom we were unable to collect agricultural production data). We do not find evidence of differential attrition of sample plots due to household attrition or plots transacted to other farmers that we did not track, however we do find access to irrigation causes an additional 6.4 - 10.2pp of plots to be rented out to a commercial farmer. We interpret the lack of data on these plots as biasing our primary estimates of the impacts of irrigation downwards, as these plots are cultivated with productive export crops, and we discuss attrition further in Appendix F. 2.3 Stylized facts To motivate our analysis of the impacts of hillside irrigation, we first introduce some stylized facts about irrigation in this context. Table 1 presents summary statistics for agricultural production from our four years of data, pooled across seasons. 12 Table 1: Summary statistics on agricultural production Staples Horticulture Staples Maize Beans Bananas All Rainy Dry (1) (2) (3) (4) (5) (6) (7) Yield 302 318 285 273 575 588 566 Hired labor (days) 37 37 37 9 61 66 57 Hired labor expenditures 28 28 28 7 45 49 42 HH labor (days) 266 248 260 101 417 414 420 Inputs 19 35 16 3 50 50 50 Profits Shadow wage = 0 RwF/day 256 255 241 263 481 489 475 Shadow wage = 480 RwF/day 128 136 117 214 280 290 273 Shadow wage = 800 RwF/day 43 56 34 182 147 158 139 Sales share 0.19 0.30 0.14 0.46 0.62 0.60 0.63 Irrigated 0.02 0.02 0.02 0.02 0.65 0.25 0.93 Rainy 0.99 1.00 1.00 0.50 0.42 1.00 0.00 log area -2.44 -2.26 -2.47 -2.10 -2.71 -2.83 -2.62 Share of obs. 0.65 0.13 0.42 0.19 0.12 0.05 0.07 Notes: Sample averages of outcomes by crop per agricultural season are presented in this table. Yield, inputs, hired labor expenditures, and profits are reported in units of ’000 RwF/ha, labor variables are reported in units of person-days/ha, and log area is in units of log hectares. All other variables are shares or indicators. For reference, the median wage in our data is 800 RwF/person-day. Stylized Fact 1. Irrigation in Rwanda is primarily used to cultivate horticulture in the dry season. Farmers in our data rarely irrigate their plots in the rainy seasons, and almost never use irrigation when cultivating staples or bananas (only 2% of plots cultivated with staples or bananas use irrigation in our data). In contrast, 93% of plots cultivated with horticulture in the dry season use irrigation. This stylized fact makes agronomic sense as the rainfall in rainy seasons in this part of Rwanda is usually sufficient for either staple or horticultural production (and in wet years may be harmfully excessive for horticulture). Additionally, as staples do not have a sufficiently short cycle to permit cultivation during the relatively short dry season (while horticulture does), it is not agronomically feasible to use irrigation to cultivate staples during the dry season. Stylized Fact 2. Horticultural production is more input intensive than staple culti- vation, which in turn is (much) more input intensive than banana cultivation. The mean horticultural plot uses about 420 days/ha of household labor, 60 days/ha 13 of hired labor, and 50,000 RwF/ha of inputs, regardless of the season in which it is planted.18 This contrasts to staple plots (260 days/ha of household labor, 40 days/ha of hired labor, 20,000 - 40,000 RwF/ha of inputs), and bananas (100 days/ha of household labor, 10 days/ha of hired labor, 3,000 RwF/ha of inputs). Stylized Fact 3. Horticultural production produces much higher cash profits than other forms of agriculture. Horticultural production produces much higher cash profits (defined as yields net of expenditures on inputs and hired labor) than other forms of agricultural production in and around these sites. Plots planted to horticulture yield about 500,000 RwF/ha in cash profits, in both rainy and dry seasons. This contrasts with about 250,000 RwF/ha of cash profits producing either staples or bananas. Stylized Fact 4. Household labor is the primary input to production of any crop, and the economic profitability of horticulture depends critically on the shadow wage. A large existing literature examines separation failures in labor markets faced by agricultural households (e.g., Singh et al. (1986); Benjamin (1992); LaFave & Thomas (2016)). If households are constrained in the quantity of labor they are able to sell on the labor market, they may work within the household at a marginal product of labor well below the market wage. Here, we see that if we value household labor allocated to horticulture at market wages, then cultivating horticulture appears less profitable than cultivating bananas. As a result, ultimately the economic profitability of hor- ticulture relative to bananas will depend critically on the constraints on household labor supply decisions. It is worth noting that both horticulture and bananas appear more profitable than cultivating staples, which would be unprofitable if labor were valued at market wages; the ubiquity of staple cultivation in and around these sites is a first piece of evidence that farmers face a shadow wage below the market wage.19 18 For reference, in the study period, the exchange rate was approximately 800 RwF = 1 USD 19 Both horticulture and bananas are also primarily commercial crops, unlike staples. Farmers may place higher value on staples if consumer prices are higher than producer prices (Key et al., 2000), or if there is price risk in production and consumption, both of which may contribute to cultivation decisions as well. 14 3 Impacts of irrigation 3.1 Empirical strategy We start our analysis with a regression discontinuity design. We restrict this and subsequent analysis to sample plots within 50 meters of the discontinuity, consistent with our sampling strategy. We regress y1ist = β1 CA1is + β2 Dist1is + β3 Dist1is ∗ CA1is + αst + γX1is + 1ist (1) Where ykist is outcome y for plot k of household i located in site s in season t, CAkis is an indicator for that plot being in the command area, and αst are site-by-season fixed effects meant to control for any differences across sites (including market access or prices). We use k = 1 to indicate the household’s sample plot, as opposed to the household’s most important plot. Dist1is is the distance of plot 1 from the command area boundary (positive for plots within the command area, negative for plots outside the command area) and X1is is the log plot area.20 Our coefficient of interest is β1 , the effect of the command area on outcome y . We include controls for distance to the boundary and log plot area to address two primary potential sources of omitted variable bias. First, the canal sits at a particular contour of the hillside. Plots that are positioned relatively higher on the hillside may have different agronomic characteristics; accordingly, farmers may differentially sort into these plots. We therefore follow convention by controlling for the running variable (Dist1is ) and its interaction with treatment (CA1is ). Second, as the construction of the canal slices through plots on the hillside, this may differentially change the area of plots that are positioned above or below the canal. For example, roads are more often located higher on the hillside, leaving less room for plots to extend above the canal relative to below the canal. We anticipate this will cause plots to be relatively larger just inside the command area. As plots exhibit strong evidence of diminishing returns to scale in this context, this effect would likely bias β1 downwards absent control.21 20 We calculate distance using the distance of the plot boundary to the command area boundary. 21 In Appendix C, we estimate specification (A1) that omits controls for distance to the command area boundary, its interaction with the command area indicator, and log plot area. All our results are qualitatively similar in this alternative specification, suggesting these potential sources of bias do not meaningfully affect our results. In Supplementary Appendix A, we estimate a specification that also includes controls for distance to the command area boundary squared, and its interaction with 15 Next, we consider additional concerns related to selection into our sample caused by access to irrigation. This may arise for two reasons. First, during the construction of the hillside irrigation schemes, forest was deliberately preserved or planted just outside of the command area in order to protect the new investment from erosion. As these forested plots are not agricultural, they are not included in our sampling strategy.22 Second, marginal plots which would have been too unproductive to culti- vate absent irrigation, and would thus have been left permanently fallow, may now be sufficiently productive to be worth cultivating with access to irrigation. While our sampling strategy selected both cultivated and uncultivated plots, it did not select plots which had been left overgrown with thick bushes, as it would have been diffi- cult to identify the household responsible for those plots. In practice, the latter is likely uncommon, as typical household landholdings are small in the hillside irrigation schemes we study (around 0.3 ha), and agricultural land is highly valued – median annual rental prices in our data are 150,000 RwF/ha, approximately 25% of annual yields. We account for this potential source of bias using spatial fixed effects (SFE; see Goldstein & Udry (2008); Conley & Udry (2010); Magruder (2012, 2013)), which use a spatial demeaning procedure to eliminate spatially correlated unobservables, such as unobserved heterogeneity in productivity caused by soil characteristics. This spa- tial demeaning ensures that comparisons are made only over proximate plots. For example, if some areas of low productivity are left forested outside of the command area, but not inside, then plots inside the command area will be systematically (un- observably) less productive than plots outside the command area. However, because SFE estimators only compare neighboring plots, the low productivity plots inside the command area that are near forested low productivity areas will not have nearby comparison plots outside the command area, and therefore will not contribute to the estimation of the effect of the command area.23 In practice, we define a set Nkist to be the group of five closest plots to plot k the command area indicator. All our results are qualitatively similar in this alternative specification, suggesting choice of functional form for the distance control does not meaningfully affect our results. 22 Typically, forests were planted or preserved in areas of low productivity, where the slope of the hillside was relatively high and erosion was relatively common. Therefore, this amounts to selection out of our sample of low productivity plots outside the command area, which would bias β1 downwards. 23 Formally, SFE estimators leverage the identification assumption lim||k−k ||→0 E [ kist |Xkist ] = E [ k i st |Xk i st ], where ||k − k || represents the distance between plot k and plot k . 16 observed in season t, including the plot itself. Then, for any variable zkist , define z kist = (1/|Nkist |) k ∈Nkist zk i st . The SFE specification then estimates y1ist − y 1ist = β1 (CA1is − CA1is ) + (V1is − V 1is ) γ + ( 1ist − 1ist ) (2) where Vkis includes all controls from Equation (1), except the subsumed site-by-season fixed effects. Our sampling strategy yields the following plot proximity: restricting to the sam- ple plots in our main sample for regression discontinuity analysis, 49% of plots have their nearest 3 plots (self inclusive) within 50 meters, and 87% have 3 plots within 100m; 60% of plots have all their nearest 5 plots (self inclusive) within 100m, while 83% have all 5 plots within 150m. As reference, Conley & Udry (2010) use 500m as the bandwidth for their estimator, while Goldstein & Udry (2008) use 250m as the bandwidth; we therefore anticipate that underlying land characteristics are likely to be quite similar between each plot and its comparison plots. When estimating Specification (1), we cluster standard errors at the level of the nearest water user group, the group of plots that can source water from the same secondary pipe.24 When estimating Specification (2), the spatial fixed effects generate correlation between the errors of close observations. To allow for this, we calculate Conley (1999) standard errors.25 3.1.1 Balance We now use specifications (1) and (2) to examine whether the plots in our sample and the households who cultivate them are comparable at baseline. For each of these specifications, we show balance both with key controls omitted (Columns 3 and 4), and our preferred specifications which we use in our analysis with key controls included (Columns 5 and 6). First, our sample plots are balanced in terms of ownership and rentals. Addi- tionally, 89% of sample plot owners on both sides of the canal owned the land over 24 As described in Section 2.1, water user groups are grouped into zones. In Supplementary Appendix A, we report estimates from specification (1) with standard errors clustered at the zone level, instead of the water user group level. The patterns of statistical significance we describe in Section 3 are unaffected by clustering at the zone level. 25 Specifically, we allow plot managed by household j and plot managed by household j to have correlated errors if there exists a plot k such that ∈ Nkist or k ∈ N jst , and ∈ Nkist or k ∈ N j st . 17 5 years, or prior to the start of the irrigation construction. There is, however, some imbalance on plot size; as discussed in Section 3.1, log area (measured in hectares) is larger inside the command area than outside the command area. This imbalance is weaker in the SFE specification than in the RDD specification, such that the omnibus test fails to reject the null of balance for the SFE specification (although we reject for the RDD specification). However, we note that this imbalance would bias us against finding the effects we see in Section 3.2 on horticulture, input use, labor use, and yields, as all of these variables are larger in smaller plots in both the command area and outside the command area. Table 2: Balance: Sample plot characteristics Full sample RD sample Coef. (SE) [p] Dep. var. Coef. (SE) [p] (1) (2) (3) (4) (5) (6) log area 0.045 -2.515 0.425 0.200 (0.077) (1.179) (0.121) (0.128) [0.554] 969 [0.000] [0.118] Own plot -0.012 0.894 0.004 -0.004 -0.001 -0.006 (0.020) (0.309) (0.032) (0.038) (0.032) (0.038) [0.535] 969 [0.907] [0.921] [0.972] [0.877] Owned plot >5 years 0.045 0.880 0.019 0.012 0.007 0.010 (0.019) (0.326) (0.037) (0.035) (0.036) (0.034) [0.020] 686 [0.613] [0.723] [0.834] [0.767] Rented out, farmer 0.027 0.032 -0.003 0.009 -0.009 0.007 (0.012) (0.177) (0.023) (0.027) (0.023) (0.027) [0.022] 969 [0.884] [0.726] [0.699] [0.796] Slope -0.006 0.290 -0.007 0.007 -0.011 0.007 (0.013) (0.183) (0.019) (0.024) (0.019) (0.024) [0.655] 969 [0.695] [0.769] [0.559] [0.781] Omnibus F-stat [p] 2.1 2.6 0.5 0.1 0.1 [0.066] [0.025] [0.756] [0.971] [0.993] Site FE X X Distance to boundary X X X X log area X X Spatial FE X X Notes: Column 2 presents the mean of the dependent variable and the standard deviation of the dependent variable in parentheses, for sample plots in the main discontinuity sample that are outside the command area, and the total number of observations. Columns 1 and 3 through 6 present regression coefficients on a command area indicator, with standard errors in parentheses, and p- values in brackets. Column 1 uses the full sample, while Columns 2 through 6 use the discontinuity sample. Columns 5 and 6 use specifications (1) and (2), respectively. Following the ownership results, Table 3 examines the characteristics of house- holds whose sample plots are just inside or just outside the command area. First, note that Column 1, which does not restrict to the discontinuity sample, performs 18 poorly here; we find significant imbalance on half of our variables, and the omnibus test rejects the null of balance. However, balance improves substantially in our two preferred specifications (Columns 5 and 6, Table 3) which restrict to the discontinuity sample; households with sample plots just inside the command area appear similar to households with sample plots just outside the command area. For Specification (1) in Column (5), we reject the null of balance at the 10% significance level, as there is a significant difference in whether the household head has completed primary school- ing or not and an almost significant difference in the number of household members (15-64). In Supplementary Appendix A, we estimate specifications (1) and (2) with controls included for household head completed primary schooling and number of household members (15-64), and our results are unaffected by the inclusion of these controls. We also note that 1 out of 10 variables significant at the 10% level is what one would expect due to chance.26 Lastly, in Section 5.1.1, we consider the characteristics of households’ most im- portant plots; we show that these appear similarly balanced. 3.2 Estimating the effects of irrigation 3.2.1 Adoption Dynamics Figure 3 presents the share of plots irrigated by season for sample plots just inside the command area and sample plots outside the command area. First, as the irrigation sites were already partially online at the time of our baseline, we already observe some increased adoption of irrigation in the command area in 2014 Dry: sample plots in the command area are approximately 5pp more likely to be irrigated than sample plots outside the command area. We present results from 2014 Dry and 2015 Rainy 1 and 2 in Appendix B; consistent with this low adoption, we do not find significant impacts of access to irrigation on inputs or output in these seasons. Second, starting 26 That the coefficient on household head primary completion in our test for balance is positive might be particularly concerning if education strongly predicted decisions to adopt irrigation, as this would be suggestive of endogenous selection rather than sampling noise. In contexts where information is a first order constraint to adoption, more educated farmers have been shown to be significantly more likely to adopt new technologies (Foster & Rosenzweig, 1996). Throughout this paper, we argue that information is unlikely to be a first order constraint to adoption of irrigation in this context; consistent with this, in results available upon request we find that household head primary completion is weakly negatively correlated with adoption of irrigation. To the extent that more educated household heads have a higher shadow wage, this is also consistent with our argument that labor market failures drive heterogeneous decisions to adopt irrigation in this context. 19 Table 3: Balance: Household characteristics Full sample RD sample Coef. (SE) [p] Dep. var. Coef. (SE) [p] (1) (2) (3) (4) (5) (6) HHH female 0.041 0.221 0.045 0.044 0.043 0.041 (0.025) (0.416) (0.046) (0.050) (0.046) (0.050) [0.094] 969 [0.326] [0.378] [0.345] [0.412] HHH age 0.5 47.5 2.1 0.7 1.4 0.3 (0.8) (14.5) (1.4) (1.8) (1.4) (1.9) [0.497] 967 [0.127] [0.694] [0.298] [0.863] HHH completed primary 0.069 0.287 0.128 0.102 0.119 0.099 (0.025) (0.453) (0.047) (0.062) (0.047) (0.062) [0.005] 966 [0.006] [0.097] [0.012] [0.111] HHH worked off farm 0.023 0.410 -0.039 -0.019 -0.024 -0.011 (0.027) (0.493) (0.051) (0.064) (0.050) (0.064) [0.392] 969 [0.441] [0.763] [0.631] [0.868] # of plots 0.61 5.19 0.20 0.35 0.36 0.40 (0.18) (3.38) (0.36) (0.46) (0.36) (0.46) [0.001] 969 [0.582] [0.448] [0.319] [0.382] # of HH members 0.17 4.89 -0.00 -0.03 -0.01 -0.03 (0.11) (2.16) (0.21) (0.25) (0.22) (0.25) [0.104] 969 [0.985] [0.917] [0.971] [0.908] # who worked off farm 0.10 0.77 0.01 0.03 0.01 0.04 (0.05) (0.85) (0.08) (0.10) (0.08) (0.10) [0.039] 969 [0.909] [0.799] [0.906] [0.722] # of HH members (15-64) 0.20 2.60 0.26 0.17 0.24 0.15 (0.08) (1.45) (0.15) (0.16) (0.16) (0.16) [0.007] 969 [0.090] [0.300] [0.128] [0.349] Housing expenditures -2.3 49.2 -5.6 -16.7 -6.5 -18.6 (6.9) (127.4) (14.9) (19.0) (14.7) (19.1) [0.739] 962 [0.707] [0.380] [0.658] [0.328] Asset index 0.11 -0.12 0.15 0.06 0.13 0.04 (0.05) (0.99) (0.12) (0.13) (0.12) (0.13) [0.034] 967 [0.215] [0.647] [0.291] [0.738] Omnibus F-stat [p] 3.9 2.1 0.9 1.7 0.9 [0.000] [0.028] [0.536] [0.073] [0.514] Site FE X X Distance to boundary X X X X log area X X Spatial FE X X Notes: Column 2 presents the mean of the dependent variable and the standard deviation of the dependent variable in parentheses, for sample plots in the main discontinuity sample that are outside the command area, and the total number of observations. Columns 1 and 3 through 6 present regression coefficients on a command area indicator, with standard errors in parentheses, and p- values in brackets. Column 1 uses the full sample, while Columns 2 through 6 use the discontinuity sample. Columns 5 and 6 use specifications (1) and (2), respectively. with 2015, adoption of irrigation does not appear to trend, but exhibits meaningful seasonality. Differences remain around 3pp - 6pp in the rainy seasons, and 19pp - 26pp in the dry seasons. 20 Figure 3: Adoption dynamics Notes: Average adoption of irrigation by season on sample plots in the main discontinuity sample, inside and outside the command area, is presented in this figure. Averages outside the command area are in black, while averages inside the command area and 95% confidence intervals for the difference are in pink. Robust standard errors are clustered at the nearest water user group level. Given the limited changes in adoption dynamics after 2014 and the stark differ- ences in adoption across dry and rainy seasons, for the remainder of our analysis we estimate (1) and (2) pooling across our three years of follow up surveys, splitting our results across dry and rainy seasons. 3.2.2 Impacts of irrigation We now present our results on the impact of access to irrigation on crop choices, on input use, and on production. First, we present graphical evidence of the regression discontinuity in Figure 4; for parsimony, we do so only for the dry seasons (2016 Dry, 2017 Dry, and 2018 Dry).27 In each of the regression discontinuity figures, distance to the canal in meters is represented on the horizontal axis, with a positive sign indicating that the plot is on the command area side of the boundary. Second, we 27 Rainy season differences are always smaller and generally not visually noteworthy; we focus most of our discussion on the dry season results. 21 present regression evidence in Table 4. In the discussion below, we focus on results from the tables, but we note that these results are consistent with visual intuition from Figure 4.28 Figure 4: Regression discontinuity estimates of impacts of irrigation 28 We also note that the RDD graphs in Figure 4 typically demonstrate a relatively strong spatial slope within the command area, where plots further away from the canal are more likely to irrigate, cultivate horticulture, and have higher yields. There are several plausible explanations for this trend: further away from the canal, more of the land is marshland, where growing horticulture is more traditional, plots may be selected differently (for example, smaller); and the system also has greater water pressure. The first two of these explanations emphasize the value of the regression discontinuity estimator while the last means that our local estimates at the canal may be conservative. 22 Table 4: Access to irrigation enables transition to dry season horticulture from perennial bananas, causes large increases in dry season labor and input use, yields, and sales; profitability depends on household’s shadow wage (a) Dry season Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA 0.005 0.162 0.137 -0.133 70.8 6.3 3.7 73.1 55.5 63.9 9.3 (0.041) (0.024) (0.024) (0.037) (17.5) (1.5) (2.1) (23.2) (14.5) (21.0) (16.5) [0.909] [0.000] [0.000] [0.000] [0.000] [0.000] [0.082] [0.002] [0.000] [0.002] [0.573] SFE (Spatial FE, Specification 2) CA 0.022 0.171 0.156 -0.142 76.9 4.3 3.2 55.0 49.3 49.1 -3.0 (0.044) (0.030) (0.029) (0.035) (20.7) (1.8) (2.6) (28.5) (19.2) (25.7) (20.8) [0.610] [0.000] [0.000] [0.000] [0.000] [0.019] [0.221] [0.054] [0.010] [0.057] [0.886] # of observations 2,537 2,537 2,536 2,536 2,523 2,527 2,527 2,402 2,527 2,402 2,400 # of clusters 196 196 196 196 196 196 196 196 196 196 196 23 Control mean 0.391 0.058 0.065 0.245 59.5 2.5 3.7 82.3 49.7 76.1 32.8 (b) Rainy seasons Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA -0.092 0.035 0.016 -0.158 8.5 1.1 3.7 -22.6 -13.3 -26.4 -31.8 (0.025) (0.009) (0.018) (0.038) (23.1) (2.9) (3.4) (30.8) (18.5) (28.5) (26.4) [0.000] [0.000] [0.371] [0.000] [0.714] [0.710] [0.276] [0.462] [0.472] [0.354] [0.228] SFE (Spatial FE, Specification 2) CA -0.053 0.059 0.048 -0.168 9.9 2.1 3.1 -15.4 5.6 -19.4 -27.3 (0.027) (0.012) (0.025) (0.034) (24.7) (3.1) (4.5) (30.8) (21.6) (27.5) (31.9) [0.051] [0.000] [0.056] [0.000] [0.689] [0.511] [0.490] [0.617] [0.793] [0.480] [0.393] # of observations 4,236 4,236 4,235 4,235 4,215 4,223 4,223 4,085 4,223 4,085 4,078 # of clusters 196 196 196 196 196 196 196 196 196 196 196 Control mean 0.838 0.016 0.073 0.274 226.7 16.1 15.9 271.5 85.1 239.8 59.5 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, and log area of the sample plot. RDD specification includes site-by-season fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. First, in line with results from Section 3.2.1, command area plots are 16pp - 17pp more likely to be irrigated during the dry season than plots outside the command area, and almost all of this increase is explained by the transition to cultivation of high value horticulture during this dry season. In contrast, adoption of irrigation during the rainy season is much lower, with increases of just 4pp - 6pp. This transition to dry season horticulture substitutes for cultivation of perennial bananas, a less productive but less input intensive commercial crop; we estimate a decrease of 13pp - 14pp in the command area, and as a consequence we observe no impacts on overall cultivation in the dry season.29 Second, we find large increases in dry season input use, which are dominated by increases in household labor. These results are consistent with the transition from perennial bananas, which require little inputs and labor, into horticulture, which is highly input and labor intensive. To interpret these results, we conduct a treatment on the treated analysis under the assumption that the command area increases input use only through its effect on irrigation. Doing so, we find that adoption of irrigation increases household labor use, input expenditures, and hired labor expenditures by 440 - 450 person-days/ha, 25,000 - 39,000 RwF/ha, and 19,000 - 23,000 RwF/ha, respectively; these numbers are similar to differences in input intensity of dry season horticulture and bananas reported in Table 1. The impacts on household labor are particularly large — valued at a typical wage of 800 RwF/person-day, this labor would be priced at 350,000 - 360,000 RwF/ha, an order of magnitude larger than the effects on input expenditures or hired labor expenditures. Applying this labor to 0.3 ha (median household landholdings) of command area land would require roughly 4 person-months of labor during the 3 month dry season. In contrast to these dry season results, we find no effects on input use during the rainy seasons. Third, consistent with our estimates of impacts on input use, we find large in- creases in dry season agricultural production. Treatment on the treated analysis suggests adoption of irrigation increases yields by 320,000 - 450,000 RwF/ha, 51 - 72% of annual agricultural production. As horticulture is primarily commercial: each 1 RwF/ha increase in yields is associated with a 0.76 - 0.90 RwF/ha increase in sales. Once again, these results on outputs are consistent with differences between bananas 29 As bananas are perennials, plots cultivated with bananas typically have harvests in each season. In contrast, the rotations of staples and horticulture (or simply horticulture) that replace bananas may only involve two plantings and harvests, and we therefore see a modest decrease in cultivation during the rainy seasons of 5pp - 9pp on a baseline of 84%. 24 and horticulture production reported in Table 1. Additionally, these impacts on yields are much larger than our estimates of impacts on input and hired labor expenditures; our results suggest irrigation increases yields net of expenditures by 290,000 - 390,000 RwF/ha, a 53 - 71% increase in annual yields net of expenditures. However, we should not interpret this as impacts on profits, as it implicitly places no value on the large increases in household labor. If we instead value household labor at 800 RwF/person- day, the median wage we observe, these impacts vanish completely. Therefore, the profitability of the transition to dry season horticulture enabled by irrigation depends crucially on the shadow wage at which household labor is valued.30 We anticipate three categories of spillovers in our context: across household spillovers, within plot (across season) spillovers, and within household (across plot) spillovers. First, the across household spillovers we anticipate are general equilibrium effects, as the increased demand for labor and increased production of horticulture caused by the sample plot shock drive up wages and down horticultural prices. We test for these effects in Appendix E and find no evidence that wages or staple prices changed over time, though prices of horticultural crops did decrease in one of the sites, reducing our estimated effects on revenues. Second, the within plot (across season) spillovers we anticipate are driven by the shift out of perennial bananas, which causes a change in patterns of cultivation during the rainy season, while adoption of irriga- tion is primarily during the dry season. However, we failed to find strong evidence of impacts on rainy season labor, inputs, yields, or profits. Third, the within household (across plot) spillovers we anticipate are driven by the increase in demand for labor and inputs we observe on the sample plot, which may lead households to substitute labor and inputs away from their other plots. To address this spillover, in Section 4 we model a household’s agricultural production decisions and how they can generate substitution across plots, and in Section 5 we estimate these spillovers and quantify their implications for our estimates and for efficiency. Taken together, these results suggest that irrigation leads to a large change in production practices for a minority of farmers. Those farmers cultivate horticulture in the dry season and a mix of horticulture, staples, and fallowing in the rainy seasons, they have substantially higher earnings in the dry season but similar earnings in the 30 In Appendix D, we estimate impacts of access to irrigation on household welfare. Although these estimates are imprecise, all point estimates are positive and some are statistically significant. These results are consistent with positive impacts of access to irrigation on profits, although smaller impacts than implied by estimates that do not value household labor. 25 other seasons, and they invest more in inputs and much more in household labor in the dry seasons. Our estimates suggest that irrigation has the potential to be transformative in Africa, in light of the 53 - 71% increases in yields net of expenditures that we document from just three months of cultivation. At the same time, we observe a minority (30%) of farmers ultimately making use of the irrigation system. These results suggest that the shadow wage, and therefore labor market frictions, are likely to be important for the decision to cultivate horticulture.31 Building on this result, we next adapt the classical agricultural household model (Singh et al., 1986; Benjamin, 1992) to develop formal tests for the role of market failures in adoption of irrigation. 4 Testing for binding constraint In this section, we describe a model of a farmer household that chooses input and household labor allocations across multiple plots, subject to constraints in markets for labor, inputs, and risk. We use this model to generate testable predictions of the farmer household’s responses to the sample plot shock for the presence of constraints and the nature of the constraints the farmer household faces. The formal setup of the model, motivating assumptions, and additional details are in Appendix G. The farmer household has two plots: consistent with our data, we call these their sample plot and their most important plot. Production on each plot is a function of input and household labor allocations on that plot, of the plot’s productivity, and of a common productivity shock. While we abstract from the choice of crops or decision to irrigate in this framework, we interpret this production function as the envelope of production functions from cultivating different fractions of bananas and irrigated horticulture during the dry season, with cultivating horticulture as optimizing at a high labor and input intensity.32 The household maximizes expected utility over consumption and leisure, choosing its off-farm labor supply and its allocations of inputs and household labor on each plot. 31 Testing the role of the shadow wage, by estimating the impacts of access to irrigation on profits under different assumed shadow wages, has a number of limitations. First, all results above could be fully explained by heterogeneity in the returns to adoption of irrigation (Suri, 2011). Second, person days may be a clumsy measure for comparing hired labor and household labor, as household labor may be less intense, require fewer hours per day, or be preferred by households to working for others. 32 We formalize this interpretation by extending the model to feature crop choice in Supplementary Appendix B. 26 We build on Benjamin (1992) and allow farmers to face three crucial constraints that cause deviations from expected profit maximization. First, access to insurance may be limited, so farmers may reduce labor and input use to avoid basis risk. Sec- ond, credit or access constraints may limit input use. Third, farmers’ off farm labor allocations may be constrained from above, resulting in overutilization of labor on the household farm.33,34 We model access to irrigation on the sample plot (the “sample plot shock”) as a labor- and input-complementing increase in the productivity of the sample plot. This offers predictions consistent with our results in Section 3: the sample plot shock increases production, labor allocations, and input allocations on the sample plot. Next, we consider the impacts of the sample plot shock on production decisions on the most important plot in this framework. Proposition 1. If no constraint binds, separation holds and input and labor use on the most important plot does not respond to the sample plot shock. In the absence of the three constraints listed above, households maximize expected profits. As access to irrigation on the sample plot does not impact marginal prod- ucts or prices on the most important plot, labor and input allocations on the most important plot do not respond to the sample plot shock. Proposition 2. If input, labor, or insurance constraints bind, then input and labor use are reduced on the most important plot in response to the sample plot shock. The logic case-by-case is as follows. First, if input constraints bind, then the in- crease in inputs on the sample plot caused by access to irrigation must be associated with a reduction in inputs on the most important plot. Second, if labor constraints bind, then the increase in labor on the sample plot caused by access to irrigation must be associated with a reduction in the sum of labor on the most important plot 33 These constraints correspond with those most commonly cited by farmers in focus groups as driving crop choice. In particular, farmers frequently cite imbaraga, or strength, of the household head (corresponding to labor market constraints), igishoro, or access to capital (corresponding to credit or input market constraints), and isoko, or access to markets (corresponding to price risk, or insurance market constraints). 34 We model labor constraints as a constraint on off farm labor allocations from above because this generates a shadow wage that is below the market wage, and we note this generalizes a model without an off farm labor market. Agness et al. (2020) provide evidence in Kenya that households’ shadow wages are almost always below the market wage. In addition, as we discuss in Section 3.2.2, our results on profits are consistent with a shadow wage that is below the market wage. 27 and leisure. Third, absent insurance, the increase in agricultural production caused by access to irrigation reduces the marginal utility from agricultural production rel- ative to the marginal utility from consumption. In turn, this causes labor and input allocations on the most important plot to fall. An implicit assumption we make that generates this result is the absence of func- tioning land markets. With perfectly functioning land markets, shocks to the house- hold’s land endowment, such as the sample plot shock, should not affect productive decisions on the household’s most important plot. Instead, both the sample plot and the most important plot would flow to the household with the highest willingness-to- pay for them. In practice, land transactions do occur; as discussed in Section 2.2.2, our survey tracks plots across transactions in land markets, so we are able to directly test the prediction that the sample plot shock does not affect the productive decisions on the most important plot. Rejecting separation with the test suggested by Proposition 2 implies that the levels of irrigation adoption are inefficient and that land market failures contribute to this inefficiency. At the same time, this test does not allow us to test for which of the three constraints above interacts with land market frictions to generate separation failures. To shed light on which other constraints generate separation failures, we consider how households with different characteristics should differentially respond to the sam- ple plot shock. We focus on two important household characteristics: household size, which determines the availability of household labor, and wealth, which determines the ability to purchase inputs.35 Proposition 3. If input or insurance constraints bind, then the input and labor allocations on the most important plot of larger households (wealthier households) should be less (less) responsive to the sample plot shock. The intuition for this result is that both insurance and input constraints are ul- timately financial constraints, which causes household size and wealth to enter the problem symmetrically. Under insurance constraints, both household size (by increas- ing the amount of labor income) and wealth increase household consumption. If we additionally assume that risk aversion is decreasing sufficiently quickly in consump- tion, then the allocations of wealthier and larger households will be closer to those 35 In Section 5.1, we further discuss the choice of household size and wealth as shifters of avail- ability of household labor and ability to purchase inputs, respectively. 28 maximizing expected profits, and therefore allocations on the most important plot will be less responsive to the sample plot shock. Second, under input constraints, wealthier households are less likely to see the constraint bind. As the allocations on the most important plot of unconstrained households do not respond to the sam- ple plot shock, wealthier households should be less responsive. Similarly, if larger households could finance input purchases from labor income, larger households would be less likely to see the constraint bind. Therefore, their allocations on the most important plot would be less responsive to the sample plot shock. Proposition 4. If labor constraints bind, then the relative responsiveness of input and labor allocations on the most important plot of larger households (wealthier households) to the sample plot shock cannot be signed without further assumptions. If larger households and poorer households have more elastic on farm labor supply schedules, and if on farm labor supply exhibits sufficient curvature, then the input and labor allocations on the most important plot of larger households (wealthier households) should be less (more) responsive to the sample plot shock. When labor constraints bind, the household responds to the sample plot shock by allocating additional labor to the sample plot, but they may withdraw labor from either the most important plot or from leisure. In general, the differential responses of wealthier and larger households cannot be signed. However, we focus on the case where larger households have more elastic on farm labor supply, while wealthier house- holds have less elastic labor supply; this relationship has been posited as far back as Lewis (1954), and is discussed in depth in Sen (1966). These differences in on farm labor supply generate the prediction that larger households should be less responsive to the sample plot shock, as they draw labor to the sample plot primarily from leisure, while wealthier households should be more responsive to the sample plot shock, as they draw labor primarily from the most important plot. These four propositions, summarized in Table 5, generate two sets of tests. First, Propositions 1 and 2 imply that substitution away from the most important plot in response to the sample plot shock allows us to reject the absence of constraints. Second, Propositions 3 and 4 produce a test of the absence of labor constraints. If the input and labor allocations of larger households are less responsive to the sample plot shock, while those of wealthier households are more responsive to the sample plot shock, then we would reject the absence of labor constraints. 29 We note that while it is unambiguous that constraints on selling labor would lead to inefficient adoption of irrigation, it is not clear whether they would result in adoption which is inefficiently low or inefficiently high. That result will depend on how wages would respond to the removal of those constraints. We return to this point in the conclusions. Table 5: Model predictions dL2 d dL2 d dL2 dA1 dL dA1 dM dA1 No constraints 0 0 0 Constraints Insurance − + + Inputs − 0/+ + Labor − +∗ −∗ Notes: Predicted signs from the model for key comparative statics of interest are presented in this table. dA dL2 1 is the effect of the sample plot shock on labor allocations on the most important plot, and dL dA1 and dM dA1 are the impact of increased household size and wealth, respectively, on d dL2 d dL2 this effect. Predictions in the no constraints case correspond to Proposition 1. Predictions on dAdL2 1 correspond to Proposition 2. Predictions on dL d dL2 dA1 and d dL2 dM dA1 when insurance or input constraints bind correspond to Proposition 3, and when labor constraints bind correspond to Proposition 4. * is used to indicate predictions that hold when additional assumptions are made. 5 Separation failures and adoption of irrigation 5.1 Empirical strategy Our first specification to test for separation failures mirrors Equation 1, which we use to estimate the impacts of irrigation. We still make use of the discontinuity across the command area boundary, but outcomes are now on the household’s most important plot (plot 2) instead of the sample plot (plot 1). y2ist = β1 CA1is +β2 Dist1is +β3 CA1is ∗Dist1is +β4 CA2is +γ1 X1is +γ2 X2is +αst + 2ist (3) Equation 3 also includes controls CA2is , an indicator for whether the most important plot is in the command area, and X1is and X2is , the log area of the sample plot and the most important plot, respectively. We report β1 , the effect of the sample plot shock 30 on outcomes on the most important plot. In other specifications, we also consider heterogeneity with respect to the location of the most important plot, and include CA1is ∗ CA2is to test for this. In these specifications, we also report this difference in differences coefficient. For both this coefficient and β1 , in line with the model predictions in Table 5, we interpret negative coefficients on labor, inputs, irrigation use, and horticulture, as evidence of separation failures.36 As in Section 3, we include a specification with spatial fixed effects.37 Specifically, we estimate y2ist − y 2ist = β1 (CA1is − CA1is ) + (V1is − V 1is ) γ1 + (V2is − V 2is ) γ2 + ( 2ist − 2ist ) (4) Our benchmark specification to test for which constraints drive the separation failures is similar, but also includes the interaction of households characteristics with the sample plot shock.38 We estimate y2ist = β1 CA1is + Wi β2 + CA1is ∗ Wi β3 + β4 Dist1is + β5 CA1is ∗ Dist1is +β6 CA2is + γ1 X1is + γ2 X2is + αst + 2ist (5) where Wi is a vector of household characteristics, which includes household size and an asset index in our primary specifications.39 We focus on β3 : the heterogeneity, 36 Although we do not explicitly model the location of the most important plot in Section 4, in Supplementary Appendix B.1 we provide an extension of the model featuring crop choice and demonstrate these predictions hold for heterogeneity of substitution with respect to the location of the most important plot. 37 Note that all differencing in this specification is done using the location of sample plot. In other words, most important plots whose associated sample plots are near each other are compared, as opposed to most important plots which are near each other. 38 For parsimony, we only present the specification of the interaction for the specification without spatial fixed effects; all tables also present results with interactions included with spatial fixed effects, similar to Equation 4. 39 Through the lens of our model, household size and the asset index act as shifters of the house- hold’s availability of labor and the household’s ability to purchase inputs, respectively. First, this requires that we have meaningful variation in household size and wealth in the cross-section. In our sample, household size has a standard deviation of 2.2 members (see Table 3), and a one standard deviation increase in the asset index corresponds to an additional 30,000 RwF of liquid assets (goats and chickens). For comparison, our estimates in Section 3.2 suggest that 1.3 members worth of labor and 10,000 RwF of inputs are necessary to cultivate median household landholdings with hor- ticulture during the dry season. Second, it requires that household size and the asset index are not correlated with other household or plot characteristics that might affect patterns of substitution. In Appendix H, we justify our choice of shifters by showing that household size and asset index explain agricultural production decisions in the expected manner, and that this correlation is unaffected by 31 with respect to household characteristics, of the impacts of the sample plot shock on outcomes on the most important plot. The signs on β3 produce our main test of which market failures cause separation failures; Table 5 presents which signs map to which market failures. We present results only for the dry seasons (2016 Dry, 2017 Dry, and 2018 Dry), because these are the primary seasons for irrigation use, during which we anticipate substitution effects. Additionally, we present results only on cultivation decisions and input use, because we expect these substitution effects to be smaller than the direct effects and therefore we do not anticipate being able to detect effects on output.40 5.1.1 Balance We now use specifications (3) and (4) to examine whether the most important plots in our sample are comparable for households whose sample plot is just inside or just outside the command area. As in Section 3.1.1, for each of these specifications, we show balance both with key controls omitted (Columns 3 and 4), and our preferred specifications which we use in our analysis with key controls included (Columns 5 and 6). Balance tests for most important plots are reported in Table 6. First, note that specifications that do not restrict to the discontinuity sample perform particularly poorly here. Most notably, most important plots are more likely to be located in the command area when sample plots are also located in the command area, as households’ plots tend to be located near each other. In contrast, our preferred specifications (Columns 5 and 6) which restrict to the discontinuity sample correct for this imbalance. For both specifications, the omnibus test fails to reject the null of balance. As an additional check, in Appendix B, we estimate for 2014 Dry specifications (3) and (4), and specifications with heterogeneity following Equation 5. As the command area, as of the baseline, had not yet caused a large increase in demand for labor or inputs, or caused large increases in agricultural production, we would not anticipate any effects on MIPs. In line with this prediction, we fail to find any consistent the inclusion of key household covariates. In addition, in results available upon request, we have included other household and plot characteristics in the interaction that might affect these patterns – number of plots and number of command area plots (if households with more land are more price risk averse), and plot area (as a proxy for quality, if smaller plots are typically higher quality). None of these added controls affect the significance of our results. 40 Results for the rainy seasons and with output as an outcome are available upon request. 32 Table 6: Balance: Most important plot characteristics Full sample RD sample Coef. (SE) [p] Dep. var. Coef. (SE) [p] (1) (2) (3) (4) (5) (6) log area -0.108 -2.381 0.094 0.074 (0.068) (1.041) (0.128) (0.136) [0.114] 784 [0.460] [0.588] Own plot 0.025 0.875 0.040 0.033 0.039 0.029 (0.019) (0.331) (0.033) (0.039) (0.032) (0.037) [0.174] 784 [0.226] [0.392] [0.232] [0.436] Owned plot >5 years 0.005 0.960 0.012 0.033 0.011 0.030 (0.014) (0.197) (0.024) (0.024) (0.023) (0.025) [0.728] 585 [0.617] [0.175] [0.617] [0.233] Rented out, farmer 0.013 0.033 -0.026 -0.040 -0.029 -0.041 (0.010) (0.179) (0.022) (0.025) (0.023) (0.026) [0.224] 784 [0.249] [0.114] [0.222] [0.116] Slope 0.024 0.268 0.011 -0.004 0.011 -0.006 (0.009) (0.144) (0.016) (0.018) (0.015) (0.018) [0.012] 784 [0.497] [0.806] [0.496] [0.730] Command area 0.187 0.399 -0.053 -0.079 (0.032) (0.491) (0.058) (0.059) [0.000] 784 [0.360] [0.183] Omnibus F-stat [p] 7.1 0.7 1.4 0.9 1.3 [0.000] [0.627] [0.220] [0.458] [0.279] Site FE X X Distance to boundary X X X X log area X X MIP log area X X MIP CA X X Spatial FE X X Notes: Column 2 presents the mean of the dependent variable and the standard deviation of the dependent variable in parentheses, for sample plots in the main discontinuity sample that are outside the command area, and the total number of observations. Columns 1 and 3 through 6 present regression coefficients on a command area indicator, with standard errors in parentheses, and p- values in brackets. Column 1 uses the full sample, while Columns 2 through 6 use the discontinuity sample. Columns 5 and 6 use specifications (3) and (4), respectively. significant effects on MIPs, either in our main specifications or for heterogeneity. 5.2 Results 5.2.1 A test for separation failures We now present results on separation failures, demonstrating that the sample plot shock causes farmers to substitute away from their most important plot. First, we present graphical evidence of this substitution in Figure 5. As in earlier figures, distance of the sample plot to the canal in meters is represented on the horizontal 33 axis, with a positive sign indicating that the plot is on the command area side of the boundary. However, we now plot outcomes on both the sample plot and the most important plot. In this figure, substitution will manifest as decreases in input and labor use on the most important plot when the sample plot is in the command area, while input and labor use increase on the sample plot. Second, we present regression evidence in Tables 7 and 8. In the discussion below, we focus on results from the tables, but we note that these results are consistent with visual intuition from Figure 5. Table 7: Sample plot shock causes households to substitute labor and input intensive irrigated horticulture away from most important plot Culti- Irri- Horti- Banana HH Input Hired vated gated culture labor/ exp./ha labor ha exp./ha (1) (2) (3) (4) (5) (6) (7) RDD (Site-by-season FE, Specification 3) CA 0.038 -0.044 -0.038 0.092 -32.2 -6.0 -1.8 (0.040) (0.026) (0.024) (0.032) (20.0) (2.7) (2.1) [0.344] [0.087] [0.110] [0.004] [0.107] [0.028] [0.404] Sample plot effect 0.005 0.162 0.137 -0.133 70.8 6.3 3.7 SFE (Spatial FE, Specification 4) CA 0.004 -0.036 -0.037 0.065 -33.2 -6.7 -0.5 (0.049) (0.033) (0.029) (0.036) (23.8) (2.8) (2.3) [0.930] [0.270] [0.206] [0.072] [0.162] [0.017] [0.825] Sample plot effect 0.022 0.171 0.156 -0.142 76.9 4.3 3.2 # of observations 2,179 2,179 2,179 2,179 2,166 2,169 2,169 # of clusters 182 182 182 182 182 182 182 Control mean 0.368 0.114 0.109 0.199 66.8 5.6 3.9 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, log area of the sample plot and the most important plot, and a command area indicator for the most important plot. RDD specification includes site-by-season fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. “Sample plot effect” estimates are from Table 4. First, consistent with the presence of separation failures, we find households sub- stitute labor and inputs away from their most important plot. Households decrease allocations of household labor (32 - 33 person-days/ha) and inputs (6,000 - 6,700 34 Figure 5: Regression discontinuity estimates of most important plot responses to sample plot shock RwF/ha) on their most important plot in response to the sample plot shock. Addi- tionally, they substitute away from labor and input intensive technologies, consistent with our interpretation of the production function as the envelope of production functions across crop choices. Households decrease use of irrigation (3.6 - 4.4pp) and cultivation of horticulture (3.7 - 3.8pp), while increasing cultivation of bananas (6.5 - 9.2pp).41 41 While these results are not consistently statistically significant, the specifications used lose 35 Table 8: Sample plot shock causes households to substitute labor and input intensive irrigated horticulture away from most important plot Culti- Irri- Horti- Banana HH Input Hired vated gated culture labor/ exp./ha labor ha exp./ha (1) (2) (3) (4) (5) (6) (7) RDD (Site-by-season FE, Specification 3) CA 0.076 -0.004 -0.004 0.096 -13.6 -3.3 0.2 (0.043) (0.020) (0.018) (0.041) (14.1) (1.8) (2.1) [0.079] [0.836] [0.813] [0.019] [0.338] [0.070] [0.922] CA * MIP CA -0.089 -0.094 -0.080 -0.009 -44.1 -6.3 -4.7 (0.052) (0.035) (0.035) (0.042) (23.5) (3.2) (2.5) [0.089] [0.007] [0.021] [0.824] [0.060] [0.044] [0.066] Joint F-stat [p] 2.1 3.6 2.7 4.5 1.8 2.6 1.8 [0.122] [0.028] [0.070] [0.013] [0.164] [0.078] [0.175] Sample plot effect 0.005 0.162 0.137 -0.133 70.8 6.3 3.7 Average effect 0.038 -0.044 -0.038 0.092 -32.2 -6.0 -1.8 SFE (Spatial FE, Specification 4) CA 0.059 0.010 -0.007 0.087 -15.4 -3.8 1.5 (0.048) (0.026) (0.023) (0.044) (19.2) (2.1) (2.6) [0.215] [0.686] [0.771] [0.047] [0.422] [0.076] [0.546] CA * MIP CA -0.121 -0.103 -0.066 -0.048 -39.7 -6.5 -4.5 (0.056) (0.045) (0.044) (0.044) (31.2) (3.7) (3.1) [0.030] [0.021] [0.133] [0.275] [0.204] [0.079] [0.146] Joint F-stat [p] 2.7 2.7 1.3 2.0 1.1 3.0 1.1 [0.070] [0.069] [0.286] [0.139] [0.324] [0.050] [0.345] Sample plot effect 0.022 0.171 0.156 -0.142 76.9 4.3 3.2 Average effect 0.004 -0.036 -0.037 0.065 -33.2 -6.7 -0.5 # of observations 2,179 2,179 2,179 2,179 2,166 2,169 2,169 # of clusters 182 182 182 182 182 182 182 Control mean 0.368 0.114 0.109 0.199 66.8 5.6 3.9 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) and its interaction with a command area indicator for the most important plot (“CA * MIP CA”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, log area of the sample plot and the most important plot, and a command area indicator for the most important plot (“MIP CA”). RDD specification includes site-by-season fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. “Sample plot effect” estimates are from Table 4, and “Average effect” is the coefficient on CA from a specification that omits CA * MIP CA. power by including most important plots outside the command area, which are almost never irrigated and have small allocations of labor and inputs 36during the dry season. As discussed in the next paragraph, specifications which include the interaction of the sample plot command area indicator with a most important plot command area indicator are more precise for irrigation use, horticulture cultivation, and labor and input use. Next, we expect the results above to be driven primarily by most important plots located in the command area for most outcomes. This is because there is limited irrigation, and therefore input use or horticulture during the dry season, on plots that cannot be irrigated. Consistent with this, we find our results on irrigation, horticulture, and inputs are all driven by plots located in the command area. When the most important plot is located in the command area, the 16 - 17pp increase in irrigation use on sample plots in the command area coincides with a 9 - 10pp decrease in irrigation use on the most important plot; these relative magnitudes suggest that separation failures cause few households to be able to use irrigation on more than one plot in the command area. As discussed in Section 3, the direct effects of the command area appear driven by enabling the transition to dry season horticultural cultivation and substitution away from lower value banana cultivation. However, the model in Section 4 is agnostic about whether decreases in labor and input allocations on the most important plot are driven by extensive margin responses (i.e., decreases in horticulture) or intensive margin responses (i.e., decreases in labor and input allocations conditional on crop choice). To test this, in Table A6, we present results of the sample plot shock on sample plots and most important plots, controlling for cultivation and crop choice.42 Table A6 confirms that the effects we document in Section 3 are driven by the shift to dry season horticulture, as effects on sample plots all but disappear controlling for crop choice. Shifting to most important plots, Table A6 suggests that much of the effect of the sample plot shock on labor and input use on most important plots is driven by intensive margin responses, as coefficients on household labor and inputs fall by only 23% - 36%. Combined with our results on irrigation use and horticulture, this suggests that households respond to the sample plot shock on both the intensive and extensive margins on their most important plot. These results on separation failures imply the existence of a within-household negative spillover, as they show that having one additional plot in the command area causes a household to substitute away from their other plots, reducing their use of irrigation, labor, and inputs on those plots. In principle, this means that our estimates 42 As crop fixed effects are a “bad control” (Angrist & Pischke, 2008), which introduces selection bias, we interpret these results as suggestive. However, we anticipate that selection conditional on crop choice should bias us towards finding no intensive margin effect on most important plots, as the particularly constrained households switching out of horticulture in response to the sample plot shock are likely to be the households who used less labor and inputs. 37 of the impacts of irrigation are the impacts of irrigating one of a farmers’ plots, gross of any input reallocations made by the farmers across plots in response to that irrigation. We would be particularly concerned about the bias generated by these reallocations if inputs were being shifted out of production on non-irrigated plots: in that case, our estimated impacts of access to irrigation would include reduced farming intensity on non-irrigated plots. However, the substitution of inputs we estimate from most important plots outside the command area is generally not significantly different from zero, and the largest point estimate implies it is 37% as large as substitution away from command area plots.43 We therefore conclude that the dominant within-household spillover is a reduced intensity of cultivation on irrigated plots. This suggests any bias in our estimates caused by spillovers onto plots outside the command area is likely to be small, and that spillovers onto plots inside the command area decreases our estimates in Section 3. 5.2.2 Impacts of separation failures on adoption of irrigation We now quantify the impact of separation failures on adoption of irrigation. We ask what would happen to adoption of irrigation if all households with two or more plots in the command area only had one plot in the command area. This counterfactual follows naturally from our estimates of the effect of the sample plot shock on adoption of irrigation on the most important plot, which we can interpret as the effect of a household’s second plot (the sample plot) being moved to the command area on adoption of irrigation on its first plot in the command area (the most important plot). Specifically, we calculate (# of HH with 2 CA plots) ∗ 2 ∗ (β1 + β3,CA ) (6) (# of HH with 2 CA plots) ∗ 2 + (# of HH with 1 CA plot) First, (β1 + β3,CA ) (from Equation 5) is the total effect of the sample plot shock on adoption of irrigation on most important plots in the command area. Second, in the denominator, we count the total number of command area plots among households’ sample plots and most important plots.44 Third, in the numerator, we apply the 43 We calculate this using β1 /(β1 + β3,CA ) from Equation 5, the impact of the sample plot shock on input allocations on most important plots outside the command area divided by the impact on most important plots inside the command area. Estimates come from Table 8. 44 We implicitly ignores households’ other plots; we do so because our research design has little to 38 estimated substitution caused by the sample plot shock to both the sample plot and the most important plot, as households are also substituting away from their sample plot when the most important plot is in the command area. We find adoption of irrigation would be 4.8pp higher under this counterfactual, which represents a 30% increase.45 This counterfactual relates to land market fric- tions – absent these frictions, we would expect that the increased adoption of irri- gation caused by this reallocation would be achieved by land markets. Intuitively, under perfect land markets, characteristics of the household that manages a partic- ular command area plot at baseline, including the number of other command area plots that household managed at baseline, should not affect equilibrium adoption of irrigation on that plot. Relatedly, as shown in the model, this would also be true if all markets (except potentially land markets) were frictionless. 5.2.3 Separating constraints We now provide evidence on the source of the separation failure by estimating hetero- geneous impacts of the sample plot shock on outcomes on the most important plot. Recall that for this analysis, the model makes two key predictions. First, if only insur- ance or input constraints bind, wealthier households and larger households should be less responsive. Second, if only labor constraints bind, the differential responsiveness of wealthier households and larger households cannot in general be signed. However, under additional assumptions, households with more elastic on farm labor supply (likely poorer households and larger households) should be less responsive. Note that this test does not allow us to reject a null that a particular constraint exists; any pattern of differential responses is consistent with all constraints binding. However, if we observe that wealthier households are more responsive, we can reject the null of no labor constraints. Additionally, we would interpret observing wealthier households to be more responsive and larger households to be less responsive as the strongest evidence of the presence of labor constraints from this test. We present the results of this test in Table 9. First, larger households are less say about the impacts of additional command area plots, or on households’ behavior on these plots, so we interpret this exercise as estimating a lower bound on the impact of reallocation on adoption of irrigation. 45 The p-value on this estimate is 0.077, which we calculate using block bootstrapped standard errors at the nearest water user group level to account for uncertainty in both the numerator and denominator of Equation 6. 39 responsive to the sample plot shock across every outcome. A household with 2 ad- ditional members, approximately one standard deviation of household size, is less responsive to the sample plot shock on its most important plot by 50% - 94% for irrigation use, 74% - 103% for horticulture, 63% - 75% for household labor, and 20% - 21% for inputs, with all but the input coefficient statistically significant. In contrast, wealthier households are more responsive to the sample shock across these same outcomes. A household with a one standard deviation higher asset index is more responsive to the sample plot shock on its most important plot by 41% - 97% for irrigation use, 39% - 81% for horticulture, 39% - 72% for household labor, and 42% - 58% for input use; however, these results are less precise. In effect, these results suggest that our estimates of separation failures are driven by the behavior of small, rich households, while large, poor households do not change their allocations on their most important plot in response to the sample plot shock. As discussed in Section 4, these results are very difficult to reconcile with a model that does not feature labor market failures. In sum, these results provide strong evidence for the existence of labor market failures that generate separation failures, which in turn cause inefficient adoption of irrigation. 6 Experimental evidence Our results leveraging the discontinuity suggest that land and labor market frictions combine to constrain the adoption of hillside irrigation in Rwanda. In this section, we provide evidence from randomized controlled trials on the presence of other competing constraints to adoption of irrigation: management challenges of irrigation schemes, and financial and informational constraints. Additional details on the motivation, treatment assignment protocols, and logistics of implementation of each of these ex- periments are presented in Appendix I. First, we test whether failures of scheme management limit farmers’ adoption of irrigation. If farmers faced limited access to water due to problems in the centralized operations and maintenance (O&M) system, this could constrain adoption of irriga- tion. We sought to alleviate this potential constraint by randomizing empowerment of local monitors to assist system operators and report maintenance needs. We find no evidence this experiment changed cultivation practices. This result is likely be- 40 Table 9: Larger and poorer households do not substitute away from most important plot in response to sample plot shock Culti- Irri- Horti- Banana HH Input Hired vated gated culture labor/ exp./ha labor ha exp./ha (1) (2) (3) (4) (5) (6) (7) RDD (Site-by-season FE, Specification 5) CA -0.069 -0.097 -0.107 0.052 -82.7 -9.1 -4.5 (0.086) (0.048) (0.046) (0.065) (34.8) (4.2) (3.6) [0.424] [0.046] [0.020] [0.418] [0.018] [0.031] [0.216] CA * # of HH members 0.021 0.011 0.014 0.008 10.1 0.6 0.5 (0.013) (0.007) (0.007) (0.011) (4.5) (0.5) (0.5) [0.112] [0.155] [0.061] [0.485] [0.025] [0.237] [0.273] CA * Asset index -0.013 -0.018 -0.015 -0.003 -12.4 -2.5 0.3 (0.027) (0.016) (0.017) (0.023) (10.2) (1.6) (1.4) [0.620] [0.277] [0.384] [0.900] [0.226] [0.117] [0.856] Joint F-stat [p] 1.5 1.5 1.8 3.1 2.0 1.9 0.6 [0.213] [0.214] [0.147] [0.026] [0.122] [0.128] [0.592] Average effect 0.038 -0.044 -0.038 0.092 -32.2 -6.0 -1.8 SFE (Spatial FE, Specification 5) CA -0.188 -0.121 -0.129 -0.052 -94.6 -10.3 -2.1 (0.098) (0.051) (0.047) (0.083) (38.9) (4.1) (3.5) [0.056] [0.017] [0.006] [0.532] [0.015] [0.013] [0.551] CA * # of HH members 0.039 0.017 0.019 0.023 12.5 0.7 0.3 (0.014) (0.008) (0.007) (0.015) (4.5) (0.5) (0.5) [0.007] [0.030] [0.012] [0.110] [0.006] [0.185] [0.539] CA * Asset index -0.043 -0.035 -0.030 -0.012 -24.0 -3.9 -0.3 (0.032) (0.020) (0.021) (0.026) (12.7) (1.7) (1.4) [0.181] [0.077] [0.156] [0.661] [0.060] [0.025] [0.813] Joint F-stat [p] 3.5 2.3 2.6 2.1 2.9 2.6 0.1 [0.015] [0.079] [0.050] [0.094] [0.033] [0.051] [0.937] Average effect 0.004 -0.036 -0.037 0.065 -33.2 -6.7 -0.5 # of observations 2,176 2,176 2,176 2,176 2,163 2,166 2,166 # of clusters 182 182 182 182 182 182 182 Control mean 0.368 0.114 0.109 0.199 66.8 5.6 3.9 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) and its inter- action with W (“CA * W”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, log area of the sample plot and the most important plot, a command area indicator for the most important plot (“MIP CA”), and all W. RDD specification includes site-by-season fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. “Average effect” estimates are from Table 8. 41 cause very few farmers report any challenges related to operations and maintenance over the four years of survey data collection. Second, the government planned to charge farmers in the command area land taxes, which were unconditional on cul- tivation decisions, to fund O&M in the schemes. To test whether these taxes limit farmers adoption of irrigation, we randomized tax subsidies for farmers. We find no evidence this experiment changed cultivation practices. This result is likely because compliance with the fees was extremely low (4%), so collected fees were too low to plausibly constrain farmers. We discuss these experiments further in Appendix I.2, and conclude here that these issues were not relevant in this context. Third, we test whether financial and informational constraints limit adoption of irrigation. We assigned horticultural minikits to randomly selected farmers from water user group member lists. Each minikit included horticultural seeds, chemical fertilizer, and insecticide, in sufficient quantities to cultivate 0.02 ha. In principle, these minikits should resolve constraints related to input access, including credit constraints. In addition, they should reduce basis risk which may resolve insurance constraints and facilitate experimentation if information is a constraint. In other contexts, minikits of similar size relative to median landholdings have been shown to increase adoption of new crop varieties or varieties with low levels of adoption (Emerick et al., 2016; Jones et al., 2018). To test for spillovers, water user groups were randomly assigned to 20%, 60%, or 100% minikit saturation, with rerandomization for balance on Zone and O&M treatment status. Minikits were offered to assigned individuals prior to 2017 Rainy 1 and 2017 Dry.46 6.1 Empirical strategy and results We estimate the impact of minikits using the specification y1ist = β1 Assigned minikiti + β2 Minikit saturationi + X1is γ + 1ist (7) 46 Each of these three interventions exist only in the command area. As such, the effects of irriga- tion estimated throughout this paper are averages across the experimental treatments. Overall, this concern is mitigated by the fact that all three experimental treatments had very limited impacts on cultivation practices. In addition, the first two of these treatments (fee subsidies and monitoring systems) vary characteristics which would be heterogeneous across different irrigation systems; we are therefore comfortable with the interpretation that estimates above exist for the average of these treatments. Readers may be most concerned about interpretations of treatment effects in the pres- ence of the minikit treatment; in addition to the modest effects on cultivation described below, we have also conducted analysis excluding minikit winners and conclusions are qualitatively unaffected. 42 Primary outcomes y are whether households used a minikit (in 2017 Rainy 1 or in 2017 Dry) and adoption of horticulture. Assigned minikiti is an indicator for whether household i was randomly assigned to receive a minikit, Minikit saturationi is the probability of receiving a minikit for households in the water user group of house- hold i’s sample plot, and X1is includes the stratification variables (Zone fixed effects and O&M treatment status), as well as indicator variables reflecting the probability that a household would receive a minikit,47 and in some specifications 2016 Dry hor- ticulture adoption. For precision, we restrict to command area plots, and for plot level outcomes we focus on the 2017 and 2018 Dry seasons; these are the plots and seasons in which we expect households receiving minikits to adopt horticulture. As minikit saturation is assigned at the water user group level, robust standard errors are clustered at the water user group level. We present the results of this analysis in Table 10. First, we find a strong first stage; households assigned to receive a minikit are 40pp more likely to use a minikit than households not assigned to receive a minikit. Almost all non-compliance is driven by households who were assigned to receive a minikit but did not pick it up — 4.8% of households not assigned to received a minikit used one, while 43.8% of households assigned to receive a minikit used one. Second, we find no effects of minikits on horticulture use, and we have sufficient precision to reject estimates from other contexts of the effect of minikits on technology adoption (Emerick et al., 2016; Jones et al., 2018). Third, we find no effects of minikit saturation, although these estimates are less precise than those of the impacts of assignment to receive a minikit; we note that we also fail to reject that the sum of the coefficients on assigned minikit and minikit saturation (the effect on adoption in a fully treated compared to an untreated water user group) is zero. Fourth, we find strong positive selection into using a minikit: farmers who grew horticulture in 2016 Dry, who are 30.6pp more likely to grow horticulture in 2017 and 2018 Dry, are 13.1pp more likely to use a minikit in response to assignment to receieve a minikit receipt. We interpret these results as corroborating evidence that information and financial constraints are not dominant constraints to adoption of irrigation. Most farmers assigned to receive a minikit do not pick it up and use it, and the farmers who do 47 After matching names from the lists of water user group members to our baseline survey, we found that 32% of households either had multiple household members on the lists of water user group members or had a single household member listed multiple times; these households are more likely to be assigned to receive a minikit and may differ from other households. 43 Table 10: Minikits do not cause increased adoption of horticulture, strong positive selection into minikit takeup Minikit takeup Horticulture (1) (2) (3) (4) Assigned minikit 0.398 0.395 0.035 0.052 (0.038) (0.044) (0.041) (0.042) [0.000] [0.000] [0.396] [0.221] Minikit saturation -0.047 -0.064 -0.078 -0.067 (0.056) (0.057) (0.054) (0.054) [0.394] [0.260] [0.149] [0.218] Horticulture (2016 Dry) 0.046 0.306 (0.049) (0.053) [0.345] [0.000] Assigned minikit * Horticulture (2016 Dry) 0.131 -0.019 (0.068) (0.070) [0.052] [0.788] # of lotteries entered X X X X O&M treatment X X X X Zone FE X X X X # of observations 910 762 838 727 # of clusters 187 170 182 167 Notes: Regression analysis is presented in this table. All columns use outcomes on sample plots. Each row presents coefficients, with robust standard errors clustered at the water user group level in parentheses, and p-values in brackets. “Assigned minikit” is an indicator for whether the household was assigned to receive a minikit, “Minikit saturation” is the probability of minikit assignment that was assigned to the water user group of the household’s sample plot, and “Horticulture (2016 Dry)” is an indicator that the household planted horticulture on their sample plot in 2016 Dry. pick it up typically would have grown horticulture even if not assigned to receive a minikit. We similarly find no evidence that saturation of minikits lead to increased adoption, as we might expect if learning was important.48 Our experimental evidence therefore supports the conclusion that, in this context, financial and informational frictions are not the primary explanations for the low and inefficient irrigation use. 48 That information is not a binding constraint is also consistent with the stability in levels of irrigation adoption that we observe over time, in contrast to an S-curve of adoption which would be consistent with learning. 44 7 Conclusion This paper provides evidence that irrigation has the potential to be a transformative technology in sub-Saharan Africa. Using data from very proximate plots which receive differential access to irrigation, we show the construction of an irrigation system leads to a 53% - 71% increase in cash profits. These profits are generated by a switch in cropping patterns from perennial bananas towards a more input-intensive rotation of dry-season horticulture and rainy-season staples. At the same time, we observe only a minority of farmers adopting this technology four years after introduction. We further document that frictions in land and labor markets cause inefficient adoption of irrigation. This result provides novel evidence that separation failures in agricultural household production lead to land misalloca- tion and inefficient adoption of a new technology in Rwanda. This result has stark policy relevance: without greater adoption, these irrigation systems will not be able to generate sufficient revenue to be sustainable. While our results highlight the presence of constraints on land and labor markets and demonstrate those constraints generate inefficiencies in an important technology adoption context, they can not provide evidence as to whether those inefficiencies lead to too much or too little adoption of irrigation. If farmers faced no constraints in the amount of labor they could sell at the market wage, the model would suggest they would irrigate even less; indeed, based on the stylized facts in Section 2.3, the model would predict nearly all of these farmers would either cultivate low-intensity bananas or exit farming altogether. Of course, a labor supply shift of this magnitude is likely to put downward pressure on the wage. If the ability to sell labor without frictions led to a substantial reduction in the wage rate, we may see many more farmers hiring labor and cultivating horticulture on more of their plots: this shift could allow irrigation systems to realize their transformative potential. These results underscore the need for more evidence on both the role of factor markets in technology adoption, and the identification of particular institutions which contribute to or which can smooth those market failures. In some cases, these market failures may pose a competing constraint which coexists with other, more conventional constraints to production: if frictions in factor markets similarly constrain adoption of new technologies in other environments, then incomplete factor markets may limit the effectiveness of financial and information interventions in improving agricultural 45 productivity. This is a fruitful area for future research. References Adamopoulos, T., Brandt, L., Leight, J., & Restuccia, D. (2017). Misallocation, Selection and Productivity: A Quantitative Analysis with Panel Data from China. Working Paper 23039, National Bureau of Economic Research. Adamopoulos, T. & Restuccia, D. (2014). The size distribution of farms and inter- national productivity differences. American Economic Review, 104(6), 1667–97. Adamopoulos, T. & Restuccia, D. (2018). Geography and Agricultural Productiv- ity: Cross-Country Evidence from Micro Plot-Level Data. Working Paper 24532, National Bureau of Economic Research. Agness, D., Baseler, T., Chassang, S., Dupas, P., & Snowberg, E. (2020). Valuing the time of the self-employed. Ali, D. A., Deininger, K., & Goldstein, M. (2014). Environmental and gender impacts of land tenure regularization in africa: Pilot evidence from rwanda. Journal of Development Economics, 110, 262–275. Anderson, M. (2008). Multiple inference and gender differences in the effects of early intervention: A reevaluation of the abecedarian, perry preschool, and early training projects. Journal of the American Statistical Association, 103(484), 1481–1495. Angrist, J. D. & Pischke, J.-S. (2008). Mostly harmless econometrics: An empiricist’s companion. Princeton university press. Attwood, D. W. (2005). Big is ugly? how large-scale institutions prevent famines in western india. World Development, 33(12), 2067–2083. Barrett, C. B. (1996). On price risk and the inverse farm size-productivity relation- ship. Journal of Development Economics, 51(2), 193–215. Benjamin, D. (1992). Household composition, labor markets, and labor demand: testing for separation in agricultural household models. Econometrica: Journal of the Econometric Society, (pp. 287–322). 46 Besley, T. (1995). Property rights and investment incentives: Theory and evidence from ghana. Journal of Political Economy, 103(5), 903–937. Besley, T. & Ghatak, M. (2010). Property rights and economic development. In Handbook of development economics, volume 5 (pp. 4525–4595). Elsevier. Bleakley, H. & Ferrie, J. (2014). Land openings on the georgia frontier and the coase theorem in the short-and long-run. Breza, E., Krishnaswamy, N., & Kaur, S. (2018). Scabs: The social suppression of labor supply. Conley, T. G. (1999). Gmm estimation with cross-sectional dependence. Journal of Econometrics, 92(1), 1–45. Conley, T. G. & Udry, C. R. (2010). Learning about a new technology: Pineapple in ghana. The American Economic Review, (pp. 35–69). De Janvry, A., Sadoulet, E., & Suri, T. (2017). Field experiments in developing country agriculture. In Handbook of Economic Field Experiments, volume 2 (pp. 427–466). Elsevier. Deininger, K. & Feder, G. (2009). Land registration, governance, and development: Evidence and implications for policy. The World Bank Research Observer, 24(2), 233–266. Dillon, A. (2011). The effect of irrigation on poverty reduction, asset accumulation, and informal insurance: Evidence from northern mali. World Development, 39(12), 2165–2175. Dillon, A. & Fishman, R. (2019). Dams: Effects of hydrological infrastructure on development. Annual Review of Resource Economics, 11. Dillon, B. & Barrett, C. B. (2017). Agricultural factor markets in sub-saharan africa: An updated view with formal tests for market failure. Food policy, 67, 64–77. Dillon, B., Brummund, P., & Mwabu, G. (2019). Asymmetric non-separation and rural labor markets. Journal of Development Economics, 139, 78–96. 47 Duflo, E. & Pande, R. (2007). Dams. The Quarterly Journal of Economics, 122(2), 601–646. Emerick, K., de Janvry, A., Sadoulet, E., & Dar, M. H. (2016). Technological inno- vations, downside risk, and the modernization of agriculture. American Economic Review, 106(6), 1537–61. Fafchamps, M. (1993). Sequential labor decisions under uncertainty: An estimable household model of west-african farmers. Econometrica, 61, 1173–1197. Fei, J. C. & Ranis, G. (1961). A theory of economic development. The American Economic Review, 51(4), 533–565. Foster, A. & Rosenzweig, M. (2017). Are there too many farms in the world? labor- market transaction costs, machine capacities, and optimal farm size. NBER Work- ing Paper, (No. 23909). Foster, A. D. & Rosenzweig, M. R. (1996). Technical change and human-capital re- turns and investments: evidence from the green revolution. The American economic review, (pp. 931–953). Goldstein, M., Houngbedji, K., Kondylis, F., O’Sullivan, M., & Selod, H. (2018). Formalization without certification? experimental evidence on property rights and investment. Journal of Development Economics, 132, 57–74. Goldstein, M. & Udry, C. (2008). The profits of power: Land rights and agricultural investment in ghana. Journal of political Economy, 116(6), 981–1022. Gollin, D. & Udry, C. (2019). Heterogeneity, measurement error and misallocation: Evidence from african agriculture. NBER Working Paper 25440. Heisey, P. W. & Norton, G. W. (2007). Fertilizers and other farm chemicals. Handbook of agricultural economics, 3, 2741–2777. Jacoby, H. G. (1993). Shadow wages and peasant family labour supply: an econo- metric application to the peruvian sierra. The Review of Economic Studies, 60(4), 903–921. Jacoby, H. G. (2017). “well-fare” economics of groundwater in south asia. The World Bank Research Observer, 32(1), 1–20. 48 Jones, M., Kondylis, F., Mobarak, A. M., & Stein, D. (2018). Evaluating the inte- grated agriculture productivity project in bangladesh. Karlan, D., Osei, R., Osei-Akoto, I., & Udry, C. (2014). Agricultural decisions after relaxing credit and risk constraints. The Quarterly Journal of Economics, 129(2), 597–652. Kaur, S. (2014). Nominal wage rigidity in village labor markets. NBER Working Paper 20770. Key, N., Sadoulet, E., & Janvry, A. D. (2000). Transactions costs and agricultural household supply response. American journal of agricultural economics, 82(2), 245–259. LaFave, D. & Thomas, D. (2016). Farms, families, and markets: New evidence on completeness of markets in agricultural settings. Econometrica, 84(5), 1917–1960. Lewis, W. A. (1954). Economic development with unlimited supplies of labour. The manchester school, 22(2), 139–191. Magruder, J. (2012). High unemployment yet few small firms: The role of centralized bargaining in south africa. American Economic Journal: Applied Economics, 4(3), 138–66. Magruder, J. (2013). Can minimum wages cause a big push? evidence from indonesia. Journal of Development Economics, 100(1), 48–62. Ostrom, E. (1990). Governing the commons: The evolution of institutions for collec- tive action. Cambridge university press. Restuccia, D. & Santaeulalia-Llopis, R. (2017). Land misallocation and productivity. NBER Working Paper 23128. Sekhri, S. (2014). Wells, water, and welfare: the impact of access to groundwater on rural poverty and conflict. American Economic Journal: Applied Economics, 6(3), 76–102. Sen, A. K. (1966). Peasants and dualism with or without surplus labor. Journal of political Economy, 74(5), 425–450. 49 Shenoy, A. (2017). Market failures and misallocation. Journal of Development Eco- nomics, 128, 65–80. Singh, I., Squire, L., & Strauss, J. (1986). Agricultural household models: Extensions, applications, and policy. The Johns Hopkins University Press. Skoufias, E. (1994). Using shadow wages to estimate labor supply of agricultural households. American journal of agricultural economics, 76(2), 215–227. Smith, C. (2019). Land concentration and long-run development: Evidence from the frontier united states. Suri, T. (2011). Selection and comparative advantage in technology adoption. Econo- metrica, 79(1), 159–209. Udry, C. (1996). Gender, agricultural production, and the theory of the household. Journal of political Economy, 104(5), 1010–1046. Udry, C. (1997). Efficiency and market structure: testing for profit maximization in african agriculture. World Bank (2007). World development report 2008: Agriculture for development. 50 appendix – for online publication Appendix A Main variable appendix Household variables: All household variables are constructed from the baseline. • HHH female: Indicator that the household head is female. • HHH age: Age of the household head. • HHH completed primary : Indicator that the household head completed primary. • HHH worked off farm : Indicator that the household head worked off farm. • # of plots : Number of plots reported as managed by the household. Includes plots rented in, plots owned and cultivated in the past year, and plots rented out. • # of HH members : Number of members of the household. • # of HH members (15-64): Number of members of the household between age 15 and 64. • # of HH members who worked off farm : Number of members of the household who worked off farm. • Housing expenditures : Expenditures over the past year on housing and furnish- ing. Winsorized at the 99th percentile. • Asset index : First principal component of log number of assets-by-category owned and an indicator for positive number of assets-by-category owned, where the categories are cows, goats, pigs, chickens, radios, mobile phones, pieces of furniture, bicycles, and shovels. Standardized to be mean 0 and standard deviation 1, with positive values indicating more assets. • Food security index : First principal component of log days in the past week of consumption of food item-by-category and an indicator for any consumption of food item-by-category. In baseline, categories are flour, bread, rice, meat and fish, poultry and eggs, dairy products, cooking oil, fruits, beans, vegetables, plantains and cassava and potatoes, juice and soda, sugar and honey, salt and spices, meals prepared outside home, and groundnut and other oilseed flour. In follow up surveys, categories are flour, bread, cakes and chapati and mandazi, rice, small fish, meats and other fish, poultry and eggs, dairy products, peanut oil, palm oil and other cooking oil, avocados, other fruits, beans, tomato, onion, other vegetables, plantains, Irish potatoes, sweet potatoes, sugar, salt, local banana beer at home, groundnut flour. Standardized to be mean 0 and standard deviation 1, with positive values indicating more consumption. 1 appendix – for online publication • Overall index : Index constructed following Anderson (2008) using housing ex- penditures, asset index, and food security index. Plot variables: All plot variables are constructed from the baseline. • Command area : Indicator that plot located in command area, equal 1 if any share of the plot is inside of the command area. Calculated from plot map. • Distance to boundary : Distance from plot boundary to command area boundary, 0 for plots whose plot map intersects the boundary. Positive for plots that are inside the command area, negative for plots that are outside the command area. Calculated from plot map. • Area : Area in hectares. Calculated from plot map. • Water user group: Water user groups that the plot is located in, calculated from plot map. If the plot intersects multiple water user group boundaries, the water user group in which the largest share of the plot’s area is contained. Missing for plots that are outside the command area. • Nearest water user group: For plots inside the command area, the water user group. For plots outside the command area, the water user group whose bound- ary the boundary of the plot is the shortest distance from. Calculated from plot map. • Terraced : Indicator that the plot was terraced. • Elevation : Elevation of plot in meters. Calculated from plot map. • Slope: Maximum plot grade. Calculated from plot map. Plot-season variables: All plot-season variables are constructed from the baseline when used in balance tables. Variables related to attrition are observed at plot-season level when used as outcomes in regressions testing for differential attrition. • Own plot : Indicator that the surveyed cultivator owns the plot. 0 when the surveyed cultivator rents in the plot. • Owned plot >5 years : Indicator that the surveyed cultivator had owned the plot for at least 5 years. • Rented out to farmer : Indicator that the surveyed cultivator rented out the plot to another farmer. • Rented out to commercial farmer : Indicator that the plot was rented out to a commercial farmer. • HH attrition : Plot-season indicator that the household associated with the plot was not reached for the survey. 2 appendix – for online publication • Transaction (not tracked): Plot-season indicator that the plot was sold, rented out, or no longer rented in, and the new household responsible for the plot was not successfully followed up with. • Tracked : Plot-season indicator that the plot was sold, rented out, or no longer rented in, and the new household responsible for the plot was successfully fol- lowed up with and asked questions on agricultural production on the plot. • Missing : Plot-season indicator that agricultural production data is missing for that plot. Sum of variables HH attrition, Rented out to commercial farmer, and Transaction (not tracked). Agricultural variables • Cultivated : Plot-season indicator for any cultivation. All other agricultural variables are set to 0 when no cultivation takes place. • Irrigated : Plot-season indicator for any irrigation use. • Horticulture: Plot-season indicator for any horticulture cultivated. As horti- cultural crops are annuals, this will include activities associated with planting, growing, and harvesting.49 • Banana : Plot-season indicator for any bananas cultivated. As bananas are perennials, this refers to any activities associated with planting, growing, or harvesting, and need not include all three. • HH labor/ha : Plot-season sum of household labor use, divided by plot area. Winsorized at the 99th percentile. • Input expenditures/ha : Plot-season sum of expenditures on non-labor inputs, divided by plot area. Winsorized at the 99th percentile. • Hired labor expenditures/ha : Plot-season sum of expenditures on hired labor, divided by plot area. Winsorized at the 99th percentile. • Hired labor (days)/ha : Plot-season sum of hired labor use, divided by plot area. Winsorized at the 99th percentile. • Price: Prices are calculated at the District-crop-season level, as the median of plot-crop-season reported sales divided by reported kilograms sold. Prices are set to missing when there are less than 10 observations that District-crop-season and either more than two District-crop-seasons with at least 10 observations that District-crop-survey or at least 30 observations that District-crop-survey; 49 In Table 1, an alternative definition of crop choice is used, where a crop indicator indicates that crop is the primary crop cultivated that plot-season. 3 appendix – for online publication these cut-off points were chosen to maximize inclusion of prices judged subjec- tively to be reasonable, and maximize exclusion of prices judged subjectively to be not reasonable. • Yield : Plot-season sum of prices times harvested quantities. Yields are miss- ing when all crops cultivated that plot-season have missing prices or missing harvested quantities. When multiple crops are grown on a plot-season and some have observed prices and harvested quantities, those with missing prices or quantities are treated as 0 production. After this procedure 3.6% of rainy season observations and 5.3% of dry season observations in our discontinuity sample have missing yields. Winsorized at the 99th percentile. • Sales/ha : Plot-season total reported sales, divided by area. Winsorized at the 99th percentile. • Sales share: Sales/ha divided by yield, equal to 1 when reported sales/ha is greater than yield. • Profits/ha (Shadow wage = 0 RwF/day): Yield minus hired labor expendi- tures/ha minus input expenditures/ha. • Profits/ha (Shadow wage = 800 RwF/day): Yield minus hired labor expendi- tures/ha minus input expenditures/ha minus 800 times HH labor/ha. Experimental variables: Additional details on these variables are in Appendix I. • Assigned minikit : Indicator that household was assigned to receive a minikit. • Minikit saturation : Saturation of minikits assigned for the Water User Group of the plot. • Minikit takeup: Indicator that the household reported using a minikit. • Zone: The Zone in which the plot’s Water User Group is located in. The plots in our survey are located in 239 Water User Groups grouped into 33 Zones. • O&M treatment : O&M treatment status of the Water User Group of the plot. • # of lotteries entered, minikits : Number of lotteries for minikits the household was entered into. Appendix B Baseline results We present results from 2014 Dry, when the hillside irrigation systems were online in only a small part of the sites, and from 2015 Rainy 1 and Rainy 2, when hillside irrigation was just beginning to come online. These surveys were just a few years 4 appendix – for online publication after terracing occurred, and shortly after the construction of the hillside irrigation schemes was completed. To begin, we estimate specifications (1) and (2) in Tables A1 and A2. First, in Table A1, we consider two additional impacts of command area construc- tion. First, terracing occurred jointly with hillside irrigation. Although there was also meaningful terracing outside the command area to protect against erosion, there was much more terracing inside the command area, as it is impossible to have hillside irrigation without terracing (as water would run off the sloped hillsides). We there- fore note that our effects are the combined effect of terracing and access to irrigation. However, we also note that irrigation is used almost exclusively for dry season horti- culture, and our results in Section 3 are fully explained by crop fixed effects, providing suggestive evidence that the transition to dry season horticulture enabled by access to irrigation, as opposed to any direct productivity effects conditional on crop choice caused by terracing, drives our results. Second, rentals out to commercial farmers occurred inside the command area, as these commercial farmers were keen to take advantage of access to irrigation. These commercial farmers were private businesses exporting vegetables and they had negotiated land lease rates with the government, and as such they were not willing to share detailed data on their profitability. We discuss the implications of this differential attrition for our results in Appendix F. In addition, plots inside the command area are discretely lower in elevation than plots outside the command area – this is mechanical, as the command area is below the canal. While controlling for distance to the command area boundary and restricting to plots within 50 meters of the command area boundary partially eliminate these differences, some difference remains. Consistent with this, in Table A1 we find that command area plots are 33 meters lower than plots outside the command area using specification 1. However, by restricting comparisons only to plots that are very close to one another, using the spatial fixed effects specification we find that command area plots are 9 meters lower than plots outside the command area. As our results from both of these specifications are quite similar in Section 3, this should assuage concerns that elevation is an important omitted variable in our analysis. In Supplementary Appendix A, we also include elevation as a control, and find the patterns in the results that we describe in Section 3 are robust to its inclusion. Moreover, while our primarily agricultural outcomes for analysis are from recall over the past three completed agricultural seasons, our measure of food security comes 5 appendix – for online publication Table A1: Terracing, baseline rentals to commercial farmer, elevation, and baseline food security in command area Sample plot Terraced Rented out, Elevation Food security comm. farmer index (1) (2) (3) (4) RDD (Site FE, Specification 1) CA 0.407 0.173 -32.7 0.19 (0.055) (0.031) (5.6) (0.10) [0.000] [0.000] [0.000] [0.053] SFE (Spatial FE, Specification 2) CA 0.450 0.168 -8.6 0.15 (0.053) (0.044) (1.2) (0.10) [0.000] [0.000] [0.000] [0.122] # of observations 969 969 969 968 # of clusters 197 197 197 197 Control mean 0.484 0.018 1741.0 -0.13 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, and log area of the sample plot. RDD specification includes site fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. from the past week of food consumption. Our baseline survey was conducted from August - October 2015, so most irrigating households would have just recently har- vested and sold any 2015 Dry horticultural production. Consistent with this, in Table A1 we find significant impacts of the command area on food security at baseline. Second, in Table A2, we estimate impacts on cultivation, irrigation, and crop choice decisions; consistent with irrigation not having come fully online, we observe limited adoption of irrigation. In contrast to our main results from follow up surveys, at baseline cultivation is lower in the dry season inside the command area. This is driven by a combination of low adoption of irrigation and horticulture (only 2 - 5pp higher in the command area than outside the command area), and lower cultivation of bananas (8 - 10pp lower). These banana effects are partially explained by terracing, during which bananas were torn up to construct the terraces. These banana effects are smaller than in follow up surveys, and the share of plots cultivated with bananas is also lower outside the command area than in follow up surveys. Together, we 6 appendix – for online publication interpret these results as farmers beginning to replant bananas following terracing, but less replanting occurring inside the command area than outside. As irrigation had come online by 2015 Rainy 1 and 2, rainy season results look similar to rainy season results in subsequent seasons – modestly lower cultivation, and significant but modest increases in adoption of irrigation and horticulture, and reduced banana cultivation. 7 Table A2: Access to irrigation in the command area is limited at baseline appendix – for online publication (a) Dry season Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA -0.128 0.029 0.019 -0.103 -26.9 1.6 0.7 -30.4 -26.2 -31.7 -16.4 (0.046) (0.016) (0.019) (0.036) (23.6) (2.1) (1.4) (23.5) (21.7) (22.1) (13.9) [0.005] [0.068] [0.304] [0.005] [0.255] [0.437] [0.623] [0.197] [0.227] [0.153] [0.240] SFE (Spatial FE, Specification 2) CA -0.120 0.029 0.014 -0.077 -39.5 1.5 -0.1 -31.4 -37.2 -32.6 -7.9 (0.051) (0.016) (0.018) (0.041) (28.2) (2.0) (1.6) (30.4) (28.7) (29.2) (19.2) [0.020] [0.067] [0.454] [0.060] [0.162] [0.458] [0.930] [0.302] [0.194] [0.264] [0.682] # of observations 894 894 894 894 890 894 894 868 894 868 864 # of clusters 196 196 196 196 196 196 196 195 196 195 195 Control mean 0.211 0.009 0.012 0.145 41.3 1.9 0.8 46.5 27.1 45.0 13.4 8 (b) Rainy seasons Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA -0.067 0.043 0.057 -0.104 -5.5 2.3 3.0 5.7 9.5 0.5 2.8 (0.038) (0.011) (0.022) (0.037) (23.5) (3.4) (4.2) (22.8) (13.8) (23.2) (24.0) [0.076] [0.000] [0.008] [0.005] [0.815] [0.492] [0.480] [0.804] [0.491] [0.984] [0.906] SFE (Spatial FE, Specification 2) CA -0.048 0.041 0.064 -0.093 -7.3 4.4 3.9 -1.6 24.5 -9.6 -6.5 (0.042) (0.015) (0.029) (0.038) (34.4) (3.9) (6.0) (29.0) (17.9) (28.9) (35.0) [0.261] [0.006] [0.029] [0.015] [0.831] [0.265] [0.518] [0.957] [0.170] [0.739] [0.853] # of observations 1,632 1,632 1,632 1,632 1,621 1,632 1,632 1,585 1,632 1,585 1,575 # of clusters 192 192 192 192 192 192 192 192 192 192 192 Control mean 0.756 0.011 0.042 0.162 225.4 12.5 12.8 171.2 45.0 146.2 -30.0 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, and log area of the sample plot. RDD specification includes site-by-season fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. appendix – for online publication Third, we also estimate impacts on inputs and output in Table A2. Consistent with the small increases in horticulture and modestly larger decreases in low input intensive bananas, we do not find consistent significant effects on input use, yields, sales, or measures of profits in the dry season or rainy season. Lastly, as the command area, as of the baseline, had not yet caused a large increase in demand for labor or inputs, or caused large increases in agricultural production, we do not anticipate any most important plot effects. As a placebo check, we present most important plot results, estimating specifications (3) and (4), and specification (5) with heterogeneity. We present these results in Tables A3 and A4. In line with our prediction, we fail to find any consistent significant effects on most important plots, either in our main specifications or for heterogeneity. Appendix C Robustness To complement the analysis in Section 3, we estimate the impacts of access to ir- rigation using a specification that omits controls for distance to the boundary, its interaction with a command area indicator, and log plot area. y1ist = β1 CA1is + αst + 1ist (A1) Specification A1 compares samples plots inside the command area within 50 meters of the command area boundary to sample plots outside the command area within 50 meters of the command area boundary. Results are presented in Table A5 and are qualitatively similar to the results in Table 4. 9 appendix – for online publication Table A3: No effects of sample plot shock on most important plots at baseline Culti- Irri- Horti- Banana HH Input Hired vated gated culture labor/ exp./ha labor ha exp./ha (1) (2) (3) (4) (5) (6) (7) RDD (Site-by-season FE, Specification 3) CA 0.015 0.020 0.021 0.016 -2.4 3.8 -6.3 (0.058) (0.016) (0.016) (0.050) (26.2) (1.9) (5.4) [0.800] [0.196] [0.195] [0.752] [0.927] [0.039] [0.240] CA * MIP CA 0.046 -0.005 -0.009 0.051 -16.9 -0.6 -2.8 (0.062) (0.030) (0.030) (0.050) (27.4) (3.2) (4.7) [0.461] [0.869] [0.773] [0.311] [0.538] [0.859] [0.554] Joint F-stat [p] 0.6 0.8 0.9 1.6 0.4 3.0 2.6 [0.541] [0.430] [0.429] [0.214] [0.663] [0.053] [0.079] Sample plot effect -0.128 0.029 0.019 -0.103 -26.9 1.6 0.7 Average effect 0.034 0.018 0.017 0.037 -9.5 3.6 -7.5 SFE (Spatial FE, Specification 4) CA 0.039 0.009 0.022 0.056 -23.5 1.2 -10.0 (0.068) (0.017) (0.015) (0.057) (31.2) (1.2) (6.8) [0.566] [0.624] [0.140] [0.325] [0.452] [0.292] [0.142] CA * MIP CA -0.046 -0.011 -0.018 -0.018 -31.0 -2.6 -3.6 (0.069) (0.029) (0.031) (0.058) (28.8) (3.7) (5.6) [0.512] [0.700] [0.549] [0.759] [0.281] [0.478] [0.524] Joint F-stat [p] 0.2 0.2 1.1 0.6 1.7 0.6 2.8 [0.779] [0.854] [0.337] [0.572] [0.177] [0.573] [0.059] Sample plot effect -0.120 0.029 0.014 -0.077 -39.5 1.5 -0.1 Average effect 0.018 0.004 0.014 0.048 -37.5 0.1 -11.6 # of observations 751 751 751 751 747 751 751 # of clusters 182 182 182 182 182 182 182 Control mean 0.186 0.030 0.027 0.129 40.6 1.4 5.1 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) and its interaction with a command area indicator for the most important plot (“CA * MIP CA”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, log area of the sample plot and the most important plot, and a command area indicator for the most important plot (“MIP CA”). RDD specification includes site-by-season fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. “Sample plot effect” estimates are from Table A2, and “Average effect” is the coefficient on CA from a specification that omits CA * MIP CA. 10 appendix – for online publication Table A4: No heterogeneous effects of sample plot shock on most important plots at baseline Culti- Irri- Horti- Banana HH Input Hired vated gated culture labor/ exp./ha labor ha exp./ha (1) (2) (3) (4) (5) (6) (7) RDD (Site-by-season FE, Specification 5) CA 0.135 0.045 0.013 0.067 19.3 0.3 -9.8 (0.085) (0.046) (0.040) (0.065) (29.8) (3.8) (6.0) [0.113] [0.333] [0.741] [0.300] [0.518] [0.935] [0.099] CA * # of HH members -0.021 -0.005 0.001 -0.007 -6.3 0.6 0.3 (0.016) (0.008) (0.007) (0.013) (5.5) (0.8) (0.9) [0.185] [0.498] [0.940] [0.611] [0.255] [0.426] [0.750] CA * Asset index -0.003 0.008 -0.003 0.002 -10.8 -2.7 -6.4 (0.037) (0.017) (0.016) (0.030) (16.4) (2.2) (3.8) [0.940] [0.656] [0.857] [0.959] [0.507] [0.222] [0.093] Joint F-stat [p] 1.2 0.4 0.3 0.5 1.2 2.5 1.5 [0.306] [0.736] [0.810] [0.658] [0.311] [0.057] [0.224] Average effect 0.034 0.018 0.017 0.037 -9.5 3.6 -7.5 SFE (Spatial FE, Specification 5) CA 0.079 0.013 -0.002 0.051 -20.7 -3.0 -12.4 (0.104) (0.045) (0.037) (0.082) (31.6) (3.5) (6.1) [0.446] [0.776] [0.957] [0.531] [0.512] [0.386] [0.044] CA * # of HH members -0.013 -0.002 0.003 -0.000 -3.8 0.6 0.0 (0.019) (0.008) (0.008) (0.015) (6.1) (0.6) (0.9) [0.507] [0.811] [0.687] [0.988] [0.532] [0.325] [0.977] CA * Asset index 0.033 0.010 0.000 0.043 -11.9 -1.9 -6.8 (0.047) (0.018) (0.018) (0.041) (15.9) (2.0) (3.3) [0.482] [0.587] [0.992] [0.284] [0.454] [0.343] [0.039] Joint F-stat [p] 0.2 0.1 0.3 0.7 0.7 0.7 1.9 [0.867] [0.933] [0.852] [0.527] [0.541] [0.575] [0.133] Average effect 0.018 0.004 0.014 0.048 -37.5 0.1 -11.6 # of observations 750 750 750 750 746 750 750 # of clusters 182 182 182 182 182 182 182 Control mean 0.186 0.030 0.027 0.129 40.6 1.4 5.1 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) and its inter- action with W (“CA * W”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, log area of the sample plot and the most important plot, a command area indicator for the most important plot (“MIP CA”), and all W. RDD specification includes site-by-season fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. “Average effect” estimates are from Table A3. 11 Table A5: Estimated effects of access to irrigation are robust to omission of key controls appendix – for online publication (a) Dry season Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) Base (Specification A1) CA 0.033 0.202 0.180 -0.134 69.6 7.4 5.6 61.2 52.3 49.6 -0.3 (0.031) (0.019) (0.020) (0.024) (14.7) (1.3) (1.9) (20.7) (13.3) (18.6) (12.0) [0.289] [0.000] [0.000] [0.000] [0.000] [0.000] [0.003] [0.003] [0.000] [0.008] [0.978] # of observations 2,537 2,537 2,536 2,536 2,523 2,527 2,527 2,402 2,527 2,402 2,400 # of clusters 196 196 196 196 196 196 196 196 196 196 196 RDD estimate 0.005 0.162 0.137 -0.133 70.8 6.3 3.7 73.1 55.5 63.9 9.3 SFE estimate 0.022 0.171 0.156 -0.142 76.9 4.3 3.2 55.0 49.3 49.1 -3.0 Control mean 0.391 0.058 0.065 0.245 59.5 2.5 3.7 82.3 49.7 76.1 32.8 (b) Rainy seasons 12 Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) Base (Specification A1) CA -0.054 0.044 0.044 -0.149 -7.7 2.5 7.1 -45.1 -4.8 -53.4 -47.2 (0.020) (0.007) (0.011) (0.024) (18.3) (2.0) (2.4) (22.0) (10.8) (20.5) (16.5) [0.006] [0.000] [0.000] [0.000] [0.671] [0.205] [0.003] [0.041] [0.660] [0.009] [0.004] # of observations 4,236 4,236 4,235 4,235 4,215 4,223 4,223 4,085 4,223 4,085 4,078 # of clusters 196 196 196 196 196 196 196 196 196 196 196 RDD estimate -0.092 0.035 0.016 -0.158 8.5 1.1 3.7 -22.6 -13.3 -26.4 -31.8 SFE estimate -0.053 0.059 0.048 -0.168 9.9 2.1 3.1 -15.4 5.6 -19.4 -27.3 Control mean 0.838 0.016 0.073 0.274 226.7 16.1 15.9 271.5 85.1 239.8 59.5 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) are presented above. Specifications control for site- by-season fixed effects. Standard errors are in parentheses, and p-values in brackets. “RDD estimate” and “SFE estimate” are from Table 4. appendix – for online publication Next, we separate the impacts of sample plot access to irrigation, on both sample plots and most important plots, into intensive and extensive margin effects. To do so, we compare estimated coefficients in Tables 4 and 8 to the same specifications with the inclusion of controls for cultivation, horticulture, and bananas. These results are presented in Table A6. These controls for crop choice almost fully explain the effects on sample plots, but only partially explain the effects on most important plots. Table A6: Impacts of access to irrigation are mostly explained by transition to horti- culture from bananas, but impacts of sample plot shock on most important plot are on both extensive and intensive margins Sample plot Most important plot HH Input Hired Yield Sales Profits/ha HH Input Hired labor/ exp./ha labor /ha Shadow wage labor/ exp./ha labor ha exp./ha =0 = 800 ha exp./ha (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) RDD (Site-by-season FE, Specifications 1 & 3) with Crop FE CA 30.0 1.2 0.8 32.0 23.8 31.4 12.6 -25.0 -4.6 -1.4 (13.2) (1.2) (2.0) (18.0) (11.4) (16.7) (15.7) (14.1) (2.1) (1.9) [0.023] [0.330] [0.681] [0.075] [0.037] [0.060] [0.422] [0.077] [0.032] [0.482] Effect without Crop FE 70.8 6.3 3.7 73.1 55.5 63.9 9.3 -32.2 -6.0 -1.8 SFE (Spatial FE, Specifications 2 & 4) with Crop FE CA 29.0 -1.5 0.3 13.8 10.9 16.4 1.3 -21.4 -4.9 0.2 (14.8) (1.4) (2.5) (23.9) (14.9) (22.2) (20.0) (16.8) (2.2) (2.1) [0.051] [0.302] [0.894] [0.565] [0.466] [0.460] [0.949] [0.203] [0.023] [0.915] Effect without Crop FE 76.9 4.3 3.2 55.0 49.3 49.1 -3.0 -33.2 -6.7 -0.5 # of observations 2,522 2,526 2,526 2,402 2,526 2,402 2,400 2,166 2,169 2,169 # of clusters 196 196 196 196 196 196 196 182 182 182 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) are pre- sented above. Specifications with sample plot outcomes control for distance to the command area boundary, its interaction with CA, cultivation, horticulture, bananas, and log area of the sample plot. Specifications with most important plot outcomes also include controls for log area of the most important plot and a command area indicator for the most important plot. RDD specification includes site-by-season fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. “Effect without Crop FE” are from Table 4 for sample plot outcomes and Table 8 for most important plot outcomes. Appendix D Household results We present results of the impacts of access to irrigation on household welfare outcomes in Table A7. We estimate specifications similar to Equations (1) and (2), but now use annual outcomes at the household level (instead of outcomes on sample plots). 13 appendix – for online publication Table A7: Household welfare Housing Asset Food Overall expenditures index security index index (1) (2) (3) (4) RDD (Site-by-survey FE, Specification 1) CA 12.10 0.13 0.07 0.12 (6.73) (0.11) (0.08) (0.07) [0.072] [0.224] [0.372] [0.077] SFE (Spatial FE, Specification 2) CA 13.91 0.05 0.07 0.11 (8.25) (0.12) (0.10) (0.08) [0.092] [0.668] [0.509] [0.191] # of observations 2,771 2,776 2,772 2,764 # of clusters 196 196 196 196 Control mean 28.03 -0.14 -0.12 -0.08 Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, and log area of the sample plot. RDD specification includes site-by-survey fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. We find suggestive evidence of positive impacts on household welfare. All point estimates are positive, and impacts on housing expenditures and an Anderson (2008) index of household welfare are each significantly different from zero in two specifica- tions. The implied treatment on the treated estimates are large. However, as impacts on households are imprecisely estimated, we interpret these results with caution. Appendix E Prices and wages We present figures showing the evolution of wages (Figure A1) and sale prices (Figure A2) across the 3 hillside irrigation schemes. In Figure A1, average wages do not appear to change after the hillside irrigation schemes became fully operational.50 In Figure A2, median sale prices appear to display more meaningful trends. In Karongi, there do not appear to be any trends in sale prices of horticultural crops. However, in Nyanza, sale prices of both tomatoes and eggplants appear lower after the hillside irrigation schemes became fully operational than before. We discuss the interpretation of these changes, if one believes they are causal, in Section 3.2. 50 Median wages (not presented here) remain constant within both of the sites used for the re- gression discontinuity analysis, and are slightly higher in the third site after the hillside irrigation schemes became fully operational. 14 appendix – for online publication Figure A1: Wages Notes: Average wages by season across the three hillside irrigation schemes are presented in this figure. Average wages are calculated across household-by-plot-by-season observations within site-by- season and are weighted by person days of hired labor. Appendix F Attrition We present results on attrition for our sample plot regressions for specifications (1) and (2) in Table A8; we do not find significant differential attrition on the most important plot.51 Additionally, we break attrition down into three causes: household attrition (typically caused by the household having moved), transactions to other local farmers where we failed to track the plot across the transaction, and rentals out to commercial farmers. We find significant differential attrition, but this differential attrition is driven almost entirely by rentals out to commercial farmers in one of the two sites. These were private businesses exporting vegetables and they had negotiated land lease rates with the government, and as such they were not willing to share detailed data on 51 Results on attrition on the most important plot are available upon request. 15 appendix – for online publication Figure A2: Prices (a) Karongi (b) Nyanza Notes: Median sale prices by season are presented in this figure. Prices are calculated separately for Karongi district (Karongi 12 and Karongi 13) and for Nyanza district (Nyanza 23). For each district, prices are calculated for the most commonly sold banana crop, the two most commonly sold staple crops, and the two most commonly sold horticultural crops. their profitability. Because they were producing chillies and stevia for export, land rented out to commercial farmers is likely to have much higher production and to be farmed more intensively, and therefore not having it in our data biases our main estimates downwards. Additionally, the commercial farmers preferred to rent land in the most productive areas of the sites, and therefore our estimates are if anything biased downward relative to the effect of access to irrigation on production for local farmers. Some discussion of the two other sources of attrition is potentially warranted. First, excluding rentals out to commercial farmers, attrition is low, at 4.8% outside the command area, and is a non statistically significant 0.9 - 3.5pp higher inside the command area. However, in one specification we do find 3.2pp higher household at- trition statistically significant at the 10% level. Lastly, tracking plots was important to correct for differential attrition – although command area plots were not differen- tially likely to be transacted to other farmers and not tracked, they were significantly more likely to be transacted to other farmers and tracked during the dry season (1.8 - 3.5pp). 16 appendix – for online publication Table A8: Sample plots Dry season Rainy seasons Dep. var. Coef. (SE) [p] Dep. var. Coef. (SE) [p] (1) (2) (3) (4) (5) (6) (7) (8) Tracked 0.032 0.018 0.023 0.035 0.047 0.011 0.019 0.036 (0.177) (0.010) (0.014) (0.019) (0.211) (0.011) (0.016) (0.023) 2,907 [0.056] [0.083] [0.069] 4,845 [0.306] [0.224] [0.114] Missing 0.060 0.111 0.127 0.103 0.064 0.102 0.121 0.094 (0.238) (0.020) (0.025) (0.028) (0.244) (0.020) (0.026) (0.028) 2,907 [0.000] [0.000] [0.000] 4,845 [0.000] [0.000] [0.001] Reason data is missing HH attrition 0.038 0.007 0.032 0.034 0.039 0.007 0.032 0.035 (0.192) (0.014) (0.019) (0.022) (0.194) (0.014) (0.019) (0.022) 2,907 [0.590] [0.096] [0.129] 4,845 [0.601] [0.096] [0.121] Rented out comm. farmer 0.012 0.102 0.092 0.069 0.011 0.099 0.089 0.064 (0.108) (0.017) (0.019) (0.015) (0.105) (0.016) (0.019) (0.015) 2,907 [0.000] [0.000] [0.000] 4,845 [0.000] [0.000] [0.000] Transaction (not tracked) 0.010 0.002 0.003 0.001 0.014 -0.004 0.000 -0.005 (0.099) (0.005) (0.005) (0.007) (0.116) (0.005) (0.006) (0.008) 2,907 [0.681] [0.539] [0.921] 4,845 [0.465] [0.945] [0.542] Site-by-season FE X X X X Distance to boundary X X X X log area X X X X Spatial FE X X Notes: Regression coefficients on a command area indicator for the sample plot (“CA”) are presented above. Specifications control for distance to the command area boundary, its interaction with CA, and log area of the sample plot. RDD specification includes site-by-season fixed effects, and SFE specification includes spatial fixed effects. Standard errors are in parentheses, and p-values in brackets. Appendix G Testing for binding constraint Appendix G.1 Model Households have 2 plots, indexed by k : k = 1 indicates the sample plot, while k = 2 indicates the most important plot. On each plot k , they have access to a simple production technology σAk Fk (Mk , Lk ) where Ak is plot productivity, Mk is the inputs applied to plot k and Lk is the household labor applied to plot k . The common production shock σ is a random variable such that σ ∼ Ψ(σ ), E [σ ] = 1.52 While this specification assumes a single production function on each plot, in Supplementary Appendix B we demonstrate that we can interpret Fk (Mk , Lk ) as the envelope of production functions from cultivating different fractions of bananas and horticulture 52 While we refer to σ as a production shock, this incorporates general uncertainty in the value of production which includes joint price and production risk. 17 appendix – for online publication on the dry season; thus we will think of cultivating bananas as optimizing at a low input intensity. Utilizing subscripts to indicate partial derivatives and subsuming arguments we assume FkM > 0, FkL > 0, FkM L > 0, FkM M < 0, FkLL < 0.53 Households have a budget of M which, if not utilized for inputs, can be invested in a risk-free asset which appreciates at rate r. In this context, households maximize expected utility over consumption c and leisure l, considering their budget constraint and a labor constraint L which is allocated to labor on each plot, leisure, and up to LO units of off farm labor LO .54 Finally, we model irrigation access as an increase in A1 . As we consider the role of each different constraint, we develop the necessary assumptions to produce the results from Section 3: that this increase in A1 generates an increase in demand for inputs and labor on plot A1 . Households maximize expected utility max E [u(c, l)] M1 ,M2 ,L1 ,L2 ,l,LO subject to the constraints enumerated above σA1 F (M1 , L1 ) + σA2 F (M2 , L2 ) + wLO + r(M − M1 − M2 ) = c M1 + M2 ≤ M L1 + L2 + l + LO = L LO ≤ LO After substituting in the constraints which bind with equality, we derive the fol- 53 Among these, FkM L > 0 is the most controversial. Existing evidence on FkM L in developing country agriculture is mixed (see Heisey & Norton (2007) for discussion). In our context, we expect FkM L > 0 primarily because Fk (·, ·) encompasses the transition from bananas to horticulture, which should be associated with increased labor and input demands according to Stylized Fact 2. 54 We follow Benjamin (1992) in modeling incomplete labor markets as driven by an off farm labor constraint. As in Benjamin (1992), we do so to match the observation that rural wages appear to be higher than the productivity of on-farm labor. However, for the predictions that follow it is sufficient that the household farm face an upward sloping residual labor supply. This holds if households face a downward sloping labor demand curve (implied by Benjamin (1992); alternatively, Breza et al. (2018) demonstrate the existence of norms driven wage floors), or if households incur convex costs from working off farm (due to distaste from working for others). Alternatively, the market failure may only apply to a particular task, such as managerial labor. 18 appendix – for online publication lowing first order conditions55 cov(σ,uc ) (Mk ) 1+ E[uc ] Ak FkM = (1 + λM )r (A2) cov(σ,uc ) (Lk ) 1+ E[uc ] Ak FkL = (1 − λL )w (A3) ( ) E[u ] E[uc ] = (1 − λL )w (A4) Intuitively, the first order conditions for inputs and labor include three parts. First, each contains the marginal product of the factor, Ak FkM and Ak FkL respectively, on the left hand side, and the market price of the factor, r and w respectively, on the right hand side. The second piece, 1 + cov (σ,uc ) E[uc ] , is the ratio of the marginal utility from agricultural production to the marginal utility from certain consumption. This ratio scales down the marginal product of the factor. It is less than 1 because agricultural production is uncertain, and higher in periods in which marginal utility is lower, so cov(σ, uc ) < 0. With perfect insurance, cov(σ, uc ) = 0, and this piece disappears. Without it, however, farmers will underinvest in both inputs and labor relative to the perfect insurance optimum.56 Third, there are the Lagrange multipliers associated with the input constraint λM and with the labor constraint λL , which scale the associated factor prices up and down, respectively. When these constraints do not bind, and with perfect insurance, we have the familiar result that marginal products equal marginal prices. However, if any of these constraints bind, then separation fails: farmer characteristics will cause variation in λL , λM , or cov(σ, uc ) and in turn inefficient input allocations. Appendix G.2 A test for separation failures Proposition 1 Showing this result is straightforward: with perfect markets for inputs, labor, and insurance, cov(σ,uc ) E[uc ] = 0, λL = 0, and λM = 0, respectively. The 55 The derivation is in Supplementary Appendix B. 56 This result does not generically hold in models of agricultural households, as when consumption is separately modeled, households that are net buyers of an agricultural good may overinvest in inputs and labor relative to the perfect insurance optimum (Barrett, 1996). This is unlikely to be first order in our context, as we sampled cultivators and our results are driven by production of commercial crops. 19 appendix – for online publication first order conditions then simplify to (Mk ) Ak FkM = r (Lk ) Ak FkL =w E [u ] ( ) E[uc ] =w The household’s labor and input allocations on plot 2 depend only on plot 2 produc- tivity A2 , the price of inputs r, and the wage w, and not on access to irrigation on plot 1 (A1 ). Proposition 257 The logic case-by-case is as follows. First, if input constraints bind, then the increase in inputs on the sample plot caused by access to irrigation must be associated with a reduction in inputs on the most important plot. As inputs and labor are complements, this causes labor allocations on the most important plot to fall as well. Second, if labor constraints bind, then the increase in labor on the sample plot caused by access to irrigation must be associated with a reduction in the sum of leisure and labor on the most important plot. Under standard restrictions on the household’s on farm labor supply, this must be associated with a reduction in labor on the most important plot.58 As inputs and labor are complements, this causes input allocations on the most important plot to fall as well. Third, absent insurance, then the increase in agricultural production caused by access to irrigation reduces the marginal utility from agricultural production relative to the marginal utility from consumption.59 In turn, this causes labor and input allocations to the most important plot to fall. Appendix G.3 Separating constraints To shed light on which other constraints generate separation failures, we leverage the fact that our model offers predictions about how households with different charac- teristics should differentially respond to the sample plot shock. Roughly speaking, 57 See proof in Supplementary Appendix B. 58 Specifically, we assume that leisure demand is increasing in consumption; this assumption is not necessary but is sufficient. 59 This does not generically hold; however, restrictions on the distribution of σ are sufficient to imply that marginal utility from agricultural production relative to the marginal utility from consumption is falling in agricultural production. Details are in Supplementary Appendix B. 20 appendix – for online publication depending on which constraint binds, changes in different household characteristics may slacken or tighten the binding constraint. We focus on two important household characteristics in our model: we use household size to shift L, the household’s total available labor, and wealth to shift M , the household’s exogenous income available for input expenditures. We present these predictions below. Proposition 360 Under insurance constraints, both wealth and household size enter the model symmetrically by increasing consumption; therefore, in all cases, wealthier and larger households will respond similarly to the sample plot shock. If we addition- ally assume that risk aversion is decreasing sufficiently quickly in consumption, then the allocations of wealthier and larger households will be closer to those maximizing expected profits, and therefore allocations on the most important plot will be less responsive to the sample plot shock. Under input constraints, wealthier households are less likely to see the constraint bind. As the allocations on the most important plot of unconstrained households do not respond to the sample plot shock, wealthier households should be less responsive. Now, note that in this model, households cannot use labor income to purchase addi- tional inputs; we could interpret this as consistent either with households receiving labor income after inputs are purchased, or with the constraint as a constraint to availability of inputs. In a more general model with borrowing, they may be able to; in that case, both wealthier households and larger households are less likely to see the constraint bind, and therefore will both be less responsive to the sample plot shock on their most important plots.61 Proposition 462 When labor constraints bind, the household responds to the sam- ple plot shock by allocating additional labor to the sample plot, but they may with- draw that labor from either the most important plot or from leisure. Whether wealth- ier or larger households withdraw relatively more labor from the most important plot depends on the higher order derivatives of the utility and production functions; in 60 See proof in Supplementary Appendix B. 61 If all households are input constrained, then the effect of the sample plot shock on input allocations on the most important plot depends on characteristics of the production function. Note that in this case, larger households will still exhibit a response in the same direction as wealthier households as both effects enter only through the wealth channel. 62 See proof in Supplementary Appendix B. 21 appendix – for online publication general, these differential responses can not be signed.63 Additionally, one key dif- ference from the insurance case and input case is that household size and wealth no longer enter the model symmetrically. In one sense, household size and wealth instead enter the model as opposing forces: wealthier households allocate less labor to their plots, as they value leisure relatively more than consumption, while larger households allocate more labor to their plots. We focus on one particular case that builds on this intuition, presented in Fig- ure A3. When on farm labor supply exhibits sufficient curvature, then changes in responsiveness to the sample plot shock of allocations on the most important plot are dominated by changes in the elasticity of on farm labor supply. Suppose this to be the case, and further suppose that the elasticity of on farm labor supply is decreasing in the shadow wage. As we can think of household size as shifting out on farm labor supply (by increasing L), and wealth as shifting in on farm labor supply (by increasing the marginal utility of leisure relative to the marginal utility of consumption), then larger households are located on a more elastic portion of their on farm labor supply schedule, while wealthier households are located on a less elastic portion of their on farm labor supply schedule. As a result, larger households will be less responsive to the sample plot shock, as they will primarily draw labor on the sample plot from leisure, while wealthier households will be more responsive to the sample plot shock, as they will primarily draw labor on the sample plot from the most important plot. Appendix H Household size, wealth, and agricul- tural production decisions In Section 5, we assume that household size and the asset index act as shifters of the household’s availability of labor and ability to purchase inputs, respectively. Al- ternatively phrased, they decrease the household’s shadow wage and shadow price of inputs, respectively. We should therefore expect larger households to use more household labor and less hired labor. We should also expect wealthier households to use more inputs and more hired labor. We test this assumption by correlating household size and the asset index with household labor use, input expenditures, and hired labor expenditures. Specifically, 63 Of course, the potential for ambiguous responses is heightened further if other forms of labor constraints, for example on hiring labor, are also considered. 22 appendix – for online publication Figure A3: Differential responses to sample plot shock under labor constraints Shadow wage SM L2 L1 + L2 L1 + L2 L − LO − l BIG L − LO − l dL2 /dA1 dL2 /dA1 On-farm labor Notes: Households’ labor allocations under a binding off farm labor constraint are presented in this figure. Lk and l are the household’s labor allocation on plot k and choice of leisure, respectively, as a function of the shadow wage, with the argument suppressed. L1 + L2 is total household on farm labor demand; if the household’s sample plot (k = 1) is in the command area (“sample plot SM shock”), on farm labor demand shifts out to L1 + L2 . L − LO − l is household on farm labor BIG supply; for large households, on farm labor supply is shifted out to L − LO − l. The shadow wage is determined by the intersection of on farm labor demand and on farm labor supply, and labor allocations on the most important plot are L2 evaluated at this shadow wage. In this figure, larger households are on a more elastic portion of their on farm labor supply schedule; as a result, the sample plot shock causes a smaller increase in the shadow wage, and in turn a smaller decrease in labor allocations on the most important plot (smaller in magnitude dL2 /dA1 ). we estimate ypist = β1 # of HH membersi + β2 Asset indexi + Xpis γ + αst + pist (A5) where y is either household labor use, input expenditures, or hired labor expenditures, on plot p of household i in site s in season t, and αst is a site-by-season fixed effect. We vary the set of controls and fixed effects across specifications to test the robustness of these correlations to omitted household characteristics. We present estimates of specification A5 in Table A9. Consistent with our pre- dictions, we find that larger households use more household labor and less hired labor, while wealthier households use inputs and more hired labor. These correla- 23 appendix – for online publication Table A9: Household size and wealth shift agricultural production decisions in a manner consistent with them shifting the shadow wage and shadow price of inputs HH labor/ha Input exp./ha Hired labor exp./ha (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) # of HH members 7.2 5.0 4.8 0.4 -0.0 0.0 0.6 -1.1 -0.9 (1.2) (1.7) (1.5) (0.2) (0.3) (0.2) (0.3) (0.4) (0.4) [0.000] [0.003] [0.001] [0.013] [0.899] [0.969] [0.065] [0.010] [0.021] Asset index 9.4 4.0 1.6 2.4 1.8 1.4 9.0 9.1 8.6 (2.6) (3.0) (2.6) (0.4) (0.5) (0.4) (0.7) (0.8) (0.7) [0.000] [0.190] [0.530] [0.000] [0.000] [0.001] [0.000] [0.000] [0.000] log area -114.0 -113.7 -114.2 -116.3 -5.2 -5.4 -5.3 -5.9 -3.8 -4.8 -4.6 -5.3 (3.8) (3.8) (3.8) (3.4) (0.5) (0.5) (0.5) (0.4) (0.5) (0.5) (0.5) (0.5) [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] # of HH members (15-64) 2.1 2.2 -0.4 -0.5 -0.5 -0.8 (2.3) (2.0) (0.4) (0.3) (0.7) (0.6) [0.359] [0.280] [0.280] [0.088] [0.470] [0.177] HHH female -3.8 1.9 -3.4 -2.3 0.8 1.1 (6.1) (5.5) (0.9) (0.8) (1.3) (1.3) [0.534] [0.723] [0.000] [0.003] [0.556] [0.414] # of plots -1.1 -0.0 -0.0 0.2 0.1 0.2 (0.9) (0.8) (0.2) (0.1) (0.2) (0.2) [0.207] [0.971] [0.986] [0.214] [0.627] [0.322] Site-by-season FE X X X X X X X X X X X X Site-by-season-by-crop FE X X X # of observations 28,750 28,717 28,578 28,576 28,823 28,790 28,651 28,649 28,823 28,790 28,651 28,649 # of clusters 1,637 1,635 1,628 1,628 1,637 1,635 1,628 1,628 1,637 1,635 1,628 1,628 tions change somewhat when both household size and the asset index are not both included, as household size and the asset index are correlated. This highlights the importance of controlling for the asset index and household size when estimating co- efficients on household size and the asset index, respectively, as we do in Section 5. In addition, these correlations are robust to the inclusion of other important household covariates (number of household members (15-64), gender of the household head, and number of plots), and also to the inclusion of site-by-season-by-crop fixed effects. We interpret these results as consistent with household size shifting the shadow wage, and the asset index shifting the shadow price of inputs. In turn, these are consistent with household size shifting the household’s availability of labor, and the asset index shifting the household’s ability to purchase inputs. Appendix I Experimental Appendix Appendix I.1 Experimental design We conducted three randomized controlled trials in these hillside irrigation schemes. First, we manipulated operations and maintenance (O&M) in the hillside irrigation schemes, by randomly assigning water user groups to different approaches to moni- 24 appendix – for online publication toring. Qualitative work raised concerns that the water user groups as established would not be sufficient to enforce water usage schedules and that routine mainte- nance tasks would not be performed adequately, as has been documented by Ostrom (1990). Second, we subsidized water usage fees the government had planned to col- lect from farmers, which were as high as 77,000 RwF/ha/year. For reference, this is roughly 20% of our dry season treatment on the treated estimates, and roughly 50% of median land rental prices. If farmers believed that they were more likely to be required to pay the fees if they used the irrigation infrastructure, then these fees had the potential to influence farmers production decisions, (even though they are small relative to potential yield gains from irrigation use). Third, we provided agricultural minikits, which included 0.02 ha of seeds, chemical fertilizer, and insecticide, which could be used for horticulture cultivation. In other contexts, minikits of similar size relative to median landholdings have been shown to increase adoption of new crop varieties or varieties with low levels of adoption (Emerick et al., 2016; Jones et al., 2018). Although horticulture is not unfamiliar in these areas, at baseline 3.2% of plots outside the command area were planted with at least some horticulture, and primarily during the rainy seasons. Assignment to experimental arms for O&M, minikits, and subsidies were as fol- lows. First, for the O&M intervention, 251 water user groups across three irrigation sites were randomized, stratified across the 33 Zones these irrigation sites are divided into, into three arms.64 Second, for the minikit intervention, water user groups were randomly assigned to 20%, 60%, or 100% saturation, with rerandomization for bal- ance on Zone and O&M treatment status. Following this assignment, individuals on the lists of water user group members provided to us by the sites were randomly assigned to receive minikits with probabilities equal to that water user group’s satu- ration. Minikits were offered to assigned individuals prior to 2017 Rainy 1 and 2017 Dry. Third, for the subsidy intervention, our implementing partner was concerned 64 40% were assigned to a status quo arm where the irrigator/operators employed by the site were responsible for enforcing water usage schedules and reporting O&M problems to the local Water User Association. 30% were assigned to an arm where the water user group elected a monitor who was tasked with these responsibilities, trained in implementing them, and given worksheets to fill and return to the Water User Association reporting challenges with enforcement of the water usage schedule and any O&M concerns. In an additional 30%, the elected monitor was required to have a plot near the top of the water user group, where the flow of water is most negatively impacted when too many farmers try to irrigate at once. Monitors were trained just before the 2016 Dry season, with refresher trainings during 2016 Dry and 2017 Rainy 1. 25 appendix – for online publication with the perception of an assignment rule that might be perceived as hidden, so public lotteries for subsidies were conducted at the Zone level.65 Appendix I.2 O&M and Fee Subsidies We find no effects of empowering monitors and fee subsidies on agricultural decisions in our context; we offer some qualitative evidence and simple descriptives from our data that explain these null effects.66 First, we find no impact of empowering monitors. This is because O&M was highly effective in these irrigation schemes, and empowering monitors therefore had limited scope for changing O&M practices. Farmers reported 14% as many days without enough water during the dry seasons as they reported days using irrigation. Any event where conflict among water user group members caused insufficient water at some point during the dry season was reported for 3% of irrigated plots.67 This success was far from guaranteed in the early years of the schemes; site engineers have suggested that the combination of lower adoption of irrigation than the schemes are designed for and high compliance with water usage schedules among farmers have been the cause of this. Moreover, during the 2018 Dry season we found evidence that control water user groups adopted the intervention, as some members of control water user groups adopted the roles that were assigned to monitors. Second, we find no impact of fee subsidies. The reason for this is clear – although we have a strong and large first stage on fees owed by farmers in administrative data, the impacts of subsidies on feed paid by farmers were 10% of the size of the impacts on fees owed, both in administrative data and self reports. Moreover, the fees were implemented as land taxes and not charged based on irrigation use so as not to discourage adoption. In sum, at the low levels of enforcement observed during the 2017 Rainy seasons, they should not have affected farmers’ production decisions, consistent with the results we find. 65 At these public lotteries, 40% of farmers received no subsidy, 20% received a 50% subsidy for one season, 20% received a 100% subsidy for one season, and 20% received a 100% subsidy for two seasons. The lotteries took place at the start of the 2017 Rainy 1, and subsidies were for 2017 Rainy 1 and 2017 Rainy 2; at the time the Water User Associations did not plan to collect fees during the Dry season. 66 Results are available upon request. 67 This magnitude is small; as reference, Sekhri (2014) finds the share of farmers reporting disputes over ground water in India increases by 29pp when water tables become sufficiently low. 26 supplementary appendix – not intended for publication Supplementary Appendix A Additional robustness To complement the analysis in Section 3, we estimate the impacts of access to irriga- tion using specifications similar to specifications 1 and 2, but with additional controls or using alternative weights. We present these estimates similarly to Table 4, and compare estimated coefficients to those in Table 4. First, as described in Section 2.2.1, our sampling strategy oversamples households who managed relatively more plots at baseline, and these households are therefore likely to be overrepresented in our analysis relative to their share of the population. To test the robustness of our main results to changes in sampling, we estimate Spec- ifications 1 and 2 reweighting by the inverse number of plots, and we present these estimates in Table S1. The patterns in the results we describe in Section 3 are robust to this alternative weighting. 1 Table S1: Estimated effects of access to irrigation are robust to reweighting by the inverse of # of plots supplementary appendix – not intended for publication (a) Dry season Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA 0.004 0.165 0.141 -0.125 63.8 6.1 4.0 68.4 48.4 58.2 8.0 (0.046) (0.026) (0.024) (0.045) (15.6) (1.7) (1.9) (21.0) (14.6) (18.4) (14.4) [0.923] [0.000] [0.000] [0.006] [0.000] [0.000] [0.038] [0.001] [0.001] [0.002] [0.581] # of observations 2,537 2,537 2,536 2,536 2,523 2,527 2,527 2,402 2,527 2,402 2,400 # of clusters 196 196 196 196 196 196 196 196 196 196 196 Unweighted estimate 0.005 0.162 0.137 -0.133 70.8 6.3 3.7 73.1 55.5 63.9 9.3 (b) Rainy seasons Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage 2 ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA -0.091 0.039 0.009 -0.149 8.7 3.2 5.9 -5.9 -3.6 -14.3 -20.4 (0.027) (0.009) (0.020) (0.050) (23.2) (2.8) (4.2) (28.8) (17.7) (25.6) (22.4) [0.001] [0.000] [0.664] [0.003] [0.707] [0.244] [0.161] [0.839] [0.837] [0.575] [0.362] # of observations 4,236 4,236 4,235 4,235 4,215 4,223 4,223 4,085 4,223 4,085 4,078 # of clusters 196 196 196 196 196 196 196 196 196 196 196 Unweighted estimate -0.092 0.035 0.016 -0.158 8.5 1.1 3.7 -22.6 -13.3 -26.4 -31.8 supplementary appendix – not intended for publication Second, as described in Section 3.1, our primary specifications 1 and 2 control for distance to the command area boundary and its interaction with the command area indicator. As an alternative, we consider the robustness of these specification to this functional form on the control for distance to the command area boundary. In Section Appendix C, we showed that our main results were robust to omitting controls for distance to the command area boundary and its interaction with the command area indicator. We now estimate specifications 1 and 2 with additional controls distance to the command area boundary squared and its interaction with the command area indicator, and we present these estimates in Table S2. The patterns in the results we describe in Section 3 are robust to the inclusion of these controls. 3 Table S2: Estimated effects of access to irrigation are robust to controlling for distance to the command area boundary supplementary appendix – not intended for publication squared and its interaction with the command area indicator (a) Dry season Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha = 0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA 0.027 0.142 0.135 -0.126 60.9 7.0 8.3 49.0 42.1 33.8 -11.4 (0.057) (0.031) (0.029) (0.055) (21.6) (2.0) (2.5) (35.2) (25.8) (33.4) (28.2) [0.632] [0.000] [0.000] [0.022] [0.005] [0.000] [0.001] [0.165] [0.103] [0.312] [0.685] Effect w/o quadratic RV 0.005 0.162 0.137 -0.133 70.8 6.3 3.7 73.1 55.5 63.9 9.3 SFE (Spatial FE, Specification 2) CA 0.082 0.126 0.129 -0.062 59.9 5.1 6.6 54.0 35.4 42.0 0.9 (0.055) (0.035) (0.035) (0.052) (21.0) (2.5) (3.0) (34.7) (23.2) (32.2) (27.5) [0.134] [0.000] [0.000] [0.234] [0.004] [0.043] [0.026] [0.119] [0.127] [0.193] [0.974] Effect w/o quadratic RV 0.022 0.171 0.156 -0.142 76.9 4.3 3.2 55.0 49.3 49.1 -3.0 # of observations 2,537 2,537 2,536 2,536 2,523 2,527 2,527 2,402 2,527 2,402 2,400 # of clusters 196 196 196 196 196 196 196 196 196 196 196 4 (b) Rainy seasons Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha = 0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA -0.064 0.035 0.001 -0.136 7.5 1.4 4.7 -30.8 -48.0 -36.0 -41.3 (0.033) (0.014) (0.031) (0.057) (26.0) (3.9) (4.4) (49.4) (31.4) (46.1) (46.5) [0.057] [0.010] [0.978] [0.017] [0.774] [0.716] [0.281] [0.533] [0.127] [0.436] [0.375] Effect w/o quadratic RV -0.092 0.035 0.016 -0.158 8.5 1.1 3.7 -22.6 -13.3 -26.4 -31.8 SFE (Spatial FE, Specification 2) CA -0.022 0.051 0.021 -0.083 -0.5 -0.4 -0.3 -48.5 -38.4 -47.5 -49.3 (0.038) (0.015) (0.038) (0.050) (29.8) (3.8) (4.8) (39.5) (26.2) (37.5) (38.9) [0.557] [0.001] [0.587] [0.098] [0.988] [0.923] [0.944] [0.219] [0.143] [0.206] [0.206] Effect w/o quadratic RV -0.053 0.059 0.048 -0.168 9.9 2.1 3.1 -15.4 5.6 -19.4 -27.3 # of observations 4,236 4,236 4,235 4,235 4,215 4,223 4,223 4,085 4,223 4,085 4,078 # of clusters 196 196 196 196 196 196 196 196 196 196 196 supplementary appendix – not intended for publication Third, as described in Section 3.1, we estimate specification 1 with standard errors clustered at the water user group level. As an alternative, we estimate this specifica- tion with standard errors clustered at the zone level instead of the water user group level, and we present these estimates in Table S3; we describe differences between zones and water user groups in Section 2.1. The patterns of statistical significance we describe in Section 3 are unaffected by instead clustering at the zone level. 5 Table S3: Patterns of statistical significance of the effects of access to irrigation are robust to alternative clustering supplementary appendix – not intended for publication (a) Dry season Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA 0.005 0.162 0.137 -0.133 70.8 6.3 3.7 73.1 55.5 63.9 9.3 (0.053) (0.040) (0.034) (0.041) (23.9) (1.7) (3.0) (39.2) (23.4) (35.3) (22.4) [0.930] [0.000] [0.000] [0.001] [0.003] [0.000] [0.210] [0.062] [0.018] [0.071] [0.677] # of observations 2,537 2,537 2,536 2,536 2,523 2,527 2,527 2,402 2,527 2,402 2,400 # of clusters 31 31 31 31 31 31 31 31 31 31 31 SE w/ WUG cluster (0.041) (0.024) (0.024) (0.037) (17.5) (1.5) (2.1) (23.2) (14.5) (21.0) (16.5) p w/ WUG cluster [0.909] [0.000] [0.000] [0.000] [0.000] [0.000] [0.082] [0.002] [0.000] [0.002] [0.573] (b) Rainy seasons Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha 6 vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha =0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA -0.092 0.035 0.016 -0.158 8.5 1.1 3.7 -22.6 -13.3 -26.4 -31.8 (0.025) (0.011) (0.019) (0.050) (18.1) (2.4) (3.8) (29.8) (19.4) (26.5) (31.8) [0.000] [0.002] [0.386] [0.002] [0.640] [0.652] [0.325] [0.448] [0.493] [0.319] [0.317] # of observations 4,236 4,236 4,235 4,235 4,215 4,223 4,223 4,085 4,223 4,085 4,078 # of clusters 31 31 31 31 31 31 31 31 31 31 31 SE w/ WUG cluster (0.025) (0.009) (0.018) (0.038) (23.1) (2.9) (3.4) (30.8) (18.5) (28.5) (26.4) p w/ WUG cluster [0.000] [0.000] [0.371] [0.000] [0.714] [0.710] [0.276] [0.462] [0.472] [0.354] [0.228] supplementary appendix – not intended for publication Fourth, as described in Section 3.1.1, the coefficients on household head completed primary and number of household members (15-64) in the balance test in Table 3 may appear economically significant, although only the coefficient on household head completed primary is statistically significant and only in some specifications. To test the robustness of our main results, we therefore estimate specifications 1 and 2 with household head completed primary and number of household members (15-64) included as controls, and we present these estimates in Table S4. The patterns in the results we describe in Section 3 are robust to the inclusion of these controls. 7 Table S4: Estimated effects of access to irrigation are robust to controlling for household head completed primary and supplementary appendix – not intended for publication number of household members (15-64) (a) Dry season Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha = 0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA 0.002 0.157 0.134 -0.131 67.9 5.9 3.4 70.1 51.9 61.2 8.2 (0.041) (0.024) (0.024) (0.036) (18.0) (1.5) (2.3) (23.9) (14.8) (21.6) (17.0) [0.962] [0.000] [0.000] [0.000] [0.000] [0.000] [0.139] [0.003] [0.000] [0.005] [0.631] Effect w/o household controls 0.005 0.162 0.137 -0.133 70.8 6.3 3.7 73.1 55.5 63.9 9.3 SFE (Spatial FE, Specification 2) CA 0.019 0.167 0.153 -0.140 75.6 4.1 3.1 55.3 47.8 49.2 -2.4 (0.044) (0.031) (0.029) (0.034) (21.7) (1.9) (2.7) (28.7) (19.2) (25.9) (21.0) [0.657] [0.000] [0.000] [0.000] [0.000] [0.031] [0.248] [0.054] [0.013] [0.057] [0.908] Effect w/o household controls 0.022 0.171 0.156 -0.142 76.9 4.3 3.2 55.0 49.3 49.1 -3.0 # of observations 2,529 2,529 2,528 2,528 2,515 2,519 2,519 2,395 2,519 2,395 2,393 # of clusters 194 194 194 194 194 194 194 194 194 194 194 8 (b) Rainy seasons Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha = 0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA -0.091 0.033 0.011 -0.155 3.8 0.7 2.9 -28.8 -16.3 -31.7 -33.1 (0.025) (0.009) (0.018) (0.038) (23.6) (2.9) (3.4) (31.3) (18.4) (29.0) (26.6) [0.000] [0.000] [0.517] [0.000] [0.872] [0.821] [0.390] [0.359] [0.376] [0.274] [0.213] Effect w/o household controls -0.092 0.035 0.016 -0.158 8.5 1.1 3.7 -22.6 -13.3 -26.4 -31.8 SFE (Spatial FE, Specification 2) CA -0.056 0.056 0.042 -0.165 5.6 1.5 2.4 -19.4 2.1 -22.5 -27.1 (0.027) (0.012) (0.025) (0.033) (24.5) (3.1) (4.5) (30.7) (21.3) (27.7) (32.0) [0.035] [0.000] [0.092] [0.000] [0.818] [0.618] [0.597] [0.528] [0.920] [0.417] [0.397] Effect w/o household controls -0.053 0.059 0.048 -0.168 9.9 2.1 3.1 -15.4 5.6 -19.4 -27.3 # of observations 4,223 4,223 4,222 4,222 4,202 4,210 4,210 4,073 4,210 4,073 4,066 # of clusters 194 194 194 194 194 194 194 194 194 194 194 supplementary appendix – not intended for publication Fifth, as described in Section Appendix B, command area plots are significantly lower in elevation than plots outside the command area, although this difference is much smaller when we use a spatial fixed effects specification. To test the robustness of our main results, we therefore estimate specifications 1 and 2 with elevation included as a control, and we present these estimates in Table S5. The patterns in the results we describe in Section 3 are robust to the inclusion of elevation as a control. 9 Table S5: Estimated effects of access to irrigation are robust to controlling for elevation supplementary appendix – not intended for publication (a) Dry season Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha = 0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA -0.007 0.159 0.135 -0.141 69.0 5.4 0.5 55.5 43.5 50.6 -1.2 (0.041) (0.025) (0.025) (0.037) (18.2) (1.6) (2.1) (23.5) (15.3) (21.4) (16.9) [0.864] [0.000] [0.000] [0.000] [0.000] [0.001] [0.809] [0.018] [0.005] [0.018] [0.946] Effect w/o elevation control 0.005 0.162 0.137 -0.133 70.8 6.3 3.7 73.1 55.5 63.9 9.3 SFE (Spatial FE, Specification 2) CA 0.022 0.157 0.147 -0.119 71.7 2.6 1.3 48.7 49.3 46.2 0.1 (0.045) (0.031) (0.030) (0.035) (23.0) (1.9) (2.7) (29.8) (21.4) (27.0) (22.8) [0.620] [0.000] [0.000] [0.001] [0.002] [0.173] [0.636] [0.102] [0.021] [0.087] [0.995] Effect w/o elevation control 0.022 0.171 0.156 -0.142 76.9 4.3 3.2 55.0 49.3 49.1 -3.0 # of observations 2,537 2,537 2,536 2,536 2,523 2,527 2,527 2,402 2,527 2,402 2,400 # of clusters 196 196 196 196 196 196 196 196 196 196 196 (b) Rainy seasons 10 Culti- Irri- Horti- Banana HH Input Hired Yield Sales Profits/ha vated gated culture labor/ exp./ha labor /ha Shadow wage ha exp./ha = 0 = 800 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) RDD (Site-by-season FE, Specification 1) CA -0.079 0.037 0.024 -0.167 17.6 2.7 2.5 -14.8 -12.0 -18.6 -31.3 (0.026) (0.010) (0.018) (0.038) (23.2) (2.8) (3.7) (31.6) (18.6) (29.4) (27.8) [0.002] [0.000] [0.188] [0.000] [0.448] [0.339] [0.499] [0.640] [0.520] [0.528] [0.260] Effect w/o elevation control -0.092 0.035 0.016 -0.158 8.5 1.1 3.7 -22.6 -13.3 -26.4 -31.8 SFE (Spatial FE, Specification 2) CA -0.050 0.053 0.044 -0.144 5.3 0.6 2.3 -21.2 4.1 -23.1 -26.6 (0.028) (0.013) (0.026) (0.035) (26.4) (3.2) (4.8) (33.4) (23.0) (30.2) (34.7) [0.076] [0.000] [0.091] [0.000] [0.841] [0.852] [0.629] [0.525] [0.858] [0.445] [0.442] Effect w/o elevation control -0.053 0.059 0.048 -0.168 9.9 2.1 3.1 -15.4 5.6 -19.4 -27.3 # of observations 4,236 4,236 4,235 4,235 4,215 4,223 4,223 4,085 4,223 4,085 4,078 # of clusters 196 196 196 196 196 196 196 196 196 196 196 supplementary appendix – not intended for publication Supplementary Appendix B Model appendix Derivation of first order conditions. Substitute for LO using the household labor constraint, L1 + L2 + + LO = L, and substitute for c in the household’s maximization problem. This leaves two constraints, M1 + M2 ≤ M , and L − L1 − L2 − ≤ LO ; call the multipliers on these constraints λM and λL , respectively. Taking first order conditions yields (Mk ) E[uc σ ]Ak FkM − E[uc ]r = λM (Lk ) E[uc σ ]Ak FkL − E[uc ]w = −λL ( ) E[u ] − E[uc ]w = −λL To ease interpretation, normalize λM ≡ λM /rE[uc ] and λL ≡ λL /wE[uc ], and substi- tute cov(σ, uc ) = E[uc σ ] − E[uc ]E[σ ] = E[uc σ ] − E[uc ]. This yields cov(σ,uc ) (Mk ) 1+ E[uc ] Ak FkM = (1 + λM )r cov(σ,uc ) (Lk ) 1+ E[uc ] Ak FkL = (1 − λL )w E[u ] ( ) E[uc ] = (1 − λL )w No constraints. When no constraints bind, as discussed the first order conditions simplify to (Mk ) Ak FkM = r (Lk ) Ak FkL =w u ( ) uc =w Note that the first order conditions for M2 and L2 are functions only of (M2 , L2 ), and exogenous (A2 , r, w). Therefore, dM dA1 2 dL2 = dA 1 = 0. Insurance market failure. Consider the case when insurance markets fail. To abstract fully from labor supply, we temporarily remove leisure from the model. To further simplify, we drop other inputs from the production function; when the pro- duction function is homogeneous in labor and other inputs, this is without loss of 11 supplementary appendix – not intended for publication generality. Households solve max E[u(c)] L1 ,L2 σ (A1 F1 (L1 ) + A2 F2 (L2 )) − w(L1 + L2 ) + wL + rM = c To simplify the analysis, this can be rewritten as the two step optimization problem max E[u(c)] L σG(L; A1 ) − wL + wL + rM = c max aF1 (L − L2 ) + A2 F2 (L2 ) = G(L; a) L2 Next, let γ (g, c) = E E[uc (σg +c)] [σuc (σg +c)] ; γ ≥ 1 is the ratio of the marginal utility from con- sumption to the marginal utility from agricultural production. As above, to represent derivatives of G and γ we use subscripts to indicate partial derivatives and subsume arguments. This yields the first order condition (L) GL − γ (G(L; A1 ), w(L − L) + rM )w = 0 The central intuition for this case can be captured from just the first order con- dition: L and M enter symmetrically into the model, so larger households should respond similarly to richer households. If absolute risk aversion decreases sufficiently quickly (e.g., with CRRA preferences), then for sufficiently high levels of consumption E[σuc ] = E[σ ]E[uc ] = E[uc ] ⇒ γ = 1. Therefore, sufficiently wealthy or sufficiently large households should not respond to the sample plot shock. Below, we will main- tain the assumption that preferences exhibit decreasing absolute risk aversion, and that limc→∞ γ (g, c) = 1. Let FOCL be the left hand side of the first order condition for the utility max- imization problem. Then, an application of the implicit function theorem yields 12 supplementary appendix – not intended for publication dL dA1 = − ddFOC L /dA1 FOCL /dL . Evaluating these derivatives yields dFOCL = GLL + γc w2 − γg GL w dL dFOCL = GLa − γG Ga dA1 dL GLa − γg Ga =− dA1 GLL + γc w2 − γg GL w Next, we use the first order condition for constrained production maximization. Some applications of the envelope theorem and taking derivatives yields GL = A1 F1L Ga = F1 GLa = F1L (1 − dL2 /dL) GLL = A1 F1LL (1 − dL2 /dL) Lastly, note that dA dL2 1 = dL 2 dL dL dA1 + dL da 2 , as the increase in A1 shifts both arguments to G. Let FOCL2 denote the left hand side of the first order condition for constrained production maximization. Then, applications of the implicit function theorem yield dFOCL2 /dL dFOC /da dL2 dL = − dFOCL /dL2 and dL da 2 = − dFOCLL2/dL2 . Additional math yields 2 2 FOCL2 = −aF1L + A2 F2L dFOCL2 = F1L da dFOCL2 = −aF1LL dL dFOCL2 = aF1LL + A2 F2LL dL2 dL2 aF1LL = dL aF1LL + A2 F2LL dL2 F1L =− da aF1LL + A2 F2LL substituting these into our expression for dL2 dA1 , and in turn our expressions for deriva- 13 supplementary appendix – not intended for publication tives of G (in the numerator), yields dL2 −A1 F1LL (GLa − γg Ga ) + F1L (GLL + γc w2 − γg GL w) = dA1 (A1 F1LL + A2 F2LL )(GLL + γc w2 − γg GL w) (F1L w2 )γc − (F1L w − F1LL F1 )A1 γg = (A1 F1LL + A2 F2LL )(GLL + γc w2 − γg GL w) To sign this expression, note that the denominator is the product of two second order conditions, for utility maximization and for maximization of production subject to L1 = L − L2 ; each of these is negative, so the product is positive. Therefore sign(dL2 /dA1 ) = sign((F1L w2 )γc − (F1L w − F1LL F1 )A1 γg ). Next, note that F1L w2 > 0 and −(F1L w − F1LL F1 )A1 < 0; therefore one sufficient condition for this derivative to be negative is that γc < 0 and γg > 0; in other words, increasing consumption reduces the marginal utility from consumption relative to the marginal utility from agricultural production, and increasing agricultural production increases the marginal utility from consumption relative to the marginal utility from agricultural production. The former generically holds under decreasing absolute risk aversion, while the latter holds under some restrictions; under these restrictions, dA dL2 1 < 0. For one sufficient restriction, we follow Karlan et al. (2014) and make restrictions on the distribution of σ . We assume that, for some k > 1, σ = k with probability k 1 (“the good state”) and σ = 0 with probability k− k 1 (“the bad state”); i.e., there is a E[uc ucc ] crop failure with probability k− k 1 . Under this assumption. Next, define R = − E[uu c c] to be the household’s average risk aversion, and Rk = −E[ u cc uc |σ = k ] to be the household’s risk aversion in the good state. Note that by decreasing absolute risk aversion, Rk < R. From this, it follows that E[ucc ] E[σucc ]E[uc ] γc = − = γ (Rk − R) < 0 E[σuc ] E[σuc ]2 E[σucc ] E[σ 2 ucc ]E[uc ] E[uc |σ = 0] γg = − = ( k − 1) Rk = (kγ − 1)Rk > 0 E[σuc ] E[σuc ]2 E[uc |σ = k ] Finally, consider the limit as household wealth increases, and assume that agri- cultural production will not grow infinitely with household wealth; this holds when the marginal product of labor on each plot falls sufficiently quickly and is true of typical decreasing returns to scale production functions. Then, limM →∞ γ = 1 and limM →∞ γc = limM →∞ γg = 0, and therefore limM →∞ dA dL2 1 = 0. We therefore expect 14 supplementary appendix – not intended for publication that, heuristically on average, dA > 0, as dA < 0 and dA approaches 0 for large d L2 2 dL2 dL2 1 dM 1 1 M . As L and M enter symmetrically, the same results hold for L. Input constraint. When only the input constraint binds, the first order conditions simplify to (Mk ) Ak FkM = (1 + λM )r (Lk ) Ak FkL =w E[u ] ( ) E[uc ] =w Note that the choice of leisure does not enter into the first order conditions for Mk or Lk . Substituting M2 = M − M1 yields the following system of equations A1 F1M (M1 , L1 ) − (1 + λM )r = 0 A1 F1L (M1 , L1 ) − w = 0 A2 F2M (M − M1 , L2 ) − (1 + λM )r = 0 A2 F2L (M − M1 , L2 ) − w = 0 Stack the left hand sides into the vector FOCM . Define the Jacobian JM ≡ D(M1 ,L1 ,λM ,L2 ) FOCM . Applying the implicit function theorem yields D(A1 ) (M1 , L1 , λM , L2 ) = −JM −1 D(A1 ) FOCM . Some algebra yields   A1 F1M M A1 F1M L −r 0    A1 F1M L A1 F1LL 0 0  JM =    −A2 F2M M 0 −r A2 F2M L   −A2 F2M L 0 0 A2 F2LL D(A1 ) FOCM = (F1M , F1L , 0, 0) dM2 = kM A2 F2LL A1 (F1L F1M L − F1M F1LL ) dA1 dL2 = −kM A2 F2M L A1 (F1L F1M L − F1M F1LL ) dA1 where kM is positive.68 As F2LL < 0, sign dM2 dA1 = −sign (F1L F1M L − F1M F1LL ). 68 kM = − (A1 F1LL )A2 (F2M M F2LL −F 2 1 2 2 . We make standard assump- 2 2M L )+(A2 F2LL )A1 (F1M M F1LL −F1M L ) 15 supplementary appendix – not intended for publication This is negative whenever productivity growth on plot 1 would cause optimal input allocations, holding fixed the shadow price of inputs, to increase on plot 1. Similarly, sign dA dL2 1 = sign(F2LM )sign dM 2 dA1 . The labor response and input response on the second plot have the same sign whenever labor and inputs are complements on the second plot. Labor constraint. When only the labor constraint binds, the first order conditions simplify to (Mk ) Ak FkM = r (Lk ) Ak FkL = (1 − λL )w u ( ) uc = (1 − λL )w Substituting = L − LO − L1 − L2 and LO = LO , and some rearranging yields A1 F1M (M1 , L1 ) − r = 0 A1 F1L (M1 , L1 ) − (1 + λL )w = 0 A2 F2M (M2 , L2 ) − r = 0 A2 F2L (M2 , L2 ) − (1 + λL )w = 0   u  Ak Fk (Mk , Lk ) + r(M − M1 − M2 ) + wLO , L − LO − L1 − L2  − k∈{1,2}   (1 + λL )wuc  Ak Fk (Mk , Lk ) + r(M − M1 − M2 ) + wLO , L − LO − L1 − L2  = 0 k∈{1,2} Stack the left hand sides into the vector FOCL . Additionally, it will be convenient to define the following derivatives of on farm labor demand on plot k , LDk , with respect to the shadow wage w∗ and productivity Ak , on farm input demand on plot k , MDk , with respect to productivity Ak , and on farm labor supply, LS, with respect to the shadow wage w∗ and consumption (through tions required for unconstrained optimization; second order conditions for unconstrained optimiza- tion imply kM is positive. 16 supplementary appendix – not intended for publication shifts to wealth) c. Let Ak FkM M LDkw∗ = 2 2 Ak (FkM M FkLL − FkM L) Ak FkM FkM L − Ak FkL FkM M LDkAk = A2 2 k (FkM M FkLL − FkM L ) Ak FkL FkM L − Ak FkM FkLL MDkAk = A2 2 k (FkM M FkLL − FkM L ) uc LSw∗ = − u − (1 + λL )wuc uc − (1 + λL )wucc LSc = − u − (1 + λL )wuc We make standard assumptions required for unconstrained optimization; these imply LDkw∗ is negative (labor demand decreasing in shadow wage), and LSw∗ is positive (labor supply increasing in shadow wage). We further assume LDkAk and MDkAk are positive (labor demand and input demand are increasing in productivity); an addi- tional sufficient assumption for this is that F is homogeneous. We further assume LSc is negative (labor supply is decreasing in wealth); an additional sufficient assumption for this is that u is additively separable in c and . Next, define the Jacobian JL ≡ D(M1 ,L1 ,M2 ,L2 ,λL ) FOCL . Some algebra yields   A1 F1M M A1 F1M L 0 0 0 A1 F1M L A1 F1LL 0 0 −w       JL =   0 0 A2 F2M M A2 F2M L 0   0 0 A2 F2M L A2 F2LL −w     dFOCL, dFOCL, dFOCL, dFOCL, dM1 dL1 dM2 dL2 −wuc dFOCL, = A1 F1M (uc − (1 + λL )wucc ) dM1 dFOCL, = A1 F1L (uc − (1 + λL )wucc ) − (u − (1 + λL )wuc ) dL1 dFOCL, = A2 F2M (uc − (1 + λL )wucc ) dM2 dFOCL, = A2 F2L (uc − (1 + λL )wucc ) − (u − (1 + λL )wuc ) dL2 Applying the implicit function theorem yields D(A1 ) (M1 , L1 , M2 , L2 , λL ) = −JL −1 D(A1 ) FOCL . 17 supplementary appendix – not intended for publication Some further algebra, and substitution, yields D(A1 ) FOCL = (F1M , F1L , 0, 0, (uc − (1 + λL )wucc )F1 ) dL2 LD1A1 − LSc (F1M MD1A1 + F1L LD1A1 + F1 ) = LD2w∗ dA1 LSw∗ − (LD1w∗ + LD2w∗ ) − LSc (LD1A1 + LD2A2 ) dL2 1 = LD2w∗ dL LSw∗ − (LD1w∗ + LD2w∗ ) − LSc (LD1A1 + LD2A2 ) dL2 rLSc = LD2w∗ dM LSw∗ − (LD1w∗ + LD2w∗ ) − LSc (LD1A1 + LD2A2 ) dL2 dA1 < 0; for interpretation, note that this expression is the derivative of labor demand on plot 2 with respect to the shadow wage, times the effect of the shock to A1 on the shadow wage. The numerator of the latter is the effect the shock on negative residual labor supply through direct effects (LD1A1 ) and wealth effects, including through adjustments of labor and inputs (−LSc (F1M MD1A1 + F1L LD1A1 + F1 )). The denominator of the latter is the derivative of residual labor supply with respect to the shadow wage, adjusted for wealth effects (LSw∗ − (LD1w∗ + LD2w∗ ) − LSc (LD1A1 + LD2A2 )). The signs of dLdA d2 L2 and dM d2 L2 dA1 are ambiguous. However, unlike the cases of input 1 market failures or insurance market failures, here these second derivatives may have opposite signs. To see one example of this, consider a case where on farm labor and input demands are approximately linear in the shadow wage and productivity, and on farm labor supply is approximately linear in consumption, but exhibits meaningful curvature with respect to the shadow wage. In this case, sign( dLdA d2 L2 ) = sign dL d LSw∗ 1 and sign( dLdA ) = sign dM LSw∗ . To focus on one case, larger households are less 2 d L2 d 1 responsive to the A1 shock ( dLdA d2 L2 > 0) if and only if they are on a more elastic 1 portion of their labor supply curve ( dL d LSw∗ > 0). That larger households, with more labor available for agriculture, or poorer households, who likely have fewer productive opportunities outside agriculture, would be on a more elastic portion of their labor supply curve is consistent with proposed models of household labor supply dating back to Lewis (1954). This motivates the prediction we focus on: that larger households should be less responsive to the A1 shock, and richer households should be more responsive to the A1 shock. 18 supplementary appendix – not intended for publication Supplementary Appendix B.1 Testing for binding constraint with crop choice Supplementary Appendix B.1.1 Model featuring crop choice Households have 2 plots, indexed by k : k = 1 indicates the sample plot, while k = 2 indicates the most important plot. On each plot k , they have access to two production technologies, corresponding to horticulture and bananas. The technology for horticul- ture production is σAH k Fk (Mk , Lk , zk ), where Ak is plot horticulture productivity, H H Mk is inputs applied to horticulture on plot k , and Lk is household labor applied to plot k . The production shock σ is a random variable with mean normalized to 1.69 Utilizing subscripts to indicate partial derivatives and subsuming arguments we assume marginal products are strictly positive (FkM H H > 0, FkL H > 0, Fkz > 0), marginal products are increasing in the use of other inputs (FkM H L > 0, FkM z > 0, FkLz > 0), H H and the production technology is strictly concave (FkM H H H M < 0, FkLL < 0, Fkzz < M FkLL − (FkLM ) > 0, …). The technology for banana production is Fk (1 − zk ). H H H 2 B 0, FkM We make the simplifying assumption that banana production only uses land as an in- put, consistent with the very low input and labor use associated with banana produc- tion that we document. We make the additional simplifying assumption that banana production is riskless, consistent with qualitative work suggesting that horticultural production is particularly risky because of both production risk and marketing risk. In addition, we allow other costs and benefits of allocating land to bananas relative to horticulture, Ck B (1 − zk ), which includes rainy season production of bananas. We further assume Fkz B B > 0, Fkzz < 0, CkzB B > 0, Ckzz > 0. Within this framework, we model irrigation access on plot k as an increase in horticultural productivity AH k from 0.70 Note that this implies that during the dry season, households will not cultivate horticulture, use labor, or use inputs on plots outside the command area, consistent with our results in Section 3. Households have a budget of M which, if not utilized for inputs, can be invested in a risk-free asset which appreciates at rate r. In this context, households maximize expected utility over consumption c and leisure l, considering their budget constraint 69 While we refer to σ as a production shock, this incorporates general uncertainty in the value of production which includes joint price and production risk. 70 One microfoundation of this is that production of horticulture is Leontief in water and the com- posite F H (Mk , Lk , zk ). Access to irrigation provides free access to water, which is not traditionally available during the dry season. 19 supplementary appendix – not intended for publication and a labor constraint L which is allocated to labor on each plot, leisure, and up to LO units of off farm labor LO .71 Households maximize expected utility M1 , M2 , L1 , L2 , z1 , z2 , l, LO E [u(c, l)] subject to the constraints enumerated above σ (AH H H H B B 1 F1 (M1 , L1 , z1 ) + A2 F2 (M2 , L2 , z2 )) + F1 (1 − z1 ) − C1 (1 − z1 )+ B B F2 (1 − z1 ) − C2 (1 − z2 ) + wLO + r(M − M1 − M2 ) = c M1 + M2 ≤ M L1 + L2 + l + LO = L LO ≤ LO After substituting in the constraints which bind with equality, we derive the fol- lowing first order conditions cov(σ,uc ) (Mk ) 1+ E[uc ] Ak FkM = (1 + λM )r (S1) cov(σ,uc ) (Lk ) 1+ E[uc ] Ak FkL = (1 − λL )w (S2) cov(σ,uc ) (zk ) 1+ E[uc ] Ak Fkz B = Fkz B − Ckz (S3) ( ) E[u ] E[uc ] = (1 − λL )w (S4) Supplementary Appendix B.1.2 Model featuring crop choice yields the same predictions Input or labor constraints To show that the model featuring crop choice yields the same predictions as the model without crop choice, we proceed in 3 steps. First, we define the plot level production function as the envelope of allocations of land 71 We follow Benjamin (1992) in modeling incomplete labor markets as driven by an off farm labor constraint. As in Benjamin (1992), we do so to match the observation that rural wages appear to be higher than the productivity of on-farm labor. However, for the predictions that follow it is sufficient that the household farm face an upward sloping residual labor supply. This holds if households face a downward sloping labor demand curve (implied by Benjamin (1992); alternatively, Breza et al. (2018) demonstrate the existence of norms driven wage floors), or if households incur convex costs from working off farm (due to distaste from working for others). Alternatively, the market failure may only apply to a particular task, such as managerial labor. 20 supplementary appendix – not intended for publication across horticulture and bananas, conditional on input and labor choices. Second, we show that second derivatives of this envelope have the same signs as second derivatives of the plot level production function. Third, we note that our results on input and labor constraints in Supplementary Appendix B did not depend on properties of second derivatives of the plot level production function except their sign. Therefore, establishing that the second derivatives of the plot level production function are the same with and without crop choice is sufficient for results in Supplementary Appendix B on input and labor constraints to hold in a model featuring crop choice. First, let Fk (Mk , Lk ; a) ≡ maxz aFk H B (Mk , Lk , z ) + Fk B (1 − z ) − Ck (1 − z ). Ap- plications of the envelope theorem then yield FkM = AH k FM , FkL = Ak FL , and H H H Fka = F H . Second, in three steps we work through each of the second derivatives of F that appear in Supplementary Appendix B. First, we show FkLL < 0 and FkM M < 0. k FkLL + Ak FkLz dLk . An application of the implicit function theorem yields H H dzk FkLL = AH H B . We now make three substitutions. First, we substitute for dzk AH H k FkLz = − AH F H B kzz −Ckzz dLk k kzz + F dzk dLk in our expression for FkLL . Second, we substitute FkLL H H 2 < (FkLz ) /F1zz . Third, H we substitute Fkzz − Ckzz < 0. These substitutions and simplification yield FkLL < 0. B B An identical argument implies FkM M < 0. Second, we show that FkLa > 0 and FkM a > 0. FkLa = FkL H + AHk FkLz dAH . An H dzk k application of the implicit function theorem yields dA > 0. dzk FH H = − AH F H +F B −C B kz k k kzz kzz kzz This yields that FkLa > 0. An identical argument implies FkM a > 0. Third, we show that FkLM > 0. FkLM = AH k FkLM + Ak FkLz dMk . An application H H H dzk of the implicit function theorem yields dM > 0. This yields dzk AH F H = − AH F H k+FkM B z B k k kzz kzz −Ckzz that FkLM > 0. As these are all the second derivatives of the plot level production function that entered into our results in Supplementary Appendix B, as noted above this completes the proof. Insurance constraints Absent perfect insurance and incorporating crop choice into the model, households may now respond to changes in productivity on the sam- ple plot by either shifting land allocated to horticulture or shifting labor allocated to horticulture on each plot. This significantly complicates expressions, so we simplify by abstracting from the choice of inputs and labor and assuming Fk H H (m, l, z ) = Fk (z ). We therefore focus only on statics of substitution related to land allocated to horticul- ture. As land and labor are complements in the production of horticulture, we expect 21 supplementary appendix – not intended for publication these results to be robust to allowing for labor and inputs to enter the production k ≡ Fk − Ck B . function for horticulture. Lastly, to simplify notation, we write ΠB B As in Supplementary Appendix B, we abstract from labor supply and remove leisure from the model. Households therefore solve max E[u(c)] z1 ,z2 σ (AH H 1 F1 (z1 ) + AH H 2 F2 (z2 )) + ΠB 1 (1 − z1 ) + ΠB 2 (1 − z2 ) + wL + rM = c Next, we define Zk = −ΠB k (1 − zk ). Further, define Fk (Z ) = Fk (1 − (Πk ) (−Z )). H∗ H B −1 As ΠBk is concave, its inverse is convex, so one minus its inverse is concave. Fk is H concave and increasing. Therefore, Fk H∗ is concave and increasing. We then rewrite the above problem as max E[u(c)] Z1 ,Z2 H∗ H H∗ σ (AH 1 F1 (Z1 ) + A2 F2 (Z2 )) − (Z1 + Z2 ) + wL + rM = c However, this is identical to the setup for insurance constraints in Supplementary Ap- pendix B. Therefore, all results for insurance constraints still hold under the additional assumptions made in Supplementary Appendix B: dA dZ2 H < 0, and limM →inf dAH = 0. dZ2 1 1 Lastly, note that dZk = ΠB kz dzk . As Π B kz > 0, and lim M →∞ kzΠB > 0, it holds that dz2 dAH < 0 and limM →∞ dAH = 0. This completes the proof for allocations to horti- dz2 1 1 culture. Lastly, as noted above, we expect these results to be robust to allowing for labors and inputs to enter the production function for horticulture. Supplementary Appendix B.1.3 Extensions in model with crop choice We note two extensions in the model with crop choice. First, as dry season productiv- ity under horticulture is 0 on plots outside the command area, during the dry season households do not cultivate horticulture, apply labor, or apply inputs to plots outside the command area. As a result, we do not expect to see any substitution from most important plots that are outside the command area, a test we implement in Section 5. Second, that our predictions are robust to flexibly modeling the costs associated with adopting bananas implies our model predictions are robust to any within-plot across-season spillovers caused by the fact that bananas are a perennial crop. Third, predictions on across-plot substitution labor and inputs in Section 4 now also hold 22 supplementary appendix – not intended for publication for land allocated to horticulture; intuitively, this is because horticulture is strongly complementary to labor and inputs. This provides an additional theoretical justifi- cation our consideration of horticulture and irrigation as outcomes in our analysis of substitution in Section 5. 23