94660 AUTHOR ACCEPTED MANUSCRIPT FINAL PUBLICATION INFORMATION Should African Rural Development Strategies Depend on Smallholder Farms? An Exploration of the Inverse-Productivity Hypothesis The definitive version of the text was subsequently published in Agricultural Economics, 45(3), 2013-09-03 Published by Wiley and found at http://dx.doi.org/10.1111/agec.12070 THE FINAL PUBLISHED VERSION OF THIS ARTICLE IS AVAILABLE ON THE PUBLISHER’S PLATFORM This Author Accepted Manuscript is copyrighted by the World Bank and published by Wiley. It is posted here by agreement between them. Changes resulting from the publishing process—such as editing, corrections, structural formatting, and other quality control mechanisms—may not be reflected in this version of the text. You may download, copy, and distribute this Author Accepted Manuscript for noncommercial purposes. Your license is limited by the following restrictions: (1) You may use this Author Accepted Manuscript for noncommercial purposes only under a Attribution for non-commercial purposes in accordance with Wiley Terms of Conditions license http://olabout.wiley.com/ WileyCDA/Section/id-817011.html. (2) The integrity of the work and identification of the author, copyright owner, and publisher must be preserved in any copy. (3) You must attribute this Author Accepted Manuscript in the following format: This is an Author Accepted Manuscript of an Article by Larson, Donald F.; Otsuka, Keijiro; Matsumoto, Tomoya; Kilic, Talip Should African Rural Development Strategies Depend on Smallholder Farms? An Exploration of the Inverse-Productivity Hypothesis © World Bank, published in the Agricultural Economics45(3) 2013-09-03 Attribution for non-commercial purposes in accordance with Wiley Terms of Conditions http:// olabout.wiley.com/WileyCDA/Section/id-817011.html http://dx.doi.org/10.1111/agec.12070 © 2015 The World Bank Should African rural development strategies depend on smallholder farms? An exploration of the inverse productivity hypothesis Donald F. Larson1,2, Keijiro Otsuka3, Tomoya Matsumoto3 and Talip Kilic2 Abstract In Africa, most development strategies include efforts to improve the productivity of staple crops grown on smallholder farms. An underlying premise is that small farms are productive in the African context and that smallholders do not forgo economies of scale – a premise supported by the often observed phenomenon that staple cereal yields decline as the scale of production increases. This paper explores a research design conundrum that encourages researchers who study the relationship between productivity and scale to use surveys with a narrow geographic reach in order to produce more reliable results, even though results are better suited for policy decisions when they are based on data that are broadly representative. Using a model of endogenous technology choice, we explore the relationship between maize yields and scale using alternative data. Since rich descriptions of the decision environments that farmers face are needed to identify the applied technologies that generate the data, improvements in the location specificity of the data should reduce the likelihood of identification errors and biased estimates. However, our analysis finds that the inverse productivity hypothesis holds up well across a broad platform of data, despite obvious shortcomings with some components. It also finds surprising consistency in the estimated scale elasticities. Keywords: agriculture, inverse productivity, farm size, smallholders, Sub-Saharan Africa, technology choice JEL classifications: O13, Q12, Q16, Q18 Acknowledgments The authors would like to thank Gerald Shively and two anonymous reviewers for their helpful suggestions on early drafts. The authors also gratefully acknowledge support from the World Bank’s Research Supplemental Budget. The paper draws on data collected under the Living Standards Measurement Study - Integrated Surveys on Agriculture project, an ongoing initiative funded in part by the Bill & Melinda Gates Foundation. The authors would like to thank Amparo Palacios-Lopez for her help with the Malawi and Tanzanian datasets. 1Corresponding author. Fax: 202-522-1151. E-mail address: DLarson@worldbank.org 2 World Bank - Development Research Group, 1818 H Street, NW, Washington, DC 20433, USA. 3 National Graduate Institute for Policy Studies, 7-22-1 Roppongi, Minato-Ku, Tokyo, Japan 106-8677 1 Should African rural development strategies depend on smallholder farms? An exploration of the inverse productivity hypothesis Inventing and disseminating technologies that boost yields on smallholder farms is a central pillar of rural development strategies in Africa. Moreover, on-farm yields are seen as a measure of whether government policies have been successful, and yields are used to evaluate new varietal research. To be sure, there are other components of rural development strategies, especially components related to health, education and nutritional safety nets. But agriculture remains the economic engine of rural Africa and the livelihoods of the rural poor depend on it. As a consequence, a dominant view among policy makers is that boosting yields is the best way to enhance rural incomes in the short-to-medium term and the goal of boosting yields on smallholder farms is advanced by the World Bank, the Food and Agriculture Organization of the United Nations, the CGIAR institutions, most African governments, and pan-African institutions. A key basis for the strategy is evidence that small farms are more productive than large farms in Africa when it comes to growing staple crops, a proposition known as the inverse-yield, or inverse-productivity relationship.1 Despite widespread institutional support, the debate over whether this emphasis on smallholder agriculture is wise remains active. Sometimes, the arguments take on normative tones. For example, Collier (2008) charges that the development community has stressed less-innovative smallholder agriculture over more-productive commercial agriculture because of an overly romantic view of peasant farming. Hazell et al. (2010) counter that promoting smallholder agriculture is a more equitable approach to rural development, as well as a more efficient one. Lipton (2006) argues that emphasizing smallholder development partly compensates for policies in rich and poor countries that are, on balance, urban-biased. Still, beyond issues of motivation and equity, significant scope for positive debate remains because of uncertainty about the design features of the current body of empirical research, a topic which this paper explores. On the face of it, emphasizing smallholder technologies as a means of boosting agricultural productivity seems a narrow strategy. However, proponents of smallholder-based rural development emphasize that the prevalence of small farms is not accidental, but an optimizing outcome in the face of imperfect land and labor markets. A central tenet of the reasoning behind this approach is that farmers are optimizing agents – albeit within the constraints of limited resources and weak land and labor markets – since this is needed to explain 1 The following quotes are indicative of the institutional support for smallholder-led strategies: from the President of the World Bank “Eighty-six per cent of staples in poor areas come from local sources, so support for country-led efforts to bolster smallholder agriculture is critical.” (Zoellick, 2011); from the Director General of FAO, “Sustainable intensification of smallholder crop production is one of FAO’s strategic objectives.” (Diouf, 2011); from the Director General of IFPRI, “G20 Ministers of Agriculture must focus on smallho lder farmers to achieve food security and prevent food price volatility” (Fan, 2011); from the pan-African Alliance for a Green Revolution in Africa, “AGRA works to achieve a food secure and prosperous Africa through the promotion of rapid, sustainable agricultural growth ba sed on smallholder farmers.” AGRA (2011); and from Bill Gates, co-chair of the Bill and Melinda Gates Foundation (Gates, 2012) “If you care about the poorest, you care about agriculture. Investments in agriculture are the best weapons against hunger and poverty, and they have made life better for billions of people.” 1 why farmers would willingly adopt yield-improving technologies, even as the sector as a whole seemingly fails to employ the larger-scale technologies associated with productive commercial farming on other continents. All of this does not preclude the possibility that large farms have a place in land abundant areas. Still, most land currently farmed in Africa is occupied by smallholder households, and it is hard to envision a path of structural change that would bring about an accelerated transformation of Africa’s agricultural sector. 2 Instead, decades of sustained effort are likely needed to build a more capable and better educated labor force as well as the strong institutions needed for secure land markets, both of which are needed for structural change. Consequently, a primary path to a better life for this generation of rural poor lies with technologies that boost productivity on smallholder farms. While the appropriateness of a smallholder-led strategy for Africa is debated, there is less controversy about whether the strategy can be effective in general, largely because of Asia’s Green Revolution e xperience. Like Africa, and unlike other regions, agriculture in Asia is smallholder based, and the scale of farming in many Asian countries has declined during recent decades. Yet, this occurred while yields increased at a remarkable pace, especially for staple grains like wheat and rice.3 What’s more, when the Green Revolution began in Asia, poverty was as pervasive in rural areas as it is currently in many parts of Africa. As the new varieties of staple grains spread through Asia, associated productivity gains raised farm income and spilled over to spur growth in the non-farm sectors of rural communities; because of the large share of income spent on food by the poor, a resulting productivity-led decline in real food prices reduced urban poverty as well.4 For these many reasons, the Asian experience factors heavily in the design of African rural development strategies (Otsuka and Larson, 2012). And there is much riding on the outcome; without productivity gains, food output in Africa will stagnate as land frontiers close. If global food prices rise as a consequence, the 2 Past efforts to organize large-scale farming in Africa include notable failures in Sudan (Eicher and Baker 1992) and Tanzania (Lane and Pretty 1991). Land transfers in Africa accelerated between 2004 and 2009; however, Deininger and Byerlee (2012) note the often the transferred lands were not used, or were used in ways inconsistent with agreed-upon investment plans. The authors also point out the perils of policies that facilitate land accumulations when land market institutions are weak. 3 von Braun (2005) reports that the average farm size in both Africa and Asia is about 1.6 hectares, while farms average 27 hectares in Western Europe, 67 hectares in Latin America and the Caribbean, and 121 hectares in Canada and the United States. In Asia, farm sizes have been constant to declining in recent decades .For example, the average farm size in China was 0.6 hectares in 1980 and 0.4 hectares in 1999. In India, average farm size declines from 2.3 hectares in 1971 to 1.4 hectares in 1994-96. In Indonesia, the average farm size was 1.1 hectares in 1973 and 0.9 hectares in1993 (Nagayets 2005). At the same time, data from FAO (2012) show that average rice yields grew by 2.6 percent annually in China from 1961 to 2001 and at 2.1 percent and 2.7 percent annually in India and Indonesia. Annualized growth rates for wheat were 4.6 percent for China and 3.2 percent for India. In contrast, average maize yields in Sub-Saharan Africa grew at less than one percent per year from 1961 to 2010. 4 The consequences for poverty were global. According to the World Bank (2008, p.3), the primary reason why the global $1-a-day poverty rate fell from 28 percent in 1993 to 22 percent in 2002, was an 8 percent decline in rural poverty – mostly due to improved rural living conditions. Moreover, the largest decline was in East Asia, driven in part by Asia’s Green Revolution (Ravallion, Chen and Sangraula, 2007). 2 urban poor and the many rural poor who are not food self-sufficient will suffer. If productivity rises elsewhere and prices continue to decline, then the farm incomes of Africa’s smallholders will fall, pushing them deeper into poverty. In this paper we revisit the inverse productivity phenomenon in the context of African maize, a staple crop important to African policy makers, and explore a conundrum faced by researchers investigating the phenomenon.5 To be relevant for policy, the inverse relationship between productivity and scale should hold broadly across large and diverse geographic areas, since the strategy depends on national and regional efforts. However, credible tests of the hypothesis require fine and detailed data, which are generally associated with homogeneous settings. To explore this conundrum, we first developing a statistical model based on Mundlak’s (1988, 1993) insight that agricultural production data implicitly arise from a mix of applied technologies, which are in turn endogenously determined by the differing circumstances that frame on-farm decisions. In this, local conditions are important, so relevant information is gained as the unit of observation moves from macro to micro, even as the policy relevance of the results weakens. We estimate the model using datasets of varying granularity and geographic scope, allowing us to explore how trade-offs between spatial diversity and reductions in omitted information affect statistical tests of the inverse- productivity hypothesis. Related literature An important component of the empirical work on the consequences of smallholder agriculture for productivity has to do with observations that small farms seemed to be more productive than larger farms in Asia and in Africa. Writing in 1946 about differences in rubber yields, P.T. Bauer notes that “Measured by long-period supply price, the smallholders are the more efficient class, though their methods necessarily differ from those of the estates (p. 391).” Using data from Uttar Pradesh, Mazumdar (1965) finds that the shadow price of family labor was lower for smallholders, leading to higher labor inputs and yields. Sen (1966) makes the same point, citing government studies from India. Carter (1984) does as well, based on observations from Haryana, India, even after adjusting for selection bias and for differences in village and soil characteristics. Also using data from India, Bhalla (1979) finds an inverse relationship, even though small farmers in his sample were less likely to adopt modern technologies. Bardhan (1973) notes that land and other market imperfections can work to generate inverse yield relationships; he also points out that differences can arise among crops because of crop-specific differences in labor and management practices. Yotopoulos and Lau (1973) argue that Indian smallholders were more efficient because of owner- managers had an advantage in supervision and leadership. Taslim (1989) makes the same point using data from Bangladesh. Binswanger and Rosenzweig (1986) and Binswanger and McIntire (1987) argue that family farms are the predominant organizational structure for agricultural sectors in places where containing transaction costs are more important than economies of scale. In one of the few conceptual treatments of the 5 See Smale, Byerlee and Jayne (2012) for a discussion of maize and maize policies in Africa 3 problem based on first principles, Feder (1985) shows that supervision costs and credit availability can generate systemic relationships between yields and farm size in a conceptual model. Lipton (2006) also notes that high transaction costs related to output markets that encourage self-sufficiency also favor small farms. In a similar way, Barrett (1996) argues that food security concerns induce smallholder farmers to supply added labor as a risk-mitigation strategy.6 Because the technology, crop and market characteristics driving farmer choices and influencing the organizational structure of agriculture vary, differences in empirical studies might be expected. In an early study that employed cross-sectional data, Deolalikar (1981) found differences in the relationships between farm size and productivity that were also associated with on-farm technology choices. Examining data from land-abundant Western Sudan, Kevane (1996) found that wealth rather than farm size mattered most for productivity and labor intensity. U-shaped relationships were reported by Carter and Wiebe (1990) for Kenya and Heltberg (1998) for Pakistan. Moreover, there is evidence that as the circumstances conditioning farmer decisions change, optimal scale outcomes shifts as well. For example, Foster and Rosenzweig (2011) show that increased mechanization in India may be shifting incentives toward larger scale farms. Another line of reasoning suggests that the inverse yield relationships are statistical artifacts. A compelling argument is that smallholder lands are naturally more productive since households are less likely to sell or rent-out their highest quality land. Consequently omitting measures of land quality can systematically biases empirical results (Bhalla and Roy, 1988; Benjamin, 1995; Lamb, 2003; Assuncao and Braido, 2007). A related argument is that inverse yield relationships are driven by measurement errors due to systematic tendencies in self-reported area and yields (Lamb, 2003). Empirical evidence is thin, but a study by Barrett et al. (2010) that includes soil measurements in Madagascar finds no evidence of systematic bias for measured yields. Similarly, a recent study that incorporates self-reported and GPS land measurements in Uganda concludes that measurement errors work against rather than in favor of the inverse yield hypothesis (Carletto et al., 2011). While the large literature on the inverse-yield phenomenon encompasses diverse explanations, a unifying theme is that local features of markets and geography, and the transaction costs they engender, matter since they influence farmers’ choices about what they produce and how they go about producing it. This creates a practical tension for applied work. On the one hand, a significant spatial variation in collected data is desirable in order to generate exploitable variation in the market conditions. At the same time, precise and comparable data is needed in order to generate robust and convincing empirical results. In general most research addresses the later need, and consequently most empirical results are based on data with limited geographic reach.7 In this paper we explicitly recognize the trade-off between spatial diversity and measurement precision by using disparate datasets to estimate a common statistical model. 6 See Kimhi (2006) for a concise description of the set of arguments motivating applied studies of the inverse-yield phenomenon. 7 An early exception is Berry and Cline’s (1979) cross-country study. 4 Conceptual and applied model The starting point for the conceptual model is Mundlak’s (1988) notion that applied technologies stem from endogenous choices and that this should guide empirical models. The idea is consistent with agricultural sectors comprised of multiple and disparate farming methods in developing countries and also the significant differences in applied technologies apparent across countries for the same crop. It is also conformable with the common sense notion that natural endowments, such as climate and soils, influence how farmers grow their crops. In developing countries, the relative scarcity of land and labor is often key. For example, when land is abundant and labor scarce, shifting agriculture can be the most cost-effective way to produce subsistence crops, conserving both labor and non-labor inputs. Conversely, fixed farming systems predominate in areas where populations are denser and land for farming is scarce. Population density often signals better market access as well, which creates the opportunity for farmers to produce beyond subsistence levels. 8 Layered on top of these fundamental structural drivers, the weak performances of land and labor markets have their own consequences for production choices.9 In the broadest physical and economic sense, a technology is a mapping between planned inputs and expected outputs, and as such, an applied technology reveals a strategic choice. This has conceptual and practical implications when the decision environment is heterogeneous – that is, when the endowments, prices and constraints farmers face vary. Since the set of optimization problems behind the data vary, relationships measured in the data vary accordingly. For empirical work, this implies that outcomes from multiple applied technologies are implicit in the data. In turn, there must be an adequate accounting of the different underlying relationships, primal and dual, between inputs and outputs before inferences can be made, including inferences about scale effects. Conceptual model To see this later point, let characterize the full set of optimization problems faced by a given population of farmers, conditional on a set of available technologies, where the vector fully describes exogenous (non-choice) state factors, including input and output prices, , that influence farmers’ choices. 10 In the simple case, when farms and farmers are identical and face the same set of prices and other state factors, supply and input demand schedules can be recovered in a straight-forward way at the solution values of via the envelope theorem. In turn, an empirical model can be constructed, based on a homogeneous depiction of the farm problem as embodied in a uniform production function. In an 8 See Binswanger (1991), and Pingali, Bigot and Binswanger (1987) for good discussions of how land and labor scarcity affect technology choices. 9 See Deininger and Feder (2001) for a review of land institutions and the performance of land markets. See Larson and Mundlak (1997) for a discussion about the slow pace of adjustment between agricultural labor and labor employed in other sectors. 10 For a more detailed discussion, see Mundlak, Larson and Butzer (1999). 5 applied setting, the conditions farmers face are heterogeneous – that is the set of feasible technologies, relevant prices, endowments and other state variables that condition choice vary. This means that the state- variables of the general function should be scripted to denote the different states, or decision environments, relevant to a particular choice and related outcomes. One implication is that standard supply and input demand schedules are only available for classes of farmers, where within-class farmers face the same-valued state variables. With this in mind, let , where is the set of all possible decision environments, and where the dimension of the vector equals the minimum number of state variables needed to fully describe the decision problem. Using the Envelope Theorem, the input demand schedules are given by , and the supply function, given at solved values of by , where are i-specific shadow prices for inputs and output. The solution input and output choices can be written as , where = 1, 2, 3…I, represents each of the relevant decision environments. In general, . Note that the optimization problem can be thought of as a choice about which available technology to apply, that is, , where is an element of , the set of all feasible mappings between inputs and output across all possible states. Applied model The derivation of the applied model starts with what has become a common practice of asserting a two-stage decision process in which a farmer first allocates land among activities and then chooses the rates of application for the remaining inputs.11 This convenient device is used to develop an applied model in which yield is modeled in terms of planted area and a function of how intensely remaining factors are used. To derive the model, start by partition the vector of inputs so that , where is harvested area and is a vector of the remaining inputs contained in . The production identity can be written as in logs as , where are parameters associated with the applied technology that generated the conditional solution values . Dividing through by land gives , represents planned yields. To accommodate non-zero input values, we use a proportional linear functional to represent the scaled intensity term, that is , where for . Looking ahead to the statistical model, let observed yields differ from planned yields by random error , which also depends on the applied technology choice, so that . Assembling the parts and expressing the model in terms of observables results in the applied statistical yield model, . Rearranging the yield model provides: 1) where and , and where . 11 See Antle (1983) and the related discussion in Kimhi (2006). 6 Equation 1) can also be reordered to provide a model of the factor intensity for other inputs associated with the same applied technology. For a given , the intensity function can be expressed as , where . To save on notation, let . After using the expanded form of to derive an expanded version of equation 1, the result can be rearranged to yield: 2) Identification Estimating equations 1) and 2) is straightforward when the applied technologies can be fully identified using observable state variable. When this is the case, although a mixture of technologies may have generated observed outcomes, collected data can be sorted and segregated into technology specific sub-samples and subsequently used to estimate a set of equations that are consistent with the underlying production function associated with each sub-sample. This method of identification simplifies the estimation problem in two ways. First, it eliminates the potential problem that data associated with multiple parameter sets is mistakenly used to estimate a model with constant parameters. Second, knowing that each observation in the data is associated with the same solution values permits a simplification of the applied model. Although yields are stochastic, the expected values of the solution variables are fully identified by the state variables. As a consequence, components of the statistical model can be represented by a function of the state variables if the associated parameters on the decision variables are not of special interest. Specifically, since the identical decision variables lead to the same choice. The same argument holds for yields. Consequently, with sufficiently accurate measurement and full identification, the applied models reduce to: 3) and 4) where and . In applied settings, the task of sorting out co-mingled technologies in the data is left to indirect methods. One approach is to use information about the spreads in input solution values, as an identification strategy, since they relate to difference in choice (Mundlak, Butzer and Larson, 2012); a second approach is to estimate a flexible form that permits an interaction between observed decision and state variables (Larson and Leon, 2006). An additional strategy, which can be used in combination with other approaches, is to limit the data used to estimate the statistical model in a way that reduces variation in the state variables, and thereby mitigates the problems associated with falsely asserting that a single production function gave rise to the observations. Taken as a whole, the evolution of inverse productivity studies reviewed earlier reflects this latter strategy. Production decisions taken by smallholder farmers constitute one choice is a complex livelihood strategy, that is conditioned and, therefore, the number of state variables needed to adequately describe the 7 smallholder decision environment is large. Data that is most representative of an agricultural sector implicitly contain outcomes based on the largest set of decision environments, not all of which are observed. For this reason, parameters estimated using broad datasets and conclusions drawn from them are vulnerable to omitted-variable problems. A simple way to address this problem is by reducing the unit of observation to eliminate variations in some of the state variables, and one unambiguous way of doing so is by narrowing the geographic focus of the study. For example, variations in the performance of institutions or market access, which might be problematic in regional studies, might be reasonably ignored in data drawn from a single village. What’s more, differences among household wealth or stocks of human capital can be treated as constant in studies based on within-household plot observations. Said more succinctly, the geographic dimensions of the state variables differ and the variation in some state variables disappear as the geographic scope of the sample narrows. Consequently, the challenge of addressing the omitted variable problem is reduced as the geographic range of the observations narrows. Expressed in terms of the applied model, the challenge of finding an unbiased estimates of and becomes more manageable as the geographic scope of the data used to estimate the model becomes narrower since the dimension of is reduced. The research conundrum As discussed, the notion that land-productivity falls as land-scale rises, is taken as evidence that the small- scale of farming prevalent in Asia and Africa is consistent with optimizing farm technology choices. It also motivates two important classes of policy instruments: i) instruments designed to mitigate the pervasive constraints conditioning current choices about technology, and ii) investments in basic and applied research that supplements the set of technologies available to smallholder farmers. Expressed in terms of the applied model, policy makers are interested in values of and that are indicative of the national or regional farming practices since the policy instruments they wield are national or regional in nature. Additionally, policy makers are interested in the parameters associated with states, especially education, health and market access, since these are fundamental drivers of productivity levels and can, in turn, be changed through investments and policy. Herein lies the conundrum for researchers. The most useful estimates for policy are the most challenging to produce since the number of implicit technologies grows as the dataset becomes more representative of the farming sector as a whole, as does the dimension of . Consequently, the opportunities for omitted variable problems abound with spatially diverse datasets, which increases the risk of biased estimates and unreliable parametric tests. Potentially, a reliable national notion of could be built up from a mosaic of well done studies based on micro-survey data or via experimental-design studies. Even so, a separate research approach would be required to provide information about the role of state variables in productivity 8 outcomes, since these methodologies do not simultaneously provide parameter estimates for an extended set of state variables.12 Empirical results The remainder of the paper is an empirical exploration of what happens to estimates of and as the geospatial scope of datasets used to estimate the parameters narrows. In particular, we use location fixed effects to proxy the decision environments characterized by the state variables , which in turn become more place-specific as unit of observation narrows. Relying on three groups of data, we first use area and yield observations to obtain estimates of equation 3, where fixed effects are used to estimate . The first of the three groups is a meta-panel built from combining household surveys from ten countries in Sub-Saharan Africa. Taken together, the combined surveys include over 62,000 observations (Table 1). Country coverage for the second dataset is described in the center columns of Table 1. This dataset is also a meta-panel constructed by combining over 8,000 plot-level observations from five countries.13 Four surveys that contain detailed plot-level production data comprise the third group, described in the far right columns of the table. We use these data separately to estimate versions of equation 3), and extend the analysis to look at the relationship between scale and the intensity of other factors, based on equation 4), where location-specific fixed effects are used to estimate . Regressions using combined surveys As discussed, the first two regressions utilize the meta-panel datasets. Though the surveys contain limited information about the farming methods employed, the datasets contain outcomes across a wide and divergent geography, with differing agro-climatic conditions, transportation networks and political institutions. The data also spans considerable time and this potentially implies a change in available technologies. For each household, the total area under crop was calculated and used to classify the scale of farming into four quantiles from small to large. Summary statistics about farm size, area planted to maize, and yields by each of the four categories are given in Table 2. Although average yields are higher from the countries included in the plot-level meta-panel, the scale of farming is quite similar, with more than three-quarters of the farms planting less than two hectares to crops. 12 As discussed, micro-studies limit the geographic scope of the analysis as a way of controlling the omitted variable problem, thereby limiting the opportunity to measure the impact of state variables. Often experiment-design studies have a narrow geographic scope as well, but this is not necessarily the case. Still, the types of state variables relevant for policy are omitted from experimental-design studies as well, since the purpose of the design is make sure that omitted state variables are uncorrelated with the intervention of interest, obviating the need to estimate their effects. 13 The datasets are distinct; that is, plot-level observations were not aggregated and included in the household-meta-panel. Nor are the plot level observations included in the third collection of detailed surveys. 9 Table 3 reports estimates from the fixed-effects model, (equation 3), for each of the two meta-panels. In the case of the household data, yields represent an average based on an aggregation over plots. The fixed- effects used to represent the farmers’ decision environment are constructed from country and survey year identifiers, and therefore do not contain information about in-country differences. In contrast, household identifiers are used to capture differences among household decision environments when the plot-level data is used. In this case, the model explains the remaining variation in plot yields, and the omitted information has to do with unobserved plot-level differences, including soil fertility. In the empirical literature, this distinction can be lost, since studies typically use one type of data or the other. However, the difference may be relevant for policy, since studies based on household data investigate the average relationships between yields and the scale of production, while the plot-based studies investigate the effects of fragmentation of farms on yields. Nevertheless, despite significant difference in the granularity of the data, the regression outcomes are remarkably similar. In both cases, there is a negative correlation between land and yields; the correlation estimates are both statistically significant and are of similar magnitude (-0.161 versus -0.206). Results reported in the lower panel of the table under the heading “Core 90 percent” shows that the results are not especially affected by extreme values; tossing out the top and bottom 5 percent of yields in the samples lowers the estimated correlations, but the changes are quantitatively small and tests of significance are unaffected. Detailed datasets In this section we turn our attention to the four datasets containing detailed production data at the plot level. The datasets are the result of two survey efforts. The first two datasets are derived from the Malawi Third Integrated Household Survey (IHS3) and the Tanzania National Panel Survey (TZNPS), products of collaboration between the World Bank and the Governments of Malawi and Tanzania as part of the Living Standards Measurement Study – Integrated Surveys on Agriculture initiative. The IHS3 was conducted from March 2010 to March 2011 by the Malawi National Statistical Office, covering 121,271 rural and urban households across Malawi. The TZNPS is a biannual survey conducted by the Tanzania National Bureau of Statistics (NBS). The first round implemented between September 2008 and October 2009 is used here. The TZNPS is based on a stratified two-stage cluster sample design. The total sample size is approximately 3,200 households. The data sample was designed to be nationally-representative, and also representative at the level of the major agro-ecological zones, and for Dar es Salaam, other mainland urban areas, mainland rural areas, and Zanzibar. The second two datasets are the result of the REPEAT project which is an ongoing longitudinal rural household survey in Kenya and Uganda by National Graduate Institute for Policy Studies (GRIPS) and the Foundation for Advanced Studies on International Development (FASID) in Japan with collaboration of the 10 TEGEMEO research institute in Kenya and Makerere University in Uganda. We use the survey data in 2004 and 2007 for Kenya covering 890 households, and 2003 and 2005 for Uganda covering 940 households. 14 For each country, surveyed farms are classified into one of four quantiles based on farm size; basic information on yields, farm size, plot size and number of plots is given in Table 4. As the surveys confirm, household farms are quite small in all four countries. In the case of Malawi, only about 22 percent of the surveyed farms exceeded one hectare. The surveyed farms were slightly larger in Kenya, Tanzania and Uganda, but the average size for the largest class of farms was less than five hectares for all three countries. The surveys also report how farmed land is broken into still smaller plots. There is a tendency for plots to be larger on large farms, but multiple plots are common regardless of farm size. Looking at the last column of Table 4, the inverse yield relationship is not always readily apparent in the average yields across classes of farm size, except in Tanzania. Often, a U-shaped relationship is observed instead. Malawi The Malawi data contains observations on 8,983 households growing maize on 12,078 plots, a significant sub- set of the 12,000 households surveyed in IHS3. The survey reports on three types of labor – household provided labor, hired labor and exchange labor, a reciprocal arrangement where neighbors help each other at crucial times in the growing season. The amount of chemical fertilizers, herbicides and pesticides used per plot is recorded, as well as whether or not organic fertilizer was applied. Table 5 reports average input use per hectare by farm-size quantile in two ways for Malawi and the other countries analyzed in this section. On the left-hand side of the table, group-wise averages are reported inclusive of zero-values. In the case of Malawi, the use of household labor rises and then falls. However, the differences are more muted in the columns to the right, once zero-values are excluded from the averages. A clearer pattern of other factor use emerges as well. When hired or exchange labor is used at all, it is used more intensely on smaller farms, and the same is true for chemical fertilizers. The opposite is true for pesticides and herbicides, perhaps because the plots are weeded manually instead. Table 6 provides information on the types of farming methods employed by scale of operation by country, with Malawi listed first. Surprisingly, the share of plots that are intercropped and the share of plots where hybrid seeds are used do not vary greatly with scale. Few households choose to employ farming technologies that rely on pesticides or herbicides, although farmers choose to use chemical fertilizers on most plots regardless of scale. The use of exchange labor is infrequent, but relatively constant across scale. As might be expected, larger farms are most likely to hire labor, although hired workers also work on the smallest farms and when employed are used more intensively (recall Table 5). Although few plots were rented in total, farmers with the smallest farms were most likely to rent plots. Table 7 contains regression results based on the applied model. As with the meta-panel data, the log of land planted to maize per plot was regressed on the log of yield, fixed household effects, and two additional 14 Details about the REPEAT project and surveys are given by Yamano, et al. 2004 and Yamano et al. 2005. 11 indicators of technology: whether or not hybrid seeds were used and whether intercropping techniques were used on the plot. The exercise was repeated using the per hectare intensity of the remaining inputs, based on the model given in equation 4). As discussed, the variables used in the regression were scaled, but not converted to logs in order to accommodate zero-valued observations. However, to facilitate comparisons, elasticities are reported in the table based on estimated parameters and mean values. Turning to specific results from the regressions for Malawi, the estimated correlation between land and yield, at -0.349, is somewhat larger in absolute terms than the combined plot results in Table 3, and is also statistically significant. The same is true for household labor. In the cased of hired and exchange labor, the elasticities are significantly different from zero, although quantitatively small, perhaps reflecting the fact that larger farms are more likely to bring in non-household labor, but use non-household labor less intensively. Once adjustments are made for intercropping and the type of seed used, the estimated chemical fertilizer elasticity is positive and significant, while the estimated elasticity for organic fertilizer is statistically indistinguishable from zero. As before, the regressions were rerun after clipping the upper and lower 5 percent of the observations based on yield. The yield correlation is unaffected quantitatively or qualitatively. The same is true for the household labor and chemical fertilizer elasticities. The non-household labor elasticities move closer to zero, and the exchange labor elasticity is no longer distinguishable from zero.15 Tanzania The survey from Tanzania covers 1,786 maize plots on 1,272 farms, slightly more than half of the farms covered by TZNPS. Labor falls into two groups, household and hired, both measured in working days, and plot-level inputs are recorded for organic and chemical fertilizers and herbicides and pesticides. The intensity of input use, on average and conditional on use, are reported by type of input and scale of production, are reported in Table 5. In contrast to the corresponding measures for Malawi, the intensity of labor use declines noticeably with farm size. The patterns for the remaining inputs are less pronounced. While around half of the observed maize plots were intercropped in Malawi, about two-thirds of the maize plots from the Tanzania were intercropped – a pattern that was consistent across farm size (Table 6). Few farmers in Tanzania used hybrid seeds – about 14 percent of the surveyed plots – and the use of hybrid seed varied little with scale. As in Malawi, farmers with the smallest farms are more likely to rent in plots although the vast majority of small- scale farmers do not. About 27 percent of the smallest farms hired workers, a slightly smaller share than that of the largest category of farms, 36 percent of which entered the labor market. As a whole, few maize farmers relied heavily on either labor or rental markets and farmers utilized neither market for about 64 percent of the plots. 15 The estimated intercepts, and the estimated discrete impacts of intercropping and the use of hybrid sees from the underlying regressions are given an Annex table, which is available upon request. 12 Estimation results from the Tanzania data are given in Table 7. The estimated correlation between land and yield is statistically significant and, at -0.246, quite close to results based on the Malawi survey. With the exception of the elasticity for hired labor, the results show that the inputs are land-saving and are in line with the results from Malawi. With the exception of the herbicide-pesticide elasticity, the remaining estimates are statistically distinguishable from zero. The elasticity on hired labor is positive and significant at just over the 5 percent level. This suggests that increasing the per-hectare use of hired labor is associated with larger plots of maize. This is not unreasonable, but it is different from the results in Malawi and with the negative relationships between factor intensification and land use. As shown in the “Core 90 percent” panel of Table 7, this particular finding and the results from Tanzania in general, were not changed by dropping the extreme yield values from the sample. Kenya The Kenya data contains 3,152 maize plots of 814 maize producing households from 2 waves of the panel survey in Kenya in 2004 and 2007 covering 4 seasons (2 crop seasons per year for 2 years). Table 4 reports the average farm size, number of maize plots, size of maize plots, and average yield by farm-size quantile based on the cultivated land size of the households in 2004. The intensity of input use, on average and conditional on use, are reported by type of input and scale of production in Table 5. The intensity levels of household labor use, hired labor, and organic fertilizer use decline with farm size while hybrid seed use and chemical fertilizer use do not show a clear pattern. In Kenya, compared to other countries, hybrid seed and chemical fertilizer are used in higher proportion and their application level is high and is indicative of more intensive farming methods (Table 6). Roughly 12 percent of plots are rented-in, which is the highest among the four countries. Farmers with the smallest farms were more likely to rent in plots. About 90 percent of plots were intercropped. Larger farms were more likely to hire labor or draft animals. Taken together, these facts suggest that rural labor market and land rental market are more active in Kenya, relative to other countries in our study. Table 7 reports the results of the regressions. As before, the log of land planted to maize per plot was regressed sequentially on the log of yield, chemical fertilizer use per hectare, organic fertilizer use per hectare, total hours of family labor, and cost for the hired labor and drafted animal with other covariates: the household fixed effect, intercropping dummy, and a hybrid seed use dummy. In addition, since the data is a panel, a season-year dummy is included as well. The elasticity of plot size is negative and statistically significant for both yield and all the inputs. Quantitatively, the elasticity at -0.49 is larger in absolute terms than the estimates for Malawi and Tanzania and this holds true of the elasticities for the other factor intensities. It falls when the core dataset is used to - 0.363. These results show that the yield declines with the plot size, which is consistent with the inverse yield relationship. They also show that the inverse relation would be due to the inverse input use relationship. 13 Uganda The Uganda data contains 3,200 maize plots of 825 maize producing households from 2 waves of the panel survey in Uganda in 2003 and 2005, covering 4 seasons (2 crop seasons per year for 2 years). The intensity of input use, on average and conditional on use, are reported by type of input and scale of production in Table 5. The intensity of household labor use declines while hired labor cost increases with farm size. Other inputs do not show clear patterns. The table indicates that the intensity level of maize production is very low in Uganda, unlike Malawi and Kenya. Chemical fertilizer is used on only 2 percent of maize plots and organic fertilizer is used on only 4 percent. On average 70 percent of maize plots are intercropped and the percentage decreases with farm size. Hybrid seed and hired labor are more likely to be used on larger farms. Again, the smallest farms were most likely to rent plots for maize production, although rental land was used on few farms of any scale (Table 6). Table 7 reports the results of the regression with household fixed effects. The plot size elasticity on yield and family labor use is negative and statistically significant while those of other inputs are not significant. At - 0.557 the land-yield elasticity is the largest in absolute terms of the country estimates. It remains significant but falls to -0.335 when we use only the core 90 percent sample for the regressions. The yield and family labor use declines with the plot size. Summary results Taken together, the four data sets describe farming sectors where plots are small and households rely primarily on household labor and land that they own, conditions consistent with a conceptual depiction analyzed by Feder (1985). Only in Kenya are farmers likely to hire draft animals or employ workers. Other indicators of chosen technologies – for example, the use of intercropping, fertilizer or high-yielding seeds – vary across countries, but vary less within country by scale. Differences in fertilizer use are indicative; a majority of maize farmers in Kenya and Malawi apply chemical fertilizers to their plots regardless of scale, while fertilizer is rarely used in Tanzania and Uganda. Keeping in mind that what classifies as a large farm in Malawi Tanzania, Kenya or Uganda is small by standards in Europe, Central Asia or the Americas, within country variation comes not so much from differences in applied inputs, but rather the intensity of their application. This is most visible in the use of household labor in Tanzania, Kenya and Uganda. The difference in labor intensity is less obvious in Malawi, but then there is also little variation in the size of farms in Malawi. The statistical analysis confirms the inverse yield relationship for maize at the plot level. The calculated elasticities, estimated independently for each sample, are distinguishable from zero with a high degree of statistical confidence in each case. Moreover, the elasticities are quantitatively similar to results obtained 14 from the meta-panel of plot level data reported in Table 3, once the extreme tails of the yield distribution are excluded.16 The analysis also suggests an inverse relationship between scale and household labor use, an empirical phenomenon that is related to how labor markets work in Sub-Saharan Africa and a potential determinate of why farms are small in Africa. In this case, there is greater variation in the quantitative estimates; however, the distinction falls according to survey groups – with the World Bank LSMS surveys producing lower household labor elasticities than the GRIPS REPEAT surveys. Since measuring household labor input is notoriously difficult, it may be that sampling methods play a role. 17 Summary and conclusions In this paper, we revisit the inverse-yield relationship, the observed phenomenon that cereal yields decline as the scale of production increases in Sub-Saharan Africa. The relationship is important for policy since it is taken as evidence that the small-scale structure of farming in Africa is a natural and efficient outcome, given the pervasive constraints on the sector, particularly those related to land and labor markets. In turn, this has had important implications for the evolution of African agricultural policies; currently, efforts to improve the productivity of smallholder farms are a ubiquitous core feature of the development strategies promoted by African governments and international development agencies. Early empirical evidence in favor of the inverse-yield relationship was based on observations of the phenomenon in regional data compilations and econometric analysis using regionally representative datasets. This prompted criticism, since potentially important information about farms, farming households, and related markets were often omitted, opening the door to potentially biased results. As a consequence, researchers began to rely on datasets with a limited geographic reach, often examining the variation in plot yields for a given farming household. This shifted the focus of the studies from the consequences of the prevailing scale of farming to the consequences of fragmenting already small farms into smaller plots. This shift in the literature reflects an underlying mythological conundrum for researchers. For policy, what matters most is evidence as to whether small farms are an appropriate and natural outcome of the interplay of endowments and markets. At the same time, homogenous and narrow empirical settings, though less relevant for policy, are most likely to generate better parameter estimates and the robust statistical tests that facilitate evidence-based policy decisions. In this paper, we use multiple dataset to explore this 16 Throughout our analysis, we employ fixed effects models on the presumption that omitted state variables lead to bias. This need not be the case if the omitted state variables are uncorrelated with the included regression variables. Although it seems likely that important state variables, such as climate and household composition are uncorrelated with factor use, we did test to see whether the fixed-effect measures we used to proxy the aggregate effects of the state variables were correlated by comparing our estimates with results from otherwise identical random-error models, using a test proposed by Hausman (1978). In all yield and household labor regressions, we were able to reject the notion that the state variables were uncorrelated. The results, available upon request, were mixed for other factor regressions. 17 See Beegle et al. (2012) in the context of measuring household consumption in developing countries. 15 conundrum and find evidence that the tradeoffs are not as severe as might be expected, at least in the case of maize production in Africa. In particular, we adopt Mundlak’s a conceptual model of endogenous technology choice in which farming methods (applied technologies) depend on the decision environment in which farmers operate. One consequence is that heterogeneity in household characteristics, farm endowments, and market performance generates heterogeneity in the decision environment and therefore heterogeneity in the applied technologies underlying the data. State variables can be used to identify the relevant production technologies, but the number of state variables needed for identification increases with the variety and complexity of the decision environments producing the data. Consequently, the risk of misidentification rises with the geographic scope of the data, since the data is drawn from a more diverse set of decision environments. With this in mind, we develop an applied model to test the inverse relationship between maize yields and production scale in Sub-Saharan Africa, using fixed location effects to proxy the set of relevant state variables. When estimating the model, we progress from coarse data that is geographically and temporally diverse to more granular and place-specific data, with the presumption that the opportunity for omitting relevant information about the state variables is reduced along the way. In all settings, we find empirical evidence that supports the hypothesis that maize yields fall as the scale of production rise. Moreover, once extreme yield values are expunged from the data, the estimated elasticities measuring the inverse relationship are remarkably similar across the assorted data platforms. We take this as evidence that small- scale farming in Africa is an endogenous outcome of the endowments, markets and constraints that temper farming household decisions about what to produce and how to produce it. Policies meant to improve smallholder productivity and those meant to relieve the constraints that bind households to less productive strategies should be viewed as complementary. Yields and the relationship between yields and scale are narrow measures of productivity and the more detailed data provides a richer ground for describing how applied technologies differ by type of input across scale. For the four detailed datasets included in our study, evidence of an inverse relationship between household labor and scale is also evident. In general, there is considerable consistency in the type of inputs used by households in a given location. Therefore, the data suggests that variations in scale are associated with how intensively inputs are applied, rather than in the variation of adopted inputs. Bibliography AGRA. 2011. What is the Alliance for a Green Revolution in Africa? www.agra-alliance.org, downloaded June 16, 2011. Antle, John M. 1983. Sequential decision making in production models. American Journal of Agricultural Economics 65(2), 282–290. 16 Assuncao, Juliano and Luiz Henrique Braido. 2007. Testing household-specific explanations for the inverse productivity relationship. American Journal of Agricultural Economics 89 (4), 980-990. Bardhan, Pranab K. 1973. Size, productivity, and returns to scale: an analysis of farm-level data in Indian agriculture. The Journal of Political Economy 81 (6), 1370-1386. Barrett, Christopher B. 1996. On price risk and the inverse farm size-productivity relationship. Journal of Development Economics 51 (2), 193-215. Barrett, Christopher B., Marc F. Bellemare, Janet Y. Hou. 2010. Reconsidering conventional explanations of the inverse productivity–size relationship, World Development 38 (1), 88-97. Bauer P. T. 1946. The working of rubber regulation. Economic Journal 56 (223), 391-414. Beegle, Kathleen; Joachim De Weerdt, Jed Friedman, and John Gibson. 2012. Methods of Household Consumption Measurement through Surveys: Experimental Results from Tanzania. Journal of Development Economics 98 (1), 3-18 Benjamin, Dwayne. 1995. Can unobserved land quality explain the inverse productivity relationship .Journal of Development Economics 46 (1), 51–84. Berry, R. Albert, and William Robert Cline. 1979. Agrarian Structure and Productivity in Developing Countries. Baltimore, MD: Johns Hopkins Univ. Press. Bhalla Surjit.1979. Farm size, productivity and technical change in Indian agriculture, appendix. In R. Albert Berry and William Robert Cline (eds.). Agrarian Structure and Productivity in Developing Countries. Baltimore, MD: Johns Hopkins University Press. Bhalla, Surjit S. and Prannoy Roy. 1988. Mis-Specification in farm productivity analysis: the role of land quality. Oxford Economic Papers, 40 (1), 55–73. Binswanger, Hans. 1991. Brazilian policies that encourage deforestation in the Amazon. World Development 19 (7), 821-829. Binswanger, Hans and John McIntire. 1987. Behavioral and material determinants of production relations in land abundant tropical agriculture. Economic Development and Cultural Change 36 (1), 75–99. Binswanger, Hans and Mark R Rosenzweig. 1986. Behavioral and material determinants of production relations in agriculture. Journal of Development Studies 22(3), 503–539. Carletto, Calogero, Sara Savastano and Alberto Zezza. 2011. Fact or artefact: the impact of measurement errors on the far size-productivity relationship. Policy Research Working Paper 5908. Washington: World Bank. Carter, Michael R. 1984. Identification of the inverse relationship between farm size and productivity: an empirical analysis of peasant agricultural production. Oxford Economic Papers 36 (1), 131-145. 17 Carter, Michael R. and Keith D. Wiebe. 1990. Access to capital and its impact on agrarian structure and productivity in Kenya. American Journal of Agricultural Economics 72 (5), 1146–1150. Collier, Paul. 2008. The politics of hunger. Foreign Affairs 87 (6), 67–79. Deininger, Klaus and Derek Byerlee. 2012. The rise of large farms in land abundant countries: do they have a future. World Development 40 (4), 701-14. Deininger, Klaus and Gershon Feder. 2001 Chapter 6: Land institutions and land markets. In: Bruce L. Gardner and Gordon C. Rausser, Editor(s), Handbook of Agricultural Economics, Volume 1, Part A. Amsterdam: Elsevier. Deolalikar, Anil B. 1981. The inverse relationship between productivity and farm size: a test using regional data from India. American Journal of Agricultural Economics 63 (2), 275–279. Diouf, Jacques. 2011. Foreword to Save and Grow: A Policymaker’s Guide to the Sustainable Intensification of Smallholder Crop Production. Rome: Food and Agriculture Organization of the United Nations. Eicher, Carl K., and Doyle Curtis Baker. 1982. Research on agricultural development in Sub-Saharan Africa: a critical survey. MSU International Development Paper 1. East Lansing, Michigan: Michigan State University. Fan, Shenggen. 2011 Press Statement June 15, 2011. Washington: International Food Policy Research Institute. FAO. 2012. FAOSTAT data base. Rome, Food and Agriculture Organization. Feder, Gershon. 1985. The relation between farm size and farm productivity: The role of family labor, supervision and credit constraints. Journal of Development Economics 18 (2-3), 297-313. Foster, Andrew D. and Mark R. Rosenzweig. 2011. Are Indian farms too small? Mechanization, agency costs, and farm efficincy. Providence, RI, Brown University. Gates, Bill. 2012. Helping Poor Farmers, Changes Needed to Feed 1 Billion Hungry. Bill and Melinda Gates Foundation. Available on the Internet at: http://www.gatesfoundation.org/media-center/press- releases/2012/02/helping-poor-farmers-changes-needed-to-feed-1-billion-hungry. Hausman, Jerry Allen. 1978. Specification tests in econometrics. Econometrica 46(6), 1251-71. Hazell, Peter, Colin Poulton, Steve Wiggins, and Andrew Dorward. 2010. The future of small farms: trajectories and policy priorities. World Development 38 (10), 1349–1361. Heltberg, Rasmus. 1998. Rural market imperfections and the farm size-productivity relationship: evidence from Pakistan. World Development 26 (10), 1807–182. Kevane, Michael. 1996. Agrarian structure and agricultural practice: typology and application to Western Sudan. American Journal of Agricultural Economics 78 (1), 236–245. 18 Kimhi, Ayal. 2006. Plot size and maize productivity in Zambia: Is there an inverse relationship? Agricultural Economics 35 (1), 1–9. Lamb, Russell L. 2003. Inverse productivity: land quality, labor markets, and measurement error. Journal of Development Economics 71 (1), 71–95. Lane, Charles, and Jules N. Pretty. 1991. Displaced pastoralists and transferred wheat technology in Tanzania. IIED Gatekeeper Series No. 20. London, International Institute for Environment and Development. Larson, Donald F., and Mauricio León. 2006. How endowments, accumulations, and choice determine the geography of agricultural productivity in Ecuador. World Bank Economic Review 20 (3), 449-71. Larson, Donald F., and Yair Mundlak. 1997. On the intersectoral migration of agricultural labor. Economic Development and Cultural Change 45 (2), 295-319. Lipton, Michael. 2006. Can small farmers survive, prosper, or be the key channel to cut mass poverty? The Electronic Journal of Agricultural and Development Economics 3(1), 58 –85. Matsumoto, Tomoya and Takashi Yamano. 2009. Soil fertility, fertilizer, and the maize Green Revolution in East Africa. Policy Research Working Paper 5158. Washington, World Bank. Mazumdar, Dipak. 1965. Size of farm and productivity: a problem of Indian peasant agriculture. Economica 32 (126), 161-173. Mundlak Yair, Donald F. Larson and Rita Butzer. 1999. Rethinking within and between regressions: the case of agricultural production functions. Annales d'Économie et de Statistique (55/56), Économétrie des Données de Panel, 475-501. Mundlak, Yair, Rita Butzer and Donald F. Larson. 2012. Heterogeneous technology and panel data: The case of the agricultural production function. Journal of Development Economics 99 (1), 139-149. Mundlak, Yair. 1988. Endogenous technology and the measurement of productivity. In Susan M. Capalbo and John M. Antle (eds) Agricultural Productivity: Measurement and Explanation. Washington: Resources for the Future. Mundlak, Yair. 1993. On the empirical aspects of economic growth theory. American Economic Review 83 (2), 415-420. Nagayets, Oksana. 2005. Small farms: current status and key trends. In The Future of Small Farms. Proceedings of a Research Workshop. Washington: International Food Policy Research Institute. Otsuka, Keijiro and Donald F. Larson. 2012. An African Green Revolution: Finding Ways to Boost Productivity on Small Farms. Dordrecht: Springer. Pingali, Prabhu, Yves Bigot, and Hans P. Binswanger. 1987. Agricultural Mechanization and the Evolution of Farming Systems in Sub-Saharan Africa. Baltimore: Johns Hopkins University Press. 19 Ravallion, Martin, Shaohua Chen, and Prem Sangraula. 2007. New evidence on the urbanization of global poverty. Population and Development Review 33 (4), 667-701. Sen, Amartya K. 1966. Peasants and dualism with or without surplus labor. The Journal of Political Economy 74 (5), 425–450. Smale, Melinda, Derek Byerlee and Thom Jayne. Maize revolutions in Africa. 2012. In Keijiro Otsuka and Donald F. Larson (eds.) An African Green Revolution: Finding Ways to Boost Productivity on Small Farms. Dordrecht: Springer. Taslim, M.A. 1989. Supervision problems and the size-productivity relation in Bangladesh agriculture. Oxford Bulletin of Economics and Statistics 51 (1), 55-71. von Braun, Joachim, 2005. Science and technology policies for agricultural productivity and growth in developing countries (PowerPoint Presentation), Agricultural Outlook Forum 2005 32857, United States Department of Agriculture, Agricultural Outlook Forum. Available on the Internet at: http://ageconsearch.umn.edu/handle/32857. World Bank. 2008. World Development Report 2008: Agriculture for Development. Washington: World Bank. Yamano, Takashi, Dick Sserunkuuma, Keijiro Otsuka, George Omiat, John Herbert Ainembabazi, and Yasuharu Shimamura. 2004. The 2003 REPEAT Survey in Uganda: Results, available at: http://www3.grips.ac.jp/~globalcoe/j/data/repeat/REPEATinUgandaReport.pdf. Yamano, Takashi, Keijiro Otsuka, Frank Place, Yoko Kijima, and James Nyoro. 2005. The 2004 REPEAT Survey in Kenya (First Wave): Results, available at: http://www3.grips.ac.jp/~globalcoe/j/data/repeat/ReportKenya2004F.pdf Yotopoulos, P.A. and L. J. Lau. 1973. A test for relative economic efficiency: some further results. American Economic Review 63(1), 214–223. Zoellick, Robert. 2011. Free markets can still feed the world. Financial Times January 5, 2011. 20 Tables Table 1: Household and plot datasets Household meta-panel Plot meta-panel Detailed plot data Country Observations Country Observations Country Observations Ghana 677 Ethiopia 4,681 Kenya 3,109 Kenya 6,347 Madagascar 388 Malawi 12,078 Madagascar 755 Mali 176 Tanzania 1,787 Mozambique 1,306 Tanzania 2,003 Uganda 3,195 Malawi 12,099 Uganda 1,033 Nigeria 1,760 Rwanda 3,941 Tanzania 27,786 Uganda 5,170 Zambia 2,195 Total 62,036 Total 8,281 Total 20,169 Note: Included surveys were conducted between 1999 and 2009. 21 Table 2: Farm and plot characteristics by quantile. Averages from plot meta-panel Quantile Farm size Plot maize area Maize yield Number of plots 1 0.37 0.23 1,479 1.23 2 0.92 0.37 1,470 1.38 3 1.67 0.45 1,427 1.56 4 10.62 0.78 1,128 1.92 Full sample 3.38 0.46 1,376 1.52 Averages from household meta-panel Quantile Farm size Farm maize area Maize yield 1 0.41 0.29 1,070 2 0.96 0.50 1,006 3 1.78 0.74 1,012 4 7.28 1.52 1,037 Full sample 2.45 0.74 1,030 Note: Farm size is the sum of cultivated area for all crops on all plots. 22 Table 3: Fixed-effects model results from combination datasets Household meta-panel Plot meta-panel Coefficient Std. Err. Coefficient Std. Err. Full sample Log yield -0.161* 0.003 -0.206* 0.020 Constant 0.364* 0.020 0.215 0.135 Rho 0.285+ 0.606+ Core 90 percent Log yield -0.135* 0.004 -0.178* 0.030 Constant 0.188* 0.027 0.016 0.203 Rho 0.310+ 0.611+ Note: Rho is the fraction of variance due to the fixed effects.* denotes that the associated t-score is significant at the 1 percent level. + denotes that the associated F-score is significant at the 1 percent level. The highest and lowest 5 percent of observed yields were dropped from the full samples to create the “Core 90” samples. 23 Table 4: Sample averages for Malawi, Tanzania, Kenya and Uganda Number of Plot area in Farm size Maize yield maize plots maize Quantile Malawi 1 0.15 1.35 0.24 1,774 2 0.45 1.59 0.31 1,355 3 0.74 1.72 0.43 1,282 4 4.36 1.98 2.22 1,344 Full sample 1.42 1.66 0.42 1,441 Quantile Tanzania 1 0.47 1.50 0.32 889 2 1.20 2.05 0.49 846 3 2.10 1.74 0.73 758 4 5.35 1.96 1.22 788 Full sample 2.24 1.79 0.68 821 Quantile Kenya 1 0.40 1.22 0.20 1,575 2 0.94 1.42 0.40 1,672 3 1.67 1.51 0.60 1,383 4 4.19 1.55 0.92 1,588 Full sample 1.74 1.42 0.52 1,567 Quantile Uganda 1 0.43 1.21 0.21 1,199 2 1.03 1.23 0.30 1,119 3 1.88 1.42 0.51 1,181 4 4.92 1.51 0.82 1,123 Full sample 2.01 1.35 0.48 1,154 Note: Farm size in Malawi and Tanzania is the area owned by the household. In Kenya and Uganda, farm size is adjusted by net rentals. 24 Table 5: Malawi: Plot-level average use per hectare by farm size. Average use per hectare, including zero values Average use per hectare when input is used (non-zero use) Quantile Quantile 1 2 3 4 Full sample 1 2 3 4 Full sample Malawi Household labor (hours/ha) 799.20 902.88 877.45 804.00 845.34 818.79 909.35 883.58 815.67 856.61 Hired labor (hours/ha) 6.29 4.21 3.85 5.82 5.05 25.30 22.47 19.03 18.30 21.11 Exchange labor(hours/ha) 2.58 2.33 2.38 1.61 2.23 24.65 22.06 19.81 15.80 20.61 Herbicide and pesticide use (kg/ha) 0.06 0.09 0.14 0.41 0.18 7.29 15.49 30.71 35.56 23.30 Chemical fertilizer use (kg/ha) 123.05 119.20 115.74 118.04 119.03 188.05 174.24 163.84 161.33 171.53 Tanzania Household labor (days/ha) 239.96 170.22 115.83 90.1 156.64 242.34 171.11 116.34 90.72 157.71 Hired labor (days/ha) 9.01 7.87 7.41 6.42 7.72 33 22.46 22.54 18.02 23.81 Herbicide and pesticide use (kg/ha) 1.01 0.21 1.85 1.12 1.08 9.66 2.59 15.26 7.91 9.56 Chemical fertilizer use (kg/ha) 16.92 20.59 15.85 16.86 17.43 114.84 141.56 92.07 101.17 110.31 Organic fertilizer use(kg/ha) 22.68 27.4 18.42 23.45 22.81 160.36 234.42 117.51 168.36 163.49 Kenya Household labor (hours/ha) 1,287.9 921.0 810.8 648.8 930.3 1,294.0 922.9 834.1 679.8 947.4 Hired labor and animal cost (Ksh/ha) 9,530 6,174 5,617 5,754 6,802 17,049 9,828 8,356 7,780 10,534 Chemical fertilizer use (kg/ha) 90.07 91.42 85.97 86.55 88.79 118.70 119.81 118.61 120.32 119.40 Organic fertilizer use (kg/ha) 2,763.1 1,646.8 1,366.8 884.5 1,679.8 5,101.0 3,187.2 2,754.2 2,146.4 3,399.4 Uganda Household labor (hours/ha) 1,017 909 809 714 847 1,032 912 825 734 862 Hired labor and animal cost (Ush/ha) 18,527 27,032 28,503 38,827 29,084 78,741 92,242 73,677 84,378 81,774 Chemical fertilizer use (kg/ha) 0.68 0.49 0.66 0.90 0.69 36.48 29.16 23.02 47.14 32.43 Organic fertilizer use (kg/ha) 34.5 31.7 15.8 8.5 21.4 554.7 519.2 445.1 317.5 479.3 25 Table 6: Share of plots by farming techniques Quantile 1 2 3 4 Full sample Malawi Inter-cropped 0.45 0.51 0.47 0.41 0.46 Hybrid seed used 0.52 0.45 0.43 0.42 0.46 Labor hired 0.25 0.19 0.20 0.32 0.24 Exchange labor used 0.11 0.11 0.12 0.10 0.11 Plot rented 0.21 0.02 0.01 0.01 0.07 Herbicide pesticide used 0.01 0.01 0.00 0.01 0.01 Chemical fertilizer used 0.65 0.68 0.71 0.73 0.69 Organic fertilizer used 0.11 0.12 0.14 0.15 0.13 Tanzania Inter-cropped 0.67 0.65 0.67 0.67 0.67 Hybrid seed used 0.16 0.14 0.12 0.14 0.14 Labor hired 0.27 0.35 0.33 0.36 0.32 Plot rented 0.16 0.04 0.02 0.02 0.06 Herbicide pesticide used 0.10 0.08 0.12 0.14 0.11 Chemical fertilizer used 0.15 0.15 0.17 0.17 0.16 Organic fertilizer used 0.14 0.12 0.16 0.14 0.14 Kenya Inter-cropped 0.91 0.91 0.87 0.86 0.89 Hybrid seed used 0.57 0.60 0.58 0.61 0.59 Labor or animal hired 0.56 0.63 0.67 0.74 0.65 Plot rented 0.14 0.14 0.10 0.09 0.12 Chemical fertilizer used 0.76 0.76 0.72 0.72 0.74 Organic fertilizer used 0.54 0.52 0.50 0.41 0.49 Uganda Inter-cropped 0.78 0.75 0.66 0.64 0.70 High yielding seed used 0.15 0.23 0.22 0.24 0.22 Labor or animal hired 0.24 0.29 0.39 0.46 0.36 Plot rented 0.20 0.08 0.08 0.05 0.09 Chemical fertilizer used 0.02 0.02 0.03 0.02 0.02 Organic fertilizer used 0.06 0.06 0.04 0.03 0.04 Note: For Uganda, high-yielding seeds include hybrid and open pollinated varieties. 26 Table 7: Yield and input intensity elasticities from household fixed effects estimates. Full data set Core 90 data Elasticity t-score Elasticity t-score Malawi Yield -0.349a -24.03 -0.392a -19.39 Household labor -0.041b -2.01 -0.044b -2.01 Hired labor 0.036a 6.44 0.003a 4.91 Exchange labor 0.006b 2.11 0.004 1.32 Herbicide-pesticide -0.001 -0.58 -0.000 -0.62 Chemical fertilizer 0.091a 5.93 0.100a 6.18 Organic fertilizer* -0.027 -0.62 -0.020 -0.48 Tanzania Yield -0.246a -6.06 -0.218a -3.88 Household labor -0.100a -7.23 -0.092a -6.21 Hired labor 0.033c 1.91 0.035b 2.04 Herbicide-pesticide -0.004 -0.80 -0.001 -0.17 Chemical fertilizer -0.021b -2.55 -0.033b -2.15 Organic fertilizer -0.013a -2.75 -0.010c -1.70 Kenya Yield -0.490a -16.27 -0.363a -14.80 Household labor -0.666a -11.18 -0.631a -9.20 Hired labor and animal cost -0.681a -6.15 -0.794a -6.10 Chemical fertilizer -0.184a -5.26 -0.199a -5.24 Organic fertilizer -0.415a -6.03 -0.439a -5.37 Uganda Yield -0.557a -15.06 -0.335a -10.82 Household labor -0.601a -9.27 -0.524b -6.51 Hired labor and animal cost 0.147 1.27 0.205 1.44 Chemical fertilizer 0.169 0.64 0.166 0.47 Organic fertilizer -0.107 -0.71 -0.169 -0.89 Note: Elasticities were calculated at mean values.* denotes a discrete variables and the associated parameter is the percentage change in yield from adopting the input. See Table 5 for underlying variable units. Superscripts a, b and c denote significance at the 1, 5 and 10 percent level. For Kenya, time dummies (year-season dummies) are included in addition to household effects; for Uganda, year-season dummies are included. 27