WPS7812 Policy Research Working Paper 7812 Varietal Identification in Household Surveys Results from an Experiment Using DNA Fingerprinting of Sweet Potato Leaves in Southern Ethiopia Frédéric Kosmowski Abiyot Aragaw Andrzej Kilian Alemayehu Ambel John Ilukor Biratu Yigezu James Stevenson Development Economics Development Data Group September 2016 Policy Research Working Paper 7812 Abstract Sweet potato (Ipomoea batatas) varieties have important protocol; and (C) enumerator recording observations on nutritional differences and there is strong interest to iden- five sweet potato phenotypic attributes using a visual-aid tify nutritionally superior varieties for dissemination. In protocol and visiting the field. Twenty percent of farmers agricultural household surveys, this information is often identified a variety as improved when in fact it was local collected based on the farmer’s self-report. However, recent and 19 percent identified a variety as local when it was in evidence has demonstrated the inherent difficulties in cor- fact improved. The variety names given by farmers deliv- rectly identifying varieties from self-report information. ered inconsistent and fuzzy varietal identities. The visual-aid This study examines the accuracy of self-report information protocols employed in methods B and C were better than on varietal identification from a data capture experiment method A, but still way below the adoption estimates given on sweet potato varieties in southern Ethiopia. Three by the DNA fingerprinting method. The findings suggest household-based methods of identifying varietal adoption that estimating the adoption of improved varieties with are tested against the benchmark of DNA fingerprinting: methods based on farmer self-reports is questionable, and (A) elicitation from farmers with basic questions for the point toward a wider use of DNA fingerprinting, likely to most widely planted variety; (B) farmer elicitation on five become the gold standard for crop varietal identification.. sweet potato phenotypic attributes by showing a visual-aid This paper is a product of the Development Data Group, Development Economics in collaboration with CGIAR Standing Panel on Impact Assessment. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at aambel@worldbank.org and Frederic.Kosmowski@fao.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Varietal Identification in Household Surveys: Results from an Experiment Using DNA Fingerprinting of Sweet Potato Leaves in Southern Ethiopia Frédéric Kosmowski1,*, Abiyot Aragaw2, Andrzej Kilian3, Alemayehu Ambel4,*, John Ilukor1, Biratu Yigezu5 and James Stevenson1 Keywords: Sweet potato; agricultural survey; varietal identification; measurement error; DNA fingerprinting; visual-aid in surveys. JEL Classification: C8, C93, Q16 ______________________________ This study was supported by the OPP1009472 project from the Bill and Melinda Gates Foundation. We are grateful to Fekadu Gurmu (SARI), Tilahun Wendimu (ORARI) and Fiseha Tadesse (ORARI) for providing access to improved germplasms. We thank Jean Hanson and Alemayehu Teressa (ILRI), Berhanu Huilu (CSA) and Steffen Shultz (CIP) for technical support. Authors’ affiliation: 1 CGIAR Standing Panel on Impact Assessment, Food and Agriculture Organization of the United Nations, Rome, Italy. 2 International Potato Center (CIP), Hawassa, Ethiopia. 3 Diversity Arrays Technology Pty. Ltd., Yarralumla, ACT, Australia. 4 Development Data Group, World Bank, Washington, DC, USA. 5 Central Statistical Agency, Addis Ababa, Ethiopia. * Correspondance E-mails: f.kosmowski@cgiar.org aambel@worldbank.org Introduction Developing countries rely on agricultural productivity for growth (World Bank, 2007) and achieving a Green Revolution in Sub-Saharan Africa is a major objective of many development organizations. Indeed, the period of food growth production witnessed in the mid-1960s in several Asian countries contributed to widespread poverty reduction, averted hunger for millions of people, and avoided the conversion of thousands of hectares of land into agricultural cultivation (Stevenson et al., 2012). One essential activity of agricultural development is the breeding and dissemination of improved varieties. Crop germplasm improvement is thus a major activity of CGIAR centers and thousands of new varieties have been developed in different agro-ecological contexts to provide higher yield, better nutritional content, or increase resistance to diseases or droughts. Accurate information on crop varieties is therefore crucial to study the extent of adoption by farmers and evaluate the performance of agricultural development programs. However, measuring and understanding the diffusion of improved crop varieties remains challenging. The challenge is more pronounced among poor smallholder farmers where records from official transactions are often missing. Various methodologies, such as sales inquiries, expert opinion estimates and household survey questionnaires have been employed, each with its own inherent limitations. For example, seed-sales inquiries require specific surveys which may not fit into existing agricultural statistic systems. They are also more susceptible to recall bias. In addition, companies are often unwilling to share this information with the public. In a major effort to quantify the adoption of improved varieties in Sub-Saharan Africa, the DIIVA project1 has shed light on the convergence of expert opinion with household survey estimates (Walker, 2015). Conclusions point toward the fact 1 Diffusion and Impact of Improved Varieties in Africa. 2 that expert opinion estimates are likely to overemphasize the uptake of specific varieties while household surveys are likely to understate their importance. The study concludes that “probably neither surveys nor expert panels can do a good job in delivering accurate estimates of cultivar-specific adoption” (Walker, 2015). Assessing the extent of measurement errors is, however, impossible in the absence of an objective benchmark. Since 2010, the technology of DNA fingerprinting has become increasingly affordable, and costs per sample are projected to continue to decrease in the coming decade. The emergence of DNA fingerprinting as a survey instrument provides the opportunity to conduct a survey validation exercise and assess the accuracy of existing methods for collecting crop varietal identification (Maredia and Reyes, 2015; Rabbi et al., 2015). However, the available evidence does not provide information on relevant questions. For example, does this survey validation exercise matter for all crops? Or is it different for different crops? How are different household survey based approaches performing against the DNA fingerprinting benchmark? In this study we present varietal identification for sweet potato. While the Green Revolution was mainly based on the diffusion of crop genetic improvement for the three main staple cereals – maize, rice and wheat – Sub-Saharan Africa exhibit a high diversity of crops, which are of similar importance for food security (Pingali, 2012). Among them, sweet potato has encountered widespread interest since the 1980s. Sweet potato is a co-staple crop in East Africa’s mid-elevation farming areas. In Ethiopia, the number of sweet potato producers has increased recently. The crop is now considered as a major food crop, with 1.6 million producers (Central Statistical Agency, 2012). It is mainly used for household consumption (82%), with only a small portion of the crop being sold (12%). The sweet potato seed system is almost entirely informal, with only occasional formal distributions of new varieties by agricultural research centers or NGOs (Namanda et al., 2011). The crop is generally propagated from farmer to farmer by vine cuttings obtained from mature ware crops. 3 Sweet potato offers several advantages. The crop only requires low levels of inputs, can grow on degraded soils and is easily propagated from vines. Sweet potato is often regarded as a food security crop, having a flexible growing season over a 3–10-month period. The crop is also good to cope with slack season because it is possible to harvest sweet potato way before the harvest season for other crops, at times where food shortages are common. Finally, sweet potato is a candidate of choice for biofortification – the breeding of micronutrients into crops to control vitamin A, iron and zinc deficiencies (Bouis et al., 2011). Indeed, different varieties of sweet potatoes have different nutritional value. Orange-fleshed sweet potato varieties have high beta-carotene content and represent a promising and cost-effective way to combat micronutrient deficiencies, which are prevalent through the developing world. There is mounting evidence that the introduction of orange-fleshed sweet potato can increase vitamin A intakes among children and women (Hotz et al., 2012) and reduce children’s diarrhea prevalence and duration (Jones & de Brauw, 2015). With the objective of spreading an “orange revolution”, several projects have been implemented to promote and disseminate orange-fleshed varieties (HarvestPlus, 2012; Miethbauer, 2015). Therefore, varietal information is important to accurately measure the health and nutrition implications of sweet potato diffusion. In this study, we test the effectiveness of three household-based survey methods of identifying varietal adoption against the benchmark of DNA fingerprinting of sweet potato leaf samples. These are: A) Elicitation from farmers with basic questions for the most widely planted variety; B) Farmer elicitation on five sweet potato phenotypic attributes using a visual-aid protocol and C) Enumerator recording observations on five sweet potato phenotypic attributes from the visual-aid protocol by visiting the field. 4 Materials and Methods Sweet potato improved varieties released in Ethiopia In Ethiopia, the term improved variety is used to designate a variety which has been tested by breeders and evaluated for its superiority over existing (traditional or local) varieties (Ethiopian Ministry of Agriculture, 2013). The list of improved sweet potato varieties released in Ethiopia is provided in Table 1. Since 1990, a total of 25 improved sweet potato varieties have been released. Breeding and germplasm maintenance activities have been concentrated in the Southern Nations, Nationalities and Peoples’ Region (SNNPRS) and Oromia. Five orange-fleshed varieties have been released and promoted for their higher nutritional content: Koka-12, Guntutie, Kero, Kulfo and Tulla. 5 Table 1. Sweet potato improved varieties released by the national agricultural research system of Ethiopia, 1990-2013. Year of Variety Breeder release Tola 2012 Bako ARC Ma’e 2010 Werer ARC Jari 2008 Sirinka ARC Birtukanie 2008 Sirinka ARC Berkume 2007 Haramaya University Adu 2007 Haramaya University Balo 2006 Baco ARC Ordollo 2005 Awassa ARC Kero 2005 Awassa ARC Tulla 2005 Awassa ARC Kulfo 2005 Awassa ARC Dimitu 2005 Bako ARC Temesgen 2004 Awassa ARC Beletech 2004 Awassa ARC Belela 2002 Awassa ARC Awassa-83 1997 Awassa ARC Dubo 1997 Awassa ARC Falaha 1997 Awassa ARC Kudadie 1997 Awassa ARC Damota 1997 Adet ARC Bareda 1997 Awassa ARC Guntutie 1997 Awassa ARC Ogan-Sagan unknown Ministry of Agriculture Koka 12 1987 Awassa ARC Koka 6 1987 Awassa ARC   Source: Ethiopian Ministry of Agriculture, 2013. ARC = Agricultural Research Center   Crop descriptors follow a standard codification and are regarded as a universally understood language for germplasm data. The International Board for Plant Genetic Resources recommends a list of 26 descriptors related to the plant morphology, storage root and inflorescence (CIP, 1991). However, several descriptors can be tricky to assess for non-specialists. In particular, many descriptors are recorded as an average value of measurement (for instance, length or size) or an average expression of the character. In contrast with most crops, sweet potato varieties exhibit a diversity of colors on 6 different parts of the plant as well as a heterogeneity of leaf shapes. This makes the crop particularly interesting to test a visual-aid survey protocol based on distinctive phenotypic attributes. To this purpose, available documents on the descriptors of sweet potato improved varieties were reviewed and interviews with specialists were conducted. Based on discussions with breeders, observation of plots and pre-testing of different protocols, we identified five phenotypic attributes that are relevant for sweet potato varietal identification and are more likely to be perceived by interviewees and enumerators. Indeed, visual-aid protocols offer advantages over the existing methods of data collection. Pictures have the potential to overcome language and translation barriers, which could be a huge advantage on data quality. Earlier in the project, a visual aid protocol had been included in the Ethiopian Socioeconomic Survey (ESS), which was implemented in 2015/16 by the Ethiopian Central Statistical Agency and the World Bank on a nationally representative sample of 3,800 rural households. In this study, visual-aid protocols, collecting information on the sweet potato variety skin color, flesh color, dominant type leaf shape, vein color and vine color, were used as a survey instrument (Appendix A). Varietal identification by phenotypic attributes Recursive partitioning methods and classification trees (Breiman et al., 1984) provide a potential way to uniquely identify improved varieties on the basis of their descriptors. In our case, the response variable is a 20-level categorical variable of sweet potato improved varieties while the five phenotypic attributes are used as explanatory variables. The analysis, which generates a set of decision rules and predicts varieties, proceeds as follows. The first step is identifying the single variable which best splits the data into two groups. The data are separated, and then this process is applied separately to each subgroup, and so on, recursively until the subgroups either reach a minimum size or until no improvement can be made. The second step of the procedure consists of using cross-validation to trim back the full classification tree (Therneau et al., 2015). Results of the classification tree analysis 7 are presented in Figure 1. Overall, the algorithm identified 13 different paths of the 20 varieties for which germplasm was collected and included in the reference library used for DNA fingerprinting. Eight sweet potato improved varieties are uniquely identified by the classification tree while the remaining 12 share common phenotypic traits with other varieties. Figure 1. Varietal identification of sweet potato improved varieties using classification tree analysis.    Data collection Field data were collected in January 2015 in Wolayita zone, a major sweet potato producing area in Ethiopia. Compared to the national average of 76 qt/ha, sweet potato yields in the Wolayita area established at 107 qt/ha in the 2011/12 agricultural season (Central Statistical Agency, 2012). The survey was implemented in five different communities (kebelles): Buge, Ade Koyisha, Gacheno, Ofa Sere and Waja Kero (Figure 2) using snowball sampling. Although snowball sampling may introduce a bias in our sample, we are more interested by varietal diversity than representativeness. Oral consent was granted from all participants and the data were analyzed anonymously. Tablets equipped with the 8 Open Data Kit application were used. The survey questionnaire included three modules. Module-1 captured information on the most widely grown variety, followed by basic questions on this variety. The variety name given by the interviewee was repeated in each question throughout the questionnaire. Farmers were asked to report whether the sweet potato variety grown is a local or improved variety − referring as “yakabababe zer” for a local variety and “mirit zer” for an improved one − and whether the variety was introduced by the government. In Module-2, the interviewees were asked about phenotypic attributes of the main variety they are growing, using the visual-aid protocol. The visual- aid was presented to interviewees who could identify the variety attributes at distance from the plot. Then, the enumerator was accompanied by the farmer to the plot to answer Module-3 and the same five attributes were recorded by the enumerator. The plots were georeferenced and leaf tissues from 259 fields were collected with a unique ID and conserved in a plastic bag. At the end of the interview, farmers were asked to help in identifying other sweet potato growers around the area. Figure 2. Research sites locations. 9 DNA extraction and genotyping by sequencing All the samples collected from the farmers’ fields and the genotypes included in the reference library were extracted according to the Cetyl Trimethyl Ammonium Bromide (CTAB) method (Borges, 2009). To establish the library, we included all CIP genebank accessions (1004 samples) as well as 19 improved materials collected from the agricultural research centers of Awassa, Adami Tulu and Baco. Six improved materials could not be included in the reference library because they were either not 10 maintained anymore on research stations (Ordollo, Dubo, Adu and Balo) or were unlikely to be found in the variety collection area (Ma’e and Jari). For genotyping by sequencing, a combination of DArT complexity reduction methods and next generation sequencing platforms was used (Kilian et al., 2012; Courtois et al., 2013; Raman et al. 2014; Cruz et al. 2013). Following the PstI-MseI method, sweet potato DNA samples were processed in digestion/ligation reactions principally as per Kilian et al. (2012) but replacing a single PstI-compatible adaptor with two different adaptors corresponding to two different Restriction Enzyme (RE) overhangs. The PstI-compatible adapter was designed to include Illumina flowcell attachment sequence, sequencing primer sequence and “staggered”, varying length barcode region, similar to the sequence reported by Elshire et al. (2011). The reverse adapter contained a flowcell attachment region and MseI-compatible overhang sequence. Only “mixed fragments” (PstI-MseI) were effectively amplified in 30 rounds of PCR. After PCR equimolar amounts of amplification products from each sample of the 96-well microtiter plate were bulked and applied to c-Bot (Illumina) bridge PCR followed by sequencing on Illumina Hiseq2000. The sequencing (single read) was run for 77 cycles. Sequences generated from each lane were processed using proprietary DArT analytical pipelines. In the primary pipeline the fastq files were first processed to filter away poor quality sequences, applying more stringent selection criteria to the barcode region compared to the rest of the sequence. In that way the assignments of the sequences to specific samples carried in the “barcode split” step were very reliable. Approximately 1,400,000 sequences per barcode/sample were identified and used in marker calling. Finally, identical sequences were collapsed into “fastqcoll files”. The fastqcoll files were “groomed” using DArT PL’s proprietary algorithm which corrects low quality base from singleton tag into a correct base using collapsed tags with multiple members as a template. The “groomed” fastqcoll files were used in the secondary pipeline for DArT PL’s proprietary SNP and SilicoDArT 11 (presence/absence of restriction fragments in representation) calling algorithms (DArTsoft14). For SNP calling all tags from all libraries included in the DArTsoft14 analysis are clustered using DArT PL’s C++ algorithm at the threshold distance of 3, followed by parsing of the clusters into separate SNP loci using a range of technical parameters, especially the balance of read counts for the allelic pairs. In addition, multiple samples were processed from DNA to allelic calls as technical replicates and scoring consistency was used as the main selection criteria for high quality/low error rate markers. From the 259 leaf tissue samples collected, a total of 231 samples were DNA fingerprinted. Results DNA analysis identified 63% of samples from farmers’ fields as improved varieties. Five improved varieties were identified in the surveyed area: Awassa-83, Berkume, Kudadie, Ogan-Sogan and Kulfo/Tulla. The latter varieties, both orange-fleshed, were considered genetically identical. The most common improved types were Kulfo/Tulla (22%), Awassa-83 (20%) and Ogan-Sagan (12%). The remaining improved varieties were identified on a samples only. Table 2 summarizes the accuracy of estimates of the adoption of improved varieties by each method of data collection. The accuracy of data derived from the three methods is evaluated against the benchmark of varietal identification established through DNA fingerprinting. 12 Table 2. Summary results of improved varieties adoption estimates established through DNA fingerprinting and derived from the three methods. Method A Method B Method C % True positive 43 47 50 (improved when improved) % True negative 16 4 6 (local when local) % False positive (Type I Error: improved when 20 33 31 local) % False negative (Type II Error: local when 19 16 13 improved) The first result is that all methods appear less accurate than the DNA fingerprinting benchmark. Method A suggests a less accurate identification of improved varieties by farmer’s elicitation (“Is this variety a traditional or improved variety?”): Twenty percent of farmers identified a variety as improved when in fact it was local, and 19% identified a variety as local when it was in fact improved.  As an alternative way of identifying improved varieties, the survey asked: “Has this variety been introduced by the government?”. Results demonstrate that this question was even less accurate, with less than half of improved varieties (46%) identified as such by this alternative question. Based on improved varieties phenotypic attributes, methods B and C respectively identified 47% and 50% of improved varieties correctly, thus representing a slight improvement in accuracy over the existing method of collection, method A. These two methods, however, delivered a higher number of false negative results. Method A: Interviewee’s self-report without visual aid Three-quarters of the farmers grew only one variety. Over 18 different names were given by farmers to describe the varieties of sweet potato they planted. Most cited varieties were Wolayita, Gadissa, Fisisa and FAO. One-tenth of the interviewees could not name the variety they grew. As shown in Figure 3, the variety names given by farmers mapped inconsistently to improved varieties: only 4% of 13 improved varieties were correctly identified by name. It is notable that an overall signal − where some varieties are largely linked to a single name − can be observed (Awassa-83 and Kulfo/Tulla). We also note that the common responses to a local variety name that the respondents provided were “Wolayita” (the name of the area) and “unknown”. However, there is no consistent pattern regarding the extent and direction of the error – whether adoption of a specific variety is over- or under- estimated. These results would suggest that in informal seed systems, as in the case of sweet potato in southern Ethiopia, two issues arise when attempting to identify improved varieties by name. First, farmers may refer to a specific improved variety by giving it a different name. The name of the agricultural officer who promoted the variety is often accepted as the variety’s name and this case was encountered for Awassa-83, largely referred as “Gadissa” and Kulfo/Tulla, referred as “Fisisa”. Second, it is difficult to rule out misspelling, which may result in several misclassification cases. For example, the names “FAO” and “Fino” or “Tula” and “Tulo”, which sound similar, would typically point us to be suspicious of the presence of widespread measurement error. 14 Figure 3. Sankey diagram capturing the relationship between sweet potato varieties identified through DNA fingerprinting and sweet potato variety names given by farmers. The bars indicate percentage of total varieties while lines describe the relationship. Using the question “Is this variety a traditional or improved variety?”, only 64% of improved varieties assessed through DNA fingerprinting were identified as such by farmers. Overall, 20% of farmers identified a variety as improved when it was not (false positive) and 30% identified a variety as local when it was in fact improved (false negative). As an alternative way of identifying improved varieties, the survey asked: “Has this variety been introduced by the government?” Results demonstrate that 15 this question was even less accurate, with less than half of improved varieties (46%) identified as such by this alternative question. Methods B (interviewee with visual-aid) and C (enumerator observation) Although Method B, the use of visual-aid protocols, represents an improvement in accuracy over farmers’ elicitations (where only 4% of varieties were correctly identified by name), there is still a large amount of measurement errors. Farmers’ answers on phenotypic characteristics rarely match the correct varieties and information provided by the visual-aid protocol failed to uniquely identify improved varieties (Table 3). In all cases except Kulfo/Tulla, having an enumerator visiting the field only provided a small improvement over asking the interviewee the question directly and method C also provided results that are way below the DNA fingerprinting benchmark. Another important observation in Table 3 is that varieties with colorful attributes such as Kulfo/Tulla were more easily identified: out of 50 samples, 38 orange-fleshed varieties were correctly identified by both methods B and C. 16 Table 3. Varietal identification of sweet potato improved varieties established through DNA fingerprinting and derived from the three methods (n=146). Method A Method B Method C Awassa-83 (n=47) Correct 1 15 25 False positive 46 32 22 False negative 1 3 7 Berkume (n=12) Correct 0 2 5 False positive 12 10 7 False negative 0 4 6 Kudadie (n=9) Correct 0 0 0 False positive 9 9 9 False negative 0 5 8 Kulfo/Tulla (n=50) Correct 8 38 38 False positive 42 12 12 False negative 3 5 6 Ogan-Sagan (n=28) Correct 0 7 9 False positive 28 21 19 False negative 0 2 9 It is important to understand which phenotypic attributes were accurately identified by both methods, and which were not. Figure 4, which explores this question, delivers three important messages. First, skin, flesh and vein colors were perceived by interviewees as well as enumerators in more than 80% of the cases. Second, of all phenotypic attributes, data collected on leaf types were found to be the most inaccurate. Among the different types, the hand-shaped leaf type, typical for the orange fleshed varieties Kulfo and Tulla, was the only one that was easily identified by interviewees (84% accuracy) and enumerators (86% accuracy). Other leaf types demonstrate poor identification: only half of hearth-shaped leaf type were correctly identified by both methods of data collection; and the leaf type 4, typical of the Kudadie variety, was accurately identified by 56% of interviewees and only one-third of enumerators. However, as the five improved varieties identified through DNA fingerprinting only 17 have three different types of leaves, we were not able to explore all sweet potato leaf types. Finally, with the exception of the vine color, having the enumerator visiting the field provided only a slightly better (flesh and vein) or even a slightly lower (skin and leaf type) accuracy over the farmer’s response. Figure 4. Accuracy of data collected on five sweet potato phenotypic attributes. Discussion The objective of this study was to compare different methods of data collection for sweet potato varietal identification. The gold standard represented by DNA fingerprinting validation is compared to other low-cost, easy to implement methods. Crop germplasm improvement is a major activity of agricultural research centers throughout the world and varietal identification is central to assess its contribution and impact. In addition, different varieties of sweet potatoes have different nutritional value and there is a strong interest among development agents in assessing the extent of adoption of biofortified sweet potato varieties. All methods were found to be less accurate than the DNA fingerprinting benchmark. Data quality may suffer since information from these methods proved to 18 be unreliable. Regarding sweet potato varietal identification, a wider use of DNA fingerprinting seems unavoidable. Implemented throughout Sub-Saharan Africa, household surveys are the most common source of data for modern crop varietal adoption. The surveys typically ask the most knowledgeable person in the sampled farm household. However, the results presented earlier show that farmers were not able to identify improved varieties. Moreover, farmers’ identification of improved varieties by name only delivered fuzzy varietal identification. The fact that most, if not all, agricultural surveys rely on farmers’ elicitation raises concerns about the accuracy of the data collected by the traditional approaches. These results may highlight the importance of social factors and plant crop exchanges between farmers. It is understandable that a variety adopted by a farmer decades ago would be described as local, while it is in fact an improved variety that has been introduced as a result of a process of publically-funded agricultural research. The informal nature of the sweet potato seed system makes it even harder for farmers to assess sweet potato variety types. Whether household surveys under- or overestimate adoption is context and crop-specific and staple crops such as maize, wheat and teff may be more accurately identified by farmers. This paper contributes to the literature by introducing an innovative and reproducible method to track sweet potato improved varieties. Visual-aid varietal identification protocols are low-cost and have the potential to fit into many existing agricultural surveys. In addition, using pictures overcomes language barriers – an important constraint throughout Sub-Saharan Africa. The result that the visual-aid protocols employed in methods B and C were better than method A, but still far below the adoption estimates given by the DNA fingerprinting method is striking; and the question whether visual-aid protocols do represent a useful tool for tracking improved varieties deserves to be asked. Our results also indicate that visual aid protocols based on colors may perform better than those relying on shapes 19 and forms of plant attributes. The development of visual-aid protocols in other contexts should be encouraged: this method is certainly helpful in identifying varieties that have very distinctive phenotypic attributes, as is the case with orange-fleshed sweet potatoes, and can offer low-cost improvements in data quality over traditional survey questions. Our study is not without limitations. First, it is clear that more evidence is needed in different contexts, and for a variety of crops. It should be noted that methods A, B or C could work for other crops and other seed systems so further experimentation should not be ruled out. Our survey data do not allow us to explore the relationship between self-report errors by farmers and observable characteristics of the farmer. While one could hypothesize that, for example, better educated farmers would be more able to provide accurate answers, arguably the informality in the seed system is the more binding constraint to more accurate survey-based identification. As a viable tool to obtain accurate estimates of modern variety adoption, the use of DNA fingerprinting should be encouraged in future studies. Its implementation in large-scale household surveys in Sub-Saharan Africa represents a substantial challenge but one that is worthy of significant future research efforts. Without the combination of accurate varietal identification and comprehensive socioeconomic and agricultural data for the same farms, assessing the impact of adoption of improved varieties (on productivity, and further to income effects for farmers) remains a formidable challenge. More and more countries are acquiring the technical capacities to extract DNA from field samples and to carry out genotyping. In addition, the costs of DNA fingerprinting are declining and will continue to do so in the coming decade. In the meantime, more evidence is needed to assess whether DNA fingerprinting should be used as a complementary or an essential part of crop varietal identification. 20 Appendix A: Sweet Potato varietal identification protocol 1. When [VARIETY NAME] is cut after harvest, what is the color of the flesh? [__] 1 = White 2 = Orange 2. After harvested, what does the skin color looks like? [__] 1 = White 2 = Pink/Red   21 3. What is the dominant shape of leaves for [VARIETY NAME]? [__] Enumerator: let the interviewee think and take time. Then, ask for confirmation 1 = Hearth-shaped/triangular 2= with 3 nodes 3 = Hand-shaped – 5 fingers (narrow) 4 = with 5 fingers (1 major and 4 minors) 5 = Hearth-shaped with teeth 6 - with 5 fingers 22 4. What is the color of the vime for [VARIETY NAME]? [__] 1 = Green 2 = Purple 5. What is the color at the back of leaves for [VARIETY NAME]? [__] 1 = Green 2 = Purple 23 References Borges, A., Rosa, MR., Recchia, GH., de Queiroz-Silva, JR., de Andrade Bressan, E. & Veasey, E.A. 2009. CTAB methods for DNA extraction of sweet potato for microsatellite analysis. Scientia Agricola, 66(4): 529-534. Bouis, H., Hotz, C., McClafferty, B., Meenakshi, J.V. & Pfieffer W. 2011. Biofortification: A new tool to reduce micronutrient malnutrition. Food and Nutrition Bulletin, 32: S31-S40. Breiman, L., Friedman, J.H, Olshen, R., & Stone C.J. 1984. Classification and regression trees, Wadsworth, Belmont CA, USA. Central Statistical Agency. 2012. Area and production of major crops 2011-2012. Addis Ababa, Ethiopia. CIP, AVRDC & IBPGR. 1991. Descriptors for Sweet Potato, Huaman, Z., editor., International Board for Plant Genetic Ressources, Rome, Italy. Courtois, B., Audebert, A., Dardou, A., Roques, S., Ghneim- Herrera, T., et al. 2013. Genome-Wide Association Mapping of Root Traits in a Japonica Rice Panel. PLoS ONE, 8(11): e78037 Cruz V.M., Kilian, A., Dierig D.A. 2013. Development of DArT marker platforms and genetic diversity assessment of the U.S. collection of the new oilseed crop lesquerella and related species. PLoS One, 8(5): e64062. Elshire R.J., Glaubitz, J.C., Sun, Q., Poland, J.A., Kawamoto, K. et al. 2011. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One, 6(5):e19379. Ethiopian Ministry of Agriculture. 2013. Crop Variety Register, Issue No. 16, Plant Variety Release, Protection and Seed Quality Control Directorate, Addis Ababa, Ethiopia. HarvestPlus. 2012. Bridging the delta. Available at: http://www.harvestplus.org/sites/default/files/2014%20HarvestPlus%20Annual%20Repor t_Web.pdf Hotz, C., Loechl, C., Lubowa, A., Tumwine, J.K., Ndeezi, G., Nandutu, A. et al. 2012. Introduction of b-Carotene–Rich Orange Sweet Potato in Rural Uganda Results in Increased Vitamin A Intakes among Children and Women and Improved Vitamin A Status among Children. The Journal of Nutrition, doi: 10.3945/jn.111.151829. Jones, K.M. & de Brauw, A. 2015. Using agriculture to improve child health: Promoting orange sweet potatoes reduces diarrhea. World Development, 74: 15-24. Kilian, A., Wenzl, P., Huttner E, Carling, J., Xia, L. et al. 2012. Diversity Arrays Technology: A Generic Genome Profiling Technology on Open Platforms. Methods in Molecular Biology, 888:67-89. Maredia, M.K. & Reyes, B.A. 2015. Are we measuring what we think we are measuring? Recent experience in using DNA fingerprinting and implications for tracking varietal adoption and assessing impacts, in: Selected Paper Prepared for Presentation at the 2015 AAEA & WAEA Joint Annual Meeting, San Francisco, California, 26-28 July 2015. pp. 26–28. Miethbauer, T., Aragaw, A. & Woldegiorgis, G. 2015. Better potato for a better life: Reducing food insecurity and dependence on cereals in Amhara Oromia, Tigray and SNNP regions of Ethiopia. Nairobi (Kenya). International Potato Center (CIP). 24 Namanda, S., Gibson, R. & Sindi, K. 2011. Sweet potato seed systems in Uganda, Tanzania and Rwanda. Journal of Sustainable Agriculture, 35(8): 870-884. Pingali, P.L. 2012. Green Revolution: Impacts, limits, and the path ahead. PNAS, 109(31): 12302- 12308. Rabbi, I.Y., Kulakow P.A., Manu-aduening, J.A., Dankyi, A.A., Asibuo, J.Y., Parkes, E.Y., Abdoulaye, T., Girma, G., Gedil, M.A., Ramu P, Reyes B. & Maredia M.K. 2015. Tracking crop varieties using genotyping-by-sequencing markers: a case study using cassava (Manihot esculenta Crantz). BMC Genetics, 16(115): 1–11. Raman, H., Raman, R., Kilian, A., Detering, F., Carling J, et al. 2014. Genome-Wide Delineation of Natural Variation for Pod Shatter Resistance in Brassica napus. PLoS ONE, 9(7). e101673. Stevenson J., Villoria, N., Byerlee, D., Kelley, T. & Maredia, M. 2012. Green Revolution research saved an estimated 18 to 27 million hectares from being brought into agricultural production. PNAS 110(21): 8363-8368. Therneau T., Atkinson, B. & Ripley, B. 2015. Package ‘rpart’. Available at: https://cran.r- project.org/web/packages/rpart/rpart.pdf Walker, T.S. 2015. Validating adoption estimates generated by expert opinion and assessing the reliability of adoption estimates with different methods,” In: Walker TS and Alwang J eds. Crop improvement, adoption and impact of improved varieties in food crops in sub-Saharan Africa. pp 406- 419. World Bank. 2007. World Development Report 2008: Agriculture for Development. World Bank, Washington, DC, USA. 25