20731 May 2000 Volume one t. W. a Da UI '0 FD ~~~esigning Household Survey s-"'< Questionnaires for ?n CDeveloping Countries Lessons from 1 5 years of the 0D' Living Standards Measurement Study m 0- Edited by Margaret Grosh and Paul Glewwe r1 The World Bank Oxford The World Volume one PD esigning Household Survey Questionnaires for Developing Countries Lessons from 1 5 years of the Living Standards Measurement Study Edited by Margaret Grosh and Paul Glewwe The World Bank "Household surveys are essential for the analysis of most policy issues. This book has carefully assessed recent experience and developed today's best-practice technique for household surveys. Indeed, much of this technique was developed and pioneered by the authors.This book is clear, systematic, and well structured. It is also wise and scholarly. It will be indispensable to anyone involved in carrying out or analyzing household surveys, and thus it is required reading For all those who wish to take evidence seriously when they think about policy." -Nicholas Stern, senior vice president, Development Economics and chief economist, the World Bank "This book is an ambitious undertaking, but it quickly exceeded my expectations. It has many strengths: . . .com- prehensiveness, . . .ernphasis on practical application, ...and a sense of balance. For both my domestic and interna- tional survey research, this volume will serve as a valued reference tool that I will consult regularly." -David R.Williams, professor of sociology and senior research scientist, Survey Research Center, University of Michigan "This is a comprehensive guide to planning household surveys on a range of socioeconomic topics in develop- ing countries. It is authoritative, clear, and balanced. The work is a valuable addition to the library of any survey statistician or data analyst concerned with socioeconomic surveys in the developing world." -William Seltzer, former head, United Nations Statistical Office Household survey data are essential for assessing the impact of development policy on the lives of the poor.Yet for many countries household survey data are incomplete, unreliable, or out of date.This handbook is a compre- hensive treatise on the design of multitopic household surveys in developing countries. It draws on 15 years of experience from the World Bank's Living Standards Measurement Study surveys and other household surveys conducted in developing countries. The handbook covers key topics in the design of household surveys, with many suggestions for customizing sur- veys to local circumstances and improving data quality. Detailed draft questionnaires are provided in written and electronic format to help users customize surveys. This handbook serves several audiences: * Survey planners from national statistical and planning agencies, universities, think tanks, consulting firms and international organizations. * Those working on either multitopic or topic-specific surveys. * Data users, who will benefit from understanding the challenges, choices, and tradeoffs involved in data collection. PD es ign ing Household Survey Questionnaires for Developing Countries Lessons from 1 5 years of the Living Standards Measurement Study Edited by Margaret Grosh and Paul Glewwe Copyright ©) 2000 The International Bank for Reconstruction and Development/THE WORLD BANK 1818 H Street, N.W Washington, D.C. 20433, U.S.A. All rights reserved Manufactured in the United States of America First printing May 2000 The findings, interpretat:ions, and conclusions expressed in this paper are entirely those of the author(s) and should not be attributed in any manner to the World Bank, to its affiliated organizations, or to members of its Board of Executive Directors or the countries they repre- sent. The World Bank does not guarantee the accuracy of the data included in this publication and accepts no responsibility for any conse- quence of their use. The boundaries, colors, denominations, and other information shown on any map in this volume do not imply on the part of the World Bank Group any judgment on the legal status of any territory or the endorsement or acceptance of such boundaries. The material in this publication is copyrighted. The World Bank encourages dissemination of its work and will normally grant per- mission promptly. Permission to photocopy items for internal or personal use, for the internal or personal use of specific clients, or for educational class- room use, is granted by the World Bank, provided that the appropriate fee is paid directly to Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, U.S.A., telephone 978-750-8400, fax 978-750-4470. Please contact the Copyright Clearance Center before photocopying itemas. For permission to reprint individual articles or chapters, please fax your request with complete information to the Republication Department, Copyright Clearance Center, fax 978-750-4470. All other queries on rights and licenses should be addressed to the World Bank at the address above or faxed to 202-522-2422. ISBN:0-19-521595-8 Library of Congress Cataloging-in-Publication Data has been applied for. Contents Foreword ix Acknowledgments xi Contributors xiii Volume I Part I Survey Design 1. Introduction 5 Margaret Grosh and Paul Glewwe 2. Making Decisions on the Overall Design of the Survey 21 Margaret Grosh and Paul Glewwe 3. Designing Modules and AssemblingThem into Survey Questionnaires 43 Margaret Grosh, Paul Glewwe, andJuan Munoz Part 2 Core Modules 4. Metadata-Information about Each Interview and Questionnaire 77 Margaret Grosh and Juan Munoz 5. Consumption 91 Angus Deaton and Margaret Grosh 6. Household Roster 135 Paul Glewwe 7. Education 143 Paul Glewwe 8. Health 177 Paul j. Gertler, Elaina Rose, and Paul Glewwe 9. Employment 217 Julie Anderson Schaffner 10. Anthropometry 251 Harold Alderman v CONTENTS 11. Transfers and Other Nonlabor Income 273 Andrew McKay 12. Housing 293 Stephen Malpezzi 13. Community and Price Data 315 Elizabeth Frankenberg Volume 2 Part 3 Additional Modules 14. Environmental Issues 5 Dale Whittington 15. Fertility 31 Indu Bhushan and Raylynn Oliver 16. Migration 49 Robert E B. Lucas 17. Should the Survey Measure Total Household Income? 83 Andrew McKay 18. Household Enterprises 105 Wim P. M. Vijverberg and Donald C. Mead 19. Agriculture 139 Thomas Reardon and Paul Glewwe 20. Savings 183 Anjini Kochar 21. Credit 211 Kinnon Scott 22. Time Use 249 Andrew S. Harvey and Maria Elena Taylor Part 4 Special Topics 23. Recommendations for Collecting Panel Data 275 Paul Glewwe and Hanan Jacoby 24. Intrahousehold Analysis 315 Nobuhiko Fuwa, Shahidur R. Khandker, Andrew D. Mason, and Tara Vishwanath 25. Qualitative Data Collection Techniques 337 Kimberly Chung 26. Basic Economic Models and Econometric Tools 365 Jere R. Behrman and Raylynn Oliver Volume 3 Draft Questionnaire Modules Introduction I Module for Chapter 4: Metadata 5 Margaret Grosh andJuan Muinoz Module for Chapter 5: Consumption 15 Angus Deaton and Margaret Grosh vi CONTENTS Module for Chapter 6: Household Roster 3 1 Paul Glewwe Module for Chapter 7: Education 37 Paul Glewwe Module for Chapter 8: Health 73 PaulJ. Gertler, Elaina Rose, and Paul Glewwe Module for Chapter 9: Employment 147 Julie Anderson Schaffner Module for Chapter 10:Anthropometry 219 Harold Alderman Module for Chapter I :Transfers and Other Nonlabor Income 221 Andrew McKay Module for Chapter 12: Housing 229 Stephen Malpezzi Module for Chapter 13: Community Data 247 Elizabeth Frankenberg Module for Chapter l4: Environment 285 Dale Whittington Module for Chapter 15: Fertility 325 lndu Bhushan and Raylynn Oliver Module for Chapter 16: Migration 333 Robert E. B. Lucas Module for Chapter 18: Household Enterprise 349 Wim P M. Vijverberg and Donald C. Mead Module for Chapter 19:Agriculture 407 Thomas Reardon and Paul Glewwe Module for Chapter 20: Savings 453 Anjini Kochar Module for Chapter 21: Credit 461 Kinnon Scott Module for Chapter 22:Time Use 483 Andrew S. Harvey and Maria Elena Taylor Module for Chapter 23: Panel Data 495 Paul Glewwe and Hanan Jacoby vii Foreword Multitopic household surveys have become an indis- The household surveys treated in this book truly pensable instrument for understanding development. are multitopic surveys, covering such topics as house- They are fundamental to serious rnicroecononiic analy- hold size and composition, education, health, anthro- sis of the incentive and distributional aspects of policy, pometry, fertility, income and consumption, employ- and therefore to the analysis of most policy issues. ment, agricultural production, household enterprises, Researchers draw on them to test behavioral theories. transfers and nonlabor income, savings and credit, Policymakers need them to assess public interventions. housing, the environment, migration, and time use. The development community uses them to locate the The editors have greatly increased the value of the poor. Developing countries, without adequate house- basic approach by incorporating chapters on commu- hold survey data, are forced to make policy decisions in nity data, panel data, and the allocation of resources an environment with many blind spots, where crucial within the household. information can be seen only dimly or not at all. As the World Bank and other development Household surveys are also expensive, both in organizations increase their efforts to reduce poverty terms of money and institutional capacity. Ultimately and raise living standards in developing countries in their value depends on their design and execution. the 21st century, the need for comprehensive, reliable Errors in their design or execution are wasteful, and and up-to-date information on economic and social can lead to policies that are harmful to the poor. It is conditions in these countries will be greater than therefore important to design and implement surveys ever. The vast store of knowledge in this book will correctly from the outset. contribute significantly to meeting this need. Failure Margaret Grosh and Paul Glewwe have put to use this knowledge will consign policymakers to together one of the most comprehensive and inform- making their decisions without adequate information ative documents ever written on the design, imple- for many years to come, while systematic use of this mentation, and use of household surveys in develop- knowledge will do much more for the poor than the ing countries. If you are engaged in any of these tasks, innumerable speeches made and summits convened this book is essential reading. on their behalf. Lyn Squire Director, Global Development Network World Bank ix Acknowledgments A project of this size and scope depends on many peo- the surveys, the many agencies that provided techni- ple playing many roles. Space limitations preclude us cal assistance and funding, and the academic partici- from naming all of the hundreds of people who made pants who provided advice and criticism over the contributions along the way, but we would like to years. acknowledge some of the most important. The project as a whole was strongly supported The authors of the individual chapters deserve from original vision to final printing by our immedi- thanks for their gracious willingness to go through ate manager for most of that time, Emmanuel Jimenez, many rounds of revisions, spread over a longer time who provided us with useful technical input and a than anyone originally envisioned. Producing a book great deal of enthusiasm, patience, and bureaucratic on the design of multitopic questionnaires requires support.We also greatly appreciate the support of his much more cooperation among authors and several directors, Lyn Squire and Paul Collier. The project was more iterations than does the standard edited volume. primarily financed by a grant from the World Bank We are extremely grateful for the forbearance of the Research Committee (679-61), managed by Greg authors in this difficult process.The authors themselves Ingram and administered by Clara Else. were helped by a large number of peer reviewers.They Many people reviewed the book and project as a are recognized in the individual chapters, but we whole. We greatly appreciate these contributions by would like to extend our thanks to them here as well. Pat Anderson, Jere Behrman, Elisa Lustosa Caillaux, Much of the work in these volumes was based on Courtney Harold, John Hoddinott, Anna Ivanova, past practice in LSMS and other household surveys Alberto Martini, Raylynn Oliver, Prem Sangraula, including, but not limited to, the World Fertility/ Salman Zaidi, and three anonymous reviewers.To have Demographic and Health Surveys, the RAND Family input on the project as a whole from these outsiders Life surveys, the Social Dimensions of Adjustment was very helpful. In addition, participants at three surveys, and several special topic surveys such as workshops held at the World Bank, plus various train- household budget surveys, water and sanitation sur- ing events sponsored jointly by the World Bank and veys, housing surveys, and time use surveys.While the the Inter-American Development Bank, critiqued the authors pulled together the lessons from past experi- project while it was in progress. ence, it is also important to acknowledge the irre- In the course of creating the book, Diane Steele placeable contributions made by the thousands and answered questions from all authors on the details of thousands of household members who served as LSMS data sets. Fiona Mackintosh edited early drafts and respondents, the dozens of agencies that implemented helped to transform the disparate chapters into a single xi ACKNOWLEDGMENTS whole. Lyn Tsoflias provided us with valuable research Communications Development Inc. Communication assistance.Word processing and conference logistics were with the World Bank's Publications Committee and ably handled by Thomas Hastings, Patricia Sader, Jim with the publishers and printers was efficiently handled Schafer, and Daniel O'Connell. Questionnaire layout by Paola Scalabrin and Randi Park. was mastered by Thomas Hastings, Andrea Ramirez, and Finally, effusive and endless thanks to our families Heidi Van Schooten. Contracting support from Liliana and friends who put up with the excessively long hours Longo, Selina Khan, and Patricia Sader was timely and that we spent on this project, who cheered and calmed organized. The final editing, layout, and design were us through the frustrating times, and who helped us to handled dextrously by Meta de Coquereaumont,Wendy bring this long project to a successful conclusion with- Guyette, Kate Hull., Daphne Levitas, Heidi Manley, out completely losing track of other important aspects Laurel Morais, and Derek Thurber, all with of our personal and professional lives. xii Contributors Harold Alderman World Bank Jere R. Behrman University of Pennsylvania Indu Bhushan Asian Development Bank Kimberly Chung Michigan State University Angus Deaton Princeton University Elizabeth Frankenberg RAND Nobuhiko Fuwa World Bank Paul J. Gertler University of California, Berkeley Paul Glewwe World Bank and University of Minnesota Margaret Grosh World Bank Andrew S. Harvey St. Mary's University, Halifax, N.S. Canada Hanan Jacoby World Bank Shahidur R. Khandker World Bank Anjini Kochar Stanford University Robert E.B. Lucas Boston University Stephen Malpezzi University ofWisconsin Andrew D. Mason World Bank Andrew McKay University of Nottingham, United Kingdom Donald C. Mead Michigan State University Juan Mufioz Sistemas Integrales Raylynn Oliver Consultant, World Bank Thomas Reardon Michigan State University Elaina Rose University ofWashington Julie Anderson Schaffner Fletcher School of Law and Diplomacy, Tufts University Kinnon Scott World Bank Maria Elena Taylor St. Mary's University, Halifax, N.S. Canada Wim P.M.Vijverberg University of Texas at Dallas TaraVishwanath World Bank Dale Whittington University of North Carolina at Chapel Hill xiii VolumerI Part I Survey Design Introduction t Margaret Grosh and Paul Glewwe Accurate, up-to-date, and relevant data from household surveys are essential for governments to make sound economic and social policy decisions. Governments need these data to measure and monitor poverty, employment and unemployment, school enrollment, health and nutritional sta- tus, housing conditions, and other dimensions of living standards. They need the data to deter- mine whether schools, health clinics, agriculture extension services, roads, electric power, and other basic services are reaching the poor and other disadvantaged groups. And analysts need household survey data to model economic behavior and thus provide answers to such important policy questions as: How would changes in food subsidies affect the population's nutritional sta- tus? Would increasing fees for public schools reduce school enrollment, and how much revenue would be raised by such fee increases? Who would participate in a new labor-intensive public works program, and what would be the net benefit for participants? How would changes in the price of fertilizer affect farmers' production of different crops? One way to collect the data needed to answer these the poor. By the mid-1800s, generalizations about questions is to conduct separate household surveys on household behavior were being drawn from these each topic-that is, to conduct a labor force (employ- data. For example, Ducpetiaux's 1855 study of 200 ment) survey, a health survey, a housing survey, and so Belgian households was used by Ernst Engel to derive forth. Alternatively, data on many different topics can his classic law that the fraction of a household's budg- be collected in a single survey. Such a "multi-topic" et devoted to food falls as income rises. household survey, which has many advantages, is the The statistical theory that supports modern survey type of survey considered in this book. methods was developed in the 1920s. This led to the Household surveys are not a new invention. establishment of high-caliber nationwide surveys in Stigler (1954) points out that systematic collection of many countries, especially after World War II. data from households began over 200 years ago. The Developing countries also participated in this phe- first known efforts were the collection of family budg- nomenon; for example, India's annual National Sample ets in England by Davies (1795) and Eden (1797). In Survey began in 1950. With the advent of modern the 1800s similar data were collected in Saxony, computing, and especially the appearance of powerful Prussia, Belgium, the United States, and undoubtedly personal computers, the collection and analysis of other places as well. The motivation for much of this household survey data has expanded rapidly in both research was to focus public attention on the plight of developed and developing countries. (See Deaton 5 MARGARET GROSH AND PAUL GLEWWE 1997 for a brief review of household surveys in the Sub-Saharan Africa. In the 1985 calculation only 6 20th century.) percent of Sub-Saharan Africa's population was repre- Since 1970 several major international programs sented, while recent estimates cover 66 percent of this have been organized to support the collection of population (Ravallion and Chen 1998). Finally, the household survey data in developing countries. time lag between collection and dissemination of the Among the largest such programs have been the data is getting smaller. In the 1985 World United Nations Hlousehold Survey Capability Development Report the average lag was 11 years, so Program, the World Fertility Surveys (which later the average survey date was 1974. Now the lag is only became the Demographic and Health Surveys), and five years (Ravallion and Chen 1997). the World Bank's Living Standards Measurement The surge in the collection of household survey Study (LSMS) survey program. Other organizations, data in developing countries has greatly increased the including the International Food Policy Research demand for knowledge on how best to design and Institute, the RAND Corporation, and Cornell implement such surveys. Moreover, the growing num- University, have also carried out household surveys in ber of surveys provides a vast amount of experience developing countries. Some U.N. organizations regu- from which to draw lessons.Yet until now it has often larly participate in single-topic household surveys in been difficult for those planning a new survey-espe- developing countries, such as employment surveys cially one in a developing country-to find out about done in collaboration with the International Labour the experiences of previous household surveys: what Office. And two regional survey programs have been was tried, the factors that influenced decisionmaking, strongly influenced by, and indeed have grown direct- what worked, and, most importantly, what did not ly out of, the World Bank's LSMS program.The first of work. The formal literature is scattered across disci- these, the Social Dimensions of Adjustment (SDA) plines-statistics, economics, sociology, psychology- program for Sub-Saharan Africa, was supported by a and often contained in conference proceedings or consortium of agencies and administered by the World government document series that are not widely Bank. The second and more recent regional survey indexed and are seldom available outside the country program, the Improving Surveys of Living Conditions where they were written. An additional limitation is program for Latin America, is sponsored jointly by the that a considerable amount of the formal literature Inter-American Development Bank, the World Bank, pertains to surveys in industrialized countries. While and the Economic Commission for Latin America. much can be learned from such literature, it is still (The Spanish name for this program is Mejoramiento unclear how well the literature applies to settings with de las Enquestas de Condiciones de Vida; it is often lower literacy rates, different income levels and referred to by its Spanish acronym, MECOVI.) The employment and consumption patterns, and differing surveys done under the LSMS, SDA, and MECOVI factors that affect the social interactioin of the inter- programs are all multi-topic surveys. view. Because of these and other efforts, household sur- Much of the experience of surveying in develop- vey data are now much more widely available than ing countries is poorly documented. Statistical insti- they were 10 or 20 years ago. World Bank statistics on tutes in developing countries have little money or staff the extent of poverty in 1985 were based on data from to devote to experimentation or research; their man- only 22 of 86 developing countries. Although these 22 date is production and their resources are few. Articles countries accounted for 76 percent of the population published in the formal academic literature that use of the 86 countries (Ravallion, Datt, and van de Walle the data from these surveys typically provide only a 1991), it is significant that at that time no reliable data brief description of the data used. They may contain existed for three-fourths of the developing countries. some hints about whether the data collection methods Similar calculations currently underway are based on worked, but almost by definition, data collection data from about 70 of 100 developing and transition efforts that failed usually do not lead to academic pub- countries, covering about 88 percent of the total pop- lications. Household survey questionnaires and their ulation of the countries. Data for more than one point associated statistical abstracts (reports) contain both in time are now available for 50 cotntries. Coverage implicit and explicit information, but they are some- has grown the most in the region where it was lowest, times available only in the country in which they were 6 CHAPTER I INTRODUCTION administered. Moreover, statistical abstracts tend to information on, for example, a household's composi- minimize any problems that may have been associated tion, basic characteristics, and level of welfare. Finally, with a survey because the statistical agencies that pro- this book will assist researchers who use household duce these abstracts do not want to publicize a survey's survey data produced by others, because it will help shortcomings. In principle, the most useful informa- them understand the challenges, possible options, and tion for the designers of future surveys would be the tradeoffs involved in data collection. Such an under- internal memoranda and informal notes of the agen- standing will allow these researchers to interpret cies and people involved in designing and implement- household survey data more accurately and use these ing past surveys. However, these are rarely filed and data more fully. seldom preserved after a survey is completed, much The recommendations in this book apply to a less systematically made available to people outside the broad range of multi-topic household surveys, reflect- agency. ing the authors' expectation that future surveys in developing countries will be increasingly diverse in The Objective and Audience for this Book their purposes and content. The book provides survey designers with a wide range of options from which The objective of this book is to provide detailed they can pick and choose according to both the pur- advice on how to design multi-topic household sur- pose of their survey and the prevailing circumstances in veys, based on the experience of past household sur- the country studied. Future household surveys will, and veys.This book will help individuals and organizations should, evolve in ways that are hard to foresee.Thus this that are planning a comprehensive, multi-topic survey book should be regarded as a starting point for plan- to define the objectives of their survey, identify the ning new surveys rather than as an exhaustive treatise data needed to analyze those objectives, and draft on the way to design all future household surveys. questionnaires that will collect such data. These tasks This book assumes that the survey designer has are not easy, because designing such a survey for a already decided to implement a multi-topic household given country (or an area within a country) usually survey, as opposed to a census, a qualitative study, or a involves a host of tradeoffs among different objectives. single-purpose survey. Nevertheless, several chapters in This book aims to help survey designers evaluate these this book compare the advantages and disadvantages of tradeoffs, set realistic objectives, and design a survey different data sources for studying certain topics. In that best fulfills those objectives. addition, Chapter 25 provides a thorough discussion of This book was written with several target audi- qualitative data collection methods. ences in mind. The primary audience consists of the people most likely to carry out household surveys The Experience on Which This Book Is Based similar to the ones discussed in the book-the staff of the national statistical agencies and planning agencies Much of this book is based on the experience of the responsible for their countries' household surveys. A World 13ank's Living Standards Measurement Study second audience consists of individuals or groups in (LSMS) program (Box 1.1), one of several recent consulting firms or international aid agencies that international efforts to expand the pool of data on advise governments on the design of household sur- poverty and living standards in developing countries. veys. A third audience is composed of researchers or The World Bank established the LSMS program in research agencies that plan to field a survey to pursue 1980 to explore ways of increasing the accuracy, time- their own research objectives. A fourth audience con- liness, and policy relevance of household survey data sists of individuals or groups working on a survey collected in developing countries. Because the first intended to evaluate or monitor the impact of a devel- LSMS surveys were designed by the World Bank for opment project in a particular country-either a research purposes, there was little variation in these nationwide project or a project limited to a small part surveys' design and implementation. However, by the of the country. A fifth audience is composed of people late 1990s LSMS surveys had been carried out in a working on a single-topic survey, because the book wide range of low- and middle-income countries, can provide them with guidance on how to collect with the involvement of many different national agen- "background" information from households- cies and international organizations. Over time LSMS 7 MARGARET GROSH AND PAUL GLEWWE Box 1.1 An Introduction to LSMS Surveys The overall objective of LSMS surveys is to measure and dures, which are generally difficult to implement on larger study the determinants of living standards in developing coun- samples, usually include several key elements. Both the sur- tries, especially the iving standards of the poorTo accomplish vey's fieldwork and its data entry are decentralized, and the this objective, LSMS surveys must collect data on many people who carry out these tasks are strictly supervised. aspects of living standards, on the choices that households Interviewers receive extensive training (usually for about four make, and on the economic and social environment in which weeks) prior to the survey In the field, information is gathered household members live. Much of the analysis undertaken not by asking one person all the questions about the house- using LSMS surveys attempts to investigate the determinants hold and its members but through a series of 'mini-inter- of living standards-which requires more sophisticated ana- views," with each adult responding for himself or herse fThis lytical methods than sirmple descriptive tables. procedure minimizes any errors caused by respondent fatigue LSMS surveys have several characteristics that distin- or by the use of proxy respondents .The interviewers make guish them from other surveys. One of the most important is multiple visits to households to find any members who were that they use several questionnaires to collect information not home during the interviewer's earlier visits-which also about many different aspects of household welfare and reduces the need to use proxy respondents. behaviorThese consist of a household questionnaire, a com- There is one supervisor for every two or three interview- munty questionnaire, a price questionnaire, and, in some ers.The supervisors must revisit a significant percentage (often cases, a facilities questionnaire. (For more details on the ques- 25 percent) of the sampled households to check on the accu- tionnaires see Box 1.4.) racy of the interviewer's data.They must directly observe some Another characteristic of LSMS surveys is that they typ- interviews, and they must review each questionnaire in detail. ically have nationally representative, but relatively small, sam- Supervisors' performance of these procedures is documented, ples-usually between 2000 and 5000 households. This will and the supervisors are in tum supervised by staff from the cen- yield fairly accurate descriptive statistics for the country as a tral office of the statistical agency Data entry and editing are who e and for large subareas (such as rural and urban areas done as soon as each interview is over either in the local field or a few agroclimatic zones), but usually not for political juris- office or by a data entry operator who travels to households dictions (such as states or provinces). The surveys' sample with the team of intervewers. As data are entered into the sizes are generally adequate for the regression methods often computer a data entry program carries out a large number of used for policy analysis of LSMS survey data. quality checks to detect responses that are out of range or Because of the coinplexity of most LSMS surveys, these inconsistent with the other data from the questionnaire. Any surveys have rigorous quality control procedures to ensure problems this program detects can be verified or corrected in a that the data they gather are of high quality. These proce- subsequent visit to the household by the interviewer surveys have become increasingly customized to fit have been implemented in about 20 Sub-Saharan specific country circumstances, including policy issues, African countries, and for the MECOVI program now social and economic characteristics, and local house- in progress in eight Latin American countries. hold survey traditions. Each survey has also inevitably The increase in the number of LSMS surveys and reflected the interests (and prejudices) of the individu- other household surveys has substantially expanded als planning it. the stock of data that can be used to study poverty The LSMS program has had its share of successes. and, more broadly, economic and social development Most importantly it has shown the feasibility of col- in developing countries. In every country where an lecting comprehensive household survey data in devel- LSMS survey has been done, the data have been used oping countries. Since the first LSMS survey in 1985, to measure and analyze poverty by the governmeint, an LSMS surveys have been implemented in about 30 international development agency, or both working developing countries (Table 1.1). In some of these together. In several countries LSMS data have directly countries the original LSMS survey prototype was influenced specific government policy decisions (see implemented in its entirety In other countries this pro- Box 1.2). Data from LSMS and similar surveys have totype was significantly altered to suit local circum- also been used in hundreds of studies of developing stances. In still other countries it was used as a guide to countries, helping to extend what is known about redesign surveys that already existed. LSMS surveys poverty, household decisionmaking, and the impact of were also the starting point for SDA surveys, which economic and social policy changes on household 8 CHAPTER I INTRODUCTION Table 1.1 ILSMS Surveys Has the survey been repeated, Country Year of first survey or will it be repeated? Number of households in sample Albania 1996 No 1,500 Algeria 1995 No 5,900 Armienia i1996 No 4,920 .............'a................................................................................ 1996............................................N...o.............................................................. 4,920 ................... Azerbaijan 1995 No 2,016 ...................... .................................................................................................................................................................................................... Bolivia 1989 Yes 4,330-9,160 ............................................................................................ 1996............................................N...o.............................................................. ,000 ................... Brazil 1996 No 5,000 Buigaria 1995 Yes 2,000 E m....... c...........................................*............................................ 1997......... *.................................. Yes.............................................................. 6,0'''................... Cambodia 1997 Yes 6,010 China (Hebei and Liaorning only) 1995 No 800 Cote d'ivoi;e 1 985 Yes 1,600 Ecuador 1994 Yes 4,500 ............................*................................................................. 199 4............................................ Yes.............................................................. 4,500 ................... Chana 1987/88 Yes 3.200 ................................................................................................................................................................................................................................... Guyana 1992/93 No 1,800 Jamaica 1988 Yes 2,000-4,400 .........................................................................................*.............. 1996............................................ N... .............................................................. 2,0 Kazakhstan 1996 No 2,000 ......................... i'c..................................................................... 1994............................................ Yes.................................................................10"'................... Krygyz Republic 1994 Yes 2,100 ..........' .................................................................................... 1996................... *........................N...o.............................................................. 3,373 ................... Mauritania 1988 Yes 1,600 Morocco 199i Yes 3,360-4,800 ........................................................................................................................................................................................ .................................. Nepal 1996 No 3,373 Nicaragua 1993 Yes 4,454 Pakistan 1991 Yes 4,800 Panama 1 997 Yes 4,945 ...........a................................................................................. 1997............................................ Yes.............................................................. 4,945 ................... Paraguay 1997/98 Yes 5,000 ..............................................................................................1997"/98.................................... *....... Yes.............................................................. ,000 ................... Peru 1985 Yes 1,500-3,623 ........................ ................................................................................................................................................................................................ Roman is 1994/95 Ycs 31,200 ..........a-n................................................................................ 19. 99............................................N...o.................................................................0-l.................... South Africa 1993 No 8,850 ......... ....... I............... ........... *............. *............................................... *..................................................................................................................... Tajikistan 1999 No 2,000 Tanzania-Kagera 1991 No 800 Tanzania-Human Resource Development Survey 1993 No 5,200 .......... "............................................ ........ *.................................................................................................................................................... Tunisia 1995/96 No 3,800 n......... t.a..................................................*.......................... 1997............................................ N... .............................................................. 2,350 ................... Turkmenistan 1997 No 2,350 ............................................................. .................................. 1992/93 ................................*........... Yes................................................... 4.8"" --................... Vietnam 1992/93 Yes 4,800-6,000 Source: LSMS data bank welfare. Many of these studies have been presented at naires, samples that exclude rural areas, and long delays conferences and published in books or academic jour- in processing the data after completing the fieldwork. nals, and have thereby shaped thinking about these Second, improvements are needed in the process issues far beyond the countries in which the data were of adapting the LSMS approach to countries that have collected. not yet implemented LSMS-type surveys. It has been Despite these successes, several challenges remain difficult for people working on a survey in one coun- for LSMS surveys and other multi-topic household try to learn from the experience of other countries surveys. First and most obviously, many developing that have carried out LSMS and other multi-topic sur- countries still have inadequate household survey data. veys. Mid-level staff in government statistical agencies This is true even for some of the countries that have know the details of why particular choices were made recently fielded new surveys, including LSMS surveys. and know how well the choices worked, but they Ideally, all governments should collect data on a regu- rarely meet with their counterparts in other countries. lar, ongoing basis in order to monitor poverty trends A small pool of World Bank staff and consultants also over time. However, survey efforts are still sporadic in know many of these details and have been in contact many developing countries today, and many surveys with many of the people developing new surveys in have serious deficiencies such as limited question- particular countries. However, until now, they have 9 MARGARET GROSH AND PAUL GLEWWE Box 1.2 Using LSMS Data to Inform Government Policy Choices LSMS household surveys are designed to collect data that can the countryThe data were quickly put to extensive use both be used to study living standards and how living standards are by the new government and by academic researchers. The affected by government policies.The following examples illus- first product, an extensive statistical abstract, was followed by trate how some governments and donor agencies have used a poverty profile prepared jointly by the World Bank and the LSYIS data to help make policy choices. government's Ministry of Reconstruction and Development, In 1989 the Jamaican government was considering then by other studies and reports. This body of work has whether it should eliminate subsidies for basic food items and helped to shift the national debate about poverty away from use the funds saved to expand its food stamp program.While the nature and extent of poverty toward policy options for the government was making this decision, data from the reducing poverty. For example, young women in rural areas Jamaican LSMS survey became available. Analysis of these data were made eligible for public works employment programs showed that most of the benefrts from general price subsidies after the data showed that these women were often needy went to nonpoor households, while most of the benefits of the and that they would be able to participate in such schemes food stamp program went to the poorThis information helped since they had access to childcare. The survey data also the government decide to remove the subsidies on basic food- revealed that the old age pension program was well targeted, stuffs and expand its 1-ood stamp program. The government which convinced the government not to modify that program then commissioned further analysis of the LSMS data to find but instead to consider reforming other programs that out how many families needed help in purchasing a minimum appeared to be less well targeted. food basket, and how much help these families needed. The In 1998 the Government of the Kyrgyz Republic under- government used this information to choose new eligibility took a thorough assessment of the current and projected thresholds and benefit ievels for the food stamp program. impact of its state pension reform.With the help of a World The Jamaican government has used its LSMS data in Bank team, the government analyzed data from the 1993 and making many other decisions, such as whether to change 1996 Kyrgyz LSMS surveys to examine a range of policy alter- kerosene subsidies and whether and how to subsidize medi- natives.The survey data were used to show rates of partici- cines distributed through public health clinics. In addition, the pation in, contributions to, and receipts from pension pro- government has used L SMS data to study the effects of rais- grams by age cohort and by level of welfare.The data were ing user fees for public health care services. The Jamaican particularly helpful to the government when it worked on LSMS survey is conducted annually; the incidence of poverty setting a new level for the m nimum pension, based on aver- is measured in each su-vey age earnings in the poorest quintile of the population. In South Africa the 1993 LSMS survey provided the first Forthcoming analytical work will include an assessment of comprehensive, credible data set for the entire territory of household consumption and demand for utility services, and South Africa, including the homelands. The survey was com- the formulation of a strategy to compensate the poorest for pleted just before the first democratic elections were held in increases in utility prices. shared their knowledge of past surveys mostly on an ing instead to understand how goods, services, and informal basis, one person or one country at a time. power are allocated among the different members of a And since the teams that are assigned to work on each given household. In addition, there is growing interest specific survey are usually small, these teams start the in using qualitative and quantitative techniques in survey development process with detailed knowledge complementary ways, or even combining these tech- of some of the topics to be covered by the survey niques.And analysts increasingly use household survey questionnaires but less detailed knowledge of others. data to address environmental issues. Third, the data gathered from some parts of LSMS survey questionnaires have been disappointing. Two How this Book Came to Be particularly difficult problems are how to measure household income from agriculture and nonagricul- In recognition of the continuing challenges for LSMS tural self-employment and how to measure savings and surveys, in the mid-1990s the World Bank initiated the financial assets. multiyear research project that developed this book. Fourth, new issues have emerged since the first (See Box 1.3 for a brief description of related initia- LSMS surveys were implemented. The economics tives.) The project was assigned three goals: to extend profession has increasingly discounted the notion of the range of policy issues that can be analyzed with the household as a unified decisionmaking body, try- LSMS data; to increase the reliability and accuracy of I0 CHAPTER I INTRODUCTION the surveys; and to make it easier to implement LSMS Box 1.3 Other LSMS Products surveys, either by simplifying survey design or by pro- A monual for planning and implementing the LSMS survey. viding more and better instructional materials on sur- When work began on this book about questionnaire vey design and implementation.This book contributes design, work also began on a companion volume about to the achievement of all three goals and thus address- planning and implementation: "A Manual for Planning and es the four challenges facing the LSMS that were Implementing the Living Standards Measurement Study described above. Survey" by Margaret Grosh and Juan Munoz.The manual, Past LSMS surveys have typically consisted of a completed in 1996, is intended for all people involved in household questionnaire, a community questionnaire, planning and implementing an LSMS survey, including staff and a price questionnaire; sometimes they have also in planning agencies, statistical agencies, line ministries, aca- included a school or health facility questionnaire. The demic institutions, and development agencies.The manual discusses such issues as samp ing, fieldwork, data manage- ment, intial analysis, dissemination, and a host of planning composed of separate inodules, sections of the ques- and budgeting issues-in each case explaining the techni- tionnaire that focus on different topics (Box 1.4).This cal procedures and standards used in LSMS surveys.The book reviews each module that has typically been a manual is available in English, Spanish, and Russian. part of past LSMS surveys, and offers some interesting new additions. The LSMS data bank. Data from LSMS surveys are now The author or authors of each chapter of this much more accessible than they were in the early years of book were chosen according to the following criteria: the LSMS program in the late- I 980s. The LSMS website, extensive research experience on the topic in ques- http:l/www.worldbank.argllsmsllsmshome.html, contains a catalogue of the data sets that are available, the documen- tion; experience in analyzing data on that topic using tation for most surveys, and the data from some of the sur- data from LSMS and non-LSMS surveys (both multi- veys. Data sets and documentation not yet available from topic and single-topic); and experience in collecting the website are available by mail.Three factors have made data in developing countries. In order to ensure that it possible to increase the accessibility of LSMS data. First, experiences and perspectives from both LSMS and a growing number of countries have adopted more open non-LSMS surveys were included, a concerted effort data access policies, and some have even given the World was made to include not only people who have long Bank permission to place their data on the LSMS website, wassmadet win t only peoplew Second, the LSMS team at the World Bank has thoroughly documented most of the surveys, whether working alone, associated with other survey traditions. working with managers of survey projects, or commission- The authors of the chapters that focus on specific ing documentation work. Good documentation preserves modules have reviewed the relevant hterature (both ana- institutional memory, lowers the cost to the Bank of dis- lytical literature and hterature on survey experience), seminating data, and reduces startup costs for new users analyzed existing survey data, and, in the case of the con- of LSMS data sets.Third, the LSMS team now has a full-time sumpfion module, experimented with different methods data manager, good technical support, and adequate space of collecting data. Many authors have drawn lessons not to stock an inventory of questionnaires, manuals for field staff, abstracts, and other useful documents from each L r country's survey. other surveys, including the RAND Family Life surveys, the World Bank's Social Dimensions of Adjustment Other tools. The LSMS program periodically produces other (SDA) surveys, and the Demographic and Health tools for survey planners and analysts.The best way to keep Surveys (DHS). The authors of many chapters have abreast of these tools is to look on the "tools for managers reviewed a large number of single-topic surveys, includ- of new surveys" and "tools for using household survey ing ones on housing, agriculture, water and sanitation, data" pages of the LSMS website. Readers of this book may time use and household income and expenditure. be particularly interested in a paper by Deaton and Zaidi W (I 99~onhowto ontrut cnsmpton ggegaes While this book was being written, two workshops ( 1999) on how to construct consumption aggregates, which complements Chapter 5 in this book, and in a recent were held that brought together all of the authors, as book by Deaton (1997) on analyzing household survey well as representatives from the various organizations data, which brings together a large amount of statistical and that constitute the main audiences for this book. The econometric material relevant for policy analysis. participants in the first workshop were primarily data users-researchers and policy advisors.They were invit- I I MARGARET GROSH AND PAUL GLEWWE Box 1.4 Components of aTypical LSMS Survey One distinguishing characteristic of LSMS surveys is that they miscellaneous income, and savings and credit. Some of the are both multi-topic and multi-level: they use severai ques- information (consumption, housing quality, agricultural pro- tionnaires to study many different aspects of household wel- duction) is collected only at the household level, but much of fare and behavior The largest LSMS questionnaire is the it (employment, education, health) is collected at the individ- household quest onnaire.The LSMS household questionnaire ual level. always collects detailed information to measure household The community questionnaire gathers information on consumption, which is the best monetary indicator of house- local conditions common to all households living in the same hold welfare (see Chapter 5 for further discussion). The community. Many of these conditions recorded can be direct- household questionna re also collects information on income; ly influenced by government actions.The information covered transfer income and income from wage employment are col- typically includes the basic characteristics (including distance lected in almost every LSMS survey and many LSMS surveys from the community) of nearby schools and health facilities, also collect data on income from agriculture, household the existence and condition of local infrastructure (such as enterprises, and misce laneous sources. roads and public transportation), sources of fuel and water LSMS household questionnaires always record informa- availability of electricity, means of communication, and local tion on a variety of other dimensions of welfare and on the agricultural conditions and practices. use of social services; housing and related amenities such as A separate price questionnaire is used to record the pre- water and sanitation; the level of education of adults, grade vailing prices of commonly purchased items in local shops and attainment and current enrollment rate of school-aged chil- markets. In almost all countries prices vary considerably dren; and vaccination histories and anthropometric (height among regions in order to compare the welfare levels of and weight) measurements for children. A typical household households that live in different regions one needs informa- questionnaire collects more information than this, in order to tion on the prices that they face when purchasing goods and expand the range of living standards indicators that can be services. The community and price questionnaires are dis- studied and to allow researchers to model the choices cussed in Chapter 1 3. households make.The traditional list of modules included in Finally, in some LSMS surveys special facility queshon- a prototype LSMS survey includes: household roster educa- naires are used to gather detailed information on schools or tion, health, employment, migration, anthropometry, fertility health facilities. These questionnaires are discussed in consumption, housing, agriculture, household enterprises, Chapters 7 and 8. ed to ensure that the book had correctly identified the ning and statistical agencies-and advisors of these research and information needs of potential data users. agencies-had a genuine and pressing need for this A larger share of the participants in the second work- book. shop were data producers-staff from national statistical Nevertheless it must be recognized that because agencies and representatives of organizations that pro- the draft modules presented in each chapter are based vide technical assistance or funding to national statisti- primarily on lessons from past surveys, few of them cal agencies. They were invited to ensure that the book have been rigorously field tested in the exact form addressed their concerns-a requirement for any suc- presented here. Thus extensive field testing must be cessful survey. Prior to the first workshop each chapter done in each country implementing a new survey. was reviewed by an expert in the relevant field. Before Survey designers should consider this testing a vital the second workshop the draft manuscript as a whole part of their job after they have chosen a set of mod- xvas reviewed by several experts in analyzing and pro- ules, modified these modules, and combined the mod- ducing household survey data in developing countries. ules into survey questionnaires. After all of these people's advice had been incorporated Each chapter contains a "cautionary advice" box and a polished draft produced, the book was subject to that specifies how much the draft module has been another round of (anonymous) peer review and revi- changed from its design in previous LSMS surveys, sions. In addition, many of the draft chapters were given how well similar modules have worked in the past, and to people who were in the process of advising govern- which parts of the modules most need to be cus- ments or survey institutions on the design of a multi- tomized to fit specific country circumstances. topic household survey. This served as a limited field test This book represents a major advance on three for the book, and also confirmed that government plan- fronts. First, the book makes it easier for those work- 12 CHAPrER I INTRODUCTION ing on new household surveys to learn from the wide The remaining chapters ofVolume 1 form Part 2 range of LSMS and other survey experience. Second, of the book, and the first nine chapters of Volume 2 the book makes it much easier to customize the design comprise Part 3.The chapters in Parts 2 and 3 discuss, of questionnaires for new multi-topic surveys. Third, in great detail, the individual modules that are the the material presented in the book deals with new building blocks of any multi-topic household survey. policy questions, presents new analytical methods to Each chapter reviews the main policy issues pertinent address both new and long-standing policy issues, and to the subject matter of the module, identifies the data provides new ways to reduce or avoid measurement needed to analyze these issues, introduces one or more problems. draft modules (which are presented in Volume 3), and provides annotated notes that explain the reasoning How to Use this Book behind many of the details of each draft module. For most modules, two or three different versions are The process of designing a comprehensive, multi-topic introduced, each of different length.Which module to survey can be divided into five steps. First, survey plan- use depends on the level of interest in the particular ners must define the fundamental objectives of the topic. In addition, many of the chapters in Parts 2 and survey and decide on the overall design of the survey III discuss how to add or delete submodules within in light of these objectives. Second, within this gener- each module in order to provide a better fit with local al framework the survey planners must choose which circumstances and the specific focus of the survey. modules to include in the questionnaires, the objec- Part 2 (Chapters 4-13), in Volume 1, introduces tives of each of these modules, and the approximate "core" modules that must be included in virtually all length of each module. Third, the planners must work LSMS-type surveys: metadata, consumption, roster, out the precise design of each module, question by education, health, employment, anthropometry trans- question, in light of the module's specific objectives fcrs and other nonlabor income, housing, and the and approximate length. Fourth, the modules must be community questionnaire.The modules on health and integrated with each other and combined into a com- education come with draft questionnaires that can be plete set of draft questionnaires (household, communi- used for gathering data from local schools and health ty, price, and in some cases, facility). Fifth, the draft care facilities.The collection of community-level data, questionnaire should be translated (if applicable) and including data on local prices, is discussed in Chapter field tested. Ideally, the five steps should be completed 13, the final chapter of Part 2. in chronological order. However, in practice, imple- Part 3 (Chapters 14-22), in Volume 2, introduces menting any given step may reveal information that modules that are optional: environmental issues, fertil- requires survey designers to rethink a previous step. ity, migration, total income, household enterprises, This book consists of three volumes. Volumes 1 agriculture, savings, credit, and time use. and 2 contain all 26 chapters of this book.Volume 3 The last four chapters ofV'olume 2 constitutes Part provides the draft questionnaires introduced by the 4 of the book. These chapters contain material that is chapters in Volumes 1 and 2. Volumes 1 and 2 are more methodological in nature. Chapter 23 discusses organized into four parts. The first three chapters of when and how to collect panel data-in other words, Volume 1 constitute Part 1, which discusses the "big whether to interview the same households when picture." This includes decisions that must be made doing a sequence of surveys, and how best to do so about the overall design of the survey and the modules when this option is chosen. Chapter 24 reviews the to be used, as well as procedures for combining mod- issues involved in analyzing the allocation of resources ules into questionnaires and questionnaires into a sur- and power within households, and summarizes the vey (or sequence of surveys). Chapter 2 starts by implications of this analysis for data collection. describing how to choose from among the three "clas- Chapter 25 summarizes how qualitative research sic" survey designs and how to select the modules to methods can be used to complement the quantitative be included in the survey. Chapter 3 describes general methods typically used in the design and analysis of procedures for designing each module, combining the multi-topic household surveys. This chapter stresses modules into a well-integrated set of questionnaires, that qualitative methods can play a useful role in the and translating and field testing the questionnaires. design of multi-topic household surveys, especially in 13 MARGARET GROSH AND PAUL GLEWWE formulating questions and developing hypotheses for eral guidance on how to design individual modules and data analysis. It is unfortunate that these methods have combine them into an integrated set of survey ques- been neglected by most survey designers, who usually tionnaires. First each module must be customized to have quantitative backgrounds. meet the specific objectives set out for it. Then the Chapter 26 reviews the basic economic and econo- modules must be compared with each other to check metric concepts that underpin many of the chapters in for gaps and overlaps and to harmonize wording, codes, this book. Although many survey designers have an and recall periods. Next, survey designers must decide economics background, many others do not, and even on the order of the modules and combine the modules some economists may benefit from a review of this into draft questionnaires. Finally, the questionnaires material. The chapter begins by presenting the basic must be translated and field tested. Throughout this economic model of the household and goes on to dis- process, issues of questionnaire formatting will arise; cuss standard econometric techniques that have often Chapter 3 explains the principles and conventions used been used in policy research on developing countries. in formatting LSMS questionnaires. The reader should understand that the question- naires provided in Volume 3 and discussed in Parts 2 Part 2 and III are not polished or completed, and cannot be After survey designers have read Part 1 of this book used immediately in any developing country. Final and decided on the broad issues concerning the scope versions of the questionnaires for any country must be and design of the survey, they can begin the painstak- developed by the survey designers themselves. Survey ing but crucially important task of designing the indi- designers must combine their own experience and vidual modules. Detailed advice on designing individ- expertise with the information in this book to design ual modules is given in Parts 2 and 3 of the book. Part a country-specific questionnaire that will elicit the 2 includes all modules that should be included in information needed to answer the most important almost any LSMS-type survey, while the modules in policy questions of that country. This book is just a Part 3 are optional. Almost all of the chapters in Parts starting point. It provides survey designers with the 2 and 3 follow a similar outline. The first section lessons learned from past experience and with advice reviews the current policy issues in developing coun- from experts who are familiar with both LSMS and tries for the topic or sector covered by the chapter. other household surveys. The second section explains what kinds of data are needed from household surveys to address these poli- Part I cy questions and also discusses any measurement All readers of this book should read all of Part 1, which issues. The third section introduces one or more ver- in addition to this chapter includes Chapters 2 and 3. sions of a draft module (the modules themselves are presented in Volume 3), and the fourth section pro- CHAPTER 2: MAKING DECISIONS ON THE OVVERALL vides notes that explain the reasoning behind, and DESIGN OF THE SURVEY. This chapter leads survey important details of, each version of the draft module. designers through the factors that need to be consid- Chapters 4 on metadata and 5 on consumption ered when determining the basic scope of the survey. are the first chapters in Part 2 because they contain a It sketches three alternative designs for an LSMS-type good deal of information on survey methods and survey-a full LSMS survey, a "core" (scaled-down) issues of validity and measurement, information with LSMS survey, and a core and rotating module broad implications for the subsequent chapters. Like design-and suggests rules to help survey designers Chapters 4 and 5, Chapters 6-13 cover "core" mod- choose the design most appropriate for the circum- ules. Each of these topics does not have to be covered stances that they face. Chapter 2 also explicitly defines in great detail, but it is recommended that at least the the "core" components that should be included in any essential parts of each of these modules be included. LSMS-type survey. For example, the fullest version of the health module introducted by Chapter 8 is so extensive that it should CHAPTrER 3: DESIGNING QUESTIONNAIRE MODULES AND be included only in a survey specializing in health ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES. This issues.Yet questions 10-38 of the short health module chapter moves to a finer level of detail, providing gen- are an essential part of the core. 14 CHAPTER I INTRODUCTION CHAPTER 4: METADATA-INFORMATION ABOUT EACH surveys have not made any significant contribution to INTERVIEW AND QUESTIONNAIRE. "Metadata" are data this issue.The chapter confirms the usefulness of gath- about the survey itself, such as dates of interviews, ering household roster information on any children identities of respondents, and time required for each and parents of household members who do not live in interview.This topic has frequently been neglected in the household, as well as of linking parents and chil- LSMS and other household surveys. Metadata are dren to each other when both are household mem- needed to guide the implementation of a survey, to bers. help analysts interpret survey data, and to allow a quantitative evaluation of different survey procedures. CHAPTER 7: EDUCATION. This chapter recommends Chapter 4 reviews the different kinds of metadata that only modest changes to the design of the education should be collected as part of any multi-topic house- module that has been used in most previous LSMS sur- hold survey, and provides guidance on how to collect veys, because this design has worked quite well in the them. If the recommendations made in this chapter past. The education module (presented in Volume 3) had been adopted at the beginning of the LSMS pro- collects information about the schooling of all house- gram, the rest of this book and its companion manual hold members, including highest grade attained, (Grosh and Mufioz 1996) would have had a firmer degrees obtained, and grades repeated. Individuals cur- empirical foundation for discussing the tradeoffs in rently in school are also asked about the type of school survey design and implementation. For example, they attend (public or private), their recent attendance, information on the time required to complete specif- and the amount of money the household spends on ic modules of varying lengths would have been very their schooling. Chapter 7 also introduces an expand- useful for deciding costs of expanding a module in ed version of the education module, which is useful for terms of interview time. the design of surveys that focus on education issues. This expanded module requires adnministering relative- CHAPTER 5: CONSUMPTION. This chapter differs from ly short tests of cognitive skills to members of the the others by focusing most of its attention on meas- household as well as collecting information about local urenmeilt issues, specifically on how to measure house- schools through school and teacher questionnaires. hold consumption.The chapter draws on the literature on data collection, and also on data collection experi- CHAPTER 8: HEALTH. The health data collected in pre- ments that were part of the research for this book. One vious LSMS surveys have been of limited usefulness important conclusion of Chapter 5 is that measure- for policy analysis.The draft health module introduced ment of consumption is highly sensitive to differences in Chapter 8 (and provided inVolume 3) has been dra- in methods, and that consumption measurement tech- matically revised from previous surveys. The new niques within a given country must therefore be stan- module consists of a series of submodules on self- dardized over time. Researchers and consumers of reported health status, health-related behavior, child household survey data need to be aware of the strin- immunization, insurance coverage, health service uti- gent comparability requirements that consumption lization and cost, and knowledge of health providers. data must satisfy before comparisons can be made One key change is that information is collected on all across different surveys. visits made by household members to medical facili- ties during the reference period-not just the most CHAPTER 6: HOUSEHOLD ROSTER. One of the funda- recenit visit. Ainother iimiprovemenit is the collcctioni of mental decisions in any household survey is deciding self-reported data on "activities of daily living," data who is and who is not a household member. Chapter which cover the ability to climb stairs, carry heavy 6 provides a basis for identifying all members of the loads, or walk long distances.The chapter also presents household, thus selecting from which individuals the an expanded version of the module that includes survey will collect information.The recommendations questions related to mental health and very detailed in this chapter do not differ greatly from procedures questions on the household's health expenditures and used in past LSMS surveys.The chapter offers no new utilization of health facilities. The expanded version of proposals on how to define a "household," which is a the module also collects data on observed activities of difficult issue in almost any household survey. LSMS daily living and the cognitive functioning of house- I5 MARGARET GROSH AND PAUL GLEWWE hold members that are observed by the interviewer. short module and a longer module. The short module collects data similar to those collected by the housing CHAPTER 9: EMPLOYMENT. The collection of data on module used in previous LSMS surveys, albeit with employment and labor force participation in past several improvements. The longer module collects LSMS surveys has been fairly successful. However, information that can be used to study a wide range of given the large size of this module, there is ample housing policy issues. Both modules are flexible in that room for many small improvements. Several modifica- they can gather data on complex water supply systems tions are suggested to past designs. First, detailed infor- and on many different rental arrangements. Finally, in mation on household members' work for household the new housing module questions are added that are businesses or on the household farm is collected in the appropriate to places with cold climates and well household business and agriculture modules, not in developed infrastructure; such questions will be partic- the employment module. Second, job history infor- ularly useful for surveys in the transition economies of mation is collected in a way that focuses on each indi- Eastern Europe and the former Soviet Union. vidual's employment. five years before the time of the survey. Third, sumnmary employment information is CHAPTER 13: COMMUNITY AND PRICE DATA. Com- collected in a way that can accommodate individuals munity questionnaires have been used in many past who have done many different kinds of work in the LSMS surveys to gather information on the econom- past 12 months. ic environment in which households operate. Community characteristics that affect households' CHAPTER 10: ANTHROPOMETRY. Nutritional status has economic environment can often be directly changed always been one of the key nonmonetary indicators of by government interventions. This chapter provides a welfare in LSMS surveys, especially for children; future much-needed discussion of how to define the "com- surveys should continue to collect such data. Chapter munity" for which the information is to be collected 10 discusses the tradeoffs involved when anthropo- and how to gather community information from a metric information is collected only for young chil- group of respondents. Finally, it introduces a longer dren, rather than for all household members. The and much more comprehensive community question- chapter recommends that in general anthropometric naire (presented in Volume 3) than has been used in data should be collected for all household members, previous LSMS surveys. The design of this draft ques- both adults and children. Chapter 10 also discusses the tionnaire is based on the experience of both LSMS merits of collecting data on mid-upper arm circum- surveys and the RAND Corporation's Family Life ference, which until now has been done in only one Surveys, as well as on suggestions from many of the LSMS survey. authors of the other chapters in this book. CHAPTER 11: TRANSFERS AND OTHER NONLABOR Part 3 INCOME. Many households receive income unrelated to This part, consisting of Chapters 14-22, is in Volume any of their members' current work activities. Past 2. The chapters in Part 3 follow the same format as LSMS surveys have usually collected data on this those in Part 2. Part 3 covers topics that are likely to income in a module on transfers and other nonlabor be of interest, but one would never include all of them income. Chapter 11 introduces an improved version of in any one survey. None of the topics in Part 3 are that module, capturing more information on both pub- required for an LSMS-type survey. hic and private transfers. For private transfers, the new module also collects more information about the donor CHAPTER 14: ENVIRONMENTAL IssuEs. To date, very household and about the purpose of the transfers. few LSMS surveys have collected data that can be used to examine environmental issues. The environmental CHAPTER 12: HOUSING. Past LSMS surveys collected module introduced in Chapter 14 (and presented in data on housing to serve as indicators of"basic needs" Volume 3) offers a series of submiodules that can be and to derive the implicit consumption value (imput- used, as appropriate, in different settings. There are ed rent) associated with owner-occupied housing. very brief submodules on environmental priorities in Chapter 12 introduces two draft housing modules, a urban and rural areas, on attitudes and perceptions 16 CHAPTER I INTRODUCTION about urban air quality, and on discount rates. All of CHAPTER 17: SHOULD THE SURVEY MEASURE TOTAL these submodules could be used in many surveys, even HOUSEHOLD INCOME? Chapter 5 (and indeed this surveys that do not focus on the environment. The entire book) argues that consumption is the best mon- environmental module also includes lengthy submod- etary measure of welfare in multi-topic surveys. This ules on water, sanitation, and fuel, to be included in implies that the consumption module is essential and surveys for which the use of these resources is of par- must always be included. In contrast, the book takes ticular interest. There are also contingent valuation the view that collecting the data needed to calculate submodules that attempt to measure the extent to total household income is optional, which implies that which households are willing to pay for improvements the household enterprise, agriculture, and savings in urban air quality, the urban water supply, urban san- modules (which collect much of the income data) can itation, or the rural water supply. The design of all of be substantially reduced or even omitted, depending the above submodules is based on extensive experi- on the level of interest in these topics. Chapter 17 ence from single-purpose surveys. reviews the advantages and disadvantages of collecting the data needed to calculate total household income, CHAPTER 15: FERTILITY. This chapter follows the same and describes the circumstances under which measur- general approach used in many past LSMS surveys. ing total income should be an objective of a multi- The chapter introduces a short version of a fertility topic survey. module that collects the data necessary to understand some general aspects of contraceptive use and to com- CHAPTER 18: HOUSEHOLD ENTERPRISES. Small busi- pile a maternity history that lists all births. Chapter 15 nesses owned and operated by households are quite also introduces a standard version that includes a common in developing countries, yet it is difficult to maternity history, a reproductive health submodule collect accurate data on these income-generating covering the previous three years, a longer section on activities. Based on their extensive experience of ana- contraceptive use, and a section on fertility prefer- lyzing data from past household surveys-both LSMS ences. Both the short and standard versions are pre- and others-the authors introduce three versions of sented inVolume 3.Whichever version of the module the household enterprise module (presented in survey designers choose to use, the module should be Volume 3). Survey designers should choose the ver- administered to all women in the household of child- sion that best matches policymakers' level of interest in bearing age. This departs from the past LSMS practice household enterprise issues. In previous LSMS sur- of interviewing only one randomly selected woman veys, data on employment in these enterprises were per household. collected in the employment module, but this chapter recommends that such information be collected in the CHAPTER 16: MIGRATION. Data on migration have household enterprise module; each of the modules been collected in many past LSMS surveys, but the introduced in this chapter collects such data. One con- amount of information collected has been quite small sequence of this is that the standard version of this and the data have rarely been analyzed, despite signif- module is now longer than the version typically used icant interest in migration among researchers. This in previous LSMS surveys, and the expanded version chapter introduces three different versions of the is even longer than the standard version. migration module: a short version, a standard version, and an expanded version. All three versions are pre- CHAPTER 19: AGRICULTURE. Collecting accurate and sented in Volume 3. The standard and expanded ver- comprehensive data on agricultural activities is diffi- sions are designed to collect much more detailed cult in any survey, and past LSMS surveys have expe- information than has been collected in the migration rienced many problems in collecting such data. This modules used in previous LSMS surveys. Including chapter introduces short, standard, and expanded ver- either the standard or the expanded draft migration sions of the agricultural module (presented inVolume module in a future survey will yield a rich data set that 3) that are very different from the agriculture modules should prove very useful for comprehensive research used in previous LSMS surveys. In the standard and on migration. expanded versions, information on land owned and crops produced is gathered on a plot-by-plot basis, 17 MARGARET GROSH AND PAUL GLEWWE rather than at the level of the wvhole farm as was done module as part of a multi-topic survey and in analyz- in previous surveys. Information on household mem- ing the data collected before it becomes clear whether bers' work on their own farms is now obtained in the to routinely include time use questions in LSMS and agriculture module, rather than in the employment similar multi-topic surveys. module as in previous LSMS surveys.The short mod- ule is new, and is limited to collecting information on Part 4 the households' agricultural assets and on the total This part, inVolume 2, presents four chapters that dis- amounts of each crop produced by the household. cuss several general survey design issues. These chap- ters are useful for survey designers to read alongside CHAPTER 20: SAVINGS. It is difficult to collect data on the chapters in Parts 2 and 3 that interest them. household savings because many households are reluc- tant to provide savings-related information. Several CHAPTER 23: RECOMMENDATIONS FOR COLLECTING previous LSMS surveys have collected a modest PANEL DATA. This chapter reviews the advantages and amount of data on savings, but these data have rarely disadvantages of collecting panel data in developing been used in analysis due to doubts about their accu- countries, along with past experience of collecting racy. Chapter 20 provides an extensive review of panel data.The chapter recommends that panel data be research on savings in developing countries, emphasiz- collected in most surveys, provided that in successive ing the difficulties involved in doing such research. rounds the original sample of households is supple- The two versions of- the draft module introduced by mented with a sample of households living in this chapter (and presented in Volume 3) include sev- dwellings that have been constructed since the first eral modest improvements to the module used in pre- survey. This is necessary to ensure that the sample vious LSMS surveys. Neither of these versions is much remains nationally representative when each survey is longer than the savings modules of past surveys. implemented. Chapter 23 also recommends that infor- mation be collected from households in the first sur- CHAPTER 21: CREDIT. This chapter emphasizes that to vey that will help interviewers find these households capture all of the sources and uses of credit in a way in subsequent surveys, even when it is not certain that natural to respondents, questions on household credit later surveys will attempt to collect panel data. use must be inserted in several of the survey's modules. Such questions should be inserted in the housing, con- CHAPTER 24: INTRAHOUSEHOLD ANALYSIS. The study sumption, household enterprise, and agriculture nmod- of the allocation of resources and responsibilities with- ules, as well as in the community questionnaire and in in households has grown in the economic literature a special credit module. In contrast with most past over the last few years, and such issues are increasing- LSMS surveys, the draft credit module introduced by ly arising in policy discussions. This chapter explains this chapter (and presented inVolume 3) gathers infor- which kinds of data should be collected at the indi- mation at the level of the individual rather than at the vidual level rather than at the household level in order level of the household. to support intrahousehold analysis; from this perspec- tive the chapter provides a critique of the modules CHAPTER 22: TIME UsE. LSMS surveys have tradition- proposed in Parts II and III. For modules deemed ally included neither comprehensive measures of time inadequate for intrahousehold analysis, the chapter use nor modules dedicated to time use. This chapter proposes ways to modifv^ them so that they better sup- discusses the experience of special time-use surveys, port such analysis. The chapter accepts that it is not and uses this experience to formulate a special time- feasible to collect individual-level data on all topics in use module (presented inVolume 3).This module will an LSMS survey. Nevertheless, future LSMS surveys be of particular interest to researchers concerned with can be designed to support substantial intrahousehold intrahousehold issues. However, the draft module is analysis. Much of the data collected in past LSMS lengthy and could crowd other modules out of the surveys-on employment, health, education, anthro- survey (since there is a limit to the amount of time pometrics, migration, and fertility-have long been households are willing to be interviewed). Further collected at the individual level.And the draft agricul- experience will be needed in implementing such a ture, household enterprise, credit, and miscellaneous 18 CHAPTER I INTRODUCTION income modules presented in Parts II and III of this glossary at the end of Chapter 26 defines the eco- book recommend the collection of more individual- nomic and econometric terms used in many chapters level data than were collected in previous LSMS sur- of the book. veys. In addition, the draft time-use module intro- duced in Chapter 22 is a new tool for gathering data Note that are crucial for intrahousehold analysis. The authors xvould like to express their gratitude to Jere Behrman, CHAPTER 25: QUALITATIVE DATA COLLECTION Lawrence Haddad, Courtney Harold, John Hoddinott, Alberto TECHNIQUES. Previous LSMS surveys have focused Martini, Raylynn Oliver, minnon Scott, and Diane Steele for com- almost exclusively on collecting quantitative data, ments on previous drafts of this chapter. making very little use of qualitative data collection methods. This regrettable tendency probably reflects References the quantitative backgrounds of most survey design- ers. Chapter 25 explains ways in which qualitative Davies, David. 1795. The Case of Labourers in Husbandry Bath, research methods can usefully and effectively comple- United Kingdom: Printed by R. Cruttwell for G. G. and J. ment quantitative data collection. The chapter con- Robinson. cludes that qualitative methods should not be com- Deaton, Angus. 1997. The Analysis of Household Surveys: A Micro bined with quantitative methods into a single survey; Econometric .4pproach to Development Policy. Baltimore, Md.: instead, both methods should be used in separate but Johns Hopkins University Press. complementary data collection exercises. Quantitative Deaton, Angus, and Salman Zaidi. 1999. "Guidelines for surveys can benefit from qualitative methods in sever- Constructing Consumption Aggregates for Welfare Analysis" al ways. For example, qualitative research can be used World Bank, Development Research Group,Washington, D.C. to help survey designers formulate the exact wording Eden, Frederick. 1797. The State of the Poor London: Davis. of particular questions, and qualitative methods are Grosh, Margaret. and Juan Mufioz. 1996. AAlaMnalfor Planning and useful for creating hypotheses about household Implementing the Living Standard Measurement Study Survey behavior, which can then be tested using quantitative Living Standards Measurement Study Working Paper 126. data. Washington, D.C.:World Bank. Ravallion, Martin, and Shaohua Chen. 1997. "What Can New CHAPTER 26: BASIC ECONOMIC MODELS AND Survey Data Tell Us about Changes in Distribution and ECONOMETRIC TOOLS. This chapter gives non-econo- Poverty?" World Bank Economic Review 11 (2): 357-82 mists some basic information on economic models of . 1998. Personal conversation updating Ravallion and Chen household behavior, and reviews econometric meth- 1997. November 2 and 3.Washington, D.C. ods commonly used in analyzing the policy questions Ravallion, Martin, Guarav Datt, and Dominique van de Walle. discussed in this book. Chapter 26 is a useful reference 1991. "Quantifying Absolute Poverty in the Developing for non-economists as they read other chapters. A World." Review of Income and Wealth Series 37 (4). 19 Making Decisions on the Overall Design 2 of the Survey Margaret Grosh and Paul Glewwe Comprehensive, multitopic household surveys such as LSMS surveys usually consist of three sep- arate questionnaires: a household questionnaire, a community questionnaire, and a price question- naire. Each questionnaire is composed of modules, sections that collect information on a specific topic. Questionnaires and their modules can be combined in a variety of different ways to create a multitopic household survey.There is no single right way to combine modules and question- naires into a survey; each way has advantages and disadvantages. The key is to choose a design that provides the best fit given the objectives of, and constraints on, the proposed survey. The starting point for designing the modules and The third step is to work out, question by ques- questionnaires to be used in a multitopic survey is a set tion, draft questionnaires for each module that will be of policy questions.The overall objective of each sur- included in the survey. This can be done by drawing vey is to collect the data needed to answer these ques- on the detailed recommendations in the chapters in tions. There are five steps involved in survey design. Parts 2 and 3 of this book, as well as the draft modules The first step is to define the fundamental objectives included in Volume 3. The fourth step is to compare of the survey and to choose an overall survey design the modules to each other to ensure that they are con- that best fits these objectives. This is usually done not sistent and well integrated, and to combine them into by a single individual but by a team of survey design- draft household, community, and price questionnaires ers who consult extensively with a broad range of (in some cases omitting the community question- individuals and organizations interested in the survey. naire). The fifth and final step is to translate and field In choosing the overall design, the team must take into test the draft questionnaires. Translation may not be account several important factors including the capac- necessary in some countries; field testing is always ity for collecting data within the country, the funding essential and must not be done quickly or superficial- available, and the amount and quality of data available ly. The first two steps are discussed in this chapter. The from other sources. third, fourth, and fifth steps are discussed in Chapter 3. The second step involves deciding which modules While these five steps should ideally be done in to include in the survey, specifying the objectives of the order given above, in reality there is likely to be each module, and proposing an approximate length for substantial movement backward and forward among the modules. This step is needed because a survey that the various steps. Some objectives originally set out for attempts to include all possible modules will be too the survey may prove impossible to achieve given the large and complex to implement. constraints. And discussion of the detailed objectives 21 MARGARET GROSH AND PAUL GLEWWE for each module may cause the survey design team to reassess the overall objectives of the survey. In other Box 2.1 The Importance of aWell-Rounded Design words, it may be necessary to take one or two steps Team backward at some points in order to continue to move An effective survey design team must include researchers forward. This is to be expected and even encouraged. and policy analysts, policymakers, and staff from the organi- As more is learnecd about what can and cannot be zation implementing the surveyThe problems that can arise done, survey designers are more likely to produce a when one or more of these groups is not involved in survey design that meets their objectives-which may designing the questionnaires are illustrated by the experi- also have become more realistic. It is better to pare ence of an LSMS survey implemented n Jamaica. The down the number of objectives in order to achieve household questionnaire for the first Jamaican Survey of Living Conditions (implemented in 1988) was designed pri- some of them than to attempt to do too much and, as marily by international experts who had little knowledge of a result, achieve few or none of the original objectives. Jamaican social programs. Although the househo d ques- The first four sections of this chapter cover the tionnaire was largely successful in accomplishing its analyti- first step of the survey design process. The first section cal objectives, it had two serious fiaws. First, although food provides an overview of who should be involved in subsidy policies were an important issue at the time, the designing and assembling the questionnaires. The sec- consumption module did not clearly distinguish expendi- ond section discusses the main factors that survey tures on key subsidized food items from expenditures on designers should take into account when choosing similar nonsubsidized items. This made it more difficult to designers shoulveydtakesignoptionsh ao whenchionost study the incidence of food subsidies. Second, the ques- among survey design options. The third section out- tionnaire asked respondents about their receipt of food lines "core" elements that must be included in any stamps during the previous month even though food LSMS or similar multitopic survey and reviews sever- stamps were provided only every two months.This made it al classic survey designs, each of which supplements difficult to identify which househods had received food the core in a different way.The fourth section presents stamps, thus hindering the study of another issue important guidelines for choosing the survey design most appro- at the time. Fortunately, these flaws were identified and cor- priate for each of a range of different circumstances. rected in the following year's househo d questionnaire. The fifth and final section explains the second step of the survey process: choosing the modules to include rily by researchers and policy analysts, and much of the in the survey, setting objectives for each module, and success of past LSMS surveys in supporting policy-rel- setting the approximate length of the modules. evant research is due to the fact that the surveys used questionnaires designed by people who would be Organizing a Survey Design Team actively involved in the analysis of the data. Researchers and policy analysts can ensure that the The most important factor ensuring the success of a information collected in multitopic surveys is well multitopic household survey is the involvement of the suited for policy research. right people in the process. Designing the survey The lead role in designing the questionnaires of questionnaires is much too large a task for one person. an ESMS or similar multitopic household survey Instead, a team of experts must be involved, including should be given to a small group of researchers and members of the organization implementing the survey policy analysts who share two characteristics. First, as well as research analysts from other institutions. If they should know what issues are of most concern to the team does not contain a sufficient diversity of the country's policymakers. Second, they should have experts, this can have negative repercussions for the experience in using data from similar surveys to ana- data (Box 2.1). The design team must work together lyze these issues. The group of researchers and policy with policymakers and program managers to define analysts should include members of the national plan- the overall objectives of the survey and to settle on ning agency, representatives from the national statistics many details at each step of the survey design process. agency, local academic researchers, and one or more people who have helped analy7e or design multitopic Researchers and PolicyAnalysts surveys in other countries. It is essential to involve researchers and policy analysts The team must include individuals with extensive in questionnaire design.This book was written prima- experience in implementing and analyzing other 22 CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY household surveys in the country in question. Ideally, people should be consulted before the modules are local researchers and policy analysts should take pri- created, and they should also be shown draft modules mary responsibility for designing the survey, because to elicit comments. they have an intimate knowledge of the country's cul- Unfortunately, in many previous LSMS surveys ture, economy, and society, and they are very familiar the survey design team did not give enough attention with existing programs and key policy issues. Local to communicating or consulting with policymakers. researchers and policy analysts are also likely to know Policymakers, who are often unfamiliar with house- about previous surveys done in the country that have hold surveys, may find it difficult to read complicated covered some of the topics included in the new sur- questionnaires or to imagine what analyses the result- vey. And they will know which people and institutions ing data could support. One option is to show policy- should be consulted during the survey design process. makers and program managers examples of the kinds It may also be desirable to involve international of tables and analyses that could be produced using researchers in the design of the questionnaire, espe- data from the questionnaire; these might be either cially in countries where local data analysts are not hypothetical tables for the country of the survey or familiar with LSMS and other multitopic household tables made in other countries using data from similar surveys. International researchers can contribute their surveys. Another strategy is to show policymakers a experience about what has and has not worked in sur- report based on the first year's data; this is an excellent veys in other countries.1 Judicious use of the advice of way to obtain policymakers' feedback on the design of both local and foreign experts will significantly follow-up surveys.A third strategy is simply to ask pol- improve survey design. icymakers what they need to know to implement Past LSMS surveys have probably made insuffi- effective policies. cient use of the knowledge available from local rcscarchcrs and policy analysts.Too often thc involvc- Data Producers ment of local professionals has been limited to statisti- It is critical that the survey design team include staff cians from the statistical agency (data producers) and from the organization implementing the survey. This thus failed to draw on the expertise of social policy should ensure that the questionnaires designed are researchers from the government or academia (data workable. Often data collection can be greatly sim- users). Statisticians may have only a limited knowledge plified by making minor changes in the layout or of sectoral policy issues and programs. While they do flow of a questionnaire, changes that do not diminish have an important role to play, their input must be the questionnaire's analytical content. Data producers combined with input from data users to set priorities are an excellent source of suggestions for such among the different possible objectives for policy changes. They are usually also experienced in details research. of designing a questionnaire, such as questionnaire formatting. For all of the above reasons, the team Policymakers members from the organization implementing the When defining the fundamental objectives of the sur- survey should help design, or comment on, every vey, the team responsible for drafting questionnaires draft of the questionnaire. must seek extensive input from policymakers and pro- It is also useful for the survey design team to solic- gram managers in the country being surveyed. The it the input of experienced field supervisors, who will team's initial discussions with policymakers should notice whether the instructions to the interviewer are focus in broad terms on the most important issues to clear, whether the skip codes are correct, and whether be covered, which will determine the relative size of the format is consistent. There is of course a natural the different modules in the questionnaires. After this tension between data analysts, who want comprehen- round of discussions, further discussions should be sive information, and field supervisors, who are likely held to identify the important issues within each sec- to see all of the disadvantages but few of the advan- tor. Since drafting the module or modules for each tages of administering a lengthy, complex question- sector requires a substantial amount of knowledge naire. Each side must be prepared to make compro- about how specific programs work, technical experts mises and carefully listen to the other side's point of in many program agencies must be consulted. These view. 23 MARGARET GROSH AND PAUL GLEWWE Factors for Deciding among Various Survey Structural analysis of descriptive data can sometimes Designs be used to draw conclusions about the likely impact of government policies on living standards. Examples of After the members of the survey design team have such analyses are the "poverty profiles" typically pro- been selected, work can begin on designing the survey. vided in the World Bank's poverty reports. The first task for team members is to review the fac- In both types of descriptive analysis, the range of tors that influence the overall design of the survey. variables used to measure living standards can vary This section discusses those factors in detail. widely; variables may he used from virtually all of the The appropriate design of a household survey or survey modules or from only a small subset of mod- sequence of household surveys differs from country to ules. In general, most of the variables included in sta- country. The most important factors for determining tistical abstracts and descriptive analyses come straight the design of a proposed survey are: the kinds of poli- from the questionnaire (for example, percentage of cy issues the survey aims to address; the information households that have electricity) or require only a available from existing surveys and other data sources; small amount of manipulation (for example, nutrition- the country's institutional capacity for collecting data; al status as derived from weight and height data). Only and the financial and other resources available for one "complex" variable needs to be constructed: total implementing the survey, including any constraints on household consumption. Other complex constructed how these resources can be used. variables, such as total income or net wealth, are used less often in simple descriptive presentations. Policy Issues The design of a household survey should reflect the MONITORING POVERTY AND LIVING STANDARDS. The policy issues it is intended to address. One xvay to clas- descriptive analyses discussed above focus on living sify policy issues is in terms of their subject matter, standards at one point in time. However, another such as health, education, employment, or housing. important role of multitopic household surveys is to Another way to classify policy issues is in terms of the monitor how living standards and poverty change over kinds of data used to address them. The four most time.When data are used for this purpose they must common kinds of household survey analysis used to be comparable over time; for this to be the case, the address policy issues are: simple descriptive statistics on data must be gathered using the same methods each living standards; monitoring poverty and living stan- time the survey is implemented. One aspect of such dards over time; describing the incidence and coverage consistency concerns the design of the sample, which of government programs; and measuriing the impacts in each case must use the saime definitionis of basic of policies and programs on household behavior and concepts such as the distinction between urban and welfare. This subsection reviews these four types of rural areas. A second requirement for comparability is analysis and provides a practical example of how the that the questions defining variables of interest must information needed affects the design of the survey. remain the same each time the survey is administered. This is necessary because seemingly innocuous SIMPLE DESCRIPTIONS OF LIVING STANDARDS. The changes in the wording of questions can lead to seri- mnost straightforward objective for a household survey ous comparability problems; changing the recall peri- is to describe the living standards of the population at od for food expenditures can make it impossible to one point in time, often with particular emphasis on compare estimates of poverty and inequality over the living standards of the poor. This can be done by time. using the data to tabulate means and frequencies of Another issue to consider when monitoring key variables. The results of these tabulations are often poverty and living standards over time is the frequen- disseminated by the national statistical agency in the cy with which indicators of living standards must be form of statistical abstracts (reports) that contain a monitored. Indicators that are fairly stable over short large number of tables and a minimal amount of periods of time-stuch as fertility and adult literacy- descriptive text. It is also possible to produce more need not be measured each time the survey is done. structured descriptive analyses that supplement house- However, indicators that can change more quickly, hold survey data with information from other sources. such as consumption expenditure, children's nutrition- 24 CHAFPER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY al status, and employment status, should be measured in great detail and provide many examples of issues every time the survey is implemented. Surveys that that require the modeling of household behavior. The monitor poverty and living standards over time are following questions give an idea of the range of poli- typically fielded every year, although it is also possible cy issues that can be addressed:What is the impact of to field them biannually or semiannually. charging user fees at government health clinics on the use of those clinics by adults and by children? How EXAMINING THE INCIDENCE AND COVERAGE OF can the government encourage parents to enroll their GOVERNMENT PROGRAMS. Data from multitopic children in school? What are the impacts of women's household surveys can also be used to measure the employment opportunities on their fertility? How do incidence and coverage of specific government pro- changes in prices brought about by structural adjust- grams. For example, data on the enrollment of house- ment programs affect the welfare and productivity of hold members in public schools are useful for investi- agricultural households? gating which children benefit from the provision of public schooling. Household survey data can also be ExAMPLE SCENARiO: THE INCLUSION OR EXCLUSION OF used to study participation in government assistance THE ANTHROPOMETRY MODULE. The decision about programs such as food stamps, cash assistance, and including an anthropometry module demonstrates school meals. Another example is descriptive statistics how the analytical potential of a multitopic household on purchases of subsidized food items, which can be survey is related to its content. (See Chapter 10 for used to examine whether the benefits of specified detailed information on the collection of anthropo- food subsidies vary by households' levels of income. metric data.) If an anthropometry module is not The incidence and coverage of these different kinds of included, the survey is not useful for studying nutri- programs are easy to calculate and useful for policy- tion issues. However, by collecting limited anthropo- makers to know. metric data such as the height and weight of children A moderate sample size (2,000 to 5,000 house- under five years of age, the survey will allow analysts holds) should be sufficient to evaluate programs that to describe the extent and patterns of malnutrition in affect a large proportion of the population. Evaluating early childhood. If the country studied has adopted programs that serve a small proportion of the popula- large-scale food distribution or subsidy programs, the tion usually requires using a much larger sample of data can also be used to assess how well these programs households or including a disproportionately large are targeted to undernourished children. If the survey number of target and beneficiary households in the is repeated annually or biannually, it becomes possible sample. to monitor changes in both the nutritional status of the population and the targeting of the program.Thus ESTIMATING THE IMPACT OF POLICIES ON HOUSEHOLD the data collected from a limited anthropometry mod- BEHAVIOR AND WELFARE. Policymakers are often faced ule can address, at least partially, three of the four types with questions that can be answered only by analyzing of policy questions outlined above. household behavior. Policymakers may want to know A full version of the anthropometry module how changes in commodity taxes or subsidies would would collect data on height and wcight for all house- affect agricultural production or the consumption of hold members, not just children. Such data could be basic food items. Answering such questions requires used not only to gauge adult health but also to analyze calculating price elasticities and thus modeling house- the impact of government policies on household wel- holds' production and consumption decisions. Such fare and behavior. Suppose policymakers want to pre- modeling requires data that go well beyond measure- dict the impact of food programs on children's nutri- ment of living standards indicators. tional status. This requires estimating the determinants In multitopic household surveys that attempt to of child weight and height. Because heredity is so model household behavior, each module that collects important, parental height and weight information are data on a behavior of interest is usually designed to needed to estimate these relationships accurately; lack gather information that can be used to estimate the of data on parents' weight and height could lead to impact of several different policy changes. The chap- estimates that suffer from omitted variable bias and ters in Parts 2 and 3 of this book discuss each module thus do not accurately show the impact of the food 25 MARGARET GROSH AND PAUL GLEWWE programs on children's nutritional status. In general, survey is to describe living standards and recent including the full version of the anthropometry mod- anthropometric data are already available from anoth- ule in the survey-measuring both adults and er survey, it may seem reasonable to drop anthropo- children-greatly expands the possibilities for examin- metric measurements from the new survey. However, ing the impact of government policies on household there are two important advantages to collecting behavior and living standards. anthropometric data in the new survey. First, collect- Defining the objectives of a survey is often a less ing these data would make it possible to produce tidy process than the discussion so far has implied. descriptive tables that show simple relationships Institutions, and people within institutions, may have between nutritional status, as revealed by anthropo- varying objectives. Each sectoral ministry in a country metric measurements, and other variables of interest- is likely to be primarily interested in its own subject. for example, household expenditure levels. Second, The government as a whole may want the surveys to collecting anthropometric data in the new survey measure or monitor only a few indicators of welfare, would ensure that the anthropometric data used the while academics in the country's universities and other same definitions and classification schemes as other research institutions may want the surveys to yield the survey data, and thus could be used to draw effective detailed data needed to model household behavior. If comparisons. If the two surveys classified, say, educa- international agencies are financing the survey, they tion levels or rural and urban areas differently, this may have still another set of objectives. For example, would make it difficult to present analyses from the they may wish to ensure that the data are comparable two surveys side by side in ways that xvould be simple with similar data from other countries or that the data to interpret. Analysis based on combining results from can be used to study issues of interest to the develop- separate surveys will usually be more difficult, and thus ment community in general, even if these issues are more prone to error, than analysis based on data that not a high priority in the country of the survey. have all been collected in a single survey. Whatever the objectives envisaged when the survey is The case for collecting anthropometric data is first designed, it is likely that researchers will later use even stronger if the purpose of a new survey is to the data for other analytical purposes. investigate the impact of nutritional status on other The multiple (and sometimes competing) objec- socioeconomic outcomes (such as education, fertility, tives of household surveys are to be expected and even or labor force productivity).This objective implies that encouraged, since each of the groups with a stake in the survey must include an anthropometry module, these surveys has its own legitimate priorities.The task even if recent information on nutritional status is of survey designers is to accommodate the different available from other sources.To conduct these kinds of objectives as much as possible without compromising analyses, the variables of interest must all come from the quality of the survey. the same household survey!2 Although it is essential that data on key Other Information Available and Its Relation to Survey household-level variables come from the same house- Objectives holds, it is often useful to supplement household sur- No household survey takes place in a vacuum. In most vey information using data from a source other than countries there are several other household surveys a multitopic survey. In some cases, price data collect- that have gathered or will gather information on issues ed for generating consumer price indices can replace that the new multitopic survey is intended to cover. the price questionnaire typically used in LSMS and The extent to which data from these sources influence other multitopic household surveys (see Chapter 13). the design of the new questionnaire depends on the Other such alternative data sources are time-series amount and type of' data available and on the objec- data on weather and maps of soil quality and topog- tives of the new survey raphy, all of which can be used for analy7ing agricul- If the main objective of the new survey is to tural issues. In health and education, further possibil- describe various aspects of the living standards of the ities arise for matching household survey data with population, it may seem that the topics already covered data from other sources; some countries collect data in other surveys needl not bc included in the new mul- from health clinics and schools that may be matched titopic survey. For example, if the only goal of the new with the communities covered in a household survey. 26 CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY However, survey designers should exercise caution collecting data in the country undertaking the survey. when contemplating this approach. Although match- Because maintaining data quality becomes more diffi- ing data from different sources appears simple, it is cult when surveys become more complex, this capac- often very difficult in practice. Many of the chapters ity should be carefully assessed when planning LSMS in Parts 2 and 3 of this book discuss the potential for or similar multitopic surveys, which can be very com- matching household survey data to data from other plicated. In countries where the capacity to collect sources. data is weak, it may be better to implement a limited An important question that often arises when multitopic survey yielding reliable data on a relatively planning a multitopic survey is whether such a survey small number of topics than an overly ambitious sur- can replace one or more existing surveys and thereby vey that could yield unreliable data on a wide variety reduce total costs without any loss of information. A of topics. multitopic survey with an anthropometric module A survey containing 10 modules is easier to plan could replace a periodic anthropometric survey prima- and implement than a survey containing 15 or 20 rily intended to measure the extent of malnutrition. modules.The fewer the modules, the less time is need- However, a multitopic survey cannot replace all other ed by survey planners to contact different sectoral household surveys. Labor Force Surveys often require agencies and thus the less time is needed to build con- much larger sample sizes and more frequent data col- sensus. Also, smaller questionnaires require less time to lection than would be appropriate for a multitopic sur- design, and less time to carry out the fieldwork, enter vey. And specialized surveys, such as Demographic and the data, and manage the database. However, other Health Surveys and comprehensive farm management steps in developing and implementing a household surveys, contain much more data on those topics than survey, such as planning the sample design, do not vary can usually be collected in LSMS and similar multi- with the size of the questionnaire. Therefore, a survey topic surveys. Still other surveys, such as farm surveys with a questionnaire half the size of the questionnaire and small business surveys, have very different sampling for a full multitopic household survey will involve frames since they are based on samples of farms or substantially more than half the effort required for a businesses rather than samples of households. full survey. A final issue is whether survey designers should Despite the complexities of full multitopic sur- implement an entirely new survey, modify an ongoing veys, some very successful multitopic surveys have survey, or find creative ways to analyze existing data. been carried out in countries with very limited data Two arguments support implementing a new survey. collection capacity. Several steps can be taken to over- First, past surveys may not have been adequately doc- come the problems posed by limited capacity. For umented, or access to their data may be restricted. LSMS surveys, international experts have been Second, inter-agency rivalry, arguments concerning brought in to draw the sample, draft the questionnaires ownership of survey data, and coordination problems and interviewer manuals, and write the data entry pro- when different surveys are carried out by different gram. Such experts initially substitute for government agencies may make it easier to begin a new survey agency staff, but they can also train agency staff to take instead of using existing data or modifying an existing their place in future surveys. It is highly recommend- survey. On the other hand, survey designers should at ed that countries with limited capacity for collecting least consider trying to remedy these problems so that survey data use such expert assistance. existing surveys can be used (perhaps with some mod- In countries with weak institutional capacity, seri- ifications) to meet the designers' data and policy ous consideration should also be given to improving objectives-thus avoiding any unnecessary duplica- that capacity; capacity building yields long-term ben- tion. Examples are given later in this chapter of coun- efits that gradually reduce the need to use interna- tries in which an existing survey was modified to be tional experts to help with data collection. Genuine more like an LSMS survey. capacity building takes time, money, and political and managerial effort. An international sampling expert Institutional Capacity may be able to design and draw a sample for a survey Decisions about what kind of survey to implement in a few days, but it will usually take him or her much also depend in part on the institutional capacity for longer to teach local staff how to do so. Training to 27 MARGARET GROSH AND PAUL GLEWWE build capacity requires significant resources beyond entry operators is not a serious problem since they can those already budgeted for a survey.Whether building be trained in a matter of weeks, and no previous expe- a country's data collection capacity is important rience is required. However, it takes longer to trans- enough to warrant committing these resources will form government staff with no household survey vary from country to country. Where capacity build- experience into competent interviewers. While inter- ing is deemed necessary, the survey's work plan and viewers may be trained in a month, it is not so easy to budget must both be significantly enlarged. compensate for little or no interviewing experience. If capacity building is a goal, a program of annual Experience is even more important in the case of (or biannual) surveys will xwork better than a program supervisors. It may take years to overcome shortages of for a single survey or for a sequence of surveys that experienced interviewers and supervisors. take place every three to five years. An annual survey usually has a permanent allocation of skilled staff, staff Constraints Imposed by Funding Sources time, and equipment. Even when the team works only Surveys are always constrained by their funding. Most part of the year on the multitopic survey, staff have a LSMS and similar multitopic household surveys chance every year (or every two years) to use the skills receive some portion of their financing from sources that they have acquired in managing such a survey. other than the national budget, at least initially. As a And as the staff of the agency develop survey manage- result, they are subject to constraints associated with ment skills, the need for technical assistance from both the national budget and funds from external international experts should diminish. When some sources.3 staff members working on the survey leave, their The first and most obvious constraint imposed by replacements can learn their jobs from other staff the source of funding is the total amount of funds members who have worked on earlier rounds of the available. National budgets are often very restricted. survey. In addition, the continuity provided by an Some external funding sources have upper limits for annual survey may make it easier to improve survey how much may be spent on a single project, and most quality; if one year a problem arises in data collection have administrative procedures that grow in complex- or initial analysis, the people who deal with the prob- ity as the size of a project increases. Also, the larger the lem are likely to be involved in planning the next sur- survey budget, the more difficult it is for survey plan- vey and can better address the problem in the next ners to justify using the money for the proposed sur- survey. vey rather than for some other purpose. Limitations on In contrast, a survey carried out every four or five the size of the budget often constrain the size of the years may require new skills, staff, and equipment each sample used in the survey and in sonme cases curtail the time it is implemented. In the intervening period, survey's analytical depth and breadth. many of the individuals who carried out the first sur- Another potential constraint relates to the time vey may have moved on to other jobs either inside or period over which funds may be spent. Funding agen- outside the statistical agency. Those who remain may cies may stipulate that a survey project be completed not have been involved in planning the previous sur- in only one or two years, even though a single full- vey, and the skills of those who were involved may scale survey can easily take three years or more to have deteriorated over time. Vehicles and computers complete-6 to 18 months to plan, a year for field- used in the first survey will have been allocated to xvork, and 6 to 18 months for data dissemination and other purposes, and some may have ceased to function analysis. Moreover, chances of obtaining future fund- altogether. Most importantly much of the institution- ing can influence whether a proposed survey is car- al memory about problems and potential solutions ried out only once or is the first of a series of surveys. may have been lost. And funding limitations can affect such other aspects A final note of caution is needed regarding insti- of the survey as the thoroughness of the survey tutional capacitv. Sometimes, even when a statistical designers' work during the planning stage, whether agency has sufficient management and technical the fieldwork is spread over a frill year or concentrat- capacity to implement a complex multitopic survey, ed into a period of a few weeks, and the amount of there may not be enough experienced supervisors, analytical work funded from the survey project's interviewers, or data entry operators. Lack of data budget. 28 CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY Finally, many funding agencies also have rules on nents, discusses the strengths and weaknesses of each of how survey funds can be spent.These rules may impose the three main survey types, and describes two other controls on: the percentage of funds spent in local or survey options. international currency; the balance between recurrent and investment costs; the amounts that can be spent on The Core the salaries of local staff, survey equipment, and pay- Any LSMS-type multitopic survey must collect cer- ment of international experts; the nationalities of such tain essential information about the household, its experts; and various aspects of budgeting, accounting, members, and the local community, including: and procurement. Spending rules rarely influence big * A roster that lists, and collects basic information issues of survey design (such as survey duration, sample about, all household members. size, or questionnaire design), but they can affect many * Detailed information on household consumption details of the structure and implementation of a proj- expenditures. ect. Rules that prohibit the shifting of expenditures * Basic housing data such as type of dwelling, water between items or between time periods may limit the source, type of toilet, and whether the dwelling has ability of survey planners to deal with unanticipated electricity. problems. For example, an additional international * The education of all household members, including expert may be needed quickly, but may be difficult or who is currently in school. impossible to obtain because hiring this expert was not * The employment status of everyone of working age included in the original budget. The end result can be and, for those who are working, their occupation, a delay in the survey or a reduction in quality. Another the number of hours they worked during the pre- example is if an accident occurs involving a survey vious seven days, and, if they are employees, their vehicle; fieldwork may be delayed if expenditures to wage earnings. repair or replace the vehicle cannot be made available * The reccipt of money or in-kind assistance from promptly. key government or NGO programs. * The use of social services and programs, such as Summary government health facilities, schools, agricultural The analytical objectives of a survey, the availability of extension services, and social assistance programs. information from other sources, local institutional * Basic information related to the design of the sam- capacity, and constraints imposed by funding are all ple and the outcome of the household interview. key factors that typically affect whether to perform a * Local prices of basic food and nonfood goods (unless new survey and the form such a survey will take. price data are available from another source, or the Many other factors are also critical, including institu- country is so small and its markets so well integrated tional inertia and rivalry and the compromises that there is very little regional price variation). required to build a coalition to support and conduct a These components are referred to in this book as survey. However, it is difficult to provide general the essential core. In addition to the essential core, it is advice because these factors usually depend on the set- highly recommended that the following five types of ting of the survey; survey planners must deal with information be collccted in LSMS and similar multi- these issues as best they can given the particular cir- topic surveys: cumstances they face. * Anthropometric measurements (height and weight) of children 0-5 years old (unless malnutrition is Classic Survey Designs known to be negligible in the country). * The immunization status of children 0-5 years old. There are three basic ways to combine modules into * Information on basic household assets such as questionnaires and combine those questionnaires into durable goods, housing, land, and the capital equip- a survey or sequence of surveys: the full LSMS-type ment used for agricultural activities and nonagri- multitopic survey, the scaled-down LSMS-type survey, cultural household enterprises. and the core and rotating module survey. All of these * Information on interhousehold transfers. survey formats must include certain "core" compo- * Information on rental payments for those house- nents. This section outhnes the core survey compo- holds that rent their dwellings. 29 MARGARET GROSH AND PAUL GLEWWE In this book the set of modules formed by adding 6 (and provided inVolume 3) collects such basic infor- these five components to the essential core is referred mation, along with another piece of information that to as the recommended core, is less essential: questions that link each married indi- The essential core of an LSMS or similar multi- vidual to his or her spouse. topic survey collects the information needed to describe poverty and to monitor it over time.The rec- CONSUMPTION EXPENDITURES (ESSENTIAL). The expe- ommended core adds some very basic child health rience of LSMS surveys and other household surveys information, along with information on assets, inter- strongly suggests that household consumption expen- household transfers, and rental payments (the use of ditures are the single most important indicator of which will be explained below). Judgments about household welfare that can be obtained from a house- which data are part of the essential and recommended hold survey. (See Chapters 5 and 17 for further discus- cores are based on many years of experience that sion on this point.) Chapter 5 describes how to collect World Bank staff have in using data from LSMS sur- data on consumption expenditures, stressing that there veys to produce poverty profiles for a wide range of are no costless shortcuts for collecting such data. In developing countries.Table 2.1 lists the components of some circumstances it might be possible to omit ques- both the essential and recommended cores of LSMS- tions on the ownership of durable goods and on trans- type multitopic surveys. The paragraphs that follow fers given to other households, but the rest of the con- describe each of these components in greater detail. sumption module is an essential part of the core and should not be reduced further. Data on household HOUSEHOLD ROSTER (ESSENTIAL). Virtually every expenditures on education, health, and housing are household survey should begin by determining how collected in the core elements of those modules (dis- many people belong to each household and collecting cussed below) and need not be included in the con- very basic information on each household member, sumption module. Consumption in the form of in- including age (or date of birth), sex, nationality, rela- kind payments (such as meals, clothing or tionship to the head of household, and marital status. transportation) from employers is best collected in the Part A of the household roster introduced in Chapter employment module. Table 2.1 The Essential and Recommended Cores of LSMS-Type Multitopic Surveys Module Sections used The Essential Core Household Roster All of Part A except questions 8 and 9 Consumption All questions except transfers given to other households (Part D) and ownership of curable goods (Part E) Housing QuestionsA Al-A, Bi-B5, Ci-C3, and C 3-24 of the short module Education All questions in the short module Employment Questions A2-Ai3, B i B2.BY-Bi i ,D3; D ano D8-Dii Transfers and Other Nonlabor Income All of Part B l; see text for further discussion Heaith Questions i.0-38 of tne short module ,......................................................................... Metadata Household Identitcation and Control submodule; Questions I-A in Summary of Visits and Interviews submodu e ......................................................................I............................................................................... ...............I............................................................. Pr.ces 30-40 food tems and 10-20 nonfood items Credit Questions 9-14 and 21-28 of the short module (on cred t obtained from NGOs or government agencies) Agriculture All of Part P (use of agricu[turai extension services), which is tne same for all modules Additional components for the recommended core Anthropometry Entire modu e, for children 0-5 years old ................................................. ......................... AIl ... of... Part...C... im...m... unizat... on)........................................................................................................................... Health All of Part C (immunizat on) ,.... .......................................... Consumption All of Parts D and E ........................................................... ....................... ...E7 ..... C7' ... ofthe... sho t...m... odue......................................................................................................... Hous ng Questions C7-C[12 of the short modu e Household Enterprises Part C of the short module, quest ons 1 3 .......................... .................................. 7............ :................................................................................................................................................ Agriculture Parts A, B, ano E of the snort module. ............................. ... . ........ ........... .. . Transfers and Other Nonlabor Income Questions on income from interhousehold transfers Source Authors' recomnmendations. 30 CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY HOUSING (ESSENTIAL). InformllatioIn oJn housing, indicator of living standards, is also easy to collect. And including the type of dwelling, the construction mate- it is usually more convenient to collect information on rials used, the number of rooms, the availability of household education expenditures in the education electricity, the source of drinking water, and the type module than in the consumption module. of toilet, are very basic indicators of living standards. They provide analysts and policymakers with informa- EMPLOYMENT (ESSENTIAL). Basic employment infor- tion on a household's standard of living that goes mation on household members of working age (7 and beyond consumption expenditures. Because housing older in many countries) should be collected as part of information is simple to collect, it should be included the essential core of any LSMS-type survey. The most in any LSMS-type survey. The long version of the important source of income for poor people in devel- housing module introduced by Chapter 12 (and pre- oping countries is their labor; employment data, sented in Volume 3) collects substantially more hous- including information on unemployment, indicate ing data than are necessary for the essential core, and how this labor is being used. even the short version is longer than the essential core. Essential employment information includes each Only the following questions from the short version person's occupation and the number of hours that he of the housing module need to be included in the or she has worked in the previous seven days.While it essential core: Al-A7, BI-B5, and B11-B21. Even would also be useful to gather data on the incomes of some of these questions can be omitted in some coun- all employed household members, this is not easily tries. The questions on heating (B18-B21) can be done for the self-employed (see Chapter 17 for further removed for countries with warm climates, and the discussion). However, income data should still be col- questions that distinguish between wet and dry seasons lected for employees even when such data cannot be (B1-B3) can be simplified in countries where this dis- collected for the self-employed, for two reasons. First, tinction is not important. these partial data are useful for understanding which Another part of the core data set, housing-related occupations pay well and which do not. Second, since consumption expenditures (such as expenditures on data are already needed from employees on in-kind electricity, water, and cooking fuel), are most conve- benefits provided by their employers (in order to cal- niently collected in the housing module rather than in culate consumption expenditures), it would seem the consumption module. A final useful indicator of strange to ask about those benefits without first asking living standards is information on the ownership of about money income. the household's dwelling. Questions Cl-C3 in the The short version of the employment module short version of the housing module collect expendi- introduced by Chapter 9 (and presented inVolume 3) ture information, and questions C13-C24 collect collects more information than the core of an LSMS- ownership information. typc survey requires.Thc following questions from the draft employment module constitute the essential EDUCATION (ESSENTIAL). Education is both a determi- core:A2-A13, B1, B2, B7-B10, D3, D4, and D8-D17. nant and a key indicator of living standards. The short Job-specific information-questions in Parts B or D- version of the education module introduced in should be collected both for the person's main occu- Chapter 7 (and presented in Volume 3) comprises all pation and for any secondary occupation. (The main of the essential core questions on education.The only occupation is the job the respondent spent the most questions that might be omitted are the two questions hours doing during the previous seven days.) on grade repetition. The short education module assesses education GOVERNMENT AND NGO TRANSFERS (ESSENTLAL). from several different angles, including school attain- Many developing countries have programs that pro- ment, current enrollment, and education expenditures. vide money or in-kind assistance to households. Some Information on the school attainment of household of these are government programs and some are run members ages 5 and older is easy to collect and has by nongovernmental organizations (NGOs). Examples many analytical uses (such as classifying households in of these programs include cash welfare payments, pen- terms of the education level of their head of the house- sions, unemployment insurance, food stamps, food hold). School enrollment among children, another key rations, school feeding programs, community soup 31 MARGARET GROSH AND PAUL GLEWWE kitchens, scholarships, and free or subsidized text- version of the credit module introduced in Chapter books.While the range of programs is very wide, there 21. It is possible to identify beneficiaries of public are usually only a few sizable programs in any partic- works programs by adding one or two questions to the ular country. employment module that ask whether an individual's A key policy question that LSMS surveys can current employment is related to such a program. address is who benefits from these programs. However, Finally, information on housing-related physical infra- only programs that reach a substantial fraction of the structure services-such as water, sanitation, and elec- population can be studied with the relatively small tricity-is collected in the core of the housing mod- sample sizes recommended for LSMS and similar mul- ule, as discussed above. titopic surveys. Questions about government and NGO transfer METADATA (ESSENTIAL). The last type of information programs should not necessarily all be in the same that must be collected in the household questionnaire module (a fact that makes this part of the core difficult of any LSMS-type survey consists of basic data on to standardize).While questions about cash income fit where the household fits in the sample and on the best in the transfers and other nonlabor income mod- outcome of the interview. This type of information, ule, questions on school feeding programs should known as "metadata," is discussed in Chapter 4. For probably be in the education module. However, Part the essential core, it is not necessary to collect all of the BI in the transfers and other nonlabor income mod- information covered in the metadata module. The ule is a good place to start collecting this information. essential metadata are the date of the interview or interviews, the identification (ID) codes for the house- SOCIAL SERVICES (ESSENTIAL). Related to programs hold and its primary sampling unit,4 the ID codes of that provide cash or in-kind assistance are programs the interviewer and the other team members who col- that provide services. The most common examples of lected, checked, or entered the data from that house- social services are public schools, public health servic- hold, information on whether an interview actually es, agricultural extension services, credit programs, took place (and if not, why it did not), and perhaps public work schemes, electricity supply, public water some data on the ethnic group and religion of the supply, and sewage systems. LSMS surveys and similar household. This information is collected in the meta- multitopic surveys should always collect some infor- data module, on the Household Identification and mation on the use of social services, at least enough to Control submodule and in questions 1-4 of the measure variation in access to and utilization of such Summary ofVisits and Interviews sub-module. services across different socioeconomic groups. As with direct assistance to households, the types PRICES (ESSENTIAL). Price information should be col- of programs and the amount of detail needed to iden- lected at the level of the community (the primary tify who benefits from them will vary among coun- sampling unit) since all households in a given com- tries. School enrollment information is already collect- munity face the same prices. How to collect price ed in the core, as discussed above, although additional information is discussed in Chapter 13. The main task information may need to be collected on any school is to select the items for which price data will be col- services that are available to some students and not to lected. While the exact items will vary across coun- others, such as tuition waivers or afterschool programs tries, prices should be collected for at least 30-40 of for disadvantaged students. Information on the use of the most commonly consumed food items and 10-20 public health services is also very important; such of the most commonly purchased nonfood items. In a information is collected in questions 10-38 of the few countries other sources of reliable price data may short version of the health module (introduced in already exist for both urban and rural areas; if these Chapter 8). Data on the use of agricultural extension data can be matched to the communities covered in services are collected in Part F of all versions of the the survey, there is no need to collect new price data. agriculture module introduced in Chapter 19. Some And in some small countries such as Jamaica, prices countries have subsidized credit programs to assist vary little among regions. In these cases, no price data poor households; information on these programs is need to be collected as long as national price data exist collected in questions 9-14 and 21-28 of the short that show changes in prices over time. 32 CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY ANTHROPOMETRIC MEASUREMENTS (RECOMMENDED). as radios, televisions, bicycles, motorcycles, and cars is Anthropometric data, particularly on height and a simple indicator of living standards. Second, the sum weight, should be collected for children 0-5 years old of the value of all these different household assets gives in almost every LSMS or similar household survey. a rough (and admittedly incomplete) indicator of Stunting (low height for age) and wasting (low weight household wealth. Third, data on the ownership of for height) are common measures of children's nutri- land and on capital assets used in agricultural and tional status; height and weight data are critical in nonagricultural enterprises indicate productive assets. countries where children are at risk of malnutrition. Fourth, in some countries, particularly countries of the And collecting basic anthropometric data about chil- former Soviet Union, there is evidence that adding the dren is simpler and more reliable than collecting other consumption derived from durable goods and housing data on health status. The details of how to collect to total consumption can lead to substantial changes in children's anthropometric data are explained in the relative economic positions of different types of Chapter 10. households. Collecting height and weight data requires some Information on the ownership of consumer effort. The data are collected using special equipment durable goods can be collected using Part E of the that is bulky and troublesome to carry around to each consumption module. Data on the value of owner- household. One consequence of this is that another occupied housing are collected in the short version of individual is often added to each survey field team. If the housing module, using (at minimumi) questions Cl collecting children's height and weight data were eas- and Cll, with C3 and C 12 providing alternative val- ier, such anthropometric measurements would have uations. A short set of questions on the assets used in been classified as part of the essential core of any household enterprises is provided in Part G of the LSMS-type survey. short version of the household enterprise module; only questions 1-3 are needed. Parts A, B, and E of the IMMUNIZATION (RECOMMENDED). Almost all LSMS short agriculture module collect a modest amount of and similar multitopic surveys should collect inunu- information on households' land holdings, livestock, nization records for children ages 0-5. In recent years machinery, and other agricultural assets. child immunization programs have dramatically reduced the incidence of several life-threatening PRIVATE INTERHOUSEHOLD TRANSFERS (REcoM- childhood diseases in many developing countries- MENDED). Private interhousehold transfers, which are significantly reducing infant and child mortality rates. pervasive in many countries, are used by many house- However, many countries still do not have 100 percent holds to cope with poverty and economic vulnerabil- immunization coverage. Therefore, information on the ity. Transfers received are covered by the transfers and extent of coverage and on where coverage is low is other nonlabor income module (introduced in important for almost any analysis of living standards. In Chapter 11) and transfers sent are covered by the con- addition, since child immunization coverage can sumption module (introduced in Chapter 5). At least change dramatically over a year of two, it serves as a the short versions of the private interhousehold trans- useful indicator of chaniges in the provision of govern- fer submodules should be used in virtually all surveys. ment services during periods of economic or social Even in a relatively simple survey it may be worth- instability. Child immunization information is collect- while to use the standard version of the submodule on ed in Part C of the health module introduced by transfers received. Chapter 8. RENTAL PAYMENTS FOR HOUSING (RECOMMENDED). ASSETS (RECOMMENDED). Household assets include Estimates of the annual rental value of dxvellings are information on any consumer durable goods owned needed to estimate the consumption value of housing by the household, the value of owner-occupied hous- for households that own these dwellings. In most ing, and the ownership of land and capital assets relat- countries such estimates can be calculated by estimat- ed to agricultural activities and household enterprises. ing the relationship between basic housing character- There are several reasons for collecting these data. istics, which are already part of the core, and the rental First, the possession of household durable goods such payments made by households that rent their 33 MARGARET GROSH AND PAUL GLEWWE dwellings.TIhe key piece of information needed is the Participation in specific government programs such rental payments of households that rent. Questions as food stamps programs,job training programs, and C7-C12 in the short version of the housing module agricultural extension services. collect information on rental costs. Having all this information for a group of households makes it possible to describe many indicators of living Full LSMS-type Survey standards, estimate the determinants of different In practice, the essential core-and even the recom- dimensions of living standards and different types of mended core-will tap only a small part of the household behavior, and estimate the relationships potential policy uses of an LSMS-type survey. In between dimensions of living standards and household most LSMS and other multitopic surveys, much behavior (such as the impact of children's nutritional more can and should be added to the questionnaires status on their school performance). to gather information beyond what is collected in The full ISMS household questionnaire is long the core. This subsection and the two that follow it and complex. In almost all cases it is too long to be discuss different ways to add to the core by expand- completed in a single visit by an interviewer to a ing modules and combining them to form a survey household. Instead, an interviewer typically visits each or sequence of surveys. household twice. All of the individual-specific mod- A full LSMS-type multitopic household survey ules (roster, education, health, employment, and can be formed by combining the short or standard migration) are administered in the first visit, some- versions of most of the modules in the household times with the addition of one or two household-level questionnaire with the corresponding parts of the modules such as housing. The interviewer makes an community and price questionnaires. This produces a appointment for a second visit, usually about two household survey similar in design to the original weeks later, to reinterview household members who LSMS surveys first used in 1985, except that the mod- are most knowledgeable about the other household- ules presented in Volume 3 of this book (and level modules (such as consumption, agriculture, and described in Parts 2 and 3) include revisions based on household enterprises). To ensure that high quality 15 years of experience with LSMS and other house- data are collected and to keep the budget within rea- hold surveys. sonable limits, the samples in full LSMS-type surveys Because some of the standard versions of modules are usually relatively small-between 2,000 and 5,0()0 presented in Parts 2 and 3 are significantly larger than households. Samples of this size are still large enough versions used in the original LSMS surveys, a household to provide accurate information on the nation as a questionnaire including all of the standard modules whole, on rural and urban areas, and on a small num- would almost certainly be too large to be practical. Thus ber of geographic regions. However, such samples are the household questionnaire of a full LSMS-type survey not large enough to provide accurate statistics for each needs to be trimmed, either by replacing the standard state, province, department, or district in a country. versions of some modules with their short versions or Even at the national level, they cannot provide precise by dropping some nonessential modules. information on phenomena that do not pertain to A well designed full LSMS-type multitopic survey most households or individuals-such as post-second- collects information that measures or otherwise ary education or participation in a program used by describes: only a small fraction of the population. See Grosh and * Household consumption. Munoz (1996) for a more thorough discussion of sam- * Household incorne. pIe size and sampling issues. * Key nonmonetary indicators of welfare such as In most cases it is not worthwhile to implement a nutritional and health status, education status, and full LSMS-type multitopic survey every year. Much of housing conditions. the analysis for which LSMS surveys are designed does * Many aspects of household behavior, such as not need to be repeated annually. For example, while income-generating activities, human capital invest- it is important to understand the determinants of fer- ments, fertility, and migration. tility, it is unlikely that these determinants change * The local econornic environment (including prices greatly from one year to the next. Sizable changes are and the availability of services). likely to occur only over the course of several years, as 34 CHAPrER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY economic conditions and people's attitudes change. retaining only questions on agricultural extension Another reason not to implement a full survey every services that are part of the essential core and questions year is that it is costly to administer such a compre- on assets that are part of the recommended core.Yet hensive household questionnaire, and requires substan- the questions on wages from the employment module tial work at each stage. Therefore, a full LSMS-type and the questions on public and private transfers from survey should be implemented only once every three the transfers and other nonlabor income module to five years. should be retained, as they are part of the essential core During 1985-99 the following countries imple- of any LSMS-type survey. mented full-size LSMS surveys: Algeria, Brazil, Cote Other questions that can be dropped are questions d'lvoire, Ecuador, Ghana, the Kyrgyz Republic, on any aspects of household behavior that are of little Mauritania, Morocco, Nepal, Pakistan, Panama, Peru interest to policymakers. The savings, credit, fertility, (in 1985-86, 1991, and 1994), Turkmenistan, and and migration modules have often been deleted from Vietnam. previous LSMS surveys. Because the new time-use module is quite lengthy, it is also a candidate for omis- Scaled-down LSMS-type Survey sion, unless data on time use are of particular interest A scaled-down LSMS-type survey can be constructed to policymakers. If analysts aim to measure use of by omitting some modules from the household ques- social services but not to estimate the determinants of tionnaire of a full LSMS-type survey and by abridging demand for them, survey designers could choose to other modules. Such a survey will still be a multitopic use the short, rather than the standard, versions of the survey, but will cover fewer topics than a full-size sur- health and education modules. vey would. Substantial reductions in the size of the An alternative way to obtain a scaled down multi- household questionnaire may mean that the question- topic survey is to "scale up" an existing single-topic naire can be completed in a single visit by the inter- household survey, such as a labor force or household viewer to the household, as compared to the two vis- expenditure survey. In Romania, Latvia, and its needed for a full LSMS-type multitopic survey. Bangladesh, new modules on the use of social services The extent to which various modules should be and programs were added to existing household income reduced or eliminated will depend on which policy and expenditure surveys. In Guyana, households that questions are most important in the country in ques- had been interviewed in a previous income and expen- tion. However, there is a limit to how much the ques- diture survey were revisited to collect information on tionnaire can be cut. The essential core of an LSMS or health, education, and anthropometrics; the separate similar multitopic survey, as described above, must data files were later merged for purposes of analysis. In remain. In addition, the elements that are added to Jamaica, households from the Labor Force Survey were form the recommended core (data on anthropomet- revisited by interviewers who administered the Survey rics, child immunization coverage, basic household of Living Conditions; the two data files were later assets, interhousehold transfers, and rental payments of merged. In Paraguay, additional modules were added households that rent their dwellings) should almost directly to the Labor Force Survey questionnaire. always be included. The community questionnaire Scaling down the household questionnaire of a may or may not be included in a scaled-down survey, full LSMS-type survey reduces the analytical potential but the price questionnaire should always be used, of data collected, especially in parts of the question- except in those rare cases in which fully adequate naire that are dropped or abridged. A reduced ques- price data already exist or price variation across tionnaire produces fewer descriptive statistics on many regions is negligible. Overall, the analytical objectives dimensions of household welfare than would be pos- of a scaled-down LSMS-type survey are more modest sible using a full-size survey. Data from a scaled-down than the objectives of a full-size survey. questionnaire can be used to analyze only a few of the One common way to abridge the questionnaire is determinants of living standards. And such data reduc- to decide not to collect the data needed to measure tions substantially reduce the range of analytical meth- total household income. Not measuring total house- ods that can be used. hold income allows survey designers to delete most of A scaled-down LSMS-type survey can be imple- the agriculture and household enterprise modules, mented fairly often, perhaps annually or every other 35 MARGARET GROSH AND PAUL GLEWWE year. Such frequent implementation is desirable further details on health facility questionnaires). In the because one of the main uses of data from a scaled- third year the health module would return to its orig- down survey is to monitor changes in poverty and inal "core" size and a new subject, such as education or other dimensions of welfare over time. Also, the fact savings, would be given special emphasis. Expansion of that a scaled-down survey collects less data on the any particular module might require making some determinants of household welfare and behavior than additions to other modules in the survey to ensure that does a full-size survey means that implementing it fre- the analytical potential of the data collected in the quently wastes less resources than would implement- expanded module could be fully exploited. Each chap- ing a full LSMS-type survey every one or two years. ter in Parts 2 and 3 of this book explains what data are Another advantage of a scaled-down survey is that it is needed from other modules to complement the data easier and less expensive to carry out than a full-size collected in the module covered by that chapter. survey. Finally, a scaled-down survey can be carried The core and rotating module design is a hybrid out using somewhat larger samples than a full LSMS- of a full LSMS-type multitopic survey and a frequent- type survey because it is subject to fewer managerial ly implemented scaled-down LSMS-type survey. and budget constraints. Implementing a core and rotating module survey Scaled-down LSMS surveys have been carried annually would allow for the same monitoring of out, with World Bank support, in Albania, Azerbaijan, poverty and welfare that is possible with data from an Bolivia, Bulgaria, Pakistan (1995/96 and 1996/97), annual scaled-down survey. In addition, in each rota- Peru (1990), and Tanzania. tion of a particular module, this kind of survey would collect the data analysts need to study the determi- Core and Rotating Module Design nants of household behavior for a specific topic-in The "core and rotating module" design for a multi- other words, data comparable to what are collected in topic household survey is an attempt to combine the a full-size survey. It might even be possible to use data advantages of full and scaled-down LSMS-type sur- from the scaled-down modules to study topics that are veys. In this design, a scaled-down LSMS-type survey not emphasized by the survey in a particular year. forms the "core," while one or two modules are added The cost and sampling implications of the core or greatly expanded each time the survey is carried and rotating module design lie somewhere in between out. Modules that are added or expanded in any given those of a full-size LSMS-type survey and those of a year revert back to their "core" size the following year, scaled-down survey. Perhaps of greatest concern in the creating a module "rotation" scheme for the modules core and rotating module design are the institutional that go beyond the core. In most cases the survey is arrangements for developing, implementing, and ana- fielded annually, although it can also be a semiannual lyzing the special modules.While for both full-size and or biannual survey. The core that is repeated each time scaled-down LSMS-type surveys it is possible to put a the survey is implemented must include the essential lot of effort into the design of the first survey and give core described above, and in almost all cases it should less attention to improving its design in subsequent include everything in the recommended core. In many years, implementing the core and rotating module cases the core of a core and rotating module design design means that the questionnaire needs to be sig- should collect additional information as well, in order nificantly modified each year-requiring much more to provide a more detailed picture of household wel- attention from survey designers after the first year. fare each time the survey is implemented. Indonesia's SUSENAS is a long-standing example An example of how to implement this approach of a core and rotating module survey design. Jamaica's would be to use only the core in the first year of the Survey of Living Conditions, which began in 1988, survey, in order to focus on making sure that the core was the first LSMS survey to adopt this approach. A works well. In the second year the health module in new LSMS survey in Cambodia is just starting to the household questionnaire would be expanded to develop such a system, as is the Bangladesh Household gather more detailed data on individuals' health status Expenditure Survey. (The Bangladesh Household and behavior, the kinds of health care sought, and the Expenditure Survey is not usually regarded as an cost and quality of that health care. In addition, a health LSMS survey; however, it has adopted much of the facility questionnaire could be added (see Chapter 8 for LSMS methodology.) 36 CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY Specia/ Purpose Sample Designs beyond debate; instead, they should be thought of as a There are two other possible survey designs, both of starting point for making survey design decisions.This which use special purpose samples (that is, samples that is the case for several reasons. First, the dividing lines are not nationally representative). The first is a survey between the three basic survey designs are flexible, as that samples a special population that is of particular it is possible to develop "hybrid" surveys that merge interest for analytical or policy purposes.5 An example characteristics from the different survey design of this is a sample of households within a single city that options. Second, individual countries may not fit neat- is used to study issues pertaining to that city, such as the ly into the categories implicit in the rules. Third, sur- housing market, the water supply system, or urban air veys may have multiple analytical objectives. Finally, pollution. Two LSMS surveys of this type have been funding constraints are not explicitly considered here. performed: one in the Kagera region oflTanzania, focus- Survey planners should consider the following ing on areas with high prevalence of AIDS, and one in general "rules of thumb" when deciding what kind of rural areas of Northeast China, focusing on the agricul- survey to implement: tural activities of rural households. 1. Countries with sufficient institutional capacity to A second kind of special purpose survey is one in implement a complex survey should use either a which the sample is drawn solely for purposes of pro- full LSMS-type multitopic survey every three to gram evaluation. In this type of survey, a group is five years or a core and rotating module design; observed both before and after the benefits of a partic- both options can serve a broader range of analytical ular service or program are made available to this group. objectives than can a sequence of scaled-down Alternatively, the sample may be composed of two LSMS-type surveys. groups, one consisting of the households who benefit 2. If annual (or biannual) monitoring of living stan- from the service to be evaluated-the treatment dards or poverty is the most important analytical group-and the other consisting of households that are objective, either a sequence of frequent scaled- similar to the first in every respect except that they do down LSMS-type surveys or a core and rotating not benefit from the service-the control group. module survey should be adopted. In contrast, a These special-purpose samples usually gather full-size multitopic survey is inappropriate because detailed data on the topic being studied, whether it is cost and efficiency considerations imply that such a specific sectoral issue (such as agriculture) or a pro- surveys should be implemented only every three to gram to be evaluated. There are so many ways to five years. design such surveys that this book cannot hope to 3. No new survey is needed if the main objective is to cover all of them. However, since special purpose sur- provide periodic descriptive information (say, every veys typically collect data on many general character- three to five years) or to examine the coverage of istics of the sampled households (such as size, compo- government programs in countries where ample sition, living standards, labor force status, and data are already available from other sources. education), designers of this kind of survey can use the 4. If the main objective is to gather periodic modules proposed in this book as a guide for collect- descriptive information or to examine the cover- ing this supplemental information. The experience of age of government programs, a core and rotating past LSMS surveys has been used in designing special module design should not be chosen. Such a purpose surveys to evaluate the impact of educational design would collect data much more frequently reforms in El Salvador. And the Nicaragua LSMS sur- than is necessary. vey included a special sub-sample designed to evaluate 5. If the main objective is to model household behav- the impact of that country's Social Investment Fund. ior, either a full LSMS-type survey or a core and rotating module survey should be chosen. A series Matching Circumstances and Designs of scaled-down surveys would be insufficient for modeling household behavior. This section provides some approximate rules of 6. If the main objective is to model household thumb for choosing among the three common survey behavior and very little other data are available, a design options discussed in the previous section.These full-size multitopic survey is preferable to a core recommendations should not be thought of as rigid or and rotating module survey since the latter cannot 37 MARGARET GROSH AND PAUL GLEWWE supply detailed information on all topics until it Choosing the Modules, Defining Their has been in operation for several years. The core Objectives, and Setting Their Size and rotating module design can be adopted after one or two full LSMS-type surveys have been car- Once the basic blueprint of the survey has been select- ried out. ed, survey designers must decide which modules to 7. If the main objective is to model household behav- include in the household and community question- ior and a large amount of other data are available, naires.' Designers must also define specific objectives the core and rotating module survey is preferable to for each module and decide on each module's approx- a series of periodic full LSMS-type surveys because imate length. The procedures for these steps are dis- the core and rotating design allows poverty to be cussed in this section. Because decisions about length monitored more frequently over time. and objectives ultimately depend on many country- 8. If the institutional capacity in the country is lim- specific details, specific recommendations cannot be ited and the survey aims either to monitor pover- provided for each possible scenario. Instead, some gen- ty and living standards annually or to provide eral guidelines and procedures are provided that descriptive information (including coverage of should prove useful for completing this step efficient- government programs) periodically, a scaled- ly and effectively. down LSMS-type survey should be chosen. This Two general points must be made at the outset. survey may be either frequent (for annual moni- First, the tasks of choosing modules, defining their toring) or periodic (for descriptive information objectives, and setting their approximate size are all every three to five years). The other options, full closely related and thus must be done simultaneously multitopic and core and rotating module, are too rather than sequentially.The type of objectives and the complex for countries with limited institutional number of objectives have considerable implications capacity. for the size of each module; more objectives, and more Table 2.2 summarizes the implications of these complex objectives, necessitate a larger module. rules, showing which rules lead to which choices. Second, the objectives of each module should be con- Because countries vith little institutional capacity sistent with the overall objectives of the survey, in cannot implement a full LSMS-type multitopic survey terms of both the analytical objectives (describing liv- or a core and rotating module design on their own, ing standards, monitoring poverty and living standards, they will not be able to collect data that are useful for examining the coverage of government programs, esti- analyzing household behavior unless their institution- mating the impact of policies) and the specific topics al capacity is either permanently improved or supple- in which policymakers are interested. The overall mented in the short run by using international objectives of the survey already provide some infor- experts. In addition, significant purchases of new mation on what the objectives of many of the mod- equipment may be required in some countries. ules will be. Table 2.2 Recommended Survey Designs for Different Settings Analytical objective _ _ Describing living standards or Monitoring living Availability of other data examining program coverage standards or poverty Modeling household behavior Countries with sufficient institutional capacity Limited Full LSMS-type survey Core and rotating module Full LSMS-type survey (Rule 5 + Rule 6) (Rule I + Rule 4) (Rule I + Rule 2) Ampie No newv survey needed Core and rotating module Core and rotating module (Rule 5 + Rule (Rule 3) (Rule I + Rule 2) .................................................................................................................................... ...........................*................................................................. Countries with limited institutional capacity Limited Periodic scaled-down LSMS- Frequent scaled-down LSMS- Full LSMS-type survey (Rule 5 + Rule 6), type survey (Rule 8) type survey (Rule 8) Ample No new survey needed Frequent scaled-down LSMS- Core and rotating module (Rule 3) type survey (Rule 8) (Rule 5 + Rule 7)a a. International experts must be h red to carry out key tasks. Source: Authors' recommendat ons. 38 CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY Choosing modules to be included in a scaled-down LSMS-type survey A good first step in choosing modules is to set the and would work in a core and rotating module survey upper and lower limits of what can be included in a only if it were chosen as the topic emphasized in a multitopic survey.The lower limit is the essential core particular year. The same circumstances apply to the discussed above; in almost all cases this lower limit environmental module. The full set of these environ- should be expanded to include the additional elements mental submodules is equivalent to a very large that are in the recommended core. The upper limit expanded module, and for this reason it is difficult to will depend on country-specific circumstances such as imagine the full set used in a single survey.The con- the capacity of the statistical agency and the willing- tingent valuation modules should be used only when ness of households to participate in lengthy inter- specific improvements in services (such as urban water views. It is never possible to include all of the modules supply, urban sanitation, urban air quality, or rural in any one survey. water supply) are being contemplated. An important question to address relatively early Even a subset of the expanded environmental when making decisions about modules is whether the modules is likely to be equivalent to a large expanded survey will attempt to collect enough data to calculate module, especially if the water, sanitation, and fuel use total income.The advantages and disadvantages of col- modules are included. This being the case, it is feasible lecting these data are discussed at length in Chapter to include a large subset of the environmental modules 17. Clearly, if survey designers decide to collect the in a full LSMS-type multitopic survey, but only a rel- data needed to calculate total income, the agriculture atively small subset can be included in a scaled-down and household enterprise modules need to be includ- survey. In a core and rotating module survey a large ed in the questionnaire.7 If designers decide not to subset of the environmental submodules can be used collect income data and there is little interest in these only if environmental topics are emphasized in that two modules, they can be dropped, except for the particular year; for all other years only a small subset questions on use of agricultural extension services that would be feasible. are part of the essential core and the asset questions At this point, it is useful to give some general rules that are part of the recommended core. about how much room there is for modules in differ- It will probably not be possible to collect total ent kinds of multitopic surveys. For a full LSMS-type income data in a scaled-down survey because it is not survey, the household questionnaire should be rough- feasible in a single visit to a household to collect the ly large enough to include a mixture of about 15 stan- recommended core data plus the data from the agri- dard or short modules. The number of modules that culture and household enterprise modules and still can be included in a scaled-down survey is probably have room to examine other topics. This implies that closer to 8 or 10, most of which have to be short ver- it is also difficult to collect total household income in sions. A core and rotating module survey lies some- a core and rotating module survey, except when the where in between but is probably closer to the scaled- module featured is either the agriculture or the house- down survey if only one visit is made to the hold enterprise module: even when one of these two household. The modules chosen must in all cases modules is featured in such a survey, collecting total include the components of the essential core; in almost income data may not be feasible in some countries. all cases the modules should also include the addition- Two other specific decisions to make early in this al components found in the recommended core. step of the survey design process are whether to col- Using these starting points for what is feasible, the lect time-use data and whether to implement a large next task is to consult with policymakers at the high- number of the detailed environment modules (see est level to get a detailed idea of which topics are of Chapters 22 and 14, respectively).The time-use mod- greatest interest to them (if this has not already been ule is very long, and as such should be thought of as done). Policymakers need to specify which topics are an expanded module. If survey designers choose to of overriding concern, which are of moderate interest, include this module, they may have to omit several which are of minor interest, and which are of little or other short or standard modules. While it would be no interest. Expanded modules, if they exist, should be feasible to include the time-use module in a full-size used for topics of overriding interest.8 Standard mod- LSMS-type survey, this module is probably too large ules should be used for items of moderate interest. 39 MARGARET GROSH AND PAUL GLEWWE Short modules may be appropriate for items of minor Second, for each module, survey designers should interest. Items of little or no interest need not be cov- match the policy issues raised by policymakers with ered in the survey unless they are part of the essential the data required to analyze them, as laid out in each or recommended core. chapter of Parts 2 and 3. One way to do this is to The core and rotating module survey design is choose the smallest version of each module that can inherently more flexible than other classic designs; if address all of the relevant policy issues, and remove any the core and rotating survey is implemented annually, questions in that module that are not needed to ana- it can cover four or five topics in great detail over the lyze these policy issues. If the module is still too long, same number of years by including the expanded ver- questions needed only to address the least important sion of one of these modules each year. Of course, sur- policy issues are deleted. This shorter module is vey designers still have to set priorities about which checked again to see if it exceeds the provisional size expanded module is included in the first year, which is limit.The general principle is that the most important included in the second year, and so on. policy questions are addressed first and additional The above paragraphs provide survey designers issues are added until the module has reached the with a scheme for generating a draft list of the mod- length that survey designers, in consultation with ules to be included in the survey, their approximate high-level policymakers, have set for it. length, and, to an extent, the objectives of each mod- Third, after this has been done for all modules, ule. Needless to say, this draft list needs to be refined. survey designers should prepare a list of issues they This can be done by adding two new "ingredients" to think can be covered by the survey and give this list the process: discussions with policymakers who spe- to the high-level policymakers, who will decide cialize in particular topics or programs, and a careful whether they would like to change the amount of reading of the chapters in Parts 2 and 3 of this book. space allocated to each module. The survey design- The task is to reconcile the specific policy questions ers should tell the policymakers about the tradeoffs raised by these more specialized policymakers with the involved, working with them to ensure that the feasibility of collecting data to analyze them (as dis- issues policymakers deem most important are cussed in detail in Parts 2 and 3 of this book) given the addressed. approximate sizes of each module as specified by high- Ultimately, this process produces a list of modules level policymakers.This process is not simple and con- to be included in the survey, the proposed length of sequently involves a certain amount of iteration. each module to be included, and the specific objec- Unfortunately, policy issues raised by specialist tives for the modules. This completes the second step policymakers often require more questions than can fit of survey design. This step may need to be revisited into a module of the size specified in the first draft of later if results of the field test show that the question- the modules. The choice at this point is between not naires are too long or that there is room to expand the including many of these policy issues in the survey and questionnaire. expanding the module containing these questions at the expense of other modules. A third alternative is Notes expanding the relevant module without reducing the size of any other module, but the feasibility of this The authors would like to express their gratitude to Jere Behrman, option is open to question and will not become clear Lawrence Haddad, Courtney Harold, John Hoddinott, Alberto until a draft questionnaire is field tested. Martini, Raylynn Oliver, Kinnon Scott, and Salman Zaidi for com- Given this situation and the uncertainty regarding ments on an earlier draft. what is feasible ancl what is not, survey designers 1. This book is designed to provide a thorough reviewv of inter- should use the following procedure to reconcile the national experience. However, new experience and knowledge will specific objectives of each module with any constraints continue to accumulate after the book has been published.Therefore, on module size. First, designers should ask policymak- until a new book is written, any new international-level information ers who specialize in a given topic to rank the policy is probably most easily obtained from international researchers. issues in order of importance, so that the module can 2. If geographic areas-rather than households-are the unit of collect the data needed to analyze the most important observation, it may be possible to merge data from different sur- policy issues despite the inevitable constraints. veys. However, this high level of aggregation yields less precise 40 CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY results, raises issues of aggregation bias, and generally requires sur- 6. Each module in the household questionnaire should also be veys with very large samnple sizes. included in the community questionnaire. See Chapter 13 for fur- 3. A variety of external sources have been used to fund past LSMS ther discussion of the community questionnaire. surveys.World Bank loans have partially financed several LSMS sur- 7. While all versions of the household enterprise module col- veys. Grants from various bilateral development agencies (especially lect income information, only the standard and expanded versions from the United States, Scandinavian countries, andJapan) and mul- of the agriculture module collect sufficient data for use in the tilateral development agencies (particularly the United Nations measurement of total income. Development Programme and the United Nations Children's Fund) 8. A full LSMS-tvpe survey could accommodate two and possi- have wholly or partially financed a large share of LSMS surveys. In a bly three expanded versions of modules; a scaled-down survey could few cases, grants from the World Bank research budget have support- accommodate at most one.Volume 3 presents expanded versions of ed LSMS surveys. Similar surveys, such as the World Bank's SDA sur- the following modules: roster, education, health, employment, veys, RAND's Family Life Surveys, and a few other surveys in Africa, migration, enviromnent, household enterprises, and agriculture. The all receive a large share of their funding from external sources. time-use modules introduced in Chapter 22 should also be treated 4. Most previous LSMS surveys have used two-stage sample as expanded modules, and the same is even more true for the full set designs. If a three-stage sample design is used, ID codes will be of environmental modules introduced in Chapter 14. needed that identify both the primary and secondary sampling units of each household. An analogous comment applies to surveys References that use four or more stages in their sample designs. 5. In large countries with federal systems. surveys can be per- Grosh, Margaret, and Juan Mufioz. 1996. A allanualfor Planning and formed for individual states. Such surveys usually have the same Iinplenlentinig the Living Standard Mlieasurenzenr Study Surucy. general purposes as national surveys, and have samples that are rep- Living Standards Measurement Study Working Paper 126. resentative of the whole state. Washington, D.C.: World Bank. 41 3 Designing Modules and Assembling Them into Survey Questionnaires Margaret Grosh, Paul Glewwe, and Juan Munoz Chapter 2 outlined the five-step process that survey designers should follow to design LSMS and similar multitopic surveys. It also provided detailed recommendations on how to undertake the first two steps, which are deciding on the overall design of the survey and deciding which mod- ules to include in the survey questionnaire.This chapter discusses the last three steps of the five- step survey design process.The first section of this chapter describes the third step-drafting each module, question by question, to ensure that it will collect the data necessary to meet the mod- ule's objectives (which were laid out in the second step).The second section guides survey designers through the fourth step-coordinating the different modules and combining them to create a consistent and comprehensive set of questionnaires. The third section explains the proce- dures for the last step -translating the questionnaires into local languages and conducting a field test. The fourth section discusses the formatting of the questionnaires, which is an extremely important but often neglected aspect of designing successful multitopic surveys. Survey designers should refer to the material contained in the fourth section many times during the last three steps of the survey design process. In practice, the survey design process rarely moves the people involved in carrying out the fieldwork (the smoothly and sequentially from one step to the next. data producers) but also policymakers (who will make Instead, survey designers often find themselves moving decisions based on the data), members of the research backward and forward among the various steps. For community (who will analyze the data), and the staff example, if designers encounter difficulties when of any agencies financing or providing technical assis- drafting a specific module, they may need to reconsid- tance to the survey. Eventually, what should emerge er and modify their original objectives for that mod- from the process is a well-designed set of question- ule. Developing survey questionnaires is an iterative naires for a multitopic household survey. process, and survey designers should expect to go through at least three or four drafts of each module. It Producing Draft Modules is not unusual for the different versions of the drafts to add up to a stack of paper one foot (30 centimeters) The third step in survey design, producing draft mod- high. Each major redraft of a module or questionnaire ules for the household and community questionnaires, should be reviewed by all interested parties, not only is one of the most time-consuming steps in the 43 MARGARET GROSH, PAUL GLEWWE, AND JUAN MUNOZ process. Detailed guidance on this step is provided in must also review all questions and response codes and, the chapters in Parts 2 and 3, so the discussion here if necessary, modify them to reflect local institutions will be general and relatively brief and terminology. For example, the transfers and other Once the objectives for each module are finalized nonlabor income module discussed in Chapter 11 (at least tentatively), survey designers can begin to must explicitly refer to each public transfer program develop detailed draft modules for the household and by name. The consumption module will need even community questionnaires. Survey designers should more work; in particular, the lists of items selected use the draft "prototype" modules introduced by the must closely reflect items consumed in the country. chapters in Parts 2 and 3 (and presented inVolume 3) The agricultural module will need careful attention, as as their starting point. As explained in Chapter 2, sur- this module must reflect the country's landholding and vey designers will already have decided on the policy cropping patterns. and analytical objectives of each module. They should For many of the modules, survey designers may now choose the shortest versions of the modules that find it useful to collect some preliminary data using will allow for analysis of the most important of these qualitative techniques, which may help them deter- objectives; any questions not relevant to these objec- mine how best to design these modules to collect tives should be removed. quantitative data. Chapter 25 provides a detailed dis- If the resulting module is still too long, survey cussion on how to collect qualitative data. Such data designers should remove any questions that are need- can be particularly useful in countries where success- ed only for the analysis of the least important of the ful quantitative surveys have never been done for the policy issues. This process should continue until the topic to be studied. module meets the length constraint. In some cases the A final general issue to consider when drafting module may be shorter than expected, in which case modules is the role played by the fieldwork schedule. a policy issue and its accompanying questions can be A prototypical full LSMS survey spreads fieldwork added. The general principle is that the most impor- evenly over a 12-month period, for two reasons. First, tant policy issues should be addressed first, and addi- this makes it possible to study or average out any sea- tional ones should be included only if space allows. sonality effects. Second, and more importantly, surveys This approach is a good start, but much more remains with this fieldwork schedule require a smaller number to be done. of survey field teams than do surveys that compress the For some modules the information and guidance fieldwork into a shorter period of time. This smaller given in the relevant chapter in Parts 2 and 3 of the number of teams reduces costs and allows for book may be incomplete. For example, the chapter improved quality control. All of the interviewers can may not address certain policy issues that are impor- be trained together and thus to a uniform standard; in tant in a given country or setting, in which case the addition, the cost of training interviewers-which designers of that survey will need to develop new takes about four weeks-will be proportionately modules or submodules. Even in these cases the infor- cheaper. Each interviewer will complete more inter- mation provided in the relevant chapter is usually a views and thereby gain more experience. Finally, fewer good base for developing such modules. However, if computers and vehicles will be required. the designers intend to implement major innovations Despite these advantages of a year-long survey in their survey, they should seriously consider adding period, many past LSMS surveys have compressed to the survey team a specialist with the relevant expe- fieldwork into a period of just two or three months. rience in both data collection and data analysis. This has often been done when there was pressure on Once each draft module has been written out in the survey team to collect data for analysis as quickly its entirety, the next task is to verify that the design of as possible. In other cases interviewers may have been each module reflects the economic and institutional available for only a short period of time, or the organ- structures of the country in question. For example, the ization funding the survey may have required that the designers need to check whether common living project be completed in a relatively short amount of arrangements are reflected in the definition of the time. The fieldwork schedule can also be modified to household used in the household roster and in the accommodate analysis of certain topics. For example, housing and interhousehold transfer modules. They analysis of some agricultural issues may require inter- 44 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES viewers to make two or more visits to each household well as in Ainsworth and van der Gaag (1988). A good at different times during the year. general reference publication for developing and Variations in the fieldwork plan may require designing household survey questionnaires is United changing the wording of some modules. This means Nations (1985). More recent general references are in that survey designers should ensure that the design of Babbie (1990), Fink (1995), and Fowler (1993); each module in the questionnaire is consistent with although these books focus more on developed coun- the fieldwork plan. tries, much of the material they contain is also relevant When a survey is conducted over a relatively short for developing countries. A final point to bear in mind period, such as a few weeks or months, careful atten- is that a good deal of attention must be given to cor- tion must be given to the wording of questions con- rect and consistent formatting. This is described in cerning events that are seasonal in nature.Will school great detail in the fourth section of this chapter; sur- be out of session for a large portion of the survey peri- vey designers should read that section very carefully od? If so, the education module may need to be before they begin designing any survey modules. changed to reflect this. In particular, questions refer- ring to school activities during the previous week, Integrating and Combining Modules to Create such as the number of days that a child attended Complete Questionnaires school or the number of hours of homework done by the child, would clearly be inapplicable. Also, questions Once draft versions of each of the individual modules about water supply during both wet and dry seasons have been written, these drafts must be combined to should be reviewed to ensure that they reflect the cir- form complete household and community question- cumstances of these seasons. The largest seasonal naires. Merely stapling the various modules together changes may need to be made to the agricultural will not produce a well-designed questionnaire; much module. A detailed discussion of the implications of more work has to be done to ensure that the different seasonality for that module is provided in Chapter 19. modules fit well together. This section describes how More substantial changes will be required if the to do this important task. It focuses primarily on mak- household is to be visited more than once at different ing the modules of the household questionnaire con- times of the year. In such cases it may be desirable to sistent with each other. Similar, though less difficult, have the interviewer administer modules for which issues arise when integrating the modules of the com- the answers are expected to vary by season (such as the munity questionnaire; in most cases the approach to consumption, agriculture, water, or time use modules) take for the community questionnaire can be inferred each time he or she visits the household. In contrast, from the discussion of the household questionnaire. the modules that are unlikely to be affected by season- This section will also highlight particularly important ality, such as housing, education, fertility, or migration, points to consider when combining the household, probably need to be administered only once. Any community, and price questionnaires to form a com- modules that are to be administered more than once prehensive household survey. usually need to be modified, particularly with respect to their recall periods. For example, if the interviewer Gaps and Overlaps makes two household visits six months apart, the con- Survey designers must scrutinize and compare the dif- sumption module should be administered in both vis- ferent questionnaire modules for gaps and overlaps in its and should have a recall period of six months rather the information that the modules collect. Analysts than one year. Also, the water module should ask only often need to combine data from different modules in about the particular season (wet or dry) during which the household questionnaire. Perhaps the most impor- the interview is to be conducted. tant example of this is the calculation of each house- The guidelines given in this chapter are general, hold's total consumption, which requires information since very detailed information is provided in Parts 2 not only from the consumption module but also from and 3 of this book. Other information on adapting the education, health, employment, and housing LSMS questionnaires to fit local circumstances can be modules-and from the water, sanitation, or fuel mod- found in Oliver (1997), which focuses on survey ules (see Chapter 14) if they are included as separate design in the countries of the former Soviet Union, as modules in the questionnaire (as opposed to using the 45 MARGARET GROSH, PAUL GLEWWE,AND JUAN MUNOZ housing module to collect information on these top- Some simple examples illustrate this point. The ics). Likewise, income data are collected in the expanded water module contains questions about the employment, agriculture, household enterprise, and price and quality of water from different potential miscellaneous income modules. It is important to water sources. If the primary sampling units are geo- check that a questionnaire includes the data needed to graphically compact, all of the households in each pri- construct these and other complex variables. mary sampling unit are likely to have the same alterna- Another example of this general issue is that sur- tive water sources, implying that the water price and vey designers often have a choice regarding the mod- quality questions can be put in the community ques- ule in which to collect some kinds of information. For tionnaire (which should be administered in each pri- example, data on expenditures on fuel for cooking and mary sampling unit) rather than in the household heating can be collected in the consumption module, questionnaire. On the other hand, if the primary sam- the housing module or, if it exists, the expanded fuel pling unit is not compact so that the households are module. Questions on child immunization can be widely dispersed, it is likely that some households will placed in the fertility module, the health module, or be nearest to, say, a particular spring or well while other the anthropometry module. An argument can be made households will be closer to other springs or wells. In for choosing any of these options (see the pertinent such cases these questions about alternative water chapters in Parts 2 and 3 of this book), but the essen- sources should remain in the household questionnaire. tial point is to ensure that the information is collected Another example concerns the distance to schools at least once, and is collected twice only if there is a and health facilities. In a compact primary sampling reason to do so.' Appendix 3.1 provides a list of the unit, the distance to the nearest school or health facil- most common types of gaps and overlaps to check. ity probably varies little among the households in the In cases in which information could be plausibly primary sampling unit. This means that information be collected in more than one part of the question- on the distances to schools and health facilities can be naire, there may be no absolute right or wrong place collected in the community questionnaire as opposed to collect it. Rather, survey planners must take into to the household questionnaire. account who the respondent is in each module, how well the best recall period for that information match- Length es the recall periods of modules in which it might be The overall length of the household questionnaire collected, at what point in the interview the respon- must be manageable. In general, it is not feasible to dent might discuss the topic most naturally, and include, say, the standard version of each module pre- whether the topic is a sensitive one that should there- sented in Volume 3, even though past full LSMS sur- fore be addressed near the end of the questionnaire veys typically included 15 modules, many of which (for reasons discussed further below). were similar to the standard versions in this book. The survey designers should also examine any There are several reasons why using all of the stan- overlaps among the household, community, and price dard draft modules in this book is not feasible. First, questionnaires. In general, the community and price this book introduces several new modules, including questionnaircs should collect information on any the time use module and several environmental mod- topic that varies only slightly from household to ules. Second, some of the standard draft modules, such household within the primary sampling unit.2 While as those on health, migration, and household enter- much of the information collected in the communi- prises, are much longer than the modules on those ty and price questionnaires could be collected in the topics that were used in previous LSMS surveys. household questionnaire, it is better to collect it in the Finally, in some of the chapters in this book (includ- community questionnaire in order to shorten the ing Chapter 18 on household enterprises and Chapter length of each household's interview. Collecting this 19 on agriculture) it is argued that collecting more information in the community questionnaire is also detailed data will greatly increase their value for ana- more efficient; why collect it for all households in a lytical purposes. Thus survey designers should not primary sampling unit (often 16 or 20 households) combine the standard versions of all of the modules when it need be asked only once in the community presented in the book into a single household ques- questionnaire? tionnaire. Instead, the short versions should be used for 46 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES some modules, and in almost all cases at least one or test, which is discussed in detail in the fourth section two modules should be dropped. of this chapter. If field test interviews require many Assessing whether a draft questionnaire is too hours to complete and exhaust the cooperation and long is not simply a matter of counting the pages or patience of households, this is an indication that the questions in it, since many questions, and sometimes questionnaire is too long. At the same time, survey even entire pages or modules, will apply only to some designers should realize that field test interviews nor- households. Moreover, in some cases adding questions mally require much more time than do similar inter- does not lengthen the interview time because the views during an actual survey, because interviewers respondent cannot avoid going through the thought have little training or experience with the question- process made explicit in these questions, which implies naire at the time of the field test. In addition, the ques- that a supposedly abbreviated set of questions will not tionnaire used in the field test is not a final draft and reduce the time required to complete the interview. thus is likely to contain some problems that will slow An example of this is the calculation of income down the interviews. A handy rule of thumb is that derived from agricultural activities. interviews in the actual survey take only about half of There are also several ways to implement long the time that they take in the field test-and some- questionnaires that minimize the time required by times even less than that. (and the fatigue induced in) each survey respondent. A general goal to aim for in the actual survey is These include conducting individual "mini- that any given respondent should not be interviewed interviews" with each household member to collect for more than one hour on a given day. Of course, all of the information needed from that individual at people's tolerance for being interviewed will vary one time (which allows him or her to leave when from country to country, and this general guideline questions are being asked of other household mem- must be adapted to suit local conditions. Experience in bers); using the best-informed respondent for each LSMS surveys to date suggests that people's tolerance household module; and dividing the interview into for long interviews is lower in urban areas than in rural multiple visits (for example, going through all the areas, lower among wealthy households than among individual-specific modules in one visit and returning poor ones, and lower in wealthier countries than in on a different day to conduct the consumption mod- poorer ones. ule and other household-level modules). LSMS sur- veys use all of these techniques. Still, there is a limit to Recall Periods the amount of information that can be gathered from The recall periods proposed for each module intro- a single household. duced in this book are mostly those that the authors How can survey designers determine whether a have deemed appropriate for that particular module. household questionnaire is too long? A rough idea of This can be a problem when analysts want to combine the effective length of the questionnaire in different or compare data from several modules. For example, in circumstances can be obtained by calculating how many LSMS surveys the employment module uses a many households will go through the different paths one-week recall period. Since most adults work, this created by the skip patterns and how many questions yields a large number of observations, and the period will be asked for each possible path. An excellent of time is short enough to yield accurate answers to example of this is provided in Chapter 18 on house- such basic questions as the number of hours worked hold enterprises, in Table 18.5. and the payments received during this recall period. In A more precise estimate of the time required to contrast, the health module uses a four-week recall administer a household questionnaire can be obtained period. This relatively long recall period is used when similar surveys have already been done in the because most people are not ill in any given week.The country or region studied. In this case, the designers of four-week recall period allows more observations of the new survey will be able to find out how long the illness for a given sample size than would be obtained interviews took in the previous survey, provided that using a one-week recall period. Since illnesses are the earlier survey collected metadata along the lines important events, respondents can be expected to suggested in Chapter 4. If such information is not remember many details of their episodes of illness dur- available, survey designers will need to rely on the field ing the past four weeks. 47 MARGARET GROSH, PAUL GLEWWE, AND JUAN MUNOZ However, if an analyst wants to study the impact A particularly important task is to coordinate the of illness on earnings or work effort, these different coding of items in the consumption expenditure recall periods will complicate the analysis.The analyst module with items in the price questionnaire. As cannot tell whether the illness took place before or explained in Chapter 2, price data are needed to gen- during the period for which the earnings and hours erate regional and temporal price indices that enable data were collected. This could be resolved either by comparison of real expenditures of households inter- adding questions to the health module to specify the viewed in different places and at different times.This is days during the recall period on which the respondent done by matching the prices collected in the price was ill or by making the recall periods coincide, per- questionnaire with the consumption expenditure haps with a compromnise of two weeks for both mod- information gathered in the consumption module. If ules (bearing in mind the disadvantages in sector-spe- the items are not well matched, this task becomes cific analyses of using a recall period different from the more difficult, and the resulting price indices will be "ideal" one for that module). Part of the job of inte- less accurate. In general, the goal should be a one-to- grating the draft modules is to determine and judge one correspondence between the items listed in the the tradeoffs being made, either confirming that they consumption module and the prices collected in the are acceptable or altering them until a more appealing price questionnaire. For example, if questions are asked tradeoff is reached. on two or three varieties of rice or wheat in the con- sumption module, a price for each variety should be Nomenclature and Coding Schemes collected in the price questionnaire. The questionnaire should be reviewed to check that This should be relatively simple to do for almost wherever similar questions are asked, the nomencla- all food items. Nonfood items are more difficult. It is ture and coding schemes are the same. This should usually not possible to obtain prices for durable goods reduce coding errors and simplify data analysis. For because they often come in many varieties (for exam- example, many different modules allow the respondent ple, there are many kinds of bicycles or televisions). to choose the time unit (for example, hour, day, week, However, for nondurable items, prices can be obtained or month) that they find most convenient when for well-defined examples. For example, there are responding to questions regarding time or payments many kinds of shirts, but if a specific widely purchased over time (such as wage rates, the length of time spent type of shirt can be defined, data on that type of shirt gathering firewood, and the length of time covered by can be collected in the price questionnaire and used as a payment for water). The code numbers for these an indicator of prices for all kinds of shirts. See time units should be the same throughout the entire Chapter 13 for a detailed discussion of the price ques- questionnaire; in the draft modules presented in this tionnaire, including a list of suggested food and non- book "day" is always coded as "3," "week" is always food items to include in it. coded as "4," and so on for other units of time. Another example concerns the migration mod- Choosing the Order of the Modules in the Household ule, the transfers received page of the transfers and Questionnaire other nonlabor income module, and the transfer pay- A final and very important question to address is the ments page of the consumption module. All have order of the modules in the household questionnaire.3 questions about where the migrant, donor, or recipi- It is natural and convenient to arrange the modules in ent lives. The coding scheme that categorizes this the order that they will be administered, so the key information, whether it is the type of place (capital issue here is the order in which the modules will be city, other urban area, rural area, or overseas) or the administered and how this affects the physical design name of the place, should be uniform. Likewise, sever- of the questionnaire. al modules include questions about the relationship To put this issue in context, consider the tradi- between two individuals. It is usually a good idea for tional fieldwork plan for a full LSMS survey. Each field these questions to use the same codes that are used in team works in its assigned primary sampling units the household roster module to indicate the relation- (communities) twice.The first time a team arrives in a ship of each household member to the head of the primary sampling unit, it works there for about one household. week. The first half of the questionnaire, most of 48 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES which usually consists of the individual-specific mod- more visits to each household during the sole trip to ules, is completed for each household. In addition, a the primary sampling unit. short module is administered that asks which house- Given these different possible fieldwork plans, hold members are best able to answer questions con- there are several basic principles about how to order cerning the specific household-level modules (agricul- the modules in the household questionnaire. The first ture, household enterprises, consumption, and savings) principle is that any modules on topics that respon- that will be filled out when the team returns to the dents might consider sensitive should be put at the end primary sampling unit about two weeks later. Figure of the questionnaire.This gives the interviewer time to 3.1 provides an example of such a module. develop a rapport with the household members, The field team works in a different primary sam- which should increase the probability that they will pling unit during the following week, while the data answer questions on sensitive issues, and do so truth- in the half-completed questionnaires from the first fully. It also means that if the respondent breaks off the primary sampling unit are entered into a computer by interview in response to a sensitive question, only the a data entry operator (who does not travel with the data from that last module or modules are lost. Finally, team) using a data entry program. The data entry pro- by this point in the interview, any interested onlook- gram checks the first half of the questionnaires for a ers, such as family members and neighbors, may have wide range of errors and inconsistencies. (This is dis- wandered away, making it possible to administer the cussed more filly in Grosh and Munoz 1996.) The more sensitive portions of the questionnaire with team returns to the first primary sampling unit in the greater privacy. Education, housing, migration, and in third week, administers the rest of the questionnaire some cases health4 are usually good topics with which (which mainly consists of household-level modules), to open the interview, because people generally do not and resolves any problems or inconsistencies found by mind talking about these topics. In contrast, fertility, the data entry program when the data from the first savings, credit, and transfers and other nonlabor half of the questionnaire were entered. income are among the most sensitive topics in the In several recent LSMS surveys, two different pro- household questionnaire. cedures have been used in the fieldwork stage. One A second principle concerns bounded recall peri- procedure is that the data entry operator travels with ods. In past LSMS surveys in which the interviewer the field team. This option has become feasible with made two visits two weeks apart to each household, the advent of small laptop computers that can be pow- some parts of the questionnaire used bounded recall ered by batteries, vehicle cigarette lighters, or solar periods; in other words, questions were asked such as panels.This allows the whole questionnaire to be filled "How much has your household spent on rice since out and checked using the data entry program during my last visit?"As explained in Chapter 5, using bound- a single trip to the primary sampling unit. In addition, ed recall periods can increase the accuracy of the the second half of the questionnaire can be checked by respondents' answers. Obviously, if bounded recall the data entry program almost immediately, so that periods are used in certain modules, these modules interviewers can return to the sampled households to must be administered in a second visit to the house- resolve any problems detected by the program. hold and thus be included in the second half of the The other procedure, used when a scaled-down questionnaire. The two modules in Volume 3 that LSMS survey is being implemented, is to complete all explicitly use bounded recall periods are those on of the interviews in a single trip to the primary sam- consumption (Chapter 5) and household enterprises pling unit and sometimes even in single visits to each (Chapter 18).5 household. This procedure will have a serious disad- A third consideration is the selection of respon- vantage if the data entry operator does not travel with dents. As explained above, several modules (including the team, because none of the data can be checked in the consumption, agriculture, household enterprises, time to return to the households to resolve problems savings, housing, and environmental modules) collect detected by the data entry program. If the data entry much or all of their data at the household level, which operator travels with the team, there is little difference means that the questions are answered by the house- between this procedure and the former procedure, hold member most knowledgeable about that topic. except that a full LSMS questionnaire will require With the exception of the housing module, these 49 on FIGURE 3.1: MODULE FOR CHOOSING RESPONDENTS TO BE INTERVIEWED IN THE SECOND HALF OF THE QUESTIONNAIRE C) >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- RESPONDENTS FOR SECOND HALF OF QUESTIONNAIRE | RESPONDENT: THE PERSON BEST INFORMED OF THE C ACTIVITIES OF THE HOUSEHOLD MEMBERS FULL NAME OF THE RESPONDENT: ID CODE E z 1. Who shops for the food for your household? 5. During the past 12 months has any member of your household participated in agricultural NAME: ID CODE [ production, forestry, or raising livestock? Z C YES ...1 1z NO .2(>8) lN 2. Who in your household knows most about the 6. Who is the person who knows most about all the non-food expenses of the members of your household? agricultural and livestock activities of the members of your household? NAME: ID CODE NAME: ID CODE [ j 7. In addition to this person, who else in your household manages plots of land owned 3. Who in your household knows most about the miscellaneous or rented in by the household? Who is responsible income and transfers received from other households? for plots that are rented out by the household? NAME: ID CODE ID NAME CODE 4. Who in your household knows most about the savings in your household? _ NAME: ID CODE|l= FIGURE 3.1: MODULE FOR CHOOSING RESPONDENTS TO BE INTERVIEWED IN THE SECOND HALF OF THE QUESTIONNAIRE 8. 9. 10. Over the past 12 months, has anyone in What kind of enterprises does your household operate? Who is most informed about and/or in charge of your household operated any non- day-to-day operations of the enterprise? agricultural enterprise that produces goods or services (for example, artisan, metalworking, tailoring, repair work, and processing and selling your outputs from PROBE TO DETERMINE INDUSTRIAL SECTOR IN WHICH ENTERPRISE your own crops, if done regularly) or has OPERATES. anyone in your household owned a shop or operated a trading business? YES ..1 ENTERPRISE ID NO. . 2 ID FULL WRITTEN DESCRIPTION CODE NAME CODE H (:NEXT MODULE) 1 z m z 2 r~~~~~~~~~~~~~~~~~n m Ce z U w z 0 -I I I MARGARET GROSH, PAUL GLEWWE, AND JUAN MUfNOZ modules are quite lengthy. Thus for each of these the names of all the household members, is usually modules it is usually best for the interviewer to ask placed further back in the questionnaire so that the which member would be the most appropriate names on that page can be seen during the adminis- respondent during the first visit to the household, and tration of all individual-level modules. Thus the phys- then make an appointment to interview that person at ical placement of this page will not reflect the time a later, more convenient time. In the traditional two- during the interview when it is filled in. (For details visit fieldwork plan, this implies that these modules, see the discussion on the fold-out roster page in the except perhaps housing, should be administered dur- fourth section of this chapter.) ing the second visit and thus should be located in the After the household roster, it is useful to fill out second half of the questionnaire. However, if the team the form on selecting household respondents shown travels only once to the primary sampling unit, it is still in Figure 3.1; this form can be administered to the feasible to make appointments for later in the day or same person who answered the household roster ques- for another day, which gives survey designers more tions (usually the head of household or the person flexibility in deciding where to place these modules in most knowledgeable about other household mem- the questionnaire. bers). It is useful to collect this information early A fourth principle relates to the logistics of data because it can be used to save household members' entry. The individual-level modules include many time by interviewing them sequentially using "mini- more questions for which strict range and consistency interviews."That is, after the interviewer has adminis- checks can be built into the data entry program than tered the form that identifies the relevant respondents do the modules on consumption, agriculture, and for the household-level modules, he or she should household enterprises.6 If the whole questionnaire is administer all of the modules that are clearly individ- completed using the traditional LSMS fieldwork plan ual-specific (except the credit and fertility modules) to (two visits two weeks apart), all the individual-specific each household member, finishing all such modules modules except the credit module should be adminis- with one member before interviewing another mem- tered during the first interview. (The credit module is ber. These are the education, health, employment, probably too sensitive to be administered during the migration, and time use modules. Some household first interview.) This will allow the survey team to members will not need to be interviewed further, and enter the data from these modules and to detect any thus their mini-interviews will consist of the inter- apparent errors or inconsistencies that could then be viewer administering only these modules. In contrast, resolved in the second interview. If the data entry other household members will also be the respondents operator travels around with the interview team, the for some of the household-level modules. For exam- data from the interviews can be checked in a matter of ple, the respondent for the housing module can also hours; thus where these modules appear in the order provide answers for the questions in that module as of the household questionnaire becomes less impor- part of his or her mini-interview. Using this method, tant for the purposes of data entry. the interviewer can obtain all of the information Given these principles, and some common sense, needed from each individual in a way that minimizes more specific advice can be given. Each household the use of respondents'time; once a respondent finish- questionnaire should have the metadata module at the es the mini-interview he or she can leave or start some very front, since much of the information that module activity without further interruption. collects (such as whether the interviewer successfully Within this group of individual-level modules, located the household, the date of interview, and the those on education and migration should be adminis- language in which the interview was conducted) tered first since the information they collect is not very becomes apparent at the very beginning of the inter- sensitive. Some employment information can be sensi- view. The next module should be the household ros- tive, particularly questions concerning wages, so this ter; this must be completed before any other module should be one of the last of the individual-level mod- because it determines who is and who is not a house- ules to be completed, if not the very last. If the short hold member, and thus determines the people to health module is used, it can be put near the front. whom all the other modules will apply. However, at However, if the standard or expanded version is used, it least one page of the household roster, the one with should be placed toward the end of the individual- 52 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES specific modules because of the sensitive nature of who still need to be interviewed after the "mini- some of the questions in this version (see endnote 4). interviews" are completed, as this will allow people Which modules should go near the end of the to leave if they do not need to be interviewed fur- questionnaire? Because the three most sensitive mod- ther. Continuing the agriculture example, note that ules are those that collect information on savings, the form in Figure 3.1 identifies all of the household credit, and transfers and other nonlabor income, these members who either manage or work on a plot of three modules should probably be put at the end of land. Household members who do not fit this the questionnaire. Another potentially sensitive topic is description and who are not needed to complete fertility. In countries in which fertility is particularly any other household-level module can leave after sensitive, it should come immediately before the sav- their "mini-interview" is finished. ings, credit, and transfers and other nonlabor income This completes discussion of the fourth step of modules. integrating the draft modules and combining them Where should the other modules go? If the tradi- into a complete set of questionnaires. The primary tional "two visits two weeks apart" interview system is focus has been on the household questionnaire, since used, the consumption and household enterprise the community questionnaire is much smaller. (See modules should be in the second half of the question- Chapter 13 for a detailed discussion of the communi- naire since these modules often use a bounded recall ty questionnaire.) Designers of prospective surveys can period, namely the time since the interviewer's previ- consult the questionnaires used in previous LSMS sur- ous visit. Modules that are long and also need to be veys by downloading them from the LSMS website, administered to specific respondents-the consump- http://worldbank.org/lsms/lsmshome.html. tion, agriculture, household enterprise, and environ- ment modules-should also be completed in the sec- Translating and Field Testing the Draft ond visit. Finally, as discussed above, the housing Questionnaires module can be administered in the first interview because it is unlikely to contain any sensitive ques- After the draft modules have been combined into a tions. If the two-visit system is not used, these modules complete set of household, community, and price can be put anywhere between the individual-specific questionnaires, they need to be translated and field modules and the more sensitive modules. tested.7 The field test is particularly important because Finally, the goal of saving respondents' time by it is the last check on the design of the questionnaires conducting mini-interviews with each respondent before the survey is implemented. (who can leave after his or her mini-interview is fin- ished) is complicated by the fact that the household Translation enterprise, agriculture, miscellaneous income, and It may be necessary to translate the questionnaires for credit modules consist of a mixture of household-level three reasons, each of which has different implications and individual-level questions. For example, in the for the design of the survey. The most common and standard and expanded versions of the agriculture most important reason is that respondents may speak a module, individual household members are asked range of different languages. In many countries more whether they have worked on specific plots of land. than one language is spoken. In these countries quali- However, these questions cannot be asked until sever- ty control requires that a separate questionnaire be al other questions have been asked about the different produced for each of the major languages spoken in plots of land owned and rented in by the household the country, with every question written out verbatim. members-and such questions would be awkward to Scott and others (1988) demonstrated how this ask in a form as simple as the one shown in Figure 3.1. procedure greatly increases the accuracy of the data The best way to resolve this problem will collected. They conducted an experiment designed to depend on which modules and which versions of measure interviewer errors when the interviewer had these modules are included in the household ques- to translate each question during the interview. For tionnaire, so it is difficult to provide general advice. example, the interviewer may have had to use a ques- However, one way to reduce the time burden on tionnaire written in English to conduct an interview household members is to identify all of the people in Tagalog or Cebuano or a questionnaire written in 53 MARGARET GROSH, PAUL GLEVWWE,AND JUAN MUNOZ French to conduct an interview in Baoule or Dioula. written down in the interviewers' manual. In the case The interviewers' error rates were two to four times of the least common languages, local interpreters were higher when they translated questions during the used when none of the interviewers spoke the lan- interview than when they used questionnaires already guage. In this respect, while previous LSMS surveys written in the languages used by the respondents. have conformed to normal survey practice, they have While the final versions of the questionnaires not reached the cutting edge of quality control as must be translated from the national (official) language defined by the World Fertility Surveys. The guidelines to produce verbatim questionnaires in the other lan- used in those surveys require that questionnaires be guages used in the country, the preliminary drafts of prepared in all languages used by more than 10 per- the questionnaire can be developed using only the cent of the sample and that a minimum of 80 percent national language. Ideally, the version of the question- of the sample be interviewed using questionnaires naire to be used for the field test should be translated written in the respondents' native language. into each of the languages that will have a final writ- Future LSMS and similar multitopic surveys ten version of the questionnaire. In practice, field tests should make greater efforts to translate the household are often done using only oral translations of the questionnaire into local languages. When preparing national language version of the questionnaire. Thus these translations, the questionnaire should always be the wording in the local language interviews during worded in the way that the language is commonly the field test may not correspond exactly to the word- spoken, using relatively simple terms and avoiding aca- ing that will be used in the written translations of the demic or formal language. The gap between the spo- final questionnaire. While this is an imperfect way to ken and written languages and the difficulty of strik- proceed, it is often a reasonable tradeoff given the high ing a balance between simplicity and precision may be costs, in both time and in money, of field testing the greater in local languages, especially ones that are not questionnaires in each language. commonly used for reading and writing. The transla- After the final version of the household question- tors should therefore be especially careful to try to find naire has been translated into another language, the an appropriate balance. translation needs to be carefully checked.The best way Two examples illustrate the kind of problems that to do this is to use "back translation."That is, after the can occur. The question "&Estuvo enferma en las ulti- questionnaire is translated from the language in which mas cuatro semanas?" literally asks, in Spanish, whether it was developed into the languages in which it will be the respondent was sick in the past four weeks. administered, someone should translate the versions in However, in spoken Spanish in Chile it could be those languages back into the original language. After understood as a polite euphemism for asking whether this "back translation" has been accomplished, the two a woman has had a menstrual period in the last four versions in the first language should be checked. weeks.An even more difficult problem in wording was Where there is a discrepancy in wording or meaning revealed in the field test in Nepal.Apparently the most between the two versions, the translation should be natural Nepali phrasing for "Have you been ill?" is carefully checked.A person or group of people famil- closer to "Have you been to the doctor?"The change iar with the purpose of the questions should do the in meaning from what was intended appeared in the first translation. The back translation should be done field test several times when respondents answered by someone who was not intimately involved in "No, I couldn't afford to go," clearly an inappropriate designing the questionnaire. Any ambiguities and response to the question "Have you been ill?" errors must be noted and corrected in the translated The second reason why the questionnaires may version rather than being "fixed" only in the back need to be translated is that sometimes the internation- translation version. al experts working on the survey design team do not Most previous LSMS questionnaires were printed speak the national language well enough to design the only in the national languages of the countries stud- questionnaires in that language. This happened in the ied, so multilingual interviewers had to be employed case of the Vietnam LSMS questionnaires, which were to conduct interviews in the most commonly used developed jointly in English and Vietnamese. In con- local languages. Occasionally a few key questions or trast, the LSMS questionnaires used in Latin American phrases were translated into the local languages and countries have been drafted only in Spanish by teams of 54 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES local and international experts who are fluent in that households should not be selected at random. Instead, language. When translation is a necessary part of the different types should intentionally be included so that development of the questionnaires, each draft of the all of the various situations likely to be found during questionnaire must be translated, which may require a the survey are observed during the field test. substantial amount of money and can also increase the Experience with LSMS surveys has shown that time needed for designing the questionnaires. field tests should be conducted using at least 100 The third and final reason for translating ques- households.To get enough responses for each module tionnaires is to produce a questionnaire in one or of the questionnaire, it may be necessary to visit addi- more of the major international languages (English, tional households to conduct partial interviews in Spanish, or French) in order to encourage the interna- which only those modules that apply to a relatively tional research community to use these data in their small number of households are administered. For policy analysis. Such translations need not be done example, the original 100 households may not include until after the final questionnaire is developed, and enough pregnant women or people who have been ill back translations are not needed. in the month preceding the interview to determine whether the fertility and health modules, respectively, Field Testing are well designed. In such cases survey designers After draft versions of the household, community, and should find additional households that contain preg- price questionnaires have been assembled and (if nec- nant women or ill people and have interviewers essary) translated, they must be tested in the field. The administer only the fertility or health module to those field test is one of the most critical steps of the survey households.8 A field test usually takes about one design process. The goal is to ensure that the ques- month to complete-about one week for interviewer tionnaires are capable of collecting the information training, two to three weeks of fieldwork (interview- that they are intended to collect. A field test should ing), and one or two weeks to discuss the findings and address the adequacy of the draft questionnaires at finalize the questionnaires. More time is required if the three levels: final questionnaires are to be produced in more than * TDie Questionnaire as a Whole. Is the full range of one language, because each version of the question- required information collected? Is the information naire should be field tested. collected in different parts of the questionnaire While the full field test should cover 100 or more consistent? Are any variables unintentionally dou- households, much can also be learned from prelimi- ble-counted? nary smaller tests. A general rule of thumb is that * Individual Modules. Does the module collect the about half of the problems will show up in the first 10 intended information? Have all major activities households interviewed. In one recent field test, inter- been accounted for? Are all major living arrange- national experts wrote six pages of comments about a ments, agricultural activities, and sources of in-kind single module after interviews were completed for and cash income accounted for? Are some ques- only three households. Such small-scale preliminary tions missing? Are some questions redundant or field tests are often particularly appropriate for new or irrelevant? difficult modules. Yet survey designers must under- * Individual Questions. Is the wording clear? Do any stand that these are precursors to a full-size field test of questions allow for ambiguous responses? Are there the whole questionnaire, and not a substitute. multiple interpretations? Have all responses been The personnel involved in a field test should anticipated and coded? include the survey design team, a few experienced It is important for a field test to include house- interviewers or field supervisors, and a few of the peo- holds from all major socioeconomic groups. For ple consulted by the survey design team, including example, a sample should include: rural and urban both policymakers and research analysts. It may also be households; individuals employed in the formal sector, helpful to include people with experience working on in the informal sector, and in agriculture; and farmers past LSMS or similar multitopic surveys.All of the par- in each main agroecological region, in each produc- ticipants should divide into a small number of teams, tion scheme (independent farming, renting, sharecrop- each of which includes at least one person with each ping, and cooperative farming), and so forth. The kind of expertise. 55 MARGARET GROSH, PAUL GLEWWE,AND JUAN MUNOZ There should only be a few teams involved in the poses, because in most cases the sample is both non- field test, usually around three or four. Mechanisms random and very small. However, the questionnaires should be set up to enable the teams to contact each from the field test can be used to check the perform- other during the field test so that they can compare ance of the data entry program. notes on the problems they encounter and the solu- The personal participation of all senior staff tions they have tried. A good way to set up such (including analysts) is fundamental for both the field mechanisms is to have all of the teams working test and its evaluation. The following anecdote illus- together for the first few days, perhaps in the capital trates this point. In one country, before the field test a city.This means that the teams will be in contact with manager in the statistics office asserted that collecting each other every evening during the period when the information on family assets would be impossible first and often biggest flaws in the draft questionnaire because respondents would fear that the information are uncovered. In some cases the team members can would be used for tax purposes. The module was agree on modifications to the questionnaire during included in the field test, and no unusual difficulties the field test itself, which allows these modifications to were encountered. But the manager who opposed the be field tested. module did not witness the field test, and some of Each interview during the field test should those who did participate in the field test did not par- include, at minimum, the respondent, the interviewer, ticipate in the module's evaluation. Despite the suc- and an analyst or senior survey specialist. During the cessful field experience, the module was removed from field test it is acceptable for the analyst or survey spe- the questionnaire, largely because key decisionmakers cialist to interrupt the interview tactfully in order to did not fully participate in the survey design process. refine the wording of a question or the responses coded Many small changes are generally made to ques- for it. Of course, in the actual survey the interviews tionnaires as a result of field testing, including changes should be conducted in private, and the interviewers in the wording of some questions, in questionnaire should adhere to the wording of the questionnaire. format, and in answer codes. If either the question- The interviewers used in the field test should be naire's structure or the way in which certain variables drawn from the experienced staff of the statistical are measured is changed substantially, all of the parts of agency. They should be good interviewers-familiar the questionnaire that have been so modified must be with basic interviewing practices and able to distin- tested again. This can delay the survey, but one way to guish between problems caused by deficiencies in the reduce the probability of such a delay is to begin the questionnaire and problems caused by their lack of field test with two or more versions of the most diffi- familiarity with the questionnaire. The interviewers' cult, contentious, or important modules in the ques- training should focus on the purpose of the survey and tionnaire. If one version clearly works the best, there is the structure and format of the questionnaire. One no need to do another field test because that version week of training is usually sufficient, followed by two has already been field tested. or three weeks of household interviews. Ideally, the household, community, and price Survey planners should set aside 1-2 weeks imme- questionnaires should all be field tested at the same diately after the field test to review the field test results time.This allows the survey design team to evaluate all and debate how to modify the questionnaire in light of of the questionnaires together, taking into account the those results.The group involved in the field test should possibility that changes in one questionnaire may have go through the questionnaires, module by module, and implications for the design of the others. Simultaneous discuss any problems that arose. At this stage, the team testing of the three questionnaires can also reduce should bear in mind that the length of time required travel costs since, like the household questionnaires, for each interview will fall dramatically when the the community and price questionnaires should be interviewers are well trained and have become familiar tested in a variety of locations. with the questionnaire. As mentioned above, the typi- Regrettably, in several past LSMS surveys the sur- cal field test interview will be at least twice as long as vey teams neglected to field test the community and the average interview in the actual survey. price questionnaires, concentrating solely on the house- The data from the field test should not be entered hold questionnaire. The community and price modules in the computer or examined for any analytical pur- were tested late and haphazardly or, in some cases, not 56 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES tested at all. It is probably not coincidental that the users chosen should be the one that is clearest and most of the data from many previous LSMS surveys have likely to minimize the possibility of errors. The draft often had more complaints about the community and questionnaires presented in this book follow the for- price data than about the household data. If there is not matting conventions explained in this section, which enough staff time to test all three questionnaires at once, have been used frequently in past LSMS surveys, with it is important to ensure that separate, rigorous field tests successful results. are done of the community and price questionnaires. Questionnaire format is important because a good The health and education modules discussed in format minimizes potential interviewer and data entry this chapter often include detailed facility question- errors, which improves the accuracy of the data and naires (in other words, school or health clinic ques- reduces the time needed to check the data before tionnaires), which can be very complex (see Chapters making them available to data analysts.The objectives 7 and 8 for details). It is essential to field test these facil- underlying a given survey can occasionally have impli- ity questionnaires. During the field test the survey team cations for formatting, so some aspects of formatting should be sure to visit each type of facility covered by will vary from country to country. Even so, almost all the facility questionnaire. For example, field testing a of what has been learned about questionnaire format health facility questionnaire should involve visits to in previous LSMS surveys will be applicable to new public health posts, public clinics, private doctors' surveys. Thus the formatting guidelines presented in offices, public hospitals, and private hospitals in both this section are recommended for all LSMS and simi- urban and rural areas. Similarly, field testing a school lar multitopic surveys, and for other surveys as well. questionnaire should involve visits to public and private schools, primary and secondary schools, and schools in Identifiers urban and rural areas. Since field testing a facility ques- Every person or object for which data are collected in tionnaire is a major undertaking in its own right, it is a survey must be uniquely identified. This usually probably best to conduct such a field test separately requires two or three separate codes. The first code from the field tests of the other questionnaires. identifies the household. The second code identifies the person or object of interest, such as an individual Rules for Formatting Survey Questionnaires household member, a household business, or a plot of land. Sometimes there is a third code, which applies, The formatting of survey questionnaires is not a sepa- for example, to all children ever born to each woman rate step in the overall survey design process. Rather, in the household or to the assets of each business oper- it influences how the third, fourth, and fifth steps are ated by the household. carried out. Good questionnaire formatting can make Whenever possible, the identification codes for a tremendous difference in the quality of the data col- the second or third levels of observation should be lected. This section discusses formatting in detail, mak- preprinted on the questionnaire pages to which they ing very specific recommendations about how ques- pertain. For example, the individual identification tionnaires should be formatted.9 code for each household member should be printed There is, of course, more than one way to format on all pages that collect data on individual household household survey questionnaires. Most of the benefits members.This ensures that the codes cannot be omit- of good formatting come from selecting a formatting ted and avoids any errors that would occur if the inter- convention and following that convention consistent- viewer were to write down the wrong codes. An ly, rather than choosing the "best" convention from example of these codes appears in the far left column among several possible options. For example, in LSMS of Figure 3.2, which presents the short version of the questionnaires uppercase and lowercase letters are used education module. to distinguish words spoken aloud during the inter- The importance of adequate identifiers is so obvi- view from instructions to the interviewer, but this ous that it is hard to believe mistakes can be made, but could be done in other ways, such as using different they can. In one health survey the questionnaire con- colors or different fonts. Once a convention is select- sisted of two sheets of paper stapled together. One ed, it is extremely important to use it consistently contained information on the household, while the throughout the whole questionnaire. The convention other contained information on individuals. In order 57 co FIGURE 3.2: ILLUSTRATION OF INDIVIDUAL IDENTIFICATION AND SKIP CODES (EDUCATION MODULE SHORT VERSION) 3 C) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Have you Are you What is the What is Were In what What is the Is the How much has your household spent on your education in the last 12 Have you How many O ever currently highest the you grade are highest school you months for: ever times have attended enrolled grade you highest enrolled you diploma are repeated you repeated I school? in have diploma in school currently you have currently a grade of a grade of c D school? completed you have during enrolled in attained so enrolled in school? school? in school? attained? the past school? far? public or r o 12 private? O months? D w)NEXT E PERSON z PUBLIC. . 1 >C PUT PETT ~~~~PRIVATE z CODES CODES SECU- A. Tuition B. Parent C. Uni- D. Text- E. Other F. Meals, G. Other a FOR PU U FOR LA.... 2 and other Associ- forms books? educational transpor- expenses oZ YS. . .1 DIFFER- CODES YES O . 1 CODES DIFFER- PRIVATE required ation and materials tation (extra YES. .1 N NO-... 2 YES.. f KW I 9 k 7ORFOR k-W RELI- fees? fees? other (exercise and/or classes. NO. ...2 NMER OF (-NEX (-6 GRADS DXL No .. 2DIPLOMAS GRADES GIOUS. .3 clothing? books, odging? optional (-NEXT REPEATED PERSON) NO ... 2 _______ R (.10) B_____ _____ ________I_______ pens, etc.)? ____fees)? PERSON) GRADES 4- 6 100 mA -== = 11 8 5 -ll l 9 5 -l l 11= CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES to facilitate data entry, the two pages of the question- Exceptionally large households sometimes have so naire were separated. Unfortunately, the household many members that there are not enough lines in the identifier was not put on the page for individuals, grids for all household members. In these cases a sec- making it impossible to link the two parts of the sur- ond copy of the household questionnaire will be vey with each other after the data were entered. required, and care must be taken to ensure that the right household and individual numbers are used. As Questionnaire Layout explained in Chapter 4, a coding scheme is needed to The LSMS questionnaires are designed so that only one distinguish between the first and second copies of the copy of the questionnaire is needed for each household. questionnaire filled out for large households. For In contrast, some surveys use one household question- example, the individual numbers in the second copy naire and a separate set of individual questionnaires.This should be changed to start with 13 instead of 1 requires that household identification codes be copied (assuming that the first questionnaire has room for 12 perfectly onto all of the individual questionnaires.While household members).This is a reasonable approach for perfection is always sought, it is rarely achieved, and sep- large households, but it also introduces a potential arate questionnaires create the risk of improper match- source of error; survey designers should set the format ing. This is illustrated in the case of the Russian of the grids to accommodate as many individuals as is Longitudinal Monitoring Survey. Although care was practical. Previous LSMS questionnaires have typically taken to ensure accurate coding and matching, many had space for 12-15 individuals. errors were introduced. For the first round of the sur- In cases where the unit of analysis is such that vey, which was held in the summer of 1992, there were there is only one observation per household (for 3 percent fewer individual questionnaires than had been example, one dwelling per household), the questions expected given the number of household members pertaining to that unit can be arranged in a single col- identified in the household questionnaires. By the sum- umn down the page. One problem with a single col- mer of 1993, in the third round of the survey, this dis- umn of questions is that much of the page is left blank. crepancy had grown to about 9.5 percent. To save paper, two or more columns may be put on Putting all of the information into a single house- one page, as long as it is clear that there is no hori- hold questionnaire implies the need for a grid of some zontal relationship among the questions in the differ- kind whenever there are two or more of a particular ent columns. An example of this format is provided in unit of analysis in a household. For example, a house- Figure 3.3, which shows the first page of Part C of the hold often includes several people, may have several standard housing module. plots of land, and may grow several different crops.The grid typically used in LSMS surveys has questions Fold-Out Roster Page arranged across the top and units of observation (peo- The household roster page of the household ques- ple, plots, or crops) down the side; in other words, each tionnaire is printed so that it extends to the left of the question is a column and each unit of observation is a pages that pertain to individuals in the household. row. An example of this is shown in Figure 3.2; note Most importantly, the names of each individual mem- that the identification codes for the units of observa- ber of the household on the roster page are visible tion (household members) are printed on the left side when filling out the other individual-specific pages of of the grid page. the household questionnaire. This has been done four Sometimes the interviewer must fill in the code in different ways in LSMS surveys, as illustrated by Figure the first column, as in Question 2 of Figure 3.8 (which 3.4. is discussed below), but this practice should be mini- In the method shown in Format 1, the sheets in mized to reduce the possibility of introducing errors front of the roster are shorter than the cover, the ros- when writing down such codes. In the grids for indi- ter, and the sheets that follow the roster. The most viduals, the lines can be differentiated by alternating common method is shown in Format 2. The roster shaded and unshaded blocks (as in the draft modules sheet is folded out to extend beyond the body of the inVolume 3 of this book) or by using a different color questionnaire and its covers. In Formats 1 and 2 the for each row or block of rows. This helps an inter- roster page is placed behind all of the pages that per- viewer record the information on the correct line. tain to individuals, so that the names on the household 59 g% FIGURE 3.3: ILLUSTRATION OF PRECODING (PART C OF HOUSING MODULE) C) >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 7. Do you have legal title to the dwelling or any document that shows ownership? C 1. Is this dwelling owned by a member of your household? YES ......................hi? 1 _ NO ............................2 YES ..1......... II C NO .2 (>11) 8. What type of title is it? - 2. How did your household obtain this dwelling? FULL LEGAL TITLE, REGISTERED . .1 LEGAL TITLE, UNREGISTERED... 2> PURCHASE RECEIPT. 3 Z PRIVATIZED ............................. 1 OTHER..4 PURCHASED FROM A PRIVATE PERSON ........2 > NEWLY BUILT ...............3 Z COOPERATIVE ARRANGENT................ 9. Which person holds the title or document to this dwelling? SWAPPED ............................. 5 (>6) 0 INHERITED .................... 6 (>>6) WRITE ID CODE OF THIS PERSON FROM THE ROSTER N OTHER ............................. 7 (>>6) 1ST ID CODE: 3. How much did you pay for the unit ? 2ND ID CODE: 4. If you make installment payments for your dwelling, what is the amount of 10. Could you sell this dwelling if you wanted to? the installment? YES ...1.. . ....... I WRITE ZERO IF THE HOUSEHOLD DOES NOT MAKE NO .2 (-13)I INSTALLMENT PAYMENTS l I AMOUNT (UNITS OF CURRENCY) L. I 11. If you sold this dwelling today how much would you receive for it? TIME UNIT T AMOUNT (UNITS OF CURRENCY) 5. In what year do you expect to make your last installment payment? 12. Estimate, please, the amount of money you could receive as rent if you YEAR let this dwelling to another person? 6. Do you have legal title to the land or any document that shows AMOUNT (UNITS OF CURRENCY) ownership? TIME UNIT YES ...........1 l -> QUESTION 28 NO ................ 2 13. Do you rent this dwelling for goods, services or cash? TIME UNITS: DAY ........3 MONTH ....... 6 YEAR ..9 WEEK . 4 QUARTER ..7 FORTNIGHT ..5 HALF YEAR .. .8 YES .1 [l. NO .. 2 (>26) FIGURE 3.4: ROSTER ARRANGEMENTS Format I _. gFormat 2 Legal size (14" x 8.5") D size(8.'xlI ) } / ~~~~~~~ ~~or ISO A4 } tor A4 / / ~~~~~~~~~~Shore pages/_1 HouseholdeRostr is the first of the Household Roster an a wider longer pages in the middle of ticm sieet; folds out fiom the back page z Format 3 D LetA size (8.5 x I I") Format 4 O Lcgal size (14" x 8.5") R Household / or ISO A4 ffi folds out from (S.S" x Il") m double-sized or A4> number niust / eoal aS o b rA appear on h ousehold //\pIfrostag > roswr and / ldsw m #/j Et ~~~~~~~~~~~~~ldsotfr us r on ~I alfrat,cos thein and malcedestiinathe open flt ID, ooe pero b otran nhidvda ae questionnairne t heck po s i m I 3 00 In all formats, choose binding to muke questionnair open flat. ID codes appear on the roster and on each individual page. z Lines or, the roste must be aligned with the page in the questionnaire. 0'.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ MARGARET GROSH, PAUL GLEWWE, AND JUAN MUNOZ roster page are visible whenever individual questions questionnaire. In most cases these response codes are asked. should be printed directly in the box where the ques- An innovation in the Kagera Health and tion appears, or next to the question if there is no box Development Survey in Tanzania was to make the ros- around it.Where the list of codes is lengthy and applies ter page a removable card, as shown in Format 3. This to several questions, it should be placed in a special was useful because the survey was designed to be box on the border of each page for which it is need- administered four times-every six months for two ed. Alternatively, if a list is very long it can be printed years-to the same households. The roster card was on the back of the preceding page (making it visible inserted into a pocket in the back of the questionnaire when the interviewer fills out the page in question). in the first round of the survey. When the second An example of a box on a border of a page is the time round started, the roster card was removed from the unit box shown at the bottom of Figure 3.3. first questionnaire and placed in the back pocket of In past LSMS surveys fewer than a dozen ques- the second questionnaire. In this way, individuals tions on the household questionnaire have required retained the same identification codes in each round. the interviewer to write down words or phrases that A few follow-up questions guaranteed that individuals are given codes, usually by someone else, after the who moved in or out of the household or were born interview. Precoding allows the data to be entered into or died between rounds were counted appropriately. the computer straight from the completed question- In four rounds of interviews conducted over two years naire, thus eliminating the time-consuming and error- for 800 households, none of the roster cards was lost. prone step of transcribing codes onto data entry However, this success may reflect the intensive super- sheets. vision carried out by the organizers of that survey, as Precoding requires that response codes be clear, well as the relatively small sample size. This option simple, and mutually exclusive, that they exhaust all should probably not be used in situations with signif- likely answers, that respondents will not all provide the icant quality control problems. same response, and that none of the codes apply to Format 4 was used in the Tunisia questionnaire. In only a handful of respondents. Designing adequate this format each page is oriented as "portrait" (a verti- response codes requires extensive knowledge of the cal page) rather than as "landscape" (a horizontal page) phenomenon being studied as well as careful field test- and is spiral-bound so that it opens flat. Each ques- ing.A standard technique to ensure that the codes are tionnaire page then consists of the full 11 x 17 inches mutually exclusive is to add a qualifier where more of the two-page spread.The roster folds out to the left. than one answer could apply-asking, for example, In all four cases the line for each individual member "What was the main reason for dropping out of of the household on the roster page is aligned with the school?" Other standard qualifiers are "What was the corresponding lines on the other individual-specific first (or last, or principal) reason for ... ?" Alternatively, pages of the household questionnaire. spaces can be provided for multiple responses, with an A final point regarding the fold-out roster page is instruction to code all responses (up to, say, the two or that it may be useful to have more than one such page three most important) that apply. per questionnaire. A fold-out roster will be useful A standard technique to ensure that codes encom- whenever there are several pages of questions for the pass all possible answers is to add an "other (specify same level of analysis and especially when there are _ )" code to questions for which an explicit many rows on the grid. For example, in the agricultur- enumeration of each possible response is impossible or al module one might make rosters for crops grown or inconvenient. In past LSMS surveys the detailed for plots of land. A fold-out roster page would be par- answers were almost never coded, so analysts usually ticularly helpful for the household enterprise module. put all "other" responses into a single residual catego- ry. One way to increase the probability that the infor- Precoding mation recorded in the "other (specify )" All of the potential responses to almost all of the ques- answers will be used at a later date is to enter it (as tions in the questionnaire should be given code num- text) into the computer, without assigning any codes bers so that the interviewer records only code num- to the responses. This allows analysts to code any bers, as opposed to words or phrases, on the answers that were not precoded in the data released to 62 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES the public. It also allows the designers of subsequent unnatural. For example, "Did you spend any time surveys in the same country to review the answers that doing housework?" followed, if necessary, by "...such were written in (especially in cases in which a signifi- as cooking, mending, doing laundry, or cleaning?" is cant percentage of the responses were coded "other") better than "Did you spend any time engaged in and to modify their coding lists accordingly. In partic- domestic labor, for example, preparing food, repairing ular, if most of the "other" responses fall into a single, clothes, cleaning clothes, or cleaning house?" It is not well-defined category, this category should have its always easy to find terms that are simple, short, and yet own code in any subsequent survey. precise, but that should always be the goal. There is, of course, a limit to the kind of material In most cases the interviewer reads the question that can be covered even by well-designed, precoded aloud and marks the questionnaire with the code for questions. But this limit may be less of a disadvantage the answer given by the respondent. For example, for than it first appears. Because most analyses of LSMS the question, "Are you currently enrolled in school?" surveys use sophisticated quantitative techniques, it is the interviewer writes down a 1 for "yes" or a 2 for difficult for these analysts to make use of the "no." For some questions the response categories are exploratory, qualitative information gathered in open- part of the question-for example, "Is the school you ended questions. So even if such questions were asked, are currently enrolled in public or private?"There may the answers to these questions would not be used also be a few questions for which the wording of much in analysis. If it is clear that some analysts do respondents' answers may vary even though the mean- need extensive information of an exploratory, qualita- ing is the same.The best thing to do in such cases is to tive nature, the designers of a prospective survey may have the interviewer read out all of the response cate- wish to adopt a different data collection instrument or gories. For example, in Question 4 of Figure 3.5, after even a new research technique. See Chapter 25 for a reading "Compared to your health one year ago, thorough discussion of qualitative data collection would you say that your health is..:" the interviewer alternatives. should read the responses "much better now," "some- what better now," "about the same;' "somewhat Verbatim Questions with Simple Answers worse," and "much worse." If necessary, the interview- All questions in LSMS surveys are written out in er can explain the differences between the various their entirety and are meant to be read out verbatim response categories. However, the reading out of by the interviewer.This is done to ensure that ques- response categories should be used as little as possible, tions are asked in a uniform way, since different because respondents may not listen to the full list wordings may elicit different responses. For example, before answering, which can lead to errors. the answers that a respondent gives to "Can you The answers to the questions must be kept simple. read?" and to "Can you read a newspaper or maga- This means that additional filter questions are often zine?" will probably be somewhat different. Other needed.Adding enough filter questions to ensure sim- changes may subtly alter the time period referred to, ple answers can make the number of questions and as in the change from "Have you worked since you skips seem high. Many survey designers are tempted to were married?" to "Did you work after you were shorten the questionnaire or simplify the skip pattern married?" Scott and others (1988) discuss some rig- in a way that results in complex questions and answers. orous field experiments that compared such verbatim This should be avoided since it will confuse some questionnaires with questionnaires in which the respondents and is unlikely to save time. topic was given for each question but the exact Survey designers yielded to this temptation in the wording was not. When the questionnaire that did agricultural module of the 1987-88 Ghana LSMS sur- not contain the exact wording was used, 7 to 20 vey. In that module the following question was asked: times more errors occurred than when the verbatim "Do you or the members of your household have the questionnaire was used. right to sell all or part of their land to someone else if When choosing the wording of questions, it is they wish?" The precoded answers (which were not important to use terms that reflect the language as it is read out to the respondents) were "Yes," "No," "Only commonly spoken. Using language that is too formal after consulting family members who are not house- or academic will make the interview stilted and hold members," and "Only after consulting the chief or 63 FIGURE 3.5: ILLUSTRATION OF CASE CONVENTIONS (HEALTH MODULE STANDARD VERSION) 3 C) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. - I IS THIS COPY THE ID During the last four Compared with your health one CHECK THE AGE IN Did [NAME] Was it Was it Was it a How did you treat it? a D PERSON CODE OF weeks, how many year ago, would you say that your YEARS OF THIS experience mixed with mixed with pale liquid? I ANSWER- THE days of your health is: PERSON diarrhea in blood? mucus? >NEXT - C ING FOR RESPOND- primary daily [READ OUT ANSWERS TO the last 7 rSECTION c 0 HIMSELF/ ENT FROM activities did you RESPONDENT] days? G1 D HERSELF? THE miss due to poor E HOUSEHOLD health? REDUCED FOOD OR ROSTER LIQUID GIVEN TO m CHILD ........... 1 z GAVE SPECIAL FOODS TO CHILD..... 2 ORAL REHYDRATION Z OTHERAPY . Much better now ......1 0-6 ............. T (SPECIRY 4 N Somewhat better now ..2 7-14 ............ 2 NO TREATMENT ..5 YES ..1 About the same ....... 3 (-NEXT SECTION) YES ..1 YES ..1 YES. .1 YES. .1 (>>3) Somewhat worse .... 4 15-39 ..... 3(t24) NO .. 2 NO.. 2 NO...2 NO ... 2 No.. .2 ID CODE DAYS Much worse ........... 40 AND OVER.4(>>11) (>>11) 1ST 2ND 3RD 2 3 7 12= =l CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES the village elders." It is not clear whether the respon- ed rules printed in the manual rather than on the dents could distinguish between the simple yes answer questionnaire. This helps ensure that instructions will and the yes answer qualified by the need for consulta- be followed uniformly. Well-placed skip codes ensure tion.Thus a different formulation might have been bet- that inapplicable questions are not asked. (Asking inap- ter. The question could have been left as is but using plicable questions irritates respondents, wastes inter- only simple "yes" and "no" codes. Then the interviewer view time, and confuses data analysis.) Finally, explicit could have put a second question to those who skip codes imply that a "not applicable" code is almost answered "yes," worded as follows: "Do you need to never used in LSMS questionnaires. consult with anyone outside the household before sell- One way to check skip codes is to develop a flow ing the land?"The response codes would be "Yes" and chart of the questions in each module. Flow charts "No."Then a third question would be put to those who are useful both for checking the logic of the ques- answered "yes" to the second: "Whom must you con- tionnaire and for training interviewers. Figure 3.6 sult?"The response codes for this question would be for presents a flow chart of a typical health module used "family member,""village elders," and other appropriate in past LSMS surveys (which differs in several categories.This formulation would have made the ques- notable ways from the health module presented in tionnaire longer in terms of the number of questions Volume 3). The proportions of people who answer but would probably not have increased the interview yes at each branch are recorded based on results from time since some sort of probing probably occurred in several previous LSMS surveys.The numbers of indi- the Ghana LSMS when the "yes" answer was given. viduals that would be asked each set of questions are More importantly, keeping questions and answers sim- shown on the left, assuming a base of 10,000 indi- ple makes the interpretation of the data much clearer. viduals in the sample. The flow chart makes it easy to check whether the skip patterns lead people through Skip Codes the module correctly. For example, it is possible to Skip codes are used extensively in LSMS question- check that the question on health insurance is asked naires. Skip codes tell the interviewer which question of all household members, not just of those who are to proceed to after finishing the current question. ill. Analyzing the whole household in this way gives Some skip codes apply only when a particular answer survey designers a better sense of the likely length of is given. In such cases an arrow and the number of the time it will take to complete each interview than question to skip to are positioned in parentheses next does the number of pages or number of questions in to or below the individual response to which the code the questionnaire, because many questions will be applies. An example is given in Question 2 of Figure skipped for many individuals. (For further discussion 3.2. If the answer to Question 2 is "yes," the inter- of the length of the questionnaire see the second sec- viewer should skip Questions 3, 4, and 5 and proceed tion of this chapter.) to Question 6. If the answer to Question 2 is "no," the interviewer should proceed to Question 3. In Case Conventions Question 1 a similar construction is used, but when Everything that the interviewer should read aloud the answer is "no" the interviewer is instructed to skip should be written in lowercase letters. Instructions to all the remaining questions in the module for this the interviewer should always be written in uppercase respondent and proceed to interview the next person. letters.10 Answer codes should also be written in Another kind of skip instruction applies regardless uppercase, unless they are to be read aloud to the of the response given to the question.When an arrow respondent. This makes it easy to include instructions and a question number or instruction are placed in a on the questionnaire as opposed to relying on the box separate from the response codes, the skip instruc- interviewers' memory of the manual or of instructions tion contained in the box applies regardless of what that they were given during their training. In Figure answer is given. An example of this is given in 3.5 instructions to the interviewer are printed in Question 10 of Figure 3.5. Questions 1, 2, 4, and 5.These are in uppercase, as are There are several advantages to extensive, explicit the answer codes in Questions 1 and 5. (The answer skip codes. Interviewers do not have to make decisions codes in Question 4 are in lowercase because they are themselves, nor do they need to remember complicat- to be read aloud to the respondent.) 65 MARGARET GROSH, PAUL GLEWWE, AND JUAN MUNOZ FIGURE 3.6: FLOW CHART OF HEALTH MODULE USED IN PREVIOUS LSMS SURVEYS 10,000 |1 Were you ill or injured in the last 4 weeks? I - YES (10-45%) 2 How many days in the last 4 weeks did you have 1000-4500 to stop doing your usual activities? 3 Was anyone consulted? s I ~~~~NO YES (40-80%) 400-3600 4 Who was consulted? 5 Where did you go for that consultation? 6 What was the cost of the consultation? 7 What means of travel did you use? 8 How long did it take to get to the place of consultation? 9 How much did you spend on travel costs? 10 How long did you have to wait? 11 Did you have to stay overnight at the clinic or hospital? YES (5-8%) 20-288 12 How many nights did you stay? 13 How much did you have to pay? 1000-4500 14 Did you buy any medicines for this illness or injury? NO YES (60-90%) 600-4050 15 How much did you spend on medicines? 10,000 16 Do you have health insurance? |NEXT PESN Enumeration of Lists used in the consumption module, as shown in Figure There are two methods of gathering information 3.7. Although several dozen items are included, it is about long lists of iterns.A typical LSMS questionnaire expected that most households will have consumed may use either method depending on particular many of them. The first question is "Has your house- circumstances. hold consumed [FOOD] during the past 12 Consider the case in which one expects that a months?"The interviewer first goes down the whole large proportion of the items on the list will apply to list asking this "yes or no" question. Then the inter- most households. For each item on this list a line is viewer returns to the first item that was consumed put in the grid and the name and code number of the and asks all the follow-up questions for that item item is printed on the questionnaire. This approach is before proceeding to the next item. The complete 66 FIGURE 3.7: ILLUSTRATION OF CLOSE- ENDED LIST (PART B OF CONSUMPTION MODULE) PURCHASES SINCE LAST VISIT PURCHASES TYPICAL MO HOME PRODUCTION GIFTS UNIT CODES: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. USE CODES In the following questions, I want to ask Have the How much How much How many How much How many How much What was What is the WITH STAR about all purchases made for your members of did you pay did you months in do you months in did you the value of total value OSSIBELR household, regardless of which person your in total? buy? the past 12 usually the past 12 consume in the [FOOD] of the made them. household months did spend on months did a typical you [FOOD] KILO* 1 bought any your [FOOD] in your month? consumed consumed GRAM* .... Has your household consumed [FOOD] [FOOD] household one of the household in a typical that you POUND ... 3 during the past 12 months? Please since my purchase months that consume month from received as OUNCE* . exclude from your answer any [ITEM] last visit, [FOOD]? you [FOOD] that your own a gift over LITER* ... purchased for processing or resale in a that is since purchase you grew or production? the past 12 CUP* ....6 household enterprise. [DAY/DATE [FOOD]? produced at months? PIT* .7 p home? QUART*.. .8 PUT A CHECK (/) IN THE IF NONE GALLON* ..9 APPROPRIATE BOX FOR EACH WRITEIF NONE IF NONE, BUNCH . 10 FOOD ITEM. IF THE ANSWER TO WRITE WRITE PECK .... ll n I Q. 1 IS YES, ASK Q.2-13. > ZERO, ZERO BUSHEL 12 > YES.1 >10 TIN ..... 13 3 NO. .2 PIECES. .14 V | NO | YES |CODE (>5) CURRENCY AM T UNIT MONTHS CURRENCY MONTHS AM T UNIT CURRENCY CMRRENCY DOZENS.. 15 . z Wheat (grain) 1 l l l _ _ _ Wheat (flour or c) maida) 2 l_ l_ _ _ l| Maize (tlour or r c grain) 3 l ll|v Jawar/Bajra 4 l _ l z Fine rice (basmati) 5 l l l l l l l _ 3 :1 Coarse rice 6 | Other grains/cereals 7 - - - I Gram = 8 E l l l__l|__ Dal 9 _ _ _ __ _ _ _ Groundnuts 10 _ r l l l ___ Liquid vegetable ll oils (dalda) 11 I l c --I Ghee, Desi ghee =12 = | l l 0 ____ Fresh milk _ l l l l l l l mz z Fresh milk 1 _ _ _ _ _ __ _ _ _ _ _ _ _ _ __ _ _ Q% FIGURE 3.7: ILLUSTRATION OF CLOSE- ENDED LIST (PART B OF CONSUMPTION MODULE) 3 PURCHASES SINCE LAST VISIT PURCHASES TYPICAL MO HOME PRODUCTION GIFTS UNIT CODES: 1. 2. 3. 4. 5 6. 7. 8. 9. 10. USE CODES In the following questions, I want to ask Have the How much How much How many How much How many How much What was What is the WITH STAR about all purchases made for your members of did you pay did you months in do you months in did you the value of total value 0HENE household, regardless of which person your in total? buy? the past 12 usually the past 12 consume in the [FOOD] of the I made them. household months did spend on months did a typical you [FOOD] KILO* ... 1 C bought any your [FOOD] in your month? consumed consumed KILO* ....1 Has your household consumed [FOOD] [FOOD] household one of the household in a typical that you GRAH* .... G during the past 12 months? Please since my purchase months that consume month from received as UND* . exclude from your answer any [ITEM] last visit, [FOOD]? you [FOOD] that your own a gift over oUNE*4 ...5 purchased for processing or resale in a that is since purchase you grew or production? the past 12 C household enterprise. [DA YIDA TE [FOOD]? produced at months? CUPI* ....6 z household enterprise. [DAY/DATE [FOOD]? produced at months? ~~~~~~ ~ ~~~PINT* ... U0 home? QUART*8... C PUT A CHECK (/) IN THE IF NONE GALLON* 9 ... APPROPRIATE BOX FOR EACH ZGERO IF NONE IF NONE, BUNCH... 1 C FOOD ITEM. IF THE ANSWER TO WRITE WRITE PECKl.... OQ1 IS YES, ASK Q.2-13. ZERO, ZERO E .... N ZERO, ~~~~~~~~BUSHEL. .12N YES.1 )>10 TIN . 13 NO. .2 PIECES. .14 NO YES CODE (-5) CURRENCY AMT UNIT MONTHS CURRENCY MONTHS AMT UNIT CURRENCY CURRENCY DOZENS. .15 BO-TTLES .16 Yogurt and Lassi 14 _ l Milk Powder 15 _ l Baby Formula 16 = _ Sugar (refined) _ 17 l _ l CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES enumeration of items consumed is done before ask- not produce an answer, the interviewer is instructed ing the follow-up questions so that respondents will (in the interviewers' manual and in training) to write not be tempted to say that they have not consumed "DK" (for "don't know") in the space reserved for an something in order to shorten the interview by answer code. Such responses are given a special non- avoiding the follow-up questions. This temptation is numeric code in the data entry program. The end prevented because the enumeration is done before the result for analysis is much the same as having a "don't respondent finds out that there will be follow-up know" code for each question. However, this system questions on each item enumerated. has the advantage that it discourages interviewers from A second approach is useful when it is expected accepting "don't know" answers too easily, which they that only a few of many possible items will pertain to may be tempted to do to speed up the interview. any one household. Consider Figure 3.8. The large Moreover, the special non-numeric code for such grid on the right contains lines for several durable responses is glaringly obvious when the supervisor goods owned by the household, but these are not pre- reviews the questionnaire. coded. Rather, the respondent is asked, using the small grid on the left, whether the household owns certain Letting Respondents Choose Units durable goods. In this example 12 durable goods are For many questions that involve payments or quanti- considered, but in some cases 20-30 goods have been ties, respondents are allowed to give their answers in listed. Most households own only a few durable goods. whatever units they find most convenient. Examples of For all durable goods owned by the household, the this are found in Figure 3.3. In Questions 4 and 12 the interviewer lists the name and the code number in the code of the time unit in which the respondent replies large grid to the right in Figure 3.8, and asks a series is placed in the box marked "time unit."The codes are of questions about each good. If the household owns provided in a box at the bottom of the page. two or more of the same durable good, one line is Allowing the respondent to select the time unit filled out for each good owned. means that transactions are expressed in the units in which they normally occur, which may differ from Probe Questions household to household or from person to person. There are some kinds of information that respondents This avoids inaccuracies in conversion. For example, a may accidentally not provide. In such cases the ques- person paid $510 per week will respond precisely if tionnaire includes instructions to the interviewer to allowed to respond on a per-week basis. If forced to ask further "probing" questions on the subject. An respond in terms of dollars per month, the respondent example of this is Question 9 of Figure 3.1. Suggested might round the figure down to $500 for ease of mul- probing questions are usually included in the inter- tiplication and calculate each month as being equiva- viewers' manual and occasionally included in the lent to four weeks. The annualized figure would thus questionnaire itself. Probe questions are often used to become $24,000 instead of the $26,520 that would be ensure that all items in a respondent-determined list reported if the respondent were allowed to report on have been reported to the interviewer, or to ensure a per-week basis and the data analyst then calculated that the respondent's answer is properly classified by the respondent's annual rate from that answer. the interviewer. Interviewers are also asked to probe Of course, data analysis is always slightly more for answers to questions that ask "how much ... ?" (This complicated when respondents' answers must be con- kind of question is commonly found in the consump- verted in order to arrive at annualized figures, but, tion, agriculture, and household enterprise modules.) since a computer can easily do this, this disadvantage is Interviewers should be thoroughly trained to ensure trivial. However, it is very important to ensure that, that they fully understand what information to probe where necessary, the questionnaire explicitly asks the for, and how to do so. respondent how many times per year the payments are Because the interviewer is trained and instructed made. For example, a worker who reports a daily wage to probe for information, there should be very few rate may be employed only intermittently. In this case, answers of "don't know" and thus very few codes for the questionnaire should ask the respondent how "don't know" in the questionnaire. In the exceptional many weeks or months he or she has worked during case when even a sound interviewing technique does the preceding 12 months (see Chapter 9 for details). 69 o4 FIGURE 3.8: ILLUSTRATION OF OPEN-ENDED LIST (PART E OF CONSUMPTION MODULE) 3 2. 3. 4. 5. 6. 7. > H LIST ALL THE ITEMS OWNED BY How many Did you purchase How much did How much If you wanted 1. Does your household own any of the THE HOUSEHOLD, THEN PROCEED years ago it or receive it as you pay for it? was it worth to sell this 0 TO ASK Q.3-7. did you a gift or payment when you [ITEM] today, acquire this for services? received it? how much I [ITEM]? would you c DETERMINE WHICH DURABLES THE T receive? G) HOUSEHOLD OWNS BY ASKING Q.1. FOR E EACH DURABLE OWNED, WRITE THE M DESCRIPTION AND CODE IN THE SPACE PROVIDED UNDER 0.2, AND PROCEED TO >>c ASK Q.3-7 FOR EACH ITEM. > 3 z C: PURCHASE..1 0 GIFT OR PAYMENT ..2 ITEM CODE YES NO DESCRIPTION CODE YEARS (>6) CURRENCY CURRENCY CURRENCY Stove 201 = = 2 = Refrigerator 202 2 Washing Machine 203 _ 4 Sewing/knitting machine 204 4 Fan 205 5 Television 206 6 Video player 207 __8 = Tape player/CD player 208 - Camera, video camera 209 9 Bicycle 210 = 10 Motorcycle/scooter 211 11 Car ortruck 212 12 13 14= 1__ 16 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES Table 3.1 Units of Quantity Used in Ghana, 1987-88 cate who is providing the information, a question can be inserted that asks the interviewer whether the person is Pound * I answering for himself or herself. If someone else is pro- Kilogram *2 viding the information, the interviewer shoulwal inu the .............................................................................................. ......... vi ngt e nf r a o ,th i t r i w rs ou d i n t e Ton *3 identification code of that person. An example of this is Minibag *4 shown in Figure 3.5.This information is useful because ................................................................................................ .......... Maxibag . .... .. .a proxy respondent may give less accurate information Bowl 8st64@80vl@@@-66*******z@@@9Pev-w----- ................................ ity in question. For example, one household member American tn ; - may not know the exact salary of another. Therefore, Am"en c an tbin 9 Tre'e'' a'' ' '''''''''''''''' ' ''''''''''' l'''' some analysts may wish to identifyr any possible biases .......................................................... ........*........................ 1-0 ................. so m an ly t ma v.s to id n iy a y po sb e b a e Stick 11 introduced by the proxy respondents or to omit their Bundle 12 responses altogether. ..................................................................................................... 13........... Barrel 1 3 ............................................................................*......................... -14.......... C r s o k C v r Liter ... . ..... *1.4 . .Cord stock Covers GEa'iil'o"n, ..... ... ..... . LSMS questionnaires are usually printed with card- Beer bottle * 16 *n-c - ............................................stock covers-covers made of very thin cardboard Bunch 1 7 Nut......................................................................................... similar to the cardboard used in file folders. In some Fruit 19 past surveys it was decided not to use these covers L og 20 because of their added cost, but this led to the prob- Box 21 lem that the front and back pages of the questionnaire Al.2-2 occasionally came loose. Since the front page usually Note: It is preferable to use the unit codes marked by (*) whenever possible. carries the key household identifier information and Source: Ghana LSMS survey (1987-88). the back page sometimes contains the household ros- A particular place in the questionnaire where it is ter, any such loss is likely to render the rest of the useful to allow respondents to choose their own units questionnaire useless. Thus cardstock covers are well is in the "quantities produced" questions in the agri- worth their cost. culture module. In Ghana, for example, respondents were allowed to give answers in 22 different kinds of Identifying Sections units (Table 3.1). A serious problem for analysts who The household questionnaire contained in a prototyp- want to convert these different quantities to a single ical full LSMS survey can be very bulky. The Nepal standard unit is that only about half of the units used questionnaire, for example, had 70 pages. Therefore, it in this example were standardized, and some of the is useful to devise some ways to make it easy for read- standardized units were local terms (such as minibag ers to find their way around in these questionnaires. A and maxibag) that would be unknown to anyone not few ideas are listed here, and there may well be more. familiar with farming in Ghana."t In the case of stan- First, it is useful to have page numbers on each page dardized local units, the survey team should ensure and a table of contents listing the sections (and their that such terms are defined (in terms of international page numbers) at the beginning or end of the house- standardized units) in a basic information document hold questionnaire. Second, some inexpensive graphic that includes all of the information that data users will techniques can be used to divide the questionnaire need to analyze the data. into smaller parts. For example, some sections of the questionnaire can be printed on different colored Respondent Codes paper or in different colored inks, or sheets of colored It is sometimes useful to know who is answering a cer- paper can be inserted between major portions of the tain section of the questionnaire. In general, each house- questionnaire. It is also possible to print short, dark hold member should answer for himself or herself, but bars at the edge of each page, with the placement of this is not always possible. For example, a household these bars on the page being the same within each member may be away during the entire week when the module but lower down (if on the vertical edge) or field team is working in his or her community. To indi- further to the right (if on the bottom edge) in each 71 MARGARET GROSH, PAUL GLEWWE, AND JUAN MUNO0Z successive module. Using just one or a few of these also simplifies translations, as the verbal parts can be techniques will be sufficient.The questionnaire should overwritten in the local language, leaving intact the not become too colorful or complicated. skip codes, response codes, and general format. Legibility and Spacing Appendix 3.1 Common Gaps and Overlaps There is an art to laying out the grids for a question- naire. The lettering must be large enough to read, This appendix provides a list of the modules that which is sometimes difficult to accomplish in the should be checked for gaps and overlaps with respect compact structure of the grid. Legibility is especially to the information that they collect. This list is not important, as interviews often take place under poor meant to be exhaustive because household question- lighting conditions, such as outdoors at dusk or after naires of different configurations will be subject to dif- dark in homes dimly lit with lanterns, oil lamps, or ferent risks of gaps and overlaps and because there are candles. The good print quality now available from so many possibilities that it is difficult to list them all. laser printers helps, but poor legibility is an ongoing However, some of the most common and important complaint among interviewers. issues are mentioned here. Many more are mentioned There must also be enough white (empty) space in the relevant chapters of this book. in the layout of the questionnaire. Whenever the answer will be coded later, a generous space should be Consumption allowed to write out fully the information required, Consumption information usuaUly comes from several such as the person's name, the name of the school different modules of the household questionnaire. See attended by the respondent, and the respondent's the discussion in Chapter 5 on the different compo- occupation. In other places, judicious use of white nents of consumption and the modules in which those space makes the questionnaire easier to read or less components are typically collected. confusing than a questionnaire in which every page is crowded with print. Income In fact, in this book, the fonts used inVolume 3 are Information on household income is gathered in the probably too small. This is necessary for Volume 3 to following modules: employment, household enter- show how typical questionnaire pages should appear. In prise, agriculture, and transfers and other nonlabor an actual questionnaire, the size of the pages usually will income. It is sometimes also collected in the housing be somewhat larger than the pages in this book, and the and savings modules. It is important to review the font size should be increased by a similar proportion. questionnaire as a whole to make sure that it accounts for all possible sources of income. In particular, ques- Software for the Questionnaire Layout tions about income from any rental property could be Many of the most common word processing and placed in the transfers and other nonlabor income graphics software packages are adequate for producing module, on the assets page of the savings module, or, if questionnaire page layouts, and LSMS questionnaires the income comes from renting out a portion of the have been produced using several different software household's primary dwelling, in the housing module. packages. The modules in Volume 3 (the electronic versions of which are available to readers in the CD- Wealth ROM enclosed in the volume) were produced in Information on household assets is collected in sever- Microsoft Excel, for two reasons. First, Excel is wide- al modules. The housing module gathers information ly available. Second, spreadsheet software is better than on the household's principal residence.The household word processing software at dealing with the long hor- enterprise module gathers information on equipment izontal format of groups of questions on a single topic and land associated with each household enterprise, that are spread across several pages. Regardless of the and on the stocks of inputs and outputs used in each software used, it is now much simpler and cheaper to enterprise. The agricultural module gathers informa- make revisions between the various drafts of the mod- tion on land, equipment, and livestock. The savings ules than it was in the days when graphic artists had to module collects information on other properties and draw each page by hand. The computerized approach financial assets, and the durable goods submodule of 72 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES the consumption module collects data on the house- information in the health module, which includes hold's durable goods. Finally, the credit module gath- questions on vaccinations in Part C of the standard ers information on the household's liabilities. health questionnaire. Credit Domestic Housework Credit information is collected in several modules, Some previous LSMS surveys have collected informa- including the modules for housing, consumption, sav- tion on how much time household members spend ings, agriculture, and household businesses. There is doing housework (such as cooking, cleaning, and also a separate credit module. Chapter 21 introduces childcare) in the employment module, usually asking the credit module and clarifies gaps and overlaps in only one question. If a time use module is included in credit. a questionnaire, there is no reason to ask questions about housework in the employment module. Mortgages However, because the time use module is very long, it Information on any mortgages that a household might is unlikely to be used in most LSMS-type multitopic hold can be gathered either in the credit module or in surveys. If the time use module is not included but the housing, agriculture, and household enterprise survey designers want to gather a small amount of modules. information on, for example, the number of hours spent on housework during the previous seven days, Employment one or two questions can be added to the employment Analysts often need to know how many hours each module. (See Chapter 9 for further discussion of this household member works in the household's enter- issue.) prises and in its agricultural activities as well as hours worked in employment outside the household. In pre- Notes vious LSMS surveys, all of this information was col- lected in the employment module. As explained in The authors would lke to express their gratitude to Jere Behrman, Chapter 9 (and Chapters 18 and 19), this book rec- Lawrence Haddad, Courtney Harold, John Hoddinott, Alberto ommends collecting data on household members' days Martini, and Raylynn Oliver for comments on an earlier draft. and hours of work in household enterprises and agri- 1. Survey designers occasionally collect redundant information cultural activities in the household enterprise and as a cross-check on other data. For example, most previous LSMS agriculture modules, respectively, while continuing to surveys have recorded both the age (in years) and the date of birth ask about the number of hours worked in wage of each household member. This is done to verify the accuracy of employment in the employment module. However, the age variable. some survey designers may decide not to include the 2. This assumes that a two-stage sample is used. In the case of a household enterprise and agriculture modules. In such three-stage sample, the secondary sampling unit is more pertinent. cases information on the number of hours spent Generally, the penultimate sampling unit is the appropriate unit for working on these activities must be collected in the collecting community data. employment module. 3. Issues concerning the order of the questions within each module are discussed in the topic-specific chapters in Parts 2 and 3 Vaccination of this book. For a general discussion of ordering questions in If the survey includes a fertility module, questions household surveys see United Nations (1985) and Frey and Oishi about vaccination should usually be placed in the fer- (1995). tility module so that this information can be collected 4. The short version of the health module presented in Chapter not only for children who currently live in the house- 9 does not ask for particularly sensitive information, but the stan- hold but also for children who have died or moved to dard and extended versions ask detailed questions about health sta- another household. If there is no fertility module or tus and health behavior (including drinking and smoking) that can the fertility module does not include all women of be sensitive. Ifeither the standard or the long module is used, health childbearing age, vaccination information on children should not be one of the first modules in the questionnaire. living in the household can be collected in the anthro- 5. The questions in the household enterprise module that refer pometry module. Another alternative is to gather this to "the past 14 days" can be reworded as "since my last visit" if the 73 MARGARET GROSH, PAUL GLEWWE, AND JUAN MUNOZ second half of the questionnaire is administered two weeks after the income, which was the purpose of the agriculture module in the interviewer's first visit. Ghana LSMS. However, as is common with such rich data sets, ana- 6. For example, the education module asks questions such as lysts are using the data for other purposes as well, such as calculat- "What grade is [..NAME..] enrolled in?" For this question, the ing the total quantities of various crops that were produced. range of acceptable valuLes in the data set is precisely defined. Moreover, it is also related to other information such as the degree References obtained and the age of the student. (For example, a six-year-old should not be in secondary school.) In the consumption module, Ainsxvorth, Martha, and Jacques van der Gaag. 1988. Guidelines for however, a wide range of values might be found for a question such Adapting the LSMS Living Standards Questionnaires to Local as "How much did ,vou spend on rice in the last two weeks?" which Conditions. Living Standards Measurement Study Working implies that fewer consistency checks are possible. Paper 26.Washington, D.C.:World Bank. 7. This section is a slightly modified version of the discussion on Babbie, Earl. 1990. Survey Research Methods, Belmont, Cal.: translating and field testing found in Chapter 3 of Grosh and Wadswvorth. Munoz (1996). Fink, Arlene. 1995. The Survey Handbook. Thousand Oaks, Cal.: 8. An alternative approach is to stretch the reference periods Sage Publications. during the field test. For instance, instead of asking "Have you been Fowler, Floyd. 1993. Survey Research Mlethods. Second ed. Newbury ill or injured during the past 30 days?" as in the actual survey, it may Park, Cal.: Sage Publications. be expedient to ask "Have you been ill or injured during the past FreyJames, and Sabine Mertens Oishi. 1995. How to Conduct Interviews 12 months?" or "When was the last time you were iDl or injured?" by Telephone and in Person. Thousand Oaks, Cali: Sage Publications. This approach will simplify the logistics of finding enough people Grosh, Margaret, and Juan Munoz. 1996. A Mlanualfor Planning and to try out the niodule but will not test very precisely whether the Implementing the Living Standards Measurement Study Survey respondents find it difficult to recall the information, since the Living Standards Measurement Study Working Paper 126. recall period used in the field test will be longer than the period Washington, D.C.:World Bank. used in the final questionnaire. Oliver, Raylynn. 1997. AIodel Living Standards Mleasurement Study 9. This section is a slightly modified version of the discussion of Survey Questionnairefor the Countries of the Former Soviet Union, questionnaire formatting found in Chapter 3 of Grosh and Munioz Living Standards Measurement Study Working Paper 130. (1996). Washington, D.C.: World Bank. 10. For languages that do not have uppercase and lowercase, Scott, Christopher, Martin Vaessen, Sidiki Coulibaly, and Jane another way should be found to distinguish instructions from ques- Verrall. 1988. "Verbatim Questionnaires Versus Field tions. It may be possible to use italics, bold, a different font, or a dif- Translation or Schedules: An Experimental Study" International ferent color.An example of this is the LSMS survey of rural house- Statistical Review 56 (3): 259-78. holds in northeast China in 1995. Chinese characters do not have United Nations. 1985. "Development and Design of Survey uppercase and lo-xvercase, so two different fonts were used. Questionnaires." Department of Technical Cooperation for 11. It is not necessary to convert quantities into standard units Development, National Household Survey Capability (for example, to convert bunches into kilos) to calculate farm Programme, New York. 74 Part 2 Core Modules 4 MMetadata-Information about Each Interview and Questionnaire Margaret Grosh andJuan Munioz A survey data set should contain not only the answers given to the questions posed in the inter- views but also some information about how the survey was conducted. This information is often called "metadata" or data about data. Survey planners must make many different method- Substantive Analysis ological decisions while they are designing their sur- Some metadata are required for analyzing survey data. vey. Often, especially in the case of surveys carried out Previous LSMS surveys have been reasonably effective in developing countries, these decisions are made on collecting such data. the basis of anecdotal evidence, personal judgments, or The most crucial metadata required for analytical "survey folklore" rather than on the basis of quantita- purposes is information about the sample. Even simple tive analysis of past experience. However, it is possible analysis (such as means and cross-tabulations) requires to gather evidence on various aspects of survey correct sampling weights, which are calculated from methodology to provide some guidance to the design- information about the sample that must be collected ers of future surveys. during the planning and implementation of the survey. Until now, there has been no systematic attenmpt Other rmetadata are useful for nmore specific analy- in LSMS survey projects to collect and analyze meta- ses.Analysts often need to know the dates of the inter- data. It is ironic that LSMS surveys, which are designed views in order to calculate important constructed to facilitate quantitative analysis of government poli- variables. For example, the ages of respondents can be cies and programs, have not been designed to apply calculated by subtracting the respondents' birthdates the same types of quantitative analyses to themselves- from the dates on which they were interviewed. It is that is, to facilitate analysis of survey methods.The aim also often important for analysts to have precise infor- of this chapter is to help to remedy this oversight by mation on the ages of very young children so that they specifying what metadata should be collected in future can use these data in conjunction with height or LSMS surveys. weight data to calculate the children's nutritional sta- tus.The dates of the interviews are also important in Issues adjusting any estimates for price differences that may have occurred due to inflation during the fieldwork Metadata can be useful in three different areas: for car- period (especially when the fieldwork is spread over a rying out substantive analysis, for managing surveys, year or, in high-inflation countries, even when the and for carrying out methodological research. fieldwork is compressed into a few weeks) or between 77 MARGARET GROSH AND JUAN MUNOZ two separate surveys when analysts are comparing data * How much does the length of the interview affect from the two surveys. the quality of the data? Occasionally analysts make more complex use of a How do the characteristics of the household (such metadata. For example, they may conduct a sensitivity as size or economic activities) affect the number of analysis to determine whether the patterns of answers times the interviewer must visit the household to given by replacement households or proxy respon- complete the whole questionnaire? dents differ from the answers that would have been a How do different characteristics of the interviewers given by the intended respondents. If the answers are (such gender, age, or education) affect respondents' significantly different, the analysts may wish to omit performance or responses? the observations given by proxy respondents from a Into how many languages should the questionnaires their analyses or to adjust their calculations to correct be translated? for any biases introduced by the proxy respondents. . How many languages should the interviewers be able to speak? Managing the Survey It is extremely useful for survey managers to receive Data Needs information on how the survey interviews are going while the survey is in progress. For example, survey To gather together a frill set of metadata about a given sur- managers may want to monitor how many interviews vey, information must be gathered from several different are being conducted in the various languages spoken sources. In this section, each of the main sources of meta- in the country of the survey. If more interviews than data is discussed. In keeping with the theme of this expected are taking place in a particular language, the book-the design of survey questionnaires-most survey managers can arrange to have more question- emphasis is put on the data that are gathered in the ques- naire forms printed in that language. Similarly, if sur- tionnaire itself. This chapter concentrates on the issue of vey managers have ongoing information on how what data about the process of carrying out the survey quickly the field teams are completing their work in should be collected and recorded rather than on explain- a particular sampling unit, they can monitor whether ing or giving recommendations about the survey process. the fieldwork is progressing either slower or faster The experience of and recommendations for LSMS sur- than planned and can take steps to adjust the veys on sampling, fieldwork, data management, and proj- timetable or the budget accordingly. Also, if they ect planning are the subject of a companion volume to know the rates of nonresponse or of proxy interviews this one (Grosh and Mufioz 1996). It should be noted that for each interviewer during each month of fieldwork, the recommendations made in this chapter about what they can arrange for the interviewers who are per- types of metadata to gather will apply regardless of forming poorly to be supervised more closely, to whether specific parts of the LSMS implementation logis- receive remedial training, or, in exceptional cases, to tics are adopted or not.The specific format may, of course, be replaced. vary according to how the survey is implemented. Metadata collected on the questionnaires should Methodological Research be typed into the computer and made available to Some insights gained during one survey may come researchers along with the survey data sets. In most too late to improve that particular survey but can be a previous surveys some metadata have been recorded valuable input into the design of subsequent surveys. on the paper questionnaires as an aid to field man- Thus many of the methodological questions that agers, but have not been entered into the computer. As metadata can help answer are the same as those a result, the metadata could not be used to help to ana- involved in survey management but over a longer time lyze the substantive survey data or to rigorously eval- frame.These questions include: uate the methods used in the survey. * How many more observations can be gained by having interviewers make a second, third, or fourth The Household Questionnoire attempt at contacting a household where no one Some of the most important and useful metadata can was home at the time of the interviewer's first visit? be gathered by means of the survey's household ques- * How long is the average interview? tionnaire. 78 CHAPTER 4 METADATA-INFORMATION ABOUT EACH INTERVIEW AND QUESTIONNAIRE IDENTIFYING INFORMATION FOR THE HOUSEHoLD. The It is also important to express gcographical infor- first and simplest piece of metadata on each question- mation not only in terms of administrative units but naire is its unique identification code.1 No survey also in terms of geographical descriptors such as lati- should omit this. tude and longitude (and even altitude). This informa- tion should be recorded for two units of observation- SAMPLING INFORMATION FOR THE HOUSEHOLD. For the primary sampling unit and the dwelling. sampling weights to be calculated correctly, informa- Gathering data on the location of the primary tion from three separate sources must be brought sampling unit, even within a few kilometers, will be together. helpful for analysts and is crucial both for managing First, some information, such as the codes and the fieldwork and for merging the survey data, at least names for each stage of the sampling process-the approximately, with other geographical information- stratum, the number of the census block that serves as for example, analyzing information on climate and soil the primary sampling unit (PSU),2 and the dwelling quality in conjunction with data from the agriculture number-can usually be recorded on the question- module. An increasing number of developing coun- nalre by the clerical staff who are preparing the survey tries have data sets that include the location described before the interview is conducted.3 in terms of latitude and longitude and the characteris- Second, as the interviews are conducted, further tics of government services such as clinics, schools, and sampling information is collected for each household post offices. in the original sample. For example, it is occasionally The geographical descriptors of the primary sam- impossible to interview a particular household in the pling unit in the LSMS survey can be drawn from the sample. In these cases the interviewer should note the cartography done for the sampling. Alternatively, glob- reason on the questionnaire (for example, the inter- al positioning systems (also known as geopositioning viewer was unable to locate the building, the building systems) can be used during fieldwork to record the was no longer being used as a dwelling, the residents location of the community center and possibly of other were all absent for the duration of the fieldwork in the important points of reference in the community (such area, or the household members refused to participate as clinics, schools, the nearest hospital, the agricultural in the survey). If other households are interviewed as co-op, and the nearest paved road).The second option "replacements" for these nonresponding households, is likely to yield more precise measurements. the interviewer should note the fact that the inter- The locations of the dwellings of the sample viewed household is a replacement. households can also be recorded using global position- Third, some information comes from the records ing systems, accurate to within a few meters. Knowing of how the sample was drawn (see discussion below the exact location of households makes it possible for under "Sampling Records"). analysts to measure accurately the distance between each dwelling and the available services. It also allows GEOGRAPHICAL INFORMATION FOR THE HOUSEHOLD. them to merge the survey information more precisely The geographical information that should be collect- with information containing geographical references ed for each household should specify the state (or from other sources. In addition, knowledge of a department, province, or region) and the county (or household's exact location makes it easier for inter- municipality, district, or prefecture) within which the viewers to find the dwelling again if the survey is household is located. At least two, and in some coun- repeated.4 tries three or four, levels of administrative unit may Clearly, recording the location of the sampling exist and, therefore, need to be recorded. This infor- unit is extremely useful and should be done without mation can often be precoded into regions that are fail. However, recording the exact location of the analytically meaningful, such as rural and urban areas dwelling raises a problem of confidentiality.To include or major agroclimatic zones such as mountains, plains, the latitude and longitude of a household in the data and coast. These codes can either be placed in the files that are released to the public would mean that questionnaire or can come from the same administra- data users could, at least theoretically, locate specific tive records from which the sampling weights are households and connect their members with the drawn. information that they gave during the survey. This 79 MARGARET GROSH AND JUAN MU90Z would violate the confidentiality that interviewers the proxy respondent, because analysts of the resulting promise respondents. data set may want to treat responses from proxies dif- There are some potential solutions to this prob- ferently from responses given by preferred respon- lem. One option would be to omit the latitude and dents.Also, survey managers may wish to monitor the longitude of the households from the files released to rate of proxy respondents per interviewer as one way the public.This is parallel to the procedure often used of judging the quality of each interviewer's work. for the names and addresses of the households, which Planners of new surveys may wish to know how much are typically included on questionnaires but omitted difference proxy responses make so they can know from the publicly released data files. In this case the whether it is worth incurring the costs of making sev- survey agency would have to calculate the distance eral visits to the household to reduce the number of variables before releasing the files to potential analysts. proxy responses. The implication of this would be that only the survey team would have the power to merge the household INFORMATION ABOUT THE INTERVIEWS. Interviewers survey information with other geographical data.This should record the following pieces of information might limit the number of such merges performed, about the conduct of the individual interviews: since the survey agency would be less likely than a * The length of time taken to interview each person. series of individual researchers working on many dif- Recording this information is essential so survey ferent issues to seek out complementary data sources managers can know how onerous a burden is put on many different topics. An alternative option would on respondents during the survey. Interviews can be to give researchers access to the original geograph- be split into more than one session. Thus the basic ical descriptor data files in order to create their own unit of observation should be the "sitting" or the distance variables or to merge geographical data with length of time a given respondent is interviewed on other data sets, but to vet these individuals carefully to one occasion.The total interview time for that per- ensure that they did not make the information avail- son is the sum of the time spent in all of his or her able to third parties or otherwise misuse it. sittings.The total interview time for the household is the sum of all sittings by all individuals in the RESPONDENTS FOR THE MODULES. Interviewers should household. In some surveys it may also be usefLil to record which household member was the respondent know how long it took in each household to com- for each module and each section. All of the draft plete a specific module, especially for modules that questionnaire modules presented in this book specify are new, experimental, sensitive, or very lengthy. a preferred respondent for each section. Broadly, they Ilow many times an interviewer mnust visit a household to specify that every household member over a particular obtain all of the required information. Survey managers age (for example, ten years) should reply for himself or and planners want to know whether it is worth- herself in modules that deal with individual attributes while in terms of time and costs to make third or such as health, education, or labor activities. In the fourth visits to households in order to get more household enterprise and agriculture modules, the modules filled out or to minimize proxy responses. respondent for each enterprise or agricultural plot Therefore, interviewers should record the number should be the household member best informed about of visits they make to each household and the num- that enterprise or plot. This system should yield the ber of person-modules5 completed on each visit. most accurate informnation. This system also has the * The date each module was administered to each house- advantage of spreading the burden of the interview hold. This information must be recorded: to help among several respondents, thus reducing the chance analysts calculate precisely the ages of the respon- that any one respondent will get tired. dents (especially children) based on their birth dates Despite the survey teams' best efforts, there will and interview dates; to help analysts adjust any inevitably be times when it will not be possible to monetary values expressed in the responses for interview every individual in person. In these cases inflation; to analyze any seasonality issues; and to interviewers wvill have to settle for proxy respondents. find out how long a period elapsed during the ref- It is important that interviewers record when this hap- erence period for questions that include a time- pens, the reason why it happens, and the identity of frame (for example, the answer to "How much have 80 CHAPTER 4 METADATA-INFORMATION ABOUT EACH INTERVIEW AND QUESTIONNAIRE you spent on [..item x..] since my last visit?" can ciple applies to all modules but is presumably more only be calculated on a 12-month basis if the num- relevant in some than in others. Survey managers ber of days between the two interviews is known). mIIay wish to monitor the extent to which other The language in which the interview is conducted. It is people are present during interviews because if this sometimes prohibitively expensive or complicated occurs frequently in the case of some interviewers, to translate the survey questionnaire into all of the it may be a sign that they lack certain skills or are languages spoken in a country. In these situations, not being sufficiently diligent in following the when the respondent does not know any of the fieldwork guidelines. languages in which the questionnaire is written, the * Notes in thle margins of the questionnaire or in special sec- interviewer has two options. First, he or she can tions providedfor notes. These can be useful during the conduct the interview in a language other than that process of cleaning the data files. For example, on in which the questionnaire is written (for example, one occasion, an interviewer noted that the lady the LSMS questionnaire for Ghana was written being interviewed attributed her substantial income only in English, but some interviews took place in not to the employment or assets she was asked about Akan). Second, if the interviewer and the respon- in the questionnaire, but to her two very generous dent speak no language in common, the interview lovers. On another occasion, an interviewer wrote may be conducted through an interpreter. This is . "respondent dead drunk, answers probably not true." often a person who resides in the household or pri- In both these cases the comment provided was very mary sampling unit where the interview is con- useful. In the first case the responses to the question- ducted. If an interpreter is used, the interviewer naire were accurate but would probably have been should record this. Since errors are inevitably intro- discarded during the data cleaning process had it not duced in the process of translation and interpreta- been for the interviewer's comment. In the second tion (see Scott and others 1988), it is important to case the information on the questionnaire was prob- monitor the overall rates of these occurrences and ably inaccurate, but it might have been used had the to let analysts know for which interviews these interviewer not made that note. conditions held. It is usually sufficient to record this information for each individual rather than for each INFORMATION ABOUT THE SuRvEY TEAM. The code person-module, since individual respondents tend number of the field staff (the interviewer, supervisor, to use the same language to respond to all their anthropometrist, and data entry operator) associated modules. For example, if an elderly Bolivian with each household should be noted on the ques- woman responds to the questions in the health tionnaire, as each of these individuals deal with that module in Quechua rather than in Spanish, she will household. This makes it possible for survey managers probably also respond to the employment module to monitor the quality of their survey teams. It also in Quechua rather than Spanish. Her son, however, allows analysts to look out for "interviewer effects" might be interviewed in Spanish. that might complicate or distort analysis. Planners of Whether other people are present during an interview and subsequent surveys and methodologists often wish to what the relationship is between the respondent and the analyze how various features of the interview (such as observer. Respondents are more likely to reveal full completion rates, replacement rates, nonresponses by and accurate details of sensitive information on item, and proxy respondents) vary by the characteris- such topics as income, savings, and sexual behavior tics of the field staff.6 In order for them to be able to if the interview is conducted in private rather than do the fullest possible analysis, it is necessary to prepare in the presence of others. For example, a wife may a separate file (taking the information from the admin- not want her husband to know how much money istrative records of the agency conducting the survey) she has earned or saved from her farm plot.A youth that lists all of the characteristics of the field staff, may not want his or her parents to know about his including gender, education, years of experience in the or her sexual encounters or contraceptive practices job, ethnicity, age, and languages spoken. A format for (see, for example, Hoyt and Chaloupka 1994 on the recording information from the administrative records willingness of young people to report their use of is introduced in the third section of this chapter (and drugs in the presence of others). This general prin- provided inVolume 3). 81 MARGARET GROSH AND JUAN MUrNOZ The Community (and Facility) Questionnaire(s) sample frame and from the records that document The same list of metadata that needs to be collected or how the first stage of the sample was drawn. Thus it is the household questionnaire also needs to be collect- important to ensure that the administrative, geograph- ed on the community and facility questionnaire(s), ical, and sampling terminology that applies to each with only two variations. primary sampling unit, the probability of that particu- The first exception is the geographical location lar unit being selected into the sample, and the meas- information. While it may be acceptable not to record ure of the size used in drawing the first-stage sample the latitude and longitude of households, this should are recorded. How these should be recorded is dis- be done as a matter of course for communities (and cussed in the third and fourth sections of this chapter. facilities), as discussed above. The second variation involves the privacy of an Administrative Records interview. In household interviews the interviewer Many important metadata are contained in adminis- should try to ensure that each respondent can give his trative records rather than in the survey questionnaire, or her responses in private. In contrast, when commu- including data relating to the survey's field staff. nity questionnaires are fielded, interviewers explicitly Information on the interviewers is especially impor- seek out groups of people to be interviewed together tant since these are the staff who have the most con- In many cases this means that answers given will con- tact with the sample households. However, the same vey some kind of consensus or common view in the information should be recorded for all team members, community. In other cases group interviews yield a full including supervisors, data entry operators, and range of different answers. For example, in answer to a anthropometrists as well as interviewers. An important question about what sources of health care are avail- question in survey work is whether the characteristics able to the communrity, a group interview may elicit 2 of the interviewer affect how the respondents answer longer list of possible places to go for health care than the survey questions. Therefore, it is important to col- would an interview with a single respondent. The lect information on the interviewers' characteristics- interviewer should record information about the peo- at minimum their age, gender, and education. In places ple who contribute to the group interviews, including where many of the interviewers have more than a sec- their names, genders, positions of leadership in the ondary education, it is useful to note in which disci- commiuniity, approximate ages, and, possibly, ethnic pline they are trained. The amount of survey experi- groups. ence each interviewer has had should also be noted, as However, there is a category of metadata that can well as whether they are permanent staff of the statis- only be collected within the community question- tical agency or are short-term contractors. Often, the naire: the costs and logistics of the survey by primary race or ethnic group to which interviewers belong sampling unit. Many of the items that influence total will also be pertinent. A sample form for collecting survey costs are specific to the primary sampling unit this information is introduced in the third section of and not to the household, so the primary sampling this chapter (and provided in Volume 3). unit/community is the appropriate unit of observa- Cost information is another important aspect of tion. Having information on such questions as how metadata that can be gleaned from administrative many nights the interviewers had to stay in the pri- records. Planners of subsequent surveys are likely to mary sampling unit overnight, whether they stayed in want to see basic budget information about the survey, hotels or in households, the number of miles the vehi- such as how many staff worked at what salary for what cles traveled between and within primary sampling period of time,7 how many and what kinds of durable units, and whether electricity was available for enter- equipment were required (for example, vehicles and ing the data into computers helps planners of subse- computers), and how many and what kinds of con- quent surveys draw up accurate budgets and work sumables were used (for example, paper, pencils, programs. diskettes, printer cartridges, and field kits for inter- viewers). No form is given for this in Volume 3 Sampling Records because the style of every survey's budget is usually Some important sampling information needed to cal- dictated by the requirements of the various agencies culate the final sampling weights has to come from the that finance the survey. Those who analyze metadata 82 CHAPTER 4 METADATA-INFORMATION ABOUT EACH INTERVIEW AND QUESTIONNAIRE from a survey after it has been completed should use Draft Module the final budget figures rather than the budget devised when the survey was originally being planned. Most of the metadata from the household question- naire can be gathered in a special "metadata module" Data Entry Program of the questionnaire. A draft metadata module is pre- The data entry program can record various statistics sented inVolume 3.The metadata module consists of about each data entry session for each household (for three parts: the household identification page, the example, the date on which the entries were made, the record of visits and interviews, and the comments length of time it took to enter the data, the operator's page. Alternatively, some of the information can be identification code, and the number of entries that collected on the pages of the topic-specific modules showed errors when range or consistency checks were using an alternative design of the questionnaire page, performed). This information has hardly ever been examples of which are presented in Volume 3. Forms analyzed in the case of previous LSMS surveys, but it are also included for collecting important metadata could reveal a lot about how the fieldwork was actu- from the community questionnaire and from sampling ally carried out. For example, it could help answer and personnel records. Detailed notes on how to use questions such as: In how many cases were the data from the first part of the questionnaire really entered before the second set of interviews took place? How Box 4.1 Cautionary Advice long did it take to enter the data from a questionnaire? How often did errors in skip patterns occur? * How much of the draft module is new and unproven? The household identification page is similar to those used in Special .periments many previous LSMS surveys. The record of visits and interviews changes the unit of observation from the To analyze some issues a special experiment may be whole questionnaire to the "sitting." This change is required. Consider, for example, the issue of whether untested, but most of the information collected in the the codes for "yes" and "no" should be 1 and 2 (as in grid was previously collected for the household as a most previous LSMS surveys), 1 and 3 (as in RAND's whole. Neither the field staff's commnents nor the char- Family Life Surveys), or 1 and 6 (as in some other sur- acteristics of the survey staff have been incorporated veys).The idea behind using 1 and 3 or 1 and 6 is that into the electronic fies in previous LSNS surveys, but the data entry operator is hkely to make fewer nis- these have both been done in other surveys with str onger tr aditions of collecting nretadata than the takes because the keys for the two different numbers stone Lraiin.fcleLig-aaata h LSMS.The sheet for survey costs and conditions specif- are further apart on the keyboard than 1 and 2. ic to the primary sampling unit is new.The information However, since the normal administration of surveys contained in the sampling spreadsheet has always been includes checks for data consistency and/or for dou- collected in one way or another but usually not so tidi- ble-blind entry of data and since the entered data are ly packaged until late in the data-cleaning phase. proofread and checked against the original question- How vell has the module worked in the past? The house- naire pages, most (if not all) of these typographical hold identification page has worked quite well in previ- errors should be corrected before the data set is pro- ous LSMS surveys.The household-level record of visits errors should be crectledbeorimen thetincluded data setand interviews may not have been filled out as rigor- ously as it could be and has rarely been used in analy- entry without all of the normal quality controls but sis, probably because it was evident to both interview- with a special assessment of accuracy would be ers and analysts that the unit of observation used in the required in order to definitively answer the question of past did not make sense. what are the best codes to use for "yes" and "no." Since * What parts of the module most need to be customized? these experiments are not commonly done as part of Survey planners will have to choose what amount of general survey implementation, no more will be said metadata to collect.Thereafter the main customization required will be the nomenclature :or the various sam- about te heTeheenpling terms, administrative terms, ethnicides, and lan- remind the reader that the metadata that can reason- guages. Also, if cost data are collected in the communi- ably be gathered as part of implementing a survey, ty questionnaire, this will need to be customized to fit while useful, will not answer all questions of survey the fieldwork plan. methodologv. 83 MARGARET GROSH AND JUAN MUFOZ these formls are provided given in the fourth section of The third part of the houselhold mietadata miodule this chapter. is the comments page. Interviewers should be asked either to write all of their comments here during the The Household Metadata Module interview or to transfer the comments that they wrote The first part of the module consists of the household down elsewhere to this space after the interview. identification. This should be filled out for every Putting all of the comments together in one place household in the sample as originally drawn, regardless makes it much easier for the data entry operator to of whether the household is interviewed. This part enter them in a text file. The system for numbering the contains the sampling information (including whether comments and for cross-referencing them to the sit- the household is a replacement and, if the interview tings grid, section, and question to which they pertain did not take place, why not), the geographical infor- is designed to make it easy for the analyst to locate mation (including the address of the dwelling), mini- appropriate comments. mal information about the head of the household, and There are two circumstances in which survey the codes for the field staff. designers may wish to arrange their survey so that the The second part of the module is a grid for metadata are collected in the topic modules.The first recording a summar-y of visits and interviews (other- circumstance is when designers do not need to collect wise known as the sittings grid). In this grid, the unit a full set of metadata but simply want to collect meta- of observation is the "sitting," which is the length of data on modules that they think contain the most sen- time any one respondent is interviewed at one time. sitive topics or on modules that are experimental or On a single visit to a household, the interviewer may may not be as well designed as the others.The second conduct more than one sitting if she or he interviews circumstance is when designers want to collect not more than one person. The grid records important only the metadata in the sittings grid but also some information about the sitting-how long it lasted, extra metadata on a few modules. In this case they whether the person interviewed was responding for should include the sittings grid in the questionnaire himself or herself or as a proxy, the language used, and leave space for collecting metadata in the topic whether or not others were present, and the inter- modules as well. For example, survey planners may viewer's impression of the quality of the data. A space want to collect metadata on the average time it takes is provided on this form for matching specific com- to complete a specific module. In this case they should ments that appear on the next page (the field staff leave spaces for the interviewer to record the time at comments page) to the sitting to which they pertain. the beginning and end of the module in the module An alternative to the summary of visits and itself.The information on the sittings grid will not be interviews page would be for the interviewer to sufficient because several modules are usually filled out record the respondent's name and ID code number, during a single sitting. the language used, and whether others were present The form following the comments page shows during the interview somewhere within the topic how information on the respondent and on the length modules. This would be straightforward as the inter- of the interview can be added to the food page of the viewer can observe all of these details as a matter of consumption module. In reality it is probably of more course rather than needing to ask questions about interest to analysts to know how long it took to com- them. In some modules it may even be feasible to add plete the whole consumption module than to com- some items that the interviewer would have to take plete each part of it. Therefore, the blanks for record- action to discover, such as the start and end time of ing the start and stop times might begin on page 1 of each module (by looking at his or her watch) or the the consumption module and end several pages later. reason why a proxy respondent had to be inter- The dotted line indicates that questions have been viewed. However, doing this for each and every omitted to show the beginning and end of a multipage module would probably disrupt the flow of the inter- module on a single sheet. Another form (in this case, a view. If interviewer checked his or her watch at the section of the employment module) shows how infor- beginning and end of each module, the respondent mation on the interview can be added to an would probably either feel rushed or become impa- individual-level module. Again, the time recorded for tient to finish the interview. the "length of interview" should probably cover the 84 CHAPTER 4 METADATA-INFORMATION ABOUT EACH INTERVIEW AND QUESTIONNAIRE whole employment module rather than just one part page, the nomenclature and numnber of codes will have of it. to be changed to fit the circumstances in the country The reader should note that the forms presented whcre the survey is to be conducted. Also, in the sit- here differ from past LSMS practice in two ways. First, tings grid, the list of languages in Question 7 and the the sittings grid is more detailed. Previous LSMS sur- list of modules in Question 10 should be changed to veys were organized as follows. Interviewers made a match the survey being conducted. first visit to all sample households in a given primary sampling unit, at which time they administered the Form for Collecting Information on Survey Costs first half of the questionnaire (roughly speaking, the The purpose of this form is to collect information on individual-specific modules). This was followed by a the cost of the survey's fieldwork, thereby helping the period when the data from these first interviews were planners of similar surveys in the future. Including this entered in the computer, while the interviewers visit- form as a page in the community questionnaire would ed households in a different primary sampling unit. simplify the administration of the survey and make it Then, the interviewers returned to the sample house- less likely that the form would be lost or not filled out. holds in the first primary sampling unit to conduct a However, since the information collected by this form second "interview" to administer the remainder of the has a purpose so different from the information col- questionnaire. Metadata were, therefore, collected for lected in the rest of the community questionnaire, the each "interview." However, the drawback to using the form has been discussed and included here instead of "interview" as the defined unit of measurement was in Chapter 13. that, within each "interview," interviewers were asked to conduct "mini-interviews" with each household Form for Recording Information from Sampling Records member over the age of ten. Accomplishing this often This form is for recording all pertinent sampling required more than one visit to a household.Thus the information. Some of this information will be gener- format given for collecting information by "inter- ated in the course of drawing up the first stage of the view" did not correspond with actual field practice. sample, some will come from the results of the listing This may be one of the reasons why little use has been operation, and some will come from the results of the made of the metadata information collected. survey itself. Also, some will come from calculations The second way in which the draft module dif- based on the raw information from the above three fers from those included in previous LSMS surveys is sources. Details of this process are given in the fourth that it makes it possible to enter interviewers' com- section of the chapter. ments in the electronic data files. In previous surveys interviewers have always written marginal comments Form for Recording Information on Personnel on the questionnaires, but these comments have not Characteristics been included in the electronic data sets.Thus analysts This form is for recording all pertinent information on have been unable to consult these comments, even the field staff. Some of this information may be avail- when they may have been necessary to elucidate the able from the personnel records of the statistical data. agency, although these records often omit details- Thus, in the forms proposed here, the unit of such as the languages spoken by the staff member or observation has been changed to reflect the actual the staff member's ethnicity-that can be important to practice in the field. However, it is important to bear analysts studying interviewer effects. One way to gath- in mind the fact that the sittings grid and comment er all the necessary information might be to do so at a system presented here have not been field tested. It training session that brings all staff together in the will be important to conduct careful field tests of both same place at the same time. instruments before using them in a specific country setting, to ensure that both interviewers and their Explanatory Notes supervisors are properly trained in how to use them. As with all the other modules, the metadata mod- This section of the chapter provides some explanatory ule will need to be customized to each country con- notes to the questionnaire forms given in the previous text. For example, on the household identification section. 85 MARGARET GROSH AND JUAN MuKioz Notes on the Household Identification Page The labels for the administrative and sampling The household identification page is frequently codes should be changed in accordance with the local referred to as the "cover page" and often is, in fact, the nomenclature. cover page of the household questionnaire. This is not If the household has a telephone, the interviewer desirable, since the covers of questionnaires are subject can enter the telephone number of the dwelling to a lot of physical strain as they are handled during beside the address. It may later be possible for the sur- the course of fieldwork and transportation. They tend vey supervisor to make phone calls to such households to come loose, and if the information contained on to double-check that the interview really took place this page is separated from the rest of the question- and that key pieces of information were collected cor- naire, the rest of the questionnaire will be useless as it rectly or to clear up any confusion that may have been will not be possible to assign the proper sampling detected by the data entry software. probabilities to the information collected. Moreover, if The household's ethnic group and religion are household-specific information, no matter how sometimes noted on the identification page.These are, of innocuous, can be read on the front of the question- course, individual-specific traits; different household naire, this may make respondents skeptical of the inter- members may have different ethnic origins or religious viewers' assurance that the information collected in affiliations. Thus, conceptually, this information should the survey interviews will be confidential. be on the roster page of the household questionnaire. Thus the cover page should consist of a page of However, in many countries, it is deemed impolite to ask stiffer material emblazoned with the logo of the sta- direct questions about such matters, and often the inter- tistical agency or survey. The only piece of house- viewer can easily observe these characteristics without hold-specific information that should appear on the asking any questions. In such cases the information may cover page should be the household's unique identi- be placed on the household identification page. The fication code. This code should also be recorded on example page in Volume 3 is formatted this way to this form. However, to ensure that there are no dou- remind survey designers of this option. There should ble-recording errors, it may be better to record the always be an "other" category in the list of ethnicities. code on the household identification page but to arrange for it to appear through a cut-out window on Notes on the Summary ofVisits and Interviews the cover page. Most of the information on the grid is straightfor- The household identification numbers, the name ward. Notes are provided here only as necessary. of the head of household, the household's address, and, sometimes, the household's geographical codes can be Q4. It is important to equip interviewers with watches filled out on the household identification and control if it is likely that some interviewers will not have them. information page either in the survey office or by the supervisor before the interviewer takes the question- Q7. The list of languages should be customized to fit naire to the household for the first time. the languages spoken in the country. It should include There is room on this form to note the number the languages into which the questionnaire was trans- of attempts that the interviewer makes to contact the lated, other languages spoken by significant subgroups household. These include the interviewer's attempts to of the population, and an "other" category. meet household members and persuade them to be interviewed, as it may take several visits to the dwelling Q9. It is possible to include more details here. The ID before the interviewer finds someone home. If a con- code of the household members who are present can tact is made, either a date for an interview may be be recorded. Also, the list of nonmembers can be made or a sitting may begin immediately. Only once coded either by demographic characteristics (such as the sittings begin does the interviewer need to begin age or gender) or by the type of social relationship that to fill in the sittings grid. (The total number of trips the respondent has with the nonmember (such as kin, that an interviewer makes to a household is the sum of neighbor, or village elder). the number of contacts from the contact list and the number of different visits to the household derived Q1o. Below the main grid is a space for the inter- from the sittings grid.) viewer to mark whether a module is required for any 86 CHAPTER 4 METADATA-INFORMATION ABOUT EACH INTERVIEW AND QUESTIONNAIRE particular household. If a household has no household motorcycle, bicycle, horseback, or on foot), the ques- enterprise, that box can be marked off on the row tions should be modified to fit these circumstances. below the grid. This will make it easier for the inter- This may need to be done carefully as different means viewer to tell whether or not they are finished with a may be used in different parts of the country, but the household by looking at the sittings grid rather than form must be relevant to all of these circumstances. having to look through the whole questionnaire. Notes on the Sampling Spreadsheet Q1l. The interviewer should provide his or her assess- The sampling spreadsheet is a byproduct of the first ment of the reliability of the information given in the stage of sampling. (For details on how this works for interview in column 11. The assessment is qualitative LSMS surveys see Grosh and Muinoz 1996.) Each line but is based on detailed information that only the in the spreadsheet corresponds to one primary sam- interviewer can have. The issue of what constitutes pling unit selected from the whole sample frame to reliable and unreliable information should be discussed serve as part of the sample. Some of the columns in the when interviewers are trained. spreadsheet contain data straight from the sample frame whereas others are computed within the spread- Q12. In column 12 the interviewer should note the sheet or come from other sources, as explained below. number of textual comments he or she may have writ- Column A (PSU No.) is a short identifier for the ten on the comments page that pertain to a particular primary sampling unit for the purposes of a specific sitting. household survey. It is convenient to use mnemonic, three-digit identifiers such as these rather than the Notes on the Characteristics of the Survey in the Primary long chain of "geocodes" that statistical agencies Sompling Units Form sometimes use to identify the units in the sample In cases in which one community questionnaire is frame. In this case the identifier is a simple serial num- filled out for each primary sampling unit, it is best to ber within each stratum (for example, 101 to 125 for integrate this form (which gathers information on the 25 sampling units selected in stratum 1 and so costs specific to the primary sampling unit) into the forth), but it would be equally appropriate to use a community questionnaire. If several community ques- straight serial number from 1 onwards to identify the tionnaires are filled out for each primary sampling unit, primary sampling units across all strata. this sheet can be treated as a separate form. However, Columns B and C are for the stratum codes and such separate forms are less likely to be filled out thor- names. oughly or to be key-entered into computer files, since Columns D to L are for the geocodes used by the they are often not viewed by the survey team as one of statistical agency of the country surveyed. The situa- the major tools of the survey. Thus, wherever possible, tion depicted in the spreadsheet may be oversimplified questions on costs specific to the primary sampling because: sometimes the names of the hierarchical sub- unit should be included in the community question- divisions differ between urban and rural areas; in some naire rather than collected on a separate form. countries intermediate categories between "urban" Questions 1-4 may have to be modified to fit the and "rural" are used; and although larger geographic fieldwork plan. According to the wording used here, subdivisions usually have names and codes whereas for example, the teams are assumed to travel from one smaller ones have only codes, the specific definitions primary sampling unit directly to another without vis- of "larger" and "smaller" are country-specific. In the iting the headquarters of the survey team in between. spreadsheet presented in Volume 3, space is given for If this is not how the fieldwork is organized, these the codes and names of the region and county; how- questions should be changed accordingly. ever, if smaller geographical subdivisions have names, Similarly, the version of this form given in the these should also appear on the spreadsheet. third section of this chapter assumes that each team has Columns M, N, and 0 are for recording the longi- the use of its own vehicle to ferry the interviewers to tude, latitude, and altitude from the approximate center the various places within the primary sampling unit of the primary sampling unit as taken from the sampling that they need to visit. If the interviewers have to trav- cartography.This method is less precise than using glob- el by other means (such as by public transport, boat, al positioning systems in the field, but it still enables 87 MARGARET GROSH AND JUAN MU940Z some approximate merging of these data with other nonresponse recorded on the questiornaire. In most sources of geographical data. These colunms can be previous LSMS surveys the sample was designed so deleted if global positioning systems are used in the field. that the same number of households in each primary Columns P to T are for recording the details of sampling unit was interviewed, which would thus the first sampling stage, assuming that it is done with make columnV almost constant. In the example given probability proportionate to size-which is almost in the spreadsheet (in Volume 3) the sample was always the case. In practice, columns P and Q would designed so that 12 households per primary sampling be headed "Number of households in the stratum" and unit were selected in the second stage, although this "Number of households in the PSU" (or "Number of number could be different (for example, 16). dwellings" or "Population" or whatever actual measure Columns X and Y are formulae. X=V/U is the of size was used in the selection with probability pro- probability of selecting each particular household in portionate to size). Note that column P (the size of the that primary sampling unit in the second sampling stratum) is a constant for all primary sampling units stage, while Y= 1 /X is the contribution of the second within a stratum. Column R (the number of primary sampling stage to the raising factor of households in sampling units selected in the stratum) is also a con- the primary sampling unit. stant within the stratum. Columns Z and AA are also formulae. Z=SX Columns S and T are formulae. S=Q(R/P) is the gives the final selection probabilities, while AA=TY probability of selecting the PSU in the first sampling gives the raising factors for households in the primary stage, while T= 1/S is the contribution of the first sam- sampling unit. pling stage to the raising factor of households in the Columns AC, AD, and AE are the distribution of PSU (in other words, how many hlouseholds in the the sample across teams and throughout the year. AC total population the sampled households represent). records the time period during which the primary Columns U to Y record the details of the second sampling unit should have been visited according to sampling stage. U is the number of households actually the original schedule. AD is the actual date of the first recorded by the listing operation in the PSU.8 It should interview in the primary sampling unit, and AE is the be completed by survey managers as the results of the date of the last interview in the area. Analysts will want listing operation come in to the survey headquarters. to verify that the pattern of actual interviews does not If the measure of size used in the first stage is the depart too much from the prescribed pattern in cases number of households, it may be useful to compute an in which the fieldwork is meant to be spread evenly auxiliary column with the ratio U/Q and use it to over the year and across the country to eliminate sea- monitor the quality of the listing operation. If both the sonality effects. Survey administrators will want to listing operation and the census that originated the note whether the expected pace of fieldwork was sample frame are carefully carried out, U/Q should be maintained in the field. a little larger than 1.0 almost everywhere (except in areas depopulated by natural disaster, war, or other Notes on the Staff Information Form unusual events). On average, U/Q should approximate Information about field staff is essential for analyzing the expected population growth rate for the country. how interviewers affect respondents' answers. If it measures less than 1.0 in a lot of primary sampling Therefore, all field staff positions should be included units, this may be the result of sloppy work during the on the staff information sheet. It may also be worth listing operation. including nonfield staff here if the information is like- The entries in columns V and W should come ly to have other administrative uses. from the survey's data set.This underscores the impor- tance of leaving room to record refusals and other COLUMN A. Each staff member should have a unique kinds of nonresponse in the household identification staff number, which might be a number used in per- page of the household questionnaire and of actually sonnel records. Alternatively, a number may be completing and key-entering the data from the house- assigned for the purpose of this survey, starting with hold identification pages of all of the questionnaires, 001 for the first staff member and continuing upward even those from nonresponding households. It may be as more staff are hired. If a staff member leaves, his or useful to split colunin W into the various kinds of her replacement should be assigned a new staff num- 88 CHAPTER 4 METADATA-INFORMATION ABOUT EACH INTERVIEW AND QUESTIONNAIRE ber rather than being assigned the number of the staff 3. This information can be recorded on the questionnaire in sev- member who has departed; reusing numbers might be eral ways. Ideally it would be done by an automated procedure in the confusing for analysts. central office to reduce transcription errors. For example, labels might be printed directly from the sampling files and then peeled and placed COLUMN F. During fieldwork, staff may do jobs that are on the questionnaire. Of course, this assumes that both electronic files not their primary function. For example, anthro- and facilities for prindng labels are available, which is not always the pometrists may pitch in to help fill out community case. Thus either the information from the sampling files can be trans- questionnaires or supervisors may conduct an inter- ferred to the questionnaires in the head office before the question- view or two to keep the team on schedule. By com- naires are given to the interview teams or the supervisors can transfer paring the roles listed here with who actually fills out the information from the "assignment roster" onto the questionnaires. the various parts of the sittings grid, it is possible to The codes for the region (rural/urban or agrocimatic zones) can also determine how much of this substitution occurs. be placed on the questionnaire or drawn from admninistrative files. Evaluating this after the fact may yield conclusions 4. Moreover, if geopositioning systems are used during the list- that would be useful inputs for designers of future sur- iilg operationi prior to the initerviewving phase, it msay becoiiie eas- veys regarding workloads and training needs. ier to locate the sampled households. 5. A person-module would be, for examnple, the administration COLUMN G. The levels of education and the terminol- of the employment module to one person. A single sitting is likely ogy used will need to be tailored to the country of the to cover the administration of several modules to one person, and survey. If interviewers in a given country have com- a single visit by the interviewer to the household is likely to involve monly had post-secondary training, for example, it sittings with different individuals. may be worth noting the main disciplines from which 6. It is also interesting to study how these vary by the characteris- interviewers are drawn. tics of the population. Moreover, since the assignment of interviewers is rarely totally random (for example, interviexvers posted in rural areas COLUMN I. The terminology for contractual status will initerview inorc farimiers, while interviewers that speak ai edlmic should also be tailored to the country studied. The language wvill interview more respondents than average who speak main distinction to be made is between the permanent only that language), it is important to control for the characteristics of staff of the survey agency and workers who are con- the respondents when studying interviewer effects. However, no extra tracted only for the duration of one survey. data about respondents are likely to be required since the typical con- trol variables (such as region, age, sex, ethnicity, and occupation) are COLUMN J. The main ethnic groups prevalent in the gathered in the course of administering the other modules. country should be used. 7. It is assumed here that interviewers are paid fixed monthly salaries (which has usually been the case in previous LSMS sur- Notes veys). If interviewers are paid on an interviexv-by-interview basis or if they receive bonuses for completing interviews fully or before The authors would like to express their gratitude to Martha some target date, that information should also be recorded. Ainsworth, Didier Blazeau, and Teresa Parsley Edwards for the assis- 8. In the listing operation, field staff should conduct an inven- tanlce provided. tory of all of the dwellings withini the boundary of each sampling 1. Occasionally a household has so many members that two unit. The resulting list of dwellings is the basis for selecting questionnaire forms are required to record the necessary informa- dwellings within the sampling unit. tion about all of the household members. in this case the two ques- tionnaire forms should have the same household ID code, but the References identifier codes for the people within them should be different; household members 1-12 would be in the first booklet, while Grosh. Margaret, and Juan Muinoz. 1996. A Afantalfor Planning and members 13-24 would he in the second booklet. Iunplenten ring thie Liieing Standards Aleasuireusenit Stady Survey. 2. If the sample is drawn in three stages rather than two, the Living Standards Measurement Study Working Paper 126. information is required on both primary and secondary sampling Washington, D.C.:World Bank. units.This chapter assumes that the sample is drawn in tevo stages, Hoyt, Cail Mitchell, and Frank J. Chaloupka. 1994. "Effect of but the examples can easily be extended to cover an additional Survey Conditions on Self-reported Substance Use." stage of sampling. Contensporary Economic Policy 12:109-21. 89 MARGARET GROSH AND JUAN MUN~OZ Scott, Christopher, Martin Vaessen, Sidiki Coulibaly, and Jane Translation or Schedules: An Experimental Study." International Verrall. 1988. "Verbatim Questionnaires Versus Field Statistical Review 46 (3): 259-78. 90 5 Consumption Angus Deaton and Margaret Grosh The living standards surveys aim both to measure and to understand living standards. Much of the focus is on poverty or deprivation-the lack of adequate living standards. Standard economic measures of deprivation are concerned with the lack of goods or the lack of resources-income or assets-with which to obtain goods. But it is always important to keep in mind that many of the most important aspects of deprivation go beyond purely material deprivation. Deprivation of health, deprivation of education, deprivation of freedom from crime, and deprivation of political liberty are all important-often more important than deprivation of material living standards. The role of development in freeing people from deprivation in a wide sense has been forcefully argued by Amartya Sen. (See Sen 1999 for a recent and comprehensive account.) Data from the living standards surveys, particularly from the modules on health and education, often help us take a broad view of poverty. Other important aspects of living standards, such as life expectancy, infant mortality, and the threat of crime, must be examined using other types of data. Nevertheless, measuring the material basis of living standards will always play an important role in the assessment of levels of living. This chapter explores ways to collect data for a consumption- based measure. The measurement of consumption has been a central establishment of the LSMS, such as RAND's objective of the LSMS program since the program's Malaysian Family Life Survey. However, many earlier inception in 1980 and has remained so throughout surveys, including the Indian National Sample Survey, some 50 surveys, all of which have been used to doc- had long used per capita household expenditure as the ument living standards and poverty. Although the pro- measure of living standards and the basis for measur- gram has always acknowledged that living standards ing poverty. have many dimensions and has taken care to measure The basic ideas underlying the measurement of them in its surveys, the narrowly economic aspect of consumption are straightforward. Nevertheless, there living standards in the program title was taken to mean are various practical complexities, many of which are not income, as had been the case in many previous discussed in this chapter. Although income and wealth surveys, but consumption.This consumption focus dif- are what enable people to obtain goods and services, ferentiated the LSMS surveys from some surveys in it is those goods and services themselves that directly developing countries that immediately preceded the generate economic well-being. The consumption 91 ANGUS DEATON AND MARGARET GROSH module of the LSMS survey is designed to measure Much of the literature on the design of surveys in the consumption of these items in some detail and in general is concerned with how best to estimate means the aggregate (with the aggregate being the total value and totals and can be seriously misleading when of consumption at suitable prices). At its simplest, the applied to LSMS-type surveys, which have different module collects data on how much people spend on concerns. This is particularly true of consumer expen- various goods and services. How best to gather such diture surveys, which are most often designed to col- information and in how much detail, how to deal with lect weights for consumer price indexes. goods that are not obtained through the market, and The first section of this chapter briefly reviews the how to obtain accurate data on prices are among the arguments for using consumption rather than income topics discussed in this chapter. both to measure living standards and to measure The LSMS surveys differ from many other house- poverty and inequality. The first section goes on to dis- hold surveys in that their primary concern is not the cuss the principal uses to which consumption data estimation of means or totals. The most important have been put, including the documentation of living concern of the LSMS surveys is documenting the dis- standards (still the central aim of LSMS surveys) and tribution of living standards-measuring poverty the illumination of a number of other important pol- (often, but not exclusively, the fraction of the popula- icy issues. Finally, the first section reviews some of the tion in the left tail of the distribution) and, to a lesser experience of more than 10 years of LSMS surveys in extent, inequality.The LSMS data are also used to illu- collecting consumption data. minate a wide range of policy issues from descriptive The second section discusses the data needed to tabulations to econometric modeling. construct a consumption-based measure of living stan- This emphasis on poverty and distribution must dards and reviews the design issues that affect both the constantly be kept in mind because it has implications cost of collecting data and the data's eventual accura- for the design of any LSMS survey. A survey that yields cy. The third section presents a draft consumption accurate estimates of average levels of income or con- module. The fourth section provides explanatory notes sumption may nevertheless do a poor job of docu- regarding the draft module. menting income and consumption among the poor or of estimating the inequality of incomes. For example, Policy Issues if people have difficulty remembering high-frequency purchases (for example, food) after a day or two, ask- In many cases consumption data are better than ing respondents about the purchases they made on the income data for measuring living standards. In addi- previous day will yield more accurate data than asking tion, consumption data have a number of important them about the purchases they have made over the analytical uses in their own right. previous week or month. If the main concern is to estimate mean expenditure for the population, it may Why Use Consumption to Measure Living Standards? be sufficient to collect data on the average of the pre- Although the LSMS surveys, like many surveys in devel- vious day's consumption since this figure would oping countries, give primary emphasis to consumption include all purchases both for those who purchased rather than income, a considerable number of other sur- nothing the previous day and for those who purchased veys concerned with well-being do not attempt to col- several days' supply. In contrast, the average of the pre- lect consumption data. Many of these surveys are in vious week or month's purchases will be biased down- industrialized countries, but the income focus is also ward if the longer recall period implies that some pur- standard in most surveys in Latin America. There are chases will be forgotten and thus not reported. both theoretical and practical considerations that affect However, for measuring poverty, the previous day's the choice of income or consumption, and the balance measure will not be sufficient because all those who in favor of one or the other may be different in different did not purchase anything would be counted as poor. circumstances. Thus it is useful to start by rehearsing the This means that to measure poverty it might be better main arguments for and against each measure. to ask people about purchases they made over the pre- vious week or month, despite the resulting downward THEORETICAL ISSUES AND IMPLICATIONS FOR bias. MEASUREMENT. Income and consumption are different 92 CHAPTER 5 CONSUMPTION concepts, not just two different ways of measuring the period; there is a substantial literature on seasonal same concept. Some economists prefer income as a poverty (see, for example, Sahn 1989). However, there measure of living standards because they follow a seems to be a general consensus that a year is a sensi- "rights" approach. According to this approach, ble reference period over which to judge people's liv- income, togethcr with asscts, measurcs the potential ing standards, even if this is inevitably a compromise claims on the economy of a person or family. Other that is too long for some purposes and too short for economists prefer to use consumption because they others.There is also a good deal of empirical evidence consider the level of living a measure of economic that even people in poor agricultural societies and input, and consumption data show the level of living people without the ability to borrow much can by measuring what people acquire. Both can be smooth their incomes within a particular year and per- defended as approximations to utility. The "indirect" haps over a series of years, so that consumption will utility function expresses welfare in terms of resources reflect living standards at least throughout one year (positively) and prices (negatively). In practice this and perhaps over a series of years. (For a review see usually means income or resources deflated by a price Bhalla 1979 and 1980, Musgrove 1978 and 1979, index: real consumption or income, not money con- Paxson 1992 and 1993, Wolpin 1982, and Deaton sumption or income. Whether consumption or 1997, chapter 4.) income is measured, measures of prices are needed If a year is chosen as the standard for assessing liv- whenever analysts wish to compare people who face ing standards but the survey in question can only hope different prices, which will be whenever they make to measure flows over a shorter period, consumption comparisons over time or space. data will yicld a more accurate estimate of living stan- Another consideration about whether to use dards than will income data. Most people do not income (including income from assets) or consuimip- receive inconie every day, and many do not receive tion is the time period over which living standards are income every season-or at least not an equal amount to be measured. At one extreme is a lifetime living every season. So while consumption over a week, two standard, measured either by average consumption weeks, or a month is likely to be a reasonable indica- over a person's lifetime or by the person's total lifetime tor of living standards over a year or over a few years, resources; apart from any bequests, these two concepts income will not be. If analysts are interested in meas- are the same. The issue here is that some poverty is uring averages, income variation will not matter much only temporary (for example, students are poor in the if the survey itself is spread over a year, since some short term but not over their lifetimes, while the eld- people's zero incomes will balance out others' high erly may be poor but have not been poor throughout seasonal incomes. However, analysts are usually inter- their lives) so short-term measures of inequality can ested not only in means-LSMS surveys are rarely the overstate lifetime inequality. One influential theory of instrument of choice for estimating mean income or consumption and saving is the "life-cycle hypothesis," consumption-but also in inequality and poverty, which asserts that a person's consumption at any age is which are sensitive to the tails of the distribution, proportional to his or her lifetime resources. If this is especially the lower tail. Gathering data on the previ- true, measuring consumption is not only useful in its ous month's income will overestimate inequality in own right but also provides an indication of lifetime annual living standards and, provided the poverty line resources. However, the evidence for this hypothesis is is below the mode of the distribution, will overstate controversial to say the least; for many people, the the fraction of people below the line. Although there promise of resources in the future will do little to pay are also randoimi irregularities and seasonal patterns in bills today. Policymakers have to deal with current consumption, they are typically smaller than those in poverty regardless of the long-term prospects of the income, because consumption is less tied to seasonal poor; saying "Don't worry, they will be OK later" and weather-related patterns in agriculture than is about poor children or "Don't worry, they've had their income. Even so, consumption measured over a refer- turn" about the elderly are not acceptable responses. ence period of less than a year is likely to overstate If a lifetime is too long a reference period, a day, a poverty and inequality. In addition, the overstatement week, and a month are all clearly too short. Arguments may not be constant over time if seasonal patterns can be made in favor of using a season as a reference change with time, because one year is different from 93 ANGUS DEATON AND MARGARET GROSH another-or over the long run, because agriculture In the United States, the Consumer Expenditure accounts for a shrinking share of household income as Survey costs about five times as much per household economies become richer. as the Current Population Survey, which is the main These arguments provide a persuasive case that, source for data on income, earnings, and employment. given the choice, (perfectly measured) consumption is Even so, the concept of expenditure-giving a more useful and accurate measure of living standards money in exchange for a good or service-is clear than is (perfectly measured) income.These theoretical both to interviewers and interviewees, whereas the advantages of consumption are likely to decrease as the concept of income, especially income from self- period over which it is feasible to gather data gets employment or own-business activity, is not. For own- longer. If it is feasible to visit households on many account workers in agriculture and small businesses, occasions throughout the year this will clearly capture their personal and business accounts are often hope- any seasonality in the household's income. Moreover, lessly entangled. Thus, in agriculture and elsewhere, if the survey has a panel element so that income can the only practical way to estimate income is to gather be averaged over a series of years, it makes little differ- data on all transactions-business as well as personal- ence whether income or consumption is measured, if and to impose an accounting framework on the result- one can be measured as accurately and as cheaply as ing information. This process is extraordinarily time- the other. consuming, and the results are subject to large margins of error. Such difficulties in calculating income are not PRACTICAL IssuEs. The choice between income and specific to developing countries; even in the United consumption is often determined more by practical States and Britain, the various surveys-the Current considerations than by theoretical considerations. In Population Survey and the Consumer Expenditure the United States poverty is assessed by income, not by Survey in the United States and the Family consumption; consumption cannot be used because Expenditure Survey in Britain-do relatively poorly the United States does not have a consumption survey in gathering data on income from self-employment. of adequate size and quality to permit the estimation (See Coder 1991 for the Current Population Survey, of the poverty numbers. In general, however, whenev- Branch 1994 for a comparison of the two U.S. surveys, er a new or reformed survey is being planned, the and Atkinson and Micklewright 1983 for the Family designers have to choose whether to collect data on Expenditure Survey.) The difference between devel- household income or consumption. Much hinges on oping and developed countries is that formal sector the relative costs and relative precision of the data col- wages and salaries are much less common in the devel- lection required. oping countries. Neither consumption data nor income data are The income of many households-particularly but easy to collect. For consumption, the need is for data not exclusively agricultural households-varies season- on total household expenditure on goods and servic- ally throughout the year. In these circumnstances, meas- es. As will be discussed in the next section, these usu- uring households' annual income (which is the mini- ally have to be gathered item by item. In some cases a mum amount of data needed to adequately determine substantial fraction of consumption does not come poverty and distribution) would require many visits to through the market, so imputations have to be made. the household or reliance on the ability of household In industrialized countries such as the United States respondents to remember their income from many and Britain, the detail and the associated time and months earlier. However, if consumption is smoothed effort of asking dozens or sometimes hundreds of over the seasons-and much of the literature already questions often make it seem relatively more attractive cited suggests that this is done in most households- to collect income data, especially in situations where consumption will vary less by season than income does. income comes from one or two sources (for example, It may also be possible to collect useful data on annual wages and pensions) that are easily recalled or for consumption without making multiple visits. which independent documentation exists. By contrast, It is generally thought that respondents are more consumer expenditure surveys are seen as among the reluctant to share information about their income and most "difficult and expensive surveys" to field in the (to an even greater degree) their assets than about their statistical system (McWhinney and Champion 1974). consumption. Thus they are more likely to lie about 94 CHAPTER 5 CONSUMPTION their income than about their consumption. In many describe other dimensions of living standards. For countries income is taxable, at least in principle, and it example, they collect data on health outcomes and may be hard for the survey interviewers to persuade facilities and on educational attainments and facilities. respondents that the information they give will not be These measures are frequently used not only to docu- passed on to tax authorities. Rich households may ment living standards but also to explore their deter- refuse to grant interviews to the survey team and, if a minants in studies, for example, of the relationship rich family does grant an interview, the respondent, between income, assets, and consumption, between who may be a family member or a servant, will fre- earnings and schooling, or between health status, quently be more knowledgeable about the household's income, and consumption. consumption than about its sources and levels of In addition to being used to construct a single income. Income from assets is likely to be particularly summary measure of the economic welfare of house- hard to capture because the ownership of assets is holds, the consumption data that can be collected in highly unequal, and the wealthy-who own the most LSMS surveys have other important uses, some of assets-are typically thought to be the least likely to which are discussed briefly below. For a much longer cooperate. Given that most of the survey interviews in account with applications see Deaton (1997). developing countries must be conducted in a semi- public place, respondents are often reluctant to state EVALUATING THE IMPACT OF PRICE, SUBSIDY, AND their wealth in the presence of relatives and friends. TAxATiON POLJCIES AND THE PROVISION OF PUBLIC These problems of measuring assets and asset income GOODs. Analysts are often concerned with the effects are likely more severe for measuring inequality than of price changes caused by changes in tax or subsidy for measuring poverty, since households below the policies or by fluctuations in world prices. poverty line typically have few assets. Consumption data are invaluable for assessing these effects-in particular, who gets hurt by a price What Analyses are Consumption Data Good For? increase and to what degree. Many developing coun- The consumption data that can be gathered in LSMS try governments collect a large share of their revenue surveys have a number of important analytical uses. through tariffs or through taxes on consumption, while simultaneously subsidizing the provision of MEASURING WELFARE. The policy importance of meas- many goods and services ranging from basic foods uring living standards is indisputable. Household budg- (such as bread, wheat, or rice) to transportation, et analysis has been used to document and to publicize health, and education.To a first approximation, a price poverty since the late 18th century.While consumption increase hurts consumers in proportion to the generally cannot measure noneconomic components of amount of the good that they purchase, so in order to living standards-health, access to education, political know the distributional effects of a price change, ana- freedom-it is the best measure of the economic com- lysts need to know who consumes what and where ponent of living standards. Formally, consumption is consumers are in the overall welfare distribution. For valuable as an approximation to utility, or "money- example, do transport subsidies benefit the poor as is metric" utility, according to which an indifference curve often claimed, or do they actually benefit people who is labeled by the amount of money at constant prices are much better off? Improving the quality of clinics that is required to reach it (see Chapter 5 of Deaton and or increasing the number of teachers in schools will Muellbauer 1980).Total household expenditure adjust- not help the poor if the poor do not use these clinics ed by a price index and divided by the number of peo- or attend the schools where these teachers are ple in the household (or by some more sophisticated employed. Even simple cross-tabulations can establish count such as the number of equivalent adults) is a results that, if not necessarily surprising, can resolve measure of the living standard of each member of the major policy controversies. (See Grosh 1997 on household and is the measure recommended in this kerosene pricing in Ghana and health care use in book for analyzing poverty and inequality. (See Deaton Guyana; see Deaton 1988 on rice pricing in and Zaidi 1999 for a more comprehensive discussion.) Thailand.) LSMS surveys also collect data on a wide range of More complex modeling of price reform requires other household and community variables that help estimates of how consumers respond to price changes, 95 ANGUS DEATON AND MARGARET GROSH so analysts can calculate dead-weight loss and the deflate these figures by the prices of the commodities tradeoffs made between equity and efficiency. Once in question (as obtained in the community or price again, data on consumption, income, and prices that questionnaires). Standard conversion tables are then are needed to estimate these responses (Newbery and used to convert quantities into a count of the number Stern 1987; Ahmad and Stern 1991; Deaton 1997, of calories contained in the food purchased-a meas- chapter 5). ure known as "caloric availability." Data on caloric availability have been used NUTRITION AND POVERTy LINES. There is a long tradi- together with data on household income or expendi- tion in development economics of counting calories ture to calculate Engel curves that plot the average and of defining poverty in terms of malnutrition, for household calorie consumption at each level of example by counting people whose caloric intake falls income or expenditure. Following work that was done beloxv some recommended standard. This tradition is in India over 25 years ago (see Dandekar and Rath misguided; nutrition is not an accurate measure of 1971a and 1971b and Government of India 1993 for welfare because people consume more items than a review), income or total expenditure poverty lines food and people often make tradeoffs between food are obtained by calculating the income or total expen- and other goods.Thus collecting data on calories con- diture level at which the calorie Engel curve gives the sumed is no substitute for estimating consumption. recommended calorie intake. If the calorie Engel Nevertheless, documenting nutrition is of consider- curve has a relatively high slope, increasing household able interest for other reasons. income will eliminate hunger relatively rapidly. If, as In some surveys a household's calorie consump- some recent writers have suggested, the elasticity of tion is estimated directly by nutritionists who enter calorie consumption with respect to income is close to the household and observe what is eaten by each zero, economic growth alone will not eliminate household member, either by weighing and measuring hunger. This means that poverty can only be reduced foods as they are consumed or by asking the members by direct intervention, an approach which is closer to questions about their dietary intake during the previ- the basic needs philosophy. (See Behrman and ous 24 hours. It is possible to imagine a module ofthis Deolalikar 1987 and Bouis and Haddad 1992-wvho sort that could be added to a multitopic survey, also argue that estimates are biased when caloric avail- although, because it would be lengthy, it might dis- ability is used rather than direct dietary surveys-as place other modules in the questionnaire as a whole. well as a contrary position from Subramanian and Swindale (forthcoming) provides guidelines on how Deaton 1996. A review is provided by Strauss and to collect dietary intake data. Thomas 1995.) Many writers believe that dietary intake surveys The demand analysis discussed in the previous are necessary to obtain accurate estimates of calorie subsection can also be applied to calories to calculate intake (Bouis 1994; Bouis and Haddad 1992). the effect of price changes (caused by, say, the elimina- However, dietary surveys also involve a number of dif- tion of subsidies on basic foods) on calorie intake ficulties. The survey techniques are invasive and may (Laraki 1989). cause people to alter their behavior.A household's day- to-day consumption may vary enough to make a 24- INTRAHOUSEHOLD ALLOCATION AND GENDER BLAs. hour recall period too short to yield accurate data with Expenditure data are an important tool for researching which to estimate poverty, yet longer periods may be the allocation of resources within the household and too expensive or too invasive.A more common-albeit for testing different models of how that allocation probably less accurate-way to count calories is the might work. In recent years many studies have found "indirect" method, which is most often used in expen- different outcomes for males and females, particularly diture surveys. Figures for the quantity of each good boys and girls, within the same household. In some that the household has consumed can be obtained in countries infant mortality is higher among girls than two ways.The survey interviewers can ask direct ques- among boys, and in even more countries educational tions about both the physical quantity consumed by outcomes are worse for girls than for boys. Several the household and the household's expenditure on the scholars have explored the possibility of using data on good, or they can collect data only on expenditures and household expenditures to cast light on these different 96 CHAPTER 5 CONSUMPTION outcomes for boys and girls, as well as to compare "adult goods" (usually alcohol, tobacco, and adult other pairings-adult women and adult men, the eld- clothing) for signs of gender bias in the treatment of erly and prime-aged adults, or widows and other children. Since the total household budget is not household members. (For a fuller discussion see increased by the presence of children, parents typical- Chapter 24 on intrahousehold issues, as well as Deaton ly reduce their expenditures on adult goods to make 1997, chapter 4.) room for the costs of the children. If the parents cut It is costly and time-consuming for surveys to col- back on their own consumption more for their sons lect complete data on every item consumed by every than for their daughters, this is evidence of discrimi- family member. In fact, this may be impossible for the nation against the girls in the household. Surprisingly, many joint (or household public) goods that are shared analysts have consistently failed to find such differ- by all household members. As a result, most multipur- ences, even in places where there is other evidence of pose surveys, including the LSMS surveys, have col- bias against girls-such as differential infant mortality lected household-level data on consumption and have (Deaton 1997, chapter 4). made little effort to collect individual data. Nevertheless, there are some cases where consumption FAMrLy STRUCTURE, CHmLD COSTS, AND ECONOMIES OF at the individual level can be inferred from household SCAiF. The most commonly used measure of living data. Such cases include health expenditures that are standards is household total expenditure per capita- linked to an identified episode of illness on the part of total household expenditure divided by the number of one member or expenditures on men's clothing when household members. This measure, while convenient, there is only one man in the household. In some sur- ignores the fact that the needs of one household veys data on expenditures have been collected using member differ from those of another household mem- the diary method, in which each adult family member ber, particularly between adults and children, and that has been asked to keep a diary about his or her own there are likely to be some economies of scale in expenditures, from which we can see who spends household size. Larger households, which usually what in the household, even if not who consumes include many children, are likely to benefit most from what. Even when data are collected by interviewers, it economies of scale.Thus measuring living standards by is probably possible to collect more individual data per capita total household expenditure almost certain- than has typically been collected in the past if the ly overstates the number of large households that are interviewer can find out who consumes how much of poor and understates the number of small households such obviously private goods as tobacco, transporta- that are poor. In some countries-most notably the tion, clothing, or entertainment. United States-there is a different (official) poverty Even when individual-specific data are not col- line for each type of household; these lines embody lected, it is possible to examine the effect of household both economies of scale and the different needs of characteristics-including household composition- adults and children. on the way households allocate their budgets. For There is a long history of economics studies that example, it may be that household expenditures on have attempted to use consumption data to derive the food and children's clothing are higher when there are cost of living for families of different types by inferring relatively more women in a household or when a large equivalence scales across age groups and estimating the share of household resources are earned by, and thus extent of economies of scale. If such calculations were putatively controlled by, women. There is also a devel- feasible and credible, they would have a key advantage oping literature (Bourguignon and Chiappori 1992; over dividing resources by the number of people in Bourguignon and others 1993; Browning and others the household because they would take into account 1994) that has identified sharing rules within the country-specific and local differences in the costs household. If some goods can be identified that are faced by different types of families. For example, it is consumed exclusively by one group within the house- often argued that children are relatively more expen- hold or if analysts have data on who consumed how sive in rich countries than in poor agricultural soci- much of each good, it is possible to infer whether or eties. Unfortunately, all procedures for estimating not income is shared equally across the groups. equivalence scales are controversial, and many econo- Related to this is the examination of expenditures on mists would argue that the task is misguided or even 97 ANGUS DEATON AND MARGARET GROSH impossible. (See chapter 4 of Deaton 1997 for a dis- Using data from an expenditure module to con- cussion of both sides of the argument.) tribute to a measure of credit use may not be such a Nevertheless, consumption data have a more lim- daunting prospect. Supplier credit constitutes a large ited but less controversial role to play in helping ana- share of households' total use of credit. A convenient lysts check the implications of various models. way of eliciting information about supplier credit is to Although all methods for measuring economies of add questions in the consumption module about pur- scale or estimating equivalence scales must contain chases on credit (see Chapter 21). untestable identifying assumptions, most have stronger implications that can be tested using the data. The Do LSMS Surveys CollectAccurate Consumption Data? results can reveal a great deal about the plausibility of In most developing countries there are no independent the models. For example, Deaton and Paxson 1998 estimates of poverty and inequality against which used a number of LSMS data sets to show that the LSMS data can be checked. However, it is possible to relationship between food expenditures and house- compare estimates of per capita consumption from the hold size contradicts most of the obvious notions surveys with similar estimates from the National about how economies of scale might operate. Income and Product Accounts (NIPA). Although the Without consumption data it is impossible to main purpose of the LSMS is not to measure means, if make any progress on the extremely important policv the LSMS data are enormously different from NIPA issue of how to factor differences in household size or estimates, pubhc confidence in the survey is likely to structure into the assessment of household welfare. erode-particularly public confidence in the survey's Until some agreed basis is established to correct for estimates of consumption. While it is important to cost of living differences faced by households of dif- make these comparisons, it should not automatically be ferent size and composition, there is no way to address assumed that the NIPA estimates are correct and that such issues as the relationship between poverty and discrepancies are wholly due to errors in the survey fertility or whether children are more likely to be poor data.The quality of NIPA accounts varies widely across than adults or the elderly. the world and, while some items of consumption are well estimated (for example, when consumption is CREDIT AND SAVING. A traditional use of expenditure from imports and there is good recordkeeping at the data in analysis is to combine these data with income border), the data on other items are often no more than data to derive estimates of saving at the household educated guesses (Srinivasan 1994). Even when this is level. The role that saving plays in economic develop- not the case, there are frequently important differences ment has alxvays been an important intellectual issue, in the definition of consumption between the NIPA and both public and private saving are rarely absent and the household survey. If these definitional differ- from the policy debate. Unfortunately, the generallv ences are not corrected for, the comparisons may not poor quality of data on savings collected through be valid. (For evidence from the United States see, household surveys has limited their contribution to among others, Gieseman 1987 and Branch 1994.) this debate, except perhaps in countries like Taiwan, Table 5.1 presents a number of LSMS survey esti- where household saving rates are very high. mates alongside their NIPA equivalents. (As far as the Microeconomic income data have typically been authors of this chapter are aware, LSMS survey data poorly measured, and even if consumption measures were not used to construct any of these national tend to be more accurate, the estimate of saving is the accounts.) This comparison should not be taken too relatively small difference between two large and inac- seriously, for two reasons. First, no detailed investiga- curately measured numbers, and as such, may be most- tion of NIPA practices for the countries in the table ly measurement error. It is not clear that having such has been undertaken, so there is no rigorous informa- measures of saving is worth the effort of obtaining tion about the accuracy of their estimates. Second, the them.To the extent that it is the owners of small-scale, survey numbers were taken from the various survey household-based activities who are doing the saving, it reports rather than from the original microeconomic is even more difficult to measure saving because data data (which would have been prohibitively expensive). on income from these activities are extremely hard to As a result, there may be some incomparabilities in measure accurately. their calculation. 98 CHAPTER 5 CONSUMPTION Table 5.1 LSMS and NIPA Estimates of Average Per Capita Consumption, Selected Surveys LSMS annual NIPA annual mean per capita per capita Ratio of Country Dates Currency expenditure consumption LSMS/NIPA Sources Bulgaria 5/95-7/97 levas 50,436 90,021 0.56 Authors' calculations from data on LSMS Web-site; staff estimates Coted'lvoire 2185-1186 CFA francs 237,853 184,935 1.29 Grootaert 1993, p.30;1IMF 1995 Cote divoire 2/86-ii8i CFA francs 223,905 i94,554 1.i5 Grootaert i993, p. 30; IMP 1995 C6te dIvoire 3/87-2188 CFA francs 216.965 190,032 1.14 Grootaert 1993, p.30;1IMF 1995 ................................ ........ ............................. ....................... ................................ ....................... ................ ..................... Coted'lvoire 5/88-4/89 CFA francs 173,072 190,203 0.91 Grootaert 1993, p. 30;1MF 1995 Ecuador 6/94-9/94 suc;es 2,032,560 2;230,392 0.91 Lanjouw and Lanjouw 1996, table 3; IMF 1996 Ghana 9/87-8/88 cedi 56,645 45,568 1.24 Glewwe and Twum-Baah 1991, p. 17; IMF 1995 Guyana 1/93-1 1/93 Guyanan dollars 91,602 53,750 1.70 World Bank I 994a, p.]:; Baker 1996 Jamaica 8/88-9/88 Jamacian dollars 4,700 5,210 0.90 World Bank 1996, p.28 Jamaica 11/89-3/90 Jamacian dollars 6,304 6,568 0.96 World Bank 1996, p.28 ................................................................................................................................................................................................................................... Jamaica 11/90-4/91 Jamacian dollars 7,616 7,869 0.97 World Bank 1996, p.28 Jamaica 1 1/91-2/92 Jamacian dollars 10,384 11,092 0.94 World Bank 1996, p. 28 Jamaica 8/92-3/93 jamacian dollars 16,998 17,718 0.96 World Bank 1996, p.28 ................................................................................................................................................................................................................................... Jamaica 11/93-3/94 Jamacian dollars 23,408 23,684 0.99 World Bank 1996, p. 28 ................................................................................................................................................................................................................................... Jamaica i 1/94-1/95 Jamacian dollars 32,712 35,819 0.91 World Bank 1996, p. 28 Kyrgyz 10/93-11/93 som 2,273 907 2.50 World Bank 1995a, p.60 Morocco 10/90-11/91I dirham 6,870 6,384 1.08 World Bank I1994b, volume 11, annex I, table 2 ........ ............................................................................................................................................................ Nicaragua 2/93-6/93 cordobas 4,079 2,312 1.76 World Bank 1995b, volume 11, p.46; IMF 1996 Pakistan 1/91I 12/91 rupees 6,835 6,037 1.13 Lanjouw and Lanjouw 1996, table 4; IMF 1995 ....................................................... ......................... ................................ .................... j...................... -........................... Peru 7/85-7/86 intis 4,616 6,359 0.73 Glewwe 1987, p.9:;IMP 1995 ... ... ... .. ... ... ................ .. . . . ......... . . . . . . .. . . . . . . . .. . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . .. . . . . . . . .. . . . . . . .. . . . . . . . .. . . . . . . . Peru 10/9 i- 11/91 new soles 750 1,178 0.64 Webb and Baca 1993, p.266; IMF 1995 Peru 7/94-8/94 new soles 2,190 3,539 0.62 Authors' calculations from data on LSMSwebsite, 1996; IMF 1996 Romania 4/93-12194 lei 1,126,558 1,348,055 0.84 World Bankl1987, annexl, table 4; National Commission for Statistics ussia iO/93.2/94 rubles 1,071,312 497,512 2.15 Poiey 1996; IMF 1996 ....................... ........... ....................................................... .......................... ..................... ..... .......................................... ...... 6....... . .... Tanzania 9/92-11/93 schillings 129,708 37,718 3.44 World Bank 1995c,p.30 Venezuela 1/92-12/92 bolivares 69,684 142,104 0.49 Scott 1994, p.1I5;1IMP 1995 Note: Adjustments were made for inflation as follows. If the country had less than I 5 percent inflation, and the survey period covered the whole year no adjustment was made, If the country had less than 15 percent inflation and the survey period covered only part of the year the survey figures were adjusted to correspond to midyear prices. If the country had greater than 15 percent inflation, the monthly consumer price index was used to adjust the NIPA numbers to the month for which the surveys data are priced. Source: Author's summary The table shows average per capita consumption consumer expenditure surveys in industrialized for 26 LSMS surveys. The ratio of the LSMS to the countries. NIPA estimate has a median of .96. Though these Looking at some of the cases with large discrep- summary measures indicate an impressive consisten- ancies between the household survey data and the cy between the survey and NIPA estimates, there are NIPA estimates illustrates the difficulties involved in large discrepancies for some countries and years. making comparisons between these surveys. Several of Nevertheless, the survey estimates provide no evi- the countries surveyed were undergoing major eco- dence that expenditures are generally understated- nomic changes at the time. For example, the som had defying the common belief among survey experts been introduced as a national currency just five (backed up by substantial literature) that the under- months before the Kyrgyz survey was carried out in statement of expenditures is the major problem with October 1993. While the ruble had lost its status as 99 ANGUS DEATON AND MARGARET GROSH legal tender in the Kyrgyz Republic, it was still used in Venezuelan survey, and, as a result, consumption was the Republic's neighboring countries, with which it probably underestimated (see the next section). Why had substantial formal and informal trading relation- the Peruvian surveys' consumption estimates were ships. In this rather chaotic situation, inflation rose to substantially lower than those of NIPA is a puzzle, 772 percent per year, and the dollar became a de facto although it may have been that the recall periods used unit of account-the most reliable store of value and were too long given prevailing rates of inflation. Yet sometimes the unit of transaction for large purchases the sample for the 1991 LSMS survey in Peru omitted or purchases of imports. The planners of the Kyrgyz some sizable rural areas, which means that the survey survey had been perplexed about whether to use the estimate should have been higher, not lower, than the som, the ruble, or the dollar as the survey's unit of NIPA estimate. account and about what would constitute reasonable recall periods, especially for nonfood items. The Data Requirements of a Consumption The same changes that complicated surveying also Survey and How to MeetThem complicated national accounting in the Kyrgyz Republic.The national statistical office had just begun This section discusses the data needed to obtain a to calculate the international NIPA numbers rather consumption-based mneasure of livinlg standards as well than the Gross Social Product numbers that used to be as to analyze the other policy and research issues out- calculated in the Soviet Union. With such complica- lined in Part 1.To the extent that this section focuses tions in the measurement of both numbers, it is not on the main purpose of this chapter-the measure- surprising, nor even particularly alarming, that the sur- ment of a single aggregate for consumption-the dis- vey data are not very close to the NIPA calculations cussion is driven less by what to measure than by how for the Kyrgyz Republic. Similar (though somewhat to measure it. In this respect, this chapter differs from less dramatic) issues plague the comparisons for many of the other chapters in the book. This section Russia, Romania, and Bulgaria, also contributing to discusses conceptual issues, the measurement of prices, very high discrepancies between LSMS and NIPA and the design issues that have loomed large in analy- estimates. ses of LSMS data and in the previous literature. An anecdote about Guyana further illustrates the Thereafter, the section presents the current best prac- problems of comparing household survey data with tice on each of the issues, based in part on the con- NIPA estimates. When the first tabulations were sumption literature but also on LSMS experience- becoming available from the survey, they yielded a and results from experimental surveys that were mean per capita expenditure about twice the amount specifically conducted to study points discussed here. estimated in the official NIPA figures. This caused This task has been a difficult one, and there are some consternation among the team working on the several important issues on which thcre is little to survey analysis. However, within hours, they discov- report. Even for consumer expenditure surveys in ered that an effort was concurrently being made to industrialized countries, which have been extensively adjust the NIPA for various flaws and biases. The new documented and which have featured a good deal of estimate for NIPA xvas within $30 per capita of the experimentation, the literature has not produced a sat- survey data (van der Gaag 1994). isfactory synthesis between theory and practice. The The teams working on the surveys in both literature is difficult to find and is scattered across var- Tanzania and Nicaragua also noticed that the estimates ious disciplines, including economics, marketing, psy- of consumption from their surveys were much higher chology, sociology, and statistics. Much of it is con- than the estimates in the national accounts. However, tained in poorly catalogued government reports and they were not disturbed by this, because national conference proceedings, rather than in academic jour- accounts in those countries had a reputation for being nals. Even so, there has been a good deal of recent inaccurate (Tsoflias 1996; Scott 1997). progress in understanding how to measure consump- As can be seen in Table 5.1, LSMS estimates for tion, particularly from interactions between survey Venezuela were lower than NIPA estimates. In this statisticians and cognitive psychologists.There are now case, it appears to be the survey that was wrong.A very several topics, reasonably well understood by analysts, short list of consumption items was used in the in which conclusions can be extrapolated from previ- 100 CHAPTER 5 CONSUMPTION ous experience with some confidence (see Sudman and typically lower for surveys in much of western and others 1996 and the 1991 volume edited by Europe. As a result, if the difficulty of surveying Biemer and others). The discussion presented here will wealthy households is one factor that causes average draw on this work. consumption to be understated in industrialized Even so, it should be kept in mind that there are countries, there is likely to be less underestimation many design issues where there is evidence of prob- from surveys in poor countries. While the high lems but little understanding of their causes nor solid response rates are good in themselves, those who recommendations for solutions. As Sudman and others choose to respond will not necessarily be attentive and (1996) point out, "The theoretical basis of interview- cooperative, say, in terms of tolerating long question- ing is still less rigorously developed than the theoreti- naires or taking enough time to try to remember the cal basis of sampling."Thus the wording and design of information they are asked to provide. questionnaires has largely remained an art governed by The LSMS program of surveys has contributed in-house tradition and personal experience. In his use- little to the methodology of measuring consumption. ful discussion of the sources of measurement error in Systematic experiments have been conducted only expenditure surveys, Neter (1970) gives the following recently and are still largely not written up in places typology of (nonsampling) errors that remains relevant easily available to the community of survey today (with applicability that goes well beyond con- researchers. The emphasis on collecting data rather sumption data): than on furthering methodology may be inevitable d Recall errors associated with the fading of people's given that the countries and country departments memories. (within the World Bank) that fund the surveys are * The "telescoping" of reported events by incorrect more interested in increasing their understanding of dating. development policies than in increasing survey know- * Reporting errors associated with respondents being how. Nonetheless, it is lamentable that the LSMS pro- overwhelmed either by the length of the survey or gram has not done more in this area and desirable that by the number of items covered. it should do more. Good survey practice should * "Prestige" errors-in other words, misreporting include the continual evaluation of methods, just as due to various social pressures. good social policymaking includes evaluation of the * Conditioning effects from being in the survey. impact of government programs. The LSMS program * Respondent effects, in which the respondent's has been remiss by not more strongly supporting identity affects the answers he or she gives. investigations into survey methodology. * Interviewer effects. Even while maintaining its principal focus on data * Effects associated with the design of the instrument. production, the LSMS could make useful contribu- Another nonsampling error that could be added to tions to survey methodology in four areas. First, it Neter's list is biases in the data due to nonresponses or would be helpful and not too costly for the LSMS sur- to the use of an inadequate sampling frame. All of vey teams to produce fuller documentation of their these issues have been extensively discussed in the sub- pretests of questionnaires, including the different sequent literature, but only for the first two-fading options that were tested, the process and lessons of the memory and telescoping-are the causes and treat- field test, and the reasons for making the choices that ments reasonably well understood 30 years later. were made about the questionnaire. A great deal is The literature on expenditure surveys in develop- learned in this process and never made systematically ing countries is even thinner.While findings from rich available to others.The record might include taping of countries often carry over to poor countries, there is interviews and debriefing of respondents about the always reason to be cautious. For example, response interviews. rates in LSMS surveys are very high-nearly always Second, by more rigorously collecting and dis- higher than 80 percent, and often closer to 100 per- seminating metadata on the process of interviewing (as cent. This is higher than response rates for surveys in described in Chapter 4), analysts will be able to sys- many developed countries; response rates are 85 per- tematically study survey methods, costs, and quality. cent for the U.S. Consumer Expenditure Survey, about Third, more controlled experiments with alternative 70 percent for the British Family Expenditure Survey, modules-such as those that were set up or analyzed 101 ANGUS DEATON AND MARGARET GROSH as part of the background research for this chapter- the estimates must be converted to real terms by should be done. These are particularly appropriate in adjusting them by a price index to account for differ- countries where the new survey will use a different ences in prices among different regions or interview consumption module than has been used in past sur- dates. It is important to note that this accounting veys. In such cases the experiments would not only should be done after the data are collected because it increase knowledge about measurement of consump- is neither necessary nor advisable for the respondent to tion but would also allow the host country to make understand the economic concept of consumption (or adjustments in comparisons between its older surveys of income). The questionnaire should be designed and the new one, so that the differences in results due around items familiar to the respondent-typically, to method could be identified. cash flows or flows of goods-while gathering enough Fourth, although the consumption modules used information to allow total consumption to be calcu- in most surveys are increasingly standard, there is lated. However, consumption questionnaires often col- much to be learned from talking through design issues lect additional data on cash flows that are not part of in "cognitive laboratories" in survey organizations (see the economist's definition of consumption but that are Sudman and others 1996). In these sessions, potential of interest in themselves and that the respondent sees respondents are brought into a laboratory, asked sam- as outlays of cash similar to purchases of goods. Such ple questions, and then de-briefed on how they inter- cash flows include taxes, contributions to savings preted the questions and how they went about accounts, or loan repayments. answering them. The results of these sessions are then To measure welfare accurately, the consumption used to modify the questions in the module as neces- concept must be comprehensive. All goods and servic- sary. Sudman, Bradburn, and Schwarz (1996) wrote, es that contribute to people's standard of living need "In questionnaire design, we strongly recommend the to be included in the measure, which can be thought use of think-aloud interviews for determining what of as a practical approximation to an indirect utility respondents think the questions mean and how they function or money-metric measure of welfare. While retrieve information to form a judgment." it is often tempting-and economical-to collect data on only a subset of consumption (or sometimes even What Consumption Data are Needed? a single good or group of goods, such as housing or The measure of living standards that analysts wish to food), the relationship between the part and the whole construct is a real value of total household consump- can vary a great deal from one household to another tion on a per capita or per equivalent basis. Thus they and from one place or time to another, so that rank- need to have available information on three things: ings or living standards obtained using the shortcut consumption, household size (including-for the measure may not be universally valid.A good example equivalence scales-the age and sex of household comes from the spatial differences in relative prices members), and prices. Data on household size (and age that cause people to substitute cheaper goods for rela- and sex of household members) can be gathered in the tively more expensive goods. Poor urban dwellers household roster (see Chapter 6); consumption and must often live in poor housing in order to have access prices are discussed here. to income-earning opportunities in the city, but the A measure of total household consumption is standard of their housing will understate their overall built up from several components. First it is necessary standard of living. to add up all reported expenditures on individual As will be seen in the following sections of this goods and services or groups of goods and services (for chapter, there is not a clearly "right" or "wrong" way a fuller discussion see Deaton and Zaidi 1999).Then a to resolve many issues about how to measure con- value for consumption that does not go through the sumption. Rather, there is a range of good practice market-in other words, consumption out of home techniques and not enough empirical evidence about production or in-kind received from employers- which is best. However, it is true that the estimates of must be added in. In countries where households hold consumption are sensitive-sometimes markedly, significant stocks of goods, particularly expensive sometimes only slightly-to which method is used to durable goods, it is necessary to correct for the differ- formulate the estimate. In addition, survey designers ence between consumption and expenditures. Then will often wish to ensure that the new survey data are 102 CHAPTER 5 CONSUMPTION comparable with previous survey data in that country, durable goods (and in some cases for stocks of grain or which is a powerful argument in favor of whatever fuel), consumption should be linked to stocks rather method was used in all (or most) of those previous than purchases; thus the submodule that deals with surveys. However, this may sometimes conflict not durable goods needs to compile a list of the house- only with the interests of accuracy and best practice hold's durable goods. Some sort of consumption flow but also with standards that would allow the survey needs to be imputed from this list.To impute the con- data to be compared with equivalent data from other sumption flow sensibly, analysts need to know both countries. Comparability over time within a country is the age of the good and its original (and perhaps cur- useful for monitoring poverty, which often takes rent) value. In the case of housing that has no adequate precedence over other considerations. In addition, rental market, analysts need to know any characteris- international agencies and researchers value interna- tics of this good that can be used to impute its rental tional comparability, and because they often provide value. Of course, such imputation is at best a hazardous the funds or technical assistance for surveys in devel- undertaking in countries where there are few rental oping countries, this consideration is frequently influ- units to judge by, and the quality of the resulting data ential. Even so, the past survey practices in a given may not be worth the effort to collect them. Great country may be far from sensible or standard; when care must also be taken to avoid erroneous interpreta- this is the case, it may be better for a new survey to be tions of the results in cases where such imputations the first in a new series of consistent and potentially have an important effect on the total consumption comparable surveys than for it to replicate a flawed measure or on the welfare rankings of households. If method of measuring consumption. At the very least, there is no rental market and possibly only a limited experiments should be conducted to test whether the housing market, an imputed rental value may overstate previous surveys actually collected the data they pur- the value of the housing to its inhabitants. Particularly ported to collect. Comparability of nonsense is no in an emergency, it may be hard if not impossible for great virtue. them to turn this imputed value into urgently needed For measuring welfare, consumption is ultimately cash. It is unwise to let policy decisions rest on often a more useful measure than expenditures (purchases). arbitrary and contentious imputations. For most perishable goods it is safe to assume that a The policy issues discussed in the first section can person's or household's consumption is closely tied to mostly be analyzed with data that are the byproduct of their purchases. A kilo of tortillas or a bunch of the need to construct an estimate of total household bananas must be consumed soon after they are pur- expenditure. There are, however, some exceptions chased; for such goods, expenditure and consumption where more data will need to be collected to allow will approximate each other over a short period. Even analysis of a particular issue. for less perishable commodities, some averaging across First, the decision about the level of disaggrega- goods may occur in a fairly short period of time. A tion at which to collect the data must be guided by the person may buy a pound of coffee one week and con- needs of analysts to have data on specific items of sume it over a month, but the next week he or she expenditure. For example, an analyst might wish to may purchase a pound of sugar that will also last for a have data on goods of particular nutritional signifi- while, the next week he or she may purchase a bag of cance, such as rice or milk. Likewise, when different flour, and so on. goods are taxed or subsidized at different rates and In the case of major durable goods, expenditures analysts want to use the survey data to investigate tax and consumption are not closely related in the short reform, the different goods must be distinguishable run and household expenditures on durable goods will from each other. If flour is subsidized in the country be a poor guide to the consumption of durable goods. where the survey is to be fielded, then it would be a (In some cases, where grains can be stored for sub- good idea to include a separate question on the con- stantial periods of time, the same may be true for sumption of flour rather than including it in a broad- goods that are not conventionally classified as durable. er question about "staples" or "flour, rice, and corn- Rice or dried pasta may be stored for a long period of meal." Where analysts are concerned with the time, so their consumption and the expenditures made relationship between consumption and the environ- to purchase them may deviate considerably.) For major ment, it is necessary to distinguish consumption items 103 ANGUS DEATON AND MARGARET GROSH that were gathered or hunted from consumption items case there may be marked regional variation in hous- that were grown at home or bought in the market. ing costs). Second, analysts may wish to collect data that are When there is no other adequate price informa- disaggregated so as to yield information about intra- tion, data must be collected in the household survey. household allocations, even if such disaggregation is This may be done at the household or community not required to estimate the total of all expenditures. level. At the household level, the survey may be Items that are exclusively consumed by different designed to ask each household how much was paid groups are an obvious example; for example, rather for each unit of an item purchased, the quantity of the than having a single item for clothing, data on men's good purchased, and total household expenditures on and women's clothing and footwear can be collected the good.When households report physical quantities separately from data on children's clothing. For at least (such as kilos, sacks, or numbers), it is possible to divide some goods it is also possible to include questions their reported expenditure by the reported quantity to about who consumed what. Tobacco and alcohol are yield a price or more precisely a unit value-for consumed individually, not jointly, and it is probably each good; these values can be weighted together to possible to obtain reasonable estimates of the create household-specific price indexes. individual-level consumption of these products. Other In most previous LSMS surveys, designers opted examples of goods that are consumed individually are to collect price and quantity data from local markets tickets for entertainment and transportation. One dif- in a community-level questionnaire, with few LSMS ficulty is that these may be the items for which the surveys collecting such information from households. best-informed overall respondent for the household However, many other surveys around the world col- gives the least accurate answers. For example, the lect quantity information from households-including homemaker may know a lot about the household's the Indian National Sample Survey, the Pakistan food consumption but relatively little about individual Household Income and Expenditure Surveys, and the members' expenditures on alcohol, tobacco, or enter- Indonesian National Socio-Economic Survey (SUSE- tainment.This problem can be dealt with by conduct- NAS).While the LSMS survey in Vietnam did collect ing individual interviews on consumption of some price data from local markets, it also collected quanti- goods-a feasible but costly solution. ty data at the household level. This was also the case Third, some of the research topics, such as the cal- for LSMS surveys in Brazil, Ecuador, the Krygyz culation of calorie availability or the estimation of Republic, Nicaragua, and Russia. The LSMS surveys in price elasticities, ideally require data on quantities of Pakistan, Bulgaria, and Ecuador included questions individual items at the household level, whereas the about expenditures per unit. Apart from the Pakistan welfare measure makes do with expenditures, deflating survey, where the unit cost data had serious problems when necessary by a price index. The issue of quanti- (probably for local reasons, almost certainly from inad- ties will be addressed in the next subsection. equate interviewer training), so far there has been no systematic evaluation of household-level price and Collecting Price Data quantity data collection. The conversion of money values to real expenditures There are several advantages to collecting price requires the construction of a price index; to construct data by asking household respondents about their a price index, price information must be available.This expenditures and the quantities of their purchases.This price information must not only capture temporal procedure yields measures of physical quantities that variations in prices but also accurately represent the are useful in their own right for such purposes as com- price level faced by each of the households in the sur- puting calorie availability or estimating the elasticity of vey sample.While adequate price indices may already quantities relative to changes in taxes or subsidies. It be available in some countries, such cases are few and also yields the raw material for a price index for each far between because many price surveys exclude rural household without requiring the formulation of areas. Urban prices are only useful for nationwide assumptions about where the household buys its analysis when spatial price variation is limited-for goods. A price index constructed in this way, howev- example, where there is a good transportation network er, covers only those goods-typically but not exclu- and markets are well integrated (although even in this sively foods-for which quantity data can be well 104 CHAPTER 5 CONSUMPTION defined in the questionnaire. Such price indexes are this and five baskets of that"). Since most of these dif- automatically tailored to the consumption patterns of ficulties were experienced in Africa and since success- the households in the survey, so there is no discrepan- ful experiences cited regarding quantities are largely cy between the price data and the goods that people Asian (though they also include Latin American coun- buy. Having price data for individual goods at the tries and countries of the former Soviet Union), there household level is also useful for analyzing demand may be a "continent effect" here, perhaps reflecting the patterns and policy issues-such as price reform-that degree to which the economy is monetized or to depend on the results of demand analysis. One of the which nonstandard prices or units prevail. authors of this chapter has satisfactorily matched The alternative to collecting prices from house- household data on unit costs from the Indian National holds is to collect prices at the community level at vil- Sample Survey to the prices that the Government of lage and local markets in the primary sampling unit. India regularly collects from local markets around the This option is cheaper because prices are collected country, at least in cases where the local markets are only for each primary sampling unit and not for each located near the survey households. Not only do these household. This option also has the advantage that, in data match across districts, but the unit values from the principle, the prices in the market are the prices that survey reflect the appropriate seasonal patterns of agri- consumers actually face. The fact that observed prices cultural prices (see Deaton 1997, chapter 5). are the same for everyone in the primary sampling There are also disadvantages to collecting price unit is thus an advantage, not a disadvantage. Most data this way. Unit values are not prices, and they vary countries have some sort of regular method for col- even among households that purchase from the same lecting data on consumer prices and aggregating them sources, because better-off households typically buy into a price index, at least in urban areas; survey higher qualities even of fairly homogeneous com- designers may be able to make use of or at least adapt modities like rice or sorghum, and certainly so for het- these well-established procedures. erogenous categories such as meat. One way this prob- However, there are also a number of difficulties lem can be dealt with is by averaging the unit values with collecting price data at the community level. over all the households in a primary sampling unit; the One problem is that in some circumstances it is diffi- Indian evidence quoted above suggests that averaged cult for a survey team to replicate the sort of transac- unit values are not likely to be misleading as indicators tions that locals engage in; haggling is often an impor- of price. A more serious problem is that, with a few tant factor in defining the prices actually paid by local exceptions such as fuels and tobacco, it is not easy to consumers, which may mean that the prices vendors define physical units for goods other than foods. quote to survey enumerators are different from those Gathering data on the price of food may be enough in paid by long-standing or regular customers. some cases-particularly in very poor economies A second problem is that the price questionnaire where food consists of two-thirds or three-quarters of can only collect prices on items that are available in the budget of most households-but this is clearly not the local markets-which may exclude many nonfood true in general. In many past LSMS surveys, even the items as well as those food items only consumed sea- definition of physical units of some food items was sonally, or consumed regionally rather than nation- unclear or subject to error. For example, the respon- wide. To solve this problem as well as the problem of dents may not have understood whether to report the defining a suitable unit, price collection in the survey price they paid per egg or the price they paid for a may be biased toward manufactured or processed dozen eggs. Moreover, goods are often sold locally in items that can easily be defined, such as a can of amounts or units that are often not very precise and standard-brand tomato paste, a two-gallon plastic thus can be hard to interpret at the analytical stage; a bucket, or a two-pound packet of sugar from the "bunch of vegetables" is much less clearly defined than national refinery. Geographic problems remain. In a kilogram of rice. countries where consumption patterns differ radically LSMS surveys have had less difficulty in defining across regions (for example, between northern and useful units for consumption than for production, southern areas of India), there may be conceptual dif- where respondents often cannot provide any precise ficulties that are as serious as the difficulties involved quantities (saying, for example, "I sold three sheets of in comparing prices across countries. 105 ANGUS DEATON AND MARGARET GROSH A third problem is that in some (but not all) coun- questionnaire with the average of households in the tries, it can be hard to know what is meant by a "local primary sampling unit. In Vietnam the correlations market." The image of a primary sampling unit as an between the reported unit values from market pur- isolated rural village with a single market is an appeal- chases and the directly observed market price vary ing one, but it is not accurate in all parts of the world. from 0.77 (for noodles) and 0.76 (for pork) to -0.07 In urban areas, people may buy goods far away from (for cassava) and -0.34 (for mangoes). For the 16 foods where they live. (The obvious solution to this problem in the comparison, cassava and mangoes are the only would be to use the urban price indices that are often two negative correlations, and the median correlation available and that are sometimes of acceptable quality.) is 0.34.The correlations are similar, if somewhat lower, Even in rural areas primary sampling units are defined for Pakistan-perhaps reflecting the problems with the by statistical and often ultimately administrative crite- unit value data. ria that may not accurately represent actual villages or Since any household survey requires some esti- village markets. So there is no guarantee that the prices mate of price variations, and since households cannot in any given local market are the prices actually faced usually provide price (or unit value) data for most by households in the survey. This problem can be nonfoods or even for some foods, the survey always exacerbated when there are several different types of has to include a price (community) questionnaire of outlet in a given community-such as markets, corner some kind. If the household questionnaire does not stores, supermarkets, and subsidized government ration collect data on quantities, meaning that it will yield no shops-all with different prices and clienteles. unit values, the community questionnaire must gather Fourth, in at least a few past LSMS surveys, the data on the prices of food as well as nonfood items. procedures for entering data from the community When, as for many foods, both approaches are avail- questionnaire into the computer were not well able, it is unclear from these two examples or from the enough established, and the data from many question- literature whether the community or household naires or (in extreme cases) the whole community sur- method is preferable for any given survey. Of course, vey were lost to analysts.The recommendations in the duplicating the collection of some data is an insurance new LSMS implementation manual by Grosh and policy; in at least one LSMS survey respondents were Munioz (1996) should overcome some of the prob- mistakenly given the option of reporting either quan- lems, but community questionnaires will continue to tities or expenditures, so that without the community be more novel to most survey agencies than household questionnaire it would have been difficult to construct questionnaires, and therefore may be managed less the essential consumption expenditure aggregate for well. each household. In the LSMS surveys for Pakistan (1991) and It is useful for analysts to have data on quantities Vietnam (1992-93), price data were collected at both for reasons that go beyond the construction of price the household level (on expenditures and quantities in indices. Thus most surveys with objectives similar to Vietnam and on quantities and unit values in Pakistan) those of an LSMS survey will want to collect data on and the community level, and the results for various quantities if it is feasible and economical to do so. If goods were compared. The comparison is clouded by previous surveys in a country have yielded good data the fact that, for the reasons given above, the two data on quantities, or if it proves possible to gather data on collection procedures measured different things, so quantities in an exploratory field test, questions about that it is unclear exactly how close the unit values quantities should be included in the consumption from the household questionnaires could be expected module of the household questionnaires-at least for to be to the market prices from the community ques- quantities of food. tionnaires. For both countries the two sets of estimates are similar at a sufficiently high level of aggregation; Level of Disaggregation for example, there is little difference between the unit The number of items about which data are collected values and prices at the all-province level in Pakistan. is one of the central issues in designing the question- However, the unit values and prices in both countries' naire for a consumption module. On the one hand, data sets differ markedly at the primary sampling unit longer consumption modules are more costly and level when one compares the estimate from the price crowd information out of other modules of the ques- 106 CHAPTER 5 CONSUMPrION tionnaire. On the other hand, asking about more spe- groups, along with a single question about their total cific items in detail is generally assumed to yield fuller expenditure during the previous month. The con- reporting and greater accuracy than would asking sumption totals from the questionnaire with the broad about shorter, more general lists of items. There have, commodity groups were only slightly lower than the however, been suggestions that a survey can try to totals from the detailed questionnaire, while the gather data in too much detail. Respondents may single-question method gave consumption estimates become bored or despondent. Or, believing that they that were 25 to 30 percent lower than the estimates are being uncooperative or showing themselves to be from the full NSS list-although still highly correlat- inadequate consumers if they have nothing to report ed (0.98). in response to a long list of questions, respondents may There is also some different evidence from the invent purchases to be "helpful" or to enhance their United States Consumer Expenditure Survey, in prestige. However, there has been a validation using which certain households are asked to keep detailed crop and trade data of food estimates from the Indian product diaries of their purchases of food, while oth- National Sample Survey which uses a very long list of ers (who are in the interview section of the survey for items (Minhas 1988; Minhas and Kansal 1989), and, which food is not the main focus) are asked to report although there is some slight evidence of overestima- their total expenditures on food at home and away tion for cereals (about 3 percent), this seems well with- from home for each of the previous three months. in reasonable bounds. According to Gieseman (1987), the amount that Traditional expenditure surveys in developing respondents said they spent on food at home was larg- countries (for example, the National Sample Survey in er in the interview part of the survey than in the diary India) use very long lists of consumption items, naming part of the survey-and also closer to the NIPA esti- each food with great specificity. Lists of 200-300 items mate. For the estimates of the amount spent on food are not unknown, and the Brazilian budget survey uses away from home, the diary appears to be quite close to a list of 1300 items. LSMS surveys, however, have been NIPA's estimate, whereas this expenditure was sub- less detailed; the 33 foods and 20 nonfood items listed stantially underestimated in the interview part of the in the Pakistan survey in 1991 and the 45 foods and 46 survey. nonfood items listed in the 1993-94Vietnam survey are Even more positive results were reported by a typical. As has already been discussed, some disaggrega- World Bank (1993) test survey in Indonesia. Both a tion is necessary to obtain information on certain spe- short and a long questionnaire were administered to cific items of interest; however, the precise level of dis- 8,000 households. In the short questionnaire the num- aggregation that survey designers choose for any given ber of food items was reduced from 218 (in the long survey will depend on their views about the tradeoff questionnaire) to 15 and the number of nonfood items between costs and accuracy was reduced from 102 (in the long questionnaire) to 8. There is a good deal of debate about whether While total measured food expenditures differed little short (or at least shorter than standard) consumption between the questionnaires in terms of means and dis- questionnaires can save time and money and still deliv- tribution, the long questionnaire showed about 15 er accurate estimates of total consumption. The issue percent more nonfood expenditure. does not seem to have been settled in the literature. These results have not been replicated elsewhere. One set of results suggests that short lists of items will A similar experiment in El Salvador which reduced yield reasonably accurate data. A study by Reagan food items from 72 to 18 and nonfood items from 25 (1954) of farm operators in the United States found to 6 resulted in ratios (long-to-short) of 1.27 for food that total consumption was only modestly lower- and 1.40 overall Jolliffe and Scott 1995). A 1994 about 10 percent overall-for a condensed list of 15 experiment in Jamaica, comparing modules with a items than for a list of over 200 items. For developing total of 119 items to modules with a total of 37 items, countries, Bhattacharya (1963) reported on a small- produced a long-to-short ratio of 1.26 for both food scale experiment on 44 households in two villages in and nonfood items (Statistical Institute and Planning West Bengal that were presented with the usual Institute of Jamaica 1996, appendix 3). In Ecuador in detailed National Sample Survey questionnaire as well 1993 two versions of the submodule on food items as with a questionnaire covering broad commodity were piloted, one with 122 food items and the other 107 ANGUS DEATON AND MARGARET GROSH with 72.The ratio of total food expenditures reported accurately remember how much they spent in individ- in the long module to those reported in the shorter ual stores than how much they spent on individual module was 1.67 (see Steele 1998). Shorter question- goods or groups of goods. One experiment in Jamaica naires sometimes dramatically reduce both survey compared botanical aggregates and "point-of-pur- costs and times compared to the longer questionnaires; chase" aggregates and found little difference between in West Bengal survey time dropped from 180 minutes the two sets of means and variances (Statistical Institute to 90 and in Indonesia it fell from 80 minutes to 10. and Planning Institute of Jamaica 1996, appendix 3). Howvever, it seems that such savings are often gained at There is also related evidence from the literature on the expense of accuracy. diaries; Sudman and Ferber (1971) tested diaries that There are alternatives to having either a long used itemization by purchase, groupings by product questionnaire or a short questionnaire. One alterna- type, and groupings by outlets, finding that respondents tive, which has never been used in an LSMS survey, were more likely to agree to cooperate with the survey would be a hierarchical scheme in which respondents and maintained their diaries for longer if a product are asked if they have purchased anything within a diary was used.The relevance ofthis evidence to LSMS broad class of goods, and they are only asked the interview surveys is a matter for conjecture. detailed questions about items in that class if they say Finally, there are implications for the degree of they have bought a good in that class. For example, if aggregation if the survey is attempting to collect data a household responded that it had not bought any on physical quantities. Some foods, including many dairy items, it would not be asked specific questions cereals, can be grouped together, and a meaningful about milk, yogurt, butter, and ice cream. Something total weight can be calculated. However, this is not the like this may already happen in practice when inter- case for such items as canned goods, vegetables, or viewers and respondents are faced with lists of several many processed goods; for these items considerable hundred items that are inevitably grouped into broad disaggregation is required in order to obtain appropri- categories. This kind of approach has obvious advan- ate units. tages when interviewers use computers to administer In summary, it seems that using drastically shorter questionnaires to respondents, in which case the questionnaires is likely to be risky and lead to the detailed questions about each item would never come under-estimation of total consumption. Three hun- up in the interview unless the respondent first indicat- dred items are probably too many and 10 are probably ed that they were relevant. The risk is that the cate- too few but it is difficult to be more precise. The draft gorical approach will cause consumption to be under- module presented in this chapter includes only the estimated because respondents forget purchases that approximately 70-100 items that have commonly they might remember if they went through the list in been used in past LSMS surveys, on the grounds that detail. There is evidence from diary surveys in indus- many more items would increase costs noticeably and trialized countries that preprinted diaries identifying that the comparisons to NIPA have not shown huge, more categories of consumption cause respondents to systematic biases. Using a much smaller number of report more consumption, presumably due to this items would increase the risk of underestimating total prompting effect (Tucker 1992; Tucker and Bennett consumption and would certainly decrease analysts' 1988). ability to calculate rough estimates of caloric content There are many different ways of turning a list of (which are sometimes used to calculate the poverty several hundred items into a list of only a few dozen. line). The traditional procedure, which might be referred to as the "botanical" method, groups together cereals, as Recall Period well as pulses, root vegetables, or leafy greens. Since Each consumption item in the consumption module botanically similar foods often contain similar amounts must be given a recall or reference period. The ques- of calories per kilogram, this way of aggregating the list tionnaire may ask how much rice the household pur- of items ensures that analysts can calculate calorie chased during the previous week, two weeks, or counts when necessary Other criteria can also be used month, or it may ask about the household's expendi- for aggregating these items, such as where the con- ture on clothing during the previous two weeks, sumer typically buys the goods. Consumers may more month, or year. The recall period is sometimes tied to 108 CHAPTER 5 CONSUMPTION a particular event-most commonly the interviewer's content with estimating aggregates or averages over last visit. Another option is to have respondents report households, for example, for weights for a consumer how much they "usually spend" over a month or a price index. If the only problem with reporting were year.While it makes obvious sense to use longer refer- progressive forgetting (the fact that respondents' mem- ence periods for items that are rarely purchased and to ories of their expenditures fade as the time since the use shorter periods for high-frequency purchases, this purchase grows longer) and if there were no other sys- guideline leaves a great deal still to be decided. tematic biases (but there are, as will be discussed below), averages could be obtained accurately with RECALL PERIOD CHOICE. The choice among recall short reporting periods. However, this is not adequate periods is one of the most important and difficult for measuring welfare at the level of the individual design issues for the consumption module. It is also an household. Longer recall periods are better than short- issue that cannot be dealt with in isolation because it er ones for measuring the distribution of consumption interacts with other elements of the module (such as because averaging consumption over many days elim- whether expenditures are collected by diary or by inates the randomness of some of the household's day- interview) and with the survey's design more general- to-day purchases that have nothing to do with its stan- ly (in particular, whether the design permits multiple dard of living. However, if people find it harder to visits at least a week apart).The ultimate objective is to remember more distant events, longer recall periods obtain a reasonably accurate estimate of the rate of will miss more consumption, and lead to downward each household's total consumption expenditure over bias. If short recall periods are used, and if people the previous year. There are many different ways to report accurately, not everyone will purchase every fulfill this aim. item every day or every week, because many goods One possibility is for the interviewer to make a can be stored, because many goods that the household single visit to the household during which the respon- consumes regularly do not have to be consumed every dent is asked to recall how much the household spent day, and because some items of consumption have sea- during the previous 12 months, either in total or on a sonal patterns. Provided that the survey's fieldwork is list of items.This is likely to lead either to an underes- spread throughout the year and provided that the timation of household expenditure (because it is diffi- respondents' reports are accurate, short reference peri- cult for people to remember their expenditure from so ods will yield unbiased estimates of the mean for the long ago) or to educated guesses (in which respon- population. Those households that do not purchase dents estimate their expenditure over the whole year anything during the reference period will be averaged from their current rate of expenditure). with those who happen to make purchases for several Another alternative is for the interviewer to visit periods, or those interviewed about their consumption the household many times throughout the year and during a festival are averaged with those interviewed ask the respondent for details of the household's about non-festival consumption. However, such data expenditures over shorter periods. However, if people's (for periods when some households spend nothing memories of their expenditures fade quickly, many while others spend a lot) do not give an adequate pic- visits may be required to ensure that accurate data are ture of the annual consumption of individual house- collected on high-frequency purchases, and such visits holds nor of the distribution of consumption across can be prohibitively costly. The diary method was households. designed to minimize reliance on respondents' mem- Using a shorter recall period in the consumption ories because the diaries are supposed to be filled out module than the period over which living standards at or near the time when the purchase is made. are defined in analysis will inevitably cause error in the However, diaries clearly pose special problems when a measurement of living standards. Because adding vari- substantial fraction of the population is illiterate- ance always increases the apparent inequality- problems that will be discussed more fully below. effectively a mean-preserving increase in spread- One of the special features of LSMS surveys is the measurement error exaggerates inequality and, if the requirement that each survey provide an estimate of poverty line is below the mode of the distribution, annual expenditures at the household level. Most con- exaggerates estimates of poverty. In the extreme case, sumption surveys do not make this demand and are using a single day as the recall period would, while 109 ANGUS DEATON AND MARGARET GROSH effectively eliminating bias in the mean, classifies as first week.Thus expenditures reported seven days later poor anyone who did not go shopping the previous were 87 percent of what they were for a single day. day. Since the LSMS surveys are at least as concerned After two weeks reported expenditures were another with the dispersion of households across the distribu- five percentage points lower. Annual recall based on tion as with means, single-visit consumption modules explicitly normative questions ("How much do you with very short recall periods should be avoided- usually spend on xx?") gave a total 91 percent of that except when purchases are known to be evenly spread of one-day recall, while annual recall based on osten- (say, as a result of rapid inflation that causes people to sibly factual questions ("How much did you spend on make frequent, regular purchases). This is not to say xx?") gave a total 113 percent of the one-day recall that short recall periods and multiple visits can never figure. work well. Several African expenditure surveys use a A second important bias caused by memory that daily recall period for seven daily visits. In Singapore affects expenditure estimates is "telescoping," whereby consumer expenditure surveys use this kind of design respondents include in their reports events that hap- in conjunction with diaries (see Silberstein and Scott pened before the beginniing of the recall period. Asked 1991). about expenditures during the previous year, respon- The difficulties that can arise with overly short dents may include a car that they bought 13 months recall periods are not confined to the estimation of ago. (Neter and Waksburg 1964 identified such effects poverty and inequality. lf the recall periods for report- in the U.S. Consumer Expenditure Survey for home- ing consumption are shorter than the period over owners' alterations and repairs.) According to the for- which living standards are defined in the analysis, the mal models of Rubin and Baddeley (1989) and measurement error in each individual expenditure will Bradburn, Huttenlocher, and Hedges (1994), because be transmitted into the total expenditure estimate, people do not remember dates very well, they may which is the sum of all of the individual expenditures. remember an event (or expenditure) but be unsure As a result there will be a nonstandard measurement about the date this event occurred. If uncertainty error bias in the estimation of Engel curves, including about dates increases as the event recedes, such tele- calorie Engel curves (Cramer 1969; Bouis and Haddad scoping errors will cause an increasingly upward bias 1992). If this is not corrected for, estimated Engel elas- in the resulting data. The further in the past the event ticities will vary with the length of the recall period is, the greater is the uncertainty about its date; thus the (Ghose and Bhattacharya 1993, 1995). probability of a less recent event being misplaced into a more recent recall period is higher than the proba- THEORY AND EVIDENCE IN TfII LITERATURE. People do bility of a more recent event being misplaced out of not forget all events at the same rate, and they can such a recall period. rememriber some "flashbulb" events (for example, the Short recall periods are more affected by telescop- death of President Kennedy in the United States) in ing than by recall bias-leading to overstatement of great detail for many years.Yet people generally forget purchases. As the recall period increases, recall bias more events the further such events slip into the past. becomes more prevalent, resulting in a downward bias Purchases of consumer goods are no exception to (Eisenhower, Mathiowetz, and Morganstein 1991). these rules; there is a large body of evidence in indus- These effects will work differently for different goods; trialized countries on "recall bias" or increasing under- telescoping has a greater effect for purchases that are estimation as the recall period is increased. (See Neter highly salient to the respondent-such as durable 1970, Eisenhower and others 1991, Silberstein and goods purchases, wedding or funeral expenses, or pur- Scott 1991, and Sudman and others 1996; for evidence chases of grain stock for the year-while high- from developing countries see Mahalanobis and Sen frequency smaller purchases on such items as food and 1954 and Ghosh 1953.) Scott and Amenuvegbe household supplies are more likely to be forgotten (1990), who ran experiments using households from altogether. the Ghanaian Living Standards Survey, found that for Neter and Waksberg (1964) proposed dealing 13 items frequently purchased by these households, with telescoping using "bounded recall," a method reported expenditures fell at an average of 2.9 percent that appears to be reasonably effective.The first step in for every day added to the recall period, within the this method is to conduct a preliminary interview in I 10 CHAFrER 5 CONSUMPTION which respondents are asked about the household's countries as a model to emulate, has changed its design expenditures in, say, the previous month. Although to accommodate the findings in the hterature (see these data are not used because they are subject to Jacobs and Shipp 1993). Although the Consumer telescoping bias, the process of noting them provides a Expenditure Survey is fielded in an industrialized record that prevents their being reported again in the country and although its primary function is to collect first "real" interview-and thus eliminates telescoping. weights for the price index rather than to monitor In the second interview, respondents are asked about welfare, it collects detailed and aggregated consump- their purchases since the first interview. Bounded tion measures, and its experience is relevant for recall is only possible when the interviewer makes at designers of LSMS surveys.There is good documenta- least two well-separated visits to the household. Recall tion on this survey's experience with telescoping, bias is more difficult to deal with, but if respondents recall bias, seasonality, and the advantages and disad- are willing and able to keep an accurate diary, both vantages of diaries compared to interviews. recall bias and telescoping are eliminated. In Consumer Expenditure Surveys up to Cognitive psychology and questionnaire "think 1960-61, expenditures were obtained by interview aloud" interviews are yielding insights into how peo- using questions with an annual recall period. (An ple answer questions. It seems that as long as an inter- interesting feature of this methodology was a "balanc- view is well conducted, most respondents do their best ing" procedure whereby a household was revisited for to provide accurate and truthful answers. At the same cross-checks if the household's reported expenditures time, most respondents also try to minimize the effort and reported income differed by more than a prespec- they must make to respond to the interviewer's ques- ified limit.) However, in the decade before the next tions, thus acting as "cognitive misers."As a result, they survey (in 1972-73) came the work on telescoping by may switch their tactics for answering expenditure Neter and Waksburg (1964), as well as influential questions as the difficulty of the task increases. Over experiments on diaries and interviews by Sudman and short recall periods or for rare but important events in Ferber (1971). The annual recall period was aban- their lives, their answers are based on counting; they doned, as was the "balancing" procedure-which was recollect individual events and then add them up. Over considered arbitrary and unworkable in the absence of long periods, for unimportant events that happen fre- an annual recall period. Available literature does not quently, or for aggregates containing large numbers of provide any evidence that the annual recall procedures items, they often resort to a more approximate way of were unsatisfactory or that the balancing procedure estimating their answer: estimating the frequency of failed to work-only that such techniques had been the occurrence and multiplying this number by the superseded. length of the reference period. The frequency that In the current design of the Consumer they choose may or may not be accurate; it may, for Expenditure Survey, introduced in 1980, one set of example, be overweighted toward their current or households keeps diaries that cover expenditures on recent behavior, with the respondent ignoring or giv- food and minor household items (mostly grocery ing inadequate weight to exceptional events. Sudman, items), and a different set of households is interviewed Bradburn, and Schwarz (1996) stated that their "tenta- on five separate occasions. The data from the first tive finding is that estimation is unbiased but counting interview, which uses the recall period of the previous methods, although reducing variance, may be biased month, are used only to eliminate telescoping bias. At either up, for short time periods, or down, for long each of four subsequent quarterly interviews, house- periods." In this sense, respondents' estimation strate- holds are asked to recall their expenditures during the gies can be thought of as an alternative to diaries as a previous three months.The evidence from these sur- means of dealing with recall and telescoping biases. veys is entirely consistent with previous evidence Indeed, there is some evidence from consumer expen- about the significance of telescoping bias. According diture surveys in Canada (McWhinney and Champion to Silberstein (1990) the (discarded) rates of expendi- 1974) that diaries and interviews with annual recall ture from the first interviews are much higher than periods gave closely similar results. those from subsequent interviews; for clothing the first The U.S. Consumer Expenditure Survey, a survey interview totals are 40 percent higher than the average that could be seen by statistical offices in developing of the subsequent four interviews. However, there is I I I ANGUS DEATON AND MARGARET GROSH also "internal" telescoping or recall bias in the data any of the foodstuffs on the list. If the answer is "yes;' from the subsequent interviews, with respondents a further question is asked about the value of any pur- consistently reporting higher expenditures in the most chases that the household made since the interviewer's recent (third) month of each quarter than in the two last visit (a time that in the prototypical LSMS field- months preceding it. There are also pronounced sea- work plan is 14 days). Second, the respondent is asked sonal effects in the reported expenditures in the in how many months of the year the household pur- Consumer Expenditure Survey-especially increased chased the food item, how often it purchased the item consumption associated with the year-end holidays in each of those months, and how much it usually (Silberstein and Scott 1992).The U.S. Bureau of Labor spent each time. Data on the value of home-produced Statistics doubles the size of the sample over the holi- food are collected in a separate set of questions that ask day period to deal with these and other (largely non- how often the home-produced food is consumed; the response) effects; the effectiveness of this measure is recall period for these questions has varied from coun- not clear. try to country in previous surveys, ranging from "each time the home-produced food is consumed" to each EXPERIENCE FROM PAST LSMS SuRvEys. Past LSMS day to a typical month. surveys have used a range of recall periods for con- This design makes it possible to compute two dif- sumption items, depending on both the item and the ferent estimates of the monthly rate of expenditure for survey. For food purchases the ongoing Jamaican sur- each food item. The "last-visit" measure is zero if no vey uses 7-day and 30-day recall periods. In South purchases were reported; otherwise it is the amount Africa in 1993 respondents were asked whether they reported since the last visit divided by the number of bought each food item on a weekly or monthly basis, days since the last visit and multiplied by the number and were then asked to report their purchases during of days in an average month (365 over 12).The "usual- the last such period. The survey in Ecuador (1994) month" measure is zero if nothing was purchased in took a similar approach in that the respondent chose the previous year; otherwise it is the reported usual the recall period for food items. The recall period was monthly expenditure multiplied by the number of one week in the Kyrgyz Republic (1993), Nicaragua months in which purchases were made and divided by (1993), and Russia (1993-94) surveys and two weeks 12. Most analysts of the data, whether constructing in Brazil (1996). In China (1994; Hebei and Liaiong poverty profiles or conducting research, have used the provinces only) the period was specified simply as usual-month figures. "1994." In many surveys nonfood items have often In some surveys the respondent has been offered been separated into two categories: high-frequency or only a single recall period for some nonfood "daily" items and "occasional" items. Daily items have expenditures-a week, the time since the interviewer's a short recall period-perhaps a week or two- last visit, a month, or a year, depending on the pre- whereas occasional items may have recall periods of sumed frequency of expenditure. Frequently pur- one month, three months, six months, or a year. For chased nonfood items such as newspapers and tobac- nonfood items, some surveys have two recall periods; co are usually collected with a recall period of the the Jamaica survey uses a month and a year. Other sur- previous seven days or the time since the interviewer's veys sort different items into different single recall last visit. However, for a substantial number of items, a periods; expenditures on soap may be reported on a dual procedure similar to the procedure used for food monthly basis, clothing on a quarterly basis, and vaca- has often been followed, with a last-visit measure con- tions on an annual basis. structed as before (although the alternative is now One design is frequently thought of as an LSMS expenditures "in the last year" rather than usual standard, in part because it was used in several of the monthly expenditure). In this case a monthly estimate earliest and most widely analyzed LSMS surveys. In can be constructed by dividing the response by 12. this protocol, respondents are asked whether the The last-visit measure can be thought of as an household has consumed a particular food item dur- explicitly bounded measure that elicits from the ing the past year. Each respondent who answers "yes" respondent an answer based on his or her recall and is asked a series of follow-up questions. First, the counting of events, while the "usual-month" question respondent is asked whether the household purchased is an attempt to elicit from the respondent an answer 1 12 CHAPTER 5 CONSUMPTION based on a rate or frequency. Of course, these are gen- For nonfood items, for which the frequency of erous interpretations. In the standard protocol, no con- purchases is generally lower than for food, the differ- sumption questions are asked at the first visit. Thus, ences between the two measures are more marked. For although the first visit may be fixed in the respondent's the means, the usual-month measures are lower than memory, it is not clear that this will help to reduce the last-visit measures for C6te d'Ivoire, Pakistan, and telescoping errors at the second visit. The respondent Ghana but not forVietnam.The dispersions are lower may remember the previous visit of the interviewer for the usual-month measures than for the last-visit very well but still be unable to recall whether his or measures in all four surveys. Except forVietnam, the her trip to the market occurred before or after that decline in means is consistent with the syndrome of visit. increasing forgetfulness over a longer period, and all There is also a serious question about whether the the dispersions are consistent with the view that the usual-month recall is likely to be independent of the last-visit (two week) measure of these items is too last-visit recall. If respondents are "cognitive misers" short to give an accurate measure of households' casting about for the easiest way to estimate the annual standards of living. "usual" rate, the answer to the "last-visit" question will When food and nonfood are put together with give them one. Omitting the "last-visit" question other items to give a consumption aggregate, disper- might not even solve the problem, since the respon- sion of total per capita household expenditure from dent's answer to the frequency question could also be the usual-month data is less than for the last-visit data, unduly influenced by his or her recent behavior. but the difference is not very marked, given the fact In background work for this chapter, data from that the item with the biggest drop in dispersion- the LSMS surveys in Cote d'lvoire (1986), Ghana nonfoods-is a relatively small share of most house- (1988), Pakistan (1991), and Vietnam (1992-93) were holds' budgets. Nevertheless, the differences are large used to compare the consumption estimates from the enough that the headcount ratio measures of poverty different recall periods used in the consumption mod- will all be lower when usual-month data are used ules. The aim was to look for evidence that reported rather than last-visit data. mean expenditure rates decline with the length of the These results can be seen as encouraging, since recall period and to check whether the same is true for they show only slight sensitivity to the choice between the dispersion of the estimates. two of the most obvious reporting periods. An opti- For purchases of food (excluding the value of home mistic interpretation is that food expenditures are fre- production), the estimates are very similar for the last- quent and stable enough that the last-visit and usual- visit and usual-month methods of calculating monthly month estimates are similar. While this is not true for food expenditures.Where there are differences, they do infrequent expenditures for which the choice of not conform to the expected pattern in which both reporting period affects both bias and variance, the mean and dispersion are lower in the usual-month fig- total expenditure on such items is usually too small for urc. Indeed, the diffcrences are most marked in C6te their net effect on total consumption to be very large. d'lvoire, where the last-visit measures have a lower mean There are some caveats to this finding. If respon- (by 5 percent) and median (by 8 percent) than the usual- dents forget about their food purchases at the sort of month estimates, and the measures of dispersion are very rates suggested by Scott and Amenuvegbe (1990), the close. For Ghana the two sets of numbers are effectively last-visit measures may be substantially underestimat- identical, a finding that extends to the complete distri- ed. In addition, the consistency between the usual- butions so that no poverty or inequality measure would month and last-visit measures for food may simply be different if one kind of data were used to calculate it reflect households using their responses to the last-visit rather than the other. InVietnam the last-visit measures questions to guide their answers to the usual-month are slightly lower than the usual-month measures, and questions, which would not then constitute any sort of the dispersion in the latter is perceptibly lower. A close independent check on the validity of either measure. inspection of the details revealed that there are fewer And there are some nonfood items in these surveys very low reports in the usual-month data than in the (mostly frequently purchased items) for which there last-visit data. (The Pakistan data cannot be used for this was only a single recall period. While it might be comparison due to the problems discussed above.) hoped that these nonfood items would not be sensi- I 1 3 ANGUS DEATON AND MARGARET GROSH tive to the choice of recall period, the inclusion of ity. Moreover, the two-visit structure is being used these items in the consumption totals mutes the effects increasingly less frequently; when this structure is not of using different recall periods. used the last-visit question has to be replaced by a question with an unbounded recall-increasing the OPTIONS FOR FUTURE SURVEYS. There are no defini- chances of telescoping. tive answers about the optimal recall period. In the The other point of view is that the LSMS stan- meantime, however, surveys must be designed, so we dard questions are far from best practice. One of our provide a brief discussion of the various options reviewers, the late Chris Scott, said, "the use of the together xvith some suggestions. question, 'How much do you usually spend on man- If a survey is meant to be comparable to another goes in one of the months that you purchase man- survey, it makes sense to use the same recall periods goes?' appears to fall far outside of reasonable best that were used in the other survey? provided that the practice." Scott, who had extensive experience carry- previous survey conformed with best practice stan- ing out consumption surveys in Africa and elsewhere, dards as outlined in this section of the chapter. believed that only a more in-depth approach would Beyond this consideration, there are two main yield adequate data. According to Scott, respondents routes to follow. The first is essentially the "status quo" should be interviewed several times, separated by the of the design of most recent LSMS surveys, while the most accurate recall period, perhaps as little as a day. second is a more extensive revision based on the sup- The number of interviews should be as many as are position that the current design is unsatisfactory. The needed to cover the reference period (say, a week or preference of the authors of this chapter (which is two), with a bounding interview at the outset. Some reflected in the draft questionnaire) is to make only or all of these interviews can be replaced by diary- minor modifications to the status quo and to experi- keeping by the respondents themselves, by proxy ment with the components of any revision before put- recordkeepers in the household, or by interviewers ting it into practice. who complete the diaries with help from the respon- The "status quo" design is to use two recall dents, an alternative that blurs the line between diary periods-one that is the amount of time since the and interview procedures. There is some evidence interviewer's last visit and the other that is for "usual" (reviewed below) that diaries can capture some expen- expenditures. As already argued, the first yields esti- ditures that may be missed in interviews, and with a mates that have a minimal amount of telescoping, sufficient number of such diaries, the substantial addi- while the second is a calculated, unbiased estimate. tional cost of conducting interviews can be translated The evidence of the comparisons with NIPA data can into high-quality data. Nevertheless, even with a two- also be cited; these do not show the gross and sys- week reference period (requiring 10 or more inter- tematic under-estimation of expenditures that might views), seasonality is not captured, nor are other fluc- be expected if the recall data missed a very large frac- tuations in consumption over the year. It is tion of expenditures. Finally, an important character- conceivable that usual-month responses capture some istic of LSMS surveys has been their coverage of many of this variation, but this is probably little more than a topics, which would be threatened by including a hope. vastly more extensive consumption module. Con- Every survey should have a budget for experi- sumption is already one of the longest and most mentation. Questions should occasionally (if not reg- expensive modules of any LSMS questionnaire, and to ularly) be subjected to cognitive laboratory techniques extend it further (even if the resources were available and revised and updated in the light of the results. to do so) would inevitably crowd out other important Even more importantly, the extensive interviewing topics. Nevertheless, these are not strong arguments. and diary techniques, as advocated by Scott, urgently The NIPA comparisons are weak, the consistency need to be compared against the standard last-visit and checks in the surveys are capable of many interpreta- usual-month responses, preferably using randomly tions (not all of which support the validity of the selected subsets of households within the same survey. data), the last-visit recall period contains no bounding Finally, it cannot be overemphasized that the questions, and arguments in terms of expediency are wordings of the recall periods must be unambiguous much weaker if the data are of uniformly poor qual- and well understood by interviewers.Wordings such as 1 14 CHAPTER 5 CONSUMPrION "since my last visit, two weeks ago" arc obviously ories are a problem, accurate estimation of annual con- ambiguous if the visit did not occur exactly 14 days sumption flows will require multiple, seasonal visits ago. There are also possibilities for confusion in just as does the accurate estimation of income. recording the units of a purchase. Where quantity or The seasonality of consumption does not neces- price data are sought, it must be clear what units the sarily follow from the seasonality of income, and may questions refer to. Obtaining the units for quantities have a quite different seasonal pattern. In theory, farm purchased may work better than obtaining the units households have strong incentives to untie their con- for prices or unit values; this is reflected in the draft sumption from the seasonal patterns of their incomes. consumption module. Harvests may generate income for only a few weeks or months, while consumption has to be maintained Multiple Visits Throughout the Year throughout the year. It is not necessary to accept the In most past LSMS surveys, primary sampling units permanent income or life-cycle hypothesis of con- and the households within these units have been vis- sumption to believe that households can smooth their ited on two occasions two weeks apart, with the con- consumption over the year. Even without access to sumption data collected only during the second visit. credit markets, farmers can store some output or save Thus there is a single record of consumption over some income from the harvest to support consump- whatever recall period is selected. This has also neces- tion throughout the rest of the year. Of course, there sarily been the case in the increasing number of LSMS will be still be seasonal patterns in farmers' consump- surveys where only a single visit is made to the pri- tion; festivals such as Tet and Christmas are associated mary sampling unit. Even if the surveys were to adopt with higher-than-normal expenditures. And because a more intensive program of multiple interviews, coni- storage is costly prices will generally be higher just sumption data would still be collected over only a rel- before the harvest than just after it, with some effect atively brief period-say, a week or a month. No pro- on consumption. Nevertheless, seasonal variation in posed design of the LSMS consumption module consumption is neither closely tied to, nor less than, would capture variations in household consumption seasonal variations in income. Reliable evidence on over a whole year; therefore, the consumption data the relative seasonal smoothness of consumption currently collected may not reflect the annual con- comes from Paxson (1993). Rice farmers in Thailand sumption flows in which analysts are fundamentally who double-crop (with irrigation) have quite different interested. To collect better data it would be necessary seasonal patterns of income than do farmers with only to revisit households on several occasions throughout one crop, yet their consumption patterns are almost the year, collecting consumption data during each identical and exhibit little scasonal variation. The same visit. Such data could be used to increase the accuracy is true when one compares farm and nonfarm house- of the consumption aggregates, allowing for variation holds or farmers in different agroclimatic zones. in households' consumption over time; the data could For the purposes of this chapter, an attempt has also be used for a number of analytical exercises. been made to look for seasonal patterns in consump- One reason for multiple visits throughout the year tion data from three LSMS surveys. Cross-sectional would be seasonality. The collection of agricultural surveys are not well suited to this task, since what is data and agricultural income typically requires that required is not a large number of households observed survey interviewers visit households (farms) in differ- throughout a single year but rather a large number of ent seasons of the year. Farm incomes are seasonal, and years over which seasonal patterns can be established. it may not be possible for a respondent to remember Nevertheless, the consumption totals for Ghana, C6te all of the transactions that went into a calculation of d'Ivoire, and Vietnam were examined for differences net income many months afterwards. In consequence, across months. In any one survey for a single year, it is widely believed-and suggested in Chapter 19 of there tended to be significant differences in the con- this book-that accurate farm surveys require multiple sumption total from one month to another- seasonal visits. The concern in the current chapter is differences probably not driven by seasonal patterns in with consumption, but it is conceivable that con- consumption. (In the Cote d'Ivoire and Ghana sur- sumption expenditures-like incomes-vary system- veys, which include data for more than one year, the atically with the seasons, so that if respondents' mem- monthly patterns in the consumption total are quite 115 ANGUS DEATON AND MARGARET GRosH different across the survey years.) Some of these differ- the intrahousehold and interhousehold components of ences-such as those caused by Tet in Vietnam-are dispersion. Although this is not explicitly allowed for easy to explain, but others are not, even though their in Scott's formulation, his corrections should still lead effects are sometimes considerable. The progression of to a better estimate of dispersion.With repeated obser- the survey teams through the country may generate vations of the same households, there are a number of some variation as they move from poorer to richer vil- techniques that would make it possible to assess the lages. Random measurement error is also likely to be size of measurement errors (see particularly Griliches an important factor. Nevertheless, this limited analysis and Hausman 1986). provides no grounds for supposing that consumption Finally, in some circumstances multiple consump- would be better measured by including multiple, sea- tion measures can be useful at the analytical stage. sonal visits to households in LSMS surveys in the Multiple visits generate a type of longitudinal or panel future. data that can be useful for studying changes over time However, there are other reasons why multiple and for sorting out cross-sectional variation from visits might be useful. The average of two consump- time-series variation, as in Scott's work above. tion totals, each for a two-week recall, will give a bet- However, if the visits are separated by only a few ter, lower-variance estimate of longer-term consump- months, the changes that have occurred may not be tion. More radically, Scott (1992) and Central large enough to be interesting, and the measured Statistical Office of Zambia (1995) advocate using the changes are likely to be dominated by measurement correlation between consumption across multiple vis- error. its to correct measures of inequality to bring them The recommendations made here are again tem- closer to what would have been measured had it been pered by the increased costs of extending the con- possible to collect consumption data over a full year sumption module of a typical LSMS survey, in terms for each household. The idea is as follows. Suppose of both money and the consequences for the rest of that analysts are interested in annual consumption the survey. Because consumption is smoothed within measured at a monthly rate but the only observations the year, measuring it over two weeks or a month may they have are of consumption over the previous yield a sufficiently accurate picture of annual con- month. At one extreme, each household may consume sumption to make it not worth incurring the cost of the same amount in each month so that the monthly adding yet more visits. However, if the detailed agri- totals are correct and can be correctly used to give a cultural module is included in the survey with multi- measure of dispersion over households. At the other ple visits during different seasons, there would be little extreme, suppose that each household's consumption additional cost involved in collecting at least some is uncorrelated from one month to the next. The "last consumption data at each visit. Such cases aside, month" totals are then correct on average, but their including multiple visits throughout the year is proba- dispersion over households is larger than the disper- bly not the highest priority for improving the typical sion in which analysts are interested because it has a LSMS survey. "within-household" component in addition to the "between-household" component with which ana- Imputing Values lysts are concerned. With multiple observations for at In nearly all LSMS surveys, calculating a comprehen- least some households and under some reasonable sive measure of consumption will require at least some assumptions, it is possible to estimate the size of the imputations. Not all consumption is obtained through within-household dispersion and to correct the mcas- market purchases; if analysts want to calculate con- ures of the total. The Central Statistical Office of sumption in monetary units, they must find some way Zambia (1995) uses such a procedure to correct dis- of pricing its unmarketed components. In many ofthe persion measures for a Zambian survey; the technique poorest countries, and especially for the poorest peo- could usefully be applied elsewhere. ple, a large share offood comes from home production As always, measurement error will add dispersion or from hunting, fishing, or collecting wild foodstuffs. to measured consumption, so that if measurement These imputations for food are likely to be those that error is random, the dispersion it causes will be in are most important for the totals. However, there are a addition to the genuine dispersion that comes from number of other commodities obtained by the house- 1 16 CHAPTER 5 CONSUMPTION hold's nonmarket labor, such as homemade clothes or consumed, since these are magnitudes that the respon- wood and water fetched by children or women. dent observes, at least in principle. The value or price Household members often receive gifts or in-kind of these quantities can be obtained in several ways. payments that need to be priced before they are added Farmgate prices, defined as what the household could in to the consumption total. And it is necessary to get for its production, set a lower bound on valuation, obtain information on the consumption that comes since it is usually presumed that consumption is evi- from using durable goods. For nondurable goods- dence that the good is valued beyond what it would even those that are partly durable-it is probably safe fetch. Market prices, by contrast, are likely to be too to assume that consumption and purchases are one and high because they include transport and distribution the same. However, for large durable goods that are margins and because the commodity traded is often of expensive and last for many years, such as houses, cars, higher quality than its home-grown counterpart. or bicycles, it is important to try to make some However, once the quantity has been obtained, the adjustments. This subsection reviews the data required respondent could be asked to report one or both of to make these imputations. these two prices or simply to estimate the value of the It should first be noted that imputation is an commodity directly. Some degree of cross-checking is inherently difficult and error-ridden process. possible from the quantities and prices of purchases Imputation is likely to work best where there is rela- reported in the agricultural module or from the prices tively little need for it-when the economy is highly gathered in the community questionnaire. monetized but there is a relatively small amount of In some circumstances it may be possible (or own-production (such as vegetable gardens) involving important) to carry out a similar exercise for nonfood goods that have clear market equivalents. Imputation items. Clothes and furniture are often made at home, works badly in economies in which a large share of and household labor is used to collect firewood, dung, transactions do not pass through the market. LSMS or water. These items have usually been omitted from procedures for estimating welfare stem from a theory past LSMS surveys, probably because of difficulties in of a consumer with well-defined preferences operat- valuation. One danger is that the welfare of poor ing in a market where prices are well defined and households might be overstated by using inappropri- unaffected by the agent's behavior. Where these mar- ate prices or wages to value their production or their kets do not exist, analysts are in effect imposing an labor. For someone with no other employment accounting framework on the physical data, a frame- opportunities who ekes out a living gathering fire- work of dubious relevance to the lives of the people wood or coal, it would be adding insult to injury to being studied. impute a high standard of living to them by valuing Food that is either home-produced or received as their time in terms of the market wage in a formal sec- gifts or payment in kind has been the most important tor to which they have no access. In some circum- imputed item in LSMS surveys to date. In principle, the stances the wage data from the community question- calculations are straightforward. The respondent is naire may be a better basis for imputations based on asked to report the values of any home-produced food the value of time, but these data do not eliminate the items consumed by the household during the reference dangers inherent in the procedure. period, and the sum of these values is added to the con- It is incorrect to compute an extended concept of sumption total. Given the seasonality of production, the consumption by adding "expenditure" on leisure (in recall period probably has to be a year, or at least a typ- other words, the value of leisure at the market wage) ical month over the last year. It may be possible to do to total consumption. This "full-income" concept has better than this when there is a multiple-visit agricul- its uses, but it is a nominal measure and, like other tural module in the survey. However, the major diffi- nominal measures, must be converted to real terms culties are with valuation, since the respondent is being before being compared across households or individu- asked a purely hypothetical question about the sale or als.The problem here is that even if everyone in the purchase of an item that is rarely traded or that may comparison faces the same (or similar) prices for have been traded sonie time ago. goods, they do not face the same price of leisure, The draft module (presented inVolume 3) reconi- because wage rates differ. As a result, before full mends collecting data on physical quantities of goods income can be used as an indicator of welfare, it must 1 17 ANGUS DEATON AND MARGARET GROSH be deflated by some price index that includes the price ical corrections for the selection, these corrections of leisure. Alternatively, if the value of leisure is to usually require arbitrary and untestable assumptions added into the value of total expenditure, the same that further compromise the credibility of the process. wage rate should be used to value everyone's leisure. This is a difficult area. In general, survey analysts For durable goods, the consumption flow is best should make sure that indefensible imputations are not thought of as a rental equivalent or "user cost." This dominating welfare comparisons. The data required has two components: the opportunity cost of the funds for rent imputations are gathered in the housing mod- tied up in the good (that could be realized through its ule (and to sonie extent in the communlity question- resale) and the value of the physical depreciation of the naire) and therefore are not discussed further in this good (through use of the good or passage of time). To chapter. estimate these magnitudes, some measures of depreci- A number of imputations come from other mod- ation and current value are needed. Perhaps the sim- ules in the survey The employment module gathers plest way to obtain this information, at least for goods information on in-kind income provided by employ- purchased in the previous five to ten years, is to ask ers, including (free or subsidized) transport to and respondents when they purchased the good and how from work, food at work, and housing. much it cost at that time. These are both factual mat- ters and will often be clearly remembered, at least in Respondents for the Consumption Module the case of large, important items. Provided that the Most LSMS surveys have interviewed a single respon- good has been available for some time and purchases dent for the whole of the consumption module or for have been made relatively evenly over time, an esti- each part of it. The household is asked to determine mate of the average lifetime of the good can be the "best informed individual" who will respond to obtained by doubling the average age over all similar questions. This has the appealing feature of not pre- goods for all households in the survey. Once this judging the division of labor in a household, either by approximate lifetime is known, the depreciated value gender or by age, as would be the case if it were of the good can be estimated given its age and origi- assumed that the wife does the shopping (or if any nal value, which is then used to calculate the first com- similar assumption were made). In many-perhaps ponent of the user cost. It is also possible to ask most-countries, the single-respondent approach respondents for direct reports of the current market works well. In particular, this approach is satisfactory value of the used durable, though there is no evidence where food is a large share of the budget, where there on the accuracy of such (hypothetical) reports, and is a common cooking pot, and where most of the field tests are likely to be useful. household resources are pooled. For housing, the largest of the durable goods, the Even within resource-pooling households, it may imputation approach again starts from the rental be useful to have multiple respondents for some equivalent. Unlike the value of most other durable expenditures or different household members report- goods, rents can sometimes be observed directly, and ing on different categories of expenditure. While the these are the correct numbers to add into the con- person who does most of the shopping for food will sumption aggregate. For households that do not report know about this large share of the household's budg- rents, the standard procedure is to impute a rent based et, another large share could be most accurately on the characteristics of the house, as reported in the reported by the person who pays the housing and util- housing module.This is typically done through "hedo- ity bills (who may or may not be the shopper). And nic" regressions in which reported rent is regressed on there are other expenditures of which no one single the house's characteristics (such as size, number of person may have a very accurate picture. Individuals rooms, construction material, and location) and the may not know how much "walking around" money results are used to calculate rents for other properties other household members have, much less how they where rents are not reported. The credibility of these spend it-whether on bus fares, meals away from regressions is compromised if only a small fraction of home, newspapers, tobacco, alcohol, or entertainment. the sample reports rents and, more generally, if those Also, there may be larger items, such as clothing, that who report rents are unrepresentative of the popula- individuals purchase without any other household tion as a whole. While it is possible to make mechan- member knowing how much was spent. This is partic- 118 CHAPTER 5 CONSUMPTION ularly true where several adults live in a household, ized on a product basis, on an outlet basis, or on a pur- each contributing some amount of their income to chase basis, and the forms can be designed to allow a the household's joint expenses and reserving the large degree of prompting (for example, by listing remainder for their own use. For example, in a house- many types of products) without the associated tedium hold with a mother and grown sons, the mother may of a long interview, pay for all the household expenses, including food and The use of diaries has some practical implications; utilities, using her pension and the regular monthly the most obvious is that a person filling out a diary sums given to her by her sons. However, she may have must be literate. Nevertheless, the diary approach has little or no idea about her sons' incomes or other been used in household expenditure surveys in a large expenditures, which could account for most of the number of countries where literacy is not universal. household's total income and outlay. In this situation, This has been achieved by having the most literate no one person can give an accurate report of the member of the household (sometimes a child) help household's income and expenditure, nor is the the one who does the purchasing fill out the diary, or "household" really the relevant unit for analysis. by having the interviewer visit the household The LSMS has little experience in procedures for frequently-perhaps even daily-to help the house- dealing with these situations. The literature discussed hold fill it out (Blaizeau 1998). In such cases, the dis- in the next subsection describes cases where each tinction between a diary and an oral interview adult membcr of the household kept a diary of at least becomes blurred. This blurring occurs even in literate some categories of expenditure. It is possible in prin- households, in which the members of the household ciple to interview each member of a household about may either forget to fill out the diary or get tired of at least some expenditures, such as those paid for by doing so; in the U.S. Consumer Expenditure Survey "walking-around" money. A more ambitious prospect substantial numbers of diaries are completed by the would be to try to record incomings and outgoings for interviewer at the time of collection based on the each member of the household who spends money. respondent's memory. When collecting a diary, the This would probably be prohibitively expensive for a interviewer examines it briefly and, if it appears to be general multitopic survey, although as always there are incomplete, tries to prompt the respondent to fill it potential benefits from conducting experimental work out more completely-essentially transforming the either on a few households or within a special survey. situation into an interview. To the extent that these Multiperson accounts are likely to provide a fascinat- "diaries" rely on respondents' memories, telescoping ing picture of how intrahousehold transfers of and recall bias again become potential problems. resources take place, who gets what, and what makes a A second logistical issue is that the diary must be group of people function as a household. This is an both left with the household and picked up after its important research area but not a prime candidate for completion. If the diary period is relatively short-say, immediate incorporation into standard LSMS surveys. a week or two-this does not necessarily pose a prob- lem in an LSMS survey, since the completion of the Diaries Versus Oral Interviews whole questionnaire typically involves multiple visits The use of consumption diaries, in which households by the interviewer to the household so that each are asked to record their purchases as soon as they member can be interviewed and the length of each make them, is common in full-fledged single-purpose interview can be kept reasonable. However, leaving a consumption surveys. The ideal diary would yield a diary with a household for a long period of time, such record of each purchase immediately after it takes as a month or a quarter, would be more difficult and place, thus eliminating the need for respondents to rely would not be possible within the current design of on their memories and removing any associated most LSMS surveys. Thus diaries are only usable for errors, including telescoping. The diaries may be kept studying items for which a relatively short recall is by a single respondent or by several or all members of appropriate-and not for a large proportion of non- the household, so they can potentially help resolve the food expenditure. question of who is the best respondent for the whole Third, the use of a diary alters interviewing. A household while simultaneously yielding information diary reduces the amount of time that the interviewer on intrahousehold allocations. Diaries may be organ- has to spend interviewing households that fill out the 119 ANGUS DEATON AND MARGARET GROSH diary completely. However, using a diary may increase circumstances, and there is considerable variation in the time that the interviewer must spend traveling, practice among different kinds of surveys. For exam- since it requires an extra trip to the household to col- ple, in the U.S. Consumer Expenditure Survey there is lect it; considerable time may also be spent helping a single household diary, while in the British Family illiterate households fill out diaries. Expenditure Survey all adults in the sample keep indi- Using a diary also shifts the burden of response on vidual diaries. There is also evidence on the use of to respondents.The effect of this shift is unclear; some multiple diaries from Hong Kong (Grootaert 1986), survey statisticians speculate that households may Papua New Guinea (Gibson 1998), and surveys in enjoy the novelty of filling out diaries, and some think Benin, Burkina Faso, Cote d'lvoire, Mali, Niger, that diaries allow households to participate in generat- Senegal, and Togo (Blaizeau 1997).This literature finds ing survey data at a time and place that are convenient that multiple diaries can be useful for obtaining for them. records of expenditures that would otherwise be If these two speculations are true, then diaries may missed, but that because it is difficult to get all house- reduce the burden of surveys on households, making hold members to cooperate, attempting to collect the households more willing to participate in surveys. multiple diaries can reduce response rates. It is also Yet evidence from industrialized countries shows that clear that for some household members-certainly keeping diaries is likely to deter households from par- children and perhaps some of the elderly-proxy or ticipating in surveys and that the burden of keeping a household-level reports are more accurate than indi- diary causes respondents to drop out of a survey over vidual diaries. time. There is also evidence that the rate of reporting To the existing literature can now be added evi- declines with time, so that, in two-week diaries, more dence from experiments in three former Soviet consumption is recorded in the first week than in the republics: Latvia, Armenia, and Ukraine. In the Latvian second. This has also been true for the Consumer experiment (analyzed by Scott and Okrasa 1998), a Expenditure Survey in the United States, and for sur- nationally representative set of 300 households was veys in seven West African countries, as documented given oral interviews and asked to keep diaries cover- by Blaizeau (1998). In the 1995 and 1996 income and ing a comprehensive lists of foodstuffs as well as a expenditure surveys in Belarus, the expenditures selection of nonfood items. In half of the sample, the recorded in the second week werc about 15 percent diaries were administered first; in the other half, the lower than those recorded in the first week of diary interviews were done first. The order of the instru- keeping (Martini and Ivanova 1996). In Armenia the ments had virtually no effect on the results. Overall, diary was kept for four weeks, and the downward food expenditures were about 46 percent higher for trend continued over this longer span. The second the diary than for the interview; the coefficients of week's expenditures on food were 26 percent lower variation were quite similar.This pattern also held for than those of the first, the third week's xvere 35 per- 13 of the 15 subgroups of food and generally for the cent lower than those of the first, and the fourth quiantities as well as for the expenditures. For nonfood, week's were 40 percent lower than those of the first the results were much more mixed. For one of the (calculations done for this chapter). This could be four categories, the expenditures reported in the caused by respondent fatigue or by the fact that the diaries were significantly greater than those reported novelty of diary keeping wears off. Also, the fact that in the interviews, for two of the categories, the oppo- they are keeping a diary may cause people to spend site was the case, and in the fourth category, the dif- more or to shift their expenditures forward into the ferences were not significant. diary keeping period. Keeping a diary may cause peo- In Armenia, as part of nationwide survey, one- ple to think more about their consumption and per- quarter of the households in each cluster were given haps take the opportunity to buy some items that they diaries for 30 days and interviews with 30-day recall needed anyway.To the extent that diaries are not filled periods were conducted with the remaining three- out every day, there is also scope for telescoping and quarters of the households. The only diary data avail- recall errors within the diary period. able to the authors in usable form are from the food Having several members of the same household sections of the diary, so the comparisons done for this each keep a diary is also an attractive option in some chapter were limited to those items.Total expenditures 120 CHAPrER 5 CONSUMPrION for food recorded in the diary were about one-third recorded in the diary, rather than each individual pur- higher than those resulting from the interview, a result chase of it. similar to that in Latvia. However, this pattern did not In the Ukrainian experiment the expenditures for hold as strongly for the subgroups of food. Of the 15 the subtotal of food items that could be matched groups (which each included between 1 and 19 between the two instruments are 10 percent lower in items), the diary yielded significantly higher expendi- the diaries than in the interviews-a difference that is tures than the interviews in eight cases, significantly significant only at the 10 percent level of confidence. lower expenditures in two, and not significantly differ- In only 3 of the 11 categories are expenditures signif- ent expenditures in the other four cases. There is no icantly different between the diary and the interview. noticeable pattern to these results in relation to the The category that covers bread and flour accounts for mean expenditure, the number of items in the sub- about two-thirds of the total differences found in the group, or the average frequency of purchases. The food subtotals. For the subtotal of nonfood items that coefficients of variation for the diary were twice as could be matched between the two questionnaires, the large as for the oral interviews. The same pattern of diary subtotals were 7 percent lower than those for the results generally held when the whole sample was interviews, a difference that was not significant. divided into rural and urban areas, although the differ- There arc several features of these three experi- ences were slightly larger in rural areas. The diary esti- ments that limit the extent to which they can be gen- mates for total food expenditure were 43 percent eralized to other contexts and countries.The popula- higher than the interview estimates in rural areas and tions of Latvia, Armenia, and Ukraine are literate, so 31 percent higher in urban areas, these experiments cannot indicate what would hap- In Ukraine a diary was administered to about 5(30 pen if diaries were used in largely illiterate survey pop- households in selected locations as an experiment in ulations. All three countries have long traditions of conjunction with a national survey that used inter- household expenditure surveys that use diaries. In views.The diary used a recall period of four weeks for Latvia and Armenia the experiments were carried out all items. The oral interview used a recall period of by the statistical office using regular interviews, so the two weeks for food items and four weeks for nonfood experiments used experienced staff who were thor- items. The lists of items in the two instruments were oughly familiar with the procedure. This was not the not the same. The comparisons made here were limit- case in Ukraine, where the experiments were done by ed to those categories that were either identical in a private survey organization; this difference may diaries and interviews or that could be clearly mapped account for some of the disparity between the two sets into each other. For instance, "butter" was an item list- of results. ed as such on both lists, but the interview list con- Past LSMS surveys have made less use of diaries tained a single question about "smoked sausage and than they could have, and perhaps more use should be other smoked meats" while the diary had separate made of them in the future. Certainly an ambitious items for "smoked sausage" and "other smoked statistical office in a developing country, looking to meats."Thus, in this experiment, there are two possi- industrialized countries for inspiration, would use ble effects that will work in opposite directions. On diaries. It is less clear whether, even in wealthy coun- the one hand, the diaries might yield higher expendi- tries, there is indisputable evidence of the superiority tures than the interviews, partly because of the greater of diaries to interviews that might justify such a deci- disaggregation of the items on the diary list. However, sion.While diaries would produce better results if they the shorter recall period in the interview may have were used in ideal circumstances, it is unclear whether caused the interviews to yield higher expenditure their practical superiority over interviews has ever numbers due to the greater proportional influence of been convincingly demonstrated. And given the sig- any telescoping. Moreover, if diaries are kept less rig- nificant illiteracy rates in many poorer countries, the orously as time goes on, the shorter recall period of argument for switching to diaries is weakened even the interview may mean that higher numbers are further. While there is no doubt that diaries can be reported in the interviews than in the diaries. These used in situations where interviewers make many vis- effects cannot be disentangled because data available to its to the household to help them remember their pur- the authors contain only the subtotals for each item chases and complete the diaries, this is closer to diary- 121 ANGUS DEATON AND MARGARET GROSH keeping by the interviewer than diary-keeping by the that detailed information is collected also means that respondent. In countries such as Armenia and Latvia, the information is more likely to be complete. where diaries have been routinely used in the past, Conceptually, there may be some overlap between there is every reason to incorporate diaries into some of the items recorded in the education and con- LSMS-type surveys. However, at present there is no sumption modules. For example, does the category on compelling case for introducing the standard use of children's clothing in the consumption module diaries in LSMS surveys. include or exclude school uniforms? Few LSMS ques- tionnaires have been careful to specify this in the ques- Data from Other Parts of the Questionnaire tionnaire itself, although it may be addressed during While a great deal of the data required to calculate the interviewer training process. consumption aggregates come from the consumption module of the household questionnaire, other impor- HEALTH MODULE. Detailed data on health care expen- tant data are usually collected in other modules of the ditures such as payments to doctors or other medical survey. These are reprised briefly here to serve as a professionals for prescription medicines and for lab checklist for the overall survey design. If survey tests are usually collected in the health module for designers decide not to collect data on the items men- each person who incurred such expenses during the tioned here in the other modules, the designers should recall period, which is usually the previous four weeks. ensure that data on these items are collected in the Data on expenditures on health insurance and over- consumption module. the-counter medicines are usually collected in the consumption module. Some surveys have also includ- RoSTER. Analysts need to know the number of house- ed a general question or two in the consumption hold members in order to compute per capita expen- module about expenditures on the items covered in diture measures. If analysts intend to calculate equiva- more detail in the health module. In the consumption lence scales, they need data on the age and sex of the module the coverage is for the whole household and members as well. Data on age, sex, and number of for a reference period of up to a year.The detailed data household members are never omitted from rosters. that are usually collected in the health module will generally give higher means than the aggregate ques- HOUSING, WATER, SANITATION, AND FUEL MODULES. tions in the consumption module, but the shorter The housing module will collect most of the infor- recall period in the health module will result in high- mation needed to imnpute the use value of owner- er variances. Analysts may then choose which method occupied housing. It is also the usual place to gather best suits their particular analysis. However, survey information on utilities (such as electricity, piped gas, designers must ensure that at least one of the two data and telephone service) and on expenses for water, san- sets is included. If only one is to be included, the detail itation, and some kinds of fuel-although it is now provided in the health module is probably preferable being suggested that in some surveys these expenses because it will yield more accurate means and support may be moved to separate extensive water, sanitation, many health sector analyses. and fuel modules. (Such modules are discussed in Chapter 14.) EMPLOYMENT MODULE. The employment module is usually the place to gather information on households' EDUCATION MODULE. Detailed data on household consumption of goods provided in kind as a part of expenditures on school fees, uniforms, textbooks, sup- wages, as well as on household members' commuting plies, bus fares, and so on are usually collected in the expenses. Different analysts tend to handle commuting education module for each student, using recall peri- and otherjob-related expenses (such as child care, uni- ods that are considered to be appropriate for the cate- forms, and fees for professional associations) different- gory of expenditure. These expenditure data are used ly. Some even exclude these expenses altogether from to compute consumption aggregates. Collecting this the consumption total on the grounds that they do not information in the education module makes it easier increase the household's welfare. At any rate, it is use- for the questions to refer to specific individuals, which ful to gather data on commuting expenses to give ana- is necessary for much of education analysis. The fact lysts the choice. 122 CHAPTER 5 CONSUMPTION ENVIRONMENT MODULE. Households sometimes bor income (Chapter 11) and a page on outgoing obtain important resources from the environment. transfers is included in the draft consumption module Firewood and water for household use are perhaps the introduced in the following section of this chapter most familiar examples, although a wide range of (and provided in Volume 3). plants, animals, and minerals can be gathered from wild or common property for use as food or fodder or Draft Modules as inputs into the household's enterprises, agriculture, or housing. To gather a full range of data on house- Volume 3 presents one version of a consumption holds' use of such resources would probably require a module to be included in an LSMS-type household special purpose module and the modification of the survey. The draft module finds a middle ground in consumption, agriculture, household enterprise, and terms of length. Some questions could conceivably be housing modules to ensure full accounting while deleted, and other areas could be explored in much avoiding double-counting. This has yet to be done in more detail.When and how to make such adjustments an LSMS survey, though it has been done in some is explained in the fourth and final section of the interesting single-purpose surveys (see Cavendish chapter. 1998). However, several past LSMS surveys have gath- Survey designers need not organize the submod- ered information on at least a few of these resources, ules in exactly the order in which they are presented especially water and firewood. Such questions have here. There seems to be no rigorous evidence about typically covered the quantities used by the household, how different systems of organization affect the as well as in some cases the time spent collecting or answers given by respondents. However, there are a carrying the goods or the distance traveled to find few principles of organization that survey designers them. Survey designers need to decide how much should bear in mind, as discussed below. emphasis to give to this issue in the survey; they also need to review all pertinent modules to ensure that Organization modifications have been made where needed. There are different ways of grouping items together: by type of item (such as food, fuel, or clothing), by COMMUNITY AND PRICE QUESTIONNAIRES. The com- place of purchase, by recall period, or by the kind of munity and price questionnaires are the places to gath- follow-up questions asked (about quantities, prices, or er data on prices at the community level. If neither a home production).The exact layout will depend upon community nor a price questionnaire is included in the circumstances in the country of the survey- the survey, it is critical that either adequate regional including what people consume and how they acquire price indices are available from some other source it-and on the objectives of the survey. The survey or-more likely-data on quantities of food and fuel designers must draft one or more versions of a ques- consumed by households are obtained in the con- tionnaire and field-test them rigorously to determine sumption module. It is also vital that the survey whether the proposed module actually works. designers be willing to base regional price adjustments only on food, not using nonfood in the price index. Comprehensiveness The questionnaire should cover all types of consump- SAVINGS MODULE. Data on any regular saving by tion. It is often pragmatic to define some categories households can be gathered in either the savings mod- that contain groups of items (for example, "canned ule or the consumption module. goods") in order to produce a comprehensive list of expenditures without making the list unduly long. INTERHOUSEHOLD TRANSF.ERS. Many surveys have included a separate module to elicit data on flows of Specificity transfers in and out of the household. In general, it It is important to list certain items individually. As a makes sense to put the questions about inflows and rule, items should be listed individually if they are outflows of transfers together and to elicit parallel important sources of calories or are particularly inter- information about them. In this book, the inflows are esting to analysts in their own right. Examples include discussed in the chapter on transfers and other nonla- subsidized fuels and goods whose consumption is par- 123 ANGUS DEATON AND MARGARET GROSH items, the inventory of durable goods should not list Box 5.1 Cautionary Advice "stoves" since it would not be clear whether stoves were included among "major kitchen appliances." In * How rnuch of the draft module Is new and unproven? The draft consumption module presented here close- this case the durable goods inventory should either use ly follows the approach taken in many previous LSMS the same category-"major kitchen appliances"-or surveys. omit all such items. This rule applies for cross-checks * How well has the module worked in the post. The con- within the same module or across two modules. sumption modules in most previous LSMS surveys have produced data that have appeared to many analysts to Customization have reasonable magnitudes and to show expected Each questionnaire must be customized to reflect the patterns of consumption by categories of items and circumstances in the country ofthe surveyThe lists of over types of households. However field tests of this module in some countries in Africa showed that consumption items must be customized to reflect local respondents had difficulty reporting data on quantities consumption patterns and local terminology. The of goods purchased, so questions about quantities pur- recall periods for these items should depend on the chased were deleted in those countries. In some sur- shopping patterns of the local people. veys respondents have seemed to be confused about whether to report a total value and a quantity or a Annotations to the Questionnaire value per unit of purchase and a quantity Where this confusion arose, it was necessary to delete some This section consists of a series of notes and annota- observations from the resulting data set or even to dis- tions to each of the draft submodules introduced in card all of the data with a short recall period.To mini- the to ection and presented inVolume in mize this risk the questions should be clearly formulat- the previous section and presented inVolume 3.These ed, the order of the questions should be thoroughly notes are followed by several general guidelines for field tested, and the interviewers should be fully trained. dealing with some special circumstances within which However, the most effective insurance would be to use a survey may operate. The section ends with some two recall periods, at least for food-thus ensuring that advice about how to keep the module as short as pos- there is always a backup measure of consumption in sible, consistent with the need to gather sufficient data the data set. to support the analysis of the most important policy Which parts of the module most need to be customized? Survey designers should base the r decisions about issues i the country. which items to include in the different submodules on local consumption patterns. If a comparable survey was Part A: Daily Expenditures previously fielded in the country and if the consump- The items included in this submodule will vary from tion module in that survey was designed broadly in country to country. The idea is to capture (using a accordance with the best practices described in this short recall period) the small, repetitive miscellaneous chapter, the designers of the new survey should use the transactions that many people engage in almost every same recall periods and the same degree of item dis- day. Individuals often purchase the items listed here agg-egation as in the previous survey. with their "walking around money." As such, this is one of the places in the consumption module where ticularly favored or disfavored by policymakers it would be feasible to collect individual-specific data. ("merit" goods and "sin" goods). The list of items used in the draft submodule is a mixture of the lists from various previous LSMS sur- Double-Counting veys. Other items that could be included are: khat, It may sometimes be useful to create a cross-check by flowers, gasoline, firewood, haircuts, shaves, baths, and gathering data on the same items in more than one tips. place in the questionnaire. In these cases, the questions In this submodule, there is a special grid for meals should be worded carefully so that the analyst knows consumed away from home. Experience has shown that how to exclude one or the other to avoid double- the more detail appears in the grid, the higher the num- counting these data in the estimate of total consump- bers reported will be. In Jamaica the 1993 survey used tion. For example, if purchases of "major kitchen questions that were individual-specific, and the estimate appliances" are listed among consumption expenditure of expenditures on food purchased away from home 124 CHAPTER 5 CONSUMPTION accounted for 5 percentage points more total food con- ing a source of confusion in the interview at the small sumption than the 1992 survey, which used a single cost of losing a piece of ambiguous data. For other question on the daily expenditures grid (Table B-6 in items, taking care to keep the subgrouping reasonably Statistical Institute and Planning Institute of Jamaica homogenous may help resolve the issues of establish- 1994; Table B-4 in Statistical Institute and Planning ing quantities and estimating caloric content. "Leafy Institute of Jamaica 1995). If the grid is individual- green vegetables" and "potatoes, sweet potatoes, and specific it is generally best for each person to respond other tubers" are more internally homogenous cate- for him or herself rather than to use a single respondent gories than "vegetables." Whether or not they adopt as specified in the draft submodule. Such an individual- these approaches, survey designers need to bear the specific page should be adjacent to other individual- problems of grouping in mind. specific modules. Care may also be needed to avoid It is usual, and probably helps respondents double-reporting subsidized lunches in the factory can- remember the necessary information, to list similar teen or the schoolyard, as these may also be captured in items together. In past surveys this has usually meant the employment or education modules, placing botanically similar items side by side-in other This draft submodule is modeled after the one words, all meats together and all fruits and vegetables used in the Kazakhstan questionnaire because the sub- together-and, within each group, placing first the module from Kazakhstan yielded mostly answers that items that are more commonly consumed. Such place- made intuitive sense. The number of meals reported ment probably reflects people's shopping patterns in was mostly divisible by five, indicating regular patterns most countries; vegetables may be in one section of associated with the work week, and the unit value for the market and meats in another, or people may buy the meals seemed plausible. their vegetables from the greengrocer and their meat from the butcher. If a very long list of items is used, it Part B: Food and Fuel might make the interview more manageable to divide This submodule gathers information on the food and the list into subgroups in a more explicit way than is fuel consumed by the sample households. shown in the draft submodule, with subheadings for each subgroup. Then each subgroup could have a fil- LIST OF ITEMS. The list of items that should be includ- ter question such as "did your household purchase any ed in this submodule will vary from country to coun- meat since my last visit?" After that, separate questions try. The full-scale expenditure survey done every five on each kind of meat would follow. Some surveys sup- or ten years in most countries to provide weights for plement this general filter question by showing the the consumer price index has served as the basis for respondent a card that either lists or illustrates the development of the lists of foods in past LSMS (depending on the degree of literacy in the country) surveys. Special care should be taken to itemize goods the various items in the subcategory as a prompt. that contribute substantially to the total number of However, no past LSMS survey has tried this. calories consumed and to expenditures, as well as The list of foods used in the draft submodule as an goods that are (or are most likely to be) subsidized. illustration came from the Pakistan LSMS question- Thus, in Central America, rice, beans, and tortillas naire (with the addition of beer and other alcoholic should each be listed separately rather than in a group beverages, which are important items of consumption of "starches" or "basic grains." in many countries, though not in Pakistan). Thus the To prevent the hst of items from becoming list is specific to Pakistan and includes items (dal, gur, unwieldy (in the order of hundreds), there will have to and ghee) that may be inappropriate to list in other be some grouping of items, using categories such as countries. The list also omits some other items that "canned foods" or "vegetables." These groupings, should be included in other countries, such as pork, while necessary, can pose problems for interviewers in cassava, yams, tomatoes, papayas, and bananas. The key eliciting answers about quantities of the item con- point is that the list must be customized to reflect local sumed and for analysts in trying to establish the nutri- patterns of food consumption. ent content of the items. There is no perfect solution to this problem. For some items it may be pragmatic RECALL PERIOD. The recall period used in this sub- to black out the quantities question, thereby eliminat- module is the time since the interviewer's last visit to 125 ANGUS DEATON AND MARGARET GROSH the household. Questions in the food submodule pre- If questions were in the reverse order, a respondent sume that the interviewer's last visit to the household might give the interviewer an answer that referred to was about two weeks before the interview on con- the household's expenditure per unit of quantity rather sumption; this was generally the case in the old proto- than for the total purchase. Instead of"I spent 50 pesos typical fieldwork plan where each primary sampling on meat. I bought two kilos," the informant might unit was visited by the survey teams twice, two weeks answer "I bought two kilos" and "I paid 25 pesos."The apart, and different sections of the questionnaire were respondent would mean 25 pesos for each kilo, not in administered in each visit.The two-visit routine served total, and the true response would be misrecorded. In a number of functions other than providing the recall Pakistan this problem occurred often enough to call period; however, now that data entry operators often into question the accuracy of all of the short period travel with the field teams rather than remaining in a data. However, in the Panama field test, the questions regional office, the logistical reasons for the two sepa- seemed to work better with the quantity question rate visits are being eroded. Moreover, in many sur- placed before the expenditure question. veys, especially ones that have shorter questionnaires than the full LSMS survey, only one visit is ever made FUEL. Fuel is placed in the same submodule as food to the primary sampling unit and the entire question- because it is the main category of nonfood that is pur- naire is used in that visit. In single-visit scenarios the chased in convenient, standard units. This follows the recall question (Question 2) should be changed from principle that wherever quantities can be collected "since my last visit" to "in the past two weeks." In easily and accurately, they should be. Hence it is con- countries with high inflation rates or where other venient, though not usual, to put fuel in the food sub- expenditure surveys use a shorter reference period, module so that the follow-on questions on quantities survey designers may shorten the recall period to a will apply. week or to whatever period matches that of the other survey. HoME PRODUCTION. There are several options about how to arrange the questions on home prodtiction BARTER. If barter is common in the country of the (Questions 7-9). The option most commonly used in survey, it can be included in the wording of the pur- previous LSMS surveys has been to place these ques- chase question (Question 5). If barter is very impor- tions either together with food purchases, as is done tant, survey designers may wish to give it a question of here, or in a separate submodule on home production. its own. It will be particularly important to pilot-test Occasionally the consumption of home-produced the wording of the purchase question as this can often food has appeared in the agricultural module as part of be awkward. the means by which a crop is disposed of. Since agri- culture modules are becoming increasingly infrequent QUANTITY. The quantity question (Question 4) applies in LSMS surveys and since they may not always to the short recall period question, so that when a urnit include crop disposal questions, the following discus- value is derived from this data, it refers to a specific sion concentrates on the first two options. time period. To have a unit value (quasi price) for Whether the questions on households' consump- some indeterminate month during the year is not very tion of home-produced food should be placed adja- helpful where there is even modest inflation or sea- cent to the questions on their purchases of food or in sonal variation in prices. a separate submodule is an issue that will be affected by both the percentage of households that produce ORDERING OF QUANTITY AND EXPENDITURE. There is their own food and the range of goods produced in some debate about the proper ordering of these two this way. If questions on home-produced goods are questions (Questions 3 and 4). Essentially, the field test placed in a separate submodule, it is simple to add the should be the guide. Expenditure has been placed first filter question, "Has your household consumed any in the draft submodule for two reasons. First, it is the food produced at home?" Then it is possible to avoid more important piece of data. Second, collecting asking a long list of inapplicable questions to house- expenditure first is likely to reduce the risk of ambigul- holds that do not produce any food at home.This will ity or misunderstanding on the part of the respondent. be most appropriate when home production is not 126 CHAPrER 5 CONSUMPTION very common. Ultimately, what works best should be from employers.This is not included in this draft sub- determined by the field test. Do respondents find it module, because it already appears in the various easier to think about all of the sources of a single food- employment submodules. stuff together? (For example, I bought some tomatoes, I grew some tomatoes in my garden, or my coworker USE OF ENVIRONMENTALLY PROVIDED GOODS. If survey gave me some tomatoes?) Or is it easier for them to designers wish to try to put an explicit value on the use think about their food budget in terms of the source of all environmentally provided resources, the list of of that food? (For example, when I shop I buy beans, food items should include any food items that are like- tortillas and milk, but in my garden I grow tomatoes, ly to be gathered, fished, or hunted. In addition, the papaya, and bananas.) The decision about whether to wording of the questions may be changed or addition- put these questions in a separate submodule will not al questions may be added to clarifr that environmen- materially affect the length of the interview, although tally provided items are included in the questioning. separating home production from purchased food may take up more paper and make the questionnaire phys- Part C: Nonfood Consumption ically longer. Again, the list of items to be included in this submod- It is not necessary to have a home production ule is country-specific and can usually be derived from question for every food item in the list, since some are the survey used to weight the consumer price index. industrially manufactured items that cannot be made The list is likely to contain more groups of items (for in people's homes. Even some items that are not example, "clothing") than were contained in the food industrially manufactured are not commonly pro- list.The nonfood list should be designed with the fol- duced in people's homes and can likewise be omitted lowing objectives: to cover all aspects of the house- from the list of home-produced items. In the draft of hold's budget that are not covered elsewhere in the the submodule inVolume 3, the boxes are blacked out household questionnaire; to cover these in a logical in the home-production columns for items that are manner that respondents find congenial; to gather unlikely to be produced at home. information on specific items that may receive heavy The quantities of food produced at home should subsidies or attract heavy taxes (for example, kerosene, be regarded as the most important data gathered about gasoline, tobacco, or alcohol); and, sometimes, to gath- home food production.The quantity of a good is a fac- er the data in a manner that will enable analysts to tual, observable piece of information. Questions about study intrahousehold issues. value are more hypothetical, since by definition a For most items it will not be practical to collect home-produced item does not pass through the market. data on quantity (or, implicitly, unit values) due to the Theoretically, a farmgate price would be too low and a difficulties of establishing meaningful units, especially market price too high. There seems to be no evidence for groups of goods. For items where meaningful units as to which price the consumer of home-produced can be established (usually fuel), data on the quantity food is likely to know more often or which answer he consumed should be gathered. or she would give in response to a general question on Note that the boundary between consumption the "value of home production." The draft submodule items and durable goods is somewhat fuzzy. Many of includes a general (ill-defined) question on the "value of the items usually listed in the consumption module home production" on the grounds that-for goods that (and included in this draft submodule) may actually have markets-the food purchases questions, communi- last for more than a year. Kitchen equipment (includ- ty questionnaire, or both gather information on prices ing cups, forks, plates, and saucepans), furniture or unit values from food purchases and may gather data (including beds, tables, cupboards, chairs, and rugs), on farmgate prices in the agricultural module (if there and linens (including sheets, towels, and blankets) all is one). Thus the "value of home production" con- appeared in the consumption module of the Cote tributes a different set of information, at least on the d'lvoire questionnaire rather than in the durable goods aspect that the respondent feels is most pertinent. module, even though all of these items were likely to last for more than a year. Nonetheless, so many differ- GIFTS. In some past LSMS surveys, the gifts section ent items may be listed that it seems pragmatic to included the value of food received as payment in kind measure their consumption via the flow into the 127 ANGUS DEATON AND MARGARET GROSH household rather than trying to enumerate the whole Part E Durable Goods stock item by item and then compute a use value for The items in the durable goods list should be items it. that last substantially longer than a year and are so large The list of items in the draft submodule is derived in relation to the household's standard of living that from several different questionnaires from previous they can be separately enumerated and respondents LSMS surveys, principally the surveys in Jamaica and can accurately remember information about their pur- Nepal. chase after several years have gone by. A car would The same consideratioins that apply for food apply nieet this definition of a durable good but a shirt to the barter and gift of nonfood goods and to non- would not-even though both may last for several food goods (such as firewood) gathered from the envi- years. ronment. Home production is theoretically of interest The durable goods page in this submodule is but has rarely been specifically enumerated in past divided into two blocks so that households can report LSMS surveys. If survey designers decide that home- on the value of two of the same kind of item (for produced nonfood goods are important, they can be example, two bicycles). Most past LSMS questionnaires handled in a manner similar to that for home-pro- have included a list of a dozen or two different kinds of duced food. durable goods, though the lists have usually been longer in the surveys fielded in the countries of the Part D: Expenditures on Private Interhousehold Transfers former Soviet Unionr. The most appropriate items to Respondents may find it most logical to be asked include in the durable goods list will vary from coun- questions about their income from and expenditures try to country.What is considered expensive enough to on interhousehold transfers in the same place in the be a durable good will also vary from country to coun- questionnaire. In this book, however, questions on try. For example, cooking pots were on the durable household expenditures on private interhousehold goods list in the survey in Kagera, Tanzania, whereas transfers are included here and questions on income the list in the Jamaican survey included satellite dishes. from interhousehold transfers are included in the The most common durable goods may also vary by cli- module on transfers and other nonlabor income mate. In tropical countries air conditioners or fans will (introduced by Chapter 11). The two submodules often be on the list, whereas in cold climates various were developed together and can be positioned adja- sorts of heaters will be listed. Culture also plays an cent to one another in the household questionnaire in important role. Jewelry, carpets, and guns were listed in a separate module. Pakistan but are unlikely to be listed in the durable Two versions of this submodule are included. goods submodule in many other countries. The short version is designed to obtain basic infor- mation when the consumption module is being Special Circumstances shortened as much as possible and the study of pri- This subsection provides survey designers with advice vate safety nets is not deemed important. The longer on how to deal with various special circumstances that submodule includes questions about the recipients they may face in planning and designing a survey in and their relationship to the donor or head of the their particular country. household, as well as on the amounts, regularity, and purpose of the transfers. The answers to these ques- DEALING WITH INFLATION. In countries with high tions supply analysts with a lot of information for inflation it may be necessary to modify the question- studying transfcrs. naire. (However, if inflation is very high, it is not clear For Question 12 it would be useful to get as much how well these suggested modifications will work.) detail on the destination as can easily be coded-for First, the recall period should be shortened to the example, to the district or county level. For Questions shortest period that is reasonable for the type of item 12 and 13 the codes should coincide with those used in question. The recall period on food might change in the migration module. Questions 6 and 7 use the from the previous two weeks to the previous week. same codes as are used on the roster, which is why The "usual-month" questions on food might be code I is missing. (On the roster module it is for the dropped altogether, since it would be unclear to which head of household.) time period they referred, making it impossible to 128 CHAPTER 5 CONSUMPTION deflate the expenditures appropriately.The recall peri- Keeping the Module as Short as Possible od for nonfood items might be shortened from a year It is important to avoid tiring or annoying respondents to three or six months. Shortening the recall periods with an overly long interview. However, there are rel- could increase the variance of the estimates, but in any atively few options for shortening the consumption case it would be impossible to interpret any means for module, aside from using the shorter of the two ver- data collected when prices were very different. sions of the submodule on interhousehold transfers. As Moreover, inflation will cause people to make more explained in the second section of this chapter, reduc- frequent purchases (so that the real value of their ing the number of items or categories for which infor- money does not diminish), so the tradeoff between mation is collected can easily lead to an underestima- biases in the mean and variance may be less than in tion of consumption. Also, getting a comprehensive places where inflation is low. measure of consumption requires inquiring about The second modification that can be made to take purchases, home production, and gifts of all com- inflation into account involves asking respondents modities consumed by the household, so it is not easy about the dates when they made certain large purchas- to reduce the number of questions about each item. es so that analysts can deflate these figures appropriate- There are a few questions that could be cut in the ly. Third, if a currency other than the national curren- draft submodules, though this would result in a loss of cy has become a de facto unit of account, survey important data in each case. Probably the most designers might allow respondents to give their answers expendable question is Question 7 of Part E, the last in either the local or the international currency. of the series for durable goods. Removal of this ques- tion would mean that the valuation of durable goods CREDIT. If the use of consumer credit is of particular would rest solely on the assumption that the average analytical and policy interest in the survey country, a life of the good is twice the average age reported short module on households' use of credit (particular- across all households. ly for purchasing food) could be inserted after the It would be possible to drop Question 9 in the food submodule, and questions on purchases on cred- food submodule, and instead to value home produc- it could be added to the durable goods submodule, as tion using data from the expenditure questions as is suggested in Chapter 21 on credit. well as the community questionnaire.There is an ele- ment of risk in this, since households would no SUPPORTING INTRAHOUSEHOLD ANALYSIS. If supporting longer have the option to express their answers in intrahousehold analysis is a special goal of the survey, value terms, which they may have found easier to use the consumption module can be reorganized so that at than quantity terms. Of course, this option is only least some of the information is individual-specific. available if there are questions on quantities in the The main principle is to classify all items according to food expenditure submodule of consumption or if two criteria. The first criterion is whether they are data on prices are gathered in the community ques- consumed individually (like a taxi fare or a shirt) or tionnaire for all items on the home-produced list. jointly (like washing powder or television use). The Alternatively, survey designers could drop the quan- second criterion is whether it is easy to distinguish if tity questions about purchases and home production. the item was consumed by one household member or (Note that the quantity and the value questions on another. Food eaten at home is individually con- home production cannot both be dropped simulta- sumed, but, since it is usually purchased and prepared neously.) However, quantity is an important piece of for the household as a whole, it is difficult to distin- data in its own right, so dropping quantity questions guish how much is consumed by each individual. On reduces analytical potential. Most importantly it the other hand, it is easier to assign the consumption means that prices from some other sources are of food consumed in restaurants or food stalls to par- required to derive a calorie-based poverty line from ticular individuals. After the items that are individual- the data set. ly consumed and easily distinguished as being con- A final option would be to drop the dual recall sumed by a given individual have been determined, periods in the food submodule or the nonfood sub- these items can be listed on a separate grid that the module. Doing this would be risky, since if something interviewer fills out for each individual. wvent wrong in the fieldwork analysts would be unable 129 ANGUS DEATON AND MARGARET GROSH to use the data set to calculate what is arguably the Bouis, Howarth E., and Lawrence J. Haddad. 1992. "Are Estimates most imiportant variable in the whole enterprise- of Calorie-Income Elasticities too High?A Recalibration of the total household consumption. Because food is more Plausible Range" journal of Development Economics 39: 333-64. important in the total than nonfood, it is preferable to Bourguignon, Francois, and Pierre-Andre Chiappori. 1992. drop the dual recall periods for nonfood items before "Collective Models of Household Behavior:An Introduction." dropping those for food items. European Economic Review 36 (April): 355-64. Bourguignon, Francois, Martin J. Browning, Pierre-Andre Note Chiappori, and Valerie Lechene. 1993. "Intrahousehold Allocation of Consumption: A Model and Some Evidence The authors are grateful for useful conimsents to Richard Blundell, from French Data." Aninales d'Economnie et de Statistique 29 Martin Browning, Paul Glewvwe,John Hoddinott,Alberto Martini, January-March): 137-56. Andrew McKay, Raylynn Oliver, and, especially, Christopher Scott. Bradburn, Norman M., Janellen Huttenlocher, and Larry Hedges. The Statistical Institutes ofArmenia, Ecuador, El Salvador,Jamaica, 1994. "Telescoping and Temporal Memory." In Norbert and Latvia, and the Kiev Institute of Sociology in Ukraine con- Schwartz and Seymour Sudman, eds., Autobiographical Mvemory ducted experiments in the collection of consumption data that are and the Validity of Retrospective Reports New York: Springer- summarized. Eric Edmonds, Dean Jolliffe, Kinnon Scott, Diane Verlag. Steele, Tilahun Temesgen, and Wlodek Okrasa contributed to the Branch, E. Raphael. 1994. "The Consumer Expenditure Survey: A analysis that underlies some of the discussion. Comparative Analysis." Monthly Labor Review 117 (12): 47-55. Browvning, Martin, Francois Bourguignon, Pierre-Andre References Chiappori, and Val&rie Lechene. 1994. "Income and Outcomes: A Structural Model of Intrahousehold Allocation.' Ahmad, S. Ehtisham, and Nicholas H. Stern. 1991. The Theory and Journal of Political Economy 102 (December): 1067-96. Practice of Tax Reform in Developing Countries. Cambridge: Cavendish, William. 1998. "The Complexity of the Commons: Cambridge University Press. Environmental Resource Demands in Rural Zimbabwe." Atkinsor, Anthony B., anid Johll MNickelwright. 1983. "Oni the Reh- University of Oxford, Centre for the Study of African ability of Income Data in the Family, Expenditure Survey Economies, Oxford. 1970-77.:Journal of the Royal Statistical Soaiety SeriesA 146:33461. Coder, John. 1991. "Exploring Non-Sampling Errors in the Wage Baker,Judy 1996. Personal correspondence.Washington, D.C. and Salary Income Data from the March Current Population Behrman, Jere R., and Anil B. Deolalikar. 1987. "Will Developing Survey." Bureau of the Census, Household and Household Country Nutrition Improve with Income? A Case Study for Economic Statistics Division,Washington, D.C. Rural South India."Journal of Political Economy 95 June): 492-507. Cramer, Jan S. 1969. Enmpirical Econometrics. Amsterdam: North- Bhalla, Surjit S. 1979. "Measurement Errors and the Permanent Holland. Income Hypothesis: Evidence from Rural India." American Dandekar,V M., and N. Rath. 1971a. Poverty in India. Pune: Indian Economic Revieuw 69: 295-307. School of Political Economy. 1980. "The Measurement of Permanent Income and its . 1971b. "Poverty in India: Dimensions and Trends." Application to Saving Behavior."Journal of Political Economy 88: Economic and Political Weekly 6: 25-48. 722-43. Deaton, Angus. 1988. "Rice Prices and Income Distribution in Bhattachary,va, Nikhilesh. 1963. "On the Effects of Itensization in Thailand: A Non-Parametric Analysis." Economic journal 99 the Family Budget Schedule." Indian Statistical Institute, (Supplement): 1-37. Calcutta. . 1997. The Analysis of Hoousehold Surveys: a Microeconometric Biemer, Paul P., Robert M. Groves, Lars E. Lyberg, Nancy A. Approach to Development Policy. Baltimore, Md.: Johns Hopkins Mathiowetz, and Seymour Sudmanr, eds. 1991. MIeasurement University Press. Errors ins Surveys. New York: Wiley Deaton, Angus S., and John Muellbauer. 1980. Economics and Blaizeau, Didier. 1998. "Flousehold Expenditure Surveys in the Consumer Behavior Cambridge: Cambridge University Press. Seven UEMOA Countries." Institut National de la Statistique Deaton, Angus S., and Chtistina H. Paxson, 1998. "Economies of et des Etudes Economiques, Paris. Scale, Household Size, and the Demand for Food." Journal of Bouis, Howvarth E. 1994. "The Effect of Income on Dcmand for Political Economy 106 (October): 897-930. Food in Poor Countries: Are Our Databases Giving us Reliable Deaton, Angus S., and Salman Zaidi. 1999. "Guidelines for Estinmates?" Journal of Development Economics 44 June): 199-226. Constructing Consumption Aggregates for Welfare Analysis:' 130 CHAFTER 5 CONSUMPTION Princeton University and Development Research Group, Jacobs, Eva, and Stephanie Shipp. 1993. "A History of the US World Bank,Washington, D.C. Consumer Expenditure Survey: 1935-36 to 1988-89."Journal Eisenhower, Donna, Nancy A. Mathiowetz, and David of Economic and Social Measuretment 19: 59-96. Morganstein. 1991. "Recall Error: Sources and Bias Reduction Jolliffe,Dean,and Kinnon Scott. 1995."The Sensitivity of Measures Techniques." In Paul P. Biemer, Robert M. Groves, Lars E. of Household Consumption to Survey Design: Results from Lyberg, Nancy A. Mathiowetz, and Seymour Sudmanr, eds., an Experiment in El Salvador." World Bank, Policy Research Measurement Errors in Surveys. NewYork:Wiley. Department,Washington, D.C. Foley, M. 1996. Personal correspondence. Lanjouw,J., and P. Lanjouw. 1996."Aggregation Consistent Poverty Ghose, Suchisrnita, and Nikhilesh Bhattacharyya. 1993. "Engel Comparison: Theory and Illustrations." Yale University and Elasticities of Clothing and Other Items." Sarvekshana 17:35-9. World Bank, New Haven, Conn., and Washington, D.C. - 1995. "Effect of Reference Period on Engel Elasticities of Laraki, Karim. 1989. "Ending Food Subsidies: Nutritional, Welfare, Clothing and Other Items: Further Results." Sankhya: The and Budgetary Effects." World Bank Economic Review 3 IndianJournal of Statistics 57 (Series B): 433-49. (September): 395-408. Gibson, John. 1998. "How Robust are Poverty Comparisons to Mahalanobis, P.C. 1946. "Recent Experiments in Statistical Changes in Household Survey Methods? A Test Using Papua Sampling in the Indian Statistical Institute."Journal of the Royal New Guinean Data:" University of Waikato, Department of Statistical Society 109: 325-78. Economics, New Zealand. Mahalanobis, P. C., and S. B. Sen. 1954. "On Some Aspects of the Gieseman, Raymond. 1987. "The Consumer Expenditure Survey: Indian National Sample Survey." Builetin of the International Quality Control by Comparative Analysis." Monthly Labor Statistical Institute 34. Review 110 (March): 8-14. Martini, Alberto, and Anna Ivanova. 1996. "The Design and Glew-we, Paul. 1987. The Distribution of Welfare in Pent in 1985-86. Implementation of the Income and Expenditure Survey of Living Standards Measurement Study Working Paper 42. Belarus" Urban Institute,Washington, D.C. Washington, D.C.:World Bank. McWhinney, Isabel, and Harold E. Champion. 1974. "The Glewwe, Paul, and K.A. Twum-Baah, 1991. The Distribution of Canadian Experience with Recall and Diary Methods in Welfare in Ghana, 1987-88. Living Standards Measurement Consumer Expenditure Surveys." Annals of Economic and Social Study Working Paper 75.Washington. D.C.:World Bank. Measurement 3/2: 411-35. Griliches. Zvi, and Jerry A Hausman. 1986.'"Errors inVariables in Minhas, B. S. 1988."Validation of Large-Scale Sample Survey Data: Panel Data:" Journal of Econometrics 31 (February): 93-118. Case of NSS Household Consumption Expenditure." Sankhya Grootaert, Christiaan. 1986. "The Use of Multiple Diaries in a Series B, 50 (Stupplement). Household Expenditure Survey in Hong Kong:'Journal of the Minhas, B. S., and S. M. Kansal. 1989. "Comparison of NSS and Almericatn StatisticalAssociation 81: 938-44. CSO Estimates of Private Consumption: Some Observations 1993. "The Evolution of Welfare and Poverty Under based on 1983 Data." Journal of Income and Wealth 11 (January): Structural Change and Economiiic Recession in Cote d'Ivoire, 7-24. 1985-88." Policy Research Working Paper 1078.World Bank, Musgrove, Philip. 1978. Consumer Behavior in Latin America. Washington, D.C. Washington, D.C.: Brookings Institution. Grosh, Margaret. 1997. "The Pohcymaking Uses of Multi-topic . 1979. "Permanent Household Income and Consumption Household Survey Data: A Primer:' World Bank Research in Urban South America." American Economic Revien' 69: Observer 12 (August): 137-60. 355-68. Grosh, Margaret, and Juan Mufios. 1996. A lMfanualfor Planning and Neter, John. 1970. "Measurement Errors in Reports of Consumer Itnplenmenting the Living Standards Measurement Study Surveys. Expenditures."Journal of Marketing Research 7: 11-25. Living Standard Measurement Study Working Paper 126. Neter, John, and Joseph Waksburg. 1964. "A Study of Response Washington, D.C.:World Bank. Errors in Expenditure Data from Household Interviews:" India, Planning Commission. 1993. " Report of the Expert Group Journal of the American Statistical Association 59 (305): 18-55. on Estimation of Proportion and Number of Poor.' Newbery, David M. G., and Nicholas H. Stern, eds. 1987. The Perspective Planning Division, New Delhi. Theory of Taxation for Developing Countries. New York: Oxford IMF (International Monetary Fund). 1995. International Financial University Press. Statistics Yearbook. Washington, D.C. Paxson, Christina H. 1992. "Using Weather Variability to Estimate - 1996. International Financial Statistics Yearbook. Washington, the Response of Savings to Transitory Income in Thailand." D.C. .4nirerican Econoenic Review 82: 15-33. 131 ANGUS DEATON AND MARGARET GROSH 1993."Consumption and Income Seasonality in Thailand." Jere Behrman and T. N. Srinivasan, eds., Handbook of Journal of Political Economy 101: 39-72. Development Economics. Amsterdam: Elsevier. Reagan, Barbara B. 1954. "Condensed versus Detailed Schedule Subramanian, Shankar, and Angus S. Deaton. 1996. "The Demand for Collection of Family Expenditure Data.' U.S. Department for Food and Calories." Journal of Political Economy 104 of Agriculture, Agricultural Research Service, Washington, (February): 133-62. D.C. Sudman, Sevmour, and Robert Ferber. 1971. "Experiments in Rubin, David C., and Alan D. Baddeley 1989. "Telescoping is not Obtaining Consumer Expenditures by Diary Methods."Journal Time Compression: a Model of the Dating of Autobio- of the American Statistical Association 66: 725-35. graphical Events.".lMemory and Coonition 17: 653-61. Sudman, Seymour, Norman N. Bradburn, and Norbert Schwarz. Sahn, David. 1989. Seasonial Variation in Third World Agricultire: the 1996. ThinkingAboutAnsuvers. San Francisco:Jossey-Bass. Consequencesfor Food Security International Food Policy Research Swindale, Anne, and Punam Ohri-Vachaspati. 1997. "Household Iastitute. Baltiialore, Md.: Johns Hopkins Uiaiversirv Press. Food Coiusuiasption Inidicator Guide." International Science Scott, Christopher, and Ben Amenuvegbe. 1990. "Effect of Recall and Technology Institute, IMPACT Food Security and Duration on Reporting of Household Expenditures: An Nutrition Monitoring Project,Washington, D.C. Experimental Study in Ghana." Social Dimensions of Tsoflias, Lynn. 1996. Personal communication. Washington, D.C. Adjustment in Sub-Saharan Africa Working Paper 6. World Tucker, Clyde. 1992. "The Estimation of Instrument Effects on Bank,Washington, D.C. Data Quality in the Consumer Expenditure Survey"Journal of Scott, Christopher, Man' Strode, and Oliver Chinganya. 1998. Official Statistics 8: 4141. "Estimation ofAnnual Expenditure from Monthly Survey Data." Tucker, Clyde, and Claudette Bennett. 1988. "Procedural Effects in Scott, Kinnon. 1994. "Venezuela: Poverty Measurement with the Collection of Consumer Expenditure Information." Multiple Data Sets." World Bank, Development Research Proceedings of the Section on Survey Research Mfethods. American Group, Washington, D.C. Statistical Association. 256-61. - 1997. Personal conversation. Washington, D.C. van der Gaag,Jacques. 1994. Personal communication,Washington, Scott, Kinnon, and Wlodzimierz Okrasa. 1998. "Analysis of Latvia D.C. Diary Experiment." World Bank, Development Research Webb, R., and G. Fernandez Baca. 1993. Peru en Nutmeros 1993. Group, Washington, D.C. Anuario Estadistico. Cuanto. Lima, Peru. Sen, Amartya K. 1999. Development as Freedom. New York: Wolpin, Kenneth I. 1982. "A New Test of the Permanent Income Knopf. Hypothesis: The Impact of Weather on the Income and Silberstein, Adriana R. 1990. "First-Wave Effects in the US Consumption of Farm Households in India:" International Consumer Expenditure IntervieNv Survey." Survey Mlethodology Economic Review 23 (October): 583-94. 16: 293-304. World Bank. 1993 "Indonesia: Public Expenditures, Prices, and the Silberstein,Adriana R., and Stuart Scott. 1991."Expenditure Diary Poor." Report 11293-IND, Indonesia Resident Mission, Surveys and their Associated Errors." In Paul P. Biemer, Jakarta. Robert M. Groves, Lars E. Lyberg, Nancy A. Mathiowetz, and . 1994a. "Guyana: Strategies for Reducing Poverty:" Report Seymour Sudmanr, eds., M\easurement Errors in Surveys. New 12861-GUA, Country Operations Division 2, Country York:Wiley. Department 3, Latin America and the Caribbean Region. - 1992. "Seasonal Effects in the Reporting of Consumer Washington, D.C. Expenditure." Proceedings of the A.merican Statistical Association . 1994b. "Kingdom of Morocco: Poverty, Adjustment & Survey Research Section 333-8. Growth."Vol. 2. Report 11918-MOR. Middle East and North Srinivasan, T. N. 1994. "Database for Development Analysis: An Africa Region, Country Department 1, Country Operations OverviexV'Journal of Development Economics 44 (June): 3-27. Division,Washington, D.C. Statistical Institute and Planning Institute ofJamaica. 1994.Jamaica . 1995a. "The Kyrgy'z Republic. Poverty Assessment and Survey of Living Conditions 1992. Kingston. Strategy." Report 14380-KG. Human Resources Division, - 1995.Jamaica Survey of Living Conditions 1993. Kingston. Europe and Central Asia Country Department 3,Washington, .1996.Jamaica Survey of Living Conditions 1994. Kingston. D.C. Steele, Diane, 1998. "Ecuador Conssimption Items:" World Bank, - 199Sh. "Republic of Nicaragua. Poverty Assessment." Development Research Group,Washington, D.C. Volume 2. Report 14038-NI. Country Operations Division, Strauss, John, and Duncan Thomas. 1995. "Human Resources: Country Department 2, Latin America and the Caribbean Empirical Modeling of Household and Family Decisions." In Regional Office,Washington, D.C. 132 CHAPTER 5 CONSUMPTION . t995c. "Tanzania Social Sector Review." Report 14039- . 1997. "Romania, Poverty and Social Policy" Report TA. Poverty and Human Resources Division, Country 16467. Human Resources Sector Operations Division, Department 2, Africa Region,Washington, D.C. Country Department 2, Europe and Central Asia Region, - . 1996. "Jamaica Survey of Living Conditions (SLS) Washington, D.C. 1988-94 Basic Information." Poverty and Human Resources Zambia, Central Statistical Office. 1995. Household Budget Survey Division, Policy Research Department,Washington, D.C. 1993/94. Lusaka. 133 Household Roster 6 Paul Glewwe Almost every household survey instructs interviewers to make a list of all the members of each household in the survey. This list is often called the household roster. In LSMS and similar multi- topic surveys, the household roster serves three distinct functions. It determines who is and who is not a member of the household. It collects some basic information on each member of the household. And it can be used to collect information on close relatives of household members- parents, children, spouses, siblings-who are no longer, or have never been, members of the household. The collection of information on these household "associates" is unusual, and thus distinguishes LSMS-type surveys from most other household surveys. This chapter provides advice on how to design the the vast majority of information collected in the other household roster of an LSMS-type household survey. modules of the household questionnaire pertains only The first section discusses the three basic functions of to household members. If a household member is mis- the household roster. The second section introduces takenly classified as not belonging to the household, the draft household roster module (which is provided no information of any kind will be gathered about in Volume 3 of this book). The third section presents that person in the rest of the household questionnaire. detailed explanatory notes on the draft module. The first step in determining household member- ship is to define what is meant by a household. For the The Three Main Functions of the Household purposes of conducting a household survey, the stan- Roster dard definition of a household is a group of people who live together, pool their money, and eat at least The household roster has three basic functions in one meal together each day (United Nations 1989).' LSMS and other multitopic household surveys: deter- While most people who live together do pool their mining household membership, collecting basic infor- money and eat together, there are some exceptions. mation on household members, and collecting infor- For example, unrelated individuals may share a mation on nonresident family members. dwelling to minimize housing costs but may eat sepa- rately and not pool their money. In this case each indi- Determining Household Membership vidual should be considered as living in a separate Perhaps the most important function of the household household. A second example is when two distinct roster is to determine which individuals are members families live in the same dwelling but allocate some of the household. This function is important because rooms to one family and the rest to the other. They 135 PAUL GLEWWE may or may not eat together, but if they do not pool recall period. To be consistent with these modules, it is their income they should be considered two separate recommended that LSMS and similar multitopic sur- households. A third example is household servants. veys define household membership over the previous Servants are generally not considered household 12 months. members because they keep their income separate How long must a person reside in a given house- from the income of the household that employs them, hold in order to be considered a member of that even though they may eat some meals with the fami- household? The main criterion for determining ly. A final case is individuals who rent a room in a whether someone has been a household member dur- dwelling that belongs to one family. A renter may eat ing the previous 12 moinths is the numiber of mointhis some meals with the family, but as long as the renter the individual has lived in the household during that keeps his or her income separate from that of other time. A person who has lived in the household for all household members, he or she should be considered a of the past 12 months should certainly be considered separate household. a household member. A person who has lived in the The definition given above of what constitutes a household for more than six months should also be household is clear for the vast majority of living situ- considered a household member, since it is not possi- ations. However, unusual situations arise in many ble for that person to be a member of any other countries. For example, in some polygamous societies household for more than six months. One potential a man may have several wives, each of whom has a problem is people who have lived in one household separate dwelling. In this case each wife (and her chil- for exactly six months and in another household for dren) constitutes a separate household. Which house- six months. If household membership includes people hold the man belongs to is difficult to say. In this and who have lived in a household for 6 of the past 12 other unusual cases flexibility is needed; one course of months, such persons have, in theory, a double chance action is to adopt the definition of a household used of being surveved. In practice this problem is probably in previous surveys in the particular country. An minor because it applies to only a very small percent- important principle to follow in these situations is that age of the population. each person in the population should be assigned to A more troublesome case is an individual who has one, and only one, household. lived in three different households during the previous After a definition of what constitutes a household 12 months-spending 4 months in each household. If is settled, the next step is to assign each person in the household membership is defined as living in a house- population to one, and only one, household. If some hold for 6 or more of the past 12 months, such indi- people are not assigned to any household, they will viduals will never be sampled in the survey. However, not be represented in the survey; if other people are if the threshold is 4 months or less, these people will assigned to two or more households, they will be over- be three times as likely to be included in the survey as represented in the survey. The main difficulty is that people who lived in only one household during the over long periods of time, many people move from previous 12 months. The prudent thing to do here is one household to another. The longer the period of to err on the side of caution by setting a relatively low time, the more common this is. To avoid these poten- number of months as the threshold for considering a tially troublesome situations, survey designers may person to be a household member. The threshold set wish to define household membership over a relative- in many past LSMS surveys has been three or more ly short period of time. At one extreme, this might be months. This is prudent because as long as the number the previous 24 hours-in effect, those who spent the of months each person lived in the household during night at the household surveyed-which would make the previous 12 months is recorded in the household the number of ambiguous cases quite small. On the roster, analysts can decide for themselves whether to other hand, many socioeconomic phenomena of include as household members people who have spent interest to policymakers involve activities and situa- 6 or fewer months in the household. tions that last for a considerable period of time. For A related problem arises when people move example, several modules described in other chapters between two or more households on a regular basis. of this book, including the modules on consumption, For example, the main income-earner of a given income, employment, and agriculture, use a 12-month household may work away from home during the 136 CHAPTER 6 HOUSEHOLD ROSTER week but return home on weekends. In general, the hold member. Classifying these people as nonmembers rule to apply is to calculate the percentage of time that may offend some members of the household. the person has been in the household during the pre- vious 12 months, express this in terms of a number of Collecting Basic Information on Household Members months, and then apply the rule described above. One Once all of the household members have been identi- exception to this rule is a person whose time away fied, what information should be collected about from the household was clearly spent in an institu- them? The information should be limited to the most tional setting, such as a workers' dormitory. Since that fundamental characteristics of each individual; data person is not in danger of being double-counted that pertain to other modules should be collected in (because he or she did not live in any other household those modules.The standard information to collect in during the previous 12 months), he or she can be the household roster is: counted as a household member. * The person's name. There have been some exceptions to the rule com- * The person's date of birth (if known). monly used in past LSMS surveys that any person who * The person's age (in years for adults, in months for has lived in the household for 3 or more of the past 12 children age 12 and under) months is considered to be a household member. First, * The person's sex. a newly born infant is typically considered to be a * The person's relationship to the head of household. household member even if he or she is less than three * The person's marital status. months old because the infant was obviously not a part * The number of months that the person has lived in of any other household during the other months. the household during the previous 12 months. Second, any household member who has died during This information is straightforward and uncontro- the previous 12 months is generally not considered a versial. However, a few comments are required. The household member. The reason for this is that this per- main purpose for obtaining the name of the person is son cannot provide answers for himself or herself, and it to allow the interview to proceed smoothly and natu- is often uncomfortable for other household members to rally. There is no reason to provide the names of answer extensive questions about someone in the household members to analysts who want to use the household who has recently died. (An exception is that data.2 This would violate the statistical regulations the fertility module introduced by Chapter 15 does col- governing almost all official household surveys, which lect information on any children who have died.) typically guarantee the confidentiality of information Another possible exception is someone who has provided by respondents. recently become a member of the household and The reason for collecting data on the respondent's clearly has not been a member of another household date of birth is that many people in developing coun- during the previous 12 months because he or she was tries, especially elderly people, have some difficulty living in some kind of institutional setting (for exam- remembering their exact ages. If a person's birth date is ple, student housing, a military barracks, or a prison) or obtained, that information, along with the date of the in a foreign country. For example, in the 1992-93 interview, enables analysts to calculate the person's Vietnam LSMS survey, all recently demobilized sol- exact age (see Chapter 4 on metadata). Another reason diers were counted as household members even when for collecting both age and birth date information is they had only been members for one or two months. that occasionally a person may unintentionally make an Another exception to the rule is that the person des- error when answering one of these questions. This ignated as the head of the household is always consid- information is sufficiently important that the responses ered a household member regardless of the amount of to these two questions should be compared to check time that he or she has spent in the household during the accuracy of the ages of all household members. the previous 12 months. A final exception that has It is important to obtain a record of the number of been used in some previous LSMS surveys is that a months that the person has lived in the household in new spouse, usually a wife, who has joined a house- question during the previous 12 months in order to hold is a member even if he or she joined only one or ascertain who is, and who is not, a household member. two months ago. InVietnam this rule was used because Many households include married couples, and it Vietnamese culture dictates that this person is a house- is useful in analysis to match each household member 137 PAUL GLEWWE to his or her spouse. In some households this will be Another piece of information that may be useful obvious, but in households that contain extended fam- in analysis is whether each respondent is currently liv- ilies it may not be clear. Therefore, in general, each ing in the household-more specifically, the number person who is married should be matched to the of days he or she has lived in the household during the household member who is his or her spouse. If the past week or month.This xvould be useful for explain- spouse is not a household member, this can also be ing why a particular individual could not answer ques- indicated in the household roster. tions for himself or herself, but it is even more useful It is sometimes useful to collect other information for analyses of migration. In most surveys such ques- about household members in the household roster, tions should be included in the migration module. such as information on ethnicity, religion, and nation- ality. In countries where intermarriage between mem- Collecting Information on Nonresident Family Members bers of different ethnic groups is rare, questions about Most LSMS surveys collect information on the parents ethnicity need be asked only once, for the household as and children of household members regardless of a whole. In this case the information should not be col- whether these people are household members. This lected in the household roster but in the household information is collected because it gives analysts and identification page (see Chapter 4). If intermarriage policymakers a better understanding of parent-child between members of different ethnic groups is com- relationships within the household, because it illus- mon, each person should be asked about ethnicity sep- trates the links between the household surveyed and arately in the household roster-and a "mixed" catego- other households, and because the data can be useful ry may be needed for the children of interethnic for applying certain econometric techniques, particu- marriages. The same is true for religious affiliation; if it larly the instrumental variables method. is rare to find people of different religions in the same Consider first the case of parents. As with spouses, household, this can be asked once for the whole house- when extended families live in the same household it is hold, but if it is common, each person should be asked not always clear which children are associated with individually in the household roster. Finally, in some which adults.Thus it is useful to ask each person explic- countries a substantial fraction of the population may itly whether their mother and father are household have migrated in from another country and thus con- members and, if so, which household members they are. sist of people who are not citizens. Data about citizen- If a person's mother or father is not a household mem- ship can be collected in the household roster if nonci- ber, the roster should gather some basic information on tizens constitute a sizable fraction of the population. that individual-in particular, whether the individual is Before moving to the next topic, it is useful to still alive, the individual's highest level of education, and consider two other types of information that some the individual's main occupation. Survey designers may survey designers may want to collect in the household also want to ask respondents where their parents live if roster: languages spoken and the recent location of they are still alive. Such information is useful because household members. In some countries many people parents' education levels are strongly associated with speak a language other than the official language of the their children's levels of education, because the occupa- country. Information on languages spoken may be tions of (adult) children are often correlated with their particularly relevant for analyses of education and parents' occupations, and because many households employment. One or two questions could be added to depend on relatives to help them in times of need. the household roster to record the languages spoken (Thus a household's vulnerability to economic hardship by each household member above a certain age. may depend on whether household members' parents However, in most cases it is more convenient to col- are alive or dead, as well as on where they live.) lect this information in the education module. In par- Another issue is how to treat stepchildren and ticular, the standard education module in Chapter 7 adopted children. Most stepchildren have one parent in asks household members to read a short sentence, and the household and another who is dead or lives else- the notes to that module suggest that this could be where. This situation will become clear from the infor- done for more than one language. This is a natural mation given on the child's parents, and thus needs no place to ask a couple of questions about the languages further discussion. Whether the parents of adopted spoken by household members. children should be treated as their real parents depends 138 CHAPTER 6 HOUSEHOLD ROSTER on the incidence and nature of adoption in the coun- Volumie 3.) The draft module is composed of four try surveyed. If adoption is common, survey designers parts. Part A is the list of household members. It con- may specifically want to ask if a child is adopted. sists of two pages. The first page provides instructions Now consider the issue of children who are not to the interviewer, along with some questions to be household members, whether minors or adult children. asked of the head of household.The second page con- In many developing countries it is common for parents tains the actual roster. Part B collects information to send their children to live with other families, partic- about the parents of household members-and a small ularly for the purpose of sending them to school.While amount of information on the siblings of adult mem- children in this category will be sampled in the house- bers. Part C collects data on children of household holds where they currently live, it is often useful to link members who do not reside in the household, and them to their parents' households for purposes of analy- Part D collects more detailed information on the sib- sis.This can be done by asking all adult household mem- lings of adult household members. bers whether they have any children who live away from Like many of the other modules in Volume 3, the household. It is also useful to gather information the household roster module has three versions: about any adult children who live away from home.This short, standard, and expanded. The short version is is useful for studying poverty because adult children may simply Part A by itself. The standard version consists support their elderly parents even when they live in a of Parts A, B, and C. The expanded version consists separate household. As a result, it is useful to know how of all four parts, and also provides additional ques- many potential sources of support each household has in tions about nonresident spouses that should be the form of adult children living away from home.3 added to Part A. Finally, along the same lines, it may be useful to col- lect some information on the absent spouses and sib- Notes and Comments on the Draft Modules lings of adult household members, because such spous- es and siblings can often be counted on to support the To help the interviewer correctly record all the infor- household in times of need. This information is rela- mation in each of the other individual-level modules tively easy to collect for absent spouses, since most in the household questionnaire, the part of the roster households will not have any absent spouses and few that contains each individual's name and sex households will contain more than one person with an (Questions 1 and 2 of Part A) should be visible and absent spouse. In contrast, the number of potential sib- aligned with each individual-specific grid in those lings is very large. One option is to ask only about the modules. (See Chapter 3 for a discussion of ways to number of siblings alive, perhaps distinguishing between design a fold-out roster page.) It is also useful to have men and women (as this may affect the probability of two extra columns immediately to the left of receiving support). Only a few surveys have attempted Questions Al and A2 that can be used to write each to collect information on each sibling of all household person's age (in years) and indicate whether the per- members; one example is a survey done in Cartagena, son is a household member.This is useful because the Colombia in 1982 (see Bamberger, Kaufinann, and individual-level sections that follow apply only to Velez 1984). The basic approach is to ask about such household members, and in some cases they apply people in a survey form similar to the one on nonresi- only to persons of certain ages. dent children in Volume 3 (Part C of the Household Roster module).The two main problems are that the list A7. For some studies of migration it is useful to add of people could be very large, especially for households the year of marriage. Such a question should come with several adult members, and that people could be immediately after Question A7, and should be asked siblings of more than one household member. only of persons currently married. However, neither problem is insurmountable. A8. If the spouse does not live in the household, one Draft Module may want to ask some questions about his or her whereabouts, since absent spouses can be a source of This section briefly introduces the draft household support in time of need. Such questions should be roster module. (The module itself is provided in added for the expanded version of the household ros- 139 PAUL GLEWWE ter. They should be asked immediately after Question ple, a child who lives with her uncle or aunt will A8. The questions to ask are the same as the questions have already provided some information on his or for nonresident siblings in Part D of the household her parents-one of whom is the sibling of the module, particularly: age, years of schooling and uncle-in Part B. Or a person 21 or older who lives degree completed, current employment or other with his or her parents may have a sibling whose activity, and current location. Another possible ques- existence has already been recorded in Part C. In tion is when the person left the household (to be such cases one may not want to repeat the informa- asked of individuals who were previously members of tion, so an extra question should be added after the current houselhold). Question D4 asking if this person is a parent or child of a household member. If the answer is "YES," the A1O. If ethnicity varies within a substantial fraction ID code of the person as indicated in Part B or Part of the households in the country, the ethnicity ques- C should be provided here, and then the interview- tion in the Household Identification and Control er should go on to the next person. However, in Information page of the metadata module (see some cases one may want to ask more about the per- Chapter 4) should be removed, and a question about son, in which case all of the questions should be ethnicity should be added immediately after asked of such persons. Question A10. The same point applies to religious affiliation. Dl. It is possible for a nonresident sibling to be the brother or sister of more than one household member. All. It is best to ask about the number of months In such cases that sibling's name should be written spent away from the household because this type of down only one time, and the ID codes of brothers and question is usually easier to answer than a question sisters in the household to whom he or she is related about the number of months during which the mem- should be recorded in Question D5. ber was present. Dio. The codes here should include countries if it is A12 (INSTRUCTIONS PAGE). As explained in the text, common for siblings to live overseas. other excluded categories (beyond the head of house- hold and newly born infants) could also be treated as D12. In addition to some basic occupation codes, the household members. Such categories might include following "activity" codes could be added: retired, recently demobilized soldiers or new spouses of housewife, student, unemployed, unable to work. household members. Notes PART B. Survey designers may want to ask a question regarding where each parent currently lives. They may The author would like to thankJere Behrman, Margaret Grosh, and also want to ask the child's age when he or she first Courtney Harold for comments on a previous draft of this chapter. lived away from his or her parents. This should be 1. Not every country follows this standard definition. In some asked separately for each parent. European household surveys the requirement that household members eat at least one meal together each day is dropped; other Bi, B8. The decision about whether to treat adoptive surveys require some kind of kin relationship among the household parents as the real parents will depend on country- members. Thus there is some room for flexibility when defining specific factors. See the text for further discussion. household membership in specific countries. 2. One possible exception to this is if an analyst is trying to con- C13. This question should include codes for foreign struct a panel data set by matching names (and other information) countries if it is common for children to live overseas. between different surveys. A well-planned panel survey should not This was the case in Vietnam. have to resort to this crude and error-prone method. For further information see Chapter 23 on panel data. PART D. In addition to being the brother or sister of 3. Whether any support is ever received from these children is a a household member, this person could also be a separate questionThis te of data is collected in the transfers and parent or child of a household member. For exam- other nonlabor income module (which is introduced in Chapter 11). 140 CHAPTER 6 HOUSEHOLD ROSTER References Bamberger, Michael, Daniel Kaufmann, and Eduardo Velez. 1984. "Research Methodology: Design and Analysis Issues" World Bank, Water Supply and Urban Development Research Department, Washington, D.C. United Nations. 1989. "Household Income and Expenditure Surveys: A Technical Study." National Household Survey Capabihty Programme, NewYork. 141 Education 7 Paul Glewwe In recent years a consensus has developed among agencies and individuals working in economic development that investments in human capital, particularly investments in education, are crucial for economic growth (World Bank 1990; UNDP 1990; Becker 1995).Yet many developing coun- tries continue to have serious problems with their educational systems, and many observers argue that the provision of education in most developing countries is highly inefficient (Lockheed and Verspoor 1991; Hanushek 1995). In order to devise policies that improve their educa- LSMS and other multitopic household surveys to col- tional systems, policymakers in developing countries lect data that can be used to investigate a wide variety need accurate information on education. However, in of education policy issues in developing countries. many countries the only data that policymakers have The first section of this chapter reviews major at their disposal is a small amount of information col- policy issues on education. The second section exam- lected from public schools.These data give only a par- ines how data on education from household surveys tial picture of how, and how successfully, students are can be used to address these issues. The third section educated. School-based data provide no information introduces draft modules for collecting data on educa- on children who do not attend school or on what tion in multitopic household surveys. (The modules happens to students after they leave school. themselves are presented inVolume 3.) The fourth and Household surveys can fill this gap by supplying final section provides explanatory notes on the draft basic descriptive information on which children go to modules introduced in the third section. school, the characteristics of the schools they attend, how long they attend, the costs of their schooling, and Education Policy Issues in Developing Countries what happens after they leave school. Such informa- tion provides a better foundation for research on how Policymakers who work on education issues need to improve education systems in developing countries. information on the current situation and estimates of Household surveys can also provide policymakers with how the current situation would change in response to important information on the impacts of specific gov- changes in government policy. More specifically, they ernment policies. need accurate information on educational outcomes, However, household surveys can perform these such as school attendance and skills learned, and they important functions only if they are well designed.The need to know what impact government education purpose of this chapter is to explain how to design policies have on each educational outcome. In addi- 143 PAUL GLEWWE tion, policymakers would like to know the impact of with skills and knowledge. It is also likely that years of these educational outcomes on other socioeconomic schooling and diplomas received are correlated with outcomes, including income, migration, and health the values and norms acquired in school, especially the status. values that are explicitly part of the curriculum. LSMS and other multitopic household surveys The six types of educational outcomes listed can provide data that meet all of these needs. In such above can be thought of as the main "outputs" of the surveys the education module of the household ques- education process. Other schooling variables can be tionnaire is the main source of information on educa- thought of as "inputs," including years of schooling, tional outcomes. This module is also an important grade repetition, daily attendance, and household source of information on many of the determinants of expenditures on school supplies. The number of years educational outcomes. Information on other socio- of schooling is perhaps the most important input, but economic outcomes is collected in other modules of daily school attendance clearly matters as well. LSMS-type surveys. Repetition indicates whether sufficient learning is This section reviews the most important policy occurring and also affects learning directly; thus it can issues concerning education in developing countries, be thought of as both an output and an input. Finally, focusing primarily on the impact of government household expenditures on schooling consist of school actions on educational outcomes. A final subsection fees and the amounts (and prices) of books and other discusses whether the education module should col- school supplies purchased by the household. lect additional data to investigate the impact of educa- In this chapter, "educational outcomes" will be tion on other socioeconomic outcomes. used to refer to both outputs and inputs.Thus the fol- lowing schooling variables comprise the basic set of The Basic Educational Outcomes educational outcomes of interest to policymakers: There are several basic educational outcomes of inter- basic cognitive skills, complex cognitive skills, general est to policymakers. In general, a child or young adult knowledge, specialized technical skills, diplomas and enrolled in school acquires: certificates, values and norms, years of schooling com- * Basic cognitive skills such as literacy and numeracy. pleted, daily attendance, grade repetition, household * Complex cognitive skills such as reasoning ability. expenditures on schooling, and current enrollment. D General knowledge on a wide variety of subjects, Current enrollment is technically neither an input nor which may include science, geography, agriculture, an output, but it is important because it allows analysts and health. to distinguish children whose formal education is fin- • Specialized technical skills (beginning at the sec- ished from children who are still in school. ondary level). Government policies affect educational outcomes * Diplomas and certificates attesting to the comple- by influencing their determinants. The determinants tion of specific levels of schooling, of educational outcomes can be classified into three * Values and behavioral norms, both those that are types: child characteristics, household characteristics, part of the curriculum (for example, good citizen- and school characteristics. In virtually every develop- ship) and those that arise through social interaction ing country, most schools are run by the government, among students (for example, aspirations to a "high which means that government policies have a major status" occupation). impact on the characteristics of schools. (Child and Measurement of learning outcomes (in other household characteristics may also be affected by gov- words, measurement of the acquisition of cognitive ernment policies, as discussed below.) Government skills, general knowledge, and specialized technical policies can be divided into those that affect prices and skills) can be complicated. Measuring values and norms those that affect other school characteristics. This dis- can be even more difficult. One way to avoid difficul- tinction is important because prices affect learning ties in measuring learning outcomes is to collect data only indirectly by influencing inputs chosen by the on other variables that are closely related to learning. household, such as the number of years a child is For example, it is much easier to collect data on years enrolled in school. In contrast, most other school char- of schooling completed and on certificates or diplomas acteristics (for example, basic classroom materials such received, and these data should be highly correlated as blackboards and desks) have a direct effect on learn- 144 CHAPTER 7 EDUCATION ing. These other characteristics of schools can also have regarding optional expenditures on schooling, they indirect effects because families may change other can at least partially control the total costs of school- inputs, such as years of schooling, in response to ing (henceforth referred to as school expenditures).As changes in school characteristics. will be seen in the second section of this chapter, the distinction between optional and required expendi- Government Policies Regarding the Price of Schooling tures is important when discussing what data are The total cost to a household of enrolling a child in needed for analyzing policy issues. school is the sum of the direct money costs and the Many education policy issues in developing coun- opportunity costs.The direct costs include tuition and tries concern school prices. The levels at which tuition other required fees-parents' association fees, exami- and other required fees are set is a matter of constant nation fees, sports fees, special fundraising levies-as debate. Some countries have provided free public edu- well as expenditures on uriforms, textbooks, other cation at virtually all levels for many years, while other learning materials (pencils, exercise books), transporta- countries set public school fees at a level high enough tion, meals at school, and in some cases lodging. If a to reimburse the government for much of the cost it student receives a scholarship or voucher, its value can bears from providing education services. Policymakers be subtracted from the household's direct costs. face two opposing pressures in setting school fees. On Opportunity costs are the implicit costs of the time the one hand, increases in school fees tend to reduce that children devote to schooling, including the time enrollment and eventual school attainment, especially they spend in the classroom, traveling to school, and among poor households. On the other hand, budget- doing schoolwork at home. If a child moves away from ary resources in most developing countries are scarce, home to attend school, the household typically loses which puts pressure on governments to raise fees to the use of all of his or her time. In developing coun- fund the operation of public schools. This creates a tries children's time is valuable because they often help fundamental dilemma for policymakers; there are no with household chores, work on household agricul- easy answers, particularly when reducing poverty is a tural land, assist in operating nonagricultural house- key concern. hold businesses, and even work for wages. Many proposals for improving education in devel- The discussion so far has been in terms of "costs," oping countries focus on pricing policies. Some edu- not "prices." In general, cost equals price times quan- cation experts have argued that public school fees tity. Several different prices apply to schooling. The should be very low at the primary level and perhaps first is the mandatory tuition fee per year (or per term) the lower secondary level but should be relatively high of schooling. Other prices include the prices for the for upper secondary and postsecondary schooling various learning materials that parents are expected or (World Bank 1995). This would increase equity by even required to purchase, such as uniforms, pencils, raising primary school enrollment, particularly among exercise books, and, in some countries, textbooks. poor households, and by reducing subsidies to better- Then there are prices for transportation, meals, and off households, whose children are most likely to lodging. In addition, there is the price of an hour or enroll in higher education. If the social returns to pri- day of a child's time, which determines the opportu- mary school are higher than the private returns and nity costs incurred by households that send their chil- there is little difference between the social and private dren to school. While tuition fees must be paid if a returns to higher education, this would also be more child is to attend school, other schooling costs may be efficient. optional.' For example, in many developing countries Other experts recommend increasing public where parents are expected to purchase textbooks for school fees but using the funds to improve the quality their children, many parents do not do so or purchase of schooling provided, since there is evidence that only some of the required books, yet still enroll their many households, even poor ones, are willing to pay children in school. Also, although official policy often more for higher-quality education services. A third requires children to wear school uniforms, this is not possibility, often referred to as targeting or price dis- always enforced in practice. crimination, is to raise fees at all levels of education but In general, households cannot usually change the to reduce the fees for poor households in order to prices that they face. However, by making decisions encourage them to send their children to school. A 145 PAUL GLEWWE fourth suggestion is to set fees at fairly high levels chase, such as uniforms, textbooks, pencils, and while providing loans to some or all students to ensure exercise books, were changed? that credit constraints, which are presumably more * How would educational outcomes be affected by common among the poor, do not prevent parents from changes in the distance from households to the sending their children to school. Finally, some have nearest school? argued that government-run schools are inherently * How would educational outcomes be affected by a inefficient and that private schools should be promot- student loan program? ed, which could be done by providing families with Several points need to be made regarding these vouchers that they could use to send their children to four questions. First, the question about changing either type of school (West 1996). mandatory school fees includes policies on vouchers The validity of the arguments in favor of these and scholarships, since providing a voucher or scholar- different policies on1 school fees depends on how ship changes the effective mandatory school fee. households are likely to react to each policy option. Second, to address the issue of how increases in school Thus it is important that the education module in fees accompanied by school quality improvements multitopic household surveys be designed to gather affect educational outcomes, it is necessary to combine data that allow analysts to estimate behavioral models the answer to the first question with estimates of the of the educational choices that households make. impact of school quality on educational outcomes. There are other prices that parents face when send- Third, when the purchase of uniforms, textbooks, ing their children to school, and government policies and other learning materials is required to enroll a can affect these prices. The price at which uniforms are child in school, researchers may be tempted to treat the available can affect children's educational outcomes, prices of these commodities as additional mandatory particularly if government schools require uniforms. school fees. However, this is often not advisable. For Similarly, if the school does not provide textbooks, text- example, uniforms usually last more than one year, so book prices will also affect households' schooling deci- their price is not an annual price. Also, both textbooks sions. If textbook prices are high, parents may withdraw and uniforms can be passed from older children to their children from school or send them to school with- younger siblings, so again the price is not necessarily out a full set of textbooks. The same applies to other annual. In general it is best to treat the prices of learn- learning materials that parents are expected to purchase. ing materials as separate variables rather than adding Distance can also be viewed as a price; schools that are them to tuition to obtain a single price variable. far from a household's dwelling may discourage parents Fourth, measuring the distance from a household from enirolling their children in school because of the to the nearest school can be complicated by various high opportunity costs of the children's time spent trav- factors. There may be more than one school from eling to school-and in some cases due to direct trans- which the household can choose. Some families may portation costs. In many developing countries, schools decide to move to a new dwelling in order to be near- are located quite far from many rural communities, er to a certain school. Other families may send their especially at the secondary level. For example, in rural children to live away from home in order to attend a areas ofVietnam, the distance to the nearest upper sec- particular school. Finally, the availability and cost of ondary school was more than 10 kilometers in 25 per- public and private transportation can alter the impact cent of the rural communes sampled. Proposals for pro- of distance on educational outcomes. In each case dis- viding student loans can also be treated as price policies, tance is not a pure price but may be partially under the since they aim to provide more children with access to household's control. The survey design problems pre- schooling rather than to change school characteristics. sented by these issues of distance will be discussed fur- This discussion of pricing policies suggests that ther in the second section of this chapter. household surveys should be designed to answer the following four questions: Govemment Policies Regarding the Characteristics of * How would educational outcomes be affected if Schools and Teachers mandatory school fees were changed? There are many important policy issues concerning * How would educational outcomes be affected if school and teacher characteristics in developing coun- the prices of items that parents are expected to pur- tries. (For convenience, the term "school characteris- 146 CHAPTER 7 EDUCATION tics" will be used in the rest of this chapter to refer to nonprice school characteristics.) It is useful to divide Box 7.1 Using a School-Based Sample to Examine the these characteristics into two types. The first type is Impact of School and Teacher Characteristics characteristics that concern what actually happens in Much research on the impact of school and teacher char- the classroom.These can be thought of as school qual- acteristics on educational outcomes, particularly on test ity variables. School quality can be defined as all of the scores, is based on data collected from a sample of schools characteristics of classrooms, the teachers in them, and rather than a sample of households. Such data typically the teaching methods used that directly affect how contain detailed information on schools but only a small much children learn and what values they acquire. amount of information on schoolchildren's households. Tuition fees and other prices are not considered com- (This is because the information on each child's household is obtained by asking the child, not by visiting the house- ponents of school quaty because, although pnrces may hold.) The lack of data on the children's households can be may be correlated with quality, they do not direct- lead to serious estimation problems. If some children do ly affect learning.2 Even though this definition of not attend school, as is often the case in developing coun- school quality focuses on learning (the acquisition of tries, it is best to use a sample of households rather than a skills and knowledge) and values, it affects all of the sample of schools to collect education data. Ideally, house- educational outcomes discussed at the beginning of hold surveys should collect data from both schools and this section. For example, parents are likely to keep households. Such data are essential for the analysis of cer- their children in school longer if school quality tain policy questions, such as the impact of school and teacher characteristics cn the likelihood that children will improves (assuming that other factors, such as school be enrolled in school. fees, do not change) because the benefit of a year of schooling rises while its price remains the same. The many different aspects of school quality can be divid- the largest improvement in educational outcomes per ed into two categories: dollar spent? Answering this question requires infor- * Material inputs in the classroom, such as black- mation on how each material input affects education- boards, textbooks, and the physical condition of al outcomes and on the cost of each input. This infor- classrooms. mation is useful not only for basic inputs but also for * Teacher characteristics and the pedagogical prac- more sophisticated ones. tices teachers use. There are a large number of policy issues regard- The second type of school characteristics is those ing teachers and how they teach. Among the most concerned with school management and school poli- important are: cies, which affect student learning indirectly by deter- * The effect of student-teacher ratios on educational mining what takes place inside the classroom. These outcomes. also can be divided into two categories: * The impact of teacher training and the general * School management variables, which refer to the educational level of the teacher (including teacher managerial structure of the school, including the knowledge as measured by test scores) on educa- characteristics of the school principal and the over- tional outcomes. all system of incentives for teachers and other staff. * The role of teacher motivation and morale. * Admissions and advancement policies, which deter- * The impact of various pedagogical techniques, mine the schools and classes students may attend. such as distance education (radio instruction), on Material inputs range from the most basic school learning.4 supplies, such as chalk, blackboards, textbooks, and * The role of female teachers in encouraging parents desks, to much more expensive and sophisticated ped- to educate their female children. agogical tools, such as personal computers. The poor- Some of these issues involve factors that are relatively er developing countries lack even the most basic easy to measure (such as student-teacher ratios, the material inputs (see Glewwe, Kremer, and Moulin extent to which a teacher has been trained, and the sex 1999 on Kenya andWorld Bank 1997 on India). By far of a teacher), while others are more difficult to observe the most important policy question in these countries (such as teachers' motivation and the pedagogical in this context is: which material inputs are most cost- practices teachers use). A similar observation can be effective?3 In other words, which inputs bring about made regarding the costs of teacher characteristics and 147 PAUL GLEWWE pedagogical practices. Some costs, such as raising * What impact do different types of school manage- teachers' pay to increase their morale, are relatively ment policies have on learning and other educa- easy to measure, while others, such as switching to a tional outcomes? new pedagogical technique, are harder to measure. * What impact do different admissions and advance- However, in principle, any change involves a cost, and ment policies have on learning and other educa- the changes that bring about the greatest improve- tional outcomes? ments in educational outcomes per dollar spent should In general, households cannot provide reliable be given the highest priority. information on school characteristics, since it is Key policy issues concerning systems of school unlikely that they will have detailed knowledge of management include: them. Instead, this information should be collected in * The impact of the education levels, training, expe- the community questionnaire, or by adding a school rience, and management styles of school principals questionnaire to the survey. Many past LSMS surveys on educational outcomes. collected community data, but only three (Ghana, * The effect of decentralized decisionmaking (allow- Jamaica, andVietnam) attempted to collect data from ing individual schools and teachers to choose curric- local schools. (The Morocco LSMS survey gathered ula and allocate resources) on educational outcomes. detailed data on schools as part of the general com- * The relative efficiency of public and private munity questionnaire.) Practical guidelines on how to schools. collect school data will be provided later in this chap- * Methods by which principals can motivate teachers ter. The point to bear in mind at this stage is that col- and increase parental participation in schools. lecting data on school characteristics is not simply a Finally, the current issues regarding school admis- matter of modifying the household questionnaire. sions and advancement policies are: the minimum age of enrollment into first grade; whether area of resi- Other Government Policies That Affect Educational dence should determine which public school a child Outcomes can attend; policies on grade repetition; and standards Several government policies not usually thought of as for determining whether a student is allowed to education policies can have important effects on edu- advance to the next grade or level of schooling. cational outcomes. Perhaps the most important of To assess the cost-effectiveness of both school these are policies that affect child nutrition. There is quality initiatives and practices regarding manage- ample evidence that children's nutritional status, espe- ment, admissions, and advancement, policymakers cially in the first one or twvo years of their lives, can need information on the impact of changes in school have a sizeable impact on later school performance characteristics on educational outcomes as well as (Glewwe and Jacoby 1995; Glewwe, Jacoby, and King information on the costs of those changes. In general, Forthcoming; Alderman and others 1997). Therefore, household surveys can only provide answers on how all government policies aimed at improving children's changes in school characteristics affect educational health and nutrition, such as immunization campaigns, outcomes; most information on the costs of changes in nutrition education programs for young mothers, san- school characteristics cannot be collected using a itation programs, school feeding programs, early child- school questionnaire in a multitopic household survey hood nutrition initiatives, and food stamp programs, and thus must be collected in a separate data collection have potential implications for educational outcomes. exercise. In particular, a comprehensive study of actu- A second set of policies that can affect education- al resource costs is needed, which can be quite com- al outcomes are those that influence the opportunity plex. For an example see Ilon (1992). cost of children's time, such as policies regarding child In summary, household surveys should be designed labor, rural infrastructure, agricultural extension, and to provide ansxvers to the following four questions: childcare facilities. For example, providing a local * What impact do different material inputs have on source of potable water to a rural community may learning and other educational outcomes? reduce the opportunity cost of children's time, partic- * What impact do teacher characteristics and specific ularly for girls, who may no longer need to walk long teaching practices have on learning and other edu- distances to obtain water for their families. Of course, cational outcomes? some policies can increase the opportunity costs of 148 CHAPrER 7 EDUCATION schooling. For example, providing access to a new cations for the design of the education module of agricultural technology may raise the productivity of LSMS and other multitopic household surveys: child labor on family farms. * Who receives this type of education? Another way in which government policies can * What are the benefits of this type of education? affect educational outcomes is by changing the returns The first question implies that the education to schooling in the labor market. For example, a gov- module should be designed to collect information on ernment that decides to reduce its budget deficit by vocational training and other "nongeneral" education, imposing a freeze on hiring new secondary school or such as apprenticeships,6 and on postsecondary educa- university graduates will decrease the private benefits tion, including the subject studied. The second ques- of schooling (particularly if government jobs pay high- tion pertains to the impact of education on socio- er wages to well-educated workers than do compara- economic outcomes, which will be discussed in the ble private sector jobs). Such a policy will tend to next section. reduce school attainment.5 Alternatively, allowing for direct foreign investment may increase job opportuni- Policies Concerning the Effect of Educational Outcomes on ties for educated workers, leading to an increased Other Socioeconomic Outcomes demand for education. The discussion so far has focused on the determinants This means that household surveys should gather of educational outcomes, but an equally important use data that can help answer the following questions: of data from the education module of a multitopic * What is the impact of child nutritional status on household survey is analysis of the impact that educa- educational outcomes? tional outcomes have on other socioeconomic factors. * What impact do child wage rates and child pro- This raises the question: what kind of educational out- ductivity in self-employment and household chores come data should be collected in the education mod- have on educational outcomes? ule to support these kinds of analyses? The indicator * How do labor market conditions affect educational most commonly used in studies of the impact of edu- outcomes? cation on other socioeconomic outcomes is years of schooling, which has been collected in all past LSMS Policies on Vocational Training and Postsecondary Education surveys. In addition, almost all past LSMS surveys have The discussion so far has implicitly focused on pri- collected information on certificates or diplomas mary education and general (as opposed to special- obtained by household members, which household ized) secondary education.Yet almost all developing members are currently enrolled in school and, for countries also provide technical training and postsec- those who are enrolled, household expenditures on ondary education. In many developing countries it is education. However, most previous LSMS surveys difficult to use household surveys to collect data on have not collected data on basic cognitive skills, gen- these types of educational services because very few eral knowledge, specialized technical skills, values, or people receive technical training or study at the post- grade repetition. Also, only about half of these surveys secondary level. Thus those who do receive voca- have collected any information on daily attendance tional training or postsecondary education are (and these have usually asked only about attendance unlikely to appear in the sample in sufficient num- during the seven days prior to the interview). bers to make it possible to draw reliable conclusions There is one educational outcome that has usual- from the data. Nevertheless, a brief review of these ly not been collected in past LSMS surveys but could policies is worthwhile. be collected relatively easily in future surveys: grade Many policy issues at the postsecondary level repetition. This information is useful for computing involve pricing, and the discussion above on pricing actual years of school attendance, which in turn can policies still applies. Several of the policy issues with have implications for estimating rates of returns to respect to vocational training are discussed in this schooling (Behrman and Deolalikar 1991).Therefore, book in Chapter 9 on employment (and therefore this variable should be routinely collected in surveys in need not be discussed here). However, there are two those countries where grade repetition is common. general questions regarding both postsecondary edu- Information on daily attendance over a short period of cation and vocational training that have specific impli- time (such as the past one or two weeks) would also 149 PAUL GLEWWE be reasonably easy to collect. However, this informa- this question. In general there are only two ways to tion is less important for understanding the impact of collect information on past school quality: asking schooling on socioeconomic outcomes because atten- respondents to recall characteristics of their schools dance is primarily an input into other outcomes (such and finding data on past school quality from some as skills learned or diplomas received), instead of being other source. In either case the chance of obtaining an output in its owIn right. useful data are small. Respondents' recollections on Thus the major question is whether information the characteristics of the schools that they attended should be collected on skills, knowledge, and values. decades ago are likely to be highly unreliable, especial- This will be discussed fully in the second section of ly given the complex nature of school quality. this chapter. However, four comments can be made at Similarly, in the vast majority of developing countries this stage. First, as discussed above, collecting informa- there is unlikely to be much useful information on tion on skills is a major undertaking. Collecting infor- school quality in previous decades, and the few who mation on values is probably easier, but is rarely done do have such data probably only have a small amount in nationally representative surveys in developing of inforimiation) for each school. However, one countries. Second, information on the skills, knowl- approach to this issue, pursued by Glewwe (1999), is to edge, and values of adults can be extremely useful for examine the impact of current school on cognitive examining the impact of education on various socioe- skills, and then to examine the impact of those skills conomic outcomes. (See Murnane and others 1995 on labor market outcomes. This was done using the for an example from a developed country and Knight data from the 1988-89 LSMS survey in Ghana. The and Sabot 1990 for an example from a developing only alternative is to begin collecting panel data today country.) and follow individuals for decades, which implies a Third, information on children's skills can be used wait of many years before results can be obtained. to investigate the determinants of other socioeconom- Moreover, it may be difficult to collect panel data in ic outcomes using an indirect approach.7 For example, developing countries; see Chapter 23 for a discussion it would be possible to examine how education poli- of collecting panel data as a part of LSMS surveys. cies affect adult socioeconomic outcomes by combin- Finally, analysts may want surveys to collect data ing estimates of the impact of education policies on on variables that predict educational outcomes, such as children's skills with estimates of the impact of the years of schooling. The reason for this is that estima- skills of adults on the socioeconomic outcomes of tion problems can arise if an educational outcome adults. (See Glewwe 1999 for an example.) Fourth, in variable is used instead of its predicted value.8 Some school systems in which skills are highly correlated "predicting" variables are likely to be collected already with years of schooling, there is little reason to collect in an education module that focuses on the impact of skills data. Sclhool systems in whiclh childreii nust pass various governmenit policies on educational out- standardized examinations to continue to the next comes, as will be seen below In addition, one general- grade are particularly likely to produce such a high ly useful variable for predicting the educational out- correlation. However, the school systems in many, if comes of an individual, especially of an adult, is the not most, developing countries do not operate in this education level of the individual's parents. Data on manner, which implies that there is a strong argument parents' education should be collected regardless of for collecting data on skills, both for adults and for whether the parents are members of the individual's children, in such countries. household or even whether they are alive. Such data One line of research of recent years that deserves have often been collected in past LSMS surveys, usu- specific discussion is the impact of school quality on ally in the household roster. See Chapter 6 for a dis- wages and other labor market outcomes. After con- cussion of the household roster module. trolling for years of schooling, do individuals who attend relatively high-quality schools have higher Data Needed to Address the Major Policy wages and other desirable labor market outcomes? The Issues discussion of school quality data in the previous sub- sections focused on current school quality, but This section discusses the data required to address the researchers need data on past school quality to answer policy issues raised in the previous section. It also dis- 150 CHAPrER 7 EDUCATION cusses some methodological issues concerning the Figure 7.1 Determinants and Consequences of Schooling estimation of relationships that provide answers to pol- Outcomes icymakers' questions, although this discussion has been Child and family Schcol School Cousol kept to a minimum. Before turning to the issue of characteristics characteristics prices factors what data are needed, it is useful to review briefly, in nontechnical terms, what determines educational out- comes. This will provide a framework for comparing I School Dail Yars of the advantages and disadvantages of collecting differ- I expenditures attendance schooling ent kinds of data. The Determinants of Educational outcomes s / ti Figure 7.1 provides a visual framework of the deter- I and values Educa- minants and consequences of educational outcomes. . ---- acquired tonol Although these could be presented as mathematical \outomes equations, most survey designers probably prefer a r visual representation. This subsection describes the n | completed "model" on which this figure is based. At the top of Figure 7.1 are three boxes, labeled Diplomsr child and family characteristics, school characteristics, I i and school prices. These three kinds of causal factors I ultimately determine all educational outcomes. School + characteristics and school prices were described in ' Socioeconomic -Scholing detail in the previous section. The characteristics of a outcomes consequences child's family that determine his or her educational outcomes are: the education of the child's parents; the Source. Authors summary household's wealth, income, or both; the value that the child's parents place on schooling (including attitudes Two points are particularly notable. First, there are no towards educating girls); the existence of family farms arrows from school prices (both mandatory school or businesses (which may affect the opportunity costs fees and the prices for various learning materials) to of a child's labor); the size of the family; and the fam- skills learned or values acquired because the former ily's access to credit. The characteristics of a child that have no direct effect on the latter. They only have an affect his or her educational outcomes are the child's indirect effect through years of schooling, school "learning ability" (which includes both genetic and expenditures, and (possibly) daily attendance. Second, environmental factors), sex, motivation for schooling, the five groups of factors that directly affect skills position in the birth order, and past and present nutri- learned-child and family characteristics, school qual- tional status.These three groups of causal factors influ- ity, school expenditures, daily attendance, and years of ence three fundamental educational outcomes that are schooling-can be thought of as a "production func- directly controlled by parents (and sometimes their tion" for skills learned in school (Hanushek 1986). children): household expenditures on schooling, daily This concept of a production function, discussed in attendance, and years of schooling. These causal rela- the next subsection, has had a strong influence on tionships are indicated by the arrows in Figure 7.1. studies of education done by economists. In principle, One level down in Figure 7.1 are skills learned a similar "production function" exists for values and values acquired. All of the three educational out- acquired, but little research has been done on how comes that parents can directly control-school schooling affects values in developing countries. expenditures, daily attendance, and years of The next level of Figure 7.1 has only one educa- schooling-are important determinants of skills and tional outcome: grades completed. If children never values. Child and family characteristics and school repeated grades, there would be no need for this level, quality also directly determine skills and values.These because grades completed would equal years of causal influences are indicated by arrows in Figure 7.1. schooling. However, grade repetition is common in 151 PAUL GLEWWE many developing countries (Lockheed and Verspoor school characteristics and school prices are no longer 1991). Perhaps the most important determinant of at the highest level of the diagram. Child and family grade repetition is skills learned, because many schools characteristics remain at the highest level and are still do not allow a child to advance to the next grade until causal factors, but the other causal factors are the char- a certain degree of proficiency is reached in various acteristics of three schools from which the family can skills. Thus the figure for the number of years repeat- choose. (The number 3 is purely illustrative; actual ed, which is the difference between years of schooling numbers of choices vary by country and by house- and grades completed, depends on skills learned. hold.) The boxes on the characteristics and prices of Repetition can also affect skills learned since some the school attended are labeled to show that these classroom time may be redundant when a child repeats variables are determined by households' choices. a grade. This is indicated in Figure 7.1 by the small arroxv going from repetition to skills learned and val- Some Basic Econometric Concepts ues acquired. The rest of this section discusses what data are needed The last level of educational outcomes in Figure to provide answers to the policy questions in the first 7.1 shows that schools grant diplomas and certificates section of the chapter. Some reference to research to students who successfully complete a certain grade methodologies is inevitable in this discussion. or pass a certain examination (hence the arrow from Therefore, this subsection provides a brief, nontechni- skills learned to diplomas). In some countries, people cal discussion of three different econometric relation- who complete a grade but do not pass the associated ships that researchers may want to estimate. For a fuller examination are officially considered as having com- and more technical discussion of basic econometric pleted that grade. In these countries it is important for concepts see Chapter 26. survey questionnaires to distinguish between finishing Household survey data can be used to estimate the grade and passing the examination. three types of socioeconomic relationships.The first is At the bottom of Figure 7.1 are socioeconomic out- a reduced form relationship, in which an educational comes (which include employment, wages, farm produc- outcome is determined by exogenous, causal factors.'0 tivity, health status, fertility, migration, and the nutrition- These causal factors are a subset of the child, family, al status of children). These data come from other and school variables (characteristics) discussed in the modules of a multitopic household survey questionnaire. previous subsection. Some child and family variables The dashed hnes in Figure 7.1 show the causal relation- are generally accepted by economists and other social ships that determine these socioeconomic outcomes. scientists as exogenous. Other such variables are gen- Figure 7.1 assumes that school characteristics and erally accepted as being endogenous. Finally, there are school prices are exogenous-that is, beyond the con- some variables for which no consensus has been trol of the household. However, in many developing reached. School variables fall into the third category; countries, parents have several schools from which to whether they are exogenous or endogenous is a mat- choose. Even when only one local school exists, par- ter of much debate. At the simplest level, there are the ents may decide to send their child to a boarding two scenarios depicted in Figures 7.1 and 7.2. If there school or to live with relatives located near a "good" is only one school from which to choose (Figure 7.1), school.9 This implies that school characteristics and it may be reasonable to assume that the school's char- school prices are not exogenous but are determined acteristics are exogenous to the household. In Figure by the parents when they select a school for their chil- 7.2 there are several schools from which to choose. In dren. However, the schooling options them-lselves-in this case, while the characteristics of the school chosen other words, what local schools, and possibly boarding are not exogenous, the characteristics of the three schools, are available-are generally beyond the con- choices are. However, further problems of endogene- trol of the parents. (In some cases, even the choice of ity may arise under each scenario, because households schools may be endogenous. This is discussed in later may migrate to live near desirable schools, households subsections.) Thus the characteristics of each schooling may take actions to change local school characteristics, option do not depend on a choice made by the house- and governments may make decisions regarding pub- hold. Figure 7.2 demonstrates this situation. Figure 7.2 lic schools based on specific local conditions, which in is the same as Figure 7.1 except that the boxes for turn often affect households' choices. To fully address 152 CHAPTER 7 EDUCATION Figure 7.2 Determinants and Consequences of Schooling OutcomesWhenThere Isa Choice of Schools toAttend |Child and || School I || School 2 | chool 3 | family characteristics |characteristics caracteristics asi characteristics i ~ _ s t/causcr foctors \ \ \ School t MI School Schooi \ _ charcteristi5 priceschoice , exp~~~~enditures attendance schooling | 5kills learned K < > Educotional > | ad vaues g Reet tlonJ outcomes - - - - - -- -- - Socioeconomic outcomes Schooling consequences Source: Author's summary these potential problems, special data and estimation ing by first affecting some other variable, such as techniques are required. These will be discussed below, school expenditures or years of schooling. Because A second relationship that can be estimated is a school fees and other prices affect learning only indi- production function for educational outcomes. In rectly, there is no direct effect (that is, no arrow going general, skills learned is the only educational outcome directly from school prices to skills learned), so the that has been depicted as a production process.'" The learning prodluction function excludes school prices. two differences between reduced form estimates and It is much harder to estimate production functions production function estimates (also known as structur- than to estimate reduced formi relationships. However, al estimiates) of skills learned are: reduced form rela- the production function is useful for understanding tionships use only exogenous variables to explain the mechanisms by which school quality improves learning while production functions quite often use educational outcomes. Production function relation- endogenous variables, and reduced form relationships ships are generally assumed to be fairly stable over long include both direct and indirect causal relationships periods of time, while reduced form estimates can while production functions are concerned only with change as socioeconomic conditions change. This is direct causality. Figures 7.1 and 7.2 illustrate the sec- because reduced form estimates reflect household ond difference. The direct causes of skills learned are behavior, while production function relationships are, shown by the five arrows that lead directly to skills by assumption, "technological relationships" not learned, while, in contrast, indirect causes affect learn- altered by household behavior. For example, the 153 PAUL GLEWWE reduced form impact of parental education on learn- endogenous variables but do not directly affect the ing may diminish as a society becomes wealthier educational outcome of interest. (Such variables are because parental education usually leads to higher called instrumental variables.) The advantage relative incomes, which, in poor societies, implies that parents to estimating production functions is that the number buy more basic learning materials such as textbooks of endogenous variables in the regression is generally for their children.As national wealth increases, all par- smaller, so fewer instrumental variables are required. ents are able to purchase textbooks, which diminishes this indirect impact of parental education on learning. Using Cross-Sectional Data to Investigate the Determinants The third type of relationship is a conditional of Educational Outcomes demand relationship. It examines the impact of select- The data collected in most previous LSMS surveys ed exogenous and endogenous variables on educa- have been cross-sectional data-data gathered in sur- tional outcomes. In particular, it concerns the estima- veys in which households were visited only once. (A tion of the deterrninants of educational outcomes household's interview may comprise two or more vis- conditional on certain variables of interest that may its over a week or two, but the resulting "picture" of not be exogenous. Conditional demand relationships the household is for a single point in time.) Thus most can be thought of as an intermediate category analyses done on education issues using data from between reduced form relationships and production LSMS surveys have used cross-sectional data.This sub- functions. They are not reduced form relationships section examines how cross-sectional data can be used because they include endogenous variables as causal to understand the impact of government policies on factors, yet they are not production functions because educational outcomes and, more generally, to under- they include some variables that affect educational stand the determinants of educational outcomes. The outcomes only indirectly. subsection makes specific recommendations on the An example may make this clearer. Suppose a data needed to analyze the likely impacts of the differ- researcher is interested in studying the impact of ent types of education policies discussed in the first household income on a particular educational out- section of this chapter. come. If household income were exogenous, he or she Before beginning a detailed discussion of data could simply estimate the reduced form.Yet household needs, a general point must be made. Accurate estimates income is, to some extent, a household choice, because of the determinants of educational outcomes require removing a child from school and putting him or her data on all variables that are believed to be causal factors. to work increases household income.Thus one cannot This is true for all three types of relationships discussed estimate the impact of income as a reduced form rela- in the previous subsection (reduced form, production tionship. On the other hand, one cannot estimate the function, and conditional demand). Figures 7.1 and 7.2 impact of income as part of a learning production show that child characteristics, family characteristics, and function, because income does not directly affect school variables jointly determine the three educational learning; its effect is indirect through the goods and outcomes most directly under parents' control-school services purchased with that income. Conditional expenditures, daily attendance, and years of schooling. demand estimates attempt to measure the impact of an These three educational outcomes, along with child, exogenous increase in income. Such an estimate of the parent, and school characteristics (but not school prices), impact of income on educational outcomes could be then determine all other educational outcomes. Many of used to assess the impact of economic growth on par- these variables will be correlated with each other, and ticular educational outcomes, such as learning or grade statistical theory shows that estimates of the impact of attainment. As explained above, this is neither a the different variables based on regression analysis are reduced form relationship nor a production function, likely to be biased if causal variables left out of the but rather a conditional demand relationship. regression are correlated with causal variables that are Conditional demand relationships are, in general, included (an estimation problem known as omitted vari- harder to estimate than reduced form relationships but able bias). Thus, in order to understand fully the deter- not as hard to estimate as production functions. The minants of any educational outcome, the researcher must difficulty in estimating conditional demand relation- have information on all child, household, and school ships is the need to find variables that predict the variables that affect that educational outcome. 154 CHAPTER 7 EDUCATION A simple example demonstrates this point. several schools in the community from which parents Suppose a researcher is trying to estimate the impact can choose. of children's nutritional status on educational out- If a linear model can be assumed (with no interac- comes, but he or she lacks data on the distance to the tion effects between school characteristics and child or nearest school. Estimates done without distance infor- family characteristics), the combined effect of all school mation can yield biased estimates of this impact characteristics for each community can be captured by because children's nutritional status may be correlated a dummy variable for that community. Variation of with the distance they have to travel to the schools household and child characteristics within each com- they attend. Households that live far away from munity can be used to estimate the impact of these schools may also live far away from health clinics. variables on educational outcomes. One disadvantage Living far from a school could lead to unfavorable of this technique is that it cannot estimate the impact educational outcomes, while living far from a health of school variables on educational outcomes.'2 clinic could lower nutritional status. This raises the To summarize, in nonrandomized settings there possibility that an apparent negative impact of nutri- are two ways to estimate the determinants of educa- tional status on educational outcomes may simply be tional outcomes using a single cross-section of data. due, at least in part, to the fact that households that live First, after collecting detailed information on child, far from schools also live far from health clinics. household, and school characteristics, including school The need to collect data on all child, household, fees, for all local schools in each community, the and school variables, particularly variables related to researcher can do a "full" estimation of the determi- school (and teacher) characteristics, implies that a very nants of educational outcomes. The second option is large data collection exercise is needed.There are three to dispense with the collection of school-level data exceptions to this general need for detailed school and instead use a fixed effects procedure to estimate data. First, if the main objective of the household sur- the impact of family- and child-specific variables (but vey is to assess the current state of affairs and to exam- not school variables) on educational outcomes. The ine correlation but not necessarily causation, it is only remainder of this subsection discusses what can be necessary to collect data on the variables of particular learned about specific policy issues using cross-sec- interest to the researcher, most of which can be col- tional data. lected at the household or child level. For example, if what is being studied is simply which socioeconomic IssuEs CONCERNING THE PRICE OF SCHOOLING. What groups have the lowest levels of school enrollment, kind of cross-sectional data should be collected to there is no need to collect data from schools to meas- examine the four questions on education pricing poli- ure this. Second, if school characteristics can be altered cies posed in the first section of this chapter? The first as part of a randomized evaluation, it is not necessary question is the impact of school fees on educational to collect data on all variables thought to be causal fac- outcomes, which raises issues that apply to the other tors. (Randomized evaluations are discussed in detail at questions.The above discussion suggests that to estimate the end of this section.) Third, in nonrandomized set- how mandatory school fees affect educational out- tings there is one approach that can be used to esti- comes, it is necessary to collect data on both the quali- mate the impact of child and family characteristics on ty and prices of local schools and then to estimate a educational outcomes in the absence of data on school model in which educational outcomes are determined characteristics and prices. This approach will work for by child and family characteristics, school quality, and any household survey that uses a two-stage (or three- school fees. If households have a choice of schools, it is stage) sample design (which is by far the most com- still possible to estimate a similar model after correcting mon method for drawing a sample of households). for selectivity bias due to school choice, as in Glewwe Such a sample design makes it possible to use com- and Jacoby (1994). Reduced form relationships are the munity or school "fixed effects" estimation techniques least difficult to estimate, but it may also be useful to on a single cross-sectional data set. The idea is that all estimate a production function (if the educational out- households in the community face the same school come of interest is learning) or a conditional demand variables. This approach can be used both when a relationship to probe deeper into the ways in which community has only one school and when there are school prices determine educational outcomes. 155 PAUL GLEWWE It may be tempting to collect only price data from relationship ofthe impact of these variables on educa- schools if the researcher is interested only in education tional outcomes can be estimated. It may also be fea- pricing policies. However, price data are not sufficient sible to estimate some conditional demand functions. for analyzing these issues. Data on many other school The second option is to collect data only on the dis- characteristics must also be collected to avoid omitted tance or travel time between each household and the variable bias. Schools that charge high fees may well be local school or schools, and use differences in travel high-quality schools, and looking only at prices could time or distance to estimate a reduced form relation- lead to the (mistaken) inference that high prices lead ship. If the assumptions underlying these estimates are to high school enrollment rates because the (unmea- accurate, which is debatable, the estimates will meas- sured) characteristics that make a school good per- ure hoxv educational outcomes change when manda- suade parents to enroll their children in that school. tory fees are changed. Most previous LSMS surveys, the exceptions being Using distance within a community to under- surveys in GhanaJarnaica, Morocco andVietnam, have stand the impact of prices on school attainment leads not collected detailed data on school quality and to the third pricing policy issue-the impact of school school prices. Collecting these data is not a trivial task. distance on educational outcomes. Taking the first Even with data on local school prices and school qual- route described in the previous paragraph-full ity, the omission of some aspects of school quality that estimation-the explanatory variables already include are difficult to observe and measure, such as teacher distance (or travel time), the various prices, and other motivation, can lead to omitted variable bias. school variables. It is then possible to estimate the Is there any way to examine the impact of school impact (reduced form or otherwise) of all these fees on educational outcomes without collecting explanatory variables. If it is reasonable to use variation detailed data on schools? Yes, if the researcher is will- in distance or travel time to estimate price effects, the ing to make some assumptions about travel time costs impact of distance (or travel time) and mandatory and the price of schooling. The idea is that each school fees should be the same (after assigning a mon- household in the community lives a different distance etary value to children's time). Thus this method can from the local school, and this variation in distance is be tested. Taking the second route (based on variation equivalent to variation in tuition costs within each in distance within communities), the answer is already community. If this equivalence holds, it is possible to clear, but it is not possible to determine whether the estimate how school fees affect educational out- price and distance effects are two distinct effects or comes.'3 Using data on hourly child wage rates, the two different ways of measuring the same effect.'4 amount of time it takes a child to xvalk to school can The second question posed in the first section be transformiied into a price that varies among children was: what is the impact on educational outcomes of within the community due to variation in distance. changes in prices for specific learning materials not Child wage rates can be obtained from either the provided by the school, such as textbooks, pencils, and community questionnaire or, in some cases, the labor extra classes? To answer this question it is necessary to section of the household questionnaire. Note, howev- regress the educational outcome being studied on er, that the assumptions that must be made to use this child and family characteristics and the full set of method may be false, so there is some risk involved in school quality and school fee variables. In contrast to using it. There is little empirical evidence on the the situation concerning school fees and distance, extent to which estimates based on this method can be there is no way to avoid collecting detailed school data misleading. For examples of the application of this to answer this question. The simplest approach is method see Gertler and Glewwe (1990, 1992) and reduced form estimation, but again it may also be use- Selden andWasylenko (1995). ful to try to estimate a conditional demand relation- To summarize, there are two ways to estimate the ship.The school fee variables should include mandato- impact of mandatory school fees on educational out- ry school fees and the prices of the specific inputs.This comes using cross-sectional data. First, detailed infor- kind of estirmation is rarely done because prices on mation can be collected on child and family charac- specific educational inputs are seldom collected in teristics and on both the characteristics and the fee household surveys-an oversight that should be cor- structure of local schools, from which a reduced form rected in future surveys. 156 CHAPTER 7 EDUCATION The fourth question about pricing educational tionnaire. Of course, it is still important to collect services involves the effects of introducing a student expenditures on school fees in the household ques- loan program, including who would receive the loans tionnaire to calculate total household expenditures and what impact receiving a loan would have on the and to check the accuracy of the data collected from educational outcomes of the recipients. Student loan the community or school questionnaire. programs are fairly rare, so it is not possible to evalu- ate them in most countries. Even where such pro- ISSUES CONCERNING SCHOOL AND TEACHER CHAR- grams exist, in most cases they are national programs ACTERISTICS. Investigating the impact of school and and thus do not vary across households at a single teacher characteristics on educational outcomes point in time. Therefore it is not possible to evaluate requires detailed information on both school charac- student loan programs using cross-sectional data.'5 teristics and school prices. There is no way to avoid Before turning to the next set of issues, an impor- collecting detailed data on local schools (unless one tant caveat must be recognized with regard to the use undertakes a randomized evaluation, a major activity of cross-sectional data to analyze the impact of pricing that is discussed in detail below). Before turning to the policies on school outcomes. The methods discussed four types of policy questions regarding school and above for estimating the impacts of school prices on teacher characteristics posed in the first section of this educational outcomes assume that none of the prices chapter, it is useful to consider whether to estimate a associated with schools can be changed unless the par- reduced form relationship, a conditional demand func- ents have a number of local schools from which to tion, or a learning production function. Estimating a choose (the scenario in Figure 7.2).Yet this assumption reduced form relationship amounts to regressing the might be challenged using the three potential prob- educational outcome of interest on all exogenous lems discussed in the previous subsection-household determining factors, which consist of school charac- migration for schooling purposes, household actions teristics, school prices, and the child and family char- that change local school characteristics, and govern- acteristics that can reasonably be assumed to be exoge- ment consideration of local conditions when making nous. A full set of school quality variables should be decisions on local school characteristics and prices. In used to minimize omitted variable bias. However, general, these problems cannot be resolved using there are potentially serious problems regarding cross-sectional data (the exception being the migra- whether local school variables are endogenous. These tion problem; see the discussion of panel data in problems will be discussed further in the panel data Chapter 23). They will be discussed below in the sub- section below. section on panel data. In some cases the researcher may be particularly A final practical matter should be discussed interested in estimating a learning production func- regarding the collection of school price data as part of tion. This implies regressing skills learned on school a household survey. While it is appropriate to collect attendance, school quality, years of schooling, child and data on schooling expenditures in the household family characteristics, and expenditures on "optional" questionnaire, it is important to collect information on items (such as textbooks, exercise books, and extra mandatory fees and other prices directly from schools classes). School expenditures are clearly endogenous, in either a community questionnaire or a school ques- so instrumental variables-variables that determine tionnaire. There are several reasons for this. First, it is school expenditures but do not directly affect likely that information gathered from households learning-need to be found; obvious candidates are about the prices they paid will contain recall errors, the prices of the specific items. Daily attendance is also and such measurement error can lead to biased esti- clearly endogenous, and an added difficulty is that mates. Second, there is often more than one school attendance data over long periods of time are hard to available in a local community, and it is difficult to obtain. Possible instruments for attendance are house- ascertain which prices reported by households apply hold wealth, distance to the school, and the productive to each school.Third, in poor communities it is possi- assets relevant to self-employment. (If data on daily ble that no one in the households surveyed is enrolled attendance are not available, production function esti- in secondary school, so no information on secondary mates will suffer from omitted variable bias, so one school fees can be obtained from the household ques- must turn to either reduced form or conditional 157 PAUL GLEWWE demand estimation.) The number of years of school- into the equation to be estimated, which means that ing is clearly endogenous, but the number of years of textbook prices cannot be used as an instrumental schooling of the children who are currently in school variable for textbooks. Overall, economists do not may not be. In particular, if virtually all children in the agree on whether conditional demand estimation is survey begin their schooling at the "normal" age (usu- likely to be feasible in general, and each case must be ally, six years) and very few have already finished their judged on its own specific circumstances. schooling (if, for example, the estimates are for pri- Returning to the distinction between the four mary school students in a country where children types of school characteristics, it was pointed out in rarely drop out while in primary school), years of the first section that, in theory, school management schooling can be considered exogenous.16 practices and school admissions and advancement In practice, it is difficult to collect information on policies have only indirect effects since their only daily attendance and, in some cases, on school expen- effect is to change what happens in the classroom-by ditures. Thus it may be necessary to choose between changing material inputs, teacher characteristics, and reduced form and conditional demand relationships. pedagogical techniques-and to regulate who can For example, it would be very useful to know how enter the classroom. Thus, if data are available for all of textbooks affect learning, but in school systems in the important material input and teacher variables, which parents purchase (or decide not to purchase) only the material input and teacher variables are need- textbooks for their children, this cannot be estimated ed in any regression estimated to assess the impact of as a reduced form relationship because possession of a these school quality variables on educational out- textbook is a decision made by parents. A reasonable comes. This implies that it is not necessary to collect instrumental variable, such as the price of textbooks, data on school management practices or admissions raises the possibility of estimating a conditional and advancement policies if these aspects of schooling demand relationship even in the absence of data on are not being studied. school attendance, and perhaps even in the absence of However, there are three reasons why data on data on expenditures on other school supplies. This school management, admissions, and promotion poli- can be done by replacing ("substituting out") the vari- cies should be collected (and why the school ques- ables for which there is no information (such as school tionnaire introduced in the third section of this chap- attendance) or no particular interest with the exoge- ter does collect them). First, it is probably impossible nous variables that determine those variables, and then to collect data on all material input and teacher vari- predicting ("instrumenting") any remaining endoge- ables (indeed, it is hard to imagine how data could be nous variables.This yields an estimate of the impact of collected on a variable such as teacher motivation), textbooks on learning that is conditional on current and school management variables may pick up some household circumstances and behavioral responses to of the effects of these variables-reducing omitted those circumstances. Although this is not a production variable bias.17 Second, many policymakers and function, if the estimates are accurate they can be used researchers are very likely to be interested in these to predict the impact on learning if schools were to issues, either now or in the future. Third, there is very provide textbooks free of charge. little cost to collecting data on school management Unfortunately, there is a serious problem with practices and admissions/advancement policies if the conditional demand estimation. Replacing endoge- survey has already been designed to collect data on nous variables such as school attendance and school material inputs, teacher characteristics, and pedagogi- expenditures with the exogenous variables that deter- cal techniques. All that needs to be done is to add a mine them may use up all the instrumental variables few questions to the school questionnaire. that were available to use as instruments. For example, Finally, it is possible to combine estimates of price if the researcher wants to estimate the impact of text- effects with estimates of school and teacher character- books but there are no data on school expenditures, istic effects.This is useful for assessing the feasibility of the school expenditures variable will have to be "sub- simultaneously raising school fees and using the funds stituted out." Total school expenditures probably to pay for improvements in school quality. "Full" depend on the price of textbooks. Thus substituting reduced form estimates of educational outcomes will out that variable will introduce the price of textbooks use both price and school quality variables. This will 158 CHAPTER 7 EDUCATION provide parameter estimates that can be used to pre- Regarding policies that affect the value of children's dict the net effect on those outcomes of changing time, community questionnaires in multitopic house- prices and school quality (more specifically, changes in hold surveys should collect data on child wage rates. material inputs and teacher variables) simultaneously. The conimunity questionnaire introduced in Chapter In many cases, it is useful to interact the school and 13 does collect such data.A problem often encountered price variables with household income in order to see when collecting such data is that some communities whether the effects depend on household income lev- have no wage labor market for children. However, in els. For examples and discussion of this approach see most communities and in almost all rural areas, many Gertler and Glewwe (1990, 1992) and Selden and children work on the family farm or for the family Wasylenko (1995). business. It may be possible to estimate the value of chil- In summary, analyzing the impact of school and dren's time by estimating a profit function for a farm or teacher variables on educational outcomes requires business-for example, using the approach of Jacoby detailed information on school (and teacher) charac- (1993) or Newman and Gertler (1994). Such estimates teristics and school prices. It is impossible to avoid col- are particularly useful because they are based on varia- lecting detailed data from local schools (unless one tion within the community, so they can be estimated implements a randomized evaluation). Information even when using a community fixed effects estimation should be collected both on what takes place inside procedure to control for unobserved school and com- the classroom and on school management and admis- munity characteristics. For the data needed to estimate sions/advancement policies. profit functions for farms or nonfarm household enter- prises, see Chapters 18 and 19. OTHER POLICIES THAT AFFECT EDUCATIONAL OUT- Another way in which government policies can CoMES. The first section of this chapter listed three affect the opportunity cost of children's time is by other government policy areas that can affect educa- altering the amount of time children spend doing tional outcomes in developing countries: policies relat- household chores-say, by providing a well (which ed to child health and nutrition, policies that affect the would reduce the time that children, especially girls, opportunity cost of a child's time, and policies that spend fetching water) or changing the prices of affect the returns to schooling in the labor market. It is kerosene and cooking fuel (which would affect the not clear that a single cross-sectional household survey time that children spend fetching firewood). In gener- can shed any light on the last type of policy. This is al, this type of information should be collected in a because a single household survey identifies relation- community or price questionnaire (see Chapter 13). ships based on variations across households and com- The empirical techniques are little different from munities at one point in time. Most government poli- those already discussed above. However, if a commu- cies that affect labor market outcomes (for example, a nity fixed effects estimation procedure is used to con- freeze on hiring new government workers) usually do trol for unobserved community variables (such as not vary much across regions and communities at a sin- school quality), it is not possible to address these kinds gle point in time. However, some policies may pertain of questions unless there is variation in the relevant to specific regions or zones. For example, if a country distances (for example, the distance from each house- declares some areas to be export promotion zones and hold to a well) within a community. if employment opportunities in these zones generate a Finally, government policies that affect child health higher payoff for schooling, it may be possible to esti- and nutritional status can also affect educational out- mate the impact of the existence of these zones on the comes. Clearly, data on health and anthropometric sta- demand for schooling and thus to investigate what tus must be collected in the respective modules of the would happen to educational outcomes if these export household questionnaire (see Chapters 8 and 10 for promotion zones were extended into other areas. details). Unlike many other child characteristics (such However, this kind of estimation has few implications as sex, innate ability, and birth order), child health is not for designing the education module or any other part exogenous. Parents and their children make choices of a multitopic household questionnaire. All the that affect the children's health. Thus estimates of the researcher needs to know is the location of the com- impact of child health and nutrition on educational munities sampled in the survey. outcomes must be either conditional demand relation- 159 PAUL GLEWWE ships or learning production functions, and the kind of pricing policies and of classroom and teacher char- instrumental variables needed are those that predict acteristics (including pedagogical practices) on edu- child health but that have no direct impact on educa- cational outcomes. tional outcomes."8 Some obvious instruments are the In examining school fees it is important not to availability, quality, and fee structure of local health depend only on the expenditure data on education clinics and the prices of common medicines. However, collected in the household questionnaire; informa- if community fixed effects are used to control for tion on school fees must also be collected from a unobserved school quality, these variables cannot be school or community questionnaire. used as instruments. The exceptions are distance or * In general, it is not possible to analyze the impact travel time, which require data from each household on of student loan programs on educational outcomes the distance to the local health facilities.Alternatively, it using cross-sectional data, even if detailed data are is possible to use variations in parental height. For a collected on local schools. fuller discussion of what predicts child health and * It is very difficult, if not impossible, to estimate a pro- anthropometric status, see Chapters 8 and 10. duction function for learning using cross-sectional A final word of caution is in order regarding child data, and some would argue that the same is true for health and schooling. Recent research has seriously estimating conditional demand relationships. questioned whether cross-sectional data can be used to * As long as a household survey is designed to collect estimate the impact of child health on educational detailed data on what goes on in the classroom outcomes. In particular, it is hard to find instrumental (material inputs, teacher characteristics, and peda- variables that predict child health status but do not also gogical practices), it should also collect information directly affect educational outcomes. For example, the on management and admissions/advancement poli- prices of medicines should, like all other prices, direct- cies, even if policymakers at the time are not par- ly affect parental decisions regarding child schooling. ticularly interested in analyzing how school man- Many researchers have tried to overcome these prob- agement and admissions/advancement policies lems by using panel data. This method will be dis- affect educational outcomes. cussed in the next subsection. A single cross-sectional data set cannot shed light on the impact of government labor market policies SuMMARY. The main conclusions of this discussion on educational outcomes unless the policies vary by have been as follows. region; however, a series of cross-sectional data sets * If the researcher is interested only in understanding over a period of time in which those policies the impact of educational outcomes on other change may be able to do so. socioeconomic outcomes, there is no need to col- * Information should be collected on the education lect data from local schools, since such data are use- levels of the parents of all household members, ful only for understanding the impact of school regardless of whether the parents live in the house- policies on educational outcomes. hold or are even alive. * If the researcher is interested only in the impact of child and household factors on educational Using Panel Data to Investigate the Determinants of outcomes-and not in the impact of school or Educational Outcomes community variables on educational outcomes- Additional estimation techniques can be used if panel there is no need to collect detailed data on schools. data have been collected-in other words, if the same * If the researcher is interested only in the impact of households have been interviewed at more than one mandatory school fees or distance to schools on point in time over a period of months or years. Panel educational outcomes, this relationship can be esti- data are relatively rare in developing countries, but this mated without collecting detailed school data, by need not be the case in the future. collecting data on the distance from each house- Many problems arise when basic multiple regres- hold to local schools (although this requires making sion techniques, such as ordinary least squares, are used certain assumptions that may be false). on cross-sectional data because some of the variables * It is necessary to collect detailed data on local included in the regression are correlated with variables schools in order to examine the impact of other that are not included. The most obvious way to solve 160 CHAPTER 7 EDUCATION this problem is to collect information on the missing A, = PO + ±lXl + 12 variables, which will reduce this omitted variable bias. A3 = PO + lXl + PA + PA In theory this should be effective, but it is rarely pos- sible to collect all possible variables, either because of where A,, A , and A3 represent achievement (for cost limitations or because some variables (such as par- example, test scores) at the end of grades 1, 2, and 3, ents' tastes, children's innate ability and teachers' moti- respectively.19 The constant term P. represents what, if vation) are inherently difficult to measure. anything, the child has learned before entering first In principle, instrumental variable methods, such grade. The Xi, X,, and X3 variables refer to the values as two-stage least squares, can overcome the problem of all child, school, and household variables in grades of an explanatory variable being correlated with 1, 2, and 3, respectively. unobserved terms. Instrumental variable techniques The relationships shown in equation 7.1 imply can also resolve many problems of measurement error, the following relationships: Unfortunately, finding credible instrumental variables is not easy. (See Strauss andThomas 1995 for a detailed (7.2) A2 - Al = PAX discussion.) Another approach is to use fixed effects A3 - A2 = 133X3 procedures to control for differences across commnuni- ties or, in some cases, across families. Yet community Estimation of the relationships in equation 7.2 implies fixed effects procedures cannot eliminate problems much simpler data collection. For example, to under- due to variation in unobserved variables within the stand the effects of the child, household, and school community, and even family fixed etfects cannot deal characteristics that prevail when children are in third with variation within the family (such as variation in grade (in other words, to estimate P3), the X3 variables the innate abilities of siblings). Some of these problems can be regressed on A3 - A2. No information is need- can, under certain assumptions, be circumvented by ed on X, or X2. More generally, if data are available on using panel data. Chapter 23 provides a general test scores from a particular past period in time, there overview of the advantages and problems associated is no need to collect any data for any previous time with the collection and analysis of panel data.The cur- periods. Estimating equation 7.2 requires panel data rent subsection discusses four ways in which panel data because it requires that each child be tested at two dif- can be used to examine education issues in developing ferent points in time.20 countries. Finally, some economists have suggested that using panel data to estimate a value-added production func- VALUE-ADDED SPECIFICATION OF THE DETERMINANTS tion has the added advantage of eliminating all unob- OF LEARNING. Estimating the determinants of learning served fixed effects (Hanushek 1992). This is true (whether production functions, reduced forms, or under certain functional form assumptions, but it is conditional demand relationships) is of great interest somewhat risky because it requires the assumption to policymakers. Many problems arise when trying to that the unobserved variables affect total learning but estimate these determinants using cross-sectional data. do not affect increments to learning, which seems One problem is that learning is a cumulative process implausible. in that what a child has learned up to a particular point in time depends on an entire "history" of school, IMPACT OF EARLY CHILDHOOD NUTRITION ON household, and child variables. Collecting data on past SCHOOLING. It is widely thought that poor nutrition in events and conditions is difficult, but omitting these childhood leads to poor school performance (Del variables may result in serious problems of omitted Rosso and Marek 1996; Pollitt 1990).This implies that variable bias. Specifying an equation of the determi- policies improving children's nutritional status may nants of learning in "value-added" form can alleviate also improve their school performance. However, some of these problems. To see this, consider a simple using cross-sectional data to estimate the impact of linear model of learning in, say, the first three years of childhood nutrition on educational outcomes is diffi- school: cult or even impossible, for two main reasons. First, the crucial factor governing a child's nutritional status may (7.1) Al = PO + RX be the conditions that prevail in the child's household 161 PAUL GLEWWE during his or her first two to three years of life rather school cannot be done in the classroom. Thus panel than the conditions at the time the child is in school. data are needed. Cross-sectional data can provide only a summary measure of nutritional status over a child's life-say, ENDOGENOUS SCHOOL CHARACTERISTICS. As was height for age-and this summary measure is likely to pointed out above, school characteristics can be contain substantial measurement error. Second, chil- endogenous either because of selective migration or dren's nutritional status is clearly endogenous, which because households' characteristics or actions deter- means that instrumental variables would be needed in mine local school characteristics. In general, as long as order to obtain unbiased estimates. Finding credible some cross-sectional data already exist on migration, instrumental variables is not an easy task. selective migration does not necessarily imply that Panel data can overcome both of these problems. panel data must be collected. In contrast, if local school By definition, they can provide information on nutri- characteristics are determined in part by unobservable tional status in the first years of life and on subsequent community characteristics, problems may arise that school performance for the same child. In addition, would be difficult to overcome without panel data. panel data can provide a rich source of instrumental Both of these points are explained in the discussion on variables that are more plausible than those based on evaluating social sector programs in Chapter 23. cross-sectional data; such instrumental variables Only one empirical study on educational outcomes include price shocks (Alderman and others 1997) and has used panel data to examine the impact of changes in indicators of health during early childhood (Glewwe, local school characteristics on changes in those out- Jacoby, and King Forthcoming). More generally, comes: an analysis of Indonesian data by Pitt, because a child's nutritional status is determined in Rosenzweig, and Gibbons (1993). Several comments are part in the first two to three years of life-before he in order regarding this study. First, the authors were able or she starts attending school-the process by which to implement their methodology using a district-level child nutrition affects school performance is inherent- panel data set constructed from a series of cross-section- ly dynamic, which leads to the need for panel data (see al household surveys; thus they did not need panel data Chapter 23 for further discussion of this point). on households. Second, the data sets used contained wedl over 100,000 households-many more than in even the RETENTION OF SKmILs LEARNED. One issue that has largest LSMS survey. Third, in principle, such panel data received little attention in the past is the retention of are not needed if retrospective data exist on the avail- skills acquired in school. Some educators claim that a ability of schools.A final, more general, comment is that minimal number of years of school attendance, such as methods to avoid bias brought about by endogenous four or six, is needed for children to avoid lapsing into school characteristics are still quite experimental; more complete illiteracy after leaving school. In addition, research is needed on how to reduce such bias. the jobs young people take after leaving school will result in loss, retention, or increase of the skills learned While these four examples show how panel data can be in school, depending on the type of job. Exactly what useful in analyzing education issues in developing coun- happens to skills after students leave school is unclear tries, panel data may also have some disadvantages. First, and may have important policy implications. For because panel data tend to magnify the bias brought about example, if a certain number of years of schooling by measurement error, careful thought must be given to guarantees lifelong literacy for nearly all those who which instrumental variable techniques to use-which leave school, a reallocation of educational resources may dictate that the panel should extend over three time may be needed to ensure that all children acquire this periods-when making plans to collect panel data. minimal level of schooling. Second, collecting panel data can lead to serious problems To analyze students' skill retention after leaving of sample attrition.Third, panel data may be more expen- school, it is necessary to measure the skills of the same sive to collect, relative to a series of cross-sectional surveys. individuals at two or more points in time-at the time On the other hand, the expense will be relatively low if they leave school and at one or more later points in individuals or households that move are not followed; see time. Unfortunately, few examples of such data exist Chapter 23 for a detailed discussion of the costs of col- because measuring cognitive skills after students leave lecting panel data under different scenarios. 162 CHAPTER 7 EDUCATION Using Randomized Trials for Program Evaluation on human beings for decades and have developed rig- For researchers interested in collecting data that make orous procedures that meet contemporary ethical it possible to assess how government policies affect standards. Finally, randomized trials can be costly. This educational outcomes, the discussion so far may have is mainly a problem if the trials cannot be carried out been discouraging. Many problems can arise regardless as part of the implementation of a policy or project, of whether cross-sectional or panel data are used, which forces the program being evaluated to be which raises serious questions about whether it is pos- financed out of scarce research funds. However, in sible to estimate the effects of changes in educational many cases randomized trials can be designed as part policies with any accuracy. Economists have become of an existing government project (Dow and others more aware of these problems in recent years, and 1997), reducing costs substantially.Working with non- some now argue that randomized trials offer the best governmental organizations is another way to reduce hope for overcomning these problems (Newman, costs by evaluating existing projects; this was done by Rawlings, and Gertler 1994; Burtless 1995). Glewwe, Kremer, and Moulin (1999). The concept behind randomized trials is quite A final issue to consider when conducting ran- simple and persuasive. If a researcher wants to examine domized trials is whether a multitopic household sur- the impact of a particular educational policy, he or she vey is needed. Randomized experiments can be should choose a representative sample of schools or designed to evaluate specific educational policies with- communities and randomly divide the sample into out conducting a household survey. In particular, data two groups. The education policy of interest will be need be collected only on the socioeconomic out- implemented in one group (the "treatment" group) comes that the policy is expected to influence. For while the other group will serve as a comparison example, it may not be necessary to collect information group (the "control" group). This approach is theoret- on household consumption expenditures because the ically sound, but there are some problems with putting policy was not expected to change them. However, it into practice. there are three reasons why a multitopic survey should First, governments are often reluctant to allow be implemented in the context of a randomized eval- randomized trials because: they may be unwilling to uation. First, it may be useful to perform separate com- admit that they do not know what works best; they parisons for different socioeconomic groups, such as are not patient enough to wait for the results; the poor and nonpoor households. Second, it may also be results may go against entrenched political interests; useful to compare results from a randomized evaluation they do not want to be seen as making their citizens with results from cross-sectional or panel surveys, subjects in "human experiments"; and they do not which implies the need to collect additional data as like that fact that random trials inevitably involve explained in the previous subsections.Third, the policy denying assistance, at least initially, to some people being examined may affect more socioeconomic out- who are thought to be particularly needy. A second comes than initially expected. problem is that it may be difficult to prevent some of This being said, it may not be necessary to organ- the people assigned to control groups from participat- ize a nationwide multitopic household survey to ing in the "treatment" group. For example, in a recent accompany a randomized evaluation. In practice, ran- study of primary education in Kenya (Kremer and domized evaluations often cover a relatively small others 1997), the schools that received assistance geographic region in order to contain costs. If the experienced large increases in student enrollment, evaluation is not designed to be nationally represen- especially in the lower grades-which may have tative, there is no need for the accompanying survey "contaminated" the experiment. (The problem of to be nationally representative either. On the other getting a clean comparison is highlighted by hand, randomized evaluations can sometimes be per- Heckman and Smith 1995.) formed for a large geographic region or even a A third difficulty is the ethical issues that some nationwide sample. One example of this is the ran- people claim are involved in using human beings as dom phasing-in of a new pricing policy over several participants in experiments. However, many things can years, as was done for health care services in Indonesia be done to overcome such objections; indeed, medical (Dow and others 1997). In such cases a household researchers have been conducting randomized trials survey that covers several provinces or even an entire 163 PAUL GLEWWE country may be the best way to assess the impact of * Distances and travel times to community sources of the new policy or program. water and fuel (to calculate the opportunity cost of children's time). Proposed Draft Education Modules for LSMS Surveys Comments and Notes on the Draft Education Modules This section introduces three draft education modules that can be used in future multitopic household sur- This section provides detailed comments and notes that veys: a short version, a standard version, and an explain why the modules introduced in the previous expanded version. (The modules themselves are pro- section (and presented inVolume 3) take the form that vided in Volume 3.) The section also introduces two they now have. The first three subsections discuss each draft questionnaires for collecting data from local household-level module in detail, including how each schools. Each draft module or questionnaire should be can be altered to fit specific needs and constraints. The tailored to meet the overall objectives of the survey fourth subsection discusses the collection of school and the characteristics of the education system in the data using school and teacher questionnaires. country where the survey is being conducted.When designing household questionnaires, survey designers Short Education Module may want to incorporate features from more than one The short module is designed for a survey that focus- of the three draft modules. For example, they may es on a topic other than education and for which the choose to use the short module as the basis for the standard education module is too large. The short survey but add several questions from the standard module collects data on the most basic educational module. Ultimately, the objectives and constraints will outcomes: years of schooling completed, degrees or be different in each country, so these modules should diplomas obtained, current school attendance, repeti- be thought of as starting points rather than finished tion, and expenditures on schooling. This module is products. appropriate if policymakers and researchers have little Box 7.2 summarizes the capacity of each of the or no interest in estimating the determinants of edu- different modules to answer the policy questions raised cational outcomes, in which case education data are in the first section of this chapter. The box does not needed primarily to provide basic descriptive statistics address the short education module, because this mod- such as current school enrollment rates and the distri- ule is designed primarily for investigating the impact bution of education across the adult population, as of household members' educational outcomes (grade well as estimates of the impact on other socioeco- attainment, years of schooling, repetition, and diplo- nomic outcomes of years of schooling, repetition, and mas obtained) on other socioeconomic outcomes. certificates and degrees obtained. In such situations Other chapters in this book examine in detail how to there is no reason to collect data on education from conduct such investigations. either the community questionnaire or a school Finally, as was mentioned several times above, cer- questionnaire. tain data from other parts of a typical multitopic The following notes explain several details of the household survey are very important for analyzing short module the determinants of educational outcomes.These data are: Q3-4, Q6-7. These questions are almost identical, * Child wage rates (from the community except that Questions 3 and 4 are asked of people questionnaire). who have finished their schooling while Questions 6 * Schooling and occupation of the parents of all and 7 are asked of people currently in school. This household members (from the household roster). distinction (made in Question 2) is important to * Prices of optional educational inputs-textbooks, ensure that the response from individuals currently in exercise books, slates, pencils-from the price school is their current grade rather than the "highest questionnaire. grade completed," which, precisely interpreted, would * Anthropometric measurements and other indica- be the grade immediately preceding their current tors of child health. grade. 164 CHAPTER 7 EDUCATION Box 7.2 Policy Issues and How the Draft Modules Can Address Them Stondard Module without School Questionnaire * Impact of government policies that affect the value of chil- * Impact of school fees on all educational outcomes except dren's time on all educational outcomes learning (assuming that distance to schools affects educa- tional choices in the same way that money costs of Issues and Methodologies That Require Ponel Data schooling affect those choices) * Impact of student loan program-if country recently * Impact of school management on all educational outcomes introduced a student loan program, and the collection of except learning (assuming differences in school manage- panel data began before the program was introduced and ment can be summarized using a simple varable-such as continued after it was introduced public/private status of schools-that can be col ected in * Impact of school characteristics on educational the education module) outcomes-if endogenous program placement biases conventional estimates Standord Module with School Questionnaire * Impact of school admissions and advancement policies on * Impact of schoo fees on all educational outcomes except educational outcomes-if the policies changed recently learning and panel data were collected both before and after the * Impact of the prices of optional schooling inputs on all policy change educational outcomes except learning * Impact of labor market policies on educational * Impact of distance on all educational outcomes except outcomes-if policies have changed in recent years and learning panel data were collected both before and after the poli- * Impact of material inputs and very basic teacher character- cy change istics on all educational outcomes except leaming, assuming * Impact of child health and nutrition on educational (unobserved) teaching practices and more detailed teacher outcomes characteristics are only weakly correlated with (observed) * Estimated value added specification of learning production material inputs and basic teacher characteristics function * Impact of government policies that affect the value of chil- * Estimated retention of cognitive skills after children leave dren's time on all educational outcomes except learning school Expanded Module (Including Cognitive Tests, School Issues That Cannot Be Addressed Using Household Survey Data Questionnaires, and Teocher Questionnaires) (Except through a Randomized Evaluation) * Impact of school fees on all educational outcomes Impact of student loan programs (if country has never had * Impact of optional schooling input prices on all educa- a student loan program) tional outcomes * Impact of school admissions and advancement policies on * Impact of distance on all educational outcomes educational outcomes if the policies have not changed * Impact of material inputs, teacher characteristics, and recently or if they have changed but panel data were not teaching practices on all educational outcomes collected both before and after the change * Impact of school management on all educational outcomes * Impact of labor market policies on educational outcomes * Impact of schoo admissions and advancement policies on if policies have not changed in recent years or if panel all educational outcomes (assuming that these policies data are not available for both before and after a policy vary across schools) change Source: Author's summary Q4, Q7. Most diplomas are sequential, so that if a Q8. The main purpose of this question is to see who respondent has a higher diploma (for example, a diplo- really benefits from public education. For example, if ma certifving completion of upper secondary school), most children from wealthy households go to private this implies that he or she has also attained a lower schools, the provision of public schooling may dispro- diploma (for example, a diploma certifying comple- portionately benefit poorer households. Some policy- tion of lower secondary school). However, there may makers and researchers may also be interested in see- be some ambiguous cases. If such cases are common, it ing whether the impact of education on some other may be that two or even three responses should be socioeconomic outcome, such as wages, also varies allowed for Questions 4 and 7. across public and private schools. If this is of interest, a 165 PAUL GLEWWE of calculating total household expenditures. A simple Box 7.3 Cautionary Advice question in the household consumption module is I-low Much of the Draft Module Is New and Unproven? likely to result in serious underestimation of actual The short version of the education module follows a spending. The exact categories of expenditure must be similar approach to past practice in LSMS surveys, and adapted to fit the country in question; these should be thus is based on proven methods. In the proposed carefully checked during the pilot test of the ques- standard education module, a few elements are new or tionnaire. have been used in only two or three past LSMS sur- veys, but none of these innovations is very complicated. QI-1 1. These questions on repetition can be dropped Asking respondents to read a short sentence and per- if repetition is relatively rare. However, when repeti- form s mple written mathematical exercises has been tion is not rare it is useful to distinguish between years tried in Morocco, South Africa and Vietnam.Asking sep- toisntre,itsueflodsinihbtw nyar tredinMooco,SothAfic an Vita.Aigsp in school and grades attained. Simple estimates of rates arate sets of questions for persons now in school and for persons no longer in school (Question 6 of Part A) of return to a year of education may be seriously over- and asking separate questions for different levels of estimated if they do not account for repetition school (Question 7 of Part A) have been tried twice (Behrman and Deolalikar 1991). before (Jamaica 1 990,Vietnam I 997-98).The questions on grade repetition (Questions 41 -46 of Part A) have Standard Education Module been tried in about four previous LSMS surveys. Part D The standard education module is designed to gather on distances to local schools has been tried in two pre- the information needed to answer most of the policy vious LSMS surveys (Ghana 1988-89, Vietnam 1997 98). questions discussed in this chapter. It is therefore * How Well Has the Module Worked in the Past? In gener- designed to allow researchers to investigate not only al, the educaton module has worked quite well. The the impact of education on other socioeconomic out- only problems in past LSMS surveys have been in comes but also the determinants of most educational expanded modules, particularly ones that have admin- outcomes.The main omission from this module is that istered achievement tests to household members and no effort is made to collect detailed information on ones that have tried to match household members to cognitive skills. Collecting cognitive skills data will be specific schools. Much was learned from these exercis- discussed further in the following subsection. es, and those essons have been incorporated into the adic gie in thschpe Before discussing this module in detail, several advice given In this chapter * Which Parts ofthe Module Most Need to Be Customized? general points need to be made. First, a cursory glance Severa points need to be watched closely. First, a clear suggests that this module is substantially longer than distinction between training and apprenticeship needs the standard education modules used in full LSMS to be made.This definition may vary across countries. surveys in the late 1980s and early 1990s.This appear- Similarly, a clear distinction needs to be made between ance is deceptive. The main reason there are more general education and technical or professional training. questions is that people who have already finished Second, wher defining schooling levels and diploma their schooling answer a set of questions that is differ- codes, clear instructions are needed on how "old" sys- tems of education, which will be reported by older indi- viduals in the sample, are coded, Third, in many coun- in school. The former answer Questions 7-24 of Part tries there are kinds of school expenditures that are A while the latter answer Questions 25-40. Having rarely found in other countries. Unusual categories in these two sets of people answer different questions any given country can be obtained by asking officials should make the interview go easier; in particular. it from the ministry of education and participating in the should avoid any confusion about whether the grade pilot test of the questionnaire. indicated for a person still in school is the current grade or the "highest grade completed," which, tech- question similar to Question 8 should be inserted after nically speaking, is the previous grade. On the other Question 4; this new question would refer to the last hand, there are a few additional questions on repeti- school attended by the respondent. tion, and Part D is completely new. However, some questions have also been deleted, and Part D has to be Q9. The main purpose of this question is to obtain filled out only once for the entire household rather accurate data on school expenditures for the purposes than once for each individual (and households that 166 CHAPTER 7 EDUCATION contain no school-age children do not fill it out at all). A3-A4. For Question A3, a simple sentence of six to Overall, there is a small increase in the number of eight words in one or more languages should be prepared questions asked but this increase is not nearly as great based on the primary-school curriculum. It is best to as initial appearances may suggest.21 have three or four variants to prevent some people in a Another general point is that one may want to household from doing well because they overheard add a few qualitative questions asking households why someone else read the same sentence. If the language they made the schooling decisions they did. While used by the household members is not the official some might argue that these explanations are rational- national language, A3 should be applied twice, once for izations and thus are not very accurate, others may find the national language and once for the language used by such information interesting, if for no other reason the household. Mathematics problems should be simple than to see what households claim- are the reasons for addition or subtraction at a level consistent with two to the choices they have made. In particular, a question three years of primary education. In some countries could be inserted after Question 5 in Part A asking: (Bolivia is one example) students receive scores on "Why have you never attended school?" Possible rea- national examinations each year, and this information is sons are the same ones given for nonattendance in the commonly kept by the household. In such cases, for chil- education module of the community questionnaire dren currently in school, Questions A3 and A4 should be (see Chapter 13). It may only be useful to ask this replaced with the scores on these examinations. question for relatively young individuals. A similar question could be added after Question 6 of Part A: A8, A26, A49-A50. These questions on preschool can "Why are you no longer attending school?"Again, this be dropped if preschool education is rare. It is impor- may only be appropriate for younger individuals. A tant to clarify the difference between preschool and third place to ask a qualitative question is immediate- daycare; most surveys only investigate preschool. ly after Question 30 of Part A; for persons who were absent from school in some of the past seven days one A12,A34. If there are 10 or fewer postsecondary insti- could ask: "Why were you absent from school during tutions in the country, it is useful to add a question the past seven days?" requesting the institution's name and to include in that A third general point is that any survey that uses question explicit codes for each institution. the standard version of the education module should also collect anthropometric data, since there is strong A13, A17, A21. It is important to distinguish between evidence that early childhood nutrition affects educa- finishing a grade and passing an examination on the tional outcomes.22 The collection of anthropometric one hand and finishing a grade and not passing the data is discussed in Chapter 10. A final point is that examination on the other. A student who finished a two school questionnaires-the questionnaire for grade but did not pass an examination should still be administrators and the questionnaire for teachers- classified as having finished that grade. could be used with the standard module. This is not necessary, but it will increase substantially the range of A15, A19, A23. These questions on when schooling is issues that can be analyzed (as seen in Box 7.2). finished are important for two reasons. First, if a person The following notes on specific questions in the dropped out after only one or two months, it is impor- standard module are useful for understanding the tant not to credit him or her with an entire year of module's design and thinking about how to modify schooling. Second, combining this information with the module to fit particular interests and constraints. the information on the time when the respondent first entered school (Questions A47 and A48) makes it pos- Al. In general, it is strongly advisable to interview a sible to verify grade repetition using a computerized subject directly, with thc possible exception of chil- data cntry package. (See Grosh and Munoz 1996 for a dren age 10 and under, because asking someone else detailed discussion of computerization of data entry.) greatly increases the chance of errors and missing data. By indicating whether a person was interviewed A17-A18, A21-A22. If no diploma is associated with directly, this question provides information on the lower secondary or primary education, the interview- accuracy of the data. er can simply ask the second question of each pair. 167 PAUL GLEWWE A27. It is important to know whether a student lives A47-A48. To verify the accuracy of the information, it away from homiie while attending school because this is imaportant to ask both questions. implies a large opportunity cost to the household in terms of the time that the student could have been B1-B2. These questions include professional and tech- spent working for the household after school or on nical training.An alternative is to exclude such activi- weekends. ties from the information collected here and to add to Part C a question on expenditures on professional and A28. Finding out the name of the school is important technical training during the past 12 months. for matching data from the school or community questionnaire. Codes should be assigned as soon as B2. This question has tvo purposes: accurately col- possible, preferably before any household interviews lecting total educational expenditures by a household have started (so that interviewers will already have a and collecting child-specific information on how list of codes when filling out the questionnaire) and household resources are allocated across siblings and certainly before the team leaves the community. If across different types of educational expenses. official school code numbers are used by the ministry of education, these codes should also be added in a B3-B5. This information on assistance received from separate column. It is not advisable to depend on people who are members of other households should national codes alone because new schools and private bc coordinated with information from the transfers and schools often do not have official code numbers. other nonlabor income module on income transfers received from other households (see Chapter 11). A29-A30. Ideally, it would be useful to know about all Rules need to be set to avoid double counting income absences during the entire school year and even in past that goes directly to the household. These rules may years. Absences during the previous week are a very vary according to country-specific characteristics. rough indicator of the necessary information, but it is Payments made directly to schools, such as tuition pay- unlikely that children or their parents will be able to ments, should probably be counted only here, and not accurately recall absences over a longer period of time. in the transfers and other nonlabor income module. In some countries schools may keep records, at least for the current year. If so, serious thought should be B6-B7. If a voucher scheme of some kind exists, sep- given to trying to obtain that information. This would arate questions along the lines of Questions B6 and B7 involve a significant amount of work and would should be added. If there are different kinds of vouch- require pretesting. Finally, combining information on ers, another question should be added to determine absences during the previous seven days with infor- which kind the child receives. mation on class time as given in the school question- naire allows researchers to calculate time spent in C1-C3. The definition of an apprenticeship needs to school in the previous week. However, if it is common be carefully worded in order to reflect the system of for children to attend school for only part of the apprenticeship that prevails in the country where the school day (because they arrive late or leave early), a survey is being carried out. If apprenticeships are quite separate question should be asked regarding the num- rare in the country-involving less than 2-3 percent ber of hours the student missed in the previous seven of the population-Questions Cl, C2, and C3 can be days due to arriving late or leaving early. dropped. A41-A46. These questions could be condensed or Cl, C4. It may be easier to obtain accurate responses to eliminated for any level of schooling in which repeti- these questions by dividing each into two distinct ques- tion is rare-in other words, if fewer than 2-3 percent tions.The first question would ask whether the respon- of students repeat. Condensing these questions simply dent has ever been an apprentice (or had technical involves asking whether the individual repeated any training) and the second would ask whether the respon- grade in primary or secondary school and, if so, which dent is currently an apprentice (or currently being grades were repeated (allowing for Up to three trained).The best wording for these questions should be responses). checked during the pilot test of the questionnaire. 168 CHAPTER 7 EDUCATION C4-C6. It is important to distinguish between profes- ACHIEVEMENT TESTS. There is not enough space in this sional and technical training at the postsecondary level chapter to discuss how to design and administer and equivalent training that does not require the stu- achievement tests as part of a multitopic household dent to have finished upper secondary school. Only survey. The next few paragraphs will discuss major the second kind of training is considered here; the first testing issues and the advantages and disadvantages of kind is addressed in Part A as general postsecondary the different testing choices. It is strongly recom- education. Particular attention should be given to mended that a highly qualified local or international teacher training. In some countries teacher training expert be hired to develop the tests. (For a good dis- always takes the form of postsecondary education, and cussion of literacy tests in the context of household thus should be dealt with explicitly in Part A. In other surveys see UNNHSCP 1989 and Wagner 1993.) The countries at least some types of teacher training are basic decisions that must be made are: who is tested; not considered postsecondary; these types should be which skills are tested; the length of the tests; the rela- explicitly identified in Part C. tionship between the tests and the national curricu- lum; and where the tests take place. PART D. These questions are intended to obtain a pic- There are essentially two types of household ture of what choices parents have when they send (or members that can be tested: children of school age and do not send) their children to school.When combined adults who have finished their schooling. Children are with information on the date when a school opened, tested to find out how their characteristics, along with these questions can be used to get around the problem household and school characteristics, affect the skills of nonrandom placement of schools by the govern- that they acquire. Adults are tested to study the impact ment. The ID codes are needed to match this infor- of skills learned on socioeconomic outcomes. It is mation with the school that is chosen (obtained in often useful to test both groups in order to study the Question A28) and with the detailed information impact of determinants of learning among children on obtained in the community questions or the school their socioeconomic outcomes in adulthood. For questionnaire. Since respondents may not be adept at example, if a certain schooling improvement costs $20 providing distance information, this information can per student and raises the average student's mathemat- be augmented with data from GPS (global positioning ics achievement 10 points, the value of those 10 points system) measurements of the location of the house- can be estimated in terms of the students' increased holds and the schools. incomes in adulthood.This is one way (although not the only way) of determining whether the school Expanded Education Module improvement is a good investment. If policymakers The education module in the household question- and researchers are interested in making such evalua- naire can be expanded in two general ways. First, rel- tions, it is crucial that the same test be given to both atively simple achievement tests can be administered school-age children and adults or, alternatively, that to household members, both members currently in when different tests are given they can be "scaled" to school and members out of school. This is a major each other to enable direct comparisons between undertaking, but one that when done correctly can scores across different tests. yield rich data that will deepen policymakers' under- Which skills should be tested depends upon the standing of what makes schools perform effcctively specific interests of the policymakers, but in almost all and how skills learned in school contribute to a wide cases they will include basic reading comprehension variety of socioeconomic outcomes. (If this route is and mathematics. In addition, if the interest exists and taken, Questions A2 and A3 in the standard module resources are available, it would be useful to test for the can be dropped.) Second, more information can be following skills: science, health knowledge, innate collected on children currently in school-which intelligence, reading comprehension in another lan- amounts to adding questions to Part B of the standard guage, values and behavioral norms, agricultural module. In this subsection some practical tips on knowledge, and basic practical knowledge. Testing administering tests are discussed, and some questions intelligence is controversial, but there are several com- are presented that can be added to Part B of the stan- monly used tests for this purpose, including the dard education module. Raven's Progressive Matrices test. The skills covered 169 PAUL GLEWWE need not be limited to skills taught in the school cur- adults-to come take the tests. One way to reduce this riculum; indeed, it may be interesting to see how peo- problem is to compensate subjects for being tested, but ple acquire skills that are not taught in school. even this may not work with many adults. Although In general, the tests should be quite short; 10-20 some testing specialists may have serious reservations questions should be sufficient for most purposes. In about testing done in respondents' homes, there are some cases it may be useful to have a basic test and an several measures that can be taken to minimize the advanced test, allowing only people who achieve a problems involved. For further practical advice see "passing" score on the basic test to take the advanced UNNHSCP (1989) and Wagner (1993). test. The length of the test or tests should be thor- oughly discussed with the consultant hired to develop ADDITIONS TO THE HOUSEHOLD QUESTIONNAIRE. the test. Many testing consultants are accustomed to Some additional questions to expand Part B of the long tests that are administered in classroom settings standard module are given in the expanded question- with few time constraints, so it may be necessary to naire module page, to which the following comments impress upon the consultant the impracticality of apply. administering long tests in a survey setting. One yardstick for measuring school performance B11-B13. These questions can be used to investigate is how well students acquire the skills that the nation- how important textbooks are under different circum- al curriculum is designed to teach. This implies that stances.These circumstances include whether the child the tests should be designed to fit that curriculum. owns a full set, whether the child can take the text- However, there may be reasons not to design tests this books home (if they are provided by the school), way. First, the curriculum itself may be outdated or whether the books are shared with others, and irrelevant, and thus have little impact on many socioe- whether they are new or used. In principle, similar conomic outcomes. Second, many skills of interest, questions can be asked about other student-specific such as agricultural skills or knowledge about health, inputs such as exercise books, slates, pens, and pen- may not be taught in schools.Third and finally, it is not cils-but it is not clear whether it is feasible or worth- clear that the precise skills that make workers more xvhile to collect detailed information on these items. productive are the skills taught in schools. In general, Which items to include in detail may depend on the the national curriculum may be a good place to start, circumstances in a particular country. but serious thought should be given to measuring other skills as well. B14. In principle, homework is one input into a learn- Perhaps the most difficult aspect of administering ing production function, but it is clearly endogenous achievement tests as part of a household survey is find- and it would be hard to find instrumental variables ing a good place to administer the tests. If the tests are that can be excluded from the production function. If administered only to children currently in school and such instrumental variables can be found, it might be the data are collected when schools are in session, the worthwhile to add a question about whether the child tests can be administered in schools.23 However, if was assisted by an adult household member, as well as adults or children not currently enrolled in school are perhaps a question on the number of hours of tutor- to be tested, they must be tested either in the home or ing the child received per week-distinguishing in some community center. between assistance received from family members and Testing in a community center has several advan- assistance received from paid tutors. tages. There are minimal distLrbances and it is easier than in other settings to prevent test takers from assist- B16-B18. These questions gather information on par- ing each other. Test administrators have relatively more ticipation in school feeding programs in order to yield control over factors that may affect the test, such as ade- descriptive information on who benefits from these quate lghting and enough tables to write on.And there programs. More ambitiously, it might be possible to should be relatively few comparability problems with measure the impact of school feeding on educational other tests that are administered in a similar setting. outcomes such as attendance and test performance, or The main disadvantage of testing in a community even on nutritional outcomes such as height and center is that it is difficult to get people-especially weight. 170 CHAPTER 7 EDUCATION School and Teacher Questionnaires All. The answers to this question may vary by grade. As explained in the second section of this chapter, If so, separate answers should be allowed for each most methods for estimating the determinants of grade. Also, if two or more languages are used for cer- educational outcomes require information on the tain subjects, codes should be developed for combina- schools that are available for children to attend. LSMS tions involving more than one language. surveys in the 1980s and 1990s have typically relied on information collected in the community Bl-B5. An alternative to these questions, which would questionnaire-information that has been much too involve more resources, is to administer teacher ques- brief. It is necessary to collect a large amount of infor- tionnaires, an example of which is discussed below. In mation on school quality to reduce problems of omit- such cases it would probably be necessary to retain only ted variable bias. Thus in the case of a standard, full- Questions BI and B2 in the school questionnaire. size multitopic household survey it is recommended that a school questionnaire be used for each local pri- B3. In some countries certain teachers may have an mary or secondary school. This may not be feasible in explicit credential that others do not have. In such urban areas, so a reasonable rule of thumb is to take cases this question should ask specifically about that the five mnost commonly attended primary schools, credeintial. the three most commonly attended lower secondary schools, and the three most commonly attended C3. A legible blackboard is any blackboard that can be upper secondary schools. The basic school question- used. It is possible to draw a finer distinction by asking naire, introduced in the third section of this chapter, whether a blackboard is sufficiently legible to be seen is provided inVolume 3 of this book. Roughly speak- by all students or only by some students (either ing, it should only take about 30 minutes to complete because some students have poor eyesight or because for one school. they sit at the back of the classroom). SCHOOL QUESTIONNAIRE. The following notes apply D2. In countries in which several languages are spo- to the school questionnaire: ken, information on the number of books in the school library should be collected separately for each A3. In some countries schools may be combinations language. of these categories, such as a combined lower and upper secondary school. Additional codes for such D3. In most countries, the question on science labora- cases should be created as needed. tories need not be asked in primary schools. A6. Information on when the school first opened is El-E4. These questions apply only to school levels useful for analyzing the nonrandom placement of that have national examinations. In some cases the schools, as in Pitt, Rosenzweig, and Gibbons (1993). school principal may be able to provide more detailed information, such as how many students scored at sev- A8-A9. If the number of shifts (Question A8) or the eral different levels. amount of class time per day (Question A9) varies by grade, extra columns should be created, as in Question Fl. Obtaining accurate figures on school fees is in A7, to get the information by grade. practice much more difficult than might be expected, since there are many different fees and the line A9. Class time per day is a critically important input between voluntary and mandatory is often unclear. It into the learning production function. It also meas- is best to list as many fees as possible by their exact ures the opportunity cost of children's time spent in names. If there is substantial variation by grade, it is school. best to obtain this information separately for each grade. A10. If many schools are open for fewer than the stan- dard number of weeks per year, it may be useful to add F7-F9. School uniforms can be a major cost to house- a question on the reasons why this is the case. holds, so relaxing the requirement to purchase a uni- 171 PAUL GLEWWE form may greatly reduce the real price of schooling Q39-40. If achievement tests are being administered to for poor households. students on subjects other than mathematics and read- ing, it may be useful to add similar questions for those TEACHER QUESTIONNAIRE. If the main focus of the subjects. survey is on education, it is important to collect infor- mation on individual teachers using a teacher ques- Q43-45. If a new kind of in-service teacher training tionnaire, a prototype of which is provided inVolume program is being used, the questions should specifical- 3. In most countries teachers should be able to fill out ly ask whether this training was of that type. the questionnaires themselves-vastly reducing the amount of time interviewers need to spend adminis- Notes tering these questions.Yet even when teachers can fill these questionnaires out, the interviewer should check The author would like to thank Jere Behrman, Kimberly all of them on the spot to see if any errors have been Cartwright, Margaret Grosh, Eric Hanushek, Xiaoyan Liang, made, and then ask the teachers to fix those errors. Marlaine Lockheed, Andrew Mason, Raylynn Oliver, Maris If a school has more than, say, 10 teachers, it may be O'Rourke, Harry Patrinos, Julie Anderson Schaffiher, and T. Paul useful to choose two or three teachers per grade at ran- Schultz for comments on earlier drafts. dom and administer the questionnaires only to them. 1. Even tuition is not always paid. In some countries a child may One way to choose teachers at random is to ask all of attend school temporarily while the teacher or principal xvaits for them their dates of birth and then select the two or three the child's parents to pay If the parents never pay, their children are with the earliest dates of birth within the calendar year. usually forced to withdraw from the school (although sometimes A final possibility to consider when collecting poor children are allowed to stay). information from individual teachers is to administer 2. There is one important exception to this statement. Distance achievement tests to the teachers. In many developing to schools can directly affect learning if long distances lead to more countries, teachers' knowledge of the curriculum has absences or increased tardiness. This will be discussed in the next serious deficiencies, and one would expect that the section of this chapter. students of weaker teachers would do worse in school, 3. For material inputs that are child-specific, such as textbooks, other things being equal. However, this information an additional issue is whether the school should provide them or can be quite difficult to collect. Teachers may resist households should be required to purchase them. For most pur- being tested because they may think that the results poses this can be thought of as a price question, xvhere provision by will be used against them. Assurances that the data are the school implies a price of zero. However, when schools provide being collected purely for research purposes may these inputs, rationing may well take place, complicating efforts to prove unconvincing (though not requiring the teach- understand the impact of such a policy ers within a given school to write their names on the 4. Distance education can be viewed as a pedagogical technique tests might improve cooperation). Thus, if everything because teachers turn on the radio, may participate as directed by else is in place for the survey, it is worthwhile to the radio program, and are expected to follow up when the pro- explore the possibility of testing teachers, but if signif- gram is finished. icant resistance is encountered, it may be best to drop 5. In this case, reduced school attainment may not be cause for this in order not to jeopardize the rest of the survey. alarm. It may be that high-paying government jobs were generat- The following comments apply to the teacher ing private rates of return to higher education that exceeded social questionnaire: rates of return. 6. Information about on-the-job training is best collected in the Q12-16, Q17-21. If achievement tests are being employment module of a multitopic household survey. See Chapter administered to students on subjects other than math- 9 for details. ematics and reading, similar sets of five questions can 7. Collecting information on schoolchildren's values may be added for those subjects. also be useful, but to the author's knowledge this has never been done for a representative sample of children in a developing Q34-35. If these tasks are primarily done during class country. time, Questions 34 and 35 should be part of the pre- 8. One estimation issue here is that the education variable may vious set of questions (Questions 24-33). be endogenous. Another issue is measurement error in the educa- 172 CHAPTER 7 EDUCATION tion variable. Both problems lead to biased estimates, and both are example, matching children from the 1990 Jamaican LSMS survey discussed further in the next section of this chapter. to national examination records proved impossible. 9. A local school can be defined as any school within walking 21. Consider two people-one who attended only primary distance (more generally, within daily commuting distance) of a school many years ago and another, currently in upper secondary household. school, who repeated one year of lower secondary school. Assume 10. Recall that causal factors are exogenous if they cannot be that neither has been an apprentice or had any technical educa- affected by household behavior (choices). tion or training. Using the 1987-88 Ghana Living Standards 11. In theory, there could also be a production fuinction for val- Survey for comparison, the number of questions asked of the first ues, but to the author's knowledge no one has attempted to esti- person increases from 13 to 15, while the increase for the second mate such a relationship and interpret it as a production function. person is from 17 to 26.The second person's increase is due most- 12. There are tsvo exceptions to this statement. First, distances ly to four questions on grade repetition and tsvo questions on the to local schools can vary across households. Second, Hausman and age and year when the person entered first grade. (This compar- Taylor (1981) provide a general method that, under certain condi- ison ignores the questions in Part D, which are asked only once tions, can be used to estimate school effects, community effects, or for each household and only if the household has children of both. However, to the author's knowledge, this method has never school age.) been used to analyze educational outcomes, and, in practice, the 22. The main exception to this recommendation is that anthro- assumptions required for it to be valid may not be plausible. pometric data need not be collected in a country with very low 13. Two implicit assumptions are made here: that the distance to rates of malnutrition. If the incidence of malnutrition is unclear, the school attended is exogenous and that each child attends school anthropometric data should be collected. nearly every day (which implies a fixed opportunity cost of travel 23. This does not mean that all children in a given school will time for each year of school enrollment). Both of these assumptions be tested; if only a few sampled children attend a given school, they can be questioned. can be asked to remain at school after classes are over or permission 14. This raises the issue of howv to collect accurate data on dis- can be obtained to withdraw them during the school day (as was tances. Households' estimates may not be very accurate, which will done in the 1990 Jamaican LSMS survey). lead to bias due to measurement error. One recent development in survey design involves use of global positioning system technology, References which can measure the latitude and longitude of any location with- in about 100 feet. If readings are taken for dwellings and local Alderman, Harold,Jere Behrman,Victor Lavy, and Rehka Menon. schools, precise distances (by air) can be calculated. 1997. "Child Nutrition, Child Health and School Enrollment: 15. It would be possible to evaluate student loans using cross- A Longitudinal Analysis.' Policy Research Working Paper sectional data in the context of a randomized evaluation This gen- 1700.World Bank, Development Research Group,Washington, eral approach is discussed below. D.C. 16. This ignores grade repetition. Ifrepetition is common, years Becker, Gary. 1995, "Human Capital and Poverty Alleviation." of schooling should be used rather than the grade attained. HRO Working Paper 52. World Lassik, Humain Resources 17. Note that assigning an interpretation to the parameter esti- Development and Operations Policy Vice Presidency, mates associated with school management variables is risky, since Washington, D.C. the magnitude of their effect depends on the completeness of the Behrman, Jere, and Anl Deolalikar. 1991. "School Repetition, data on material inputs, teacher characteristics, and pedagogical Dropouts and the Returns to School: The Case of Indonesia." practices. Oxford Bulletitn of Econonmics and Statistics 53 (4): 467-80. 18. Alternatively, a reduced form relationship could be estimat- Burtless, Gary. 1995. "The Case for Randomized Field Trials in ed by including information on local health clinics and medicine Econoniic and Policy Recearch?"Journal of Fi0Ecot0oic Petspectives prices when estimating the determinants of educational outcomes. 9 (2): 63-84. In this case, it is not even necessary to include data on child health, Del Rosso,Joy, andTonia Marek. 1996. ClassAction: Improving Schzool becaulse it is substituted out of the reduced forna relationship. Petirnance in the Developintg Worid throngls Better HealtFs and 19. Equations for higher grades could be added in equation 7.1, lNutrition. Washington, D.C.:World Bank. but this has not been done in order to keep the exposition simple. DoNv, William, Paul Gertler, Robert Schoeni, John Strauss, and 20. If previous test scores can be obtaincd from other sources, Duncan Thomas. 1997. "Health Care Prices, Health, and Labor panel data are not needed. Unfortunately this is rare, and matching Outcomes: Experimental Evidence." RAND Labor and individuals from the two data sources can be very difficult. For Population Working Paper 97-01. Santa Monica, Cal. 173 PAUL GLEWWE Gertler, Paul, and Paul Glewwe. 1990. "The Willingness to Pay for 1994. "Borrowing Constraints and Progress through Education in Developing Countries: Evidence from Rural School: Evidence from Peru." Review of Economics and Statistics Peru. Journal of Public Economics 42 (3): 251-75. 76 (10): 151-60. 1992. "The Willingness to Pay for Education for Daughters Knight,John, and Richard Sabot. 1992. Educational Performance of the versus Sons: Evidence from Rural Peru." World Bank Economic Poor: Lessons from Rural Northeast Brazil. Nexw York: Oxford Review 6 (1): 171-88. University Press. Glevwve, Paul. 1999. The Economics of Sclhool Quality Investments in Kremer, Michael, Sylvie Moulin, David Myatt, and Robert Developing Countries: An Empirical Study of Glhana. London: Namunyu. 1997. "Textbooks, Class Size and Test Scores: Macmillan. Evidence from a Prospective Evaluation in Kenya." GlewAve, Paul, and HananJacoby 1994. "Student Achievement and Massachusetts Institute of Technology, Department of Schooling Choice in Low-income Countries: Evidence from Economics, Cambridge, Mass. Ghana."Journal of Humsan Resources 29 (3): 843-64. Lockheed, Marlaine, and AdriaanVerspoor. 1991. Irmproving Primary - 1995. "An Economic Analysis of Delayed Primary School Education in Developing Countries. NewYork: Oxford University Enrollment in a Low-income Country: The Role of Early Press. Childhood Nutrition." Review of Economics and Statistics 77 (1): Murnane, Richard, John Willett, and Frank Levy. 1995. "The 156-69. Growing Importance of Cognitive Skills in Wage Glewwe, Paul, Hanan Jacoby, and Elizabeth King. Forthcoming. Determination." Review of Economics and Statistics 77 (2): "Early Childhood Nutrition and Academic Achievement: A 251-66. Longitudinal Analysis."Journal of Public Economics. Newman,John, and Paul Gertler. 1994. "Family Productivity, Labor Glew-we, Paul, Michael Kremer, and Sylvie Moulin. 1999. Supply, and Welfare in a Low-Income Country" Journal of "Textbooks and Test Scores: Evidence from a Prospective Human Resources 29 (4): 989-1026. Evaluation in Kenya." World Bank, Development Research Nesvman, John, Laura Ravlings, and Paul Gertler. 1994. "Using Group, Washington, D.C. Randomized Control Designs in Evaluating Social Sector Glewwve, Paul, Margaret Grosh, Hanan Jacoby, and Marlaine Programs in Developing Countries." World Bank Research Lockheed. 1995. "An Eclectic Approach to Estimating the Observer 9 (2): 181-201. Determinants of Achievement in Jamaican Primary Pitt, Mark, Mark Rosenzxveig, and Donna Gibbons. 1993. "The Education:" World Bank Economic Review 9 (2): 231-58. Determinants and Consequences of the Placement of Grosh, Margaret, and Juan Munoz. 1996. A Manualfor Planning and Government Programs in Indonesia." World Bank Economnic Implementing the Living Standards M1easurement Study Survey. Review 7 (3): 319-48. Living Standards Measurement Study Working Paper 126. Pollitt, Ernesto. 1990. M1alnutrition and Infection in the Classroom. Washington, D.C.:World Bank. Paris: United Nations Educational, Scientific, and Cultural Hanushek, Eric. 1986. "The Economics of Schooling."Journal of Organization. Economnic Literature 24 (3): 1141-77. Selden, Thomas, and Michael Wasvlenko. 1995. "Measuring the 1992. "The Trade-off between Child Quantity and Distributional Effects of Public Education in Peru." In D. van Quality" Journal of Political Economny 100 (1): 84-117. de Walle and K. Nead, eds., Public Spending and the Poor: Theory 1995. "Interpreting Recent Research on Schooling in and Evidence. Baltimore, M.D.:Johns Hopkins University Press. Developing Countries." World Bank Research Observer 10 (2): Strauss, John, and Duncan Thomas. 1995. "Human Resources." In 227-46. J. Behrman and T.N. Srinivasan, eds., Handbook of Development Hausman, Jerry, and William Taylor. 1981. "Panel Data and Econonmics: Volume 3. Amsterdam: North Holland. Unobservable Individual Effects." Econometrica 49 (6): 1377-98. UNDP (United Nations Development Programme). 1990. Human Heckman, James J., and Jeffrey A. Smith. 1995. "Assessing the Case Development Report. New York: Oxford University Press. for Social Experiments."Journal of Fconomic Perspectives 9 (2): UNNHSCP (United Nations National Household Survey 85-110. Capability Program). 1989. Mleasuring Literacy through Household Ilon, Lvnn. 1992. "School Unit Cost Study: Jamaica." State Surveys. NewYork: United Nations. University of New York, Graduate School of Education, Wagner, Daniel. 1993. Literacy, Culture, and Development: Becoming Buffalo, N.Y Literate in Morocco. NewYork: Cambridge University Press. Jacoby, Hanan. 1993. "Shadow Wages and Peasant Family Labor West, Edwin. 1996."EducationVouchers in Practice and Principle: Supply: An Econometric Application to the Peruvian Sierra" A World Survey." Human Capital Development Working Paper Review of Econormic Studies 60 (4): 903-21. 64. World Bank, Washington, D.C. 174 CHAPTER 7 EDUCATION World Bank. 1990. World Development Report-Poverty. New York: . 1995. Priorities and Strategiesfor Education. Washington, D.C. Oxford Universitv Press. . 1997. Primary Education in India. Washington, D.C. 175 8 Health Paul J. Gertler, Elaina Rose, and Paul Glewwe Health is a critical factor in the development of any country, for two reasons. First, health status is a key indicator of a population's welfare (Sen 1985). Second, improving the health status of the popu- lation leads to greater economic productivity (Strauss andThomas 1995). In almost every country, governments play an active role in the health sector, and there are sound economic reasons for their doing so. The main reason is that there are often "market failures" in the purchase and provision of health services, including incomplete health insurance markets and imperfect information on the part of the population about the nature of health problems and their treatments. To reduce the adverse effects of these market failures, vided in Volume 3 of this book.) The fourth section policymakers need information on health outcomes, presents annotations to these modules. on household behavior that affects those health out- comes, and on the effects of government policies on Major Health Policy Issues in Developing both outcomes and behavior. In recognition of this Countries policy need, almost all past LSMS surveys have includ- ed a health module, and the data from these modules Every government intervenes in the health sector to have been used to study several important health sec- some extent, but the nature and size of those inter- tor policy issues (for example, the study of willingness ventions varies from country to country. Economic to pay for health care by Gertler and van der Gaag theory gives three main reasons for governments to 1990). Although data from the LSMS health module intervene in the health sector. First, health is, in many have been put to productive use, it is time to re-exam- respects, a public good. This means that an individual's ine the module to see how well the information it col- good health benefits not only him or her but also lects meets the current challenges in the health sector. other members of society. For example, when a person The purpose of this chapter is to help survey with a contagious disease decides to seek treatment, designers develop a health module for LSMS and sim- this reduces the probability that others will get that ilar household surveys that will yield the information disease. Second, there are many market failures that needed by health sector policymakers.The first section prevent the effective use of available resources in the discusses key health policy questions. The second sec- health sector. For example, when health insurance tion outlines the data needed to analyze those policy markets are incomplete, this means that many families questions. The third section introduces three modules whose members develop a serious illness either cannot designed to collect these data. (The modules are pro- afford to pay for treatment or, if they can afford to pay, 177 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE will incur major financial losses by doing so. Third, ment officials have to make many decisions about the equity in health outcomes is an important social goal allocation and mobilization of resources, including the of many governments, and one way for governments location of different types of facilities (such as hospi- to increase equity is to support health care services. tals or clinics), which services each type of facility For further discussion of the economic rationale for should provide (for example, inpatient care, x-rays, sur- government intervention in the health sector see de gery, drugs, family planning, and preventative care), the Ferranti (1985), Hammer (1997), and Gertler and quality of these services, and the fees to charge for Hammer (1997). these services. Household surveys can collect much of These general reasons for government involve- the data needed to study the consequences of these ment in the health sector raise a large number of more different kinds of policy decisions. Such data are criti- specific questions concerning how governments cal for deciding how to finance and operate public should be involved. The first step in addressing these health care facilities. For example, policymakers can questions is to identify specific health policy issues. use household survey data to assess how health out- This section begins with a brief description of how comes and utilization patterns would change if user health care is provided in developing countries and fees were increased or if the quality of health care then reviews the most important policy issues con- services were improved at publicly operated facilities. cerning health in developing countries. Most countries' governments allow-and in some cases actively encourage-the private sector to deliver Health Care Provision in Developing Countries some health care services. In such countries, for the Governments in developing countries provide health government to develop effective public sector policies, care services in many different ways. One common it needs accurate information on the extent to which way in which governments are involved in the health the private sector complements or substitutes for the sector is through public health programs.These include public sector in meeting government objectives. For immunization programs, other programs to prevent the example, it is useful to know the extent to which indi- spread of highly infectious diseases, treatment programs viduals switch from using the public sector to using for individuals already infected, and public education the private sector in response to increases in user fees programs on such subjects as smoking, risky sexual in the public sector, or from the private to the public activity, hygiene, nutrition, and preventative health sector when the quality of public sector care improves. care. The distinguishing characteristic of these pro- Information gathered in household surveys can be grams is that they include activities designed to reach used to analyze these relationships. out to people in their communities, homes, and work- Finally, many governments are introducing places, as opposed to reaching only those people who mandatory social health insurance in the formal wage visit health care facilities. Policymakers can use infor- sector. This social insurance is typically financed mation collected in household surveys to make these through a payroll tax that goes into a fund used to pay programs more effective. For example, household sur- for workers' medical care when they are ill. This fund vey data can reveal which households participate in or insures workers against the financial cost of their ill- are otherwise affected by public health programs-and nesses or injuries and also reduces the government's thus indicate whether those programs are reaching health budget by requiring workers to contribute to their intended beneficiaries. the social insurance fund. However, when people are Governments also provide preventative and cura- fully insured, they tend to use more health care than is tive medical care through publicly operated hospitals, socially optimal. This phenomenon is known as moral clinics, and other health care facilities. The prices for hazard. Knowing the likely extent of moral hazard these health care services are often heavily subsidized helps policymakers to set copayments and premiums at using general tax revenues. These subsidies are provid- a level that ensures the financial viability of the insur- ed to reduce individuals' financial risk, to increase the ance fund. Moral hazard is measured by the price elas- access that disadvantaged groups such as women, chil- ticity of demand, which can be estimated using house- dren, the poor, and the elderly have to health care, and hold survey data. to improve the overall health status of the population. These different ways in which governments inter- In financing and operating health facilities, govern- vene in the health sector do not necessarily achieve 178 CHAPTER 8 HEALTH their intended objectives, and they may even have the effectiveness of government interventions. In par- negative consequences. Indeed, in some developing ticular, information is needed on the level, distribu- countries, the overall health care system functions tion, and causes of both child and adult mortality poorly and thus does little to improve the overall within the population, on the incidence of specific health status of the population (World Bank 1993; serious diseases among different demographic and Peabody and others 1997). The ability of the govern- socioeconomic groups, and on the extent to which ment to improve its policies depends on the accuracy people are unable to carry out their usual activities and timeliness of the information that the government because of poor health. has at its disposal regarding current circumstances in In some countries access to medical care may not the health sector and the likely impact of different be the crucial issue.The key to the population's health interventions on households' choices and outcomes. status may be household-level factors such as water The rest of this section reviews in more detail the most supply, sanitation, waste disposal and cooking prac- important policy issues in the health sector. tices, or individual-level behavior such as diet, infant feeding practices, exercise, use of seatbelts, tobacco use, Current Health Policy Concerns alcohol consumption, and sexual behavior. The relative Health policy concerns may be grouped by category of importance of each of these factors and kinds of disease or by category of people with health problems behavior varies from country to country. Policymakers (for example, children, women, or the elderly).Yet for in each country need to know the prevalence of dif- the purpose of trying to understand the impact of dif- ferent kinds of health-related behavior, the extent to ferent policies, it is more useful to divide policy issues which the population is aware of the impact that these according to the type of policy. This section divides kinds of behavior can have on health, and which health policy concerns into eight different categories: groups within the population are engaging in behav- * Assessment of health problems and associated ior that can adversely affect their health. behavior. * Equity in health status and in access to health serv- EQUITY IssuEs. Many policymakers and governments ices. are concerned not only with the overall incidence of * Provision of public health programs and services. health problems and the provision of health services * Pricing policies for health services. but also with the distribution of both health problems * Maintenance and improvement of the quality of and the use of health services among the population. health services. In almost every country the poor are more likely to * Regulation of privately provided health services. suffer from bad health and less likely to receive health * Health insurance policies. care services. While governments cannot fully com- * The impact of health on other socioeconomic out- pensate poor households for their lack of income, they comes. can implement or adjust health care policies in ways These categories are discussed in detail in the follow- that will help the poor increase their use of such ing paragraphs. health care services as immunization, prenatal care, and medical treatment for specific illnesses. ASSESSING HEALTH PROBLEMS AND MEASURING The first thing that policymakers need is informa- ASSOCIATED BEHAVIOR. The starting point for dis- tion on the incidence of specific health problems bro- cussing almost any health policy issue is a good under- ken down by income groups in the population. This standing of current health problems and the types of will give them an idea of the degree of inequality that behavior associated with those problems. Thus accu- exists under current policies.The second type of infor- rate assessment of the current situation can be thought mation that policymakers need is who receives med- of as a distinct health policy concern. For the popula- ical care under the current system, disaggregated by tion as a whole and for particular disadvantaged the type of care provided (for example, preventative, groups-the poor, women, children-policymakers curative, or prenatal), the type of provider (for exam- need baseline information on the level and distribu- ple, public, private, or traditional), and the level of care tion of health problems to identify which groups are received (primary, secondary, or tertiary).1 This infor- most at risk and to provide benchmarks for judging mation allows policymakers to examine the relation- 179 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE ship between differences in the use of health services policies to reduce air pollution, which should lead to among subgroups of the population and differences in improved health, although there has been little health outcomes. research in this area. For each of these examples, poli- Equity involves more than just health status and cymakers need to know how changing the availability who gets health care services. It also involves how of the relevant government programs and services much individuals and households pay for those servic- affects individuals' health and health-related behavior. es. Thus a third useful piece of information for analyz- Another way in which the government can ing equity issues is data on individual and household improve the health status of the general population is expenditures on health care. Households make private though public education campaigns that provide the out-of-pocket expenditures on outpatient visits for public with information on the consequences of indi- preventative purposes (including prenatal care, immu- vidual and household health-related behavior. nizations, and checkups), on services during child- Publicity campaigns can inform individuals about the birth, on outpatient visits for curative purposes, on hazards of activities such as smoking, alcohol con- inpatient stays, on medicine, on travel to and from sumption, and unsafe sex practices, as well as the ben- health care facilities, and on food during their inpa- efits of activities such as breastfeeding, seat belt use, tient stays. Policymakers would also like to know exercise, and good nutrition. Because there may be a whether households increase their expenditures on long time-lag between the introduction of a campaign private health care services in response to a reduction and any changes in the population's health, it is diffi- in publicly provided health care services. cult to find direct evidence that public education cam- A final equity issue is how government subsidies paigns lead to improvements in the population's for health care are distributed across different socioe- health. It may be more useful to look at the effects of conomic groups. When the fee charged for a given these campaigns on behaviors-for example, in the health care service is lower than the cost of providing case of an anti-smoking campaign, by examining that service, the government is subsidizing that service. whether there has been a reduction in the number of Policymakers need to know who really benefits from smokers-or on people's knowledge of the risks asso- these subsidies, and this means they need to have an ciated with different types of behavior. accurate picture of how public subsidies are currently being allocated. Benefit incidence analysis can be used PRICING POLICIES FOR HEALTH CARE SERVICES. Many to calculate the distribution of public subsidies among governments charge fees for publicly provided health beneficiaries, giving governments the opportunity to care services to help finance the provision of these assess whether the current distribution of subsidies is services. However, as the fee charged for any service consistent with its policy priorities. increases, the demand for that service tends to decrease. There may also be switching from one kind PUBLIC HEALTH PROGRAMS. One action governments of service to another; for example, an increase in the can take to improve the health status of the population price of publicly provided services may lead people to is to provide public services that directly improve peo- switch from public services to private services. ple's health or alter people's behavior in a way that Changes in the use of health care services can have results in an improvement in their health. The best- serious health implications, so policymakers need to known example of these services is child immuniza- know the net effect of a change in user fees on house- tion campaigns, in which health workers visit homes holds' use of health care services.They would also like and schools to immunize children against a variety of to know the effect of changing user fees on govern- serious childhood diseases (such as diphtheria, pertus- ment revenues. This depends on the impact of price sis, tetanus, measles, and polio). A second example is changes on the utilization of health care services-that the opening of a new water treatment facility, which is, the price elasticity of demand for these services.2 directly improves the health of the local people by More generally, policymakers would like to know the increasing the quality of their public water supply. A impact of changing the user fee charged for a particu- third example is the provision of public garbage col- lar health care service on the utilization of that and lection services, which results in more sanitary garbage other health care services, on households' health care disposal. Finally, there is the example of government expenditures, on which service providers they use, and 180 CHAPTER 8 HEALTH on government revenues. Ideally, policymakers would ty can be simultaneously increased in such a way that like to have this information for each type of health the increased revenues from the higher prices will pay care service provided by the government. for the increase in the quality of the services. In some Another factor that may explain why certain cases the increase in utilization rates due to improved groups have low utilization rates is that they may have quality may outweigh the reduction in utilization rates to travel farther than other groups to reach the near- due to the increased prices. If this is the case, utiliza- est health care facility. Distance can be thought of an tion will increase, or at least will not decrease-thus additional "price" that households pay to use health improving the health status of the population. facilities. The location of facilities can affect house- Whether this fortuitous result can actually be realized holds' choices of which provider to use, the extent to has been examined in several recent studies (such as which households use any health care services, and the Litvak and Bodart 1993 and Peabody, Gertler, and costs that households must incur to obtain treatment. Leibowitz 1998). Further research is needed to see Thus policymakers need to know how the location of how likely it is, and under what conditions, that price public health facilities affects the utilization of thoese increases can finance improvements in quality without facilities, household expenditures on health care, the reducing health care utilization. service providers that households choose, and govern- ment revenues from health care fees. POLICIES REGARDING PRIVATELY PROVIDED HEALTH A final way in which governments can use pric- CARE SERVICES. There are several reasons why private ing policies to influence health is by affecting health- providers should be allowed, and perhaps even related behavior. For example, governments can encouraged, to provide some types of health services. impose taxes on the purchase of goods such as ciga- The main reason is that the private sector can often rettes and alcohol, which is likely to reduce the con- provide goods and services more efficiently than the sumption of these goods.To ensure that such a policy government can. Yet allowing the private sector to is having the desired effect, a government needs to provide certain types of health care services does not know how the imposition of (or change in) such taxes necessarily imply that the government has no role to affects the consumption of these goods as well as how play. For example, the government can play an impor- it affects household expenditure patterns and govern- tant role in informing the general public about the ment revenues. efficacy of particular treatments for specific diseases and about the reputations of private providers. QUALITY OF HEALTH CARE SERVICES. Another determi- Perhaps the most important role the government nant of the utilization of health care services is the qual- can play is to regulate private providers so the public ity of the care provided. If certain groups within the is assured that private services provide a minimum population rarely use health care services, it may be level of competence and quality. This can be done in because the quality of these services is low. In general, several ways. First, governments can implement a sys- households will increase their use of services if the qual- tem to train and license health care providers. Second, ity of the services increases while the price remains governments can monitor the quality of health care unchanged. This implies that household expenditures facilities on a regular basis, focusing on both the qual- on health care services (and, consequently, government ity of the facilities (such as whether they have the revenues from user fees) will increase in response to an proper equipment and whether the environment in improvement in the quality of health care facilities.Thus which they operate is hygienic) and the process of care the quality of the care provided by a facility has both a (such as whether proper prenatal care practices are direct effect on health and an indirect effect through being followed). A third area in which the government changing the utilization of services. Policymakers need can be involved is in the regulation of pharmaceuti- to know how changing the quality of care affects uti- cals; governments must decide which drugs should be lization patterns, households' choice of provider, house- banned, which should be available only by prescrip- hold expenditures on health care, government revenues tion (and who should be allowed to prescribe them), from user fees, and, ultimately, health status. and which can be sold over the counter. Fourth, the A particularly important issue regarding the qual- government can regulate the prices charged by private ity of health care services is whether prices and quali- providers, although some economists advise against 181 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE regulation of prices in the private sector.A fifth way in Thus there is an increasing need for governments in which the government can usefully intervene is in the developing countries to think about the roles played market for private insurance, where it may want to by insurance and employer-provided health benefits in regulate certain practices that can be detrimental to determining health outcomes. consumers, such as the refusal to provide coverage to The increase in insurance coverage will bring particular individuals or groups. about several changes in the health sector in develop- Another reason why governments need informa- ing countries. First, this increase is likely to raise uti- tion on private health care facilities is to know the lization rates because individuals covered by health extent to which individuals change from the public to insurance pay less for treatment than do individuals the private sector in response to policy changes at pub- who are not covered. While this outcome may seem lic health facilities.A related issue is the extent to which unambiguously desirable, it is possible that the moral private providers adjust the price and quality of their hazard problem associated with insurance will lead to services in response to changes in the price and quali- overuse of health care services. Second, insurance cov- ty of public services. Performing such analysis requires erage will affect household expenditures on health that information be collected from both private and care by increasing the demand for health care.Third, public providers using a health facility questionnaire. individuals with health insurance may reduce their rate of (precautionary) savings because they no longer HEALTH INSURANCE AND EMPLOYER-PROVIDED have to save to be able to pay for large, unexpected HEALTH BENEFITS. Health insurance is a program or medical costs. contract in which part or all of the cost to an individ- Given the increased availability of insurance, and its ual of obtaining medical treatment is paid for by the likely impact on health behavior and outcomes, policy- insurer. This insurance may be provided by the per- makers need to know which individuals and households son's employer or by another source such as the gov- are covered by health insurance, whether through pub- ernment or private providers. Health benefits are pro- lic, private, or employer-provided plans. Detailed data vided by employers to their employees and thus are can be used by researchers to analyze the impact of not available to self-employed workers. Such benefits health insurance schemes on the use of health services. consist of sick pay, sick leave, and maternity pay and In particular, researchers need information on the bene- leave. Some employers provide health care at the place fits provided by each insurance plan, the services cov- of employment, but this is best understood as a type of ered, and the copayments, deductibles, and benefit caps health insurance rather than a health benefit. associated with each plan.This will enable them to study Insurance coverage and employer-provided health how health insurance coverage affects health care uti- benefits are less common in developing countries than lization, household spending on health care, government in industrial countries, but they do exist in many low- revenues, and other household behavior such as con- and middle-income countries. For example, 14 per- sumption expenditures and savings. Over time, the cent of Indonesians working in the wage sector are increased availability of insurance may mean that gov- covered by health insurance (World Bank 1993).Also, ernment subsidies for health care can be reduced. 9 percent of all Jamaicans and about 25 percent of all However, these subsidies should not be eliminated alto- Brazilians have health insurance (Gertler and Sturm gether because there are other reasons to subsidize health 1997; Lewis and Medici 1995). In developing coun- care, such as the need to subsidize public goods and the tries as a whole, formal health insurance is growing desire to promote more equitable health outcomes. and will become increasingly important as these coun- tries complete the demographic transition whereby HEALTH STATUS AND OTHER SOCIOECONOMIC chronic diseases-which are expensive to treat- OUTCOMES. As mentioned in the introduction, an become a bigger problem than infectious diseases- individual's health status can affect his or her econom- which can be prevented and treated at low cost. ic productivity (Deolalikar 1988; Strauss 1986; Strauss Similarly, as the economies of low- and middle- and Thomas 1995; Dow and others 1997). It can also income countries develop, there is likely to be an affect education outcomes (Behrman and Lavy 1992; increase in the number of both public and private Glewwe and Jacoby 1995; Glewwe, Jacoby, and King employers that offer health benefits to their workers. Forthcoming) and consumption and savings decisions 182 CHAPTER 8 HEALTH (Kochar 1995,1997; Gertler and Gruber 1997).When questions (drawn from the discussion in the first sub- governments decide how many resources to devote to section) and indicates which parts of the health mod- health, they should account not only for the direct ule, as well as which other sources of relevant data, benefit of increasing the population's health status but yield the information needed to answer each of these also for the impact of health on other socioeconomic questions. outcomes. Thus governments need to know how the This section, and the rest of this chapter, assumes health status of the population affects worker produc- that the survey designers have already decided to col- tivity, education outcomes, and consumption and sav- lect health data as part of an LSMS-type multitopic ings. LSMS surveys are well suited for addressing these household survey. Another alternative might be to issues because they collect data on all of these topics, design a survey devoted exclusively to health issues. as seen in the other chapters of this book. Moreover, Box 8.1 discusses the advantages and disadvantages of much of the data collected in LSMS surveys, such as each option. information on worker productivity and education outcomes, is collected at the individual level (as Assessing Health Status Using the Household opposed to the household level), which allows for a Questionnaire more disaggregated analysis of the relationship Health status can be very difficult to measure in the between health and other socioeconomic outcomes. context of a household survey (McDowell and Newell 1996; Stewart and Ware 1992).The ideal survey would Data Needs for Policy Analysis employ doctors, nurses, or other health professionals to give each household member a complete health This section describes the data that policymakers and examination, including laboratory tests. However, the researchers need to address the health policy issues expense and logistical complications of doing this presented in the previous section. The first subsection would be very high. Thus only household surveys that discusses how to assess health status using data from a focus almost exclusively on health issues, as opposed to household questionnaire. A thorough discussion is multitopic surveys such as LSMS surveys, can devote necessary because health status is complex and difficult the resources required to collect such complete health to measure. Data on health status can be used not only data. Because of the costs and complications involved, for assessing current health problems but also for ana- most developing countries collect very little data on lyzing the impact of health status on other socioeco- the health status of the general population; what these nomic outcomes.The second subsection discusses how countries do collect tends to be limited to mortality to use a household questionnaire to collect data on data (which may or may not include information on health-related behavior, the utilization of health facil- cause of death) and, in some cases, anthropometric ities, health expenditures, insurance status, and access data. In most cases multitopic surveys must organize a to services. Household data on health status, the uti- new data collection exercise to gather the data needed lization of health facilities, and health expenditures are to analyze the health policy questions discussed in the essential for examining equity issues. first section of this chapter. The third subsection turns to the community, There are two issues to consider when measuring price, and facility questionnaires, examining what health status. First, health is multidimensional. data are needed to analyze issues concerning pricing Nutritional status, morbidity, physical functioning, and policies, quality of health services, public health pro- mental functioning (mental health and cognitive abil- grams, and regulation of private health care services. ity) reflect different aspects of a person's health. These The fourth subsection provides a short discussion of different dimensions of health may respond differently how to use the data collected for policy analysis. The to policy changes and may have different effects on fifth subsection briefly examines two important sam- other important outcomes such as an individual's pling issues. The sixth and final subsection links the earnings and productivity and even his or her sense of policy issues discussed in the first section of this well-being. Second, because people's perceptions of chapter to the short, standard, and expanded versions their own health are likely to be related to their edu- of the draft health module presented in Volume 3. cation, occupation, and household income, self- Specifically, the sixth subsection presents 37 policy reported information on health obtained from house- 183 PAULJ. GERTLER, ELAINA ROSE,AND PAUL GLEWWE Box 8.1 Collecting Health Data in a Multitopic Survey or in a Single-Topic Survey The first decision that survey designers need to make is ings, education, and income and expenditures. Yet multitopic whether to collect health data as part of a multitopic house- surveys have two serious disadvantages. First, the information hold survey such as an LSMS survey or as part of a survey that they collect on health status is limited, and much of what devoted almost exclusively to health issues, such as a they do collect is reported by the respondents themselves Demographic and Health Survey. (For information on these rather than recorded by a trained observer This reliance on surveys see the Demographic and Health Survey website at self-reporting by respondents is cause for concern because http://w\ww.macroint.com/dhs.) The two main advantages of such data tend to be less reliable. Second, for reasons specialized health surveys are the detailed information that explained in Chapter l, multitopic surveys such as LSMS sur- they can coilect on health status and their large sample sizes, veys tend to have samples that are usually no more than which will yield data that can be used to study a wide range 5,000 households, which is too small to yield data for calcu- of health problems, including rare events such as maternal lating most disease-specific measures of health, such as levels mortality. However health surveys have the serious disadvan- of coronary heart disease and cancer (An important excep- tage that they rarely collect the basic socioeconomic infor- tion to this point is diarrheal disease in children; this is well mation needed to describe how health varies according to defined and highly prevalent in most developing countries, so household income. This is prmarily due to two constraints. LSMS-type surveys should collect data on it.) On the other First, collecting household income information adds to the hand, a small multitopic household survey is a good vehicle cost of the survey Second, collecting such information tends for collecting summary measures of health status, such as to put an unreasonable burden on households' time. Facility- anthropometric measurements of children, body mass index based health surveys have the additional disadvantage that of adults, and physical and cognitive abilities of the elderly. they cannot obtain information on the health status of the These summary measures provide good benchmarks for general population because they can collect information only measuring changes in overall health status over time.They are on individuals who visit health facilities, and not on individuals also useful for analyzing how the financing and accessibility of who never visit any health facility. health care services affect health outcomes. Finally they can Multitopic household surveys do not suffer from these be used as summary measures of health status in analyses of disadvantages because they collect data not only on health the impact of health on labor market productivity education- but also on many other topics, such as employment and earn- al performance, and other socioeconomic outcomes. hold surveys must be interpreted with caution. In SELF-REPORTED GENERAL HEALTH STATuS. Self- most previous LSMS surveys, each individual's health reported general health status is an index of overall status was assessed by asking whether he or she was health based on the respondent's answer to the ques- sick or injured at any time during the previous four tion, "In general, how is your health at this time?"The weeks and, if so, whether his or her usual activities possible answers to this question are excellent, very were limited by this illness or injury. Some LSMS sur- good, good, fair, poor, and very poor. This measure of veys also asked respondents to self-report symptoms health status is correlated with future mortality, even and to diagnose specific illnesses for themselves and after controlling for many other variables (see Idler their children. In contrast, and as explained more fully and Benyamini 1997 and Ferraro and Farmer 1999). below, a better approach is to collect several objective Even so, there are potential biases because respondents' measures of health status to avoid relying entirely on answers tend to depend on their subjective standards subjective (self-reported) data. of what constitutes "healthy" and on the extent to Self-reported measures of health status include which they have had contact with the health system. general health status, limitations in daily activities, cur- For example, Dow and others (1997) reported that rent morbidity, activities of daily living, and mental serious bias was revealed when this measure was used and emotional health.These will now be examined in in models of labor supply. turn.This will be followed by a brief discussion of sev- eral objective measures of health, including anthropo- SELF-REPORTED LIMITATIONS IN DAILY ACTIVITIES. metric status, mortality, directly observed physical Questions on limitations in daily activities include functioning, clinical diagnosis of illness, information whether the respondent was able to perform his or her from medical tests, measures of cognitive functioning, usual activities, the number of days during which his and observed activities of daily living. or her normal activities were limited, and whether he 184 CHAPTER 8 HEALTH or she was confined to bed because of illness. The two categories. Intermediate activities of daily living main problem with these kinds of questions is that the consist of the following abilities: carrying a heavy load responses to them will depend not only on the per- for 20 meters; sweeping the floor or yard; walking for son's health but also on what his or her normal activ- 5 kilometers; drawing water from a well; and bending, ities are and on how easily he or she is able to curtail kneeling, or stooping. Basic activities of daily living are those activities. The direction of the bias is uncertain. the ability to bathe oneself, feed oneself, put on For example, if individuals who earn low wages tend clothes unaided, stand up from a sitting position in a to have more physically strenuous jobs, they are more chair, go to the toilet unaided, and rise from sitting on likely to be constrained by a given illness than higher the floor. wage earners. On the other hand, since individuals Activities of daily living are less subjective than who earn low wages may be less able to "afford" to rest other self-reported measures of health because they when they are ill relative to those who earn higher are well-defined, are not expressed in terms of respon- wages, they may be less likely to limit their activities dents' normal activities, and do not require respon- because of illness. dents to provide general opinions about their own health. Initially developed to study levels of disability SELF-REPORTED MORBIDITY. Many household surveys, among the elderly, these measures are used increasing- including several LSMS surveys, have asked respon- ly to study the health status of all adults. These meas- dents whether they are currently ill. For respondents ures of physical functioning have been tested exten- who report being ill, some of the surveys have also sively for reliability (consistency across tests and asked what illness they were suffering from. among different interviewers) and validity (consisten- Unfortunately, such self-reported information on cy among individual assessments of different skills). In morbidity may be biased by variation in respondents' the United States, Jamaica, and Southeast Asia, they perceptions and by differences in their knowledge of have been found to be reliable and valid self-assess- specific illnesses. It has often been found that the inci- ments with a high degree of internal consistency dence of reported adult sickness increases with income (Andrews and others 1986; Guralnik and others 1989; and education. (Two examples using LSMS data are Ju and Jones 1989; Strauss and others 1993; Ware, Schultz and Tansel 1997 and Dow 1996.) Because of Davies-Avery, and Brook 1980). Moreover, they are this problem, self-reported data on symptoms and the key measures of health status in the new U.S. diagnoses for adults are of limited usefulness in assess- Health and Retirement Survey (Wallace and Herzog ing health status. 1995). A similar problem arises with the way in which Activities of daily living have been used as mothers report their children's illnesses. Using data dependent variables in many analyses of adult health from a health survey done in Peru, Sindelar and and as explanatory variables in analyses of the conse- Thomas (1991) found that the probability of a moth- quences of ill-health.They are often used in studies of er reporting that her child has a respiratory illness labor supply in the United States (for example, Bound increases with her education. On the other hand, the and others 1991; Bound, Schoenbaum, and Waidman authors also found that the reporting of children's 1995; and Stern 1989) and have recently been used in diarrhea declines with the mother's education, which Indonesia (Dow and others 1997). Gertler and Gruber suggests that reported diarrheal disease is more accu- (1997) used changes in activities of daily living to rate than reported respiratory illness. Thus data from investigate whether families are able to insure their mothers on the incidence of diarrhea among their consumption against major illnesses. If activities of children may be reliable, but the value of asking moth- daily living are gathered using carefully worded ques- ers about their children's other illnesses or symptoms tions, they will not be subject to the same reporting is doubtful. biases that are common in data on self-reported illness. They also have the advantage of being relatively easy SELF-REPORTED ACTIVITIES OF DAILY LIVING. to collect. Thus self-reported activities of daily living Activities of daily living are derived from a series of are an important measure of adult health status. In fact, questions regarding respondents' physical ability to a number of studies in both developed and developing carry out a number of activities.They are divided into countries have shown that, unlike with self-reported 185 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE morbidity, well-educated individuals in high brackets can be used to detect maternal deaths associated with report fewer problems with activities of daily living childbirth (see Graham, Brass, and Snow 1989). than do poorer, less educated individuals (Strauss and In contrast, data on child mortality data can be others 1993; Gertler and Zeitlin 1996; Kington and collected by obtaining fertility histories from mothers, Smith 1997). which has been done in many previous LSMS surveys (see Chapter 15 for several examples). Because infant SELF-REPORTED MEASURES OF MENTAL OR EMOTIONAL and child mortality rates are much higher than adult HEALTH. Self-reported measures of mental or emotion- mortality rates, the sample sizes needed are not as al health can also be obtained from household surveys large. Also, children rarely live alone, so panel data are by asking respondents about how often they have not necessary.The use of these data to study infant and experienced insomnia, fatigue, moodiness, impulsive child mortality is discussed in detail in Chapter 15, and anger, malaise, or sadness. While these measures have thus will not be considered further in this chapter. been validated in the United States and other devel- oped countries, little evaluation has been done of their DIRECTLY OBSERVED ACTIVITIES OF DAILY LIVING. One appropriateness for developing countries. More expe- way to improve upon self-reported activities of daily rience is needed on the feasibility and usefulness of col- living is to have a survey interviewer watch household lecting such data in developing countries. members perform several relatively simple activities of daily living, such as standing in various positions, sitting ANTHROPOMETRIC MEASURES. Turn now to objective up from a chair, and wvalking a short distance. Direct measures of health status. For adults, body mass index observation of activities of daily living often yields can be computed from data on weight and height, more accurate data than recording respondents' reports which are relatively easy to collect. Among adults, of activities of daily living. Directly observed activity of body mass index is associated with malnutritioni and daily living imieasures have been used extensively in poor health. For children, nutritional status can be national health surveys in the United States and were measured by standardized weight-for-height and recently used in the Matlab Health and Socioeconomic height-for-age. This implies that weight and height Survey in Bangladesh (Rand Corporation 1998). data should be obtained for all household members in a multitopic household survey.Arm circumference and CLINICAL DIAGNOSIS. Clinical diagnosis can be used to birth weight are additional indicators of health status. measure health status in a way that avoids many of the Chapter 10 discusses anthropometric measures in problems associated with self-reported illness. great detail, so there is no need for further discussion However, measurements obtained from clinical diag- in this chapter. noses involve substantial data collection problems. A sample of people who visit a health facility in a given ADULT AND CMHLD MORTALITY. Although mortality is day will be too small to be useful, which means that it invariably a concern of policymakers, data on adult is necessary to rely on health providers' memories of mortality is difficult to collect. In general, at least two events for additional information. surveys are needed to accurately measure adult mor- If the survey budget can bear the substantial costs tality. The problem with relying on a single survey, involved, some relatively simple medical tests can be which would require retrospective questions on adult conducted. For example, individuals' blood pressure household members who died in recent years, is that and temperature can be measured by anyone who has many adults live alone; deaths of adults who lived been given a small amount of training, while lung alone cannot be obtained retrospectively because the capacity, which reflects people's long-term health sta- households in which they lived no longer exist. Also, tus, can be evaluated relatively easily by nonmedical adult mortality is a relatively infrequent event in personnel who have been trained to use peak flow households, so the samples needed to measure this meters. Finally, finger-prick blood tests can detect ane- precisely are much larger than those typically used in mia and micronutrient deficiencies, and HIV can be LSMS surveys. One exception is the use of sisterhood detected from saliva samples; both of these tests can be methods, which involve asking women whether any of done by nonmedical personnel after they have their sisters have died in recent years. These methods received a modest amount of training. 186 CHAPTER 8 HEALTH An important point to consider when contem- respondents can be asked to give retrospective as well plating the collection of such clinical data is the sensi- as current information about their health status. For tive nature of medical tests, particularly HIV tests. example, as part of the self-reported activities of daily Clinical data should not be collected until clearance living, respondents could be asked to say how long they has been obtained from government agencies, particu- have been unable to perform the activity in question. larly any agencies that deal with research on human However, retrospective data can be very inaccurate; a subjects. Some organizations require that people who second, more accurate method is to collect data on a test positive for particular diseases be treated. Testing person's health status at two different points in time. for HIV is a particularly sensitive issue; some organi- This implies administering the same survey to the same zations mandate that counseling be given to all people households at two or more points in time-in other who receive HIV test results, even individuals who test words, collecting panel data. See Chapter 23 for a gen- negative for HIV eral discussion of when to collect panel data as a part of an LSMS or similar multitopic survey. COGNITIVE TESTS. Simple tests can be administered that measure the cognitive functioning of adults and SuMMARY. When household surveys are used to meas- the cognitive development of children. (See Chapter 7 ure health status, survey designers should bear in mind on education for a brief discussion of cognitive tests.) the following points. First, health status is difficult to Cognitive functioning measures for older adults can measure in household surveys, especially in multitopic also be measured using tests of memory or tests of the surveys such as LSMS surveys, because it is usually not ability to perform simple calculations or other mental possible to use health professionals to give respondents tasks. These tests can be used to capture cognitive thorough health examinations. Second, self-reported problems associated with aging. assessments of general health status can provide useful information, but they are also subject to bias. Third, PRoxY RESPONSE. In some past LSMS surveys (for self-reported assessments of current morbidity are not example, South Africa and Bulgaria), the health mod- reliable, with the exception of reports of diarrhea. ules asked one individual in each household to answer Fourth, self-reported activities of daily living are poten- questions about the health of all household members. tially very useful indicators of adult health status. Fifth, Unfortunately, such data are likely to be very inaccu- anthropometric measurements are reliable indicators of rate. In particular, proxy responses can generate meas- the health status of both children and adults, and have urement error in the data, so it is extremely important been used successfully in many developing countries. that each adult in the household respond to questions Sixth, if adequate resources are available it may be pos- about his or her own health status, utilization of health sible to perform a few simple medical tests; however, services, and health-related behavior. The only excep- this is somewhat experimental and raises ethical issues tion is that mothers should respond to questions about concerning the use of human beings for research pur- the health of any of their children who are too young poses. Finally, survey designers should ensure that each to answer for themselves. The age at which older chil- individual provides the interviewer with information dren can answer for themselves will vary from coun- about his or her own health; if proxy respondents are try to country, but a general range is somewhere used, the results are likely to be unreliable. between 10 and 15 years. Using the Household Questionnaire to Measure Heolth- CHANGES IN HEALTH STATUS. It is sometimes desirable Related Behavior, the Use of Facilities, Health Expenditures, to measure changes in health status rather than the level and Insurance Status of health status at only one point in time. A person's As explained in the first section, collecting informa- health status at a given point in time reflects past events tion on health status is only the starting point for ana- that have occurred over an individual's entire lifetime, lyzing health issues in developing countries. Data are while a change in a person's health status between two also needed on health-related behavior, the use of points in time is primarily due to events that occurred health care services, health care expenditures, and between those two points in time. There are two pos- insurance coverage. This subsection discusses how to sible ways to measure changes in health status. First, obtain these kinds of information. 187 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE HEALTH-RELATED BEHAVIOR. A person's health status 12 on housing and Chapter 14 on environmental is closely related to his or her health behavior. The issues. types of health behavior of most concern to policy- Other types of health-related behavior include the makers, the health behaviors of the population, and use of preventive medical care, diet (including infant the best ways to mneasure health behaviors all vary feeding practices), exercise, and the use of seatbelts. from country to country.Yet in most countries, two Information on preventive care should be collected important kinds of health-related behavior are smok- together with data on utilization of health care facili- ing and the consumption of alcoholic beverages.A rel- ties. Data on infant feeding should be collected in the atively simple assessment of each can be done by ask- fertility module (discussed in Chapter 15). Diet infor- ing respondents whether they have ever smoked or mation for other household members is very difficult consumed alcoholic beverages, how old they were to measure in household surveys and thus probably when they began, whether they are still doing so should not be collected in most LSMS surveys. For today, and how much they typically smoke or drink. further discussion of this see Chapter 5, which dis- Such data provide not only current information but cusses the collection of food consumption data. also retrospective information on changes in these Finally, it is relatively simple to ask questions on the kinds of behavior over time, and on differences in use of seatbelts (and the use of helmets among motor- behavior across population cohorts for a given age cyclists), on work-related physical activity, and on per- range. sonal exercise. An example of this is the LSMS survey Another type of health-related behavior that con- done in Brazil in 1996, which had seven questions on cerns policymakers is sexual practices. Because of the personal exercise. sensitive nature of this topic, collecting data on it is difficult. In particular, it is likely that respondents will UTILIZATION AND EXPENDITURES. Most previous LSMS underreport any risky behavior they engage in, such as surveys have collected incomplete data on utilization failure to practice safe sex.The best way to collect data rates and expenditures. Individuals have typically been on sexual practices will vary across countries and asked whether, in the month preceding the survey, they across cultures. In many countries it may be impossi- have seen a health care provider to be treated for an ill- ble to collect such data as part of an LSMS-type ness and, if so, how much they spent on that visit. household survey. Therefore, the draft health module However, individuals, particularly individuals with seri- does not include questions on sexual practices. For an ous illnesses, may make several visits to receive health example of surveys that do collect such data see the care over one month and may obtain treatment from standard questionnaire used by the Demographic and more than one provider. Since most previous LSMS Health Surveys (Macro International 1995a), which surveys have not been designed to capture this com- collects a small amount of data on sexual practices. prehensive information, the data from such surveys Some Demographic and Health Surveys, including a probably underestimate both utilization rates and 1994 survey in Tanzania (Macro International 1995b), expenditures. This is why most studies of the demand have collected more detailed data on sexual practices. for health care that use LSMS data are studies of Demographic and Health Surveys collecting more provider choice rather than studies of expenditures or detailed data on sexual practices have also been done general studies of the demand for health care services. in Brazil, Burkina Faso, the Central African Republic, Data on all health care consultations and expendi- C6te d'Ivoire, Haiti, Kenya, Uganda, and Zimbabwe. tures are required for many kinds of policy analysis. Household-level behavior that affects the epi- Analysts need data on all visits to medical facilities to demiological environment of the household includes see which socioeconomic groups use which facilities, cooking practices (the type of fuel used, the extent of and they need data on all expenditures to obtain an ventilation for the smoke, and whether there is a sep- accurate measurement of the costs to households of arate room for cooking), waste disposal, sanitation obtaining health care. Ideally, data should be collected practices, and the sources of water used by the house- by the level of care (primary, secondary, or tertiary), by hold. LSMS surveys typically ask for this kind of the type of provider (public, private, or traditional), by information in the housing module. For a detailed the purpose of the visit (preventative, curative, or pre- description of how to collect such data, see Chapter natal care), and by the kind of services received. To 188 CHAPTER 8 HEALTH avoid problems of recall bias, questions on any outpa- asked whether they are covered by some sort of insur- tient care received should be limited to the previous ance or health benefits. However, in many countries 30 days. Since inpatient care is less common and easi- that have health insurance and employer health bene- er to remember, the recall period for questions about fits, the nature of both varies widely according to the inpatient care can be the previous 12 months. type of insurance policy or the type of employer.Thus Expenditures should include not only fees but also any analysts also need data on the benefit structure to be other expenses incurred by the respondents, such as able to measure how health insurance or employer purchases of medicine and transportation costs. It is health benefits affect an individual's health status, also useful to ask about the amount of time spent in health behavior, and utilization of health care services. obtaining the care (including travel time and any time For example, the fact that an individual is covered by spent waiting to see the provider), since this is anoth- health insurance is unlikely to cause him or her to er cost of using these services. Finally, for households make greater use of outpatient services if the insurance or individuals that have health insurance, it is impor- policy covers only inpatient care. Thus, for health tant to distinguish between charges paid for or reim- insurance, data should be collected on the services bursed by the insurance and charges paid for by the covered by the insurance, including whether the poli- respondent. cy covers private as well as publicly provided services. As mentioned in the first section of this chapter, People may also base their health care decisions on one of the most important uses of data on the utiliza- their policy's structure of deductibles, co-insurance tion of health care services is to study the incidence of (that is, required copayments), and benefit ceilings. government health subsidies.This is done in two steps. Information is also needed on which household mem- The first step is to measure the unit subsidy provided bers are covered by an individual's health insurance; by the government for each type of health care serv- some policies cover only the individual while others ice (such as inpatient care, lab tests, primary care, pre- cover all (or some) of the members of the individual's natal care, and immunizations). The unit subsidy is the family. Two studies that examine the impact of insur- incremental cost of providing the service minus the ance are Brook and others (1983) and Gertler and fee charged. Estimates of the incremental cost can be Sturm (1997). Similarly, for employer-provided health obtained from cost function estimates based on facili- benefits, data are needed on the number of sick days ty data, either obtained from the ministry of health or allowed, the wages received for those sick days, and collected in a survey of health care providers.3 A recent analogous information for maternity benefits.4 example of such estimates, done in the Philippines, is There is another aspect of insurance and employ- given byAlba (1998). er health benefits that policymakers should consider. The second step in this benefit incidence analysis Doing a full analysis of benefits requires complete is to calculate the amount of the overall subsidy information on the financial risk associated with bouts received by different population groups. This calcula- of illness. This in turn requires data on all of the tion can be done by multiplying the unit subsidy by expenditures made by an individual for treatment of a the utilization rate of that service by each group in the specific episode of illness-in other words, the expens- population. Thus benefit incidence analysis enables es associated with all consultations with health care policymakers to know how public subsidies for specif- providers from the onset of the illness until the illness ic types of health care services are distributed among is cured. This is a complex task because expenditures the general population by geographical location, on health care can vary greatly during the course of socioeconomic status, education, age, and sex. treatment. Also, in some cases the utilization of health care for a specific illness may have started before, or INSURANCE AND EMPLOYEE HEALTH BENEFITS. Analysts may continue after, the 30-day recall period recom- can use data on insurance and employer-provided mended for LSMS-type surveys, especially in the case health benefits to explore how these benefits affect of individuals who have severe illnesses. Unfortunately, health status, health behavior, and the utilization of it is practically impossible to collect this type of infor- health care services. The structure of health insurance mation by asking these respondents to remember and of employer-provided health benefits differs from details of their treatment over a long period of time, so country to country. At minimum, respondents must be it is probably impossible to make a full calculation of 189 PAULJ. GERTLER, ELAINA ROSE,AND PAUL GLEWWE the benefits of insurance using LSMS-type surveys.5 income can be approximated by averaging income For an idea of the difficulty involved in doing such a over several years, and in some cases permanent study, see Gilleskie (1998). income can be proxied by per capita consumption. See Chapter 17 for a discussion of the issues involved in ACCESS. To understand the determinants of the use of measuring income and Chapter 5 for a discussion of health care services, analysts need data on the extent to household consumption data, which can be used as an which the population has access to health care facili- indicator of permanent income. Chapter 7 discusses ties. This can be done by asking households about the collection of data on education outcomes. Finally, nearby facilities and perhaps about more distant facil- data on prices are needed to draw causal inferences ities that they may use regularly, and then gathering about how all of these characteristics affect health out- data on those facilities using a facility questionnaire comes, as explained in Appendix 8.1. Price data are that is part of the overall survey or using an existing collected in the price questionnaire, which is discussed source of data unrelated to the survey (such as data in Chapter 13. collected by the ministry of health).The main problem Finally, a word of caution. Using household survey with this method is that it omits any nearby providers data to estimate causal relationships must be done with whose existence is unknown to the sampled house- great care; many pitfalls and complications are holds.This problem leads to the more general issue of involved. In general, simple techniques such as ordi- sampling health care providers (discussed further nary least squares regressions are likely to produce below in a separate subsection). biased results.This is discussed further below. DATA FROM OTHER MODULES. It is important to stress Prices, Quality, and Public Health Programs:The Facility, that doing a causal analysis of how health status affects Price, and Community Questionnaires other socioeconomic outcomes (as well as how non- Policy issues related to the price of health services, the health factors affect health status) requires not only the quality of health services, and public health programs health status variables discussed above but also data on can be analyzed by combining data on health status, other household characteristics and behavior. health behavior, utilization of health services, and Consider first the impact of health status on other expenditures on health care (all of which are collected socioeconomic outcomes. Common outcomes of in the household questionnaire) with data on local interest to policymakers include income (especially health facilities and programs. The need for data on labor income), education, labor force participation, health facilities and programs moves the discussion migration, and fertility. Data on these outcomes are from collection of household data to collection of data collected in the questionnaire modules discussed in that characterize the community in which the house- the analogous chapters of this book. For example, data holds live. Although this type of information could be on wages, which indicate worker productivity, are collected at the household level, doing so would be found in all versions of the employment module intro- inefficient (because much of the information does not duced in Chapter 9 (and presented in Volume 3). vary across households living in the same community) Turn now to the effect of other characteristics on and probably inaccurate (because many households health outcomes. Possible causal factors are housing may not be familiar with some of the information characteristics (such as source of water, type of toilet, being sought) ,6 A better approach is to collect this type and method of garbage disposal), the education levels of information at the community level. Data on the of adults, labor force outcomes, characteristics of the quality and prices of health services can be obtained local environment, household income, fertility history, from a sample of local health facilities, including both and perhaps access to credit.Which income measure is private and government-operated facilities. Infor- most useful will depend on the particular issue being mation on public health programs can be collected in analyzed. In some cases the best measure xwill be some the community questionnaire. Finally, the prices of any type of permanent income that is less subject to year- medicines that can be purchased from pharmacies or to-year fluctuations than annual income. In other cases other vendors can be collected using the price ques- it may be more desirable to measure only nonlabor tionnaire.This subsection discusses each of these types income. If several years of data are available, permanent of information in turn. 190 CHAPTER 8 HEALTH PRICES. In order for policymakers to understand the An alternative approach, of which Goldman and financial pressures of health care on households, they Grossman (1978) offer an early example, is to use must have information on the full range of costs that hedonic pricing models to control for differences in households face when using health care services. The quality from provider to provider that are reflected in most obvious costs are payments to health care the prices that the providers charge. This approach providers, which are determined by the prices charged involves estimating how a variety of dimensions of by health care facilities. This information can be quality affect the prices charged by a sample of obtained from health facility questionnaires, yet sever- providers for a range of different services. The differ- al issues arise when calculating the price of health care ence between the actual price and the predicted price services. First, visits to health care facilities vary wide- from this regression is the quality-adjusted price. If one ly in terms of what takes place during those visits. For assumes that the predicted price from the regression example, a single visit may involve both diagnostic and reflects all differences in quality, then the difference treatment services. between that price and the actual price reflects differ- One common approach to measuring the prices of ences in price that are not due to quality differences. health care services is to calculate the mean or median In general it is possible to estimate the impact of expenditures per visit, but this approach does not prices on health outcomes only if data are collected account for variation in the types of visits. A better from individual health care providers using a facility approach is to collect prices from health care facilities questionnaire, although there are some exceptions (for for specific types of visits. One use of such data is to example, if all health facilities are operated by the gov- construct a price index of health care costs that sum- ernment and there is no variation in price among marizes the average "price" charged by each facility.Two them). The same is also true for analyzing how the types of data are required for these price indices: data on quality of services affects health outcomes; such analy- the prices of the services provided and data on average sis cannot be done unless data are collected from household expenditures on each service. The data on health care facilities. health care expenditures can be used to calculate A third issue is that fees paid to health care weights for the health care price index, just as house- providers are not the only costs of obtaining health hold expenditure data on all goods and services can be care, even after including unofficial charges such as tips used to calculate weights for general price indices. and bribes. There are also the financial and time A second issue regarding the prices charged by (opportunity) costs of traveling to and from a facility health care providers is that prices may be higher for and the cost of the time spent obtaining treatment. health facilities that provide higher quality services. To The financial costs are simply a person's expenditures understand the pure effect of prices on household on transportation. The time cost includes the travel choices and health outcomes (in other words, the time to a facility plus the time spent at the facility, effect of a change in price for a given level of quality), multiplied by the value of the time of the person treat- it is necessary to remove any variation in prices that ed (and of anyone accompanying him or her).To com- simply reflects variation in quality. If this is not done, pute these time costs, analysts need data on respon- estimates of the effects of prices on health outcomes dents' travel times and waiting times, as well as their and on demand for health care are likely to be biased. wage rates. However, estimating such quality-adjusted prices is Travel times are best collected from each house- not an easy task. It can be done only at the data analy- hold, although one could collect this information at the sis stage, not at the data collection stage. community level if all households in the sampling unit There are several ways to adjust prices for quality live very close to each other.7 Wage rates can be differences at the data analysis stage, all of which use obtained from the employment module of the house- facility data on service quality. (Such data are discussed hold questionnaire (see Chapter 9). It is useful to collect further below.) One way is to specify quality as a data on waiting time in a facility questionnaire as well provider fixed effect, which makes it possible to purge as in the household questionnaire. Collecting data on the impact of quality from the prices of specific serv- waiting time in the facility questionnaire is useful for ices. (See Deaton 1988 for a specific example; see calculating quality-adjusted prices and also avoids inac- Chapter 23 for a general discussion.) curate average waiting times based on only one or two 191 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE households who actually used the facihty. Yet asking children whose mothers received that care-after con- about waiting time in the household questionnaire can trolling for variation in households' socioeconomic be useful for estimating simple averages across different characteristics and in risk factors. types of health facilities.The effects of any insurance on Information on the structure of care can be prices paid can be captured by including information obtained by asking about equipment, personnel, and on deductibles and copayments when calculating the availability of different kinds of medicines. prices. This insurance information must be collected in Information on the process of care can be obtained by the health module of the household questionnaire. asking about the protocols followed for different types A final cost of health care is the cost of purchas- of commonly provided health services. Several exam- ing medicines from outlets that are not health care ples are in the draft facility questionnaire introduced in providers, such as pharmacies and small vendors. The the third section of this chapter (and presented in best place to collect this information is in the price Volume 3 of this book). For a more detailed discussion questionnaire (see Chapter 13 for a full description), see Peabody, Gertler, and Leibowitz (1998) and which can be used to collect price data for many kinds Peabody and others (forthcoming). of commonly purchased medicines. PROVIDING PUBLIC HEALTH SERVICES TO THE GENERAL QUALITY OF CARE. The data necessary to evaluate how POPULATION. Public health services and programs are the quality of care affects household behavior and generally outreach programs and thus are not neces- health outcomes should be collected in the facility sarily associated with particular health care facilities. questionnaire. The best approach is to ask providers This implies that data on all public health services and what actions they take under various circumstances programs available in a given community, such as and what resources they use in providing health care immunization programs and information campaigns, services. Some households may be able to provide this should be collected in the community questionnaire. information, but others may not or may provide inac- Data should also be collected in the community ques- curate information. tionnaire on the existence (and quality) of public The quality of the care provided by each facility water supply and garbage collection services. In addi- or provider depends on both the structure of care and tion, it is useful to collect information on air and water the process of care.The structure of care is the quan- pollution, although it is difficult to collect this accu- tities and types of inputs (such as equipment, person- rately. The community questionnaire introduced in nel, and medicine) used by the provider in providing Chapter 13 collects all of this information in detail, its health care services.The process of care is the way except for data on air and water pollution, which can in which services are provided during a patient's visit, be gathered in the relevant submodules presented in including the way in which professionals diagnose and Chapter 14. Finally, the household questionnaire treat patients with specific health needs. For example, should include explicit questions on who in the in the Jamaican LSMS survey conducted in November household participated in these public health pro- 1989 process measures were collected from both facil- grams-for example, who was immunized, received a ity and household questionnaires to assess the quality pamphlet, or was otherwise affected by an information of prenatal care. campaign. There is evidence that the process of care varies substantially among developing countries. Peabody REGULATION OF PRIVATELY PROVIDED HEALTH CARE (1996) noted that inaccurate diagnoses and inappro- SERVICES. As explained in the first section of this chap- priate treatment are common in rural Vietnam; diar- ter, the main role of the government regarding privately rheal disease is often inappropriately treated with provided health care is the regulation of private health antibiotics rather than with oral rehydration therapy. care facilities. Data from LSMS surveys can be used to Using data from Jamaica, Peabody, Gertler, and measure the extent to which these facilities comply with Leibowitz (1998) found that a better process of care the relevant regulations, especially if a health care facili- (measured by how actual care compared to optimal ty questionnaire is included as part of such a survey. diagnosis, treatment, and advice protocols) was associ- Specifically, data from a facility questionnaire can show ated with a 500-gram increase in the birthweights of xvhether the staff at the facility have the required train- 192 CHAPTER 8 HEALTH ing and licenses, whether the facility's equipment meets reduces the extent to which these facilities are used, certain standards (for example, whether each piece is one would like to know how much of this is due to clean and in working order), whether proper procedures people switching to private providers and how much are being followed for specific types of examinations and is due to people receiving no professional health care treatment, and whether regulations regarding drugs and at all. Second, private providers may alter their behav- prices are being followed. ior in response to public health care policies. In theo- Additional information on compliance can be ry, this could be analyzed using data on private gathered from the household questionnaire. Such data providers-ideally using estimates of the supply curve can be used as an independent check of drug and price of different kinds of privately provided health care.Yet regulations and of whether certain procedures are such analysis would be fairly experimental. It is diffi- being followed. (This is particularly useful if health cult to say whether such analysis is feasible using data facilities provide misleading information about their from an LSMS-type household survey. compliance with government regulations.) It may also Fifth and finally, if detailed cost data are collected be possible to check some regulations about insurance from private health care facilities, these data can be using data from the household questionnaire. used to estimate the cost of providing different servic- A second general use of data from the health facil- es. Such information might be used to assess the effi- ity questionnaire-and, to a lesser extent, from the ciency of the public sector, as well as to undertake household questionnaire-is to provide a picture of the cost-benefit analysis. However, this is also a new area private health care system. For example, these data may of research, one for which there may be many unfore- reveal deficiencies in the provision of private health seen analytical problems. care, and they may also suggest to policymakers specif- ic regulatory actions to resolve these problems. Such Using Data for Policy Analysis information can be useful to policymakers when they The discussion up to this point has said little about are considering new regulations on private facilities. how to use these data to analyze health policies and A third use of data from private health care programs.A detailed discussion would require a sepa- providers is to examine how regulations affect these rate chapter, if not an entire book.Yet a brief discus- providers' services and even their patients' health out- sion can highlight some fundamental points that sur- comes.At minimum, the extent to which private facil- vey designers should bear in mind when designing the ities comply can be assessed, as discussed above. In health module for an LSMS survey. This subsection addition, if data indicate that some providers comply briefly reviews some methodological issues that arise while others do not, it may be possible to use the vari- in policy analysis in the health sector. ation in compliance to estimate the impact of compli- In general, policy questions can be divided into ance on health outcomes. However, such estimation is two categories: not straightforward and could lead to biased results. * Questions about overall levels of outcomes and the Finally, if panel data are collected and new regulations distribution of those outcomes across subgroups of are put in place during the time spanned by the sur- the population. veys, it may be possible to estimate the impact of reg- * Questions about the effects of policies on out- ulations on the phenomena that are being regulated comes, both in the aggregate and within subgroups. and perhaps even on health care outcomes. Yet here Table 8.1 lists 37 policy questions. Of these, questions too the estimation problems are considerable, and ana- 1-14, 18, 25, and 28-33 concern the overall level and lysts should be cautious about drawing causal infer- distributions of outcomes, while questions 15-17, ences. For further discussion of estimation problems 19-24,26, 27, and 34-37 pertain to the effects of poli- see Appendix 8.1 and Chapter 23 on panel data. cies on outcomes. The research methods needed to Investigating the behavior of households and of answer the first set of questions are straightforward. private providers is a fourth way to use data on private The questions can be addressed by describing the cur- health care facilities. First, policymakers need to know rent situation. All that is needed is descriptive infor- how changes in the public provision of health care mation, such as the means of the outcome variables services affect the use of private health care providers. both in the aggregate and by population subgroup. For example, if increasing prices at public facilities Analyzing the second set of policy questions raises dif- 193 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWVVWE Table 8.1 Policy Questions and Data Sources Minimal data needs from Expanded data needs Sources other than Policy question a household survey from a household survey a household survey I .Who is getting the most serious Diarrhea only: short 4 8, Vital registration data diseases? standard A6 A 10 2.Wno is unable to perform their usual Short 1, standard A3 activities because of poor health? 3..What is tne level and distr bution of Standard AlI -A29 Expanded G I -G I health in terms of physical funct oning within the population? ........... .................................................................................................................................................................................................................. 4.What is the evel and distribution of Adult expanded H l-H32; health in terms of cognitive functioning child: Education module within the population? 5. What is the ievei and distribution of Expanded A30-A47 health in terms of mental health within the population? .... ..... ~.....................................................*..........................................................*................................................................................................... 6,What .s the level, cause, and distribution Vital registration data of adult mortality in the population? 7. What is the ievel, cause and distribution Fertility module Vitai registrat on data of child mortality in the population? 8.Who is engaging in individual health- Standard B2-B23 Fertility module (breastfeeding related behavior of particular concern? and weaning) What is the degree of participation in such behavior? 9 ouseholds are engaging in Housing module sanitation, waste disposal, and water supply pract ces that adversely affect the health of their members? Il0OWho is obtaining health care by Short 9-35 (only provider type), Expanded E I -E90 (very detailed, purpose, provider type, and level of care? standard E I -E54 (more detail, including by purpose) but no data on purpose) ................................................................................................:.............................................................................................................*.............. I .What providers and services do Community questionnaire Standard FI-F8 households have access to? 12. What are private expendiitues on Short 1 1-37 (brieD, Expanded E8-E94 Special interviews or care by purpose, provider type, and standard E3-E58 (more detailed) (very detailed) calendars for severe level of care? illness ....... .................................................................................................................................................................................................................... I3.What is tne ut lization of each Short 9-35 (brief), Expanded EI-E90 subsidized service, in the aggregate standard E I -E54 (more detailed) (very detailed) and by subgroup? 4...... . S..~..... .............................................................................................................................................................................................................. 14.Wnat a tne unit subsidy for Expanded E8-E90, Estimates of providing each service? facility questionnaire 84, B9, B I I incremental cost from cost function 5.* How does a change in public Measures of health: short Additional health measures: education, health promotion actvities, 1,4-7, standard A3,A6-A9,AII I-A29: expanded A30-A47, CI-GIl, and other public health activities affect anthropometric data: Anthropometry HI -H32 health? module; education and activities: Community module I16.* What is the effect of the quality Measures of health: short 1, 4-7, Additional health measures: of the air or water supply on health? standard A3, A6-A9, A I I -A29; expanded A30-A47, G I -G I 1, anthropometric data: Anthropometry H l-H32. module; environment data: Environment modules ............. ..... .........*............................................................................................................................................................................................... I7.* W at are the effects of government Health behavior: standard 82-B23; Other information on heath promotion activities on individual government activities: Community module health promotion health-related behavior and on knowledge activities of the health risks associated with certain behavior? 8. What is the price that households Standard E3-E58 (only by provider type), Expanded E8-E94 (fuli detail) pay for care by type of service, level facility questionnaire B4, B9, B 1 of care, and provider type? ............................................................................................................................................................................................................................ 194 CHAPTER 8 HEALTH Table 8.1 Policy Questions and Data Sources (continued) Minimal data needs from Expanded data needs Sources other than Policy question a household survey from a household survey a household survey 19.* What is the effect of a change in user Standard E I -E54, facility questionnaire Expanded E l-E90 (more fees for a service on utilization of that B4, B9, B II detail on utilization and service, utilization of other services, expenditures, can also use to provider choice, household expenditures impute prices) on health care, and government revenues from user fees by service? 20.* How does a change in user fees Measures of health: short I,4--7. Additional health measures: User fees affect health? standard A3, A6-A9, A I1 -A29: expanded A30-A47, G I -G I1, anthropometric data: Anthropometry H I -H32; can also use E I -F91 module; fees: facility questionnaire to impute prices B4,B9,BI I 2 i * What is the effect of program Standard E I-E54; program iocation: Expanded El -E90 (more detail location on utilization of services, Community module; standard F -F8 on utilization and expenditures) provider choice, household expenditures on health care, and government revenues by service? 22.* How does a change in program Measures of health: short 1, 4-7, standard Additional healtn measures: location affect health? A3, A6-A9, A I I-A29: anthropometric data: expanded A30-A47, G I -GI 1, Anthropometry module; program location: H l-H32 Community module; standard F I -F8 23.* How do changes in the availability Health behavior: standard B2-B23; of government services affect household government services: Community health-related behavior? module: standard F I -F8 24.* What is the effect of a change in Health behavior: standard B2-B 16; prices: alcohol or tobacco taxes on consumption Community module of those goods, on household expenditures on these goods, and on government revenues from the taxes? 25. What is the quality of care available Standard F I-F8, facility questionnaire to households by type of service, level of care and provider type? 26.* What lsathe effect of a change in the Standard E I-E54; service quality: facility Expanded E I-E90 (more detail quality of a service on utilization of that questionnaire (Parts A-F) on utilization and expenditures), service, provider choice, household facility questionnaire (Parts G-K) expenditures on health care, and government revenues by service? 27.* How does a change in the quality Measures of health: short 1, 4-7, Additional health measures: of health care affect health? standard A3, A6-A9, A I I -A29; expanded A30-A47, G I -G I 1, anthropometric data: Anthropometry HI -H32; facility questionnaire module; service quality: facility (Parts G-K) questionnaire (Parts A-F) available from private sector providers? 29.Are private sector providers following Facility questionnaire (Parts A-F); Facility questionnaire (Parts G-K); government regulations regarding service some price data also from standard price data also from prices, the quality of services, and the use E3-E58 Expanded E8-E94 of pharmaceuticals? 30.Are private providers of insurance Standard DIl-D 17 Special study on following government regulations? insurance providers 3 i What is the impact of government Facility questionnaire Facility questionnaire Information on regulations on the quality of care (Parts A F; panel data) (Parts G-K; panel data) government in the private sector? regulations 32.Who is covered by health insurance? Short 38, 39 (coverage only) Standard DIl-D 17 What are the benefits, services covered, (detail on costs, benefits, and so on) copayments, deductibles, and benefit caps? ...................................................................................................................................................*............................................................................. (Tloble continues on next pcge.) 195 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE Table 8.1 Policy Questions and Data Sources (continued) Min mal data needs from Expanded data needs Sources other than Policy question a household survey from a household survey a household survey 33. Who has health benefts at the place Employment module of employment? What are these health benefits? 34-c Howv does the presence of nsurance Standard E I-E58; Expanded E-l-E94 (more detaii coverage affect health care utilization for insurance: short 38, 39 on utilization and expenditures); patterns, expenditures on health care, for nsurance: standard D I-D 17 and government revenues from user fees? (detail on costs, benefts) 35 What is the effect of health on Measure s of health: short 4-7, Additional health measures: worker and farm product vity? standard A3, A6 A9, A I I A29; expanded A30-A47, G I -G I 1, anthropometric data: Anthropometry H .-H32 module: for worker and farm productivity: Employment module, Household enterprise module, Agr culture module 36.* What s the effect of health on Measures of nealtn: short I, 4-7, Addit onal health measures: education and cognitve outcomes? standard A3,A6-A9,AI I -A29; Expanded A30-A47, Gl-l anthropometric data: Anthropometry H I -H32 module; educat on outcomes: Education module 37* What is she effect of heath on Measures of health: short , 4-7, Additional health measures: consumption and savings? standardcA3,A6-A9,Al I A29: expanded A30-A47, G -G 1, anthropometric data: Anthropometry H .-H32 module; savings and consumption: Savings module, Consumption module Note. Quest ons marked w th a () qire causai ana ys s.The numbers in the second and third columns ind cate question numbers from the different vers ons of the hea th modu e in the household questionnaire. For tne stardard and expanded modu es the letters refer to the 'Part" ofthe modu e. For example, "standard A6" refers so Question 6 of Part A of the standard healtr module. Source: Authors' summary ficult estimation issues because it requires estimation been controlled for. Another example is user fees. of causal relationships-assessing how specific policies Suppose that user fees are higher in areas where affect complicated behavioral choices. households' incomes are relatively high. In this case, an This subsection briefly discusses some method- analysis of the relationship between user fees and the ological issues that arise in answering the second set of utilization of health care services that does not control questions. A detailed (and more technical) discussion for households'income levels may show that higher of these methodological issues can be found in user fees are associated with higher levels of health Appendix 8.1 . Analysts may want to know how a spe- care utilization, because higher income leads to greater cific health program affects a specific health outcome, utilization. The basic statistical tool for estimating such as the incidence of diarrhea. One simple causal relationships is multiple regression analysis, approach is to compare the mean incidence of diar- which controls for the effects of all variables that can rhea among communities that benefit from the pro- potentially affect the health outcome being studied. gram with the mean incidence in communities that do For the analysis of health issues, the two most not benefit from the program. Unfortunately, this important relationships to estimate using multiple approach cannot be used to determine the causal regression analysis are health input demand equations impact of the program. Suppose a simple analysis of and health demand equations.8 Estimates of health means indicates that communities benefiting from the input demand equations measure the determinants of health program have a lower incidence of diarrhea. health care utilization, expenditures on health care, This may not be due to the program, because com- and health-related behavior. Each of these health- munities that benefit from the program may also have related outcomes is determined by a set of causal vari- higher income levels, and higher income levels may be ables, namely prices (both for health services and for responsible for all or part of the lower incidence of goods unrelated to health), prevailing wage rates, diarrhea. Thus, while a causal relationship may exist, it household income, other household characteristics can easily be obscured by other factors that have not (such as the education levels of household members), 196 CHAPTER 8 HEALTH and a variety of community characteristics, including often more useful to estimate changes in health status the characteristics of locally available health services. as determined by prices, wages, household income, In addition, households vary in their "tastes" for health and other household and community characteristics. and in the "innate healthiness" of individual household For example, a child's nutritional status at 12 months members. of age can be thought of as being determined by his The data required to use multiple regression or her nutritional status at 6 months of age and the analysis to estimate a health input demand equation health inputs that he or she has received during the are the health input of interest (such as the use of intervening 6 months. health care services, expenditures on health care, or a Perhaps the most important lesson for survey specific health-related behavior) and the entire set of designers from this discussion of methodological issues variables that determine the demand for that input. is that if they want to analyze the causes of health sta- While it is possible (though not necessarily easy) to tus and related activities (health care utilization, health collect data on prices, wages, income, and many other care expenditures, and health-related behavior), it is household and community characteristics, it is much necessary to collect data on all of the characteristics of harder to measure tastes for health and the innate households and communities that determine these healthiness of individuals. This inability to observe health outcomes. Not having all of these data can lead some of the variables that determine households' to omitted variable bias. The magnitude of the bias demand for these health inputs leads to a serious esti- will depend on the correlation of the omitted variable mation problem: omitted variable bias (discussed fur- and the included variables and on the magnitude of ther below). In fact, the two examples above (diarrhea, the omitted variable's true effect on the outcome. user fees) on the problem of using simple comparisons Omitted variable bias can arise for several reasons. of means can be thought of as cases of omitted vari- First, it can arise if one of the observable determinants able bias, in which the omitted variable is household of the health outcome is omitted, as in the above income. examples in which household income was omitted Health input demand equations can be used to from studies of diarrhea and user fees. Second, omitted estimate how specific policies or programs, such as variable bias can also arise because some of the deter- public health programs and price changes, affect health mining variables, such as tastes and innate healthiness, care utilization, provider choice, and health-related are almost impossible to measure. A third source of behavior. Of course, the price that really matters is the omitted variable bias arises in the case of a dynamic effective price paid by the consumer. This means that model, as explained in Appendix 8.1. For further dis- in addition to the data on the price charged by the cussion of omitted variable bias and how to avoid or health care facility, data are needed on taxes, travel at least minimize it, see Appendix 8.1 and Chapter 26. expenses, waiting times, wage rates, and insurance copayments.The effects of travel and waiting times can Sampling Issues be accounted for either by including time costs in the Two issues need to be considered with respect to sam- calculation of the price or by including distance in the pling. First, LSMS surveys should continue their prac- equation as an additional determining variable. tice of not subsampling individuals within a household. Estimates of the effects of specific policy changes on This would yield samples that would be too small to government revenues from taxes and user fees can also draw reliable inferences about the utilization of health be calculated from estimates of how prices affect uti- care. For example, in the 1987-88 LSMS survey in lization rates. Ghana, 5,746 individuals (39 percent of the sample) What about the impact of government policies on reported being ill or injured during the previous four health status? Clearly, health status is strongly affected weeks, and 2,398 (17 percent of the sample) had by health care utilization, expenditures on health care, received medical attention. Disaggregating these visits and health-related behavior. Thus the same variables by type of provider shows that 1,106 were treated by that determine these demands for health inputs also doctors, 292 by nurses, 654 by medical assistants, 84 by determine health. This relation is a health demand midwives, 70 by pharmacists, 109 by healers, and fewer equation (also known as a health demand function). than 100 by other providers. These Ghanaian house- Since current health can depend on past health, it is holds had an average of 4.68 members, so randomly 197 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE sampling two individuals per household would have based on more, and more detailed, questions. The sec- yielded numbers less than half of those reported above. ond difference is that expanded data needs measure Such samples are too small to generate reliable esti- more dimensions of the phenomena of interest. For mates of expenditures and utilization rates, particularly example, expanded data needs measure health outcomes by provider.The sample sizes for rare illnesses and their that are not measured by the minimal data needs, such associated treatments would be even more miniscule. as mental health and cognitive functioning. A second issue regarding sampling is the selection Another feature of Table 8.1 is that it distinguish- of the health care facilities to which to administer the es between policy questions that require causal analy- facility questionnaire. In rural areas there may be few sis and policy questions that do not. Questions that facilities to visit, so all nearby facilities (and possibly require causal analysis are marked with an asterisk next even some that are far away) should be visited. In con- to the number pertaining to the question. As pointed trast, in large urban areas the number of facilities that out above, the data requirements for causal analysis are households could potentially visit is so large that it is quite large. In particular, all the determining variables not feasible to visit them all or even most of them. In that belong in the demand equations for health or such cases survey designers must choose a sample of health inputs, such as prices, wages, household income, facilities. In an LSMS-type survey the facilities of and other household and community characteristics, interest are those that are available to the households are needed. There are some circumstances in which in the sample. Thus the general principle is to ensure a reliable estimates can be obtained without some of good deal of overlap between the facilities interviewed these variables. The issues involved in deciding which and the facilities that respondents know about and use. variables must be included are complex and beyond The best method for choosing such facilities, discussed the scope of this chapter. However, a brief discussion in more detail in Chapter 13, is to compile a list of and some useful references are provided in Appendix facilities based on those mentioned most often in 8.1. responses given in the household questionnaire. If the number of facilities on the cumulative list is small, or ASSESSING HEALTH STATUS AND MEASURING if the survey budget is large, all facilities on the list can ASSOCIATED BEHAVIOR. Questions 1-9 in Table 8.1 be included in the sample. Otherwise, a sample of refer to policy issues associated with assessing health facilities should be drawn from the list. The sample of status and measuring associated behavior. Because of facilities can be selected either randomly or according measurement problems associated with self-reported to some other criterion, such as using a probability morbidity data, LSMS surveys are, in general, not very proportional to the number of times they are men- useful for gathering information on the incidence of tioned by household respondents. Once selected, the specific diseases. Yet there are cases in which self- list of facilities and their associated code numbers is reported symptom data may be helpful, and carefully recorded in the community questionnaire. worded questions can provide useful measures for pol- icy analysis. An important example is diarrhea; several Links Between Policy Issues and Data Needs questions about the incidence of diarrhea are con- This subsection links the policy issues discussed in the tained in all three versions of the draft health module first section of the chapter to their specific data (for the standard and expanded module these ques- requirements, with reference to the draft modules tions are in Part A: Self-Reported Health Status). In introduced in the third section of the chapter. For some countries additional data on symptoms may be convenience, this information is summarized in Table useful in assessing morbidity with self-reported symp- 8.1, which identifies the minimal data needs, expand- toms, yet better data on the incidence of most diseases ed data needs, and data sources other than household can be obtained from facility data and from official surveys required to answer the policy questions. government statistics on deaths (which often include There are two differences between the minimal the cause of death). and expanded data needs.The first is precision. Minimal In addition to data on morbidity, policymakers data needs are based on relatively few questions and thus often want data on both adult and child mortality. do not measure the particular phenomena of interest as Because adult mortality is a relatively infrequent event, accurately as do the expanded data needs, which are household surveys with samples of 2,000-5,000 198 CHAPTER 8 HEALTH households-the recommended size for an LSMS- health module collects very detailed information on type survey-will not yield sufficiently precise meas- utilization and expenditures in the previous four ures of adult mortality for use in policy analysis. weeks. Official government statistics are a better source for Equity is also affected by differences in the avail- this information. On the other hand, because child ability of health care services and the extent to which mortality is more frequent, some useful estimates can people know that these services are available. The be obtained from data collected in the fertility module community questionnaire presented in this book asks (see Chapter 15). a group of community representatives and leaders The only information on health outcomes in the about what health care providers exist in their com- short version of the draft health module is for diarrhea munity. In addition, in Part F (Health Provider among young children. The standard and expanded Knowledge) of the standard and expanded versions of versions collect both diarrhea information and self- the health module, each household is asked to name reported activities of daily living in Part A (Self- the closest health care providers of different types of Reported Health Status). The expanded version also which they are aware. collects mental health information in Part A. In addi- Finally, studying the distribution of government tion, the expanded version collects data on observed health care expenditures across different population activities of daily living in Part G and on cognitive groups requires not only data on the utilization of functioning in Part H. Both the standard and the each service (which are gathered in Part E of the expanded versions collect information on health module) but also data on the unit cost of providing behavior-specifically, behavior related to the con- each service. Rough estimates of the unit costs for sumption of alcohol and tobacco-in Part B (Health- some services can be obtained from the facility ques- Related Behaviors). Because the behavior of concern tionnaire. More precise measures of unit costs require will tend to vary by country and the wording of the econometric estimates from a cost function for the questions will be culturally specific, Part B will proba- provision of health care, which entails collecting a sub- bly need to be modified to reflect the health priorities stantial amount of additional facility-level data. in each country. Other kinds of behavior that might be addressed in Part B include exercise, the use of seat PUBLIC HEALTH PROGRAMS. Questions 15-17 in Table belts, and safe sex practices. Information on infant and 8.1 address three policy issues regarding public health child feeding practices can be collected in the fertility programs. The first two questions concern how public module (see Chapter 15). Information on sexual prac- health policies affect health outcomes. At a minimum, tices is much more difficult and thus is not attempted answering these questions requires some measure of in this module. Finally, some information on knowl- health and a measure of the policy variable that varies edge of sexually transmitted diseases is collected in over the sample. The data needed for assessing health Part B of the standard and expanded modules. Similar status were discussed above.The policy variables are col- information could be collected on knowledge of the lected primarily in the community questionnaire, and risk of cancer from smoking or the risk of liver disease perhaps also in the facility and price questionnaires. from consumption of alcohol. Question 17 in Table 8.1 concerns health behav- ior and health knowledge. Data on specific types of EQUiTY IssuEs. Questions 10-14 in Table 8.1 address behavior and knowledge of the health risks are found equity issues.A key equity concern is the utilization of in Part B of both standard and expanded versions of health services and associated expenditures. The short the health module. Data on government health pro- version of the health module collects basic informa- motion activities should be collected in the commu- tion on utilization and expenditure in the previous nity questionnaire. four weeks but does not disaggregate expenditures by type (such as clinic fees, purchases of medicine, and PRICING POLICIES FOR HEALTH CARE SERVICES. transportation). The standard module collects signifi- Questions 18-24 in Table 8.1 address health care pric- cantly more information on utilization and expendi- ing policies. In general, the short version of the health tures in Part E (Health Care Utilization and module contains only a small amount of data on these Expenditures). Part E of the expanded version of the issues, while both the standard and expanded modules 199 PAULJ. GERTLER, ELAINA ROSE,AND PAUL GLEWWE collect much more information. There are several expanded versions).As discussed above, data on utiliza- options for measuring health care prices. Community- tion and expenditures are collected in Part E of the level prices for commonly available medicines can be standard and expanded versions of the health module collected in the price questionnaire. Prices for com- for the household questionnaire. mon health services can be obtained from the facility Similar data are needed to analyze the impact of questionnaire. In the standard version of the health the location and availability of government services on module, expenditures on treatment by provider type health-related behavior (Question 23), using the are gathered in Part E (Health Care Utilization and health behavior data collected in Part B of the standard Expenditures). Part E in the expanded version of the and expanded versions of the health module. Finally, health module gathers more precise measures, measures the impact of alcohol and tobacco taxes on the con- that can be used to construct prices by provider and by sumption of those goods and on government tax rev- type and level of service, to calculate the costs of serv- enues (Question 24) can proceed similarly. In particu- ice in travel and waiting time, travel costs, and in-kind lar, analysts could make use of data on taxes or prices payments, and to adjust expenditures to account for that vary over the sample and data on consumption of insurance payments. Finally, information on travel time the goods from Part B (Health-Related Behavior). from Part F, which is included in both the standard and Price elasticities can be obtained from estimates of the expanded modules, is useful for calculating the time demand functions for the goods, which again would costs involved in obtaining treatment. require data on all of the variables in the demand Question 19 in Table 8.1 regarding the impact of equation. user fees involves estimating the effect of a change in prices on the use of health care services. A minimal QUALITY OF HEALTH CARE SERVICES. Policy issues analysis requires data on utilization and provider choice concerning the quality of health care services are from Part E of the standard module, along with some shown in Questions 25-27 in Table 8. 1 . There is real- measure of user fees that varies over the sample. Ideally ly only one way to measure the quality of care, which the effects should be estimated from a fully specified is by using data collected in a health facility question- health input demand equation, which requires data on naire.This information can be linked to the household utilization from Part E as well as data on all the vari- using the information provided in Part F (Health ables that belong in the health input demand function. Provider Knowledge) and the data on local health Because the questions essentially involve estimating facilities in the community questionnaire. Sources for price elasticities, particular attention should be paid to data on utilization rates, expenditures, and health sta- the construction of price variables. tus were discussed above. User fees and the location of government health programs can have an effect on health outcomes (see PoLIcIES REGARDING PRIVATELY PROVIDED HEALTH Questions 20 and 22 in Table 8.1), which implies a CARE SERVICES. Questions 28-31 in Table 8.1 consid- need to collect data on health status. The difficulties er private sector health care providers and government involved in doing this were discussed above. As always, policies toward them. The main issues are the quality causal analysis requires estimation of a health demand of health care given by these providers and whether equation, which in turn requires data on all the vari- their services are in compliance with government reg- ables that belong in that equation. ulations. Data from the facility questionnaire, in par- Analyzing the impact of the location of govern- ticular data collected from private health care ment programs on the utilization of services, on house- providers, are essential (except for Question 30, which holds' choices of providers, on household expenditures is discussed in the following paragraph). Parts A-F of on health care, and on government revenues (Question the facility questionnaire collect data on quality of 21 in Table 8.1) requires, at minimum, data on the loca- services as measured by the equipment and services tion of those programs. Of course, data on utilization available, while Parts CK collect information on the rates and household expenditures are also needed to process of care (that is, whether certain protocols are answer this question. Data on the location of programs followed when examining patients). Data on prices, are collected in the community questionnaire and in which are relevant for Question 29, are also collected Part F of the household questionnaire (standard and in the facility questionnaire, and prices can also be 200 CHAPTER 8 HEALTH Box 8.2 CautionaryAdvice How much or the draft module is new and unproven? The main objective, the study of the choice of health care three versions (short, standard, and expanded) of the providers. Even so, the design was not well suited for cases health module introducedin this chapter (and presented in in which individuals visited several health care providers Volume 3) are much larger than, and in most respects very during the reference period. The main criticism of past different from, the health modules used in past LSMS sur- LSMS health modules concerns what they did not attempt veys. But not everything is different; past LSMS surveys to do.They collected almost no data on health status, health focused on utilization rates and household expenditures behavior or insurance status. The standard and expanded on health care, and since the short module and Part E of versions of this module directly address these deficiencies. the standard and expanded versions of the module have * Which parts of the module most need to be customized? the same focus, they are likely to work well because they Several parts of the module need to be customized to are based on the experience of past surveys. However, reflect the circumstances in the country where the survey most of the rest of the module collects data that have will be fielded. First, the types of facilities visited can vary rarely been gathered in past LSMS surveys (such as data widely across countries, so questions on the utilization of on self-reported health status, health-related behavior and those facilities must be substantially modified. Second, the insurance coverage).These parts are based mostly on the questions on health-related behavior must be adapted experience of non-LSMS surveys, but are also based to because of the sensitive nature of certain topics, as what some extent on the 1990 Jamaica LSMS survey and the is sensitive varies from country to countryThird. the activ- 1 996 Brazil LSMS survey. These parts should work well ity list in the self-reported and the observed activities of but will probably require extra attention from survey daily living may need to be tailored to ensure that they are designers. In partcular, collecting data on self-reported culturally appropriate. Fourth, several aspects of the facili- health status, especially activities of daily living, is experi- ty questionnaire need to be changed to fit local condi- mental. Health facility questionnaires have been adminis- tions, since types of facilities, kinds of personnel, and serv- tered in a variety of LSMS and non-LSMS surveys, and ices offered can vary substantially by country. After both thus the design of the facility questionnaire is based on a the health facility questionna re and the health module for large amount of experience. the household questionnaire are customized, the draft Hovl well has the module worked in the past? The health questionnaires should be discussed in detail with officials module in past LSMS surveys worked well in achieving its from the ministry of health. inferred from information on expenditures given in 33) is provided in Part D of the standard and expand- Part E of the standard and expanded versions of the ed health modules.A very small amount of information health module for the household questionnaire. on the proportion of the sample population with The topic of Question 30 is whether private insurance coverage is collected in the last two questions insurers are abiding by any government regulations of the short module. Information on employer-provid- that apply to them, such as whether they unlawfully ed health benefits is collected in the employment mod- drop individuals who have high medical costs or ule, which is discussed in detail in Chapter 9. whether they comply with laws regarding their finan- Estimating the impact of insurance on utilization cial integrity. Panel data on insurance coverage may be rates, expenditures, health behavior, health outcomes, able to detect whether household members' insurance and government revenues again requires estimation of coverage is illegally terminated, and to study similar the health demand function. As before, this requires issues regarding compliance with regulations pertain- data on all of the variables that belong in that function. ing to the treatment of insured individuals, but in gen- eral household survey data cannot provide informa- HEALTH STATUS AND OTHER SOCIOECONOMIC tion on the financial soundness of private insurance OuTcoMEs. The last three questions in Table 8.1 providers. (35-37) focus on how health status affects other socioeconomic outcomes. Answering each of these HEALTH INSURANCE. Questions 32-34 in Table 8.1 questions requires, at a minimum, data on health status address policy issues related to health insurance. and on the outcome of interest.The sources of data on Information on who is covered by health insurance health status were discussed above. A discussion of and what the specific benefits are (Questions 32 and methods for estimating the determinants of consump- 201 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE tion, education and cognitive outcomes, agricultural as many as were used in the health modules of the first productivity, and savings can be found in Chapters 5, LSMS surveys in the mid to late 1980s. However, the 7, 19, and 20, respectively. health information gathered in those first LSMS sur- veys was of very limited use. Moreover, when the short The Health Module module presented here is administered, most respon- dents will be asked far fewer than 39 questions, since This section introduces both a health facility ques- most probably will not have visited a health care tionnaire and three versions of a health module to be provider in the previous four weeks. The field test included in the household questionnaire. The three should verify that the time required to fill out the versions of the health module vary by length: short, short health module in a typical household is 10-15 standard, and expanded. Draft versions of all three minutes. health modules, as well as the draft facility question- A final point regarding the short module is that naire, are presented in Volume 3. there is no need to administer a health facility ques- Table 8.2 summarizes the kinds of data collected tionnaire as well. However, it is still useful to collect in each version of the health module for the house- some summary information about health facilities in hold questionnaire, and also shows the relevant data the community questionnaire. collected in the health facility questionnaire, the com- munity questionnaire, the price questionnaire, and The Standard Module other modules of the household questionnaire. The standard version of the health module should be used when health is one of the main policy issues in The Short Module the country of the survey, but not necessarily the most The short version of the health module is designed for important issue. The standard module collects data on countries in which policymakers have some interest in self-reported health status (including self-reported health issues but health is not one of the most impor- activities of daily living), health-related behavior, child tant policy topics in the survey. In this context, the imnmunization, health insurance coverage, health care short health module does four things. First, it collects utilization and expenditures, and knowledge of local some cursory information on the health status of health care providers. Analysts can use these data to household members. Second, it gathers information answer questions about the level and distribution of on all outpatient visits to health facilities by household health policy outcomes, including how changes in members during the previous four weeks and on all policy affect various outcomes in the health sector. Of inpatient visits in the previous 12 months.This infor- particular interest are estimates of the price elasticity mation on utilization can be used to perform relative- of the demand for health care, which can be done ly simple analyses of the incidence of public health using the standard or expanded version of the health spending among different population groups. module. Third, the short module collects some information In most cases data on health service providers on the household's expenditures on visits to the differ- should be collected in the health facility questionnaire. ent providers.This provides a more accurate measure of Anytime the standard (or expanded) version of the the household's total expenditures on health than health module is used, anthropometric data should would be obtained by asking a single general question also be collected, both for children and adults.The col- about these expenditures in the consumption module. lection of anthropometric data is explained in detail in This expenditure information also gives analysts a Chapter 10. rough idea of the out-of-pocket cost of obtaining The average duration of an interview using the health care by type of service provider. (If more detailed standard module should be approximately 5-1 0 minutes expenditure data are required, the standard module per individual. However, the length of these interviews should be used.) Fourth, the short module gathers very is likely to vary substantially, since individuals who did brief information on insurance coverage-just enough not use any health care services during the previous 30 to indicate who is covered by insurance and who is not. days will be able to skip many of the questions. Even this short version of the health module may Finally, whenever the standard or the expanded seem quite long. It contains 39 questions, about twice health module is used, the survey team supervisor must 202 CHAPTER 8 HEALTH Table 8.2 Health Data in the Health Module and in Other Modules Variable Respondents Short version Standard version Expanded version Other module Adult Self-reported Self-reported health status People 15 or older 1,2 A3, A4, A I I-A29 A3, A4, A I I-A47 Health-re ated behavior People 15 or older BB I 523 B -823 Utiization and expenditures People 5 or older 9-37 E- E58 Ei-E94 Insurance coverage People 15 or older 38,39 D I -D 17 D I -D 17 . ...................................................................................................................................... ............................................................................................ ~mpI?Xee health benefits Employed individuals Empiomn E np ........he .............................d.....a.................. ................................................................................................................m en Aduil Objective measures Body mass index People 15 or older Anthropometry Directly observed activities of daily living People 40 or older Gl-GC I ..................... ............................................................................................................................................................................................................ Cognitive functioning People 40 or older HIl-H32 ............................................................................................................................................................................................................................ Child: Mother/guordion reported .... .............................................................................................................................................................................................................................. Self-reported health status People under I5 i-B A3- A10 A3-A1I0 Immunization People under 6 Ci- iI Cl-Cl lm uiz atn .............................. ..............,,,C ., ll...................... Utilization and expenditures People under 15 9-37 El -E58 Ei -E94 ....... *....................... * *......................................................................................... *......................................................................................................... Insurance coverage People under I5 38,39 D,I,-D.17 D I , D - 7 Child: Objective meosures Height/age and weight/heght Anthropometry ................. *............... ................................................................................................................................................. ................................................. Birth weight Fert iity Infant mortaiity Fert iity co nitive development Education ..... .......................I.............................................................. ..................................................................................... *............................................... Household Health-related behavior Housing ~~~i h.....F i i K.nowedge of health care providers Wife of head El-ES El-ES Other Community health information Community Faciiitydata Director of facility Facii ty questionnaire Prices of medicines Market vendors Price questionnaire Source: Authors summary draw up a master list of codes for all of the health serv- The Expanded Module ice providers in the community.This list is used to match The expanded version of the health module takes the different information from different parts of the ques- standard version as its starting point and adds more tionnaire. In particular, these codes are entered in the questions. It should be used in countries where health section of the community questionnaire that gathers is the most important topic of the survey. Facility ques- information on health facilities, in Part F of the standard tionnaires should always be used when the expanded module (which asks about the distance from the house- module is used.The expanded version goes beyond the hold to thc nearest health facilitics), on each health facil- standard version in two ways. First, it collects more data ity questionnaire, and in Part E (detailed utilization and on health outcomes, specifically, observed data on expenditures) in the expanded module. Merging this mental health and data on activities of daily living and information is very useful for policy analysis. For exam- cognitive functioning. Second, it collects more detailed ple, combining the information in Part F of the standard information on health care utilization and expendi- module with the data from the health facility question- tures, which allows for more precise estimates of the naire not only provides analysts with the distance demand for health and the demand for health care between the household and the nearest health facilities services. The detailed data gathered on utilization and but also yields a wide range of information about those expenditures includes information about the respon- facilities from the facility questionnaire. dent's most recent visits to health care facilities, includ- 203 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE ing the purpose and cost of their visits, the treatment In household surveys in which health issues are a they received, their expenses on travel and medicines, top analytical priority, survey designers should also and the time they spent traveling to the facility and gather some facility data on the process by which waiting for treatment. health care is given. Such data can be gathered in Parts The amount of time required to complete the C-K. In each of these parts the interviewer describes expanded version of the health module is approxi- a hypothetical patient with a particular illness or health mately 10-20 minutes per individual. Again, this will care need. The respondent for the facility question- vary widely across individuals, since some household naire is asked a series of questions to find out how members will have very few health problems and no such a patient would be treated by the health facility. recent visits to health facilities while others will have The five scenarios presented to the respondent are: more of both. * The treatment of an adult with a cough and a fever. 0 The treatment of an infant with vomiting and Hybrid Versions diarrhea. A final point regarding these three versions of the * A pregnancy cxamination. health module is that survey designers should feel free * The provision of an IUD. to put together "hybrid" versions that fit their specific * The provision of oral contraceptives. data collection needs. For example, they could put The precise scenarios used should vary by country together a module halfway between the short and depending on what health problems and issues prevail standard versions or halfway between the standard and in each country. Developing new scenarios may be expanded versions. The standard and expanded ver- warranted in some countries, but survey designers sions are made up of several different submodules that doing so should consult a public health specialist who are mostly independent of each other, so survey has experience with collecting this type of data. designers could construct a health module that con- In any country, health care service providers come sisted only of those submodules that covered issues of in many shapes and sizes, from drug peddlers selling particular interest to policymakers in that country. over-the-counter medicines to large urban hospitals. There are two types of approaches to designing ques- Health Facility Questionnaire tionnaires to fit the different kinds of providers. The The draft health facility questionnaire is designed to first approach is to have one questionnaire that can be collect information on the resources and practices of used for a wide variety of providers. The judicious use local health care facilities, both public and private. The of skip codes can facilitate the use of such a question- information collected by this questionnaire can be naire by instructing the interviewer not to ask ques- used to construct health care prices and to assess the tions that are irrelevant. The second approach is to quality of care. Part A collects some basic information design separate questionnaires for each kind of health about the facility, such as its ownership, its sources of care facility-a pharmacy questionnaire, a public clin- electricity and water, and its hours of operation. Part B ic questionnaire, a private clinic questionnaire, and a asks about the medical services offered by the facility hospital questionnaire. The draft facility questionnaire and the prices charged for them. Part C gathers infor- presented here is of the general type, but it can be mation about health center employees. Part D collects modified and split into several different types if the data on medical equipment, including whether the second approach is used. equipment is in working order. In Part E the inter- A final point regarding the facility questionnaire is viewer asks for a quick tour of the facility to evaluate how to select the facilities-a particularly important the cleanliness and the availability of the equipment in issue in large urban areas where there are usually many the different rooms (for example, the examination facilities from which to choose.This is discussed in the room, the injection room, the vaccine storage area, and second section of this chapter and also in Chapter 13. the room for doing laboratory tests). Part F asks whether different types of medicines are in stock. The Annotations to the Health Module information from Parts A-F can be used to construct measures of health care quality based on the structure This section contains specific notes on particular parts of care. of the three versions of the draft health module intro- 204 CHAPTER 8 HEALTH duced in the previous section (and presented in health practitioners are always outpatient services. If Volume 3). this is not the case, these questions must specify that information is only required for outpatient services, Short Health Module along the lines of Questions 9-11, which collect anal- ogous information for outpatient visits to public hos- Q2. This is a general measure of the health status of pitals. In addition, questions on inpatient services from each individual. It is measured as the change over the private individuals and traditional health care practi- past year to minimize bias caused by the influence of tioners should be added, along the lines of Questions socioeconomic characteristics on respondents' assess- 27-35 on inpatient services. ments of their health status. Q38-Q39. If insurance is relatively common, survey Q3-Q8. Diarrhea is most serious for children ages designers may want to collect more detailed informa- 0-6, so this is not asked of anyone 7 or older. If diar- tion. In particular, it may be of interest to find out how rhea is a serious problem for older people in a partic- much the insurer is reimbursing specific health care ular country, the age range can be changed according- providers. (Note that household members are asked to ly. No questions are asked on any other specific report only costs that they pay, not costs paid by insur- diseases because, as explained in the second section of ers.) This would entail asking additional questions for this chapter, subjective information on other diseases is each of the six kinds of outpatient providers, as well as likely to be very inaccurate and may well be biased. for the three kinds if inpatient providers. Again, whether this is worth the potential increase in inter- Q9-Q35. These questions assume that there are six view time depends on the interests of policymakers. kinds of outpatient care providers-public hospitals, More detailed insurance questions are provided in Part public health clinics, private hospitals or clinics, private D of the standard and expanded versions of the health doctors, private nurses/paramedics/midwives, and tra- module. One potential problem here is that a substan- ditional health practitioners-and three kinds of inpa- tial proportion of respondents may not know how tient care providers-public hospitals, public health much the insurer pays if the costs are directly to the clinics, and private hospitals or clinics. This can vary provider by the insurer, as opposed to reimbursing from country to country, so the number of both types payments made by the individual.This can be checked should be adjusted depending on the circumstances in during the field test of the draft questionnaire. the country surveyed. Standard Health Module Qll, Q14, Q17, Q20, Q23, Q26. These questions exclude transportation costs (since transportation is A1-A2. It is useful to know if the information is being not a direct form of medical care) but include pur- provided by the person in question or if someone else chases of medicines elsewhere. For some purposes sur- is providing the information because the person in vey designers may want to include transportation question is not available or is too young to answer for costs. In other cases policymakers may want more dis- himself or herself As discussed in Chapter 4, data col- aggregated expenditure information, such as separate lected directly from the person in question are likely questions on payments to the provider, payments for to be much more reliable than data provided on this medicine purchased elsewhere, and payments for person's behalf by someone else. transportation. This would effectively add 12 more questions, and thus lengthen the interview time. This A4. This is a general measure of the health status of additional time may be worthwhile, depending on the each individual. It is measured as the change over the interests of policymakers. Alternatively, even more past year to minimize bias caused by the influence of detailed information could be gathered, along the lines socioeconomic characteristics on respondents' assess- of Part E in the standard module. ments of their health status. Q21-Q26. These questions assume that services from A5. This question directs people to different questions private individuals other than doctors and traditional depending on their age. Questions A6-A10 on diar- 205 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE rhea are usually only relevant for children ages 6 and module. Questions about expenditures are not includ- younger. Further comments on Questions A6-AI0 are ed in this version because price data are collected in provided in the previous subsection; see the comments the price questionnaire for tobacco and alcohol; for on Questions 3-8 in the short module. Questions most purposes analysts could multiply those prices by Al I-A23 are only relevant for individuals ages 40 and the quantities provided here to obtain a reasonably older, and Questions A24-A29 are only relevant for accurate measure of expenditures on alcohol and individuals ages 15 and older. tobacco. The price information on tobacco and alco- hol collected in the price questionnaire should be in A11-A29. These questions address the self-reported the same units used here for quantities consumed. activities of daily living. The particular self-reported activities of daily living that should be collected, and B15-B16. If bottles vary in size, it may be better to ask how they should be interpreted, will vary across coun- this question in terms of ounces. tries. The easiest activities of daily living are those in Questions Al 1-A16 (dressing oneself, standing from a PART C. These questions ask about the immunization sitting position, using a toilet). As indicated in status of all children younger than 7. As explained in Question A17, anyone who has difficulty doing any of Chapter 3, this information could also be collected in these activities need not answer the remaining ques- the anthropometry module or the fertility module. If tions because he or she will not be able to do these the standard version of the health module is used, it is more strenuous activities. more convenient to collect this information here than in the anthropometry module, since anthropometry PART B. This part contains questions on smoking, alco- measurements are often done by individuals who are hol consumption, and exercise, as well as on knowledge not trained interviewers. If there is a fertility module of the health consequences of sexual behavior.The par- in the survey, it is useful to collect immunization data ticular behavior to be examined and the specific way in in the fertility module because such data will also be which this behavior manifests itself will vary over coun- collected for children who have died. tries; for example, in countries where travel by car or motorcycle is common, survey designers may want to C1-C2. Mothers are asked questions about their chil- ask questions about use of seatbelts and motorcycle hel- dren's vaccination history based on their vaccination mets. Because of this, Part B may need to be substan- cards and are asked to recall whether any information tially modified to reflect the circumstances that prevail does not appear on the card. If the card is unavailable in certain countries. Ideally it would be useful to ask or incomplete, the mothers are asked to recall their questions not only on knowledge of the consequences children's entire vaccination history. of sexual behavior but also on respondents' sexual behavior, particularly risky sexual activities. However, in C4, C6, C8. The way these immunizations are given most countries the sensitive nature of these activities will vary by country. Survey designers should consult makes it difficult to ask such questions. See the second knowledgeable officials in the ministry of health to section of this chapter for further discussion. find out how immunizations are given, and change the wording of these questions accordingly. B2-B13. In addition to the kinds of tobacco asked about here, survey designers might choose to ask PART D. This schedule asks about any insurance cov- about the use of pipe tobacco, if pipe smoking is com- erage that the members of the family may have, the mon in the country surveyed. source of the coverage, and the benefits provided. B7, B8, B13, B15, B16. In addition to asking about Dl. Health insurance includes health care provided by quantities consumed, survey designers may want to ask the government or an employer without charge or for about expenditures. Such questions should be asked a reduced charge. Some respondents may not think of immediately after the questions on quantities. This this as insurance. In this case the wording of Question may provide more accurate information on expendi- Dl must clearly indicate that such health care is con- tures than would be obtained in the consumption sidered to be insurance. 206 CHAPTER 8 HEALTH D3. In some countries it is possible for household live in the household or if an adult male is clearly members to have more than one kind of insurance. more knowledgeable than any of the adult women, the For example, insurance in the form of free health carc most knowledgeable adult male should be inter- provided to government workers may be deemed viewed. This information can be used to measure inadequate, and in response some of them may also access to health care and to estimate health care prices. visit private health care providers and purchase private The kinds of health facilities from which to col- health insurance. In this case Part L) must be expand- lect this information will vary from country to coun- ed to allow for two kinds of insurance per household try. For the health facilities that are most common, the member, which in general will entail repeating nearest two or three (within some radius of the center Questions D5-D17 for a second type of insurance. of the community) should be covered. The question- naire lists only three facility types-public hospital, PART E. This part collects basic information on self- public clinic, and private hospital or clinic-but in treatment and on inpatient and outpatient treatment most cases more should be included, such as private by type of facility during the previous four weeks, as pharmacies, itinerant drug peddlers, traditional healers, well as on the total costs associated with each type of and family planning centers. The specific types to list treatment. These questions assume that there are six should be discussed with officials from the ministry of kinds of outpatient care providers-public hospitals, health. public health clinics, private hospitals or clinics, private doctors, private nurses/paramedics/midwifes, and tra- F2-F3. The codes associated with the names of the ditional health practitioners-and three kinds of inpa- facilities should be taken from a master list of health tient care providers-public hospitals, public health facilities in the community.This list is drawn up by the clinics, and private hospitals or clinics. This may vary survey team supervisor, as explained in the third sec- from country to country, so the number of both types tion of this chapter. should be adjusted depending on the circumstances of a given country. Additional Questions for the Expanded Health Module E5, Ell, E17, E23, E29, E35, E41, E47, E53. One easy A30-A47. The wording of these questions, which per- way to check whether a person is insured is to add tain to mental health, will have to be adapted to reflect another column onto the fold-out sheet with the the prevailing culture in the country of the survey. In names of all household members (see Chapters 3 and some countries the prevailing culture may prevent 6) that indicates whether the person is covered by people from answering these questions accurately, in insurance. which case there is little use in asking them. E25, E26, E31, E32. These questions assume that serv- PART E. This version of Part E is nearly twice as long ices from private individuals and traditional health as the version in the standard module because it col- practitioners are always outpatient services. If this is lects more detailed expenditure and utilization data, not the case, these questions must specify that infor- This level of detail allows for a wide variety of mation is only required for outpatient services along descriptive analyses, and can also be used to construct the lines of Questioins El and E2, which collect anal- prices in the health and health input demand equa- ogous information for visits to public hospitals. In tions.The schedule collects information on the partic- addition, questions will have to be added that ask for ular provider used by the respondent, along with the inpatient visits (hospitalizations) with these kinds of purpose of the respondent's visit and measures of time health care providers. costs, distance, types of service received, and monetary or in-kind payments made for the service.The types of PART F. These questions ask about providers that are provider will vary from country to country. known to the respondent. The respondent should be the adult woman in the household who knows the El-E57. This long set of questions can accommodate most about local health care providers. In most cases up to three visits in the previous four weeks. If, say, more this will be the wife of the head. If no adult women than 1 percent of the sample have had more than three 207 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE visits, questions could be asked about up to four visits. needed, until the arm is extended behind with the In either case there will inevitably be a few individuals palm down. If the person has to bend his or her arm whose visits exceed the number that the questionnaire to do this, the "partial" response should be coded. can accommodate. For these respondents, extra blank sheets or a blank questionnaire should be used to record PART H. These questions on cognitive functioning all the visits.The data entry program should be designed should be administered only to people aged 40 and to allow for a large number of visits. older. They should be administered by the interview- er. They are based on tests used in the Bangladesh E58-E90. These questions cover inpatient stays (hospi- Family Life Survey (which was supported by the Rand talizations) in the previous 12 months, allowing up to Corporation). Some of the questions are culturally 3 inpatient stays per individual. See the comment specific, so pretesting will be needed. Some respon- above regarding Questions E1-E57 for what to do dents may find the questions so simple that they are when individuals make more than three inpatient annoyed or insulted, yet most such respondents should stays. be satisfied by an explanation of the purpose of the questions. PART G. Interviewers should ask all people 40 and older to perform directly-observed activities of daily H8. For countries in which there is no prime minister living as one measure of their health status. For more or the prime minister is not the best-known political detail on the protocols used see Guralnik and others leader, replace "prime minister" with "president" or a (1989). similar appropriate term. G3-G5. These questions measure difficulty in standing H9. In most countries "orange" and "cat" should be from a sitting position. Individuals who can rise from appropriate, but in some countries another common a chair without using their arms answer the first part fruit and animal may need to be used. "House" should of Question G3 and proceed to G4, where they are be appropriate in virtually any country. timed on their ability to do this five times.They then proceed to G6. Individuals who are unable to rise HIO, Hit, H14, H15. The term "dollars" should be from a chair without using their arms are asked to try replaced with the name of the local currency. In doing so using their arms. If they can do so, they pro- Question H-14, "dime" should be replaced with the ceed to Question G5, in which they are asked to do appropriate analogous term. this as many times as they can, up to five times. If they cannot do so they proceed to G6. H11-H13. In countries in which rice is not a common staple, the word "rice" should be changed to a com- G9. In this activity participants are asked to pick up a mon staple grain or tuber. pencil from the floor and return to a standing position. They may bend over or squat or do whatever they like Facility Questionnaire to pick up the pencil. As explained in the third section of this chapter, Parts A-F of the facility questionnaire collect information Glo. The respondent is asked to tap his or her foot 10 on the structure of health care, while Parts CK col- times for each foot. For those who can tap their foot lect information on the process of health care.The first 10 times, the time is recorded (and the number of taps type of information is essential, but the second type is is recorded as 10). For those who cannot do this 10 more experimental and thus may be regarded as times, the number of times they can do it is recorded, optional. If resources are scarce, Parts G-K can be and no time is recorded. dropped, but if finanicial constraints are less tight, it will be useful to collect this information. Gll. The respondent is asked to hold his or her arm straight out in front and then slowly raise it up over his COVER PAGE. The code number for the health facil- or her head. He or she should continue the motion, ity can be obtained from the master list of local keeping the arm straight while rotating the shoulder as health facilities, which in most cases will be collect- 208 CHAPTER 8 HEALTH ed in the community questionnaire. See the discus- PART D. The types of equipment listed here must be sion of sampling issues in the second section of this modified to fit the country in which the survey is chapter. being carried out. A4. It is important for analysts to know when a facil- PART E. The text in lowercase consists of the state- ity first opened to understand how its appearance in ments and questions made by the interviewer to the the community may have changed health outcomes, person being interviewed.The text in uppercase is not As discussed in Chapter 23, methods of analysis that to be read aloud. Instead, the interviewer needs to fill use panel data often use this information. in this information based on what he or she observes. As much as possible, the interviewer should not indi- A13-A14. Registration fees are general fees charged to cate to the person being interviewed what he or she is each individual who visits a clinic, regardless of what observing. treatment he or she receives. In most cases extra fees are added for particular services. In countries in which El, E2, E4, E20, E21. During the training of the inter- registration fees do not exist, these two questions can viewers, clear definitions need to be established be dropped. regarding what constitutes "clean" and what consti- tutes "dirtv."This should be included in the interview- PART B. The list of the types of services offered must er manuals prepared for the team members who fill be modified to reflect the services in the country of out this form. the survey. For example, in East Asian countries survey designers may need to add acupuncture. PART F. The list of medicines must be adjusted to fit the ones commonly used in the country. Survey PART C. If the questionnaire is applied to a large designers should consult officials at the ministry of facility such as an urban hospital, it will not be prac- health when drawing up this list. tical to ask questions about each individual. In this case a set of questions should be asked about how PARTS GK. The scenarios used must be modified to many employees or different types of employees (for fit the particular characteristics of health and health example, doctors, nurses, or technicians) work there. care in the country. This can be done only by an To get more detailed information on hours worked, expert in public health, preferably one with experi- a random sample of perhaps four doctors, four nurs- ence gathering this type of information. es, and four technicians could be drawn up, and all of the questions in Part C could be administered to Appendix 8.1 Estimating the Effects of Policy them. on Outcomes C2, C5. The types of medical pcrsonncl and the kinds Many of the policy questions discussed in this chapter of degrees that they may have must be modified to fit involve estimating the effect of a change in some the circumstances of the country. Survey designers health policy, such as the price or quality of health should consult officials at the ministry of health when care, or taxes on health related goods, on policy out- drawing up these lists. comes such as utilization of (or expenditures on) health care services, health-related behavior-or on C11-C12. TFhese questions are intended to find out health itself. Answering questions of the form "what is the extent to which health workers in a public facil- the effect of a change in some variable on an out- ity also provide services in some kind of private come?" involves undertaking a causal analysis. practice. This is an important but sensitive issue, so Specifying data requirements for this type of analysis is these questions need to be field tested and used with not simple. There are often a variety of options for care. It may not be necessary to ask these questions choosing the set of variables required for the analysis, for private facilities, but in some countries even and evaluating the merits of each of these options workers in private facilities may have separate pri- requires an understanding of the methodological issues vate practices. involved. This appendix is designed to provide detail 209 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE that goes beyond the brief methodological discussion Figure A8. I The Health Production Function in the second section of the chapter. Readers interest- ed in collecting data for the purpose of undertaking a Fxed individual/ Variable Endowment causal analysis should also consult Chapter 26 on eco- household (Y) inputs (X) (A) nomic models and econometric methods and Chapter and community (Z) 23 on panel data. characteristics The approach for estimating the effects of a change in a policy on a health outcome involves mul- / tiple regression analysis in which the dependent vari- Health able is some measure of utilization of health care, (AH) demand for health-related goods, health-related behavior, or (change in) health status. An economic model of the demand for health provides guidance for determining which variables are to be included as explanatory variables in the regression equation. Fixed individual and household characteristics that The foundation of this model is the health pro- affect the productivity of health inputs, such as edu- duction function, which describes the biomedical cation, age, gender and family background. For process through which inputs (for example, medical example, better-educated individuals may be better care, nutrition, and smoking) are transformed into able to follow a recommended medical treatment health.The health production function is illustrated in and thus obtain greater improvements in health Figure A8. 1; examples of inputs for adult, child, and from medical services. newborn health production functions are listed in * Fixed community characteristics that are exoge- Table A8.1. The inputs into the health production nous to (beyond the control of households, such as function are only those factors that directly affect quality of available medical services and communi- health, such as nutrition and medical care. Factors that ty programs concerning sanitation, water, and vec- indirectly affect health by altering behavior, such as tor control. medical care prices, are not inputs. The first three types of factors are variable health Following Grossman (1972), health can be thought inputs. The fourth type (individual and household of as a form of human capital. An individual's health characteristics) and the fifth type (community charac- stock at any point in time is determined by an initial teristics) include fixed inputs and all other inputs into genetic endowment, subsequent behavioral choices (for the production function that the household cannot example, diet, medical care, smoking, and exercise), and control. The most important community variables are factors that are beyond the control of the household. government policies that dircctly influencc health Over a given period of time the change in an outcomes (and thus are inputs into the production individual's health status is determined by the health function) and policies that indirectly affect health out- production function. The health production function comes by influencing other input decisions. For exam- makes explicit the mechanism that transforms inputs ple, environmental efforts to reduce pollution have a consumed during a period of time into changes in direct influence on health and belong in the health health during that period of time. There are several production function. But these policies also have an types of factors that directly influence health: indirect influence on health because pollution affects * Health input choices such as food (diet), medical the risk of getting ill and consequently the utilization care, and physical activities. of preventative care. Similarly, the quality of care affects * Individual behaviors that are not motivated by health health outcomes both directly, by changing the mar- considerations but nevertheless affect health, such as ginal product of medical care, and indirectly, by induc- smoking and engaging in risky sexual activity. ing individuals to seek additional care. * Household behavior that affects health by affecting The health production function does not include the household disease environment, including all policies that affect health outcomes. A number of cooking, sanitation and waste disposal practices, and policies have no direct impact on health outcomes but water supply decisions. have indirect effects through their influence on the 210 CHAPTER 8 HEALTH Table A8. 1 Inputs and OtherVariables in Health Production Functions Input Adult Child Newborn Voriable inputs Nutrition Food Food Mother's food during pregnancy Physical activity Amount and intensity of work Amount and intensity of work Mother's amount and intensity of work during pregnancy Health care inputs Preventative checkups Preventative checkups Prenatal care Immunization Immunization Delivery Inpatient curative care Inpatient curative care Outpatient curative care, drugs Outpatient curative care, drugs Individual health-related behavior Smoking Smoking Smoking of mother Alcohol consumption Alcohol consumption Alcohol consumption of mother Safe sex practices Safe sex practices Safe sex practices of mother Household health-related behavior Sanitation practices Sanitation practices Sanitation practices Cooking practices Cooking practices Cooking practices Waste disposal practices Waste disposal practices Waste disposal practices Water supply decisions Water supply decisions Water supply decisions ............................................................................................................................................................................................................................ Fixed inputs and choroctenstics Individual- and household-level Age Mother's age Mother's age Education Mother's education Mother's education Gender Gender Family background Family background Family background ............................. ..................................................................................................................................................................................................... Community-level Sanitation Sanitation Sanitation Water Water Water Vector control Vector control Vector control Public education Public education Public education Quality of care Quality of care Quality of care Source: Authors' summary choice of variable inputs. Most of these purely indirect of community characteristics that directly affect policies are financial variables that affect the cost of the health, and pt is the individual's health endowment at inputs. For example, the monetary price (including the beginning of the period. insurance coverage benefits), travel costs, and opportu- In the health production function, the dependent nity cost of time affect how much preventative and variable, AH, is the change in health status over the curative medical care an individual can obtain. period. The health production function shows how Another indirect policy is information about prices behavior during the period (X), exposure to commu- and the availability of care. (Information about the nity environment (Z), and factors beyond the control productivity of care may directly enter the production of the household (Y) affect the flow of (that is, the function as it influences the efficiency of the input.) changes in) health. If the left-hand side were the stock Other policies typically have both direct and indirect of health, H, all of the input decisions and exposure to effects. The distinction between policies with direct environment factors over the individual's lifetime and indirect effects has important implications for the would belong on the right-hand side. This has impor- methods used to measure the impact of policy meas- tant implications for the measurement of health status ures on health outcomes. because most of the measures are really stocks of The health production function can be expressed health as opposed to flows. in terms of an equation. The amount of health pro- In principle, it is possible to apply multiple regres- duced during a given time period is: sion analysis to data on health and on all of the inputs and individual and community characteristics to esti- (A-1) AH = H(X; Y, Z, p) mate the effects of each of the inputs on health. That is, it is possible to estimate the health production func- where AH is the change in health over the time peri- tion. While estimates of the health production func- od, X is the set of variable inputs chosen during that tion can be useful in improving our understanding of period, Y is the set of individual/household character- health production technology, direct estimation of the istics beyond the control of the household, Z is the set production function is not the ideal way to evaluate 211 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE the effectiveness of policies, for two reasons. First, to Figure A8.2 The Health Input Demand Function obtain reliable estimates of the effects of inputs on outputs it is usually necessary to have data on all of the Fixed individual/ Budget Preferences Endowment inputs, which is rarely possible in practice. Second, the household (X) constraint (Y, Ti) (I) production function approach captures only direct and community (Z) (E PC, WI) effects; it does not allow for estimates of the effects of characteristics indirect policies, such as policies that affect prices, on \ - , ' - health. An alternative is to estimate reduced form health Variable A demand and health input demand equations. The inputs (X) reduced form approach (illustrated in Figure A8.2) is more useful for policy evaluation purposes because it captures both the direct and indirect effects of policies Health on outcomes.The reduced form approach models the (A effect of all fixed factors on both input choices and Soarce: Aurhors' summary health outcomes. Consider first the household's choice of variable inputs, which depend on the household's preferences individual and household level characteristics, Z is the for health relative to other goods and leisure, the set of community characteristics, $1 is the health household's budget constraint, other fixed inputs and endowment from the production function, and rl the endowment.9 Preferences are related to observable reflects unmeasured preferences in favor of allocating household characteristics, such as education, as well as resources to health. Both direct and indirect policies to unobservable characteristics. The budget constraint are included in the determinants of input demands. is determined by the household's income (I) and all of It is from the input demand equations that we are the prices that it faces, including the prices of goods able to determine the effects of policies such as public that directly affect the health production function (P), health and education programs and prices (all of the prices of nonhealth consumption goods (Pd), and which are community characteristics) on health care the market wage rate (W). Indirect policies and prices utilization and health-related behavior. Because prices affect input choices through the budget constraint. So are defined as the effective price paid by the consumer, which variable health inputs to use, and how much of they incorporate taxes, subsidies, travel expenses, and each input to use, depends on all of the exogenous insurance copayments as well as the prices charged by variables potentially observable to researchers- health facilities.The effects of travel and waiting times household characteristics, prices, wage rates, income, can also be included in the effective price by allowing and fixed household- and community-level inputs- time costs to enter the calculation of the price. and two factors that cannot be directly observed by Alternatively, distance can be specified as a distinct researchers: the household's preferences (Ti) and the type of price variable. Estimates of the effects of poli- individual's health endowment (j). The relationship cy changes on expenditures are obtained by comput- between these observable exogenous variables and the ing the effects of each change in policy on utilization demand for a variable health input is the health input by type of service, and multiplying the level of utiliza- demand equation for that input.10 tion of each service by the price of each service. The health input demand equation is described Using a health demand function (Figure A8.3) it by the dashed arrows in Figure A8.2 and, in equation is also possible to make a direct estimate of the effects form, is: of the observed exogenous variables, including both direct and indirect policies, on health outcomes. (A-2) Xn = X(P, PC, W, I, Y, Z, , i) The health demand function is obtained by sub- stituting the input demand functions in equation A-2 where P is the set of prices for the health inputs, P. is into the health production function: the price of consumption goods not related to health, W is the wage rate, I is income, Y is the set of fixed (A-3) AH = HD(P, Pc, WI, Y, Z, , i). 212 CHAPTER 8 HEALTH FigureA8.3 The Health Demand Function ed in c,. The dynamic health production function is recursive in health since the lagged stock of health is a Fixed individual/ Budget Preferences Endowment determinant. That is, all of the previous years' input household (X) constraint (Y', r) (g) choices plus the lifetime of exposures to environmen- and community (Z) (PP,,I WI tal risks and the individual's genetic health endow- characteristics ment are subsumed in Htj. Repeated substitution of the health production function into equation A-4 for \HIX Ht,, H t3, and so on, until lagged health variables Health are no longer on the right-hand side, yields an equa- (AH) tion in which the change in health during this time period is a function of lifetime input choiccs, individ- Source:Authors' summary ual/household and community characteristics, the shocks, and the endowment of health: Both direct and indirect policies are included in the (A-5) H - H,A= health demand function whereas only the direct poli- H(X X0; Y0, Z.Z,, ,, ..., 'o). cies were included in the health production function. Estimates of the health demand equation are used to If the dynamic production function is appropriate, estimate the effects of both the direct and indirect the input demand and health demand equations should policies on health status. also contain lagged health, as well as lagged household Under ideal circumstances, answers to most ques- wealth, A,5 (vwhich reflects a potential source of tions regarding the effect of policies on outcomes can income).The input demand equations, then, are: be obtained through multiple regression analysis, based oni equations A-2 and A-3, with cross-section data. (A-6) Xn = X,(HT A,,, Pt PC" 1", W,I It, It, Ti, e,1) However, there are several different estimation prob- lems that complscate the empirical work. These and the health demand equation is: methodological problems can often be addressed using techniques that require panel data (data from repeated (A-7) H, - H,1 = surveys that interview the same households) or other HD(H, , A,,, P, PC,, I,, Y,A Z,, [t,TI, EQ additional variables (such as instrumental variables). For further discussion see Chapters 23 and 26. This approach makes it possible to see how an indi- vidual's health status modifies the impact that other Dynamic Issues determinants of health have on health status. More In the static model presented above, it was assumed importantly, by conditioning on lagged health status, that the household's choices are made at one point in the reduced form health and input demand functions time and that the stock of health in the previous peri- depend only on contemporaneous resources and poli- od is exogenous and represented by Vt. An alternative cies, as opposed to past resources and policies, because to this is a dynamic formulation of the health produc- the rest of the individual's health history is captured by tion function in which the change in health over a H,_l. period of time is determined in part by the health at If it is not possible to condition on lagged health the beginning of that period. In other words, the effi- status, the interpretation of the model is quite differ- cacy of medical care depends in part on the severity of ent and indeed there may be serious omitted variable sickness and general frailty. The dynamic health pro- bias. To obtain a specification of equations A-6 and A- duction function can be expressed as: 7 that does not depend on lagged health, all of the determinants of lagged health must be substituted into (A-4) H, - HH I = H(H,l, X,; Yt, Z,, E). equations A-6 and A-7. These include the whole his- tory of resources and policies. Therefore, the model Unmeasurable random factors (shocks), varying should include not only current values of P, PC, W, L, over time, that affect the change in health are reflect- Y, and Z, but also the whole history of these variables. 213 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE This discussion of dynamics highlights the prob- dent fills in utilization data for each day) or conducting a special lem of modeling stocks versus modeling flows. An survey for the subsample of respondents who report severe illness individual's stock of health is a function of a lifetime or an illness that started before or ended after the recall period, The of decisions and exposure to environmental factors. issues that arise in developing such surveys are similar to those that Therefore, in the reduced form, health depends on the arise in other studies involving calendars, such as time use and food influences of the entire history of the exogenous vari- consumption studies. Such questionnaires are beyond the scope of ables.The flow or change in health from one period to this chapter. the next depends on current influences and health sta- 6. There are other problems as well. If the number of services or tus at the beginning of the period. Estimation of the number of health care providers is large but the number of house- health and health input demand equations in this holds in the community that are interviewed is small, one may not framework poses the additional requirement that get information for some kinds of services or for some providers lagged health and wealth data be collected.This is only because none of the sampled households are familiar with them. possible with true panel data. This is particularly the case if one uses information only from households that actually used a service. Another problem with col- Notes lecting data only from households that used a service is that the health care facilities covered are not necessarily a random sample of The authors gratefully acknowledge financial support from the US the population, which could lead to biased estimates of the charac- National Institute ofAging and comments from Girindre Beeharry, teristics of health service providers and community health programs. Ricardo Bitran, Eduard Bos, Mariam Claeson, Margaret Grosh, 7. Many households may not have very accurate knoxvledge of Gerry Hendershot, Peter Heywood, Prabhjat Jha, Edna Jones,John the distances to local facihties. One alternative is to use global posi- Peabody,Alexander Preker, and Laura Shrestha. Research assistance tioning system devices to pinpoint the latitude and longitude of was provided by Debra Fogarty. each household and each health facility. 1. Primary health care is the provision of basic outpatient serv- 8. A third relationship is a health production function. ices, such as services one xvould find in a simple health clinic. However, this is more difficult to estimate and does not account for Secondary health care provides more advanced outpatient services indirect effects of health policies. Thus, for most purposes, such esti- and also provides several kinds of inpatient services. This level of mates are not recommended. For further discussion see Appendix health care would be offered by large clinics and regional hospitals. 8.1. Tertiary health care pertains to the best hospitals in the country, 9. Formally, the household maximizes utility, which is a func- which may provide specialized services and may serve as teaching tion of health, leisure, and consumption of other goods subject to hospitals (hospitals in which doctors are trained and medical the budget constraint and the health production function. research is conducted). 10. Technically this would be a conditional health input demand 2. The price elasticity is the percentage change in an outcome equation if some of the inputs xvere chosen by the household, which variable (in this case, use of health care services) that results from a may be the case for household income. For the same reason, equa- given percentage change in the price of that variable. tion A-3 may be a conditional health demand equation. 3. The specific data needed would be the total cost plus the costs of each of the following: wages (and other staff benefits), References drugs, supplies, maintenance, utilities, and capital costs. The facility questionnaire introduced in the third section of this chapter (and Alba, Michael. 1998. "A Hospital Cost Function for the Philipines." presented inVolume 3) does not collect such data because in many In Alex Herris, ed., Health Sector Reform in Asia. Manila: Asian countries budgeting is done at a higher level, so that heads of facil- Development Bank. ities would not be able to answer these questions. In such cases the Andrews, G., A. Esterman, A. Braunack-Mayer, and C. Rungie. data can usually be collected from the ministry of health at a dis- 1986. Aging in the Western Paciftc. Manila: World Health trict or regional level. Organization. 4. Employer-provided health benefits in the form of subsidized Behrman,Jere, andVictor Lavy. 1992."Child Health and Schooling health care are essentially a form of insurance. Thus the discussion Achievement: Causality of Association." University of of insuraisce applies to those kinds of benefits. Pennsylvania, Department of Economics, Philadelphia, Penn. 5. One could try to collect the information needed to analyze Bound J. 1991. "Self-Reported versus Objective Measures of insurance issues on the small number of people who have had cat- Health in Retirement Models." Journal of Human Resources 26 astrophic illnesses by using calendars (forms on which the respon- (1): 106-38. 214 CHAPTER 8 HEALTH Bound J., M. Schoenibaum, and T. Waidmats. 1995. "Race and Glewwe, Paul, Haiian Jacoby, anid Elizabeth King. Forthcoming. Education Differences in Disability Status and Labor Force "Early Childhood Nutrition and Academic Achievement: A Participation." Journal of Human Resources 30 (supplement): Longitudinal Analysis:'Journal of Public Ecotnomnics. S227-S267 Goldman, Fred, and Michael Grossman. 1978. "The Demand for Brook, R.H., and others. 1983. "Does Free Care Improve Adult Pediatric Care: An Hedonic Approach." Journal of Political Health? Results from a Randomized Controlled Trial." Nleu Economy 86 (2): 259-80. EnglandJournal of Medicine 309 (23). Graham, W, W Brass, and R. Snow. 1989. "Estimating Maternal Deaton. Angus. 1988. "Quantity, Quality and the Spatial Variation Mortality:The Sisterhood Method." Studies in Family Planning in Price." Amtterican Economic Review 78 (3): 418-30. 20 (3). de Ferranti, David. 1985. "Paying for Health Services in Developing Grossman, Michael. 1972."On the Concept ofHealth Capital and the Countries: An Overview." Staff Working Paper 721. World Demand for Health.: Journal of Political Economiy 80 (2): 223-55. Bank,Washington D.C. Guralnik, J., L. Branch, S. Cummings, and J. Curb. 1989. "Physical Deolalikar, A. 1988. "Nutrition and Labor Productivity in Performance Measures in Aging Research." Journal of Agriculture: Estimates for Rural South India." Rev'ieu' of Gerontology 44: 141-46. Econotnics and Statistics 702 (3): 899-32. Hanimer, Jeffrey. 1997. "Economic Analysis for Health Projects:' Dow, William. 1996. Unconditional Demand for Health Care in CSte IVorld Batik Researcli Observer 12 (1): 47-71. d Ivoire: Does Selectioni on Healthi Status Matter? Living Standarcds Idler, Ellen, and Yael Benyvamini. 1997. "Self-Rated Health and Measurement Study Working Paper 127. Washington, D.C.: Mortality: A Review of Twenty-Seven Community Studies." World Bank. Journal of Healthi and Social Belhavior 38 (1): 21-37. Dow, William, Paul Gertler, Robert Schoeni, and Duncan Thomas. Ju, A., and G. Jones. 1989. Aging an ASEAN and its Socioeconomtlic 1997. "Health Care Prices, Health, and Labor Outcomes: Consequences. Singapore: Institute of Southeast Asian Studies. Experimental Evidence." RAND Corporation, Santa Monica, Kington R., and J. Smith. 1997. "Socioeconomic Correlates of Cal. Adult Health:" Deniographty 34 (1):159-170. Ferraro, Kenneth, and Melissa Farmer. 1999. "Utility of Health Kochar, A. 1995. "Explaining Household Vulnerability to Income Data from Social Surveys: Is There a Gold Standard for Shocks: American Economic Revieu' 85 (2): 159-64. Measuring Morbidity?" American Sociological Revietv 64 (2): . 1997. "Explaining Poverty: An Empirical Analysis of the 303-15. Effects of Ill-Health and Uncertainty on the Savings of Rural Gertler, Paul, and Jonathan Gruber. 1997. "Insuring Consumption Pakistani Households." Stanford University, Department of against Illness." NBER Working Paper 6035. National Bureau Economics, Stanford, Cal. of Economic Research, Cambridge, Mass. Lewis, Maureen, and A. Medici. 1995. "Private Payers of Health Gertler, Paul, and Jeffrey Hammer. 1997. "Strategies for Pricing Care in Brazil: Characteristics, Costs, and Coverage." Healtlh Publicly Delivered Health Care Services. In G. Schrieber, ed., Policy and Planninig 10 (4). Innovations in Health Care Financing. World Bank: Washington Litvack,Jennie, and Claude Bodart. 1993. "User Fees Plus Quality D.C. Equals Improved Access to Health Care: Results from a Field Gertler. Paul, aisd Roland Sturini. 1997. "Private Health Insurance Experiniienit in Cainerooni XSocial Sciencte awtdl Medicine 37 (3). and Public Expenditures in Jamaica." Journal oJfEconomietrics 77 Macro International. 1995a. "Model "A" Questionnaire: With (1): 237-57. Commentary for High Contraceptive Prevalence Countries:" Gertler, Paul, and Jacques van der Gaag. 1990. The Willitngness to Pay Demographic and Health Surveys, Calverton, Md. for Medical Care: Evidence from Tvo Developing Countries. . 1995b. "Tanzania: Knowledge, Attitudes and Practices, Baltimore, Md.: Johns Hopkins University Press. 1994." Demographic and Health Surveys in collaboration with Gertler Paul, and J. Zeitlin. 1996. "The Returns to Childhood the Bureau of Statistics in Tanzania, Calverton, Md. Investments in Terms of Health Later in Life." RAND McDowell, Ian, and Clair Nexvell. 1996. Mleasurinig Healthl:4 Guide to Corporation, Santa Monica, Cal. Rating Scales and Questionnaires. Oxford: Oxford University Press. Gilleskie, Donna. 1998. "A Dynamic Stochastic Model of Medical Peabody, John. 1996. Quality of Care: Key Policy Issues. Vietnam Care Use and Work Absence:' Econometrica 66 (1): 1-45. National Health Survey Background Paper. RAND Glewwe, Paul, and Hanan Jacoby 1995. "An Economic Analysis of Corporation, Santa Monica, Cal. Delayed Primary School Enrollment in a Low Income Peabody, John, Paul Gertler, and Arleen Leibowitz. 1998. "The Country: The Role of Early Childhood Nutrition." Revieu' of Effects of Structure and Process of Medical Care on Birth Economics and Statistics 77 (1): 156-59. Outcomes in Jamaica." Healthi Policy 43 (1): 1-13. 215 PAUL J. GERTLER, ELAINA ROSE, AND PAUL GLEWWE Peabody,john,JeffLuck, Peter Glassman,TFimothy Dresselhaus, and Stewart, Al, and J.E. Ware, eds. 1992. Mleasuring Functioning and Well- Martin Lee. Forthcoming. "Measure for Measure: A Being: The Medical Outcomes Approachz. Durham, N.C.: Duke Prospective Study Comparing Quality Evaluation by University Press. Vignettes, Standardization and Chart Abstractions." Journal of Strauss, John. 1986. "Does Better Nutrition Raise Farm the American Aledikal Association. Productivity?"Journal of Political Economy 94 (2): 297-320. Peabody,John, Oniar Rahman, Paul Gertler,J. Mann, D. Farley, and Strauss, John, and Duncan Thomas. 1995. "Human Resources: G. Carter 1999. Policy and Health: Implicationsfor Development in Empirical Modeling of Household and Family Decisions." In Asias. Cambridge, Mass: Cambridge University Press. J. Behrman and T. N. Srinavasan, eds,. Handbook of Development RAND Corporation. 1998. "MatLab Health and Socioeconomic Economics. Amsterdam: North Holland. Survey Documentation." Santa Monica, Cal. Strauss, J., P Gertler, 0. Rahman, and K. Fox. 1993. "Gender and Schultz,T. P., and A. Tansel. 1997. "Wage and Labor Supply Effects Life-Cycle Differentials in the Patterns and Determinants of of Illness in Cote d'Ivoire and Ghana: Instrumental Variable Adult Health."Journal of Human Resources 28 (4): 791-837. Estimates for Days Disabled."Journal of Development Economics Wallace, R., and A. Herzog. 1995. "An Overview of Health Status 53 (2): 251-86. Measures in the Health anid Retireirent Survey." Journal sf Sen, Amartya. 1985. Commodities and Capabilities. Amsterdam: Human Resources 30 (supplement): S84-S107. North-Holland. Ware,J., A. Davies-Avery, and R. Brook. 1980. Conceptualization and Sindelar, J., and D. Thomas. 1991. "Measurement of Child Health: Mleasurement of Health Status for Adults in the Health Insurance Maternal Response Bias." Economic Growth Center Study: Vol. IV Analysis of Relationships Among Health Status Discussion Paper 663. Yale University, Department of Mleasures. Report R-1987/6-HEW RAND Corporation, Economics, New Haven, Conn. Santa Monica, Cal. Stern, S. 1989. "Measuring the Effect of Disability on Labor Force World Bank. 1993. World Development Report: Investing in Health. Participation."Jouirnal of Human Resources 24 (3): 361-95. NewYork: Oxford University Press. 216 Employment 9 Julie Anderson Schaffner Most households in developing countries earn most of their incomes from the productive employment of their members. Thus the employment module in a multitopic household survey collects information that is crucial for diagnosing the causes of poverty and inequality-and for guiding policymakers in their attempts to improve living standards. The first section of this chapter provides an overview * What share of the working-age population has of the key policy issues that can be addressed with data gainful employment? from the employment module.The second section dis- * Which productive sectors of the economy employ cusses the specific data needed to address these policy large fractions of the work force? issues. The third section introduces three prototype * What is the relationship between employment and designs of the employment module-a short, standard, poverty, and what types of work are most common and expanded version. (All of these versions are pre- among the poor? sented in Volume 3.) The fourth section is a list of * How do work activities differ across region, age, annotations to the three draft modules. sex, and ethnic or racial group? * How common is child labor and what is its rela- Policy Concerns tionship to poverty? * Which workers are employed in sectors of the The employment module collects information that economy that are directly affected by government describes four main sets of labor market outcomes: rules and regulations such as minimum wage laws? employment and unemployment, earnings from wage * Who participates in rural public works programs? employment, on-the-job training, and conditions of * Who works in the economlic sectors that are mlost employment. This section highlights the policy issues likely to be affected by changes in the internation- that can be illuminated by the description and analy- al economy? sis of these outcomes. * Who is looking for work, how long have they been unemployed, and what are they doing to find Employment and Unemployment employment? Policymakers in developing countries need accurate, Without a basic understanding of the nature of up-to-date information on who works, the type of employment and unemployment, policymakers in a work they do, and the extent of unemployment, in country will not be able to judge the importance of order to answer important policy questions such as: specific policy concerns. This subsection shows how 217 JULIE ANDERSON SCHAFFNER the employment module of multitopic surveys like the public works programs in targeting the poor? How LSMS surveys can be used to answer a variety of many people are likely to be directly affected by an important questions about the nature of employment increase in the minimum wage, and would the benefi- and unemployment. ciaries be from the poorest households? When the surveys are undertaken in more than LABOR FORCE PARTICIPATION. Determining whether one year, they illuminate how economic activities are each survey respondent of working age is employed, changing over time, making it possible to answer addi- unemployed, or "out of the labor force" (that is, nei- tional questions such as: Do structural adjustment ther employed nor seeking employment) makes it pos- policies appear to be successful in shifting employment sible to estimate the size of the labor force and the size from import-competing to export sectors without of the unemployment problem. Finding out why indi- greatly increasing unemployment? Did many house- viduals are out of the labor force-for example, holds shift their economic activities into the informal because they are caring for children or attending sector after the income tax rate was increased? Is school or because they are handicapped-can also be unionization rising or declining? Answering such useful for identifying and measuring which groups are questions can be an important first step in analyzing most likely to be drawn into the labor force as eco- the costs and benefits of policy changes.' nomic conditions change.When questions about labor When data are available on a particular household's force participation are included in the employment economic activities at more than one point in time module of a household survey, analysts can link the (either because a survey was fielded more than once to answers to these questions with data from the house- produce panel data or because it asked respondents to hold roster and consumption modules to answer such answer retrospective questions), it is possible to answer diagnostic questions as: Are poor households poor more "dynamic" questions about labor market flexibili- because they have few able-bodied workers or because ty. For example, when adjustment policies cause the dis- their workers earn low incomes? Are the nonworkers tribution of workers across sectors to change, do work- in poor households actively looking for work? If those ers from declining sectors move into growing sectors or who are currently unemployed were to receive unem- do they move out of employment while new entrants ployment assistance, would this have a greater effect on to the labor market take up the new jobs in the grow- poor or nonpoor households? ing sectors? When workers lose their public sector jobs, what sorts of alternative employment do they find? SECTOR CHOICES. At least as useful as measuring whether individuals are employed or unemployed is HouRs OF WoRK. Finding out not only whether indi- identifying which individuals and households are viduals work but also how many hours they work involved in particular economic activities.Asking ques- helps quantify the intensity of participation in various tions about the work people do and the employers for sectors, and sheds additional light on the nature of whom they work is relatively easy (compared to meas- poverty. It allows answers to such questions as: How uring wages and income), and such data make it possi- high is the rate of labor utilization in the economy? ble to answer a wide range of diagnostic questions per- For how many households engaged in agriculture taining to the size of specific employment sectors. Even have nonagricultural activities come to occupy a large when the survey is administered only once, data on share of their time? Are poor households poor because sector choices make it possible to answer such ques- they find few hours of work, or are the poor already tions as: How large are the groups likely to be hit hard- working long hours? est by a decline in agricultural prices? How large is the group not covered by labor standards legislation and JOB SEARCH METHODS AND THE DURATION OF how large is the group outside the scope of payroll tax- UNEMPLOYMENT. Asking the unemployed about the ation? Do individuals in public sector employment ways in which they search for jobs (for example, by have higher living standards than individuals with sim- reading newspapers, visiting potential employers, visit- ilar education in the private sector (which would raise ing public job service offices, or talking to friends) the possibility that public sector wages could be cut makes it possible to quantify the use of various search without loss of employees)? How successful are rural methods and helps policymakers determine the 218 CHAPTER 9 EMPLOYMENT potential for getting information to the unemployed workers face.Wages provide a good description both through newspapers or job services. Such data allow of how attractive jobs are to workers and of how cost- analysts to answer the question: Which sorts of job ly labor is to employers. Information on earnings from referral programs, if any, reach the largest numbers of wage employment and hours worked in wage people with information about new and better jobs? employment can be gathered in the employment Asking the unemployed how long they have been module of multitopic household surveys like the unemployed allows researchers to assess the relative LSMS surveys. When used to produce measures of importance of long-term and short-term unemploy- wages, this information can be used to address many ment.With a large enough set of data on the duration pohcy issues, including: of unemployment, a researcher can answer questions * How high are average wages in various sectors of such as: How long is the typical period of unemploy- the economy? ment? What characteristics do the long-term unem- * How skewed is the distribution of wages? ployed tend to have? * How are wage levels and the distribution of wages changing as the economy opens to international UNDEREMPLOYMENT AND ON-THE-JOB SEARCH. Some trade, or as the economy undergoes structural workers may have jobs that give them too few hours adjustment? of work, make poor use of their skills, or earn them When combined with information from other incomes that are lower than some subsistence level. modules on nonlabor income and income earned in Household survey data can be used to measure at least agricultural and nonagricultural household enterpris- two sorts of"underemployment": employment in jobs es, this information can also be used to address the that do not yield earnings above some predetermined questions raised in Chapter 17 related to the level and level and the search by employed workers for addi- distribution of total income. tional or alternative jobs. Data on underemployment It is not possible to begin to answer such ques- as defined by income are useful for answering the tions without data on earnings from wage employ- same questions that are raised about poverty. Data on ment. Such data are not, however, easy to collect. The underemployment as defined by the search for addi- second section of this chapter will discuss in detail the tional or alternative work are much harder to inter- steps that can be taken to increase the accuracy with pret, but nonetheless are used in some countries as which data on income from wage employment can be broad indicators of labor market conditions. It can also collected. be useful to identify the job search methods used by Having data from a single survey makes it possible workers who are looking for additional or alternative to answer questions about the wage employment sec- jobs. Such data might help answer questions such as: tor such as: How attractive to workers are the average What is the best way for the government to dissemi- jobs in the wage employment sector of the economy? nate information that will prompt workers to move How unequally distributed are wages in this sector? In out of declining sectors and into growing sectors? addition, combining these data with data from other modules makes it possible to analyze which workers Earnings from Wage Employment receive high and low wages. (This is discussed further Income levels are a key determinant of a household's below, in the subsection on "Analyzing the living standards and poverty status. In developing Determinants of Labor Market Outcomes.") When countries, income from work activities is by far the wage data are available from a series of cross-sectional most important source of income for most house- surveys, this can show how the level and distribution holds. An important component of work activities in of wages is evolving over time. An especially interest- most developing countries is wage employment, in ing question in recent years has been: How have the which people work outside their homes for firms, the level and distribution of wages changed in the wake of government, or other individuals, in exchange for reforms to open the economy to international trade payments in cash or kind. Thus policymakers need to and capital flows? When a panel of wage data are avail- know how much income workers earn from such able, it becomes possible to address more detailed activities. It is especially useful to know the wage (or, questions such as: How frequently do high-wage earn- more precisely, the "average hourly earnings") that ers experience dramatic reductions in their wages or 219 JULIE ANDERSON SCHAFFNER do low-wage earners experience dramatic increases in tors do on-the-job training rates in the country stud- their wages? Do workers who lose jobs in sectors ied differ most from on-the-job training rates in devel- forced into decline by economic reforms find new oped countries? When questions about training are jobs at higher wages or at lower wages? asked in a series of cross-sectional surveys, the evolu- Though many previous surveys (both LSMS and tion of training can be traced over time. This may be others) have attempted to collect information on especially useful in countries that are becoming earnings from self-employment activities in the increasingly involved in the global economy or coun- employment module, this is unlikely to produce ade- tries in which the government has attempted to quate income data, for reasons discussed in Chapters encourage on-the-job training through subsidy or tax 18 and 19 on agriculture and nonagricultural house- rebate programs. When panel data on training are hold enterprises. available, it is possible to assess differences in the career paths of workers who do and do not acquire training. On-the-Job Training Though a household-based multitopic survey like Many workers improve their skills and acquire new the LSMS can shed light on these questions, it cannot skills by participating in on-the-job training. Skills provide all the information needed for a thorough thus acquired mav be important for making workers evaluation of training. For example, only employers more productive and causing their wages and living could provide detailed answers to questions about the standards to rise.Yet so far little information is available costs of on-the-job training. on such training in developing countries. A variety of data about on-the-job training may be collected in the Conditions of Employment employment module of multitopic household surveys, Wages are not the only characteristic of work that is of allowing policymakers to address such questions as: interest to policymakers. Policymakers are also con- • How prevalent is on-the-job training, and in which cerned about other features of employment such as sectors do workers find the most opportunities for sick leave, pensions, health insurance, and job security. such training? Governments would like to know the extent to which * How does the prevalence of on-the-job training such nonwage benefits exist and what policies the change when the economy opens to international governments can adopt to encourage these benefits. trade or when the government attempts to subsi- Governments often attempt to legislate these benefits, dize training? but in developing countries such laws are difficult to * Do low rates of on-the-job training suggest that the enforce and, in some cases, may have unintended neg- economy faces obstacles to entry into modern ative consequences on workers. Information collected manufacturing and service activities (in which on- in the employment module of a multitopic household the-job traininag is important)? survey allows policymakers to address such questions Even when collected in a single survey, employ- as: ment module data on training can shed light on * How well enforced are regulations regarding non- whether workers have received or are currently wage benefits? receiving some training, what kind of training they * When regulations regarding a particular nonwage have received (for example, learning by doing, learn- benefit change, how does the prevalence of that and ing informally from others, on-site formal training, other nonwage benefits change in response? off-site formal training, or apprenticeship), how long * How stable are jobs in various sectors of the econ- the training lasted, in which occupations they have omy, and how is job stability affected by job secu- received training, how much they paid for their rity legislation? apprenticeship, and what sorts of employers provide training. With this information it becomes possible to NONWAGE BENEFITS AND WORKING CONDITIONS. answer such questions as: How large is the stock of Finding out about workers' nonwage benefits (such as human capital that was acquired through on-the-job paid holidays, paid sick leave, and pensions), working training? What sorts of training appear to be most conditions, the costs of their commute to work, and important? In which sectors and in which sorts ofjobs the location of their homes (whether at the work site is on-the-job training most prevalent? In which sec- or elsewhere) can be useful for answering such ques- 220 CHAPrER 9 EMPLOYMENT tions as: To what extent do employers comply with severance payments, but the legal use of such contracts laws requiring them to provide specific nonwage ben- is often severely restricted to very short durations, to efits? If nonwage job attributes are taken into account, limited shares of firms' work forces, and to specific sec- is there more or less inequality in worker well-being tors. In an attempt to assess the role that such laws play than when only wages are taken into account? in labor markets, it is useful to ask: How many work- When such data are available for two or more ers have explicit or written employment contracts, and time periods, they are also useful for answering ques- how many workers remain outside the scope of the tions such as: How are living standards, defined more legislation? Which legal contract types predominate in broadly than mere measures of wages or incomes, which sectors? Do workers with written contracts changing over time? When regulations increasing spe- have better mandated nonwage benefits than workers cific nonwage benefits take effect, does the incidence without written contracts and to what degree? Do of these benefits increase, and do the wages or other patterns of job turnover differ between workers with benefits of the recipients appear to decline in com- explicit, written contracts and workers without such pensation for the increase in nonwage benefits? contracts? Data on types of legal contracts can be collected JoB TENuRE AND TuRNovER. One important feature of in the employment module of LSMS-type surveys, employment,job stability, is difficult to observe direct- allowing researchers to answer these questions. When ly, but analysts can draw inferences about it by asking such data arc collected in two or morc years, it workers how long they have worked for their current becomes possible to answer such questions as: Did a employer. Analyzing data on job tenure can produce reduction in the severance payments required upon answers to such questions as: How long do jobs in var- dismissal from contracts of indefinite duration appear ious sectors of the economy typically last? H-ow does to increase the use of such contracts? job tenure in the country studied compare with job tenure in other developed countries? (See Hall 1982 Analyzing the Determinants of Labor Market Outcomes for a method involving a single cross-section dataset; Descriptive information on the existing state of the see Schaffner 1999 for methods involving single and labor market, such as that discussed above, is useful repeated cross-sections.) When data are available from for policy analysis. Policymakers would, however, a series of cross-sectional surveys, it becomes possible like to know even more about how policies are like- to answer the question: What has been the apparent ly to affect employment, sector choice, wages, train- effect of changes in job security legislation on the typ- ing, and other labor market outcomes. This usually ical length of a job? requires estimating econometric relationships In order to find out what the labor market is like between labor market outcomes, on the one hand, for workers, it is useful to know not only how quick- and a variety of individual, household, employer,job, ly jobs end but also how they end. Clearly, the labor and community characteristics, on the other. If a market is very different if most workers that lose their reasonably complete list of potential determinants is jobs are unemployed before finding other jobs than if included in the estimated relationship, the estimates most workers quit their jobs in order to take up bet- allow the policymaker to assess what would happen ter jobs that have become available. to wages, sector choice, and other outcomes if the government intervened to change one determinant TYPES OF EMPLOYMENT CONTRACTS. In many coun- while all others remained the same (see discussion of tries the law specifies what kinds of employment con- multiple regression in Chapter 26 on econometrics). tract an employer can offer; each allowed contract Such analyses help policymakers answer questions must satisfy certain requirements regarding nonwage such as: benefits and job security. For example, workers hired * How big is the impact of expanded primary educa- under contracts of "indefinite duration" may be enti- tion, or of improved health care, on the incomes of tled to a variety of nonwage benefits and to large sev- the poor? erance payments if they are dismissed without legally * Could the wages of some public sector workers be defined "just cause." Workers hired under contracts of reduced (improving the government deficit) with- "fixed duration" may be entitled to no such benefits or out losing these workers to private sector jobs? 221 JULIE ANDERSON SCHAFFNER * Does minimum wage legislation appear to help an the time when the individual was entering the labor elite among workers, while driving down wages for market allows study of the role of financial constraints the rest? in determining workers' career paths. Financial con- * Should the government consider developing train- straints may play an important role in determining ing loan programs or industry training consortia to career paths and lifetime earnings, because individuals promote on-the-job training? sometimes need to sacrifice earnings in the early years * By how much is the income tax base likely to of their careers to get on a career path that will bring shrink (as workers leave the sectors in which them higher earnings later in life. Families with little income tax law is enforced) when the income tax disposable income or access to credit may not be able rate rises? to make such an investment in a good career.2 Having This subsection will discuss the usefulness for pol- these data on families' financial constraints helps poli- icy analysis of including measures of a variety of deter- cymakers answer the question: What evidence is there minants in relationships that study the determination that financial constraints early in individuals' working of labor market outcomes. Where an empirical litera- lives cause them to go into occupations with little ture has emerged, one or two key references will be wage growth and low lifetime earnings? cited; in some areas the potential for useful empirical research remains untapped. EMPLOYER, WORKPLACE, AND JOB CHARACTERISTICS. The main reason policymakers are interested in know- INDIVIDUAL CHARACTERISTICS. Estimating the effects ing how the characteristics of a worker's employer, on wages and other labor market outcomes of indi- workplace, and job affect his or her wages is that labor vidual characteristics such as schooling (of different markets in developing countries are often thought to levels, types, and qualities), training, health, age, gender, be severely "segmented." Such segmentation would and ethnicity helps policymakers answer the questions: mean that even among workers with identical inter- How large are the returns (in the form of increased ests, abilities, and training, some get much higher earnings) to investments in schooling, training, health, wages and better benefits than others. Segmentation and nutrition? At which schooling level do invest- implies that labor is allocated inefficiently in the mar- ments have the highest returns? Which dimensions of ket (because too few workers are employed in the school quality are most important for improving stu- high-wage, high marginal product sectors) and that dents' earnings potential? Do differences between the some earnings inequality is of an especially troubling productive characteristics (such as number of years of sort, in which luck and connections play a large part schooling) of men and women explain male-female in determining income. Determining whether labor differences in average wages, labor force participation, markets are segmented is important because, as dis- training, and participation in high-wage sectors-or is cussed in Bulow and Summers (1986), the best poli- there reason to worry about labor market discrimina- cies for welfare when labor markets function well may tion against womeni? (Strauss and Thomas 1995 review be very different from the policies that are best when much of the relevant literature.) labor markets are segmented. Segmentation can take many forms. Segmentation CHARACTERISTICS OF PARENTS' HOUSEHOLD. Es- can arise between public and private sector workers if timating the effects on wages and other labor market the public sector sets wages higher than necessary to outcomes of the education and occupations of the attract workers. Segmentation among geographic worker's parents makes it possible to answer such ques- regions may arise if workers have difficulty acquiring tions as: Does the worker have a better job and a high- accurate information about wages in other regions, or er level of education than his or her parents (Heckman if moving between regions is costly. Segmentation and Hotz 1986)? Are better-educated parents able to between union and nonunion workers may arise if provide their children with better "connections," thus labor unions are able to obtain higher wages for their increasing their children's prospects of finding high- members through bargaining. If minimum wage legis- paying jobs (Lam and Schoeni 1993)? lation is imperfectly enforced, this may cause segmen- Collecting information that describes the finan- tation between workers at firms that comply and cial constraints that the family may have been facing at workers at firms that do not. Finally, some (larger, 222 CHAPTER 9 EMPLOYMENT more modern) employers may choose to pay higher er in urban areas (even in low-wage employment) than wages than others, for example, to give their workers in rural areas, as in Harris and Todaro's famous 1970 an incentive to work hard (Bulow and Summers 1986) model, rural-urban migration will tend to be excessive or to reduce the frequency with which their workers and policies to promote agricultural employment may quit and must be replaced (Stiglitz 1974). be called for. Data in LSMS surveys can help identify the nature, severity, and causes of segmentation. Estimating TRAINING ON CURRENT JOB. Training is of interest to the effects on wages of sector (public versus private), policymakers not only as an outcome variable but also region, coverage by collective bargaining agreements, as a determinant of wages and sector choice. and coverage by minimum wage legislation-as well Estimating the effects of training on other labor mar- as employer size and sector-allow one to begin ket outcomes can show whether workers must accept answering questions such as: Which public sector lower wages to obtain jobs that give them training. If workers receive wages that are particularly out of line this is the case, financial constraints on workers from with private sector wages? Would improving trans- poor households might prevent these workers from portation and communication networks reduce seg- obtaining training and better careers, and experimen- mentation among regional labor markets? To what tation with training loans might be justified. extent do labor unions drive their members' wages up relative to the wages of those nonunion workers who TENURE IN JOB, OCCUPATION, AND INDUSTRY. Do have comparable skills? How large are wage differen- wages rise as workers' length of tenure on the current tials caused by incomplete enforcement of minimum job rises? If so, there is reason to think that employers wage legislation? Do labor markets appear segmented increase labor productivity by encouraging workers to between large, modern firms and other employers, for remain in the same job for a long time. Increasing the more complex reasons discussed above? (See wages to keep workers may economize on training Schaffner 1998.) costs or provide the workers with an incentive to work Some questions about labor market segmentation hard. Unstable macroeconomic policies or other poli- are beyond the scope of multitopic household surveys cies that make it difficult for workers and employers to like the LSMS. For example, probing more deeply into reach long-term employment agreements may hinder the causes of some sorts of labor market segmentation such improvements in labor productivity. would require detailed information from employers Do wages rise with tenure in some industries, about technology, supervision difficulties, training, and regardless of whether workers are still with the searches for job candidates. employer from whom they received training? If so, fear of poaching may make employers in that industry PREvious LABOR FORCE STATUS AND LOCATION OF reluctant to provide training. However, encouraging RESIDENCE. If the probability of a worker obtaining a the development of training consortia within those high-wage urban job varies depending on his or her industries might overcome this disincentive to provide status in the labor force or on where he or she lives, training. Do wages start lower and rise more rapidly this can have important policy implications. For exam- with tenure in some occupations than in others? If so, ple, if the probability of getting a high-wage job is borrowing constraints may prevent some workers higher if one searches while unemployed than if one from entering into occupations with better career searches while working in a low-wage job, some potential for which they are qualified and motivated. workers may choose to be unemployed (and let their labor go to waste) even when low-wage jobs are avail- COMMUNITY CHARACTERISTICS. Many policy effects able. This makes labor market segmentation (which on wages and labor supply can be estimated by com- underlies the existence of both high- and low-wage paring outcomes across communities that have been jobs for identical workers) more costly to the econo- affected in differing degrees by the policies of interest my than it otherwise would be, and suggests the diffi- (see Chapter 13 on the community questionnaire). culty of reducing unemployment without first tack- Gathering data on community-based measures of ling the causes of labor market segmentation. Similarly, infrastructure, prices that affect microenterprise prof- if the probability of obtaining a high-wage job is high- its, ease of transportation to major wage employing 223 JULIE ANDERSON SCHAFFNER regions, quality and cost of local schools, and availabil- labor would be rendered less productive as recipients ity and attractiveness of rural public works programs reduced their labor supply or shifted into less produc- allows researchers to address such questions as: How tive sectors in which their earnings were "off the big an impact do microenterprise support programs books" (and thus could not be used to reduce the size have on the choice to become self employed? To what of their transfers)? Unfortunately, the estimation of extent would the construction of better roads affect labor supply responses to wage changes is complicated commuting behavior? How would the incidence of and often yields results that do not inspire confidence child labor be affected by improvements in the quali- (Mroz 1987). For reasons discussed in the second sec- ty of schools? How do changes in rural public works tion of this chapter, wage effects on labor supply and program eligibility and in program wages affect both sector choice are best estimated in repeated cross-sec- program participation and allocation of participating tion or panel data. Even with such data, however, wage households' labor to other wage employment and to effects can be estimated only if one is willing to make household enterprise activities? (If these other activi- very restrictive assumptions about how people make ties drop off when participation in the program labor supply decisions.The necessity for such assump- increases, the program's effect on household income is tions renders current methods especially unattractive diluted.) As discussed in Chapter 23, repeated cross- for the study of developing economies. section or panel data on the same communities might be preferred to single cross-section data for estimating Summary these effects when programs are endogenously placed Box 9.1 summarizes the most important policy issues or migration is selective. pertaining to labor markets in developing countries. The table indicates which issues can be addressed with INCOME TAX AND TRANSFER PoLICIEs. An increase in just one set of data, which issues can be addressed with the income tax causes the effective wage (the amount a series of data sets from several cross-sectional surveys, by which the worker's income increases when he or which issues need panel or retrospective data, and she works an additional hour) to fall, at least in sectors which issues are difficult to study effectively with data in which the income tax law is enforced. By reducing from LSMS-type surveys. the effective wage in such sectors, an increase in the income tax may reduce work hours, cause workers to Data Needs move to jobs in sectors in which the income tax law is not enforced, or reduce overall participation in the This section discusses the implications of the research labor force. Such a tax increase has the same effect on objectives listed in the first three sections of Box 9.1 labor supply as a decrease in the actual wages that for the design of multitopic household surveys. causes the effective wage to change by the same amount. Similarly, many income transfer programs that Survey Design Issues aim to alleviate poverty have the same effects on labor Before discussing how to collect data to answer the spe- supply and sector choice as do comparable changes in cific policy issues that were raised in the first section, effective wages and household nonlabor income (see some general survey design choices must be discussed. Chapter 11 for definitions of nonlabor income).' These involve choice of reference periods, handling of Good estimates of the effects of wages and nonla- multiple activities, data collection over time, and linking bor income on labor supply and sector choice would of the questions in the employment module to the allow policymakers to answer such questions as:When questions in the household enterprise module. the income tax is increased, by how much does the tax base shrink and what happens to total tax revenue? REFERENCE PERIODS. Given the importance of agri- When the income tax is increased, how much labor culture in most developing countries, it is common for becomes less productive as workers leave the labor the economic activities of individuals and households force or move to less productive sectors of the econo- to vary during a given year. As a result, the answers that my where the income tax law is not enforced? If a respondents give to questions about whether or not means-tested income transfer program were intro- they worked, the sectors in which they worked, and duced as part of a new "social safety net," how much their average hourly earnings will depend upon which 224 CHAPrER 9 EMPLOYMENT Box 9.1 Labor Market Policy Issues and Data from Multitopic Household Surveys Issues that con be addressed with a single round of survey 22. Analyzing the effects of early-life financial constraints on data career paths. I. Measuring labor force participation, employment, and 23. Assessing the likely importance of labor market seg- unemployment, and reasons for being out of the labor mentation across regions, sectors, and employer types force. by estimating wage differentials by region, sector, and 2. Measuring participation in various sectors of the econo- employer type after controlling for such factors as skills my and types of economic activity. and working conditions. 3 Identifying the labor market status of the poor and the 24. Assessing the likely importance of differences among sectors in which they are employed. communities in infrastructure prices and location on 4. Measuring hours of work, both in total and by sector labor market outcomes. 5. Measuring the duration of periods of unemployment and what search methods are used by the unemployed. Issues that can be addressed using repeated cross-sectional, 6. Measuring underemployment and on-the-job search panel. or retrospective survey data methods. 25. Describing changes over time, possibly associated with 7. Measuring the mean and distribution of average hourly policy changes, in labor force participation in various sec- earnings among the employed. tors of the economy and participation in training 8. Estimating the mean and distribution of average hourly (repeated cross-sectional, panel, or retrospective data). earnings among the self-employed (using data from the 26. Describing changes over time, possibly associated with household enterprise and the agriculture module rather policy changes, in the level and distribution of average than the employment module). hourly earnings in wage employment (repeated cross- 9. Measuring the mean and distribution of total annual sectional or panel data). earnings from wage employment. 27. Describing changes over time, possibly associated with 10 Measuring the mean and distribution of total annual policy changes, in the incidence of nonwage benefits earnings from self employment (again using data from (repeated cross-sectional, panel, or retrospective data). the household enterprise and agriculture modules 28. Describing changes over time, possibly associated with rather than the employment modu e). policy changes, in the incidence of various employment I1. Measuring current and past participation in formal contract types (repeated cross-sectional, panel, or retro- training. .., spective data). 12. Measuring current and past participation n informal 29. Measuring how expected wages affect labor force par- training. ticipation and sectoral choice in simple models (repeat- 1 3. Identifying which employers provide and which workers ed cross-sectional or panel data). receive formal and informal training; describing workers' 30. Assessing the degree of occupational and sectoral occupational experience, mobility in the labor market (panel or retrospective 14. Measuring the incidence of various nonwage benefits data). and describing working conditions. 3 1. Measuring the relative success of various job search 15. Measuring the distribution of jobs by the length of the methods in obtaining high-wage jobs (panel or retro- job and studying the nature of job turnover spective data). 16. Measuring the incidence of various types of employment 32. Assessing income mobility (panel data). contracts. 33. Identifying winners and losers associated with policy 17. Analyzing the effects of schooling, health, and nutrition changes (panel data). on wages and other labor market outcomes. 34. Assessing how career paths differ between workers 1 8. Analyzing the effects of training and work experience on who acquire training and workers who do not. Also, wages and other labor market outcomes. how career paths differ between workers who obtain 19. Analyzing the relationship of contractual relationships to "good" jobs early in life and workers who do not (panel wages and other labor market outcomes. or retrospective data). 20. Analyzing the likely mportance of labor market discrim- 35. Assessing the effects of risk and shocks on labor market ination by estimating wage differentials by race or gen- choices (panel data). der after controlling for such complicating factors as 36. Assessing the relative success of workers' searches for skills and working conditions. high-wage jobs by their labor force status and by the 21. Analyzing the degree of intergenerational labor market location of their residence in order to shed light on the and earnings mobility. (Box continues on next page.) 225 JULIE ANDERSON SCHAFFNER Box 9.1 Labor Market Policy Issues and Data from Multitopic Household Surveys (continued) causes of open unemployment and on whether rural- 39. Examining in detail how wage levels are related to urban migration is excessive (retrospective data or a employers'technological conditions (which again would panel that follows migrants). require an employer survey, possibly in conjunction with a household survey). Issues that are difficult to address with multitopic household sur- 40. Measur ng the effects of temporary and permanent vey data wage changes on labor force participation, hours, 37. Measuring the costs of on-the-job training (for which an and sector choice in dynamic models that acknowl- employer survey is needed, possibly In corjunction with a edge the potential effect of today's choices on to- household survey, so that workers' wages can be exam- morrow's opportunities (which would require panel ined in light of both worker and employer characteristics), data of sufficient length, and methodological devel- 38. Quantifying nonwage benefits (which would require a opment, as discussed n the second section of this more focused survey). chapter). Source: Author's summary. reference period is given in the question. In other First, most labor force surveys around the world use a words, their answers will depend upon whether the 7-day reference period to collect data on labor force questions refer to the previous week, the previous participation, employment, and unemployment. Thus, month, or the previous year. Thus choosing the most if analysts want to compare labor market performance appropriate reference period is of great importance. in the country of the survey to performance in other The survey designers must first decide which refer- countries, it is essential for the survey to use a 7-day ence periods will yield data of the most use to analysts. reference period. Second, questions that use shorter Having made this decision, they may elect to use dif- reference periods tend to elicit more accurate respons- ferent reference periods in some particular ques- es than questions that use longer reference periods, tions-which is acceptable provided that the module both because they do not stretch respondents' memo- also contains questions that elicit data required for ries and because it is easier to design brief sequences converting measures into the reference periods of of questions when a shorter recall period is used. The most use to analysts. 7-day reference period is usually the most appropriate For many analytical purposes, the reference peri- short reference period to use, because it captures the od of greatest interest is one full year (either the pre- effects of weekly days off on hours and earnings. vious 12 months or the previous calendar year). For This chapter recommends that survey designers example, when attempting to identify households who aim to use both the 7-day and the 12-month reference depend on agriculture for a large fraction of their liv- period, as has been common in past LSMS-type sur- ing, it is important to avoid asking respondents ques- veys. However, unlike in many earlier LSMS-type sur- tions about their sector of employment that refer to veys, it is recommended that the questions that use the the previous day or to the previous 7 days; if the inter- 12-month reference period should be less ambitious view occurs during the agricultural slack season, than those that use the 7-day reference period, because respondents might report no involvement in agricul- it is harder for respondents to recall details of their ture, even though they may devote a large portion of work experience from up to 12 months before than to their time to agriculture each year. Similarly, if house- recall equivalent details from the previous week. holds know that their earnings fluctuate over the year The costs of using a 7-day reference period can be and if they are able to save enough during their high- minimized if random samples of households are inter- income seasons to cover their consumption during viewed in each month or season of the year, as has their low-income seasons, annual earnings data would been done in many past LSMS-type surveys. It is be a much better indicator of their living standards strongly recommended that survey organizers main- than data on their earnings during the previous 7 days. tain this feature, and the rest of this discussion assumes Nevertheless, analysts may sometimes prefer to that this will be the case. When this intrayear sample use the 7-day reference period for at least two reasons. design is used, the answers to the questions using the 226 CHAPTER 9 EMPLOYMENT 7-day reference period can be used to answer some ural to them, interview time will be reduced and data questions about annual earnings. For example, the yielded will be more accurate. In order to turn these sample average of 7-day reference period earnings for responses on earnings into average hourly, weekly, and a group with a particular level of education is an esti- annual earnings figures that are comparable across mate of the average (across weeks of the year as well as individuals, it is necessary to include additional ques- across individuals) of earnings for this educational tions in the module.To construct average hourly earn- group in the population as a whole. Therefore, multi- ings figures, reported earnings in cash, food, and other plying the result by 52 provides an estimate of average forms of remuneration should be divided by the num- annual earnings in that educational group. Repeating ber of hours worked during the reference period that this exercise for groups of people with other levels of the respondent chose. Adding together these average educational attainment will show whether and how hourly earnings in cash, food, and other forms of annual earnings differ across workers with different remuneration will produce a measure of the respon- levels of education. dent's total average hourly earnings for a particular job. Although interviewing random samples of house- This figure must then be multiplied by the number of holds in each month of a year reduces the costs of hours worked during the previous 7-day or 12-month using the 7-day reference period, it does not eliminate period to produce measures of earnings that are com- these costs. Questions that use a 7-day reference peri- parable across individuals or households. od do not yield good estimates of the share of house- There are two ways to estimate the number of holds that ever participated in a particular activity dur- hours worked during the reference period chosen by ing a specific year, even when they are administered to the respondent. First, the interviewer can ask respon- households over the course of 12 months. Nor do dents directly how many hours they worked during these questions yield sufficient data to make possible a the time period for which they reported their earn- complete analysis of wage or income distribution or of ings. Second, the interviewer can ask respondents a set poverty (measured using income rather than con- of questions about their average daily hours, average sumption data). Although the data that these questions weekly hours, and weeks worked per year, which will yield make it possible to study how average annual allow analysts to construct rough estimates of the earnings vary across groups with different observable hours worked by respondents in almost any reference characteristics (like schooling levels and locations), it is period.The first approach is likely to yield more accu- not possible to use 7-day reference period earnings rate estimates of average hourly earnings but would be data to study inequality in annual earnings within difficult to use to calculate every component of earn- groups. The problem is that when annual earnings are ings on an hourly basis. Because respondents are studied by examining 7-day reference period earnings allowed to choose a different reference period for each of similar (but not identical) households in different component of their earnings, it would be necessary to months of the year, it is impossible for analysts to tell follow each question about one component of the if the individuals and households that have high respondent's earnings with a question about the num- incomes in one month also have high incomes in ber of hours he or she worked during the reference other months. Therefore, it is recommended that the period that the respondent chose for that component. module collect data using 12-month as well as 7-day The second approach is less accurate, but asking one reference periods. short set of questions about hours makes it possible to Even though the objective of thc draft employ- convcrt all of the respondent's earnings components ment modules introduced in the third section of this into an hourly figure. chapter (and presented in Volume 3) is to measure It is suggested that, for the purpose of converting labor market outcomes using 7-day and 12-month ref- reported cash wages into an hourly figure, the inter- erence periods, respondents are allowed to report their viewer should ask respondents directly about the cash and in-kind earnings in whatever reference peri- number of hours that they worked during their cho- od they like (from one hour to one year). This is sen reference period. To convert the in-kind compo- because earnings are especially difficult for respon- nents of respondents' wages into an hourly figure, dents to report accurately, and it is hoped that by however, the interviewer should ask them a short set allowing respondents to respond in the way most nat- of questions about the number of hours worked per 227 JULIE ANDERSON SCHAFFNER day, hours worked per week, and weeks worked per hourly earnings from a particular employer. However, year, from which analysts can construct rough esti- because it may also be important to measure respon- mates of the number of hours worked in any reference dents' total earnings from all employers, the draft mod- period. This requires adding only one more question ule also asks respondents about the total number of (on hours worked in the reference period chosen for hours they spent working in the occupation (for any reporting cash earnings) than has been included in employer). CoUlecting this information makes it possi- most previous LSMS surveys, and it is likely to increase ble to estimate respondents' total earnings from all their the accuracy of estimates of average hourly earnings employers in that occupation by multiplying their total substantially, because cash earnings are the most number of hours by their average hourly earnings from important component of earnings in most wage the main employer. employment. Second, survey designers must allow respondents to give information about more than one occupation. MULTIPLE ACTIVITIES. In most developing countries it They must decide the number of occupations in each is common for households to be involvcd on an ongo- rcfercnce period on which they will collect various ing basis with more than one economic activity. Not amounts of information, and define a rule for choos- only might different household members work in dif- ing which activities should be described in detail ferent activities, and some or all household members when individuals are involved in more activities than work in different activities in different seasons of the are allowed in the detailed questions. Most previous year, but many individuals are involved in more than LSMS surveys have allowed respondents to choose one economic activity in any given week. Thus, what they consider their "main work" and their "sec- whether the 7-day or 12-month reference period is ondary work."Then the respondents have been asked used, survey designers need to be aware that these a complete set of questions about their main work, a individuals are involved in multiple activities. This has shorter set of questions about their secondary work, two implications for survey design. and only a single summary earnings question about First, because workers may perform a certain type any other work. The surveys have typically asked of work for more than one emplover (or for an respondents about the existence of any secondary employer and in a household enterprise), survey work only after asking them a long list of questions designers must be careful to distinguish between ques- about their main work, which may have made respon- tions about occupations (in other words, work of a cer- dents reluctant to report another job on which they tain type performed for any employer or enterprise) might have had to answer a battery of questions. and jobs (in other words, work of a certain type per- In contrast, the draft module begins by asking formed for a particular employer or enterprise). For respondents to describe briefly all of the occupations example, if the questionnaire asks respondcnts about in which they have becn engaged during the previous the number of hours that they spend working in their 7 days and 12 months.Then the interviewer picks the main occupations (possibly for more than one employ- two most "important" activities in each referenice peri- er) but asks them about their earnings only from their od based on the number of hours that the respondent main employer, this may result in average hourly earn- worked in those occupations. More detailed questions ings being greatly understated for individuals who per- are asked about the work that the respondent has done form their main work for several employers.The word- for the main employer in each of these two activities. ing in previous LSMS surveys has not been very clear It is hoped that this approach will reduce any inaccu- in distinguishing between jobs and occupations. The racies that might be introduced by respondents' reluc- draft modules introduced in the third section of this tance to report secondary and tertiary activities and chapter attempt to be more careful about making this will increase comparability across respondents by mak- distinction. Because it is important (for the analytical ing the definition of "main" and "secondary" jobs purposes discussed above) to associate average hourly more concrete.4 earnings with particular types of employers, the mod- ule specifies earnings and hours from the "main COLLECTING REPEATED CROSS-SECTIONAL, PANEL, AND employer" in a particular occupation, which should RETROSPECTIVE DATA. Collecting data on employ- make it possible to measure respondents' average ment variations over time is of great value to analysts 228 CHAPTER 9 EMPLOYMENT and ultimately to policymakers. Collecting these data Estimating the effects of predicted wages on labor can be done in three ways. An original survey can be supply and sector choice is possible only when the repeated using the same sample (panel data) or a new predicted wage variable used on the right-hand side of sample representative of the population at that later the labor supply and sector choice equations is a func- date (repeated cross-sectional data). Or a survey field- tion of some observed variables that affect labor sup- ed only once can include questions about respondents' ply and sector choice only through their effect on economic activities at some time in the past (for wages. That is, the dataset must contain some variables example, 5 years ago) as well as in the past 7 days or 12 that affect wages but do not affect labor supply by months (retrospective). changing workers"'preferences" toward staying home To address issues 25-28 in Box 9.1, which pertain to perform household tasks or toward avoiding certain to broad changes over time in various labor market kinds of working conditions.5 Unfortunately, most outcomes, it is necessary to have data from more than individual, household, and community characteristics one time period using any of these methods.When a that influence wages might also influence preferences. "before" survey has been conducted, the best way to If workers' preferences do not change much from year examine these issues is to conduct repeated cross-sec- to year, year indicators (in repeated cross-section or tional surveys. Retrospective data are likely to be less pancl data) might fulfill the requirement of being accurate than data gathered from an original cross-sec- important determinants of wages that do not affect tional sample, because respondents may have difficulty labor supply and sector choice in any other way. Thus remembering detailed labor market outcomes over repeated cross-section or panel data might be useful periods of one or more years. Panel data may provide for estimating wage effects, but only if a sufficient a poorer picture (than would repeated cross-sectional number of years of data, and thus a sufficient amount data) of how labor market outcomes are changing over of independent wage variation, is available.6 time in the population, because after the first sample, To address issues 30-36 in Box 9.1 it is necessary panel data are unlikely to be representative of the pop- to follow the same individuals over time either by ulation as a result of sample attrition (although building a panel of data on the same respondents or by Chapter 23 discusses how this problem can be miti- collecting retrospective data. Collecting retrospective gated). When no baseline cross-sectional survey was data has several advantages over colleting panel data for fielded at the beginning of the period in question, ret- studying these questions. First, retrospective data are rospective data are the only option. based on samples that are not subject to attrition-a While the need for data from various time periods factor that often renders panel datasets unrepresenta- is clear in the case of issues 25-28, it requires some tive of the population. Second, retrospective datasets explanation in the case of issue 29, which pertains to the tend to contain fewer spurious changes over time effect of wages on labor force participation, hours, and (induced by measurement errors) than panel data.7 sector choice. Estimating such effects is unfortunately Finally, it is possible to use retrospective data to study quite complicated. A key problem is that the analyst changes over periods for which no "before" data are would like to estimate the effect on labor force available. As mentioned above, the major disadvantage participation-and participation in various sectors of the of retrospective data is that is that they tend to be full economy-of potential wages in each sector of the of recall errors, although these are probably less severe economy in which the worker might choose to partici- in qualitative measures (such as answers on the sector pate. For example, the analyst would like to estimate the of activity) than in quantitative measures (such as effect on participation in the private sector of the wages answers on earnings). Thus the draft module contains the worker could expect to receive in both private and retrospective questions about qualitative outcomes public sectors. That would require measuring potential (pertaining to the nature of activities) but not quanti- public sector wages even for individuals who chose to tative outcomes (such as hours and earnings). Panel work in the private sector or not at all. Because it is near- data must be collected to study the changes in hours ly impossible to collect such information, researchers and earnings experienced by individuals over time.8 must attempt to estimate the wages workers could hope Survey designers should decide the best time to obtain in various sectors, and then include these esti- interval to span with panel or retrospective data mated or predicted wages as explanatory variables. according to the main measurement objective of the 229 JULIE ANDERSON SCHAFFNER survey. For describing the short-term effects of a struc- To make explicit links between employment tural adjustment program, an interval of one year may module data on labor supply and household enterprise be appropriate. For studying long-term development activities, questions can be included in the employ- concerns over periods during which there have been ment module that ask respondents who have charac- no significant changes in government policy or in the terized themselves as self-employed to identify in country's economic circumstances, five-year intervals which household enterprise (reported in the brief may be preferable.The draft module contains five-year household enterprise roster module) this work was retrospective questions, which can easily be modified performed.This is likely to reveal some additional self- to cover a shorter period. employment activities that should be included in the household enterprise roster, and thus serve as a check LINKS TO THE HOUSEHOLD ENTERPRISE MODULE. In on the accuracy of the household enterprise roster, as most past LSMS surveys, analysts could study the prof- well as a means of linking the modules. (This places an its of nonagricultural household enterprises only after extra burden on the interviewer, who must do some linking data from the household enterprise module cross-checking during the interview; the cross- with data from the employment module for the same checking will become easier as questionnaires become households, because data on hours worked in house- computerized.) hold enterprise activities were collected only in the employment module. However, making this link was Designing Questions about Specific Labor Market extremely difficult (see Chapter 18). The draft modules Outcomes introduced by the third section of this chapter (and pre- This subsection addresses how to design questions that sented in Volume 3) include questions that facihtate elicit complete and accurate information about the making that link, but these difficulties can also be four main sets of labor market outcomes that were dis- reduced by including questions about work in house- cussed in the first section: employment and unem- hold enterprise activities in both the employment and ployment, earnings from employment, on-the-job household enterprises modules. Questions on the sup- training, and conditions of employment. ply of labor should be asked in both modules, because the labor supply measures needed for employment EMPLOYMENT AND UNEMPLOYMENT. According to analysis and for analysis of household enterprises are International Labor Organization guidelines, individ- quite different. For comparability with labor force sur- uals are employed if they worked for at least one hour veys from around the world, and for other reasons, labor during the previous 7 days, either for a wage, in a supply measures for employment analysis must refer to household enterprise, or as an unpaid apprentice or 7-day (and 12-month) reference periods, and should be trainee. Individuals also count as employed if they did asked of the individual who performed the work. For not work during the previous 7 days but had a per- reasons discussed in Chapter 18, labor supply questions manent job from which they were temporarily absent. used for analysis of household enterprises should refer Individuals are unemployed if they were not employed to 2-week (and 12-month) reference periods, and and were looking for work during the previous 7 days. should be asked of the manager of the enterprise. They are participating in the labor force if they were Therefore, the employment module must be pre- either employed or unemployed during the previous 7 ceded by a brief section that identifies all of the house- days. hold enterprises in the household. To ensure that the Measuring employment is especially difficult in survey collects earnings data for all jobs, the questions developing countries, where many income-generating used in that introductory section must be phrased so activities are performed outside of markets and may as to prompt the respondent to include any activities not be thought of by respondents as "work" or for which earnings data are not collected in the "employment." (For a detailed discussion see employment module. For example, they should be Hussmanns and others 1990.) In an effort to elicit phrased so as to include the activities of itinerant car- accurate data on labor force participation, the pro- penters, whose earnings are better measured in the posed employment modules include a sequence of household enterprise module but who may not think questions that ask about three kinds of work (own of themselves as operating an "enterprise." farm, nonfarm household enterprise, and wage 230 CHAPTER 9 EMPLOYMENT employment) as well as about job searches and tempo- asked specific questions about any in-kind payments rary absences from permanent jobs.The questions also that they may have received in various forms.The draft provide examples with the aim of helping respondents employment modules are designed to collect a full set understand the range of activities to which the inter- of data on respondents' jobs in the 7-day reference viewer is referring. In an effort to assess the extent of period but only summary data about cash wages and underemployment and on-the-job searches, the proto- in-kind payments from respondents' jobs in the 12- type module contains (as optional extensions) ques- month reference period. Less detail is justified when tions on whether respondents with jobs were looking studying earnings in the 12-month reference period, for additional or replacement work during the previ- both because the 12-month reference period data have ous 7 days. The wide range of sector distinctions of seldom been used in detailed wage studies and because analytical interest are discussed above. For detailed experience shows that respondents find it difficult to notes on specific questions see the fourth section of remember detailed earnings information that far in the this chapter. past. Including more detail in the 12-month reference Care must also be taken in measuring hours of period section of the employment module also con- work. With the aim of allowing researchers to con- tributes to respondent fatigue, increasing the probabil- struct measures of hours for any reference period a ity of inaccurate or missing information later in the respondent might use for earnings reports, previous interview.10 Concern about respondent fatigue may living standards surveys have asked respondents about also lead survey designers to decide to ask detailed days worked in the last week, hours worked per day in questions about only one of the respondents' jobs in the past week, weeks worked in the past year, and usual the 7-day reference period. hours worked per week during weeks worked in the past year. U.S. validation studies (which compare ON-THE-JOB TRAINNG. The few questions about on- workers' reports about hours and earnings in house- the-job training that were included in previous LSMS hold surveys with employer records) suggest that one surveys were in the education module. They are better should try to avoid the use of questions about "usual" included in the employment module, however, or "average" hours, because (even in the less seasonal because in this module reports of training can be U.S. environment) responses to such questions contain linked with the employers that provide training and a great deal of error as measures of the average hours the occupations that require training, as well as with researchers hope to measure (Rodgers, Brown, and the individuals who receive training. Duncan 1993). In light of this, the draft employment Measuring training is inherently difficult. Because modules replace the question on average hours much on-the-job training is very informal, recipients worked per day in the past 7 days with a question on might respond "no" if they were simply asked total hours worked in the past 7 days. Researchers can whether they had ever received any training. calculate average daily hours themselves if they wish. Although asking this simple question is nevertheless Unfortunately, it is difficult to avoid the use of "usual" useful for identifying formal training, less direct ques- weekly hours questions when attempting to measure tions should also be asked that shed light on whether 12-month labor supply, without expanding the num- workers are in jobs in which their productivity ber of questions greatly.Thus the prototype retains the increases as they watch others and practice their skills. usual weekly hours question.9 The proposed module contains some questions for assessing the importance of informal training (associ- EARNINGS FROM EMPLOYMENT. To accurately measure ated with both getting better at doing a particular earnings it is essential to phrase questions carefully (for occupation and progressing to new occupations). example, referring exphcitly to the last pay period However, it should be noted that these questions have rather than to "usual earnings") and to include the full not been field tested in developing countries and are range of questions needed to pin down the respon- in great need of pretesting. They are designed with an dent's earnings both in cash and in kind. For cash eye to avoiding the ambiguities in similar questions earnings it is vital to ask explicit questions about on informal training that were included in some U.S. respondents'income and payroll taxes, tips and gratu- surveys about which researchers have complained ities, and bonuses. In addition, respondents must be (Sicherman 1990). 231 JULIE ANDERSON SCHAFFNER CONDITIONS OF EMPLOYMENT. Given the multipurpose school the individual attended was public or private, nature of LSMS surveys, the employment module whether it was rural or urban, and perhaps what lan- restricts itself to qualitative measures (for example, guage was used in that school. It may also be possible whether the individual is entitled to a pension) of a few to ask simple questions about the availability of text- nonwage benefits and working conditions. Collecting books and running water in the last school attended quantitative data (such as size of pension and vesting and whether the school was a completely enclosed requirements) is beyond the scope of the survey.With structure.AIl of this information can be collected in the regard to job tenure, care should be taken to word education module of a multitopic household survey. questions so as to make clear that the desired response pertains to the amount of time the respondent has EARLY CAREER FINANCIAL, CONSTRAINTS. As indicated in spent working for his or her current employer (in any the first section of this chapter, collecting indicators of occupation) and not the time he or she has spent doing the financial constraints facing the household in which the current sort of work. Finally, the draft module con- the individual lived when he or she entered the labor tains questions both about types of formal legal con- market would give analysts the chance to assess the the- tracts (which must, of course, be modified to account ory (which has not been studied very much) that pover- for local employment contract law) and about informal ty tracks workers into careers with little potential for arrangements between workers and employers. advancement. In addition, collecting such data could be useful for a technical econometric reason; such measures Facilitating Study of the Determinants of Labor Market would provide analysts with valuable and rare "instru- Outcomes ments" to use in some econometric procedures for deal- The first section indicated the potential usefulness for ing with simultaneity biases (see Chapter 26 on econo- policy analysis of empirical studies of the determinants metrics), because in some contexts financial constraints of wvages and other labor market outcomes. This sec- variables could be expected to influence occupational or tion briefly discusses hoxy to facilitate such studies by sectoral choice but not affect current wages directly. collecting adequate information on potential determi- In an attempt to assess the severity of financial nants. While some of the determinants mentioned in constraints in the family that launched the individual the first section (such as gender, ethnicity, and geo- into the labor market, it might be useful to modify the graphic location) require little comment, two of parental background section of the household roster to them-the individual's schooling and the financial incorporate several indicators of the financial circum- constraints the individual faced when entering the stances in which the individual found him or herself at labor market-deserve further comment here. In addi- the age of 15. (In some countries a younger age may tion, this section discusses several potential determi- be appropriate.) Assuming that the questionnaire nants of labor market outcomes that were not men- already contains questions on parental education, (an tioned in the first section. These determinants-the important determinant of household earnings), it individuals' labor market experience and innate abili- would be possible to control better for per capita dis- ty, characteristics of the individuals' current house- posable income by adding a question sequence such holds, and working conditions of the individuals' as:;'Now I would like to ask you a few questions about jobs-are seldom of direct policy interest, but are use- when you were 15 years old. Was your father alive? ful for getting good estimates of policy-relevant Was vour mother alive? How many brothers and sis- effects, either because they help avoid omitted variable ters were living with you?" Other potential questions bias in the estimation of policy-relevant effects or include questions about whether at that time house- because they can serve as valuable instruments in hold members wvere in good health, what land hold- attempts to control for endogeneity (see Chapter 26 ings the family had, the size of the family's house, and on econometrics). other simple indicators of wealth. Such questions have not been included in previous LSMS surveys, but SCHOOLING. Quality of schooling received is often as experimenting with such questions would be valuable. important as years of schooling in explaining labor market outcomes. Simple measures associated with LABOR MARKET EXPERIENCE. Individuals'labor market school quality include indicators of whether the last experience is thought to be an important determinant 232 CHAPTER 9 EMPLOYMENT of wages and other labor market outcomes, because it developing countries seems to suggest that controlling reflects the opportunities individuals have had to for the abilities captured by feasible ability measures is acquire skills informally on the job, to increase wages unlikely to change inferences on many wage regres- by moving between jobs, and to improve their labor sion coefficients (Strauss and Thomas 1995). Thus, market standing in other ways. Failing to include good while it might be interesting to collect ability measures measures of experience in studies of wage determina- occasionally, collecting them in all surveys is probably tion can lead to misleading inferences about the effects not warranted. of other characteristics on wages, with potentially seri- ous consequences for policy analysis. For example, in CHARACTERISTICS OF THE INDIVIDUALS CURRENT countries where schooling attainment has been rising HOUSEHOLD. Information on the structure of the indi- over time, the simple empirical relationship between vidual's current household (for example, the numbers wages and schooling in cross-sectional data, which of dependents and of male and female workers) and on assesses how much higher wages are for individuals nonlabor income may be of great use in helping with more schooling, would tend to understate the researchers deal with two econometric problems that effect of schooling on xvages.This is because the indi- plague studies of wage determinants: endogeneity of viduals who have more schooling also tend to be wage equation regressors such as indicators of sector younger and have less experience (see discussion of or employer type and endogeneity of selection into omitted variables bias in Chapter 26). A regression that the wage-earning subsamples on which wage regres- controls for experience as well as schooling would sions must be run (see Chapter 26 on econometrics give a more accurate picture of the schooling effect. and Schaffner 1997). In dealing with both problems it Researchers studying the determinants of labor is crucial to find variables that might affect whether or market outcomes who lack good measures of individ- not the workers are wage earners, the sector in which uals' labor market experience often construct a meas- they work, and the employer type for which they ure of "potential labor market experience," which is work, but that do not affect wages directly. Household equal to the individual's age in years minus the years structure variables are likely to satisfy such require- he or she spent in school (including years repeated) ments.They are unlikely to be known by employers or minus the five or six years before entering school.This taken into account during wage setting, but, by influ- is an accurate measure only for individuals who have encing the value of potential workers' time in the worked every year since leaving school and is proba- home, they are likely to affect labor supply and sector bly much less accurate for women than for men. The choice decisions.Variables describing household struc- draft employment modules allow for collection of ture may be derived from household roster informa- more accurate experience measures. tion. For discussion of the measurement of nonlabor income see Chapter 11. INNATE INTELLIGENCE. Inn11ate intelligence can influ- ence wages and occupational choices for obvious rea- WORKING CONDITIONS. Some employer, workplace. sons. Not being subject to policymakers' influence, the and job characteristics are not themselves considered effects of innate intelligence on labor market out- lines along which segmentation takes place but are comes are not of direct interest to policymakers. potentially important controls in studying segmenta- Again, however, failure to include measures of innate tion. For example, wages for comparable workers may intelligence may lead to misleading policy inferences. be higher in an industry to compensate workers for For example, if individuals with higher innate intelli- working conditions that are less attractive in some way. gence tend to get more schooling, simple empirical Without controlling for working conditions, analysts relationships between wages and levels of schooling, might incorrectly conclude that there is segmentation which do not control for differences in innate intelli- along industry lines. Thus having data on working gence, might overstate the wage effect of schooling. conditions (as well as on worker productivity factors Measuring workers' innate physical, mental, and such as schooling, training, and experience) to include social abilities is inherently costly, and even the best in the wage regression allows analysts to make more measures fall far short of measuring all of the traits accurate inferences about the potential severity of seg- valuable to emplovers. Research from developed and mentation (Schaffher 1998). 233 JULIE ANDERSON SCHAFFNER The Employment Module steer younger xvorkers into a shorter sequence of ques- tions more relevant to them. This section introduces three potential employment modules: a short module, a standard module, and an Respondent expanded module with a variety of possible exten- sions.When survey designers are planning a single sur- As in previous LSMS surveys, interviewers should vey, they should not include all extensions simultane- attempt to obtain information about each individual ously, as that would make the questionnaire too long. directly from the individual himself or herself. After, say, All questions should be pretested to ensure that they two unsuccessful attempts to do this, interviewers may are meaningful for a wide range of workers (agricul- obtain information from a parent or another household tural, rural nonagricultural, urban self-employed, and member.Analysis of several previous LSMS surveys sug- urban employed in both small and large establish- gests that such "proxy respondents" are, hkely to be less ments) in the country studied. The words and phrases accurate and less complete than the individuals them- in italics must be replaced by (lowercase) phrases rele- selves. Proxies appear less likely to report the individual vant in a particular country before pretesting. as working or as having a second job and less likely to supply complete wage information. Among people Links to Research Questions who are reported as working, however, reported hours of work are higher for individuals whose hours are Table 9.1 links each issue in the first two sections of reported by proxies than for individuals who report Box 9.1 with the relevant questions in the prototype their own hours.This is consistent with the hypothesis modules and indicates how well the issues can be that individuals whose information is reported by proxy addressed using data from short, standard, and extend- work more than other individuals (perhaps because ed modules.The objectives listed in the first column of their work keeps them away from home when the Table 9.1 are abbreviated references to the objectives interviewers call) but that proxies are more reluctant to listed in the first two sections of Box 9.1. Question report the individuals as working, either because they numbers indicated under "Data needs" refer to ques- do not want to answer a large number of questions tions in the "All extensions" version of the module. about someone else (after possibly already answering all Separate notations arc made under the standard mod- the same questions about themselves) or because they ule heading only when the standard module contains are uninformed or have a different perception about additional questions beyond those contained in the what constitutes the individual's work. short module. The standard module column lists only the questions available in the standard version but not Time Use Questions the short version. The table contains notations under the extensions column only when the extensions Household members make contributions to house- improve on the attainment of a research objective, and hold welfare not only by working for wages or in lists in this column only the additional questions avail- self-employment, but also by engaging in unremu- able in the expanded module. nerated child care, cooking, cleaning, and other household production activities. Understanding Age Limit household demands for such activities may be espe- cially important for understanding household choic- The employment module should be administered to all es about education and work activities of children household members above a specified age. Age limits in and about women's employment and self-employ- past LSMS surveys have ranged from 6 to 14. Survey ment activities. Chapter 22 discusses the merits of designers shonld set this limit in accordance with local collecting data on time used in such activities with- circumstances. Setting the limit at 10 or 12 years of age in a broader time use module. When such a module may be appropriate in many countries. Survey design- is not included in the survey, the employment mod- ers might occasionally want to lower the limit to 6 to ule is a natural place in which to include several stLdy child labor,"1 in which case they might also want short questions on housework. Indeed, many previ- to include questions that check the respondents' age and ous LSMS employment modules have included such 234 CHAPTER 9 EMPLOYMENT Table 9.1 Summary of Data Requirements by Research Objective Extensions Short version Standard version (expanded version) Prospects Data Prospects Data Prospects Data Other Research objective for analysis needs for analysis needs for analysis needs requirements I. Labor force participation, employment, Excellent A2-A 12 unemployment past 7 days: employment in past 12 months 2. Participation in various sectors . xcellent Bl-B3, Excellent B3,B6 of the economy C2-C7, Cl14, C24-C29, C34, D2-D7, D1 2, D2 1-D26, D31 3. Economic activities of poor .xcellent Same as Consumption for I and 2 module measures of poverty ............. f........... ........................................ ~]oo...d................ B7...... Bl...0.............................................................. *............................................*....................... 4. Hours of work Good B7-B I0 5, Unemployment duration and None Good A18-A22 Good F21-F24 May need search methods relatively large sample for accuracy ........................... l~--'ent ... andon ... the... jo...b.......... N... one...........................*.............. Goo...d................. Al'... 17 ................................................................................. 6. Underemployment and on-the-job None Good A14-A17 search methods 7..Average houriyCearnings in Good Cl2 C2i , Excelient Ci 0-C3i7 wage employment C32-C4 1, for 7-day C58-C74 D10-D 18, reference D29-D36 period: good for 12-month re'erence period 8. Average hourly earnings Must be in self-employment done in Household enterprise and Agriculture modules. 9.aAnnual earnings in wage employment Good Same as 7 Excellent plus B7-B 10, C43-C44, D38-D39 ........................... ............................. ............................................................................................................................................................... 10. Annual earnings in self-employment Must be done in Household enterprise and Agriculture modules. T l.............................................................................................I.......................................... ..................... ........................ .................*....................................*.............*......... ..... II. Participation in formal training None Good C 13, C48, Good C69 C73 C49 12. Partic pation in informal training None None Good C66-C68 ........ .....................................................................................................*................ ..............................I................................................................ 13. Linking training with employer and None Good for Same as for Good C6 I -C65 Roster, worker types formal 2 and 12 Education training ........................................................ .......................................... ............................................................................................................................ 14. Nonwage benefats and nonwage None Good C38-C40, Good C19-C21 job features C75-C77 (Table continues on next poge.) 235 JULIE ANDERSON SCHAFFNER Table 9.1 Summary of Data Requirements by Research Objective (continued) Extensions Short version Standard version (expanded_version) Prospects Data Prospects Data Prospects Data Other Research objective for analysis needs for analysis needs for analysis needs requirements 15. Job length and nature of turnover None Excellent C4 1 -C42, Excellent F40-F42 C78, D15- D 1 6, D20- D23 16. Participation in various contractual None None Excellent C6, C7, relationships C 1 5-C 18, C34, C80, C85-C87, D6, D7, DI O- DI 3, D38- 39, Fl I-F 3, F35-F36 I i7Labor market effects of schooling, Good Same as for Excellent Same as for Excellent Same as Roster health, nutrition 1, 2, 4, 7, 9 2, 5, 7, 11, for 5, 1 1, Education, 14, 15 12, 14, 15, Health, 1 6 Nutrtion i 8. Laor market effects of tra ning None Good for Same as Excellent A25-A29, Roster and experience formal for II C57-C65 Education training ......................... ...............I..................................... ............I.............................................................................................................................. 9. abor market implications of None None Excellent Same as Roster, contractual relations for 16 Education ard 17 20. Discrim.nation Fair Same as for Good Same as for Excellent Same as f r Roster 1,2,4,7,9 2,5,7, 1 1, 5, 1 1, 12, Education 14, 15 14, 15, 16 21, Intergenerational mobility Fair Same as for 2. Roster parental occupation ........................ .................................................................................................................................................................................................. 22. Labor market effects of inancial Fair Same as for Good Same as for Excellent Same as for Roster constraints 1, 2, 4, 7, 9 2, 5, 7, 1 1, 5, 1 1, 12, Education, plus 14,15 14,15,16 expanded parental background ( ncluding proxies for financial constraints at age 15). ............................. ............................................................................................................................................................................... 23. Labor market segmentation Fair Same as for Good Same as for Excellent Same as for Roster 1,2,4,7,9 2,5,7, 1 1, 5, 1 1, 12, Education, 14, 15 14, 15, 16, Parental plus C 19- variab es C21 ma............ .......................................................................................................................................................... ............................................ 24. Labor market effects of Good Same as for Excellent Same as for Excellent Same as for Roster community differences I, 2, 4, 7, 9 2, 5, 7, 1 1, 5, 1 1, 12, Education, 14, 15 14, 15, 16 plus community module; better with panel or with repeated cross-section using same communities. ............................................................ ............................................................... .................................................................................................. 25. Change over time in labor Excellent Same as for Possible F2-F 12 Retrospective, force status and sector 1, 2 without repeated cross- repeating section or parnel cross-section data or pane 236 CHAPTER 9 EMPLOYMENT Table 9.1 Summary of Data Requirements by Research Objective (continued) Extensions Short version Standard version (expanded version) Prospects Data Prospects Data Prospects Data Other Research objective for analysis needs fcr analysis needs for analysis needs requirements 26. Change over time in average Good Same as for Excellent for Same as for Repeated cross- hourly earnings 7 7-day 7 secton or panel reference data period jobs, good for 1 2-month reference period jobs 27. Change overtime in nonwage None Good Same as for Good even Fi 4-F 16 Retrospective, job features 14 without repeated cross- repeated sect on, or panel cross-section data or panel 28. Change over time In contractual None Good Same as for Good even F I I -Fl 13 Retrospective, relations 1 6 without repeated cross- repeated sect on, or panel cross-section data or panel 29. Wage effects on labor supply Fair Same as for Good Same as for Repeated cross- and sector choice I, 2, 4, 7 2, 7 sect on or panel with sufficient wage variation 30. Occupational and sectorai mobility Good Same as f r Good even F2-F6 Retrospective i, 2 wthout panel. or panel ......................... ................................................................................................................................................................................................ 31 . Relatve success of job search None Good Same as Panel methods. 1, 2, S, 6, 7 32. Income mobility Good Same as 9 Good Same as 9 Panel, as well as complete data on income not from wage employment 33.Winners and losers associated Good Same as Good Panel,ras well as with policy change 1, 2,7, 9 complete data on income not from wage employment 34. Career path effects of training, None Good Same as Good Same as Retrospective and early good jobs 1, 2, 7, 9, I I 1 1, 12, or panel data plus F17 ........................................... . . ........................................................................ ....................................................................................................................................... . 35. Labor market effects of risks Good Same as Panel of and shocks 1, 2,7, 9 sufficient length; good agriculture and health modules for characterizing shocks .......................................................................................................................................................................... .................................................. 36. Relative success of job search by None Good Same as Panel that labor force status and location I. 2, 5, 6, 7 follows migrants of residence Source: Author's summary questions. Housework questions could be added ing, cleaning, washing clothes, fetching water or fire- after Question A12 in the short version of the mod- wood, or performing maintainance work on your ule, and in conmparable locations in the other ver- home?" (If the answer is no, the next question is sions. They might read: "During the past 7 days, have skipped.) "During the past 7 days, how many hours you done any work around the house such as cook- did you do this kind of work?" "During the past 7 237 JULIE ANDERSON SCHAFFNER days, have you at times been the main person caring identification code in the first column identifies the for or watching over one or more young children?" individual by the number assigned to him or her in the (If the answer is no, the next question is skipped.) household roster. "During the past 7 days, for how many hours were you the main person caring for or watching over Al. This qucstion indicates which responses are sup- one or more young children? DO NOT COUNT plied by proxies rather than by the individuals them- TIME WHEN CHILDREN ARE SLEEPING." selves. Keeping a record of proxy responses is useful not only for assessing the quality of the survey but also Question Counts because in some econometric applications researchers may want to test whether their results are affected by As indicated by the skip codes in the prototype ques- the inclusion or exclusion of proxy data. tionnaire, many questions are relevant only for certain subsets of the sample.Table 9.2 provides rough calcu- A2-A7. These questions determine whether workers lations of the average number of questions to which were employed in the previous 7 days and during the individuals of various kinds should respond based on previous 12 months. To prevent underreporting of LSMS survey data for Peru and Cote d'Ivoire in employment, the interviewers give respondents 1985-86. Details of the assumptions on which this detailed prompts that are likely to make them think table is based are available from the author. The most about and report some activities that they might not important assumptions are that: ordinarily consider "work" or "employment."As in past * 60 percent of individuals in the age range worked LSMS surveys, this draft module uses the International in the previous 7 days. Labor Organization's definition of employment, but to * 20 percent of those who did not work in the pre- improve accuracy the questions here have been ampli- vious 7 days had a permanent job. fied. They could be amplified even further. For exam- * 80 percent of those who worked in the previous 7 ple, the phrase "for at least one hour" could be insert- days-and 10 percent of those who did not- ed into each question to prevent the underreporting of worked five years ago. activities that are only carried out part-time. Questions * 70 percent of jobs are in self-employment. A2 and A3 could be amplified by adding a sentence * 20 percent of all workers had some work in the such as: "This work might have been for paymenit in previous 12 months that they were not performing cash or in kind, in exchange for labor or a reduction of in the past 7 days. debt, or as part of an apprenticeship or on-the-job The question counts include checking questions, which training program." Even if such a sentence is not the interviewer must answer but which are not read included, interviewers should be trained to recognize aloud to respondents. The first column averages over all this wide variety of activities as "work." individuals, wvhether they are employed by others, self- Questions A6 and A7 should contain as many employed, unemployed, or out of the labor force. phrases as is necessary to cover the wide range of potential self-employment activities. The draft ques- Notes on Proposed Questionnaires tionnaires use just two: "work on own account" and "work in a household business enterprise." Both here This section gives detailed instructions about adminis- and in the household enterprises module, careful pre- tering each set of questions included in the prototype liminary research should be done to determine which employment module. It begins with the short version phrases best capture all self-employment activities. of the module and then discusses the additional ques- Because standard definitions of employment use a tions asked in standard and expanded versions. reference period of the previous 7 days, this is the ref- erence period used in the draft module. An alternative Short Version would be to refer to "last week, from Sunday through Saturday." However, this leads to longer recall periods PART A. LABOR FORCE PARTICIPATION. In response to for respondents interviewed later in the week. this set of questions, interviewers fill out one line for Questions A3, A5, and A7 are abbreviated relative to each household member over the designated age. The A2, A4, and A6, but the same comments apply. 238 CHAPTER 9 EMPLOYMENT Table 9.2 Estimated Question Counts per Individual Individuals with Individuals wth Individuals with Ind viduals with one job in wage one job in wage one job in wage one job in self- employment and one employment and Average employment and employment and in self-employment, a different wage over all the same job five the same job five the same jobs five employment job five Version individuals years ago years ago years ago years ago Short 30.7 40.2 26.0 54.2 40.2 Standard 39.7 62.0 31.7 77.3 62.0 ............ ............ _-................................................................................................................................. ....................................................................... Standard plus Extension 1 39.8 62.0 31.7 77.3 62.0 (how found current job, A24) Standard plus Extension 2 43.4 67.4 34.7 82.7 67.4 (experience, A25-A29, C57-C60) Standard plus Extension 3 51l.7 72.3 42.0 94.9 88.9 (five-year retrospective, A30-A3 1, EI-E5,F`I-1`42 Standard plus Ex ension 4 4. 5 68.3 3. 7 83.6 68.3 (contract and relationship with employer, C6-C7, C 15-C 18, C80, C85-C86, D6-D7, DI 0-D 13, D38-D39) Standard plus Extension 5 40.1 64.0 31.7 79.3 64.0 (working conditions, C 19-C21, C87) Standard plus Extension 6 40.3 64.8 31.7 80.1 64.8 (skill acquisition, C6 I -C65, C0113-C011I5) (siil- -acr --~ u - si io n -s C6 -C6 C 3n1 5 .. ............... - .I............... ...0........ ..............I ...7...............................79- .-3 .................................64 ........ . Standard plus Extension 7 40.1 64.0 31.7 79.3 64.0 ( nformal training, C66-C68) Sandard plus Extension 8 39.7 62.6 317 77.9 62.6 (length of formal training, 070, 07 1, 073) Source: Authors ca culations from C-te d'ivoire and Peru survey data. A8. This question asks the interviewer to identify respondents to focus too narrowly on wage employ- which respondents did and did not work during the ment. The question could be amplified by including previous 7 days, so that each group can be asked the phrase "whether by consulting newspapers, appropriate questions. employment agencies, employers, friends, or relatives." A9. This question asks individuals who did not work A12. This question asks individuals who did not work during the previous 7 days whether they have a per- and did not search for work why they did not search manentjob. Individuals who have a permanentjob but for work (that is, why they were not in the labor were absent from work during the previous 7 days force). Qualitative research and pretesting should be should be counted as employed according to the used to refine the list of potential responses. International Labor Organization's definition. A13. This question is used to determine what the A10. This question asks individuals who were absent interviewer should do next. If the respondent has from a permanent job why they were absent. This worked in the past 12 months, the interviewer should helps analysts interpret the report that a respondent proceed to Part B and continue interviewing that per- has permanent job but worked no hours during the son. If the respondent has not worked in the past 12 previous week. It also serves as a check on that report, months, no more employment questions need to be because the answer may reveal that the respondent asked of that person. misunderstood the previous question. PART B: OvERVIEW OF WORK IN THE PREVIous 7 DAYS All. This question is necessary for identifying indi- AND THE PREVIOUS 12 MONTHS. In response to the viduals who are in open unemployment by standard questions of this section, interviewers fill out one line definitions. Sometimes survey designers replace the for each work activity reported by each individual who word "work" with "paid work," but this may cause was identified by QuestionA8 as having worked or had 239 JULIE ANDERSON SCHAFFNER a job during the previous 7 days, or by Question A13 Box 9.2 Cautionary Advice as having worked in the last 12 months. Lines are also How much of the suggested module is new and unproven? filled out for any work done in the past 12 months that Most ofthe questions in the short and standard modules was not done in the previous 7 days. The interviewer are similar to those used in previous LSMS surveys and must fill in the identification code of the individual and in many labor force surveys around the world. Some must use the "activity number" column to number questions have been modified slightly to increase speci- sequentially the activities reported by the respondent. ficity and clarify distinctions between occupations and Part B briefly identifies all the kinds of work jobs.The questions related to experience, contracts, rela- (occupations) in which respondents have been tionships with employers, skill acquisition, and training are engaged during the previous 7 days. The interviewer new and largely unproven.The draft modules are organ- ized somewhat differently from previous LSMS employ- determines which two of these occupations were the ment modules. Previous LSMS employment modules most important (in the sense of having the most have allowed respondents to identify their main and sec- hours devoted to them). Part C then asks more ondary work in the past 7 days and past 12 months, and detailed questions about the respondent's main job asked detailed questions about each of these jobs imme- (that is, work done under the employer or household diately after they were identified-so that secondary enterprise for which the respondent has worked the jobs and I 2-month reference period jobs were identified most hours) in each of the two most important occu- only after workers had already been asked many detailed p questions about the main job of the last 7 days.The cur- pations during the previous 7 days. Part B has been rent modules contain brief overview sections on all work kept quite short in the hope that this will increase the (not just main and secondary jobs) done in the past 7 probability of respondents giving a complete account days and in the past 1 2 months before asking detailed of the work they did during the previous 7 days. If, questions about main and secondary jobs in the past 7 however, survey designers plan on asking only a few days and 12 months; in addition, the current modules more detailed questions about 7-day reference period require interviewers, rather than respondents, to identify jobs, and if they are willing to ask those questions main and secondary activities in each reference period, about all work activities in the reference period The only potential difficulty that this introduces is that (rather than just the two most important ones), the interviewers must calculate the number of hours worked by household members in each occupation in a questions could be added to this section and Part C given reference period in order to know which two could be eliminated. occupations were most important during that reference Part B also determines which occupations were period. They must then refer the respondent to his or the most important in the past 12 months. If they are her previous answers to questions about occupation and the same as the most important occupations in the past industry. In light of these challenges, it is important to 7 days, the information on them is gathered in Part C. train interviewers carefully If data are not collected on either of those jobs in Part * How well has the module worked in the post? As far as C, the information is collected in Part D. we know, the employment module has worked rea- sonably well in the past.Validation studies mentioned in this chapter suggest, however that the greater specifici- nent jobs they may have held during the previous 7 ty in the phrasing of questions in the draft modules days even when they actually did no work during that should be useful for improving accuracy Also, asking time. It is important to include these questions in the respondents to give brief overviews of work activities section dealing with the 7-day reference period rather before asking them to answer many more detailed than the section dealing with the 12-month reference questions should leac to more complete and accurate period because only in the 7-day section can detailed description of work activities. Whacriptior of thewmodule mosivitieedsto.he ustozd information about these jobs can be obtained. It seems * Whot parts of the module most need to be custom -zed? Phrases rendered in italics in the prototype modules likely that jobs from which workers might be tem- are especially in need of customization. In addition, porarily absent are systematically different from other qualitative research would be useful for determ ning jobs. Paid vacations, for example, are likely to be asso- the in-kind payments, nonwage benefits, and working ciated with relatively high-paying, formal jobs. conditions about which specific questions should be Analysis of wages and other labor market outcomes asked. would be incomplete without detailed information on such jobs. 240 CHAPTER 9 EMPLOYMENT Bl-B2. These questions identify the respondent's jobs during the previous 12 months. If survey design- occupation and industry. Interviewers write out ers decide to ask detailed questions about all of the descriptions; the responses are coded later. respondents' jobs rather than just their main and sec- ondary jobs, this question is unnecessary. There would B3. The main purpose of this question is to identify also be no need for this question if this objective way nonfarm self-employment activities so that the of picking the two most important jobs were replaced respondents involved in such activities can be asked to by a more subjective method-for example, if after match the activities to the household enterprises describing all their work activities, respondents were described in the household enterprise module. simply asked which two were the "most important." B4. This question links data on self-employment in B13. This question determines whether another row the employment module to data on self-employment in this table pertaining to 7-day reference period in the household enterprise module.This is important activities should be completed for a particular respon- both for linking the data in the two modules that per- dent. If not, the interviewer should move on to B14. tain to the same activities and (perhaps more impor- tantly) for checking that detailed data are being col- B14. This question asks the interviewer to flag the two lected in the household enterprise module for all activities to which the respondent devoted the most activities that are considered to be "self-employment" hours during the previous 7 days.The interviewer will activities in the employment module (because the later refer to this column when determining which employment module elicits very little information activities should be the subject of Part C. about self-employment activities). R15. This question determines whether another row B5-B6. Question 135 helps the researcher verify any in this table pertaining to 12-month reference peri- cases in which the respondents claim that the number od activities should be completed for a particular of hours that they worked overall in an activity respondent. If not, the interviewer should move on exceeded the number of hours that they worked for to B16. their main employer in this activity during the previ- ous 12 months (which is only possible if they have B16. This question asks the interviewer to flag the two more than one employer in this activity). B6 does the activities to which the respondent devoted the most same for the past 7 days. These questions are also use- hours in the last 12 months.The interviewer will later ful for identifying casual workers who work for sever- refer to this column when determining which activi- al employers during one 7-day period. ties should be the subject of Part D. Note that this question requires the interviewer to consider all activ- B7-B10. These questions describe the amount of time ities the individual reports in this section, whether rel- the respondent spent working in this occupation dur- evant to the 7-day reference period or only to the 12- ing the previous 7 days and the previous 12 months, month reference period. This enables a rough estimate of the number of hours that the respondent worked in a variety of reference B17-B22. In answering these questions, the interview- periods. Weekly and annual earnings in this occupa- er is determining which sections of Parts C and D tion can be estimated by multiplying the respondent's must be admninistered to this respondent and is record- average hourly earnings from his main employer in ing the results in one place.The answers to these ques- this occupation by these total labor supply numbers. tions appear only on the last line of Part B for each Interviewers should be trained to prompt respondents respondent.The interviewer will refer to these answers who find these questions hard to answer and to help later when determining whether to administer various respondents add up hours or weeks. sections of Parts C and D to each respondent. When questionnaires are computerized, these answers can be Bll. This question asks interviewers to calculate num- filled in automatically. When they are not, interview- bers to which they will refer in Question B16 when ers will require training to understand the intent and they ask respondents about their main and secondary to be able to complete them quickly. 241 JULIE ANDERSON SCHAFFNER PART C: MAIN AND SECONDARY JOBS IN THE PREVIOUS questions to paid workers in household enterprises, 7 DAYS. Interviewers should fill out one line for each the skip codes should be adjusted accordingly. person who has worked (or had a job that he or she was temporarily away from) during the past 7 days. C4. The answer to this question should indicate the Part C collects information on the activities marked in respondent's broad sector of employment.Where rele- B14 with a 1 or a 2. vant, a large rural public works program (such as the Questions about the 7-day reference period have Employment Guarantee Scheme in India) should be been split into two sections: the overview in Part B and designated by name. the detailed questions of Part C.There are three reasons for this split. First, the overview elicits information CS. This is a standard question about employer size. In about all work activities of the past 7 days, while the some countries employer size is an important legal dis- detailed questions are asked about only two jobs. tinction, with certain regulations applying only to Second, it is useful to keep the overview section short employers with at least some minimum number of so that a complete accounting of activities may be vorkers. Even where there is no legal exemption of recorded before respondents become weary of a long small firms from regulations, size is often taken as an list of questions. Finally, the focus in the overview is on important indicator of likely compliance with regula- occupations, while the focus in the detailed questions tion. In many past LSMS surveys this question was is on the main job within each occupation. Similar addressed only to workers employed in the private points can be made about splitting questions about the sector. Such a limitation could be imposed by adding 12-month reference period in Parts B and D and split- an appropriate skip code to C4. No such code is ting the retrospective questions into Parts E and F. included in the draft module because information on size may be just as important for studying wage deter- Cl. This question is used to establish which occupa- mination in the public sector as it is in the private sec- tion mentioned in Part B will be discussed in the tor. Rather than asking for an exact number, the ques- questions that follow. (This occupation is specifically tion could provide several size categories from which mentioned in C2 to remind the respondent.) the respondent can choose. C2. This question does three things. First, it asks the C6. This is a standard question used to distinguish jobs respondent to return to thinking about one specific that are covered by labor law from those that are not activity that he or she performed during the previous covered. 7 days. Second, it shifts the respondent's focus from an occupation or activity (which might have been per- C7. This question will yield answers that are more use- formed for more than one employer) to a job (for a ful for studying the effects of unions on wages than the specific employer or household enterprise). Third, it question "Are you a member of a union?" would be, asks where this work was carried out, which some because unions often strike wage deals that apply not analysts and policymakers take as an indicator of the only to unionized workers but also to nonunion degree of formality of the job. workers in the same firm or industry. C3. The main purpose of this question is to differen- C8-C9. These questions measure the distance between tiate between jobs for which the rest of the employ- a worker's residence and place of work. This is useful ment module is relevant and jobs for which it is not. in studies of wage determination because employers The skip codes here indicate that the rest of the mod- who have to recruit workers from a wide area may ule will refer either to work that the respondent does have to pay high wages to attract these workers (who for an employer who is not a member of the same must incur high commuting costs). These questions household as the respondent or to work that the may also be useful for distinguishing temporary respondent does for a household enterprise for which migrants from commuters. he receives a xvage (presumably from another house- hold member in charge of the enterprise). If survey C10-Cll. These questions determine whether the designers prefer not to put the rest of the module's extensive set of questions on wages should be admin- 242 CHAPTER 9 EMPLOYMENT istered and elicit information on why some workers C24-C41. These questions are asked for all respon- report receiving no wages. Workers who receive no dents with a second occupation in the past 7 days. payment should skip all questions about cash and in- These questions are a subset of Questions C2-C21; kind payments and about nonwage benefits. If unpaid thus the corresponding comments above apply to workers are not of direct interest to survey designers, them. the questions on training, experience, and skill acqui- sition can also be excluded for these workers by skip- C42-C44. Any income from a third, fourth, or higher ping to C18 rather than to C15. number of occupations in the past 7 days is reported here. C12. This is the main question about cash earnings. The accuracy of this measure is greatly increased by C45-C47. These questions determine whether the referring to the last payment (or the first one if the respondent should be asked questions in Part D and, if respondent has yet to receive one) rather than to the so, which question in Part D to begin with. "usual" payment. PART D: MAIN AND SECONDARY JOBS IN PREVIOUS 12 C13. This question is a simple attempt to increase the MONTHS (WHEN DIFFERENT FROM JOBS IN PREvious 7 accuracy of the calculation of average hourly earn- DAYS). In this section interviewers should fill out one ings beyond what was attained in previous LSMS line for any activity marked with a 1 or a 2 in surveys. Question B16 that has not already been discussed in Part C. These activities pertain either to year-round C14. This question provides an additional indicator of jobs that the respondent has left in the last 12 months a job's formality by revealing whether the pay is sub- or to jobs performed only seasonally. This section of ject to income or social security tax. Previous LSMS the module asks for much less detail about the respon- surveys were rather vague about whether they meas- dent's earnings than does the 7-day reference period ured before-tax or after-tax pay. section, although it still attempts to gather the sum- mary measures necessary for estimating the respon- C15-C16. These questions elicit information on any dent's annual earnings from wage employment. The other payments-for example, in the form of meals, comments on almost all of the individual questions in housing, and clothing. Qualitative research should be this section are the same as those that apply to the cor- done to determine the most important types of in- responding questions in Part C. See the introductory kind payments in the region surveyed. notes to Part C for a discussion of why 12-month ref- erence period questions are separated into two sec- C17-C21. These questions elicit data on hours worked tions (the overview in Part B and the detailed ques- for this employer during the previous 7 days and 12 tions in Part D). months, if the respondent worked in this occupation for more than one employer in the past 12 months. Standard Version C18-C21 ditfer from B7-B10 in that they refer to Many of the questions in the standard version are also work for a particular employer rather than work for in the short version, and virtually all of the questions any employer in a particular occupation. Analysts need in the short version are also in the standard version. a figure for the number of hours that the respondent For comments on questions that are in both versions, has worked for this employer in order to measure the see the comments given above in the subsection that respondent's average hourly earnings from this employ- describes the short version. Comments are provided in er. They need a figure for the number of hours that the this subsection for questions in the standard version respondent has worked in the occupation to extrapo- that are not in the short version. late the respondent's earnings from this employer to his or her total earnings in the occupation. PART A: LABOR FORCE PARTICIPATION. C22-C23. These questions determine whether the A1-A2. These questions indicate which responses are individual had a second occupation in the past 7 days. supplied by proxies rather than by the individuals 243 JULIE ANDERSON SCHAFFNER themselves. They also identify the proxy (Question PART B: OVERVIEW OF WORK IN THE PREvious 7 DAYS A2). Keeping a record of proxy responses is useful not AND THE PREVIOUS 12 MONTHS. See the general com- only for assessing the quality of the survey but also ments on the purpose of Part B in the subsection because, in some econometric applications, researchers above that describes the short version of the employ- may want to test whether their results are affected by ment module. the inclusion or exclusion of proxy data. Keeping such a record is also useful for studying whether and how B3. This question will be of greatest interest in regions proxy responses vary depending on the gender, age, where there are large agricultural wage labor markets. and education of the proxy respondent. This can help Norms of payments and other working conditions- determine how proxies should be chosen in future and their evolution over time-typically differ by surveys. (See Chapter 4 on metadata.) crop.Thus analysts who want to make sense of data on agricultural wage labor markets will want to distin- A14-A15. These questions ask people who did work guish such workers by crop. whether they also looked for additional or replace- ment jobs. Affirmative responses to such questions are B6. This question is relevant only in countries with used by some analysts and governments as indicators social security programs to which the self-employed of"underemployment." may contribute. Although the question may seem out of place in the overview section, this is the most con- A16. This question checks whether the respondent venient place to put it because the preceding questions conducted any on-the-job search, in which case A17 have already identified the group to which the ques- is relevant. tion should be asked. A17. This question asks what methods the respondent PART C: MAIN AND SECONDARY JOBS IN THE PREVIOUS used for his or her on-the-job search. Survey design- 7 DAYS ers may want to differentiate between public and pri- vate employment agencies. Intermediaries, who are C8-C12. These questions perform at least three func- often known by slang names, may be especially impor- tions. First, they make it possible to calculate the value tant in rural areas for finding work both locally and far of subsidized transport provided by employers. awvay. Second, they make it possible to measure the distance between a worker's residence and his or her place of A18-A22. These questions ask about the duration of work. (This is useful in studies of wage determination the respondent's job search and the search methods because employers who have to recruit workers from used during any period of unemployment experi- a wide area may have to pay high wages to attract enced in the previous 12 months. In earlier LSMS those workers in light of the workers' high commut- Surveys questions about the duration of unemploy- ing costs.) Finally, this set of questions is useful for dis- ment and the search methods used were asked only of tinguishing temporary migrants from commuters. respondents who were unemployed during the previ- Where seasonal migration flows are of special interest, ous 7 days-which tends to be a small sample. This questions can be added here that are relevant only to draft module seeks to identify a larger sample by ask- workers who commute sufficiently infrequently or ing about any unemployment experienced during the over sufficiently great distances. For example, such previous 12 months. questions might identifv the municipality in which the work was done. A23. This question is used to determine what the interviewer should do next. If the respondent has C13-C15. These questions elicit information about worked in the past 7 days or 12 months, the inter- apprenticeship fees.Their main purpose is to increase viewer should proceed to Part B and continue inter- the accuracy of estimates of income and expenditures. viewing that person. If the respondent has not worked In some countries the questions may also be useful for in the past 7 days or 12 months, no more employment studying apprenticeships themselves, but in most questions need to be asked of that person. countries a typical LSMS sample will not be big 244 CHAPrER 9 EMPLOYMENT enough to do this. Qualitative research should be used further. For example, it could be rephrased as "For how to determine local colloquial expressions for appren- long have you worked continuously for this employ- ticeship arrangements. er?" or "For how long have you worked for this employer with breaks of no more than three months?" C20-C22. These questions provide an additional indi- This draft module does not use such questions because cator of a job's formality by showing whether the pay the first may be confusing and second is clumsy. is subject to income or social security tax. They also increase the accuracy with which after-tax pay can be C42. This question identifies seasonal workers who measured. Previous LSMS surveys were vague about have multiyear arrangements with employers. The whether they were measuring before-tax or after-tax question is of greatest interest in regions where poli- pay.The answers to Questions C20-C22 make it pos- cymakers are concerned with seasonal migrant groups. sible to calculate take-home pay for all workers but Answers to the question may help researchers confirm allow calculation of pretax pay only for workers whose the validity of data from individuals who report hav- initial answers reveal pretax pay. (This reflects the ing worked for their current employer for many years assumption that some workers may find it difficult to but for only a few months during the previous 12 report their pretax pay, which they never see.) When months. survey designers believe workers can generally report both before- and after-tax pay, a question should be C48-C49. These questions identify whether the work- added asking respondents who have provided their er has completed formal job training or is still in a after-tax pay about their pretax pay. "formal training period" in his or her job. Again, this is useful for studying wage determination. Identifying C23-C24. These questions improve the measurement the occupations or industries in which formal training of cash wages. programs are important might be useful for identify- ing sectors in which analysts might wish to conduct C25. This question is of interest in determining aver- case studies. age hourly earnings as indicators of the extent to which employers can provide direct pay incentives for C50-C51. These questions determine whether an hard work. individual had a second occupation in the past 7 days. C26-C28. These questions improve the measurement C52-C83. These questions are asked of all respondents of cash earnings. The periodicity question is asked with a second occupation in the last 7 days.The ques- somewhat differently here than in previous LSMS sur- tions are a subset of Questions C2-C48; thus the cor- veys, in the hopes of eliciting accurate data. responding comments above apply to them. C29-C37. These questions elicit information on sev- C84-C86. Any income from a third, fourth, or higher eral payments made in kind, including payments in the number of occupations in the past 7 days is collected form of meals, housing, clothing, and "other." here. Qualitative research should be done to determine the most important types of in-kind payments in the C87-C89. These questions determine whether the region to be surveyed. respondent should be asked questions in Part D, and if so, which question in Part D to begin with. C38-C40. These questions elicit indicators of nonwage benefits. PART D: MAIN AND SECONDARY JOBS IN PREVIOUS 12 MONTHS (WHEN DIFFERENT FROM JOBS IN PREvious 7 C41. This question measures the length of respondents' DAYS). In this section, interviewers should fill out one tenure with their employer. This information is useful line for any activity marked with a 1 or a 2 in for studying job stability as well as wage determination. Question 1B18. The questions in Part D ask for much Although the question is phrased in a more precise way less detail about the respondent's earnings than does than in previous LSMS surveys, it could be amplified the 7-day reference period section, although Part D 245 JULIE ANDERSON SCHAFFNER still attempts to gather the summary measures neces- A30-A31. These questions determine whether or not sary for estimating the respondent's annual earnings individuals were working five years earlier. Such ques- from wage employment. The comments on almost all tions could be left until the beginning of the five-year of the individual questions in this section are the same retrospective section, but it is useful to note which as those that apply to the corresponding questions in individuals were working five vears ago early in the Section C. See the introductory notes to Part C for a interview, before the respondents become tired. discussion of why 12-month reference period ques- Fatigued respondents may be reluctant to admit hav- tions are separated into two sections (the overview in ing worked five years ago, knowing that such an Part B and the detailed questions in Part D). admission is likely to open them up to a new battery of questions. D21-D22. These questions aim to find out why the jobs that were important to the respondent during the A32-A33. These questions help the interviewer deter- previous 12 months were not carried out in the pre- mine which sections of the module, if any, to next vious 7 days. One reason might be that the respondent administer to the current respondent. changed from one nonseasonal job to another, in which case the data will indicate the relative impor- PART B: OVERVIEW OF WORK IN THE PREVious 7 DAYS tance of quits and fires and the relative importance of AND THE PREVIOUS 12 MONTHS. See the general com- job separations that do and do not cause workers to ments on the purpose of Part B in the subsection become unemployed. Another reason may be that the above that describes the short version of the employ- job in question was a seasonal job, in which case the ment module. Since all of the questions in Part B of data will indicate whether the seasonal diversification the expanded version are also in the standard version, of activities is important in the country or region see the comments in the subsection on the standard studied. version (as well as the comments in the subsection on the short version). Expanded Version Many of the questions in the expanded version are PART C: MAIN AND SECONDARY JOBS IN THE PREVIOUS also in the short and standard versions, and virtually all 7 DAYS. of the questions in the short and standard vcrsions are also in the expanded version. For comments on ques- C6-C7. These questions ask about aspects of the tions in the expanded version that are also in the short worker's relationship with his or her boss. This is use- or standard versions, see the comments above in the ful in some studies of wage determination. subsections that describe those versions. The com- ments in this subsection are for questions that appear C15. This question is of interest primarily in countries only in the expanded version. where certain intermediaries are exempted from vari- ous aspects of labor law. PART A: LABOR FORCE PARTICIPATION. C16-C17. These questions provide a more detailed A23-A24. These questions ask about channels through look at the relevance of labor law. Labor laws typical- which people found jobs that they held in the past 7 ly defines several types of employment contract, mak- days. (Such questions could instead be asked in Part C, ing it difficult to fire workers hired under certain types where they could be associated with particular of contracts and restricting the conditions under employers.) which other types of contracts may be used. In some countries Question C16 would be phrased quite dif- A25-A29. These questions are used to measure total ferently. In Brazil it would be phrased as:"Do you have work experience, which tends to be an important a signed workers' card (carteira assinada)?" explanatory variable in studies of wage determination. Question A25 ensures that only individuals who C18. This question is a rough attempt to determine the reported having no work during the previous 12 respondent's perception of how long the job would last months are asked if they have ever worked. at the time he or she first took the job.This is of inter- 246 CHAPTER 9 EMPLOYMENT est for dctermining whether jobs outside the scope of Again, the questions aim to shed light on relationships job security legislation (whether legally or illegally) are among different types of enterprises. If workers tend more or less stable than jobs within the scope of that to stay with the employers who first provided them legislation. Qualitative research should be used to with experience and training, there is reason to think determine the most useful way of asking this question. that training produces "firm-specific" skills that can be developed only under long-term employment C19-C21. These questions yield simple indicators of arrangements. This may mean that labor laws and working conditions, which are useful in studying wage macroeconomic policies that make long-term determination. employment arrangements unattractive discourage training. C57-C60. These questions measure the respondent's experience in his or her current industry (that is, with C66-C68. These questions identify xvhether a worker any employer in the same line of business, whether the is still in a "training period" in his or her job (either respondent's work responsibilities were the same or dif- informal or formal), and if not, how long ago that ferent) and in his or her current occupation (that is, period was. This is useful for studying wage determi- doing the same kind of work, whether for employers in nation and for assessing the importance of on-the-job the same line of business or a different one). The dis- training in the economy. tinction between the two could be made clearer by including explicit references to the industry and occu- C69-C73. These questions identify whether the work- pation that the respondent reported in Questions Bl er is still in a "formal training period" in his or her job and B2, but this is cumbersome. The inore important of and if not, how long ago such a period was. Again, this the two experience measures is the industry-specific is useful for studying wage determination. Identifying experience measure. The effects of this measure on the occupations or industries in which formal training wages are of interest for the light they shed on the nature programs are important could be useful for identifying of training problems. If experience acquired while sectors in which analysts might wish to conduct case working for different employers in the same industry studies. increases a worker's productivity and wages in the cur- rent job, there is reason for concern that poaching prob- C74-C75. These questions determine whether an lems lead to underinvestment in training within the individual has had a second occupation in the past 7 industry, and that policies facilitating industry-based days. training cooperatives might help mitigate this problem. Both industry- and occupation-specific experi- C76-C117. These questions are asked of all respon- ence measures are also useful for studying broader dents who have had a second occupation in the past 7 questions about wage determination, because it is days. These questions are a subset of Questions important to control as thoroughly as possible for the C2-C72, so the corresponding comments above apply skills that workers bring to their jobs (for example, in to them. studying possible segmentation or discrimination). C118-C120. Any income in the past 7 days from a C61-C63. These questions aim to shed light on the third, fourth, or higher number of occupations is col- overall importance of skilled labor, on the relative lected here. importance of formal education in providing workers with the skills they need, and on whether one kind of C121-C123. These questions determine whether the enterprise in effect trains the staff of other enterprises respondent should be asked questions in Part D and, if (for example, informal household enterprises and so, which question in Part D to begin with. small establishments providing skilled labor to larger, more formal establishments). PART D: MAIN AND SECONDARY JOBS IN PREVIOUS 12 MONTHS (WHEN DIFFERENT FROM JOBS IN PREvious 7 C64-C65. These questions identify where workers first DAYS). In this section interviewers should fill out one started gaining experience within their occupations. line for any activity marked in Question B18 with a 1 247 JULIE ANDERSON SCHAFFNER or a 2 that has not already been discussed in Part C. second most important work activities during the pre- This section of the module asks for much less detail vious 7 days or 12 months. Such a determination is about the respondent's earnings than does the 7-day useful for two reasons. First, it provides a more accu- reference period section, although this section still rate way to infer which respondents did and did not attempts to gather the summary measures necessary to change jobs over the last five years than simply trying estimate the respondent's annual earnings from wage to match industry and occupation codes. Second, it employment. The comments on almost all of the indi- allows an interviewer to skip some questions about vidual questions in this section are the same as those aspects of the respondent's job that are unlikely to have that apply to the corresponding questions in Part C. changed. See the introductory notes for Part C for a discussion of why 12-month reference period questions are sep- F17-F18. These questions are asked even if the respon- arated into two sections (the overview of Part B and dent continues to work for the same employer at the the detailed questions in Part D). time of the survey, because they pertain to features of employment that may have changed over time. PART E: OVERVIEW OF WoRK DONE FrvE YEARS PREVIOUSLY. In this section, interviewers should fill out F18. This question checks whether the job is still one line for each activity reported by each respondent going on, so that questions about why the job ended who was identified by Question A30 as having worked will be asked only of respondents for whom such five years ago. The interviewer must fill in the respon- questions are relevant. dent's identification code and number the activities sequentially. Parts E and F are much shorter than the Notes sections on current jobs because they emphasize describing the sectors in which the respondents' work The author gratefully acknowledges research assistance provided by activities took place, rather than quantifying the Meera Mehta andJaana Remes, and comments from many people, respondents' labor supply or earnings (which respon- includingJere Behrman, Richard Blundell, Catherine de Fontenay, dents may have difficulty recalling accurately). The Paul Glewve, Margaret Grosh, Anjini Kochar, Tom MaCurdv, comments on many specific questions are the same as Alberto Martini, Andrexv McKay, Anne Royalty John Pencavel, Jo those for corresponding questions in previous sections. Van Biesbroeck, andWX1imVijverberg. 1. Answering such questions is only a first step. because estab- E5. Whereas respondents' first and second most hshing the correlation of a policy change with labor market important work activities in the previous 7 days or the changes does not prove that the policy change caused the labor previous 12 months were determined by the number market changes. Giving careful thought to the timing of the of hours devoted to each work activity, the respon- changes and to the likely importance of other potential explana- dents' first and second most important work activities tions for the labor market changes is often helpful in determining 5 years ago are determined simply by asking them. wvhether or not the policy change was an important cause of the labor market change. PART F: MAIN AND SECONDARY JOBS FIsV YEARS 2. The need for a worker to sacrifice earnings early on to obtain BEFORE THE INTERVIEW. In this section interviewers fill higher earnings later may arise for several reasons. The initial wage out one line for each activity marked with a 1 or a 2 may be low because the worker is also receiving free training that in Question E5. The comments on most individual will enhance his or her productivityv Alternatively, employers may questiorns arc the sanae as those for corresponding use the promise of higher xvages later to motivate workers to work questions in previous sections. See the introductory hard or to provide xvorkers with reasons not to quit. Sacrificing notes to Part C for a discussion of why questions early wages by spending more time in open unemployment may about this reference period are split into an overview also help workers find higher-paying jobs later in life. (Part E) and detailed questions (Part F). 3. The effective xvage lies below the actual wage to the extent that program transfers fall as earned income rises. Nonlabor income F4-F5, F29-F30. These questions determine whether is increased by the size of the maximum transfer. the respondents' first and second most important work 4. A potential drawback to this approach is that it is does not activities of five years ago are the same as their first and permit collection of detailed information about the individual's 248 CHAPTER 9 EMPLOYMENT work in the same occupation for two different employers. itated by the use of question sequences that require respondents to Comparing the wages earned by the same individual in the same make a full accounting of their time use in the reference period occupation for employers in both the pubhc and the private (Hussmans 1990). Unfortunately this, too, is beyond the scope of sector-or for work in both large and smaDl establishments-can be the employment module of a living standards survey. Chapter 22 useful for assessing the likely importance of segmentation, because discusses possible reasons for including in the survey a time diary it enables the measurement of wage differences that cannot be the module, which could prompt the respondent to give this complete result of differences in workers' abilities. However, the number of accounting. individuals performing the same job for employers in different sec- 10. Previous I.SMS surveys asked detailed questions about in- tors is unlikely to be large in a LSMS sample, so this is unlikely to kind payments, employer characteristics, and working conditions for be a major loss. the respondent's main and secondary jobs during the previous 12 5. If wages depended only on variables that also belong in labor months as well as for his or her main and secondary jobs during the supply and job choice relations directly, predicted wages would be previous 7 days. Requiring respondents to provide details on the 12- linear functions of other variables on the right-hand side of the month reference period jobs came at some cost-increasing the labor supply and job choice relations, and the regressions would chance that respondent fatigue would lead to poor data quality in suffer from perfect multicollinearity. For more on simultaneity and the rest of the questionnaire-and little benefit- as the detail has the use of instrumental variables see Chapter 26. not been used much by researchers. Looking across rural and urban 6. Because changed wages today may influence work and sector areas in several LSMS countries, the percentage of individuals with choices later in life, one might want to estimate "dynamic" labor complete 7-day reference period data who reported additional 12- supply models that allow for the full range of such effects. Only month reference period activities (and whose observations in the panel data allow estimation of the full set of parameters required for dataset wvere thus put at greater risk of fatigue problems by having tax and transfer policy analysis under such circumstances. Useful to provide detailed 12-month reference period information) was subsets of those parameters (for example, allowing prediction of the often in the range of 5 to 10 percent, and ran as high as 28 percent. effects of permanent but not transitory policy changes) can be esti- 1 t. In the Cote d'lvoire survey the age limit was six years. The mnated with repeated cross-sectional data, especially when good percentage of children who reported having done some work dur- estimates of consumption are available (see MaCurdy 1985).At this ing the previous 7 days was approximately 4 percent at the age of point the fill dynamic models requiring panel data are tractable seven, 9 percent at the age of eight, 20 percent at the age of ten, 23 only under very strong assumptions that make them unattractive percent at the age of twelve, and 40 percent at the age of fourteen. for studying developing countries. 7. Rodgers, Brown, and Duncan (1993) find that measurement References errors are more likely to be correlated over time for an individual, and thus are less likely to lead to spurious transitions, when obser- Bulow,J., and L. Summers. 1986. "ATheory of Dual Labor Markets vations are collected retrospectively in one interview than when with Application to Industrial Policy, Discrimination and they are collected in interviews at different dates. Keynesian Unemployment." Journal of Labor Economics 4 (3): 8. Panel data are sonsetiiises thought useful for a very different 376-413. reason. When economic conditions are not changing much over Hall, R. E. 1982. "The Importance of Lifetime Jobs in the United time, panel data allow analysts to use "fixed effects" econometric States Economy." American Economic Revien' 72 (4): 716-24. procedures, which under certain strong assumptions eliminate Harris, J., and M. Todaro. 1970. "Migration, Unemployment and potential biases in the cross-sectional econometric relationships that Development: A Twvo-Sector Analysis." American Economic arise out of the failure to measure all relevant individual character- Revieu' 60 (1): 126-142. istics. Such methods are less useful than is sometimes thought. They Heckman, JJ., and J. Hotz. 1986. "An Investigation of the Labor aggravate biases associated with measurement error (which are sub- Market Earnings of Panamanian Males: Evaluating the Sources stantial), and they are based on assumptions (for example, that the of Inequality." Journal of Humnan Resources 21 (4): 507-42. unobserved characteristics do not change over time and that the Hussmanns, R., E Mehran, and V. Verma. 1990. Surveys of relationship being estimated is stable over time) that are often faulty. Econonmically Active Population, Employment, Uneitployinent and Thus the more pressing rationale for collecting panel data is interest Underemployment: An ILO Mlfanual on Concepts and Mlletisods in intrinsically dynamic questions hke those raised by issues 30-36 Geneva: International Labour Office. ofTable 1. See Chapter 23 for a discussion of panel data uses. Lam, D., and R. Schoeni. 1993. "Effects of Family Background on 9. International Labor Organization studies also suggest that Earnings and Returns to Schooling: Evidence from Brazil:' accurate measurement of labor supply in a reference period is facil- Journal of Political Economy 101 (4): 710-40. 249 JULIE ANDERSON SCHAFFNER MaCurdy, T.E. 1985. "Interpreting Empirical Models of Labor Tufts University, Fletcher School of Law and Diplomacy, Supply in an Intertemporal Framework with Uncertainty." Ir Medford, Mass. JJ. Hecknman, ed., Lottgitudinal An-alysis of Labor MWarket Data. . 1998. "Premiums to Employment in Larger New York: Cambridge University Press. Establishments: Evidence from Peru." Journal of Development Mroz,T. 1987. "The Sensitivity of an Empirical Model of Married Economics 55 (1): 81-113. Women's Hours of Work to Economic and Statistical . 1997. "The Sensitivity of Wage Equation Estimates for a Assunmptions." Econometrica 55 (4): 765-99. Developing Country to Changes in Sample Selection Model Rodgers, W L., C. Brown, and G. J. Duncan. 1993. "Errors ir Specification." Stanford University, Department of Economics, Survey Reports of Earnings, Hours Worked and Hourly Palo Alto, Cal. Wages."Journal of tre American Statistical Association 88 (424): Stiglitz, J. E. 1974. "Alternative Theories of Wage Determination 1208-18. and Unemployment in LDC's: The Labor Turnover Model." Sicherman, N. 1990. "The Measuremcnt of On-the-Job QuarterlyJournal of Econompiics 88 (2): 194-227. Training."Journal of Economic and Social illeasurement 16 (4). Strauss, J., and D. Thomas. 1995. "Human Resources: Empirical 221-30. Modeling of Household and Family Decisions." In T N. Schaffner, J.A. 1999. "Job Stability in Developing and Developed Srinivasan and J. R. Behrman, eds., Handbook of Development Countries: Evidence from Colombia and the United States." Economics. Vol. 3. Amsterdam: North Holland. 250 nw Anthropometry j O Harold Alderman Anthropometry, the measurement of human growth and size, is widely considered to be a non- invasive, inexpensive way to assess the nutritional status of large samples of individuals. By pro- viding information on one dimension of an individual's health status, it can reflect his or her intake of nutrients and morbidity history. These are important dimensions of welfare that can influence the consumption and investment choices of the household of which the individual is a member. Anthropometry can be used in clinical settings both to Policy Issues Regarding Anthropometric Survey make medical diagnoses and to assess whether individ- Data uals are eligible to be included in targeted programs.' However, the anthropometric measures of nutritional Anthropometric measures of nutritional status can status (AMNS) that are derived from multitopic usefully augment the limited portrait of living stan- household surveys differ appreciably from clinical dards that is revealed by the money value of goods and screening techniques, as the clinical techniques are services consumed (UNDP 1990, Ravallion 1993). directly linked to the individuals being measured For example, Steckel (1995) shows how historical pat- while multitopic household surveys are a research tool terns in adult height and weight shed light on patterns based on a representative and anonymous sample. The of economic growth over periods of up to two cen- data derived from these surveys have been used to raise turies. In addition, the fact that there is no perfect cor- public awareness of particular nutritional issues and to relation between AMNS and either national income inform the analysis and evaluation of policies aimed at levels or national income distribution is often used to combating the causes of malnutrition (Alderman distinguish countries that are atypical or to motivate 1995). research to account for this atypicality. In places such The first section of this chapter reviews the pol- as Sri Lanka or the Indian state of Kerela, the provision icy issues that can be addressed using anthropomet- of public services has led to higher levels of health ric survey data.The second section deals with what than might have been expected given their aggregate anthropometric data are needed for effective policy level of income or rates of poverty (Anand and analysis.The third section discusses the draft anthro- Ravallion 1993). On the other hand, nutritional status pometry module (in Volume 3), and the fourth sec- in some countries has not improved as rapidly as might tion provides observations and notes about this have been expected given the countries' income module. growth-perhaps indicating a need to make specific 251 HAROLD ALDERMAN investments in human resources (Alderman and Garcia sibly indicating the impact of changing economic 1994). conditions over short periods. Moreover, unlike other indicators of living stan- In cases where the heights of adults are reported, dards for which data are collected only at the house- these figures are considered to be an index of eco- hold level, anthropometric measures can provide nomic welfare (Fogel 1994; Steckel 1995). Strauss and insights into the distribution of resources within the Thomas (1998) show that economic trends can be household, on both a gender and a birth order basis, indicated with cross-sectional survey data by plotting because the data are collected on an individual basis. the heights of cohort groups. Figure 10.1 presents the While the evidence for gender-specific patterns of usage ofthis approach forVietnam.This illustrates both childhood malnutrition is mixed,2 patterns and anom- the long-run trend and the leveling off of the trend alies can both be used to focus attention on questions possibly due to intensification of the civil war. It also of intrahousehold allocation. indicates a convergence over time between the heights This section, a review of policy issues, begins with of individuals in the north and the south part of the a discussion about using anthropometric data to indi- country. However, because the use of completed cate both the welfare of a population or subpopulation heights requires a time lag of 20 years, adult heights and the success or failure of poverty and health inter- tell us little about current economic conditions. ventions. The section goes on to discuss the use of Body mass index (BMI), which is defined as an AMNS to indicate the consequences of malnutrition individual's weight in kilograms divided by the square and analyze the determinants of malnutrition in a of the individual's height in meters, is another meas- population. Finally, the section examines how these ure of adult health that can be derived from anthro- data can be used to design more effective interventions pometric data. BMI is highly correlated with many to combat malnutrition. health-related indicators, including mortality risk (Calle and others 1999; Gibson 1990; Waaler 1984; Using Anthropometry to Assess Poverty Fogel 1994). However, unlike most measures of chil- Descriptive statistics on the anthropometry of chil- dren's nutritional status, BMI values represent dren are regularly used in poverty assessments and increased health risks at both low and high levels. Low development statistics, usually to show rates of mal- BMI can indicate hunger. High BMI indicates obesi- nutrition. International comparisons are often cited to ty and resulting risks of high blood pressure, diabetes, argue that a given country needs to pay more atten- and stroke. Thus, for some countries, the number of tion to the health and nutritional status of its popula- adults with high BMI levels may be as indicative of tion. However, the strength of this argument depends in part on whether anthropometry provides a com- Figure 10.1 HeightofAdults inVietnam, 1992-93 mon way of measuring health and nutritional status Height (cm) that is more accurate than the exchange rates (pur- 165 chasing parities) used to compare income and pover- ty data. 160 Males, South In many studies, aggregate anthropometric meas- =/ ures are presented for children under five years of age. s55 Males, North However, more disaggregated analysis of AMNS for this age group can reveal patterns by age-patterns 150 Females, South influenced by factors such as birth weight, weaning practices, and exposure to pathogens. This degree of 145 - - Females, North disaggregated reporting can often be found in the results of a survey such as a Demographic and Health 140 1900 1910 920 1930 940 1950 1960 1970 Survey, which collects little economic data but more data on child care practices than a household survey Year born would normally collect. Where results are available Note: ndividuals were categorized as from the North or from the South based from a series of repeated surveys, it is possible to com- on the r place of residence at the time ofthe survey. pare the status of a cohort of children over time, pos- Source: Author's ca cu ations from Vietnam Living Standards Survey 252 CHAPTER 10 ANTHROPOMETRY public health problems as the number of adults with evaluate specific programs or national, cross-sectional low BMI levels. survey data and analysis techniques that control for The same measures of anthropometry that can be simultaneity in the decisionmaking process. used to assess poverty can also be used for targeting Similarly, AMNS can be used to show the rela- interventions or for assessing whether targeting crite- tionship of nutrition to worker productivity, measured ria are adequate. For example, it is a straightforward in terms of wages or agricultural output. Studies that process to use survey data to calculate the percentage use household-level data to show this link include of a transfer or the percentage of a subsidized com- Thomas and Strauss (1997) and Haddad and Bouis modity that accrues to households with malnourished (1991). In addition, Fogel (1994) uses historical pat- members. However, it is less straightforward to deter- terns to back up his claim that improved nutrition mine how many people would have been malnour- accounts for an appreciable share of income growth in ished had the transfer or subsidy not been in place. the past two centuries. Moreover, as Glewwe and van der Gaag (1990) These studies of the consequences of malnutrition demonstrate, different measures define different indi- generally require analyses that place nutrition on the viduals or households as poor.Thus,just because a pro- right-hand side of regressions. While height is gener- gram does not meet its nutritional targets does not ally taken as predetermined in wage regressions, it is necessarily mean that the program is not meeting the harder to justify treating weight as exogenous in such objectives that its proponents intended it to meet. equations or to claim that any measure of nutrition is not simultaneously determined with schooling choic- Modeling the Consequences of Malnutrition es. This imposes analytical challenges that cross- Analysis of AMNS can both demonstrate the conse- sectional data only partially address. These challenges quences of malnutrition and indicate the returns to are discussed in greater detail below. investments in health. The value of an investment in nutrition is most directly evident if the contribution of Using Anthropometry to Assess Interventions and Policies malnutrition to other health indicators, such as infant Anthropometric measures of nutritional status can be and child mortality, can be determined (Pelletier used to demonstrate the success of a program or poli- 1994). However, it is easier to find an association cy in reducing malnutrition in a given country. They between malnutrition and mortality than to find a can show whether there is a need for explicit invest- causal relationship. ments in nutrition to augment the gains that might be Not only do investments in nutrition directly expected from a labor-intensive growth strategy or contribute to human welfare by improving the health find out whether a proposed transfer program would of their recipients, they also enhance the efficacy of have an appreciable impact on the nutritional status of other government investments. For example, improv- the population. Alternatively, they can be used to pre- ing children's nutritional status can boost school atten- dict how general changes in economic conditions will dance and enhance students' ability to learn (Pollitt affect health status or how expanding access to educa- 1990; Behrman 1996).To explore this possibility, it is tion can change the population's nutritional status. In necessary to know the relationship between nutrition all of these cases, it is first necessary to derive from the and schooling decisions or outcomes (Glewwe and survey data an estimate of the impact that increased Jacoby 1995;Alderman and others 1997). income or improved education have on nutritional The relationship between nutrition and schooling status. That parameter is then multiplied by the size of decisions or outcomes is most easily shown with data the gain expected in income from growth or from the from samples specially designed to evaluate the effect hypothetical welfare program.4 of a program.3 Nonetheless, national, cross-sectional Similarly, causal models of malnutrition can be survey data, with suitable controls in the analysis for used to measure how price changes affect malnutrition simultaneity in households' decisionmaking processes, and, thus, to assess how removing a subsidy or chang- can show the association of nutrition status and ing the exchange rate will affect it. Analogously, such schooling decisions or outcomes (Glewwe and Jacoby models can be used to ascertain whether certain types 1995).These relationships can be explored using either of infrastructure or specific programs contribute to data from a special purpose sample survey designed to reducing malnutrition (Strauss and Thomas 1995). 253 HAROLD ALDERMAN Since an individual's nutritional status is the out- These have only exogenous (predetermined) variables come of a complex process of household decisions, on the right-hand side and are less data-intensive than data on nutritional status can also be used to make properly modeled production functions.They provide inferences about how a household allocates its the net impact of the exogenous (predetermined) fac- resources. For example, Horton (1988) explores how tors and are often suitable for determining the effect birth order affects malnutrition, providing both specif- of a particular intervention or service. For example, ic insights on nutrition and general insights on house- they can show the net impact of the availability of a hold resource constraints. Similarly, Thomas (1994) prenatal clinic in the community without actually explores the difference in the impact on a child's modeling who uses the clinic or the direct impact of AMNS of the amount of household income con- people's use of the clinic. trolled by mothers and by fathers and how this impact Other studies of the determinants of malnutrition differs if the child is a boy or a girl. The study by in children adopt production function approaches, Thomas presents strong evidence that resources are which use household decisions about health care to not fiflly pooled within a household but rather that explain health outcomes. These approaches can be some members may retain control over the disposition used to study the impact of specific behaviors such as of resources that they bring into the household. smoking, breastfeeding, or using oral rehydration salts. Another study in which AMNS are used to model More than most approaches, production function general issues of household decisionmaking is Pitt, approaches are stable over changing economic cir- Rosenzweig, and Hassan (1990). This study uses a cumstances and amenable to making extrapolations weight for height production function to derive an beyond existing conditions (Rosenzweig and Schultz individual's health endowment.The study is longitudi- 1988). However, these approaches require specific, nal, meaning that it uses data collected from a number detailed data on all the elements in the production of successive rounds of the same survey. Once the function. While multitopic household surveys often individual health endowment is derived, it is used to contain sufficiently detailed estimates of prices to indicate how households reduce internal inequality by identify the inputs into production functions, the transferring output from the household members with identification process is not always precise. Production the highest endowments to members with lower function approaches are most informative when they endowments (see also Filmer 1995). have well-defined inputs. Many key inputs into nutri- In another study using AMNS, Foster (1995) cre- tion are not collected in the short LSMS question- ates a model for weight changes in Bangladeshi chil- naire, including individual nutrient intakes, breastfeed- dren over several survey rounds. This model is used to ing practices, the timing and frequency of feeding, and indicate how sensitive households are to income the use of oral rehydration salts. shocks caused by flooding. Although the primary con- tribution of this study is in terms of understanding Data Needs credit markets, it also provides a useful perspective for understanding child health. This section discusses what data are needed for Foster and Rosenzweig (1994) use adult nutrition anthropometric analysis and the extent to which data as an outcome measure in their study of contractual from LSMS-type surveys can fill these analytical needs. arrangements for workers in the Philippines. Unlike most of the studies cited above, this paper does not aim What Measures for What Ages? to improve health policies but rather investigates the Since the publication of Waterlow and others (1977), incentives to work provided by different labor con- a distinction has generally been made between stunt- tracts. However, the underlying hypothesis of the ing (low height for age), which is considered a meas- authors' discussion is that work effort is reflected in ure of long-term or chronic malnutrition, and wasting AMNS. Thus the authors use AMNS as an objective (low weight for height), which is considered a meas- measure of a personal choice that is otherwise difficult ure of acute malnutrition. A third category, weight for to observe. age, contains some of the information from both these Many of the studies that use LSMS data to model measures and is still commonly reported.5 There is no nutritional outcomes use reduced-form estimations. apparent correlation between levels of stunting and 254 CHAPTER 10 ANTHROPOMETRY levels of wasting in a population (Victora 1992). This well as of two other measures of nutritional status (skin is a bit of a puzzle since a cumulative measure of folds and upper arm circumference). Foster (1995) also health such as height should reflect the sum over time specifically examines short-term nutritional status, but of short-term health status, which wasting should he uses changes in weight, not weight for height. measure.There have been few attempts to use house- The importance of data from the community hold survey data to solve this puzzle. modules is highlighted by comparing the results in Tables 10.1 and 10.2. These tables show that regres- HEIGHT FOR AGE. Stunting (low height for age) is an sions using height for age or weight for height as a indicator of long-term or chronic malnutrition. All dependent variable fail to explain much of the vari- LSMS surveys to date that have collected AMNS have ance in the sample, as indicated by the low values of included the height for age measure, and all analyses of R2 unless cluster fixed effects-which capture the AMNS data have discussed height for age results. effects of community infrastructure-are included. Therefore, it is not necessary to address here whether (Cluster fixed effects will be discussed later in this height for age should be included in modules on chapter.) Moreover, the coefficients on income are anthropometrics. quite different between the weight for height and The measurement of height will be further dis- height for age regressions. In two cases the signs on the cussed below in the context of age-specific concerns. variables for the impact of maternal education differ However, it should be noted that it is strongly recom- depending on which measure of nutritional status is mended not to seek self-reported heights or to ask par- used. While differences in results do not, by them- ents to indicate the heights (or weights) of their chil- selves, indicate that one measure is more accurate than dren. Strauss and Thomas (1996) report that even with the other, the regressors used to produce Table 10.2 high-quality data from the United States there is a sys- were not tailored to the measurement of acute condi- tematic bias in heights reported by parents. Low- tions, so it is presumed that the height for age long- income households underestimate their children's run measure is more suitable for the analysis reported. heights more than do higher-income parents-which can result in an exaggeration of the impact of income WEIGHT FOR AGE. Although weight for age is still a on height. Strauss and Thomas also indicate that height commonly used indicator, its use has declined, in part variance appears much greater when it is self-reported. because it is viewed as unable to distinguish between chronic and acute malnutrition. It may, however, regain WEIGHT FOR HEIGHT. Wasting (low weight for height) some favor with wider use of accurate battery or solar- is, in principle, an indicator of acute malnutrition and powered digital scales. Such scales are particularly useful thus may be a sensitive indicator of short-term for young children, whose weights can be calculated response to changing conditions. However, this is not conveniently by first weighing the mother or caretak- necessarily desirable in cross-sectional analysis. er with her child and then weighing her alone. Often Multipurpose household surveys are not generally the scale performs this calculation, and hence the sub- designed to differentiate short-term fluctuations from traction does not introduce additional error. long-term conditions. The variation in prices and Nevertheless, relatively few analyses of weight for age infrastructure in a cross-sectional data set has short- have been done linking this measure with economic term and long-term components that are hard to dis- data from LSMS surveys. Table 10.3, which uses the tinguish from each other. To overcome this and make same sample and models as Tables 10.1 and 10.2, indi- the most of the information on short-term nutrition- cates that the results from analyses of weight for age data al status contained in weight for height measures, it correspond more closely to results based on height for may be necessary to adopt panel approaches. age than to results based on weight for height. Even if Weight for height measures probably work best weight for age provides little information beyond that when there are repeated measurements over relatively already provided by height for age, there is little addi- short periods of time. One of the few studies to tional cost to collecting weights along with heights. explicitly choose weight for height for its sensitivity to Moreover, as discussed below, in a number of analyses, short-term health (Pitt, Rosenzweig, and Hassan weights can be used as additional instrumental variables 1990) uses repeated observations of this measure as to increase the accuracy of height coefficients. 255 HAROLD ALDERMAN Table 10.1 Sensitivity of Coefficients in Three Models to Explain Height for Age, Selected LSMS Surveys Coefficient logarithm Coefficient of Coefficient of F test for of income maternal education paternal education price coefficients R2 Vietnam (1992-93) Community fixed effects model 0.232 0.018 0.016 0.283 (3.132) (1.722) (1.726) Model without fixed effects 0.373 -0.003 0.008 0.225 (6.422) (0.273) (0.870) Model without fixed effects but 0.284 0.005 0.013 3.59 0.23 2 including commodity prices (4.6 i 4) (0.475) (1.496) [6, 2.594] ............ ......... ....................................................................I....................... I.......................I................................................I....................................... South Africa (1994) Community fixed effects model 0.273 0.012 0.007 0.239 (5.517) (1.282) (0.826) Model without fixed effects 0.355 0.015 0.002 0.125 (8.276) (1.565) (0.185) Modeltw.tnout5fixed.effects but 0.337 0.015 0.002 2.51 0.129 including commodity prices (7.661) (1.582) (0.207) [6, 3,277] ................................I................................................ .........................................I............................ ....................................... ............................... Pakistan (/991) Community fixed effects model 0.189 0.057 0.012 0.304 (2.641) (5.247) (1.638) Model without fixed effects 0.325 0.060 0.021. 0.142 (5.390) (6.328) (3.003) Model without fixed effects but 0.382 0.061 0.01 5.24 0.148 including commodity prices (5.968) (6.253) (2.483) [7, 3,749] ............................ ................. .............................. ....................................................................................................................................................... Morocco (1991) Community fixed effects model 0.239 0.038 0.005 0.338 (2.1 68) (2.137) (0.150) .. .....................'"e"d...e.......................................................... 0 ... 96 ..............................0.'040............................. 0... 006)................................................................0. '192 Model without fixed effects 0.29 6 0.040 0.006) 0.192 (3.172) (2.423) (0.153 Note: Numbers in parentheses are t-statistics and numbers in brackets are degrees of freedom. Regressions include variab es for age gender and the interactions between age and gender Parental heights are modeled where available. Regressions forVietnam additionally include variab es for race. Source: Author's calcu ations from LSMS data for each country ARM CIRCUMFERENCE. Mid-upper arm circumference a compelling reason to prefer one measure over anoth- is a measure that gauges both fat reserves and muscle er. Indeed, Zerfas (1991) claims that mid-upper arm mass. In contrast to the bulky scales and measuring circumference adjusted for age is relatively insensitive boards needed to collect other measurements, it to errors in recording age. In practice, standardizing requires little equipment other than a calibrated tape mid-upper arm circumference for age is less conven- measure (Zerfas 1991). Thus, in places where inter- ient than standardizing height or weight for age, since viewers have to carry anthropometric equipment long no prepackaged software that makes the adjustment distances to sample households, it may be more prac- easily is yet available. tical to measure mid-upper arm circumference than to A few studies have indicated that mid-upper arm measure weight and height. Moreover, in longitudinal circumference is not strongly correlated with weight surveys mid-upper arm circumference has at least as for height (WHO 1995). However, the absence of strong a correlation with subsequent mortality as do such a correlation does not seem to rule out the valid- other measures (Vella and others 1993). ity of mid-upper arm circumference as a measure of The World Health Organization (WHO; 1995) nutritional status; height for age, also not strongly cor- recommends against using unadjusted mid-upper arm related with weight for height, is widely used as such circumference because it is not age-independent. a measure. So farVietnam is the only country where a However, conceptually it is a simple matter to adjust LSMS survey has included mid-upper arm circun-fer- the measurement for age using the data provided in ence. In Table 10.4, both individual correlations of WHO (1995). Since this standardization is routinely anthropometric measure for children in Vietnam and done for heights and weights, by itself it does not seem the correlation of rates in the 148 sample clusters are 256 CHAPTER 10 ANTHROPOMETRY Table 10.2 Sensitivity of Coefficients in Three Models to Explain Weight for Height, Selected LSMS Surveys Coefficient logarithm Coefficient of Coefficient of F test for of income maternal education paternal education price coefficients RI Vietnom (1992-93) Community fixed effects model 0.290 -0.073 -0.074 0.128 (0.701) (1.363) (1.800) Mo without fixed effects 0.208 -0.037 -0.058 0.080 (0.541) (1.086) (1.280) Model without fixed effects but 0.383 -0.095 -0.066 0.96 0.080 including commodity prices (0.953) (1.106) (1.506) [6,2.594] South Africa (1994) Community fixed effects model 0.085 0.001 0.004 0.250 (1.627) (0.093) (0.570) Model.without.fixed.effects 0.047 0.005 0.004 0.05-3 (1.062) (0.602) (0.415) Model without fixed effects but 0.023 0.01 0.008 13.28 0.077 including commodity prices (0.512) (1.088) (0.889) [6, 3,277] Pakiston (1991) Community fixed effects model 0.155 0.001 0.009 0.256 (2.547) (0.156) (1.454) ................................ ......................................... .....9-,......................................," .............................. O.................... .............................................. ..6 025 Model without fixed effects 0.198 0.001 0.009 0.025 (3.801) (1.453) (0.505) ......................... ................................................................................................................................................................................... *........... Model without fixed effects but 0.165 -0.006 0.003 8.23 0.044 including commoditypnces (2.980) (0.722) (0532) 7, 3,749] Morocco (1991) Community fixed effects model 0.072 -0.041 -0.008 0.267 (0.836) (3.055) (0.301) Modelwithout fixed effects 0.219 -0.027 0.004 0.018 (2.988) (2.158) (0.141) Note: Numbers in parentheses are t satistics and numbers in brackets are degrees of freedom. Regressions include var ables for age, gender, and the nteractions between age and gender Regressions forVietnam also include variables for race. Source: Author's calculations from LSMS data for each country. reported. While cluster rates are based on very small Rosenzweig, and Hassan (1990) use this measure, not numbers and thus only roughly indicate similar corre- to study such links but in a statistical technique that spondences in larger samples, available statistics show requires an over-identifying instrument. Table 10.5 that rates of low weight for age are poorly correlated repeats three of the regressions reported forVietnam in with other measures at the cluster level.Thus the table Tables 10.1-10.3, using mid-upper arm circumference reinforces the view that weight for height reflects a standardized for age as the dependent variable. The very different dimension of community nutrition than coefficients of income and maternal education in the is reflected in height for age or weight for age. Mid- preferred fixed effects estimates are moderately similar upper arm circumference may be a more short-term to the corresponding regression with height for age as measure than height, as muscle mass and fat reserves a dependent variable.The same bias can be seen in the can both decrease as well as increase. Height, however, coefficient of income relative to the fixed effect esti- is as much a measure of past nutritional shocks as of mate observed in other AMNS. current circumstances, and does not appear to be as It is not possible to take these comparisons further closely related to weight for height as it is to other since the other LSMS data sets do not include mid- measures. upper arm circumference as a variable (although a sur- There is little evidence on how mid-upper arm vey currently being prepared in Paraguay will do so). circumference correlates with the economic variables However, many of the analyses that have been carried collected in integrated household surveys, since this out to date with other LSMS data could easily have measure has rarely been collected in LSMS surveys. been undertaken using mid-upper arm circumference Nor is mid-upper arm circumference often used in data as well. Given this possibility, the strongest argu- studies linking economic data with nutrition. Pitt, ment against the widespread use of mid-upper arm 257 HAROLD ALDERMAN Table 10.3 Sensitivity of Coefficients in Three Models to Explain Weight for Age, Selected LSMS Surveys Coeffcient logarithm Coefficient of Coefficient of F test for of income maternal education paternal education price coefficients R2 Vietnam (1992-93) Community fixed effects model 0.167 0.019 0.001 0.412 (0.370) (2.517) (0.011) ................................................................................ 0... 98 .............................. _0.'0............................. -O."02......................................................................wihu.ie fet .5 Model witnoLt fixed effects 0.198 -0.01 -0.002 0.351 (4.036) (1.488) (0.247) Modelwithout fixed effects but 0.154 0.10i7 0.002 4.36 0.357 including commodity prices (3.592) (2.403) (0.253) [6, 2,594] South Africo (/994) Community fixed effects model 0.223 0.008 0.010 0.277 (4.740) (0.944) (1.267) ............h"o....ut........ed.......................................*..................... 0...239 .............................. 0...013 ..............................0.'004................................................................. 0.I28 Model without fixed effects 0.23 9 0.013 0.004 0.128 (6.046) (1.565) (0.587) ............................. ................................................................................................................................................................................. Model without fixed effects but 0.210 0.016 0.008 8.89 0.144 including commodity prices (5.135) (2.01 1) (1.028) [6, 3,277] .....................................................................................................................I................................................I............................................................ Pakistan (1991) Community fixed effects model 0.220 0.033 0.016 0.347 (3.958) (3.952) (2.955) Model without fixed.effects 0.330 0.025 0.0 16 0.146 (7.170) (3.134) (0.016) .................................................................................*.................................................................................................*........................................ Model without fixed effects bit 0.335 0.030 0.014 7.44 0.158 including commodity prices (6.718) (3.677) (2.558) [7, 3,749] Morocco (/991) Community fixed effects model 0.215 -0.014 -0.009 0.346 (2.463) (1.118) (0.358) Mode without fixed effects 0 366 -0.002 0. 002 0.i80 (5.163) (0.191) (0.088) Note: Numbers in parentheses are t-statistics and number in brackets are degrees of freedom. Regressions include varables for age, gender and the nteractions between age and gender Regress ons forVietnam also nclude variables for race. Source: Estimate from LSMS data. Table 10.4 Correlation ofAnthropometric Measures for Children 0-60 Months,Vietnam, 1992-93 Correlation over individuals Correlation of rates over clusters Height Weight Weight Mid-upper Height Weight Weight Mid-upper Mean Rate of for for for arm for for for arm Variable value malnutrition age age height circumference age age height circumference Heiht for age -1.89 49.9 1.00 - - - 1.00 ...... .. ......... ......... ......................... ..2 ................... ........................................................................I.............................................................. ............ Weight for age -1.62 40.8 0.75 1.00 - - 0.68 1.00 - - .. ....... "" -rn................................................................................................................................................................................................................... Weight for heignt -0.59 5.8 -0.02 0.62 1.00 - 0.07 0.21 i.00 Mid-upper arm circumference -1.84 49.7 0.42 0.67 0.50 1.00 0.42 0.53 0.21 1.00 -These correlations are the sanie as the conrespord ng correlations be ow the diagonal of this 4x4 matrix. Source: Estimate from LSMS data. Table 10.5 Three Models to Explain Mid-UpperArm Circumference,Vietnam, 1992-93 Coefficient logarithm Coefficient of Coefficient of F test for of income maternal education paternal education price coefficients R2 Community fixed effects model 0.183 0.026 0.007 0.384 (3.144) (2.841) (0.922) Model without fxed effects 0.35 1 0.006 -0.008 0.277 (7.394) (0.689) (0.397) Model without fixed effects but 0.251 0.012 0.000 4.42 0.284 including commodity prices (4.931) (1.473) (0.006) [6, 2,594] Note: Numbers in parentheses are t-statistics and numbers in brackets are degrees of freedom. Regressions nclude variables for age, gender and interactions of age and gender Source. Estimate from LSMS data. 258 CHAPTER 10 ANTHROPOMETRY circumference data appears to be that it has not been live children lying on these measuring boards. Other used widely so far. In other words, this measure can- considerations in measuring very young children are not be used as a poverty indicator because there are the timing of the most recent feeding or the fact that too few examples for cross-country comparisons to be parents may not allow their child's swaddling or dia- made. And mid-upper arm circumference statistics are pers to be removed.While the failure to remove cloth- rarely used to inform policy probably only because the ing can affect the recorded weights of older children indicator is relatively new. Thus it would be appropri- too, clothing represents a higher percentage of the ate in the future to more frequently include this indi- weight of a very young child. Also, while misreporting cator in household survey questionnaires . of age can throw off the interpretation of the anthro- pometric measurements of all children, it is especially BIRTH WEIGHTS. While it is well known that birth problematic for very young children; because very weights are a strong predictor of the subsequent size of young children grow extremely rapidly, even a small children (Adair 1989), it is generally not possible for error in assessing their age can result in a significant multitopic household surveys to gather birth weight error in assessing their nutritional status. information in any systematic way. Birth weight data These problems justify treating children younger are occasionally recorded on health cards that are than one year old differently than slightly older chil- issued by clinics and kept by a baby's parents, but in dren, in analysis as well as in data collection. However, most developing countries the percentage of children the problems do not justify excluding younger chil- born in a clinic is usually too small to provide a reli- dren from the sample for which anthropometric data able indicator without correcting for sample selection. are collected. Indeed, given the long-lasting impact of In countries where these data are available for a signif- early childhood nutrition, identifying and understand- icant portion of the population, it would be advisable ing the determinants of nutrition for this age group is to take advantage of this opportunity to collect them. one of the most important tasks for analysts. However, in most cases, designers of LSMS surveys At times a lack of resources or technical capacity should assume that this will not be the case. makes it difficult to collect weight and height infor- mation. In such cases it is sometimes suggested that the MEASURING YOUNG CHILDREN. AMNS collection is problem be addressed by measuring only children most important for preschool children since these under the age of 36 months. The proponents of this children are particularly at risk of malnutrition. solution justify it on the grounds that peak vulnerabil- However, certain difficulties arise when measuring the ity occurs in these years. However, while this may be nutritional status of very young children. Because chil- an appropriate way to prioritize in clinical work, it is dren under 2 years of age are too young to stand prop- not as appropriate in survey work because the costs of erly, their length must be measured while they are locating other, shghtly older, children are not high. The lying down.The international reference tables accom- key issue for determining the appropriate age cutoff is modate this difference in measurement techniques not how easy it is to take the relevant measurements (Dibley and others 1987). but international comparability and correlation with There are also some practical considerations observable prices and infrastructure. regarding the measurement of very young children. Cultural taboos related to the evil eye and negative MEASURING ADOLESCENTS. There is little experience feelings regarding strangers visiting a child are most with collecting data on individuals between the ages common in the case of the very young. In five data sets of 6 and 20. At the low end of this range, these data investigated-Ghana, Morocco, Pakistan, South Africa, differ only slightly from data on younger children and and Vietnam-the probability of a missing height can be useful, for example, to explain why some chil- measurement for a preschool child declines signifi- dren are enrolled in school at the age of six and oth- cantly with age, with most of this decline coming ers are not. However, by the time a child reaches the between the first and second year of life. In some cul- prepuberty growth spurt (at approximately age 9), ana- tures, dead children are buried on boards that resem- lyzing his or her nutritional status becomes more ble the measuring boards used by interviewers, which problematic. While international growth reference makes parents understandably reluctant to see their curves are available for the height of children up to age 259 HAROLD ALDERMAN 18, they are only available for weight for height of girls Another problem with measuring adults is that it up to age 10 and boys up to age 11. Moreover, little may be more difficult to find all adults in the vicinity work has been done on assessing international patterns of the household during the time of the interview. or ethnic differences within a given country in the The better educated (as well as healthier) adults are timing of the onset of puberty. liable to hold jobs that prevent them from being pres- Thus the value of collecting anthropometry for ent when the field team visits the respondents' home. adolescents in housheold surveys is unproven. Or these adults' jobs may have caused them to However, two arguments can be made in favor of migrate. There is a gender bias in availability of adults including it in multitopic household surveys. First, to be measured; while only 4.7 percent of mothers increasing the amount of data available on this age were missing in the Morocco sample, over 30 percent group may indicate the data's usefulness with high rel- of fathers were. Even in the Vietnam LSMS-a data ative returns, particularly in the case of panel studies set with a remarkably low number of missing meas- that might shed light on how nutrition and schooling urements overall-more than five times as many affect each other or on whether adolescent children fathers as mothers were not measured. Missing height can overcome stunting that occurred when they were measurements were especially high for males between younger (Golden 1994; Martorell and others 1992). 20 and 25; in this age group over 10 percent of the Second, if the cost of collecting anthropometric data sample was missing. largely consists of the fixed cost of finding the house- As a result of the relative scarcity of adult meas- hold, including adolescents when measurements are urements in the field, there is less existing evidence on being taken of preschool children and adults is rela- adult health, as represented by BMI, than there is on tively inexpensive. child malnutrition. (Studies of adult BMI are included in Thomas, Lavy, and Strauss 1996 and Lavy, Thomas, MEASURING ADuLTS. Since rates of childhood malnu- and de Vreyer 1996; a few studies, such as Pitt, trition are a more widely monitored indicator of wel- Rosenzweig, and Hassan 1990, pool children and fare than adult BMI or height, measuring adults is often adults.) Adult BMI models are conceptually similar to regarded as of secondary interest for policymaking pur- models of child nutrition, although children may be poses.Yet measuring adults is useful both for analyzing more passive recipients of care than adults. However, policy (for example, assessing the impact of nutrition some additional factors can be measured for adults. on labor productivity) and for understanding the deter- Total family income (or expenditure) is more clearly minants of the nutritional status of children and other simultaneously determined with adult BMI than it is members of the family. Cross-tabulations or multivari- with nutrition of children. In a related vein, Higgins ate regressions that explain child nutrition may be mis- and Alderman (1997) indicate that since the energy leading if adult heights are not available, since parental, outlay of labor affects BMI, the failure to include a especially maternal, heights are significant explanatory measure of time allocation or energy intensity in a variables in many studies of children's nutrition. study of adult health may result in biased income coef- Moreover, adult stature may correlate with the income ficients. And pregnancy and lactation need to be and welfare of populations. And adult heights and body included in models of female BMI. mass indexes can be used to provide information on Table 10.6 shows the differences in income and the long-term returns to investments in nutrition. education coefficients derived from three LSMS sur- In multitopic household surveys, stature and veys with and without parents'heights, using the clus- weight of adults are less frequently collected than ter fixed effect models (with relevant coefficients stature and weight of children. One possible reason for repeated for convenience).6 It is apparent that in both this is that it may be more difficult to collect such Vietnam and Morocco excluding the parental height measures from adults. If children are weighed using variable leads to an appreciable bias in the coefficient hanging scales, the measurement of adults requires that of per capita expenditures, while the education coeffi- the team carry an additional, somewhat less portable, cients do not seem to be affected. In the case of scale. However, as mentioned above, digital scales are Pakistan, for which only mothers' heights are available, currently available that are accurate for both children the difference in the coefficient of income is relative- and adults. ly small when the mother's height is excluded. 260 CHAPTER I 0 ANTHROPOMETRY Table 10.6 Effect of Omitting Parental Height in Three Models of Height for Age, Selected LSMS Surveys Coefficient Coefficient of Coefficient of logarithm of income maternal education paternal education R' Vietnam (I 992-93) Community fixed effects model with both parents' heights 0.232 0.018 0.016 0.283 (3.132) (1.722) (1.726) Community fixed effects model with motner's height only 0.257 0.018 0.017 0.276 (3.495) (1.707) (1.792) Community fixed effects model excluding parents' heights 0.283 0.019 0.017 0.259 (3.816) (1.791) (1.739) ................................................................................................................................................................................................................................... Paoiston (1991) Community fixed effects model with mothers height only 0.189 0.057 0.012 0.304 (2.641) (5.247) (1.638) Community fixed effects model excluding parents' heights 0.204 0.057 0.015 0.299 (2.834) (2.196) (2.098) ..................................................I..................................................*..................... .............................. .................................. Morocco (1991) Community fixed effects model with both parents' heights 0.239 0.038 0.005 0.338 (2.168) (2.137) (0.150) Community fixed effects model with mother's height only 0.25-3 0.03 6 0.004 0.326 (2.277) (2.025) (0.112) Community fixed effects model excluding parents' heignts 0.358 0.039 0.004 0.305 (3.263) (2.136) (0.116) Note: Numbers in parentheses are t-stat stics. Regressions snclude variables for age, gender and the interactions between age and gender Regressions forVietnam also Inc ude variables for race. Source: Estimated from LSMS data. The inclusion of parents' heights does not appear OTHER HEALTH INDICATORS. Many aspects of to be as important in preventing bias in the mid-upper micronutrient status or deficiencies that affect mortal- arm circumference regression. When heights were ity and productivity are not revealed by anthropomet- omitted in the regression reported in the first row of ric measurements. Therefore, to do a full study of the Table 10.5, the coefficient of income rose only to nutritional status of a population, it is necessary to col- 0.191. Using parental arm circumference as a regressor lect data on the levels of vitamin A or iron in the is not appropriate since, unlike height, parental arm blood. Collecting this data is not usually cumbersome circumference is affected by the same factors that or costly, as there are some cheap and convenient field determine child arm circumference and thus will con- collection methods. Iron status, for example, can be tribute to biased results.7 indicated using a blood sample taken by means of a single pinprick. However, experience suggests that What Data ComplementAnthropometric Data? research protocols are not always strictly observed by Anthropometric data alone are not sufficient to analyze all survey interviewers in the field. The consequences a number of key nutritional issues.Therefore, the anthro- of such deviations can be particularly serious when pometric data collected in most multitopic household blood is being collected. Moreover, informed consent surveys need to be supplemented by data either from becomes both more important and more difficult to other parts of the survey or from outside sources. The obtain when even marginally invasive techniques are LSMS surveys have an advantage over many other used. Thus the collection of blood samples is more sources of anthropometric data in that they gather infor- suited to specialized clinical studies than to living stan- mation on other aspects of the household's well-being, dards surveys such as the LSMS. including total expenditure and educational attainment.8 The consumption of micronutrients is, in general, As is increasingly recognized by epidemiologists and hard to indicate in multitopic household surveys, in nutritionists as well as economists, this feature makes it part because most food-related data are collected at possible to be discriminating in making causal statements the household level and thus have to be aggregated about observed correlations (Briscoe, Akin, and Guilkey over all household members and all meals. This is 1990). This subsection of the paper reviews what kinds important as the preparation of the food and the tim- of data are needed to complement the AMNS. ing of its consumption influence the absorption of 261 HAROLD ALDERMAN micronutrients. However, it is possible to test for the ting objective measurements in the same module as iodization of salt by asking households to provide a questions that require reported answers makes it clear- small amount of salt to be tested in the field (as was er which data have been reported by the individual done in Peru in 1995) so that researchers can ascertain him or herself and which have been reported by other within reasonable bounds whether the household is family members. receiving an adequate amount of iodine. The fertility modules within LSMS surveys have Some inferences on nutrient intake can be made often included questions on breastfeeding and immu- using food recall data. To provide this data, respondents nization that are important for anthropometric analy- attempt to remember what food the household has ses (see Chapter 15). The general practice is to put eaten during, say, the past 24 hours. Although LSMS these questions in the fertility module because they surveys have not traditionally included food recall refer not only to living children but also to any chil- questions, it would be feasible to include them in a dren who have died. If a survey does not include a fer- multitopic household survey. Moreover, these data tility module, it would be useful to include questions need not be collected on a household basis; they can on breastfeeding and immunization within the also be collected on a per meal basis or for an individ- anthropometry module. ual or subset of individuals in the household. Indeed, A simple yes or no question is seldom sufficiently food recall data collected on an individual basis may informative about breastfeeding; interviewers must ask measure household consumption more accurately when the breastfeeding began and ended, and whether than do food expenditures (Bouis and Haddad 1992). the child was exclusively breastfed during this period. If so, they may help identify the contributions of As the phrase "exclusive breasffeeding" is ambiguous, household resources and public health measures the wording of the question should distinguish among toward preventing malnutrition. various means of providing fluids (both milk and A number of new techniques for assessing adult water) and elicit the date on which the child was first health have recently been introduced into household introduced to solid foods. For times when a child was surveys. These include self-reported assessments of not breastfed, a distinction should be made between respondents' ability to perform daily activities ranging cup and spoon feeding and bottle feeding. Often ques- from walking one kilometer or lifting heavy objects to tions about breastfeeding also ask whether colustrum simple activities such as eating, bathing, or dressing. was given to the child. Other techniques include measuring respondents' lung capacity. The various techniques are discussed at INCOME AND CONSUMPTION. Consumption questions length in Chapter 8. are included in all multitopic household surveys, while It is possible to include the anthropometric meas- income questions are included in most such surveys. ures discussed in this chapter in the health module When consumption and income data are analyzed in rather than in an independent module. Conceptually, combination with anthropometric measures of nutri- it matters little which choice is made.The main issues tional status, a few issues need to be considered. that should determine this have to do with the organ- Some difficulties may arise if researchers are inter- ization of the fieldwork. LSMS surveys often involve ested in calculating the availability of nutrients (and if anthropometrists who collect only anthropometric food consumption recall is not included in the mod- data, are well trained in the technique, and are each ule). Generally, food consumption in LSMS surveys is issued a set of field equipment. If the anthropometric recorded in terms of expenditures. Using community module were incorporated into the health module, prices to calculate quantities of goods consumed by a either anthropometrists would need to be trained in household requires researchers to make a tacit assump- general interview techniques or interviewers would tion that no differences in quality exist. To the degree need to be trained (and equipped) for anthropometry. that prices vary by income level-as they might if a Linking the two modules might make it easier to use household can chose a commodity from a range of the time of all the members of the field team more similar commodities of different qualities (for exam- efficiently, reducing the chance that one team member ple, with grains or meat)-using average prices will would have to wait while his or her colleagues com- underestimate the quantities obtained by low-income pleted the other aspects of the survey. Moreover, put- consumers. Recent analysis has shown that the quality 262 CHAPTER 10 ANTHROPOMETRY range is limited in the case of commodities that are the household must stop their income-earning activi- defined more precisely-rice, for example, rather than ties to care for the sick child. the broader category of grains. The bias in measure- ment of calories will thus be smaller if the consump- COMMUNITY DATA. Community data are very impor- tion module lists individual foods rather than classes of tant for nutritional analysis.Yet they remain a part of foods (Subramanian and Deaton 1996; Alderman in data collection that needs to be strengthened. (See Lipton and van der Gaag 1993). Nevertheless, if there Chapter 13 for a broader discussion of collecting com- is a possibility that prices differ by household charac- munity data.) teristics-for example, with poor households buying In the community questionnaire, LSMS surveys in small quantities and thus unable to get bulk dis- have typically collected a range of data pertaining to counts-the module on food consumption should ask nutritional status, including prices of food and medi- about the most recent price paid by the household. cine and distance to clinics and doctors. Alternatively, Some LSMS surveys have inquired about house- summary variables have been constructed for these holds' recent purchases of food, the number of indicators by taking the means or medians of the months households rely on home production, and observations (from households within a sample cluster) households' average consumption from home pro- from various parts of the household questionnaire. duction. Such inquiries do not allow an accurate However, more specialized questions are needed to assessment of current consumption because they do monitor disease pathogens or prevalence rates or to not indicate if home production was being consumed gauge the quality of services at health centers. Also, during the period covered in the survey recall. without measuring the availability or rationing of serv- However, most LSMS surveys have collected data on ices, the data on prices of medicine may be misleading. purchases for and consumption from production (as The importance of community data can be seen in well as in-kind wages), using the same recall period in the fact that their availability changes the estimated both cases. impact of other factors such as income and education. If Income and expenditures are often measured wealth correlates with the availability of services that are imperfectly, in some cases leading to underestimation excluded from the regression, some of the effects attrib- of the effect of income or expenditures in regressions uted to income or education may in fact be due to these of the determinants of nutritional status. Chapters 5, services. Table 10.1 uses four LSMS data sets to give 11, and 17 in this book discuss ways to improve the some indication of this bias.The regressions in Table 10.1 measurement of income and consumption. Yet even are run with different specifications. One model includes without improving income and consumption meas- community fixed effects-community means of both urements, it is possible to correct for errors in income observed and unobserved variables.A second model, the and consumption using other information in the mul- conventional model, does not include community infor- titopic survey. mation.The coefficients of per capita income in models The main way to do this is to use the data on that regress children's standardized heights for age on a household assets from various parts of the question- number of regressors in the two models are dramatical- naire (such as the modules on durable goods, housing, ly different. If the fixed effects models are considered to household enterprises, agriculture, and savings) to be unbiased, the bias in models without fixed effects can apply instrumental variables methodologies. These be as high as 70 percent (as in the case of Pakistan). In techniques can also help overcome another problem- other cases the bias is smaller, but these models include the biases in the regressions of the determinants of race and rural-urban distinctions that may also proxy for nutritional status sometimes introduced by income or service availability. (These results differ somewhat from income proxies due to simultaneous choices. Not only those reported in Gertler, Glewwe, and Ponce 1998, as is adult health likely to be dependent on income, the age grouping differs.) In the examples in Table 10.1 income may also reflect the health (and size) of the the bias on income is always upwards, as it should be if household members. While such reverse causality is the availability of types of social infrastructure that less obvious in the case of the nutrition of young chil- improve nutrition are positively correlated with income. dren, it can be hypothesized that household earnings Thus it is important to include questions on infrastruc- decrease when a child is ill because adult members of ture in the survey. 263 HAROLD ALDERMAN The relative magnitudes of the Ri values of the it may occasionally be possible to link multitopic sur- fixed effects regressions in Table 10.1 also indicate that vey data with other databases such as geographic a great deal of information is shared among the clus- information systems. For example, efforts are current- ters. The cluster fixed effects approach, however, does ly underway to link the South African LSMS to the not make it possible to infer much beyond the pres- South African Health Management Database System ence of some common factors such as prices, altitude, (GIS), a comprehensive database of the country's disease vectors, health and sanitation infrastructure, the health facilities and these facilities' staff. In general, quality of staff and management in the local clinics, however, GIS databases contain limited information and information and preferences shared among house- on staffing and even less on the quality of services. holds. More likely, the cluster effects reflect a combi- nation of these common elements. COMMUNITY DATA ON PRICES. Various policy measures that influence prices (of food, of drugs, or of medical COMMUNITY DATA ON HEALTH SERVICES. LSMS sur- services) also indirectly influence nutrition because veys have not usually contained much information on the levels of these prices determine what products and health infrastructure, which has made it difficult for services people can afford. A number of studies have analysts to use LSMS data to devise specific policy rec- used LSMS data to determine how consumers respond ommendations in this area (see Chapter 8). Although to changes in the price of services.These studies have some attempts have been made to measure access measured how price changes affect demand for health (usually as defined by distance) to health services, so care and the use of alternative service providers, rather far only a few multitopic surveys have sought infor- than measuring impacts on health status. Nevertheless, mation on the quality of health services.The main rea- anthropometric data from multitopic surveys can also son for this omission is that quality is often difficult to be used to estimate the net impact of prices or service define and clear and accurate information on quality availability on nutrition. may be hard to get from individual respondents While the prices and availability of services often (unlike, say, information on prices or distances). have an effect on a population's nutritional status, Moreover, for modeling purposes, information is including these factors in the regression does not nec- needed not only on the facilities used by the house- essarily mean that there are no biases in the coefficient hold but also on facilities the household could have of income. For example, the coefficients of income in chosen to use. Such information is particularly prob- the third row for each country in Table 10.1 differ lematic in urban areas, where there may be a range of appreciably from the community fixed effects esti- facilities from which households can choose. mates.This may be because the information on prices Two studies of nutritional status-Thomas, Lavy, and clinic availability that has been collected in most and Strauss (1996) and Lavy, Thomas, and de Vreyer LSMS surveys to date explains only a small share of (1996) study the determinants of malnutrition using the cluster-specific information. (An alternative way of data from special modules that contain measures of looking at this issue is to regress the residuals from service quality. Even with such modules, however, regressions sinilar to those in the third row for each quality (or even availability) measures must be handled country in Table 1O. 1 against the cluster fixed effects. judiciously. For example, Thomas, Lavy, and Strauss When this is done, the F tests for the regressions are (1996) compare two measures of staffing in C6te significant at conventional levels of significance.) d'Jvoire: the number of staff members listed in official Moreover, the estimated impact of service avail- records and the number present in the 24 hours pre- ability or prices can be biased if no account is taken of ceding the interview.While the actual number of doc- differences in service quality. The measures of travel tors present was found to affect child health favorably time contained in short LSMS data sets often do not and significantly, the number on the books appeared to indicate the quality of the services available at the des- have no impact on child health. Failing to take this tination (Strauss 1990). If, for example, higher prices into account may mean that incorrect inferences will or greater distances are associated with better-quality be made about which policies are most effective. services, it is likely that the estimated impact of prices In addition (or as an alternative) to collecting bet- on the utilization of health care services or on nutri- ter data on the availability and quality of infrastructure, tional status will be understated. 264 CHAPTER 10 ANTHROPOMETRY COMMUNITY DATA ON SANITATION. As in the case of Bhargava (1995, 1997). Bhargava's studies of adult and health infrastructure, most LSMS surveys have not col- child nutrition aim to determine the effects of malnu- lected enough information on sanitation infrastructure trition on worker productivity and cognitive develop- to allow researchers to carry out nutritional analyses. ment, respectively.These two studies, along with Foster Although it can be hypothesized that the quality and (1995), use data collected as part of multiround longi- quantity of water delivery and waste removal affect tudinal surveys that differ appreciably from LSMS sur- nutrition, most household survey data allow for only veys. Nevertheless, with the possible exception of broad testing of such hypotheses. This is sometimes credit flows, the variables used in Foster's study are done from housing modules by deriving dummy vari- commonly collected in LSMS-type surveys. ables that indicate the source of water or the type of Bharvaga's studies use information on individual food latrine; these analyses generally highlight the impor- intake and, in one of the studies, psychological tests. tance of sanitation. While such dummy variables are Occasionally it is desirable to model how past not included in the example presented in Table 10.1, growth affects current growth-say, to test the including them would not appreciably close the gap in hypothesis of catch-up growth or model the impact of the coefficients of expenditures or in the R2 values.As short-term economic shocks. Since it takes two obser- with health care, the quality of such services is often vations on the stock of height or weight to indicate hard to gauge unless the survey includes specific mod- the flow or change, questions about dynamics of ules covering service providers or the analysis is car- growth may require a panel that provides at least three ried out using additional data sources. Chapter 14 observations. LSMS surveys have occasionally carried contains a discussion of how to enhance the tradition- out repeat observations of the same panel of house- al treatment of water and sanitation in LSMS surveys. holds but have rarely provided more than two obser- vations from each household.Thus only a few existing PANEL DATA ON NUTRITION. There are several types of LSMS surveys are well suited to address issues of child analysis that would become possible with or could be growth or sequential consequences of health shocks. enhanced by panel data on nutrition. If nutritional sta- Even when levels of nutrition, rather than growth, tus is considered a stock variable and income is con- are being studied, having panel data allows analysts to sidered a flow variable, a regression of height on address errors in measurement. It also allows analysts to income really measures the cumulative impact on the distinguish the effects of nutrition from the effects of stock of height of the flow of income over time. A other human capital investments. Because household model that examines flow (growth in height) regressed health and schooling investments are determined by on other flows, such as income, might be more the same allocation processes that the household informative, but growth in height could only be cal- applies to the use of its resources, in models of the culated by taking the difference in height between demand for schooling conditional on nutrition it is a two points in time in a panel data set. One of the few challenge to distinguish the effects of health from the economic studies that model growth (in this case, effects of other household characteristics. One way of growth in weight) is Foster (1995), a study of credit approaching the problem is to use prices in an instru- markets. mental variables method. However, all current prices Panel data can be particularly useful for modeling arguably belong in a household's budget constraint, a child's response to interventions. This response is and thus are not useful as instrumental variables in this often age-specific (Lutter and others 1990; Sahn and context. As such, it is not possible in a cross-section to Alderman 1997). Few cross-sectional regressions are use current prices to identify nutrition in a simultane- likely to measure these differences.While it is standard ous model of schooling decisions; this may be possible practice to include age variables in the analysis of in panel data sets. (Glewwe and Jacoby 1995 use nutrition or to run regressions by age group, neither of maternal height as an identifying variable.) these approaches is entirely effective for exploring Taking height as a stock variable, one alternative whether the effects of a household resource or gov- to cross-sectional analysis is to use past shocks to iden- ernment program differ by age group. tify the impact of nutrition on current conditions.This Another approach to modeling the dynamics of approach depends on the availability of panel data that nutrition using lagged dependent variables is taken by include repeated measurements of nutritional status 265 HAROLD ALDERMAN and prices, varying over time and space.9 This above approach must also assume, consistent with most evi- Box 1O. 1 Policy Issues and Anthropometric Data dence, that catch-up growth is imperfect.While there Issues that can be analyzed usding LSMS survey data iS limited evidence on the difference between using * Monitoring the nonincome dimensions of poverty. current prices as opposed to lagged prices to identify * Assessing the nutritional status of the population over nutrition in models of current decisionmaking, time and space. Alderman and others (1997) provide an example for * Indicating and analyzing the distribution of resources by which the bias is shown to be appreciable. age and gender within a household. * Providing perspective on targeting errors. Summary * Gauging the role of incomes and prices in nutritional There are two main advantages to anthropometric status. * Gauging the role of education in nutritional status. data gathere in nationa, multitopic household - * Estimating the labor productivity returns to nutritional veys like the LSMS. The first advantage is the broad investments . range of analysis that can be done with these data because data on many other topics are gathered in the Issues thot con be analyzed using LSMS surveys with special- same survey at the same time. The second advantage ized modules (including a community questionnaire) is that these data come from a nationally representa- * Demonstrating the impact of the quality of health and tive sample. In contrast, clinic-based anthropometric sanitation infrastructure. * Measuring the degree to which private and public data are biased because the children who attend din- resources (including education) complement each iCS are not a random sample of the population. reore.icuigeucto)cmlretec Mcsoareover, oth axtenrandom sa let of the population. clother in raising the nutritional status of the population. Moreover, the extent and direction of the bias in clin- ic data is not generally known (Grosh, Fox, and Issues that require anolysis of additional non-LSMS data or Jackson 1991). speciaol samples There are also limitations to the anthropometric * Estimating the educational returns to nutritional invest- data gathered from LSMS-type surveys. For example, ments. it is not practical to take blood samples to measure * Conducting project evaluations. micronutrient levels in this kind of survey.And the rel- * Distinguishing between short-term and long-term responses to income, price, and health shocks. atively small samples (2,000-5,000 households) used * Identifying credit constraints. by these surveys make it inappropriate to disaggregate malnutrition rates to local levels such as the province Issues that connot be analyzed using household survey data or district-limiting the extent to which these data * Providing the necessary information to target interven- can be used to prioritize regions or districts for pro- tions at community or district levels. grams to reduce malnutrition. Note: To paraphrase Griliches ( 1984), mperfect data has the virtue of Box 10. lists the policy Issues that can and can- allowing a nesearcherto show his or her creativity. While this box implies not be analyzed with the anthropometric data from that LSMS surveys are relatively unsuited for shedding light on some LSMS-type surveys. issues, a researcher can often transcend these limitations with novel solu- tions. Some of the illustrations given in th s chapter are exceptions to the classifications in this table. The Draft Anthropometry Module As is virtually always the case, the choice of which single researcher, it is desirable to collect both variables to collect in a multitopic survey will be weights and heights, even though the conceptual determined by country-specific factors such as what advantages of having both measures have not been information already exists, what policy debates are fully realized in existing research. current, and what resources are available. However, it * To promote the use of weight and height measures is possible to make a few general suggestions based on as indicators of community health, these measures previous analytical work carried out using LSMS sur- should be collected for all children under the age of vey data: 60 months and preferably also for older children. * Since the range of analysis that can be undertaken The descriptive statistics that use this information with any data set exceeds the imagination of any should be disaggregated by age and gender. 266 CHAPTER I 0 ANTHROPOMETRY * Adult height and weight should be measured, at least in full-length multitopic surveys.Without data Box 10.2 CautionaryAdvice on parents (particularly mothers), regression analy- How much of the draft module is new and unproven? ses explaining height and weight indicators for chil- None. Data on the height, weight, and age of children dren are likely to be seriously biased. Collecting have been gathered in LSMS and other surveys in many data on parental height and weight also provides countries. It has been less common for surveys to col- useful insight into household decisionmaking. lect these data for adults. Nevertheless, this has been * Any module studying adult health should measure done in enough countries that it is now a straightfor- weights as well as heights. However, since BMI can ward process. Mid-upper arm circumference has been be very sensitive to intensity of labor, fertility measured in LSMS surveys only in Vietnam, but no dif- choices, and sample selection, analysis of adult ficuties were encountered in that country. weights will needtocosiderthesHow well has the module worked in the past? Where weights will need to consider these factors. high-quality data have been gathered, they have proved * If it is to be possible to analyze the determinants of valuable for analysis. However, collecting anthropometric both height for age and weight for height and to use data is a painstaking exercise with logistical implications instrumented anthropometric variables in analysis, that must be carefully examined. Quality control, super- the range of price and infrastructure variables cur- vision, and extensive training for interviewers are essen- rently collected by LSMS surveys will have to be tial. Scales and measuring boards must be procured (a expanded. One possibility would be to add modules process that can sometimes take a surprisingly long on health delivery services or community programs time) and these items must be carried around by the to healthmmdelivery services qor scommunitys,programs team members responsible for anthropometry. (Team to the community or facility questionnaires, but two members often complain about carrying the measuring other approaches could also be considered. First, in boards, since these boards can be heavy and awkward.) some countries it may be possible to make use of Even after logistics have been taken care of, in some sur- GIS and other regional administrative databases as veys the resulting data have not been of adequate qual- well as LSMS data. (For communities living at high ity. The final section of this chapter discusses how to altitudes, the GIS data should include altitude infor- maintain quality of anthropometric data; United Nations mation.) Second, a search for price series data can be (I 986) is the standard handbook on the topic. undertaken concurrent with the preparation of the Which parts of the module most need to be customized? The module is more standard than most in terms of multitopic survey. Neither type of iformation is what items of data to collect for each person.The main likely to correspond precisely to the survey's sample decision is whether to measure all household members clusters, but it may be possible to determine which or only young children.This decision should take into data collection points for prices should be associat- account both the analytical objectives of the survey and ed with which sample clusters. any logistical constraints. If the survey is done in a coun- It is better to use a relatively uncommon measuring try where many people do not know how old they are, mechanism, such as mid-upper arm circumference, it may be necessary to devise a local calendar of impor- than not to collect anthropometric measurements tant events and their dates to help accurately identify than not to collect anthropometnthecges ofarepondents at all. The information that can be obtained with the ages of respondents. that technique, while not perfect, provides analysts with a sound basis from which to draw conclusions surveys, and the extent to which it has provided use- about child welfare. It is possible that the main rea- ful data in these surveys. son why the mid-upper arm circumference meas- urement is unpopular with survey analysts is that Notes on the Anthropometry Module they are not used to using it. If some future LSMS- type surveys take arm circumference measurements This section addresses issues of how the LSMS survey in addition to weight and height measurements, this questionnaire should be applied in the field and how will provide experiential evidence on the use of this descriptive data from the module should be presented. measurement that can be taken into account in designing subsequent surveys. Measurement Issues Box 10.2 provides cautionary advice about the extent Although anthropometric measurements are believed to which this module has been used in previous LSMS to be unbiased, objective indicators of the health of a 267 HAROLD ALDERMAN population, in fact these measures are by no means pendent validation of similar approaches, is needed. error-free. Kostermans (1994) illustrates that even ran- The author is not familiar with any examples of such dom errors in measurements can lead to biased esti- a rule. mates of malnutrition rates.The most common prob- Kostermans (1994) and Gibson (1990) list the lems at the interview stage include: a household or sources of the most common measurement errors, community setting in which individuals are unable or most of which could be solved by improving the unwilling to remove their clothes when they are training of interviewers. (An extensive discussion of weighed; the infrequent recalibration of scales; a ten- field techniques can be found in United Nations dency for interviewers to round weights up or down 1986.) LSMS survey teams often include staff who to the nearest kilo or half-kilo (less of a problem for specialize in anthropometric measurements; these height measurements since they are taken in centime- people can be given intensive training. Another way to ters, a unit that comprises a small portion of total reduce error is for interviewers to repeat their meas- height); and a tendency for interviewers to under- urements, taking the second measurement either record the lengths of children under the age of two immediately after the first measurement or at a later because the children are not extended to their full date. Repeating the collection of height and weight length while being measured. measurements (usually after two weeks) need not be In addition, errors are often introduced by imper- very costly if the repeat visit coincides with the field- fect age measures. Ages are frequently rounded up or ing of the expenditure module. The costs of repeat vis- down to the nearest year or half-year. And in some its may also be minimized by repeating only a subset cases interviewers may record children as above the ofthe initial visits or revisiting only households whose age cutoff for measurement-if there is such a observations were questionable. Nevertheless, repeat- cutoff-in order to reduce their own workload.This is ing these measurements may yield fewer gains than less of a problem where birth or health records are might be anticipated. If, for example, interviewers available, as these provide a fairly reliable measure of repeat the same measurement mistakes they made the age even if the registry of birth is delayed and some- first time, or if they record the initial measurement a what inaccurate.This error declines as a percentage of second time without actually taking a second meas- actual age because the time since the original entry on urement, the error will not be corrected. the health record is usually known with certainty by An additional approach, inherent in LSMS surveys the parent. In a longitudinal study the time interval but seldom used in analysis, is to use redundant infor- since first measurement is subject only to error in an mation as instrumental variables. Repeated measure- interviewer's recording of dates, and not to errors in ments of height can be used as instrumental variables parents' memories of birth dates. when height is a right-hand-side variable. Even in the Other than improving the training ofinterviewers absence of repeated measurements, heights can be or going back to households to check outlier meas- instrumented for weights or mid-upper arm circum- urements, there are no easy solutions to these prob- ference. However, age standardization presents a prob- lems. The United Nations' suggestion to improve age lem, as age error will be common across measure- recall by constructing community events calendars to ments. And weight for height is not an appropriate temporally situate births (United Nations 1986) is of instrument for height for age, as the two are not limited use in a national sample as it is unlikely that expected to be correlated.Worse, the two may be cor- such calendars will be equally pertinent for all house- related due to a common measurement error in holds in a national sample. However, age recording can height. be improved by recording data in months rather than in years and months (which often leads to inconsis- The Presentation of Malnutrition Rates tency in units) and by including a question on an indi- Given that heights and weights of children are highly vidual's date of birth in order to double-check his or age-dependent, it is seldom useful to study anthro- her current age. When the age determined from the pometry means and distribution of height and weight community events calendar conflicts with the age measures that have not been standardized for age. determined from the given birth date (and clinic Summary statistics are most commonly presented in records are unavailable), a decision rule, based on inde- terms of an international reference-which is includ- 268 CHAPTER I 0 ANTHROPOMETRY ed in the commonly used nonproprietary software be substantial; data from the 1987 Ghana Living produced and distributed by the Center for Disease Standards Study show that the rate of malnutrition Control in Atlanta and by the WHO.'0 among rural (urban) children was 34.8 (22.0) using -2 Although WHO (1995) advocates using a single Z scores, while it was 22.8 (12.3) using heights below international reference, the use of references based on 90 percent of the reference median (Alderman 1990). a population of American children is often criticized Because low as well as high values of BMI have as an inappropriate benchmark for assessing the health health consequences, summary statistics are often pre- of children in developing countries. This criticism is sented in terms of ranges. For example, James, Ferro- not justified. It is often found that children from priv- Luzzi, and Waterlow (1988) suggest that BMI levels ileged or middle-class families in developing countries between 18.5 and 23.0 should be considered normal, have height and weight distributions that do not differ with values above 23.0 indicating that an individual is from international references (WHO 1995; Habicht over-weight. Individuals falling below 18.5 are assigned and others 1974; Graitcher and Gentry 1981).Also, the to one of three categories of energy deficiency ranging heights and weights of refugee children from Asian from grade I (BMI between 18.5 and 17.0) to grade III countries have been shown to converge rapidly with (BMI below 16.0). There is some indication that the those of children in United States as the refugee fam- distribution in a population may vary by gender; in ilies face fewer economic constraints (Yip, Scanlon, Ghana, women were more likely than men to be clas- and Trowbridge 1992). Moreover, the use of interna- sified as undernourished and as overweight (Alderman tional references is consistent with the fact that across 1990). This result, in conjunction with the fact that many countries there is a relationship between mor- Waaler's data on mortality do not show increased mor- tality risk and malnutrition as measured using such ref- tality for individuals with BMI around 23.0, suggests erences (Pelletier 1994)."1 that 23.0 is a low cutoff point for defining obesity. Height data are often reported in terms of Z scores for height for age, derived from the unit normal Notes curve after subtracting the age- and gender-specific means of height from the observation and after divid- The author would like to express gratitude to Nauman Illias for the ing by the corresponding standard deviation. assistance provided in preparing this chapter. Jere Haas and David Malnutrition rates can then be presented in terms of Pelletier provided valuable advice on technical points. The author the percentage of the population that falls below -2 Z also thanks John Strauss, Duncan Thomas, and Alfred Zerfas for scores. In the reference population 2.3 percent have Z comments on an earlier draft. scores below this level, while 16.0 percent are below 1. Excellent discussions of a broad range of issues regarding the -1 Z score. These levels might be expected for a nor- anthropometric assessment of individuals and populations can be mal population, and provide a basis for comparison. found in WHO (1995) and Gibson (1990). A review of the ways However, there is no sharp difference in risk of mor- anthropometry has been used in monitoring or promoting the tality or functional impairment at this or any other growth of children can be found in Ruel (1995). commonly used cutoff level. Thus there is a need to 2. In Africa the nutritional status of girls is as good as or better determine and present the distribution of anthropo- than that of boys (Svedberg 1990), while in Asia girls often appear metric status and not just malnutrition rates. to be disadvantaged. Yet even within Asia the evidence is by no Malnutrition rates are often reported in terms of means uniform (Harriss 1995). Nutritional data from the Pakistan the number of children below a stated percentile of Integrated Household Survey and most other studies from that the reference distribution or the number whose country do not indicate that girls have higher rates of malnutrition heights are less than a certain percentage of the medi- than boys (Alderman and Garcia 1994), even though the evidence an height. The second way of reporting is not as use- is clear that girls have higher mortality rates than boys. Globally, ful as the first since it does not allow a comparison to there is some evidence that girls are more resistant to stress than be made with an expected distribution. For example, boys (Yip 1996). Overall, equal levels of malnutrition in environ- a point at 90 percent of the reference height will rep- ments with periodic food shortages and high morbidity patterns resent a different percentile of the reference popula- may reflect the fact that households allocate more of their resources tion depending on the child's age (Waterlow and oth- to boys than to girls. However, this conjecture has not yet been sub- ers 1977).The difference in reported malnutrition can jected to rigorous testing. 269 HAROLD ALDERMAN 3. The problems associated with evaluating programs using reg- Alderman, Harold. 1990. "Nutritional Status in Ghana and its ular LSMS samples are that few programs are sufficiently wide- Determinants." Social Dimensions of Adjustment in Sub- spread for beneficiaries to be represented in sufficient numbers Saharan Africa Working Paper 3.World Bank,Washington, DC. within an LSMS sample and that it is hard to control for site and . 1993. "New Research on Poverty and Malnutrition:What individual selection using cross-sectional data. It is often necessary Are the Implications for Policy?" In Michael Lipton and to introduce an element of experimental design into the survey to Jacques van der Gaag, eds., Including the Poot Washington, address these issues. See Newman, Gertler, and Rawlings (1994). D.C.: World Bank. 4. Examples of studies that have used LSMS data to model the . 1995. "Information as an Input into Food and Nutrition determinants of nutrition include Alderman (1990), Glewwe Policy Formation." In Per Pinstrup-Andersen, David Pelletier, (1995), Piwoz (1995), Sahn (1992), and Strauss (1990). For a survey and Harold Alderman, eds., Enhancing Child Growth and of the wider literature see Strauss and Thomas (1995). NVutrition in Developing Countries: Priorities for Action. Ithaca, 5. A further distinction is made between being small and N.Y: Cornell University Press. becoming small (Beaton 1989). Becoming small is an indication of Alderman, Harold, and Marito Garcia. 1994. "Food Security and stress, which affects health and is of particular importance in a clin- Health Security: Explaining the Levels of Nutritional Status in ical setting where repeated measurements are conmnonly used, for Pakistan." Economic Development and Cultural Change 42 (3): example, in growth promotion programs. 485-507. 6. Rather than exclude children with missing parents, dummy Alderman, Harold,Jere Behrman,Victor Lavy, and Rekha Menon. variables are included for mnissing parents. The coefficients of such 1997. "Child Nutrition, Child Health, and School Enrollment: variables should equal roughly the product of average height and A Longitudinal Analysis." Policy Research Working Paper the coefficient of parental height. Hoxvever, if the absence of a par- 1700. World Bank, Poverty and Human Resources Division, ent has a direct impact on child health due to changes in caregiv- Policy Research Department,Washington, D.C. ing or an indirect impact through sample selection, this calculation Anand, Sudhir, and Martin Ravalion. 1993. "Human Development may become less accurate. in Poor Countries: On the Role of Private Income and Public 7. If parental mid-upper arm circumference is included, the Services."Journal of Economic Perspectives 7 (Winter): 133-50. coefficient of income [inVietnam?] drops to 0.104 (t=1.833).The Beaton, George. 1989. "Small but Healthy? Are We Asking the coefficients of both maternal and paternal arm circumference have Right Questions?" Human Organization 48 (1): 11-15. t scores above 6, while only maternal height is sigrnificant in the Behrman, Jere R. 1996. "Impact of Health and Nutrition on analogous regression. Education." World Bank Research Observer 11 (February): 8. For a wider discussion of the range of analyses that use 23-37. anthropometry modules from LSMS surveys in conjunction with Bhargava, Alok. 1995. "Econometric Analysis of Psychometric other data, see Strauss and Thomas (1995). Data: A Model for Kenyan Schoolers." University of Houston, 9. Current observations on nutrition could plausibly be taken as Department of Economics, Houston, Tex. the function of current and lagged prices. Thus, in principle, a . 1997. "Nutritional Status and the Allocation of Time in cross-sectional data set could be linked with a time series on prices. Rwandese Households."Journal of Econometrics 77 (1): 277-95. However, this has not yet been explored. Bouis, H., and L. Haddad. 1992. "Are Estimates of Calorie-Income 10. Kostermans (1994) indicates that further information can be Elasticities Too High? A Recalibration of the Plausible Range." acquired from the Division of Nutrition, The Center for Disease Journal of Development Economics 39 (2): 333-64. Control, 2600 Clifton Rd., MS A08, Atlanta, GA 30333, and from Briscoe, J., J. Akin, and D. Guilkey 1990. "People Are Not Passive the Nutrition Unit,WHO, 1211 Geneva 27, Switzerland. Acceptors of Threats to Health: Endogeneity and Its 11. WHO (1995) recommends that the reference data be Consequences." International Journal of Epidenmiology 19 (1): updated to correct some technical drawbacks including a disconti- 147-53. nuity in the standards at the age of 2 (Dibley and others 1987). Calle, Eugenia, MichaelThun,jennifer Petrell, Carmen Rodrigues, Howvever, this recommendation makes it quite clear that the organ- and Clark Heath. 1999. "Body-mass Index and Mortality in a ization still endorses a single international standard. Prospective Cohort of U.S.Adults." Thee New EnglandJournal of Medicine 341 (15): 1097-1105. References Dibley, Michael, Norman Staehling, Philip Nieburg, and Frederick Trowbridge. 1987. "Interpretation of Z-Score Anthropometric Adair, Linda, and Barry Popkin. 1988. "Birthweight Maturity and Pro- Indicators Derived from the International Growth References." portionalty in Filipino Infants." Human Biology 60 (2): 319-40. American Journal of Clinical Nutrition 46 (5): 749-62. 270 CHAPTER 10 ANTHROPOMETRY Filurer, Deon. 1995. "The Intrahousehold Allocation of Healtl and H-iarriss, Barbara. 1995. "The Intrafamily Distribution of Hunger in Cognitive Skills in Developing Countries." Ph.D. diss. Brown South Asia." In Jean Dreze, Amartaya Sen, and Artar Hussain, University, Department of Economics, Providence, R.I. eds., 7The Political Economy of Hunger. Selected Essays. Oxford: Fogel, Robert. 1994. "Economic Growth, Population Theory, and Clarendon Press. Physiology: the Bearing of Long-Term Processes on the Higgins,Paul.and HaroldAlderman. 1997."Labor andWomen's Nu- Making of Economic Pohcy." American Economic Review 84 (3): tritaon.A Study of Energy Expenditure, Fertility, and Nutritional 369-95. Status in Ghana."Journal of HuIian Resources 32 (3): 577-95. Foster,Andrew. 1995. "Prices, Credit, Markets and Child Growth in Horton, Susan. 1988. "Birth Order and Child Nutritional Status: Low-Income Rural Areas" EcononricJournal 105 (May): 551-70. Evidence from the Philippines." Economnic Developrment and Foster, Andrew, and Mark Rosenzweig. 1994. "A Test for Moral Ciltural Clhange 36 (2): 341-54. Hazard in the Labor Market: Contractual Arrangements, Kostermans, Kees. 1994. Assessing thle Quality ofA nthropometric Data: Effort, and Health." Review of Economics and Statistics 76 (2): Background and Illustrated Guidelines for Survey .lanagers. Living 213-27. Standards Measurement Study Working Paper 101. Gertler, Paul, Paul Glewwe, and Ninez Ponce. 1998. "Poverty, Washington, D.C.:World Bank. Growth and Nutrition:" In David Dollar, Paul Glewwe, and Lavy,Victor,John Strauss, DuncanThomas, and Philippe deVreyer. Jennie Litvak, eds., Household Ielfare and Vietnamis Transition to 1996. "Qualitv of Health Care, Survival and Health Outcomes a Mfarket Economy. Washington, D.C.: World Bank. in Ghana."Journial of Health Economnics 15 (3): 333-57. Gibson, Rosalind. 1990. Principles of Nutritional Assessmenit. Oxford: Lutter, Chessa, Jose Mora, Jean-Pierre lIabicht, Kathleen Oxford University Press. Rasmussen, Douglas Robson, and Guillermo Herrera. 1990. Glewwe, Paul. 1999. "Why Does Mother's Schooling Raise Child "Age-Specific Responsiveness of Weight and Length to Health in Developing Countries? Evidence from Morocco." Nutritional Supplementation." 4merican Journal of Clinical Journal of Human Resources 34 (1): 124-59. NTutrition 51 (3): 359-64. Glewwe, Paul, and H. Jacoby. 1995. "An Economic Analysis of James, WPT, A. Ferro-Luzzi, and J.C. Waterlow. 1988. "Definition Delayed Primary School Enrollment in a Low-income of Chronic Energy Deficiency in Adults:" European Journal of Country: The Role of Early Childhood Malnutrition." Revien' Clinical INutrition 43 (12): 969-81. of Econotnics and Statistics 77 (1): 156-69. Martorell, R.,J. Rivera, H. Kaplowitz, and E. Pollitt. 1992."Long-term Glewvwe, Paul, and J. van der Gaag. 1990. "Identifying the Poor in Consequence of Growth Retardation During Early Childhood." Developing Counties: Do Different Definitions Matter?" l4'orld In M. Hernandez and J. Argente, eds., Hunmani Growthi: Basic and Development 18 (6): 803-14. Clinical Aspects. Amsterdam: Elsevier Science Publishers. Golden, Michael. 1994. "Is Complete Catch-Up Possible for Newman, John, Paul Gertler, and Laura Rawhngs. 1994. "Using Stunted Maliourislied Children?" European Jounrnal of Clinical Randomized Central Designs in Evaluating Social Sector Nutrition 48 (1): S58-S70. Programs in Developing Countries." World Bank Research Graitcher, P. L. and E. M. Gentry. 1981. "Measuring Children: One Obsen'er 9 (2): 181-201. Reference for All?' Lancet (2): 297-99. Pelletier, David. 1994. ''E'he Potentiating Effects of Malnutrition in Gribches, Zvi. 1984. "Econometric Data Issues." In Z. Griliches and Child Mortality: Epidemiologic Evidence and Policy M. Intrilligator, eds., Handbook of Econometrics. Vol. 3. Implications." Nutrition Reviews 52 (12): 409-15. Amsterdam: North Holland Press. Pitt, M., M. Rosen.zweig, and M. D. Hassan. 1990. "Productivity, Grosh, Margaret, Kristin Fox, and Maria Jackson. 1991. "An Health, and Inequality in the Intrahousehold Distribution of Observation on the Bias in Clinic-Based Estimates of Food in Low-Income Countries:" Amnerican Economic Review 80 Malnutrition Rates." Policy Research Working Paper 649. (5): 1139-56. World Bank,Washington, D.C. Pi,voz, Ellen. 1995. "Undernutrition in Nicaraguan Preschool Aged Habicht,J-P., R. Martorell, C.Yarbrough, R. Malina, aisd R. Klein. Children: Prevalence, Determinants, and Policy Implications:" 1974. "Height and Weigh Standards for Preschool Children: Latin American and Caribbean Region Economic Note 1. How Relevant are Ethnic Differenices ill Growth Potential?" World Bank,Washington, D.C. Lancet 1 (7,858): 611-14. Pollitt, Ernesto. 1990. Malnutrition and lIifecrion in tihe Classroorn. Paris: Haddad, Lawrence, and Howarth Bouis. 1991. "The Impact of United Nations Educational, Scientfic and Cultural Organization. Nutritional Status on Agricultural Productivcty: Wages Ravallion, Martin. 1993. Poverty Comiparisons: Fundamentals of Pure Evidence from the Philippines:" Oxford Bulletin of Economics and Applied Econonios. Vol 56. Chur Switzerland: Harwood and Statistics 53 (1): 45-68. Academiic Press. 271 HAROLD ALDERMAN Rosenzweig, Mark R. and I. Paul Schultz. 1988. "The Stabilit, of Thomas, Duncan, and John Strauss. 1997. "Health and Wages: Household Production Technology: A Replication."Journal of Evidence on Men and Women in Urban Brazil. Journal of Human Resources 23 (4): 535-49. Econometrics."Journal of Econometrics 77 (1): 159-85. Ruel, M. 1995. "Growth Monitoring as an Educational Tool, an Thomas, Duncan, Victor Lavy, and John Strauss. 1996. "Public Integrating Strategy, and a Source ofinformation:A Review of Policy and Anthropometric Outcomes in Cote d'lvoire." Experience." In P. Pinstrup-Andersen, D. Pelletier, and H. Journal ofPublic Economics 61 (2): 155-92. Alderman, eds., Enhancing Child Growth and Nutrition in United Nations. 1986. How to Weigh and Mleasure Children:Assessing Developing Countries: Priorities for Action. Ithaca, NY: Cornell the Nutritional Status of Children in Household Surveys. National University Press. Household Survey Capability Programme. New York: United Sahn, David E. 1994. "The Contribution of Income to Improved Nations. Nutrition in Cote d'lvoire." Journal of African Economies 3 (1): UNDP (United Nations Development Progranmnse). 1990. Human 29-61. Development Report 1990. NewYork: Oxford University Press. Sahn, David, and Harold Alderman. 1997. "On the Determinants of Vella,Venanzio, A. Tomkins, A. Borghesi, G. Migliori,J. Ndiku, and Nutrition in Mozambique: The Importance of Age-Specific B. Adriko. 1993. "Anthropometry and Chidhood Mortality in Effects." World Development 25 (4): 577-88. Northwest and Southwest Uganda:" American Journal of Public Steckel, Richard. 1995. "Stature and the Standard of Living."Journal Health 83 (11): 1616-18. of Economic Literature 33 (4): 1903-40. Victora, Cesar. 1992. "The Association between Wasting and Strauss, John. 1990. "Households, Communities and Preschool Stunting: An International Perspective."Journal ofNutrition 122 Children's Nutrition Outcomes: Evidence from Rural Cote (5): 1105-10. d'lvoire." Economic Development and Cultural Change 38 (2): Waaler, H. 1984. "Height, Weight, and Mortality: The Norwegian 231-61. Experience." Acta Mledica Scandinavia 77 (Supplement): 1-56. Strauss, John, and Duncan Thomas. 1995. "Human Resources: Waterlox,xJ.C, R. Buzina,W Keller,J.M. Lane, N.Z. Nichaman, and Empirical Modeling of Household and Family Decisions." In J.M. Tanner. 1977. "The Presentation and Use of Height and Jere Behrman and T.N. Srinivasan, eds., Handbook of Weight Data for Comparing the Nutritional Status of Groups Development Economics. Vol. 3A Amsterdam: North Holland. of Children under the Age of Ten Years." Bulletin of the WHO - . 1996. "Measurensent and Mismeasurement of Social 55 (4): 489-98. Indicators," American Economic Review 86 (2): 30-34. WHO (World Health Organization). 1995. Physical Status: The Use - -. 1998. "Health Nutrition and Economic Development." and Interpretation ofAnthropometry. Technical Report Series 854. Journal of Economic Literature 36 (2) June: 766-817. Geneva:World Health Organization. Subramanian, S., and A. Deaton. 1996. "The Demand for Food and Yip, R. 1996. Electronic communication. Calories."Journal of Political Economics 104 (February): 133-62. Yip, R., K. Scanlon, and E Trowbridge. 1992. "Improving Growth Svedberg, Peter. 1990. "Undernutrition in Sub-Saharan Africa: Is Status ofAsian Refugee Children in the United States."Journal There a Gender Bias?" Journal of Development Studies 26 (3): of American Medical Association 267 (7): 937-40. 469-86. Zerfas, Alfred. 1991. "Choice of Nutritional Status Indicators for Thomas, Duncan. 1994. "Like Father, Like Son: Like Mother, Like Young Children in Public Health Programs." Latin America Daughter: Parental Resources and Child Height." Journal of and the Caribbean Technical Department Report 8. World Human Resources 29 (4): 950-88. Bank, Washington, D.C. 272 Transfers and Other Nonlabor Income 1 1 Andrew McKay This chapter discusses sources of household income that are not covered in Chapters 9, 18, and 19 (on income from wage employment and self-employment in household enterprises or agri- culture), in Chapter 12 (on ways to estimate imputed rent from owner-occupied dwellings), or anywhere else in this volume. The income sources covered by this chapter will collectively be referred to as miscellaneous income. Miscellaneous income consists predominantly of current transfers, but it also includes rental income and other income such as interest or lottery winnings. There are two main reasons for collecting data on Policy Issues transfers and other nonlabor income. One reason is to construct an estimate of total household income when Most of the income received by households in devel- the questionnaire also gathers sufficiently detailed data oping and transition countries comes from the wage on housing, agriculture, economic activities, and non- employment of their members or from their agricul- farm enterprises. A second reason is that many of the tural or other household enterprises. However, many income sources covered in this chapter are themselves households also receive income from a variety of other of considerable interest to policymakers and analysts; sources, including public transfers, private transfers, particularly interesting to policymakers and analysts and rent. are households' incomes from public transfers and pri- Households receive rental income if they rent out vate interhousehold transfers. their assets (land, dwellings, farm equipment) to oth- The first section of this chapter sets out some of ers.' Since only wealthy households usually own suffi- the main policy issues that can be analyzed with data cient assets to rent these assets to others, wealthy on transfers and other nonlabor income.The second households are the only households likely to receive section discusses the kind of data on these income rental income. Similarly, only well-off households are components that must be collected to analyze the likely to earn incomes from such sources as interest, policy issues identified in the first section. The third dividends, or lottery winnings-incomes which in any section introduces standard and short versions of case are usually not very substantial. modules designed to collect these data in an LSMS- The vast majority of households receive their type multitopic household survey. (The modules are most significant miscellaneous income from transfers presented inVolume 3.) The fourth and final section of various kinds. These may be private transfers from provides notes and comments on the proposed other households (possibly in the same extended fam- modules. ily) or government transfers such as state pensions, 273 ANDREW MCKAY unemployment benefits, or family allowances. Because some sources of income covered by this Households may also receive transfer income from chapter-such as transfers from government-are companies, possibly in the form of pensions or divi- highly sensitive to policy interventions, survey design- dends, or from nongovernmental organizations ers may wish to include questions on transfers and (NGOs), which may also provide transfers in the form other nonlabor income for policy reasons, irrespective of food or other goods. Some transfers, especially of whether the survey aims to measure total household transfers from other households or from a company, income. may be sent from outside the country. Although transfers and other nonlabor income Studying the Incidence and Impact of Public Transfers sources are not generally as important as income sources Government transfers can represent a significant pro- resulting firm labor, for some households miscellaneous portion of the total income of some households-for sources are very important.These households are often of example, households in which no member is work- particular interest to policymakers. For example, this may ing. On the other hand, many other households be because many households that rely heavily on miscel- receive no transfer payments from the government at laneous income are poor, or receive the vast majority of all. Public transfers from the government to house- their income from government transfers, or are econom- holds take different forms from one country to anoth- ically vulnerable due to their reliance on uncertain pri- er, but some common examples are state pensions, vate transfers. Thus it can be very important to identify unemployment benefits, food stamps, and disability and, if possible, measure such sources-especially in cir- payments. Some of these transfer payments, such as cumstances where policymakers and analysts are interest- disability payments, reflect the particular circum- ed in exploring the determinants of poverty. stances of the recipient (which may be an individual Box 11.1 discusses the importance of transfers and rather than a household), while other transfers, such as other nonlabor income sources for households in food stamps, may be intended specifically for poor three developing countries. The data in this box sug- individuals or households. gest that these sources are quite important even for Two policy questions are of interest regarding average households (Box Table 11.1), and very impor- government transfers: their incidence-who receives tant for some households. them and in what amounts-and their impact-how Table 11.1 summarizes issues that can be receiving government transfers affects individuals' or addressed using data on transfers and other nonlabor households' behavior. income. These are discussed in turn below. INCIDENCE OF PUBLIC TRANSFERS. Studying the inci- Measuring and Analyzing Total Household Income dence of public transfers involves examining the rela- Estimates of total household income can be used to tionship between receipt of a given government trans- address various policy and analytical issues, as discussed fer and the relevant characteristics of recipient in Chapter 17.When survey planners decide they want households or individuals (such as ages and standards to estimate total household income, the questionnaire of living). A number of descriptive poverty studies must, as far as reasonably possible, collect data on all of have examined which income groups benefit from the household's prospective sources of income. To public transfers and to what extent these groups ben- accomplish this it is necessary to comprehensively efit. One way the studies have done this is by dividing identify income sources for each household and to the population into quintile (or similar) groups measure income from these sources as accurately as according to their standard ofliving or into poor/not- possible within a suitable reference period. If the goal poor categories, and investigating how receipt of is to rank the income sources in terms of the size of transfers varies among the different groups. This kind their contribution to the household budget (possibly of technique has been used in studies of the incidence for subgroups of households), the data must be col- of food stamp programs (Grosh 1992, 1995a), studies lected at as disaggregated a level as possible. Such dis- of the extent of targeting error associated with such aggregation is also desirable to ensure that, as far as programs (Cornia and Stewart 1995), and studies of possible, no potential sources of income are missed the distributional incidence of social security benefits because they were not explicitly mentioned. Jarvis and Micklewright 1995). 274 CHAPTER I I TRANSFERS AND OTHER NONLABOR INCOME Box 11.1 The Contribution of Miscellaneous Sources to Household Income The degree of detail to which miscellaneous income sources While the relative importance of these two miscella- need to be measured depends partly on how much income neous income components varies between the countries, they contribute to all households and to specific types of taken together they contribute significantly to household households. If such sources contribute little to average house- income in each country. Miscellaneous income accounts for hold income or to the vast majority of individual households, nearly one-sixth of household income in Ghana and Peru, and it may not be necessary to measure the sources accurately or just less than one-tenth of household income in C6te d'lvoire. in detail. More importantly, miscellaneous income contributes greatly The table below explores miscellaneous sources of to the total income of certain households. In these three income as measured in three LSMS surveys: C6te d'lvoire countries, as in many others, a few households derive all of 1988, Ghana 1988-89, and Peru 1994. (Household income their (measured) income from private interhousehold trans- was measured fairly accurately in the C6te dIlvoire and Peru fers or "other income" sources. Each of these countries also surveys; in the Ghana survey the estimate of household contains a significant minority of households for which income is more questionable.) Data are presented for the income from one of these sources accounts for one-quarter two categories of miscellaneous income distinguished in the or more of total household income. third section of this chapter-income from private inter- Such households are liable to be systematically different household transfers and "other income" (which includes pub- from households that rely on wage income or self-employ- lic transfers, rental income, and income from other sources). ment income; some are probably richer than average such Private interhousehold transfers are transfers between as households that derive most of their income from rent and households, often consisting of gifts of money or goods from investments-but others are probably poorer or more vul- family members who live elsewhere.The precise composition nerable than average-such as households that rely predom- of the "other income" category varies from one country to inantly on private interhousehold transfers and other trans- another, reflecting the different information available from the fers to meet their consumption needs. Policymakers may be three different surveys. particularly interested in finding out about the latter group. BoxTable 11.1 Miscellaneous Sources of Household Income:Three Case Studies Cote d'lvoire Ghana Peru Share of income 1988 1988-89 1994 From pnvate interhousehold transfers More than zero 31.4 63.0 38.3 ...............................................................................................................................................I....................I.............................I......................... 10% or more 14.3 26.7 21.2 .................. more ......................................................................6.'8..................................................16.'4..............................*................... . '8...... *............... 25% or more 6.8 16.4 1 1.8 .... or...... m..............................................................................2.'7..........*.............................*...........8...0........ *...........................................4. 'l...................... 50% or more 2.7 8.0 ,,,,, 4,.. . .................. l'00%iO 0.4 1.9 0.1 Mean contribution 5.'''''''' '''' 6 '''''1'' ''''' ''''2.7' 7.7''' ' ''''' '' 'i ' ''' ..........................................................I........................................................................................................................... ...................................*..... Sample size 1,556 3,405 3,589 From "other" sources More than zero 26.5 27.6 38.0 .....................................................................................*..................................................................................................................................... or more 7.2 8.2 18.1 .... 'r... m.................................................................................... 4,2.................................................... 3.6 .................. *.................................9. '2...................... 25% or more 4.2 3.6 9.2 50% or mome i.2 1.3 3.8 ....................................................................................................... ............................................*..................................................................... 1 00% 0.4 0.1 0.3 `.1ean....................... utio...n..............................................................2.'9.................................................... 3."....................................................6. '5...................... Mean contribution 2.9 3.1 6.5 .........................................*................................................................................................................................... *................................................. Sample size 1,559 3,415 3,568 Source: Author's computations from the data derived from: Cote dilvo re Living Standards Survey (1988); Ghana Living Standards Survey Round 2 (1988-89); Encuesto Nocional de Hogrres Sobre Medicion de Niveles de Vido, Peru ( 994). THE IMPACT OF PUBLIC TRANSFERS. Studying the One relevant issue is whether and how receiving gov- impact of public transfers requires investigating how ernment transfers affects a household's continued households or individuals alter their behavior as a receipt of private interhousehold transfers or the result of receiving public transfers and whether such amounts of such transfers. (This issue will be consid- behavior changes reduce the benefits of these transfers. ered in more detail below.) Another important issue is 275 ANDREW McKAY whether receiving government transfers affects the this person can earn a higher income and send some incentives for household members to seek employ- of the income back to his or her original household. ment, especially in instances in which these transfers There are several other circumstances in which private are targeted. Sahn and Alderman (1995) found that a transfers are made, for example, between friends or targeted rice ration program in Sri Lanka reduced the between neighbors. work effort of recipient household members by up to In many low-income and lower-middle-income two or three days per month. (See Kanbur, Keen, and developing countries, private transfers between house- Tuomala 1995 for some of the underlying theoretical holds are the most important sources of income for a arguments about how targeted transfers affect labor considerable number of households.This is most often supply.) the case in countries where social security systems and A further issue is how households act in response other mechanisms for distributing public transfers are to government programs targeted to particular mem- not well-developed; in such cases these private trans- bers of the household. Haddad and Zeller (1997) fers can provide a vital informal safety net for vulner- argued that when a school meals program is targeted able or poor households. (For example, private inter- toward girls (perhaps because girls have been found to household transfers are the most important source of be more malnourished than boys), households may transfers fulfilling this function in Ghana; see Box respond by reducing the amount of food girls receive 11.1.) Therefore, studying the nature and effects of at home. However, the impact of public transfers is not private transfers in poor countries is of considerable always deleterious. Food stamp programs that require analytical and policy interest.The survey questionnaire recipient households to use primary health care facili- should collect information on not only households' ties, and school feeding programs that require recipi- income from private interhousehold transfers but also ent households to send their children to school in households' expenditure on these transfers (Table order to qualify for a transfer have additional benefits 11.1). Having data on net flows of these transfers, for recipients (Grosh 1995b). rather than simply gross flows, ensures that the analyst The difficulty in modeling all of these issues is takes into account the full effects of these transfers on working out how recipients of government transfers household behavior and welfare. would have behaved had they not received transfers. When a given survey sample includes both recipients PATTERNS OF FLOWS OF PRIVATE INTERHOUSEHOLD and nonrecipients of government transfers, it becomes TRANSFERS. Households can be classified into three possible to include the receipt of government transfers groups: net recipients of private interhousehold trans- as one of the explanatory variables in a model-say, a fers, net donors of such transfers, and households that model of labor supply. This allows analysts to use the neither give nor receive such transfers. A useful start- resulting data set to draxv preliminary conclusions ing point is to know which households belong to each about the impact of government transfers on the vari- of these groups and how this status relates to the char- able in which they are interested (in this case, labor acteristics of these households-most obviously their supply). This is essentially the procedure used by Cox standard of living but also factors such as their vulner- andJimenez (1992,1995) to model how the receipt of ability or dependency.2 government transfers affects private interhousehold Evidence from a wide number of studies (for transfers and also by Sahn and Alderman (1995) in example, Bamberger and others 1991, Cox and their Sri Lanka study. Jimenez 1995, and Knowles andAnker 1981) confirms the hypothesis that transfers between households gen- Studying Private Interhousehold Transfers erally flow from richer to poorer households, although Some households in developing or transition coun- this obviously needs to be investigated or confirmed in tries receive private transfers of income from other each specific case. However, the available evidence also households whose members may or may not be a part suggests that large numbers of poor households do not of the same family. A common example of a private receive such transfers (see Cox and Jimenez 1992, interhousehold transfer occurs when a young member Bamberger and others 1991) and thus lack the infor- of a household leaves home to seek work in another mal safety net such transfers may provide.3 location-perhaps in an urban area or abroad-where Policymakers should know the limits of the coverage 276 CHAPTER I I TRANSFERS AND OTHER NONLABOR INCOME Table I 1.1 Policy Issues Regarding Nonlabor Income Sources and the Data Needed to AnalyzeThem Can this issue Can this issue Data requirements be analyzed using be analyzed using Specific area from miscellaneous Data requirements data from the data from the General issue of interest income modules from other modules standard module? shorter module? Measurement of Estimates of household Data to estimate all remaining Yes No total household incomres from current components of household income transfers, rent, and other income misce laneous sources Government Incidence of Information on Relevant household Yes Yes transfers government transfers households' receipts of characteristics-for example, different categories of standards of livng and public transfers demographics Impact of Information on Data on the variable affected Yes Yes government transfers households' receipts of by the transfer (such as different categories of labor supply) public transfers Private Patterns of flows Household receipts from Household expenditure on Yes Yes interhousehold private interhousehold private interhousehold transfers transfers transfers; measures of such indicators as standard of living, vulnerabi ity, and dependency Characterizing flows Household receipts from Household expenditure on Yes To some extert private interhousehold private interhousehold transfers, plus information transfers: information on on donors such as their characteristics of these transfers; reasons for giving measures of such indicators as transfers standard of living, vulnerability, and dependency Modeling Household receipts from Characteristics of households To some extent To some extent determinants private interhousehold such as their standards transfers; information on of living, demographics, donors such as their and assets reasons for giving transfers: (ideally) information on house- holds' social networks Crowding out of Household receipts from Relcvant household Yes To some extent public transfers private interhousehold characteristics-for example, transfers; information on standards of living and donors such as their demographics reasons for giving transfers: information on receipts of public transfers Household's ability Magnitude of transfers, Possibly Possibly or willingness to reasons they were sent, pay for projects and (ideally) how they change in response to changing economic circumstances Impact on Household receipts from Data on variables affected Yes To some extent household behavior private interhousehold by transfers (for example, transfers; information on labor supply) donors such as their reasons for giving transfers Household sales Households' receipts from Sales of assets, purchases Yes Yes of capital assets the sale of their assets of assets, and general (if not collected household characteristics clsewhere) including standard of living (Toble continues on next page.) 277 ANDREW McKAY Table 11.1 Policy Issues Regarding Nonlabor Income Sources and the Data Needed to Analyze Them (continued) Can this issue Can this issue Data requirements be analyzed using be analyzed using Specific area from miscellaneous Data requirements data from the data from the General issue of interest income modules from other modules standard module? shorter module? Intrahousehold Estimates of receipt of Remaining incomes at individual To some extent No allocat on government and private level (where possible), transfers by individual individual charactenstics household members (such as gender and age) Source: Author's summary. of this net and the types of households that it does not Anker 1981 for Kenya; Cox and Jimenez 1995 for the cover. Philippines), but the answers to the above questions When examining the flow of these transfers, ana- may differ in each different set of circumstances. Even lysts must be able to quantify the magnitudes involved. within any given country, the answers to questions on Are transfer flows stifficient to lift significant numbers transfer issues differ from one household to another of households out of poverty? How much do the trans- depending on the characteristics of the household. For fers alter patterns of income distribution or poverty? To some households-especially poor households- properly study this issue, analysts not only need data on private interhousehold transfers function (to varying transfers, they also need enough data to estimate total degrees) as an informal safety net. Such transfers tend household income-a fact that has implications for the to be a more reliable source of income for households design of the survey questionnaire. that receive them from close family members than A final concern in collecting data on private they are for households that receive one-off transfers interhousehold transfers is the accuracy of information from more distant relatives or from nonrelatives. obtained on the existence and magnitude of such transfers. In circumstances where most such transfers MODELING THE DETERMINANTS OF PRIVATE do not cross international borders,4 an initial check INTERHOUSEHOLD TRANsFERS. A descriptive analysis can be done by comparing the aggregate receipts of provides a certain amount of information on the pat- and aggregate expenditures on such transfers to see if tern of (net) private transfer flows between households. they are of similar magnitudes. Where they are not However, more (and more reliable) information can be (and assuming that the sample is representative of the gained from modeling the determinants of these trans- population), this suggests that there has been a signifi- fer flows using multivariate techniques, particularly cant underestimation of one variable or an overesti- when it is possible to include the factors that influence mation of the other. (Even where the magnitudes are why certain households receive transfers while others similar it is possible that both are underestimated or do not. In order to identify these factors, analysts are overestimated to the same extent.) likely to need information about the social networks to which households belong, as some households may not CHARACTERIZING FLOWS OF PRIVATE INTERHOUSE- receive any private transfers because they do not HOLD TRANSFERS. It is highly desirable for analysts to belong to social networks (such as an extended family understand the nature of private interhousehold trans- or a community) that might give them access to such fers and their role in the household economy. To what transfers.A household may not expect to receive trans- extent are such transfers flows between members of fers from relatives in another household because of a the same family living in different households (for family breakup or because nonresident family mem- example, transfers to elderly parents from their sons or bers are as poor as the household residents. daughters)? Are these transfers mostly one-off pay- An analysis of the determinants of private inter- ments or are they regular and ongoing? Are these household transfers was undertaken by Bamberger and transfers mainly sent for specific purposes (for exam- others (1991) using data from a survey conducted in a ple, to finance a child's education) or simply to provide poor district of the city of Cartagena, Colombia. The general financial support to the recipient household? survey collected information on the social networks to A number of studies have investigated some of which households belonged.The authors used this and these issues for particular countries (Knowles and other information to model the determinants of 278 CHAPTER I I TRANSFERS AND OrHER NONLABOR INCOME * The likelihood of a household belonging to a vate transfer flows less likely or smaller. If a shock has "socially interactive network," implying that it had an adverse effect on a household that receives private potential access to transfers. transfers, this may prompt the households that donate * The likelihood of a household having received a the transfer to increase the amount or frequency of transfer in the previous month. transfers, thus cushioning the effect of the shock on * Total transfers received by a household. the recipient household. Alternatively, when the * Net transfers received by a household. household that donates a transfer suffers an adverse Each of these factors was related to potentially relevant shock, it may not be able to continue the transfer- characteristics of the recipient households (for exam- thus adversely affecting the household that had previ- ple, demographic factors, household income, and ously benefited from the transfer. household assets). The authors undertook a similar Consider the case of an urban-based household analysis of payments of private interhousehold trans- that derives much of its income from wage employ- fers, relating these to the characteristics of donor ment in the formal sector and that makes regular trans- households. fers to relatives who live in a poorer household in a Many household surveys do not collect informa- rural area. If a contraction in the urban formal sector tion on the social networks to which households caused one of the members of the urban household to belong, making it difficult for analysts to model the become unemployed, the income of the household determinants of a household's receipt of transfers. would decrease. This could reduce the magnitude of However, a number of authors have attempted to the transfer that the urban household sent to the rural model the determinants of total and net transfers household. Thus the impact of the contraction in the received by a household even without information on urban formal sector would be felt not just in an urban social networks. Cox and Jimenez (1995) used data area but also in a rural area. In summary, flows of pri- from the Philippines to model the determinants of the vate interhousehold transfers may respond to changes net private transfers received by a household as a func- in the circumstances of either a household that makes tion of the household's characteristics. As the model a transfer or the household that receives this transfer. If was estimated by ordinary least squares, it is not clear these changes result from policy decisions, their effects how zero values were treated in this study. (There may on transfers should be considered when the effects of be a selectivity bias if the model is based only on policies are being evaluated. nonzero values, and there may be issues of conceptual consistency if zero values are treated in the same way PRIVATE INTERHOUSEHOLD TRANSFERS AND THE as nonzero values; see Heckman 1990 and Greene EFFECTTvENESS or REDISTRIBUTVWE PUBLIC TRANSFERS. 1993, section 22.4.) In their 1992 study of Peru, Cox In situations where public and private transfers are close and Jimenez used a probit model to model the deter- substitutes for each other, increasing the public transfers nilnants of whether a household received transfers- received by a given group may result in this group and also to model the transfer amounts using receiving fewer private transfers. Households that had Heckman's generalized tobit framework (Heckman given transfers to this group may feel that it is less imper- 1979). And in a more recent study based on the ative to do so. Households that had received private Cartagena data set, the same authors used an ordered transfers may put less effort into maintaining and devel- probit model to investigate the factors that influenced oping their social networks.Whatever the explanation, if whether a household was a net recipient of inter- public transfers do "crowd out" private transfers in this household transfers, a net donor of interhousehold way, the distributional incidence of the public transfers transfers, or neither (Cox and Jimenez 1998). (allowing for indirect as well as direct effects) will be dif- By modeling the determinants of both payments ferent from what it initially appears to be. Assuming that and receipts of private transfers between households, it the donors of private transfers are better off than the may be possible to better understand how these trans- recipients, the benefits of public transfers to poor house- fers change in response to a shock in the economy. An holds may at least partially accrue to richer households. important issue in such circumstances is whether pri- This issue has been examined in a series of studies vate interhousehold transfers cushion households from by Cox and Jimenez (1990, 1992, and 1995). In the the effects of shocks or whether the shocks make pri- 1992 and 1995 studies the authors used multivariate 279 ANDREW MCKAY regression techniques to model the determinants of the transfers can have an effect on recipient households' amount of net private transfer income received by a behavior. These changes can take a number of differ- household as a function of the relevant characteristics ent forms that may or may not have been intended by of that household. Among these relevant characteristics the donor. Unfortunately, it is very difficult and com- are the type and magnitude of any public transfers plex to model the way in which receiving private received. If public transfers are crowding out private transfers affects the behavior of recipients, not least transfers, the terms representing the receipt of income because the magnitude and even the existence of a from public transfers would be expected to have statis- private transfer flow might be endogenous (in other tically significant negative coefficients; the coefficients words, within the control of the recipient household). of the regression could be used to estimate the response This is much less of an issue in modeling the impact of private transfers to specific changes in public trans- of public transfers because the amounts of public fers. In the case of Peru (Cox and Jimenez 1992) the transfers and the criteria for their provision are more authors found that private transfer payments from the clearly predetermined. young to the old would have been nearly 20 percent One example of the complexity of this process is higher in the absence of social security pension pay- how problematic it is to model the effects of private ments-which suggests that some crowding out was transfers on the participation of members of the recip- occurring. The authors' study of the Philippines (Cox ient household in the labor market. It is conceivable andJimenez 1995) suggested that private transfers were that the donor would withdraw or reduce the transfer strongly responsive to the recipients' income levels. if members of the recipient household stopped work- Thus, if unemployment insurance were to be intro- ing or stopped looking for work as a result of receiv- duced, there would be a significant-though not ing the transfers. Thus both the receipt of private complete-crowding out of private transfers. transfer income and the amount received are probably linked to the behavior of the recipient household's USING INFORMATION ON PRIvATE TRANSFERS TO ASSESS members with regard to the labor market. In princi- INDrvIDuALs' ABIuTY AND WILINGNESS TO PAY FOR ple, therefore, the model should make both labor sup- DEVELOPMENT PROJECTS. One aim of the study in ply and the receipt of private transfers endogenous. Cartagena, Colombia (Bamberger and others 1991) was However, some effects of receiving private inter- to assess how private transfers could be expected to household transfers are easier to model. While con- respond to an urban development project in the area of trolling for household income and other relevant fac- the city where the survey was conducted. Private trans- tors, Bamberger and others (1991) use multivariate fers were clearly an important source of income for regression analysis to examine how receiving private households in this area. The authors argued that when transfer income affects a household's expenditure on the local community perceives a project ofthis type as a basic and nonbasic needs. In this case, the possible desirable investment, private transfer income to house- endogeneity of the transfer income as an explanatory holds in the vicinity of the project can increase substan- variable may be less important, in that it may be diffi- tiaily. In fact this was not observed in Cartagena-a cult for the donor to observe how the transfer income result the authors attributed to the perception of locals is being spent.5 that the project did not represent a good investment for the community. But for positive evidence to support Studying the Sale ofAssets by Households their argument, the authors cited cases from El Salvador Revenue from the sale of household assets should not and the Philippines (Bamberger and others 1991, chap- be regarded as income because these sales are capital ters 6 and 7). In cases where it seems likely that private transactions. Nevertheless, the nonlabor income module transfer income will increase when a development proj- in a multitopic household survey may be a good place ect is introduced, policymakers should bear this factor in to collect information on such asset sales to the extent mind in assessing the ability and willingness of the local that it is not collected elsewhere in the questionnaire.6 community to pay for the project. Sometimes households sell offtheir assets simply to buy other assets. Thus selling assets does not necessarily STUDYING THE IMPACT ON HOUSEHOLDS OF RECEIVING imply a drop in the household's long-term sustainable PRIVATE TRANSFERS. Like public transfers, private standard of living; the reverse may in fact be true. On 280 CHAPTER II TRANSFERS AND OTHER NONLABOR INCOME the other hand, households sometimes sell their assets to Data Requirements finance consumption, reducing both the net worth of the household and its long-term sustainable standard of The planners of any multitopic household survey should living. Gathering information on asset sales can help design the survey questionnaire to collect the data need- analysts identify and study those households that sell ed to analyze the policy issues that they have identified their assets to stabilize their consumption levels in as most relevant to the country studied. If the planners response to either a fall in income or a large unexpect- have decided to measure total household income, the ed expenditure.7 As discussed by Corbett (1988), this data collected on current transfers, rental income, and strategy is usually a last-resort response to a famine or other miscellaneous income should be as comprehensive other emergency situation, especially when the assets and accurate as is reasonably possible. Beyond this, what- sold are productive. A household is clearly in a desper- ever data are collected on these income sources will ate situation if it is engaged in selling its assets merely to depend on exactly what the likely policy and analytical finance consumption. Thus, if a survey questionnaire applications of the data will be. In most circumstances collects data on the reasons why households sell their government transfers and private interhousehold trans- assets, this will enable analysts to identify households fers are the income sources of greatest interest to policy- that may be in danger of falling into poverty. makers and analysts. In some cases revenues from the sale of capital assets are also likely to be of interest. Studying the Impact of Transfers and Other Nonlabor This section of the chapter discusses some of the Income on Intrahousehold Allocations main issues to be addressed in designing modules that When users of survey data wish to study intrahouse- will collect sufficiently reliable data on these transfer hold dynamics, it is desirable to collect information on and nonlabor income components. The implications transfers and other nonlabor income sources at the for data collection of the various analytical and policy level of the individual rather than the entire house- issues discussed in the first section of this chapter are hold. For example, because women tend to ensure a set out inTable 11.1. household's food security and improve household members' nutritional outcomes (Quisumbing and Does the Survey Aim to Collect the Data Necessary for others 1995; Haddad and Zeller 1997), it is likely to be Measuring Household Income? a good idea to ensure that transfers (particularly gov- The issue of whether an LSMS survey should collect ernment transfers) are given directly to the women in the data necessary to measure total income is discussed the household. The issues involved in studying such in depth in Chapter 17. In that chapter it is argued that intrahousehold dynamics are discussed in Chapter 17 a standard-length multitopic household survey should on measuring total household income. aim to measure total household income, whereas a shorter survey should not attempt to meet this objec- Other Issues tive. Thus the standard-length module should aim to In some countries or situations analysts may be inter- collect data on all the income sources covered by this ested in looking at the distributional incidence of chapter, whereas the shorter module need focus only other miscellaneous income sources-transfers on sources that are of particular interest to policymak- received from NGOs or, in Islamic countries, transfers ers. (The standard module can still be used in a short of zakat (a form of Islamic charity intended to benefit questionnaire if survey planners believe that the addi- the poor).The principles involved in performing such tional information that it collects is of sufficient inter- an analysis are similar to those discussed above, est in its own right, but the short module should not although their relevance to policy may be less clear- be used in a standard LSMS questionnaire.) cut. Some of the remaining nonlabor income cate- When designers wish to measure total household gories-such as interest or dividends earnings (which income, it is necessary to distinguish between private will almost always disproportionately benefit richer transfers that need to be repaid and those for which no households) or lottery winnings-may need to be repayment is expected. In practice respondents may measured only to construct estimates of total house- sometimes find it difficult to make this distinction, hold income, since otherwise they are unlikely to be given the ambiguity that might be associated with of much relevance to policymakers or analysts. having "an obligation to repay." While in some cases 281 ANDREW McKAY there may be no economic obligation to repay, respon- determinants of private transfer flows and understand dents may nonetheless feel a moral obligation to do so. why some households receive or make such transfers Nevertheless, this distinction is important in while others do not. measuring total household income. If a household Collecting information on social networks may receives a transfer that it does not have to repay, the have extensive implications for the design of the ques- transfer should be regarded as income. If the house- tionnaire. Some previous surveys have collected such hold receives a transfer that must be repaid, this is information-notably the Cartagena survey studied by effectively a loan and should be regarded not as Bamberger and others (1991). In the Cartagena survey income but as part of the household's capital account. this involved collecting information on (nonresident) close relatives, distant relatives, and friends of the house- What Information on Private and Public Transfers Income hold head and the head's spouse or partner. Questions Interests Policymakers andAnalysts? were asked not only about transfer flows between the Information on public and private interhousehold household and these people, but also about the charac- transfers is of considerable interest to policymakers and teristics of these people (whether or not they gave or analysts, whether or not it is used to measure total received transfers). There were questions about where household income. But how detailed does transfer these people lived, how their standard of living com- income information need to be? The answer to this pared with that of the sample household, and the fre- question depends on how survey data are likely to be quency with which members of the sample household analyzed. If the incidence and impact of government visited them,.The questions yielded information on the transfers will be studied, information on transfers size of a household's social network, the likelihood of should be collected at as disaggregated a level as pos- the household receiving transfers from members of its sible, with clear distinctions made among the different social network (or making transfers to these members), types of transfers (such as child allowances and unem- and the extent of the household's efforts to maintain its ployment benefits). As noted above, collecting such network (through visiting). information at a more disaggregated level also has the Having this kind of information helps analysts benefit of reducing the likelihood that a household's understand why some households give or receive pri- overall receipt of government transfers will be under- vate interhousehold transfers while others do not. estimated because respondents neglect to mention However, even when this information is available, it items about which they are not explicitly asked. can be difficult for analysts to get a clear and complete The nature of private interhousehold transfers and picture of the extent of a household's network (espe- the reasons for these transfers are usually harder to cially of household friends) and to discover to what identify than is the case for government transfers.Thus extent and in what circumstances a household can rely for a private interhousehold transfer it is desirable to on receiving transfer income from members of the collect information not only on the amount trans- network. ferred but also on the characteristics of the donor, the It is also possible to use a simpler questionnaire reason for the transfer, and whether the transfer is a design to gather some information on the likelihood of one-off or a regular occurrence. The benefits of this a household receiving or making private transfers. The information include enabling analysts to assess draft household roster module introduced in Chapter 6 whether or not these private transfers are a reliable (and presented in Volume 3) collects information on source of income for the household. the parents and children of household members who The questionnaire should collect data on private live in different households. If this information can be transfers both to and from households. (Chapter 5 on matched with the information collected in the transfers consumption collects information on household income module on the donors of such transfers, it can expenditure on transfers.) be used to analyze the determinants of private inter- household transfers. This information will give analysts Should the Questionnaire Collect Information on a an idea of the extent to which close family provides Household's Family and Social Networks? support to poor or vulnerable households. However, an Collecting information on the social networks to analysis based on this information alone cannot fully which households belong helps analysts model the explain why some households receive (or make) pri- 282 CHAPTER II TRANSFERS AND OTHER NONLABOR INCOME vate transfers while others do not; nonresident parents Should this Module Collect Information on Revenue from and children form only part of a household's social net- the Sale of Assets? work. Moreover, the household roster is not the natu- It was suggested in the first section of this chapter that ral place to ask questions about how close members of the transfers and nonlabor income module may be a the sample household are to their nonresident parents good place to collect information on the household's or children. If such information is collected, it is almost revenue from the sale of its assets to the extent that this certainly better collected in the transfers and other information is not collected elsewhere in the ques- nonlabor income module than in the roster module. A tionnaire. For most purposes, all that needs to be col- final difficulty is that when close family members out- lected is information on revenue from the sale of each side the household do not provide support to poor or category of assets; collecting such information is like- vulnerable households, it is difficult to determine the ly to be possible in both the standard and shortened reasons why. versions of the proposed module. However, as noted above, this is one of several areas where close coordi- Should the Data be Collected at the Level of the nation is needed between the miscellaneous income Household, the Individual, or Both? module and the other modules in the questionnaire to Whether to collect some or all income data at the ensure that information on revenue from the sale of individual level has already been discussed in the first each category of assets is collected only once in the section of this chapter and is discussed in Chapter 17 questionnaire. on measuring total income.There are several potential analytical benefits to collecting data on an individual Draft Module basis.The issue of who receives-and thus in principle controls-government transfers may be particularly The draft modules for transfers and other nonlabor interesting to policymakers. Also, collecting such data income, presented inVolume 3, are based on the dis- at the individual level may make it possible to model cussion in the first two sections of this chapter. Both the responses of individuals or households to receiving standard and short versions of the module are present- this transfer (although the difficulties in accurately ed.A distinction is made in the modules between pri- modeling these responses should not be understated). vate interhousehold transfers and the remaining mis- When survey designers decide to collect income cellaneous income components, which are collectively data at the level of individual household members, this referred to as "other nonlabor income." This distinc- should-to the extent possible-be done for all of the tion follows common practice in many previous income sources covered by this chapter. In principle it LSMS and other surveys (Box 11.2) and it also reflects may be more feasible to identify some of these income the fact that more and different information is likely to sources with individual recipients than it would be to be required (and can be collected) on private inter- do so with, for example, self-employment income. household transfers than on any of the other income When the questionnaire is designed to collect the data sources. necessary to measure total household income, it should aim to collect information on whichever Submodule on Income from Private Interhousehold household member is the recipient of each category of Transfers household income. However, survey designers should An important issue in collecting data on income from carefully consider the logistics of survey design to private interhousehold transfers is whether to collect make sure that interview time is not wasted asking such data from each individual household member or respondents irrelevant questions (such as asking young from a single well-informed member such as the working household members about income from household head. The argument can be made either pensions). This practical consideration may limit the way. Simply asking the household head can carry the extent to which all miscellaneous income can be risk that certain transfer receipts may not be men- attributed to individual household members, but even tioned because they were not received by the house- where such an exact attribution cannot be made it is hold head or because he or she is not aware of their usually still possible to find out which household existence. However, problems can also arise if each members benefited from a given income source. household member is questioned individually about 283 ANDREW McKAY ing the practice adopted in most previous LSMS and Box 11.2 CautionaryAdvice other surveys, the draft module collects information on each separate source of private interhousehold *How much of the draft module is new and unproven? The short verson of the proposed module is largely the transfers (rather than asking about transfers in the same as the module used for miscellaneous incomes in aggregate). past LSMS surveys. In the modules presented here Separate questions are included in the draft mod- more detail is added in the standard version in terms of ule on transfers received in cash and transfers received the number of income sources identified.The proposed in kind.While this distinction may be of some interest standard version also gets a good deal more informa- to analysts, the primary motivation for making the dis- tion at the individual level rather than the household tinction is that it is likely to prompt the respondent to level, and asks more questions about donors of private mention all in-kind transfers received by the house- interhousehold transfers. hold. However, great care must be taken not to include in- - How well has the module worked in the past? Past ver- sions of the module have probably underestimated kind information twice in the estimate of total household flows. In the past, items such as rental income, interest income, as the consumption module also collects infor- income, and dividends were asked about in little detail; mation on the household's receipt of in-kind gifts of to make matters worse, these items are not very food and non-food commodities (see Chapter 5 on important to the poor and notoriously difficult to col- consumption). Total household income should gener- lect from the wealthy. Several past surveys had so little aly be calculated using the data on in-kind transfers in detail about public and private transfers that some the consumption module rather than the data in the flows were probably missed. flw weepoaly7isd miscellaneous Income module, because the consurmp- * Which parts of the module most need to be customized? miscellane income mode co nsump- The categories of income to be included and the terms ton data are likely to be more comprehensive. used for these categories must be reviewed carefully So why include a question on in-kind transfers in for each country To write effective questions about the private interhousehold transfers submodule? Its government transfer programs, it is necessary to know purpose is to facilitate analvsis of the private transfers a good deal about how payments are made (periodic- themselves. Asking respondents about in-kind (as well ity, targets and recipients, whether they vary by case or as cash) transfers in this submodule means that infor- use a flat rate) for each program. mation on the characteristics of these transfers can be collected that is not available from the consumption receipt of private transfers. While some transfers are module. Second, it was noted above that analysts and made to specific members of the household, others policymakers are predominantly interested in net may be made to the household as a whole, and flows of private interhousehold transfers. Thus it is household members may forget to mention these necessary to collect exact parallel information on the transfers when questioned individually. Moreover, monetary amounts ofthe household's expenditure on questioning household members individually is almost and receipts of private transfers, and the consumption always more time-consuming than putting a single set module is the only place in the LSMS household of questions to one household respondent. In most questionnaire where information is collected on the cases the costs of questioning all household members in-kind transfers made by households.8 more than outweigh any possible benefits from such It is desirable to collect information on the char- questioning. Therefore, the modules presented in this acteristics of people who give private interhousehold section are-like several other draft modules in this transfers to the respondent household.The respondent book-designed to be administered to one well is initially asked to specify the names of the different informed respondent per household. donors simply for the purpose of identifying each In both the standard and short versions of the transfer; this information is of no analytical interest module, to ensure the accuracy of measurements of a and need not be recorded. The respondent is then household's income from private interhousehold asked a series of additional questions about the char- transfers, it is best to ask questions about each transfer acteristics of the donors, such as their relationship (if in turn. This has the added advantage that data on the any) to the household, their gender or genders, and nature and characteristics of each transfer are likely to their places of birth.Where the donors are nonresident be analytically useful in their own right. Thus, follow- parents or children of household members, the 284 CHAPTER I I TRANSFERS AND OTHER NONLABOR INCOME responses are collected in such a way that they can be find out how private transfers affect the intrahouse- matched with the information collected on these peo- hold distribution of income without asking each indi- ple in the household roster module. vidual household member about his or her receipt of Information on the nature of transfers received private transfers. The submodule allows for the possi- and the motivations of people providing the transfers bility that some transfers may be made to the whole can be of considerable interest to policymakers. In the household and not to a specific individual. draft standard submodule the respondent in the recip- The standard submodule does not attempt to col- ient household is asked whether the transfer was made lect information on the social networks to which for a specific reason and, if so, what this reason was. households belong and thus on the sources from The respondent is also asked whether the donor makes which they may receive private interhousehold trans- transfers to the household regularly or only infre- fers. The only "social" information collected is who quently.These questions provide some basic informa- gave the household any transfers that they received tion on the motivation behind the transfers and allow during the reference period (in this case, the previous analysts to assess how reliable a source of income the 12 months). As noted above, the household roster transfers are for the household. module (introduced in Chapter 6) collects informa- Because the standard submodule is designed to tion on the nonresident parents and children of house- enable analysts to estimate a household's total income, hold members. This can be matched with the infor- it is necessary to make a clear distinction between cur- mation from the standard private interhousehold rent transfers from other households (which the transfer submodule to establish the extent to which household has no obligation to repay) and capital these nonresident parents and children of household transfers or loans made from other households (which members give transfers to the household in question. the household is expected to repay). There are two However, this yields only limited information that can ways in which this distinction could be made. One is be used to analyze why these nonresident parents and for the interviewer to explain to the respondent before children did or did not provide transfers to the house- administering the submodule that the questions only hold. Moreover, the information is available only on concern transfers that the household is not obliged to nonresident parents and children of household mem- repay and that any transfers that the household does bers and not on other potential members of the have to repay should not be reported in this module. household's social network, such as more distant rela- However, the problem with this is that the notion of tives or friends. This further limits the extent to which an "obligation to repay" is not clearly defined; as noted determinants of private transfers between households above, respondents may feel a moral obligation to can be modeled. repay the transfer even if they are under no economic To collect information on a household's family obligation to do so. Asking respondents to confine and social networks, the submodule would have to be their replies to those transfers that they do not have to substantially different and much longer. Designing repay may lead them not to mention some transfers such a module would be easier for a survey conduct- that should be included as a component of household ed in a specific locality than for a nationwide survey income. because the nature of these social networks probably An alternative is for the interviewer to ask the varies-for example, between urban and rural areas- respondent whether there is any obligation to repay as does the type of information that would need to be the donor of a transfer directly after the respondent has collected. Nevertheless, in certain instances, especially reported the transfer. This procedure enables analysts where surveys are conducted in specific localities, sur- to distinguish between current and capital transfers.9 vey planners may wish to collect information on a Thus this is the method used in the draft standard household's family and social networks. This would submodule (following the practice used in some pre- allow analysts to study not only the influence of social vious LSMS surveys). networks on private transfers but also the extent to For each transfer, the submodule is designed so which households use such networks to gather infor- that the interviewer asks whether it was given to (or mation for finding employment or for establishing or intended for) a specific individual and, if so, the iden- developing household enterprises. Such inquiries are tity of the principal recipient. These questions aim to beyond the scope of the draft standard submodule pre- 285 ANDREW McKAY sented here, and are probably best designed on a case- Submodule on Other Nonlabor Household Income by-case basis. The submodule on other nonlabor household income The draft submodule on private interhousehold covers all of the components of household income not transfers, while slightly longer than similar modules discussed elsewhere in the questionnaire.A number of used in some previous LSMS questionnaires, is still not issues need to be considered in adapting this submod- very long in terms of the time needed to administer it. ule to a particular survey situation. One of the most The additional questions in this submodule that are fundamental issues is whether information on these not used in previous LSMS and other surveys are other sources of household income should be collect- intended to yield more information on the nature of ed at the individual level, at the household level, or in private transfers (such as motivations behind them, fre- some combination of the two. The other nonlabor quency with which they are given, and intended ben- income submodule collects some information at the eficiaries) and to enable analysts to match the data on individual level, although some incomes are only esti- nonresident parents and children of household mem- mated at the household level. The submodule could bers that are gathered in this module with the data on easily be modified to collect more income informa- these people gathered in the household roster module. tion at the individual level, but implementing this in The inclusion of these questions is justified by the fact practice might not be straightforward. The short sub- that they usefully enhance analysts' understanding of module collects all information at the household level interhousehold private transfers. and does not attempt to attribute any of the receipts The short version of this submodule has the to individuals.As argued above, it is unrealistic-and of same basic structure as the standard version but omits limited analytical interest-to attempt to collect all several less important questions. Like the standard information on transfers and nonlabor income at the submodule, the short submodule collects data on individual level. household income from private interhousehold A second issue is whether data on household's transfers, to be analyzed in conjunction with corre- receipts from the sale of capital assets should be col- sponding data from the consumption module on lected in this submodule. Both the standard and short- household expenditure on private interhousehold er versions allow for this. As argued above, the extent transfers. However, the short submodule does not ask to which these categories of capital receipts are respondents whether a given transfer must be repaid included in this submodule will depend in each case (a difficult question for households to answer in any on the extent to which this information is collected case), since any survey that included the shorter ver- elsewhere in the questionnaire. The draft submodules sion of the submodule would not be aiming to col- introduced here aim simply to list the categories that lect the data necessary to measure total household should be included somewhere in the questionnaire. income.The short submodule gathers little informa- The designers of the questionnaire need to be careful tion on the nature of the transfers or their donors. to avoid omissions as well as double-counting. Rather it focuses on measuring the magnitude of the A third issue is how many categories of other transfers. nonlabor income should be covered. The standard Finally, as has been argued above, whether the submodule presented in this book covers all such cat- standard or shorter submodule on income from pri- egories because it is part of a questionnaire designed vate interhousehold transfers is chosen, the survey to collect the data necessary for measuring total should include an equivalent submodule on house- household income. However, the short submodule holds' expenditure on such transfers. A submodule of does not need to be so comprehensive. this kind has been included with the consumption Related to the third issue is another issue: what module in this volume, designed alongside and sym- the categories of other nonlabor income should be, metrically with that for incomes from these transfers. irrespective of whether the standard or shorter sub- In many questionnaires it will be most natural to have module is used. This is very difficult to specify across the questions on transfers received next to the ques- the wide range of countries and contexts in which an tions on transfers given; this can be done by merging LSMS survey might be conducted, for two reasons. these two submodules into a larger module called pri- First, the categories and amounts of other nonlabor vate interhousehold transfers. income that households receive vary considerably 286 CHAPTER I I TRANSFERS AND OTHER NONLABOR INCOME from one country to another. In many transition members-which are also the transactions on which countries, government transfers of various kinds can analysts will most likely need individual-level data. account for a large proportion of household income,10 Where government transfers and other transfers are but these transfers are a relatively insignificant source very significant income sources, it may be desirable to of household income in many of the poorer African extend the first part of this submodule. Where such countries. Such differences have obvious implications income sources are insignificant, it may be appropriate for the level of detail in which questionnaires should to omit Part Bi of the submodule and simply include collect information on these categories. Second, the these sources as income categories in Part B2. form taken by any one category of other nonlabor Box 11.3 lists a range of possible income cate- income may vary from one country to another. gories that could be included in this submodule and Government transfers might be given in many differ- suggests in which part they might most naturally be ent ways including child benefits, food stamps, or included. Customizing the submodule for the country school feeding programs. Thus survey designers must studied may require dropping some categories and take care to customize questionnaire to reflect prevail- adding others. In many instances customization ing local circumstances. requires renaming categories to make them meaning- Given the country-specific nature of many of ful in the local context (for example, using the precise these issues, this chapter does not seek to outline a local name for categories of government transfers). precise design for the other nonlabor income sub- The short version of the other nonlabor income module, nor does it aim to provide a definitive list of submodule covers a limited number of income cate- income categories that the submodule should cover. gories, and covers them only at the household level. Instead, the chapter aims to provide guidance so that Like the short version of the submodule on private survey planners can choose the categories and the interhousehold transfers, it omits the question about overall design that are most relevant to the country of which household member was the primary recipient the survey. of a given transfer (as respondents may find it difficult The standard version of the other nonlabor to answer this question or may not provide reliable income submodule consists of two parts, which can- responses). Box 11.4 sets out the categories that might but do not have to-follow one another in the ques- be covered in a short submodule on other nonlabor tionnaire."' The first (Part B1) collects information on income; as before it is critically important that survey relatively frequent and regular transactions that can planners customize the module to suit specific cir- fairly easily be attributed to individuals. While these cumstances of the country studied. are mostly government transfers, private pensions and other categories of frequent transactions may also be Notes and Comments on Draft Modules included. Information is collected on which individu- als within a household receive a given category of This section provides annotations to specific questions income and on how much income they obtained. within the draft modules on transfers and nonlabor Part B2 of the standard submodule collects infor- incomes introduced in the previous section and pre- mation from one selected respondent per household sented in Volume 3. The question numbering used is on irregular or infrequent income sources. Income for the standard versions of the questionnaire. data are collected at the household level-the only practical option for these types of income-but a Private InterhouseholdTransfers Submodule question is included asking whether a given receipt of The vast majority of LSMS surveys have included income can be attributed to any specific household modules that collected data on income received by member and, if so, which member was the principal households through private interhousehold transfers. recipient or intended beneficiary. The format of many of these modules has been similar Which transfers should be addressed in Part B1 of to the format of the draft short submodule introduced the submodule and which should be addressed in Part in the previous section (and presented in Volume 3). B2 will vary from case to case. Part Bi should collect However, some of the questions in the draft standard information on transactions that are both more regu- submodule have only rarely been used in previous lar and more easily attributed to individual LSMS and similar household surveys, including: 287 ANDREW McKAY Box 11.3 List of Other Nonlabor Income Categories The following are income categories that could be included * Official development programs. in the standard submodule: * Zakat (a form of Islamic charity intended to benefrt the poor). Other transfers from NGOs (not credit). Transfers * State pension.* Rental income * Private and company pensions.* Income from renting out dwellings, * Survivor's pension.* * Income from renting out land. * Unemployment benefits.* * Income from renting out equipment. * Illness and disability payments.* Income from renting out consumer durable goods. * Child allowances (such as child care benefits and childbirth benefits).* Revenue from sale of assets * job search programs.* Revenue from sale of land. * Educational scholarships (only where not collected in the Revenue from sale of livestock. education module).* Revenue from sale of durable goods. * Maintenance payments.* Revenue from sale of dwellings and buildings. * Dowry. * Supplementary feeding schemes. Other * Transfers from churches/mosques/other religious organi- Interest on savings. zations. Dividends. * Insurance payments. Lottery winnings. * Food or meals from NGOs. * Other gambling income. * Inheritance. Other income (specify). Categories on wh ch data can be collected in Part BI of the standard submodule. (Information on the other categor es should be collected in Part B2 of the standard submodule.) Source. Author's summary. Box 11.4 Categories of Other Nonlabor Income to Be Included in the Short Submodule The following are categories that could be covered in a short * Additional transfers from nongovernmental organizations submodule: (including in-kind transfers). * State pensions. * Income from leasing assets: land, dwelling, buildings, pro- * Company/private pension or retirement fund. ductive assets, durable goods.o * Social security payments (specific to country studied). * Revenue from sale of land. * Employee welfare schemes. * Revenue from sale of productive assets. * Medical, life, unemployment insurance. * Revenue from sale of jewelry. * Maintenance payments. * Revenue from sale of dwellings. * Dowry and inheritance. * Revenue from sale of durable goods. * Transfers from religious organizations (country-specific). * Revenue from loan repayments received by household. * Other income (specify). a.These types of rent can be separated in some instances. Source: Author's summary. * Questions aiming to identify the purpose of a transfer. match with data on the same people in the appro- * Questions on the frequency of private transfers priate sections of the household roster module. from a given donor. These questions are not completely untried; the * Questions aiming to identify to which household LSMS survey conducted in Kagera,Tanzania includ- member a transfer was sent. ed the first type of question, and the LSMS survey in * Questions that identify donors who are nonresident Pakistan included the second and third types of parents or children of a household member to questions. 288 CHAPTER I I TRANSFERS AND OrHER NONLABOR INCOME Respondents may feel more reluctant to answer Q18. The options for responses to this question must some of these questions (for example, the second and be country-specific. third questions above) than others. In some cases (such At the concllsion of this submodule respondents as the first and third questions) there may not be a should be asked if they received any other private clear answer; the design of the questionnaire allows for transfers from other households that they have not this eventuality. Many of these sensitive questions are already reported. If so, the names of the donors should omitted in the short submodule. be listed under question 2, and the interviewer should It is vital in either version of the submodule on pose questions 3-20 to respondents about these addi- income from private interhousehold transfers to cus- tional transfers. tomize the selection and wording of the following types of questions for a given country situation: Other Nonlabor Income Submodule * Questions that ask where the donor lives (Q12 and Almost all previous LSMS surveys have included an Q13 in the standard module). other income submodule. In most cases these modules * Inquiries into the motivation for providing the have had a format very similar to the format in the assistance (Q18 in the standard module). draft short submodule introduced in this chapter (and In countries where transfers are widespread and presented in Volume 3). The types of income sources frequent, it may be desirable to use a shorter reference covered in these submodules have also been very sirm- period. (For example, in the Kagera, Tanzania LSMS ilar to those suggested in the short module, including questionnaire a six-month reference period was capital receipts. However, the format of the draft stan- used-in part because households were to be inter- dard submodule presented in this chapter differs from viewed every six months.) However, for most house- that used in many past LSMS surveys, as it aims to col- holds the receipt of private transfers is likely to be an lect more information at an individual level by identi- infrequent or irregular event, meaning that survey fying individual beneficiaries where possible-espe- designers should choose a longer reference period. cially for the transactions covered in Part B1. The format introduced in the standard submodule is not Q2. The list of all donors should be completed totally new, however; some past LSMS surveys, includ- before the interviewer asks questions 3-19 about ing ones conducted in South Africa, Tanzania each transfer. (Kagera), and Ecuador, have questioned individual household members about some of these other Q8-Q9. Where a transfer has been made by a house- income sources. hold member's nonresident parent or child, these Two issues are very important in the design of the questions are designed so the data they collect can other nonlabor income submodule. First, when the be matched with data collected in Parts B and C standard version is used, it is important for survey (respectively) of the household roster module. This planners to evaluate whether government transfers and enables analysts to identify individuals named in other frequent transfers are of sufficient importance in Parts B and C of the household roster module who the country of the survey to justify including Part BI did not send any transfers to the household during ofthis submodule.Where such income sources are rel- the previous 12 months. It must be clear that the atively insignificant, Part Bl can be omitted, and infor- relationship specified in question 6 is the relation- mation on income from these sources can be collect- ship of the donor to the recipient, and not the other ed by adding further categories to Part B2.Where the way around. income sources covered by Part BI of this submodule are highly significant, survey planners may wish to Q12-Q13. As noted above, the options for the extend the draft submodule. responses to these questions must be country-specific. Whether the standard or shorter submodule is used, it is critically important that survey planners Q15. The example of in-kind assistance used in this choose the categories of other nonlabor income most question-in this case, food-is likely to vary in dif- appropriate for the country surveyed. These are likely ferent countries depending on what forms of in-kind to vary significantly from one country to another, par- assistance are most common. ticularly as sources of income are likely to have differ- 289 ANDREW MCKAY ent names in different countries (for example, different 1. As discussed above, imputed rent from owner-occupied types of government transfers). The questionnaire in dwellings is not included here. This source of rental income is like- the standard submodule should attempt to cover as ly to accrue to poor households as well as to rich households. many of these potential income sources as possible. 2. Identifving a vulnerable household is both more difficult and Thus the lists provided in Boxes 11.3 and 11.4 should more subjective than identifyving a poor household, especially based be thought of as no more than starting points. on a single cross-section of data. This issue is discussed by Glewwe and Hall (1998) using panel data for Peru. REGULAR AND FREQUENT INCOME SOURCES (PART B1i). 3. There are many possible reasons why a considerable number The list of income sources from which respondents of poor households do not receive private interhousehold transfers. are asked to report their income should include the Poor households may not have relatives elsewhere, their relatives items marked with an asterisk in Box 11.3 that are may be as poor as they are, or there may have been a breakup of the applicable to the prevailing circumstances in the coun- family. try of the survey.The list of income sources should also 4. This qualification is important. In some countries, large vol- include any similar income that may be received reg- umes of private interhousehold transfers may cross international ularly or frequently and that is paid predominantly to boundaries-for example, in the case of migrant workers from individual household members. southern African countries working in South Africa or migrant In administering this submodule, it is important to xvorkers from some Asian countries working in the Middle East. In be sure that payments for or to children are not omit- such circumstances, the comparison suggested in this paragraph ted. Questions about child benefits paid to parents cannot be made. should be included here. Where benefits are provided 5. This may not always be the case. If the donor feels the recip- directly to children (such as milk rations given to young ient is spending the transfer in a wasteful or inappropriate way (for children) respondents should be asked a similar set of example, on increased consumption of alcohol), the donor may questions about these benefits.As the questions (notably choose to withdraw or reduce the transfer. question 3) would need to be slightly differently word- 6. Some information on asset sales is likely to be collected else- ed, an additional submodule might be needed to cover where in the questionnaire. Sales of agriculmural equipment are benefits, with a design very similar to that of Part B 1 of recorded in the standard version ofthe agriculture module and sales the standard other nonlabor income submodule. of land in the expanded version. Sales of household business assets are recorded in the standard version of the household enterprise LESS FREQUENT INCOME SOURCES (PART B2). The module. However, given that sales of assets may not be covered income sources covered by this submodule have not comprehensively elsewhere in the questionnaire, the miscellaneous been listed explicitly in the draft module but should income module is probably the most appropriate place to ask ques- include items not marked with an asterisk in Box 11.3 tions about the sale of assets (such as consumer durables) that were that also apply to the prevailing circumstances in the not covered elsewhere. country of the survey. 7. It may not be easy to distinguish households that sold their Where this submodule is included in a short ques- assets to finance consumption from households that sold one type tionnaire (which does not aim to collect the data nec- of asset to acquire another type. However, it should be possible to essary to estimate total household income), the sources make such a distinction when the questionnaire also contains cor- covered here should include items in Box 11.4 that responding information on asset purchases (available in the standard apply to the prevailing circumstances in the country of modules for most categories of assets). the survey. Similar issues arise in designing the shorter 8. It may not be as important to collect such detailed informa- other nonlabor inconme module. Here the most impor- tion on the nature and characteristics of outgoing transfers as on tant issue will be to ensure that all the most important incoming ones, since incoming transfers are generally of greater sources for the country in question are covered. interest to analysts. While the questionnaire in Volume 3 presents similar modules on incoming and outgoing interhousehold trans- Notes fers, the section on outgoing transfers may be shortened if need be. 9. Experience from previous LSMS surveys suggests that where The author is grateful for the very helpful comments of Donald the distinction has been made, the vast majority of private inter- Cox, Paul Glewwe, Emmanuel Jimenez, Margaret Grosh, and an household transfers reported by respondents have been current anonymous reviewer on an earlier draft. rather than capital transfers. 290 CHAPTER I I TRANSFERS AND OTHER NONLABOR INCOME 10. In Bulgaria the 1995 Integrated Household Survey found .1995a. "Towards QuantifRing the Trade-off: Administrative that the average proportion of household income coming from Costs and Incidence in Targeted Programs in Latin America.' state transfers was 37.3 percent. More than 10 percent of the sam- In D. van de Walle and K. Nead, eds., Public Spending and thte ple derived all of their measured income from this source. Poor: Theory and Evidence. Baltimore, Md.: Johns Hopkins 11. This design combines elements of designs used in some pre- University Press. vious LSMS questionnaires (including questionnaires used in . 1995b. "Five Criteria for Choosing Among Poverty Kagera, Tanzania and in Pakistan) with some new elements. Programs" In N. Lustig, ed., Coping withi Austerity: Poverty and Inequality in Latin America. Washington, D.C.: Brookings References Institution. Haddad, Lawrence, and Manfred Zeller. 1997. "Can Social Security Bamberger, Michael, Daniel Kaufman, Eduardo Velez, and Scott Programs do More with Less? General Issues with Challenges Parris. 1991. "Intrahousehold Transfers and Survival Strategies for Southern Africa." In Lawrence Haddad, ed., .4chieving Food of Low-Income Households: Experiences from Latin America, Security in Southiern Africa: New Challenges, New Opportunities. Asia, and Africa."World Bank,Washington, D.C. Washington, D.C.: International Food Policy Research Corbett, Jane. 1988. "Famine and Household Coping Strategies." Institute. World Development 16 (9): 1099-112. Heckman, James J. 1979. "Selection Bias as Specification Error" Cornia, Giovanni Andrea, and Frances Stewart. 1995. "Food Econometrica 47 (1): 153-61. Subsidies: Two Errors of Targeting." In Frances Stewart, ed., . 1990. "Varieties of Selection Bias" .4merican Economic Adjustment and Poverty: Options and Choices. London: Roudedge. Reviewu Papers and Proceedings 80 (2): 313-18. Cox, Donald, and Emmanuel Jimenez. 1990. "Social Objectives Jarvis, Sarah J., and John Micklewright. 1995. "The Targeting of through Private Transfers: A Review" World Bank Researchi Family Allowances in Hungary." In Dominique van de Walle Observer 5 (2): 205-18. and Kimberly Nead, eds., Piublic Spending atid the Poor: Theory .1992. "Social Security and Private Transfers in Developing and Evidence. Baltimore, Md.: Johns Hopkins University Press. Countries: The Case of Peru." World Bank Economic Review 6 Kanbur, Ravi, Michael Keen, and Matti Tuomola. 1995. "Labor (1): 155-69. Supply and Targeting in Poverty-Alleviation Programs." In 1995. "Private Transfers and the Effectiveness of Public Dominique van de Walle and Kimberly Nead, eds., Public Income Redistribution in the Philippines?" In Dominique van de Spending and the Poor: Thteory and Evidence. Baltimore, Md.: Walle and Kimnberly Nead, eds., Public Spending and thie Poor:Tlieory Johns Hopkins University Press. and Evidence. Baltimore, Md.:Johns Hopkins University Press. Knowles, James C., and Richard Anker. 1981. "An Analysis of 1998. "Risk Sharing and Private Transfers: What about Income Transfers in a Developing Country: The Case of Urban Households?" Economic Development and Cultural Change Kenya."Journal of Development Economics 8 (2): 205-26. 46 (3): 621-37. Quisumbing, Agnes R., Lynn R. Brown, Hilary S. Feldstein, Glewwe, Paul, and G. Hall. 1998. "Are Some Groups More Lawrence Haddad, and Christine Pefia. 1995. Women: Dze Key Vulnerable to Macroeconomic Shocks than Others? to Food Security. Washington, D.C.: International Food Policy Hypothesis Tests based on Panel Data for Peru." Journal of Research Institute. Developmetit Economics 56 (1): 181-206. Sahn, David E., and Harold Alderman. 1995. "Incentive Effects on Greene, William H. 1993. Econometric Analysis. Second ed. Labor Supply of SriLanka's Rice Subsidy." In Dominique van Englexvood Cliffs, N.J.: Prentice Hall. de Walle and Kimberly Nead, eds., Public Spending and tfle Poor: Grosh, Margaret E. 1992. "The Jamaican Food Stamps Program: A Theory and Evidence. Baltimore, Md.:Johns Hopkins University Case Study in Targeting?" Food Policy 17 (1): 23-40. Press. 291 , {q Housing 11 2 Stephen Malpezzi Housing characteristics, and the process by which housing is constructed and occupied, are key aspects of the living standards of households in developing countries. Housing is of great impor- tance to households in both developed and developing economies, because it is the largest fixed capital investment that households make. In developing countries, housing accounts for 10-30 percent of household expenditure, 6-20 percent of GNP, and 10-50 percent of gross fixed capital formation. Furthermore, as economies develop, the proportion of GDP accounted for by housing investment rises. Other than human capital, housing and land are the types of capital that are most widely owned. There are three main ways that housing data are used table outcomes. As is explained further in the first sec- in policy research and thus three reasons why housing tion of this chapter, governments regulate and inter- data should be collected in LSMS surveys. First, hous- vene in housing markets in many ways, and household ing information provides useful direct indicators of survey data can be used in analyses that determine the living standards, including access to electricity and effectiveness of these policies. clean drinking water, type of dwelling, toilet facilities, This chapter discusses what policymakers need to and living space per person. Second, housing is a form know about housing and housing markets and which of consumption that can be overlooked when analysts housing issues can be analyzed using data from house- estimate overall standards of living using household hold surveys such as the Living Standards survey data. For example, families that rent their hous- Measurement Study surveys. The first section of this ing report their rent payments as part of their overall chapter discusses key housing policy issues and shows expenditures, whereas families that own their housing how housing market analysts can address these issues. often report incurring little current expenditure on The second section reviews the data that would need housing-as they are consuming the fruits of a previ- to be collected in a multitopic household survey to ous investment.Thus estimates of total household con- make it possible for these issues to be analyzed. The sumption should include the implicit rent of owner- third section contains a draft prototype housing mod- occupied housing. Third, housing data can be used to ule that can be customized to match the prevailing understand why particular housing conditions exist conditions in the country of the survey. The fourth and whether specific government policies can be section provides explanatory comments on the draft adopted that will lead to more efficient or more equi- module. 293 STEPHEN MALPEZZI Housing Policy Issues culated carefully. For most common purchases, such as purchases of food and clothing, the cost of the items is This section discusses in detail the ways in which data their market price, which is the value that should be on housing collected in multitopic household surveys placed on these items. However, some households' like the LSMS surveys can be used to analyze some housing may not be purchased, or even rented, in a key issues in the housing sector. Box 12.1 reflects this directly observed transaction at its true market price. discussion in that it shows which issues can and which For example, some housing is inherited, and some is cannot be analyzed with LSMS-type data. built by the households themselves. Some households that rent housing do so at subsidized or controlled Using Housing Characteristics as Indicators of Living prices. Therefore, to measure housing consumption Standards correctly, it is necessary to use market prices or an In order to use the characteristics of a household's estimate of such prices. dwelling as indicators of the household's standard of Another problem is that households that own living, analysts require data on those housing charac- their housing incurred much of the cost many years teristics. Exactly which of these characteristics are use- ago but still use the dwelling today.Yet if two house- ful is discussed in the next section of this chapter. holds live in similar dwellings, their standard of living is similar regardless of when the housing was pur- Measuring Housing Consumption chased. Thus it is necessary to estimate each house- A second reason for collecting housing data is to hold's current consumption of housing by estimating obtain the information needed to derive a correct esti- what the household would spend to rent an equivalent mate of a household's consumption of housing. In unit at market prices.Therefore, the household survey principle, households purchase accommodation (or should collect data on market rent (if observed) or this produce it for themselves) just as they purchase food, should be estimated (if not directly observed) for each clothing, and other consumption items. As explained survey household's housing unit. in Chapter 5, total consumption is a crucial indicator Virtually every housing unit is unique in terms of of household welfare, so it is important that it be cal- its size, quality, location, and other characteristics. This Box 12.1 Policy Issues and Housing Data Policy issues that can be analyzed with cross-sectional data from The determinants of price changes (panel data better than LSMS-type surveys one cross-section of data) * The level and distribution of housing consumption Tenure choice (panel data better than one cross-section * The distribution of housing assets of data) * The frequency and distribution of specific housing charac- Upgrading (panel data better than one cross-section of teristics and conditions (such as space, sanitation, age, con- data) dition, and crowding) Vacancies (panel data better than one cross-section of * Housing tenure, tenure security, and tenure choice data) * The demand for housing * The determinants of price changes Policy issues that cannot easily or directly be analyzed with data * Upgrading from an LSMS survey * The measurement and determinants of vacancies (Note: Many analyses of these issues moke indirect use of some * The valuation of housing subsidies (for example, from household survey data) public housing, or from rent-controlled markets) * The regulation of development (for example, zoning and * How households finance their housing building codes) * Behavior relating to housing finance and savings * The determinants of the supply of new construction * Satisfaction with the neighborhood and the unit * Changes in the supply of serviced land * Housing investment and the business cycle Policy issues that can be analyzed with panel data from an LSMS * Net effects of government interventions on producer and survey consumer incentives * Housing filtering (changing supply from the existing stock) 294 CHAPTER 12 HOUSING heterogeneity, the durability of most housing, and the * Regulating development by means of zoning, sub- many forms of tenure and payment that exist can division regulation, and building codes. make it a complex process to estimate market prices. * Providing public housing, either directly or through Some of the relevant measurement issues are briefly state-owned enterprises. discussed in Appendix 12.1; see also Green and * Taxing and subsidizing housing. Malpezzi 1998. ' Enforcing rent control and other rental regulations. * Regulating other aspects of the real estate industry, Understanding Housing Market Behavior such as construction and brokerage. The third reason for collecting housing data in LSMS * Providing infrastructure such as electricity, water, surveys is to help analysts and policymakers understand and sewage. how housing markets work and how government poli- * Regulating finance through interest rate regula- cies affect housing outcomes. In this chapter the dis- tions, the provision of credit, and the prudential cussion of housing market analysis focuses on the regulation of lenders. analysis needed for public policy purposes because of The relationship between policies and housing out- the overall purpose of this book. Nevertheless, analysis comes can be studied using both descriptive analysis of market behavior is of interest to, for example, hous- and estimations of behavioral relationships. Descriptive ing providers and academics as well as to policymakers. analysis is essentially the tabular presentation of simple In principle, government interventions in housing statistics on housing, such as which households rent or markets can correct for market failures and produce own, which live in subsidized units or units subject to positive externalities for society as whole. In most rent control, how much households pay for their countries governments define and enforce property housing, and how dwellings were obtained (whether rights, which are the "rules of the game" and the essen- inherited, purchased, or built). These basic characteris- tial element of a successful housing market. However, tics of housing can also be cross-tabulated by different there is no guarantee that all public interventions will income and tenure groups.This type of analysis is very have positive outcomes in practice. There are many useful for getting an initial snapshot of various gov- examples of public interventions that have exacerbated ernment policies and of the general state of the hous- market failures as well as examples of interventions that ing market, but it cannot usually provide quantitative have been more successful (World Bank 1993a). Much estimates of the effects of government policies on the depends upon the capacity of the institutions in the housing market. To find out how different housing country of the intervention and the prevailing process policies affect housing outcomes, analysts need to of housing development and management. understand household behavior. The following two The remainder of this section will discuss several subsections examine these two categories of analysis- types of housing market analysis. Before doing so, it is descriptive analysis and the estimation of behavioral useful to review the different ways in which govern- relationships. ment housing policies can affect housing outcomes, since this is the most direct way governments can Descriptive Analysis improve housing conditions in developing (and devel- Good descriptive analysis can provide policymakers oped) countries. Of course, the other obvious way that with key facts about the housing market. For example, government policies can improve housing outcomes is it can show which income groups benefit from subsi- by increasing economic growth, which raises house- dized housing and which households constructed their hold income, allowing households to purchase or rent own dwellings (and thus would not be directly affect- better housing. ed by changes in construction industry regulations). Besides the general role that governments play in Three basic types of data are most usefiil for descriptive providing a stable macroeconomic environment con- analysis: data on the housing stock, data on housing ducive to housing investment, there are many types of expenses (including taxes and subsidies), and data on government policies that affect housing. The most property rights (including rental arrangements). important are: * Assigning and enforcing property rights with THE HOUSING STOCK. Perhaps the most obvious data respect to land and real estate, including housing. to collect on housing is information on the physical 295 STEPHEN MALPEZZI characteristics of the dwelling, some of which are basic * Payments by both renters and owners for utilities indicators of a household's living standard. This gener- and other housing-related services (such as water, al use of data was already considered above and will be sewerage, electricity, and telephone services). discussed in more detail in the second section of this * The shares of the housing market that are financed chapter. However, there are certain dwelling charac- formally and informally, the terms, and how these teristics that have particular significance for policy- vary by income, region, and other household char- making. Descriptive analysis of housing stock data can acteristics. be used to examine: * Direct taxes paid, either by renters or owners. • The characteristics of a dwelling that yield infor- . Subsidy payments received by (or payments made mation about the incidence of taxes, subsidies, or on behalf of) renters and owners. regulations. For example, information on the rela- * The proportion of their income that households tive importance of indigenous versus "modern" typically spend on housing and how this varies by construction techniques or single-family homes type of tenure, the household's income level, versus multi-family housing often provides esti- region, and other household characteristics. mates of how much housing is subject to particular * The consumer's surplus gains and losses from regulations or taxes. It is often useful for policy- subsidies. makers to know how such characteristics vary by The notion of a consumer's surplus is important region and income. and merits a brief explanation. When a government * Dwelling characteristics related to basic quality subsidizes a household by giving it an unrestricted cash standards and building code requirements such as grant, a dollar is worth a dollar, a peso a peso, a ruble a regulations concerning water supply and sanitation. ruble. However, when the government subsidizes a * Vacancy patterns and how these vary by location. household by providing it with a good or service (such * In many countries, the differences in the quality of as housing) or if the government requires that a house- the housing between "formal" and "informal" sub- hold spend a transfer of cash in a certain way (for exam- markets. How do crowding, vacancies, and other ple, on housing), the value of the subsidy to the house- market outcomes differ in these submarkets? hold is usually less than its cash value. Measuring the Malpezzi (1984, Appendix F) provides a convenient household's actual benefit from such a transfer is the aim list of descriptive tables and cross-tabulations on hous- of measuring the consumer's surplus. Detailed discus- ing stock and related variables, which can be a useful sions of the consumer's surplus and related concepts can starting point for a descriptive analysis plan. Mayo and be found in Green and Malpezzi (1998), Freeman others (1982) provides an excellent illustration of how (1979), and Deaton and Muellbauer (1980). On subsi- household survey data can be used to describe and dies, including the application of consumer's surplus, see analyze basic housing market outcomes such as quali- Kim (1991), Sanyal (1981), Mayo (1986), andYu and Li ty and the policy implications that can be drawn from (1985). For more general analyses of incentives that such analysis. examine a wide range of such interventions, see World Bank (1989) and Malpezzi and Mayo (1997). HOUSING EXPENDITURES, TAXES, AND SUBSIDIES. Obviously, analysts need basic information on housing PROPERTY RIGHTS. Until the last decade, property expenditures to estimate any meaningful welfare rights in developing countries had not been analyzed measure for households and also to analyze the issues in much depth, largely because they are well-estab- of housing subsidies and taxation, which are discussed lished in many developed countries and have therefore below. Key issues for descriptive analysis are: been taken for granted. Nevertheless, property rights * Whether households that own their dwelling are are still an issue in many developing countries, partic- making payments on loans or mortgages, the size ularly in the transition economies of Eastern Europe and term of such payments, and when the loans and the former Soviet Union. The most important will be paid off. kinds of property rights data for descriptive analysis * The amounts that renters pay and the form that include: rent payments take (for example, cash, in-kind, or * How many households own and how many rent, work) and any utilities included in these payments. and how this ratio varies by region and income 296 CHAPTER 12 HOUSING group. For owners, information on the specific more elaborate treatment seeWorld Bank (1993a) and nature of property rights is also useful. Green and Malpezzi (1998). * For renters, the form of their rental arrangement, such as the length of the lease (if specified), from HOUSING DEMAND. How much people are willing to whom the dwelling is rented, and if there is any pay for housing is one of the most important character- relationship between the tenant and the owner. istics of the housing market that can be examined with * For owners, the existence of any official title or data from a multitopic household survey like the LSMS. deed for the house and for the land it is built on, As noted above, and as discussed in some detail in exactly who owns the title to the unit (or, for Appendix 12.1, housing rent (both actual or imputed) renters, who signed the lease) and what kind of title is an expenditure measure and consists of price multi- it is, and the extent to which the household's own- plied by quantity. The majority of demand studies ership of the title is secure. (including Follain, Lim, and Renaud 1980 and Malpezzi * The length of time the household has lived in the and Mayo 1987a, 1987b) examine expenditure by esti- dwelling, whether rented or owned. If owned, how mating so-called Engle relationships (for example, actu- the dwelling was obtained; if rented, the details of al or imputed rent) or sometimes house value (the pres- the lease. ent value of rent) as a function of income, demographic Additional discussion of property rights can be found in variables, and so on. A smaller number of studies have Kiamba (1989) for Africa, Bromley (1989) for Asia, and decomposed housing expenditure into its quantity and Betancur (1987) and Gilbert (1989) for Latin America. price components using hedonic models (see Appendix The recent literature is dominated by analysis of prop- 12.1) or models in which prices vary with the interur- erty rights in formerly socialst countries; see, for exam- ban location of the dwelling. Ingram (1984) and ple,Jaffe(1993),JaffeandLouziotis(1996),andPejovich Malpezzi (1998) are examples of studies that regress (1990). Examples of research on forms of housing some quantity measure against prices as well as other tenure and the value of this tenure include Jimenez factors that influence demand, such as income and (1982a, 1982b, 1984) andTipple andWillis (1991b). household composition. Although there are a plethora of measurement and other issues to be resolved in this The Estimation of Behavioral Relationships research area, housing demand is generally the most The above discussion showed how simple descriptive thoroughly studied and best understood of the major statistics can be used to get an idea of how govern- categories of housing market behavior (Olsen 1987). ment policies may affect housing outcomes. However, Key policy issues regarding housing demand are: descriptive analysis is mainly concerned with "what * How housing expenditures change with household is." To answer "why," it is important to know how income (the income elasticity of demand). households (and other relevant actors like suppliers Understanding this relationship is the key to under- and governments) behave. For example, descriptive standing the often-misunderstood set of issues that statistics can show how much households spend on are loosely labeled "affordability" issues. housing on average.This description can be extended * How housing expenditures change in response to by presenting averages for, say, different income changes in housing prices (the price elasticity of groups. However, to understand more about the demand). As housing prices are affected by taxes underlying behavior of households and other relevant and subsidies, this information can be used to show actors, analysts can go a step further and, for example, how tax and subsidy policies affect households' estimate the "income elasticity of demand," a sumrna- housing decisions made by households. ry numerical measure of how much housing expendi- * How demand varies with demographic characteris- ture increases as income increases.1 Analysis of housing tics. For example, how fast does housing consump- markets can be complicated by many different factors, tion change with household size? Do female-head- including housing's physical and locational hetero- ed households spend more or less than average after geneity, imperfect information about buyers and sell- controlling for other demand determinants? ers, illiquidity, significant environmental and other * The determinants of demand for different tenure externalities, and time lags in supply. Many, though not arrangements (owning, renting, or living in govern- all, of these issues are discussed in this chapter. For a ment-provided housing). 297 STEPHEN MALPEZZI * How demand relates to the household's investment Figure 12.1 Rent-to-Income by Income (Owners) motives, as well as its demand for current Rent-to-income (percentages) consumption. 100 * The demand for the individual characteristics of \ Seoul housing such as space, quality, location, and types of 80 amenities (such as type of toilet, drinking water, and electricity). In particular, how the location of a 60 household's dwelling relates to the location of the Manila 14 city average workplaces of the household's members.When pol- 40 - - _ icymakers misunderstand the latter relationship, this Bogota _ - can result in empty housing projects, underem- 20 - - _ … ployed public housing residents, and large ineffi- Cairo ciencies in transport spending in developing, devel- 0 oped, and transition economies alike. Comparative studies such as Malpezzi and Mayo Income in 1981 U.S. oilars (1987a) and many studies of single markets such as Note The authors compare markets by examining each market's median ncome Follain, Lim, and Renaud (1980) have demonstrated househo d.The dotted lines representing Cairo, Bogota, Manila and Seoul slope that the parameters of demand vary from country to down because within markets (for examp e, w[thin cities orwithin countr es) housing consumption always increased with ncome but generally grew more country in significant and at least partly predictable slowly than income-in other words, the income elasticity was less than 1. ways. Most studies have found income elasticities of Comparing I markets'median income (the sold line; not al 5 ctes are demand that are less than 1 within markets. (In other shown), the average rent-to-income ratio in each market increases with the median ncome- n other words, the ncome elast city is slightly greater than 1. words, housing consumption has increased with Source: Mapezz and Mayo 987a. income, but less than proportionately). Despite the rel- ative stability of within-market elasticity across coun- tries, the average share of the household's budget that Lim, and Renaud (1980), Ingram (1984), Mayo and is spent on housing varies tremendously from market others (1982), and Mohan (1994) provide useful to market and especially across countries; see Figure examples of how to undertake a demand study and 12.1 for an illustration. This relation can be examined tailor it to specific country conditions. by estimating the cross-market elasticity of average budget shares with respect to average income in each HOUSING SUPPLY. Much less research has been done to market. Malpezzi and Mayo (1987a) found that in a date on housing supply, despite the fact that supply range of developing countries, the cross-market elas- parameters are probably even more important for pol- ticity was actually greater than 1 (in other words, icymakers to know about than demand parameters. In housing consumption increased somewhat faster than broad terms, housing supply comes from two sources: income). new construction and the existing stock. Housing Despite the fact that many studies have already economists refer to changes in the existing stock as been carried out, experience suggests it is generally "filtering." In common parlance, as units "filter down," worthwhile to undertake customized demand studies they pass from richer households (owners or tenants) for a given market. There is a particular need for fur- to lower-income households. Units can also "filter ther research on how consumption responds to up"-pass from poor households to richer house- price-in other words, price elasticities (which are less holds-if a neighborhood is being revitalized or "gen- settled than income elasticities). Also, much of the lit- trified." Large improvements (upward filtering) in a erature on housing demand in developing countries particular dwelling are also known as "upgrading." focuses on demand for housing as a composite good, (For further information on upgrading see Strassman while there is much less research on demand for indi- 1982, Struyk 1982, and Rakodi 1987.) Key questions vidual housing characteristics such as numbers of about supply include: rooms and various measures of quality. See Follain and How much of the housing supply consists of new Jimenez (1985b) for a review of the literature on construction and how much is from the existing demand for specific housing characteristics. Follain, stock? How much upgrading is done in place and how 298 CHAPTER 12 HOUSING much is effective supply changed when two or more * Are land prices increasing faster than the overall households share (or stop sharing) a dwelling? rate of inflation? Where are land prices the highest * How does supply change in response to changes in and where are land prices increasing the fastest? the price of housing? What determines this elastic- * How do changes in land prices affect the costs of ity? What are the effects of natural (geographical) end users? Is the price and affordability of housing constraints versus man-made (regulatory) con- and commercial and industrial space changing and straints on supply? are real occupancy costs greater now than before? * What is the role of filtering in the market? In other * Is the land market segmented-for example, divid- words, how does the supply of housing from the ed into a formal and an informal sector? Which existing stock change to meet demand? During a households do not have access to housing from the given period, how much housing filters up and fil- formal private sector? What regulations govern the ters down? What are the determinants of this filter- use and sale of land? ing process, and are there regulatory or other * What is the system for providing infrastructure? impediments to it? What roles do the private and public sectors play in * What effects do different government policies (such this? Are costs recovered? Does the infrastructure as rent control, the regulation of real estate industry, system respond to demand? Does infrastructure get or government provision of housing) have on the installed in low-income areas? supply of housing? Once again, many of those questions are answered * How do these effects differ for different tenure (for most directly using aggregate or other collateral data, example, renting versus owning), by income, and by but many of these issues can also be analyzed using type of housing unit? household survey data. For example, it is straightfor- Some supply issues are best studied with aggregate ward to add questions on land prices and land acqui- time-series data, but many can be studied with house- sition to household surveys (see Mayo and others 1982 hold survey data, especially if the survey has collected for examples). Also, cross-tabulations of responses to panel data, which would make it possible to study household survey questions regarding services such as housing supply over time. Burns and Grebler (1977) water, sanitation, and transport can yield insights into and Renaud (1980) are examples of aggregate studies the provision of infrastructure. Angel and others of supply. Malpezzi and Mayo (1987a) presented the (1986), Dowall (1991), and Farvaque and McAuslan first econometric estimates of supply elasticities from (1992) are a few of the many useful studies of land time-series data for several developing countries. issues. Ingram and Carroll (1981) and Mohan (1994) Bramley (1993) and Ozanne and Struyk (1978) used give particularly good accounts of the spatial structure alternative methods to study supply with household of land markets in developing countries, and Bertaud survey data. For studies of supply from the existing and Renaud (1994) examine socialist countries where, stock through filtering, see Green and Malpezzi (1998) until recently, land prices were not permitted to vary for a general review and see Ferchiou (1982) and from place to place (or according to their productivi- Johnson (1987) for developing country examples. ty). Gackenheimer and Brando (1987), Lee (1992), and Lee and Anas (1992) discuss infrastructure issues in LAND AND INFRASTRUCTURE. Housing supply is inex- general. See Chapter 14 on the environment for a dis- tricably related to the amount of land available for cussion of water supply and sanitation issues in great housing construction and to the availability of infra- detail. structure. Major policy questions regarding land and infrastructure that require the estimation of behavioral HOUSING FINANCE. Perhaps the most important single relationships include: determinant of the quality of the housing of a given Is the supply of serviced land in urban areas household is its income and, therefore, its ability to expanding to meet growing population and purchase or rent housing. Nevertheless, because all employment needs? Which land uses are growing housing is an expensive and long-term investment, all the fastest? Where is urban land conversion taking housing purchases are financed in one way or anoth- place? Is the supply of infrastructure keeping up er. Formal housing finance, provided by a wide variety with demand? of organizations, has been the subject of much research 299 STEPHEN MALPEZZI in recent years. However, in many developing coun- about how the households in the sample have financed tries, formal housing finance institutions are relevant their housing and at what terms. The best example to only to a small proportion of households. Instead, date of housing finance analysis using household-level households in developing countries often turn to var- data is Struyk and Turner (I1986). ious informal sources of housing finance such as inter- family transfers, but these tend to be very expensive as Research Methods and Data Needs outlined in Renaud (1984) and Malpezzi (1996). The data needed to analyze many of the policy issues Some countries have only small enclave formal insti- discussed above can be collected in a multitopic tutions that make few loans at very favorable terms. household survey that includes a module specifically These often have little relevance to low-income and related to housing. This housing module would gath- rural households whose members earn their living in er data on, for example, housing location, housing the informal sector. In many respects, the challenge conditions (quality and quantity), tenure, and the rents facing the governments of many developing countries and prices that households pay.This information could is to encourage the development of formal sources of then be combined with data from other parts of the housing finance that are sustainable and affordable to a household questionnaire (on, for example, household broad range of the population. incomes and characteristics) to answer many of the Given the importance of finance for determining questions posed in the first section of this chapter.2 A housing outcomes, policymakers should aim to deep- well-designed housing module will also collect data en housing finance markets in order to encourage that assists analysts in other ways (for example, to investment in housing. Key issues in the area of hous- measure consumption accurately and precisely). ing finance include: It should be mentioned at this point that there is - What are the sources of housing finance, and how very little information on the operation of rural hous- are these funds used? What is the system of inter- ing markets in developing and transition economies. mediation for housing finance, and how is it con- In fact, the vast majority of housing market analysis in nected to financial intermediation in general? What developing, transition, and developed countries has kinds of mortgage instruments are available on the focused on urban housing markets, thus excluding a lending side? What rules govern institutional fea- significant slice of the housing market in the countries tures such as mortgage insurance and foreclosure? being studied.3 While this is the case in virtually all - Are subsidies and taxes built into the financial sys- countries, the severity of the problem that this omis- tem? If so, what is the nature and extent of these sion presents varies from country to country. For subsidies and taxes? What are the effects of tax, reg- example, in Asia, Korea is currently about four-fifths ulatory, and subsidy policies on the cost of credit? urbanized, while Thailand is about four-fifths rural. * What are the mortgage interest rates, and other More than one-third of the populations of Poland, the terms, paid by households of different types that are Czech Republic, Hungary, Italy, and Switzerland live borrowing from formal and informal sector finance in rural areas. Certainly, one of the biggest contribu- institutions? How do these terms compare to the tions of LSMS surveys to housing analysis is their pro- financing available for other (nonhousing) invest- vision of data on rural housing markets, which can be ments, and how do they compare to inflation? used to research this much neglected area. * What are the real effects of housing finance-in other words, the effects that housing finance has on Categories of Data housing consumption, tenure choice, and mobility? This subsection outlines the categories of data that can Does the availability (or lack) of formal housing be collected in a housing module in a multitopic house- finance affect such outcomes, or are formal and hold survey. It also indicates specific questions that informal finance good substitutes? should be included in the questionnaire, the answers to Most research on housing finance has used insti- which are likely to illuminate the important policy tutional and macroeconomic data rather than house- issues outlined in the first section of this chapter. hold survey data. However, much can be learned about housing finance from household survey data if the HOUSING CHARACTERISTICS. The most basic data that questionnaire includes carefully chosen questions should be collected in the housing module are data on 300 CHAPTER 12 HOUSING the characteristics of the household's dwelling. The capitalization rate to each owner-occupied unit to most relevant characteristics for policy research pur- appraise its value. poses will vary somewhat from place to place, but it is Appendix 12.1 briefly describes hedonic indexes always important to collect data on the basic structure and "cap rates" for readers unfamiliar with these con- of the dwelling (for example, whether it is single-fam- cepts. Each approach has its pros and cons. Generally ily or multifamily and what material it is built with), these approaches are complementary, although the the age of the structure, its size, the number of rooms, hedonic approach can be especially useful. Hedonic the number and size of bathrooms, and other charac- indexes require extensive data on a unit's characteris- teristics related to type and reliability of it water and tics (such as size, type, and location) as well as on the sanitation services. amount of rent paid. Other important questions relate to the quality of Whatever general approach is taken, data must be the neighborhood in which the dwelling is located collected on arms-length market transactions, which and what services are provided in that neighborhood. are transactions between two parties who have no spe- Not all of these location data need to be collected by cial relationship that would suggest the price paid is asking questions of household respondents. It would different from market prices. For example, transactions be better for the interviewers to make their own between close relatives may not be arms-length.5 Price observations of these phenomena while they are in a controls, subsidies, discounts to relatives and kin, and household's dwelling to conduct the survey interview. transactions that include in-kind rents (such as servic- First and foremost, they should record the location of es performed in lieu of cash rent) all introduce obvi- the dwelling in a city, town, or other market, since ous differences between the cash price paid and the housing markets are typically analyzed by place. arms-length market price. The questionnaire needs to Within each city or market, they should indicate differentiate households that are reporting their own where the dwelling is located in relation to the central rents and values based on arms-length transactions business district of a city or town. One question that from households that are under some form of control must be asked of the respondent rather than observed or subsidy, are related to the landlord, and so on.A fur- by the interviewer is the distance household members ther complication is that in some markets, very few must travel to their workplaces and the amount of market transactions are not affected by some sort of time it takes them to get there. Also, it may be useful price control. For example, in some markets, very few to find out how far the dwelling is from other places units are traded at market prices. This can be because of employment in the area or from central locations in housing is primarily owned by the government and is the metropolitan area. rented at very low rents (as in Moscow and, until recently, China) or because rent control is very wide- PRICES. One set of issues that must be addressed early spread (as in Ghana; see Malpezzi, Tipple, and Willis in the design phase of any LSMS housing question- 1991). Nevertheless, despite the problems that can be naire relates to measuring housing prices and con- involved in interpreting such numbers in countries sumption. These issues have been discussed briefly like Russia, it is necessary in these countries to collect above and are discussed again in some detail in data on the official (nonmarket) rent for the purposes Appendix 12.1. Rent is the most obvious measure of policy analysis. needed for any consumption analysis. Because rent can It follows that the questionnaire should be be observed directly for renters but not for owners, it designed to elicit from the respondent whether the is usually necessary to impute the rental value of an household receives any housing subsidy and, if so, what owner-occupied unit. kind and if the unit is subject to rent control. It is also There are several ways to collect these data (Green important that the questionnaire carefully distinguish and Malpezzi 1998). First, the owner can be asked how between housing and agricultural real estate in rural much rent they could charge for their unit. Second, areas and between housing and shops, offices, and the coefficients of a hedonic index estimated using a other nonresidential uses in both rural and urban areas. rental sample can be applied to the corresponding In addition, it should be noted whether any commer- characteristics of individual owner-occupied units to cial premises are physically attached to the household's impute rent.4 A third general approach is to apply a dwelling. 301 STEPHEN MALPEZZI When the survey is fielded in countries or regions A related factor that critically affects demand is with no active housing market, it may be appropriate the mobility of the household. This can vary enor- to include questions about housing prices in the com- mously among countries. Strassman (1991) found that, munity module of the household survey. These ques- in a given year, fewer than 5 percent of households in tions can be put to community leaders or others who Colombo, Sri Lanka moved, whereas in Bangkok, are knowledgeable about what housing units exist of Thailand about 20 percent of households moved in a various standard types.These questions in the commu- year and in Seoul, Korea an astounding 43 percent nity module will supplement the housing questions moved. Including questions about the length of tenure asked in the household questionnaire. If the market is in the survey can yield data that can be used to study extremely moribund with few similar dwellings being such behavior. More elaborate housing questionnaires sold, the questions included in the community mod- often add additional questions about previous resi- ule can be about the costs of constructing typical dences and planned moves (see Mayo and others 1982 housing units. and Malpezzi 1994). In many countries, property taxes are an important source of government revenue (Dillinger 1991). Of SuJPPLY. As was discussed in the first section of this course, how great a burden they impose depends on chapter, the supply of housing in any given country whether they are levied or enforced. In some markets, consists of the existing stock and of new construction. various transaction taxes and registration fees on hous- In any given year, well over 90 percent of the housing ing sales are high.Where this is relevant, questions about in a given market consists of the existing stock. such taxes and fees can easily be added to the housing Descriptive tabulations of housing characteristics, both module. In some markets, questions about condomini- on their own and cross-tabulated by relevant criteria um fees or maintenance fees will also be relevant. such as income and tenure, can yield important insights into housing in the existing stock. EXPENDITURES. An issue that can arise when survey A more dynamic way to analyze the supply of designers are framing the questions about housing housing from the existing stock is known as studying demand is the distinction between gross and net "filtering."There are three ways of analyzing filtering household expenditures on housing. Some renters pay (Green and Malpezzi 1997). The first way is to exam- for their utilities separately from their rent, but others ine the incomes of the changing occupants of existing pay a monthly rent that includes utility charges. If housing units over time and whether they "filter up" more than one household lives in a unit, it is necessary or "filter down" (Zais and Thibodeau 1983). The sec- for analysts to know how much money is passed from ond way is to examine the price per unit of housing one household to another and how much goes to services for different parts of the housing stock-for third parties such as the landlord. Renters may also example, low quality versus high quality housing face additional charges-particularly in controlled (Lowry 1960).The third alternative is to examine how markets-including key money, advance rent, and the quantity of the stock changes (Malpezzi, Ozanne, expenditures on maintenance and repairs. Malpezzi and Thibodeau 1987). For example, what effect does (1998) discusses the role of such side payments in new construction have on the amount of low-quality some detail. The questions in the housing module housing? What are vacancy rates like at the bottom of should cover all of these possible extra charges. the market? How fast do units depreciate? Each of these types of analysis can be done with data provided MOBILITY. Research has demonstrated that the longer that a panel of data is collected. The answers to ques- a household stays in a unit, the lower are rents for a tions on rents and prices, household income, tenure, given level of housing service, even in markets with- length of stay, housing characteristics, and age of the out rent control. This "tenure discount" associated unit are key variables for filtering studies. The respon- with longer stays is often motivated by a landlord's dent should also be asked whether the household has desire to reduce turnover, avoid vacancy losses, and had or currently has any plans to upgrade its dwelling. continue leasing to known tenants.6 Consequently, a question should be included about the length of the PROPERTY RIGHTS AND TENuLRE. Another set of vari- family's tenure in the unit. ables that needs to be collected in the questionnaire is 302 CHAPTER 12 HOUSING the set of variables related to tenure security. First and other countries), a self-amortizing mortgage with a foremost, analysts need data on how long the house- fixed interest rate and equal payments, can be com- bold has lived in its current dwelling. Information on pletely described by four pieces of information: the the type of rent control on the dwelling or any sub- interest rate, the loan amount, the loan term (dura- sidy received by the household is often relevant for the tion), and any up-front fees. However, many other study of tenure security since security is often related kinds of mortgages are possible. For example, interest to these regulations. Other questions may need to be rates may be tied to an index or payments and amor- included in markets in which there is squatting or a tization schedules may vary (Buckley 1996; Chiquier mix of "traditional" and "formal" tenure. and Renaud 1992). Household surveys have a number of uses in Much can be learned from household survey data studying property rights and tenure issues (Daniere about how different kinds of households finance their 1992; Friedman, Jimenez, and Mayo 1988; Gyourko housing and on what terms. Discrete choice models 1989;Jimenez 1984; Lim, Follain, and Renaud 1980). and cross-tabulations can be used to analyze these out- Questions relating to property rights and tenure comes. Another finance issue that can be analyzed should be drafted carefully to ensure that they reflect using household survey data is the relative inefficiency the current circumstances in the country of the survey. of"progressive building" (which is based on the stock- Thereafter, at a minimum, rights and tenure should be piling of materials and their use from time to time) categorized in three ways: owning versus renting, compared to mortgage finance (see Renaud 1984). informal versus formal/secure tenure, and In countries in which financing is subsidized for public/social versus private ownership. These cate- some borrowers or some kinds of households face gories are often continua rather than mutually exclu- very different finance terms than others, the value of sive. For example, in Korea renting encompasses sever- different "deals" can be calculated in present value al payment systems, including periodic payment of terms and then the distribution of these implicit trans- rent, a deposit-based rental system (chionisei), and sever- fers can be analyzed.World Bank (1989) demonstrated al mixed forms of deposit and periodic rent (wolsei). how to carry out a simple analysis of this type. Struyk On the other hand, the British system of very long- and Turner (1986) demonstrated another way in term leases (99 years or more) is in some ways closer which household survey data can be used to study the to owning than renting, even though periodic ground effects of finance on the housing market. They devel- rent is paid and eventually the property reverts back to oped a simultaneous model of housing investment and its residual owner. demand for finance that can be used to test whether, and if so how much, finance availability affects hous- LAND AND INFRASTRUCTURE. Since the provision of ing investment. infrastructure is a core function of all governments, the proportion of households living on land served by HOUSING AND EMPLOYMENT. The importance of loca- basic infrastructure is of great interest to public poli- tion with respect to workplace and other services was cymakers. The benefits of the services can often be discussed above.When housing markets do not func- approximated by how they affect land value. The tion well, this can prevent the efficient functioning of LSMS housing module should contain questions labor markets in general (Hughes and McCormick about the value of lots, as well as questions about their 1987;Johnes and Hyelak 1994; Mayo and Stein 1995). size, location, and the type of infrastructure to which Another issue that must be tackled in some countries they have access. is the fact that in many specific enterprises, both pub- lic and private, employees' housing is provided in con- HOUSING FINANCE. Many of the questions relevant to junction with their employment. Enterprise housing housing finance are included in the savings and credit in China is the most obvious example of this phe- modules rather than the housing module (see nomenon, but company housing can often be found Chapters 20 and 21). Of course the questions in noncommunist countries as well (Tolley 1991; described in those chapters have to be tailored to local Fishback 1992). For example, company housing is conditions. For example, the most common kind of often associated with mining and other extractive mortgage in the United States (also found in many industries when these are undertaken in remote areas. 303 STEPHEN MALPEZZI If relevant, questions should be included in the hous- types of data should also be collected in the relevant ing module about employer- or enterprise-provided modules of the questionnaire. housing. Estimating patterns of demand requires data not only on prices and incomes but also on other determi- MIGRATION. Another issue that arises mainly in rural nants of demand such as the family's preferences about areas is the housing of migratory workers, such as itin- housing, the family's composition, and the household's erant agricultural laborers. This issue also sometimes size (which is the most important single demographic arises in urban areas. For example, in China, the gov- variable affecting housing consumption). Other data ernment classifies many urban households as "tempo- that would be useful for analysis include the age of rary."This can make the choice of sampling frame par- household head, the number of children in the house- ticularly critical. Many obvious sampling frames, such hold, and the sex of the head of the household. In some as household registration lists, may systematically miss circumstances it may be appropriate to collect data on such households. Thus this kind of sample frame may the household's income, type of tenure, religion, or need to be supplemented to ensure that these house- caste to use as proxies for taste. holds are included in the sample. Survey Issues DATA FROM OTHER PARTS OF THE QUESTIONNAIRE. There are several important issues relating to the Much housing analysis, especially studies of housing mechanics of implementing the housing module. demand, relies on data gathered in other parts of the questionnaire. The main data needed for housing SAMPLE. Statistical methods are used to estimate the analysis from other parts of the questionnaire are sum- sample size required to answer a particular question to marized here so that survey designers will not over- a desired degree of precision (Kish 1965). Experience look them. suggests that roughly 500 observations are the mini- It is reasonable to assume that the demand for mum required from a given "housing market" (for housing is related to the household's expectations example, a metropolitan area or a rural region) for use- about its long-term economic situation. Since housing ful analysis that cross-classifies data by tenure and other consumption is related to long-run or permanent factors and that allows for nonresponses and other data income, this suggests that permanent income rather problems. Because LSMS surveys tend to have nation- than current income is the true determinant of hous- al samples of 2,000-5,000 households, they are often ing consumption. Permanent income is, however, never unable to produce large enough subsamples in all but directly observable and total household consumption is the largest metropolitan areas.This means that current usually used to proxy for it (Hall 1978). Thus it is LSMS designs are better suited to broad analyses of important for housing demand analysis that the ques- "national," rural-urban, or regional housing markets. tionnaire contain detailed consumption modules. However, much research suggests that defining mar- It is also useful for housing market analysts to kets so broadly often obscures important differences have data on current income measures as well-for among geographically disaggregated markets. Of example, to analyze mortgage underwriting criteria course, resource constraints are a fact of life, and much or to study the targeting of housing subsidies. Because can be done with surveys on the scale of the typical the qualification process for various subsidies and LSMS.Yet if housing market analysis is an important mortgage underwriting usually depends on current goal of an LSMS and if there appear to be different income rather than on permanent income or con- market conditions in different cities-or in different sumption, analysts need to know the household's rural regions-in the country of the survey, serious marginal propensity to consume out of its current consideration should be given to increasing the size of income as well as its consumption. What would be the sample or to over-sampling cities or regions of even more useful for housing analysts would be a special interest. If the latter strategy is adopted, sample detailed analysis of the marginal propensity to con- weights must be assigned to reflect this over-sampling. sume housing out of different kinds of income (by type of employment, by the head of household versus PANEL DATA. Analyzing the dynamics of the housing the other household members, and so on).Thus these market over time requires panel data. However, using 304 CHAPTER 12 HOUSING the household or the individual rather than the fied" deserves special emphasis. Every country is differ- dwelling as the unit of observation can present com- ent in terms of the physical design of housing, its tenure, parability problems for housing analysts because how it is paid for, and so on. The sample questionnaire households do not necessarily stay in the same introduced here should be considered only a starting dwelling between survey rounds. In previous LSMS point for designing an actual module.The initial design surveys the housing unit has generally been used as the of any module should be thoroughly pretested to ensure basis for the sample frame, which means that the sur- that it is capable of yielding the required data.This sam- vey followed the housing unit rather than its original ple questionnaire will not repeat questions that appear occupants over time.While this has complicated analy- in other modules of the survey and are covered thor- sis for some other issues, it is preferable for some hous- oughly in the relevant chapters. Note that this module ing analysis. contains a bare minimum of questions on water, sanita- In some studies, such as the Mayo and others (1982) tion and fuel use, which are suitable for describing basic study of Egypt, retrospective questions were used as a living conditions and enumerating households' major proxy for a prior panel. Of course, this is not as good for expenditures on these items. If water, sanitation, or fuel analysis as proper panel data, as respondents often give use are of special interest in the survey, the questions in inaccurate responses to retrospective questions because this draft module should be dropped, and the expanded their memories of past events are imperfect. submodules contained in the environment chapter Some key issues that need to be addressed when (Chapter 14) should be inserted in their place. designing such a panel include the need to ensure that Similarly, the draft module does not contain much units that have dropped out of the stock are clearly on housing finance, since such questions are contained coded to distinguish them from units that are tem- in the credit module introduced by Chapter 21 (and porarily unoccupied and the issue of how to bring presented in Volume 3). If that module were to be newly constructed units into the panel over time. It dropped, some of the questions about credit for hous- must be possible to link each unit's data in one year's ing could be moved to this module. Additional ques- file to that in another year's file. It is essential to tions can be found in the sample housing question- include a unique identifier code for each unit. Units naire in Malpezzi with Loux (1994). that have been demolished, held vacant, or otherwise The "long' draft module presented here is some- dropped out of the panel in the past should be identi- what longer than that used in many past LSMS surveys. fied, along with their current status. With regard to This is partly because it will support more analysis of vacant units, survey designers should devise a short housing market issues, rather than merely the descrip- section of questions to be put to a respondent in a tion of living conditions and calculation of consump- neighboring dwelling to discover, for example, how tion of housing. It also includes water and sanitation long the unit has been vacant, whether it is slated for questions that are suitable to situations in which house- demolition, and the rent at which it is being offered. holds use multiple sources; includes questions on such transactions as deposits, "key money," and cooperative COUNTRY-SPECIFIC QUESTIONS. The need for survey fees, which were rarely covered in previous LSMS sur- designers to tailor the questionnaire carefully to local veys; and tries to cover the full range of housing market conditions cannot be overemphasized. For example, it characteristics that exist in all regions of the developing is highly unlikely that bamboo would be used to con- world from Eastern Europe to Sub-Saharan Africa. In struct houses in Moscow. It is just as important to tai- practice, only in very few countries will all of these lor less obvious questions such as those about tenure additional questions need to be included in the module. and payment methods. See Malpezzi with Loux (1994) In the places where a particular characteristic is rare, for examples of more detailed housing questionnaires. questions about that characteristic can be simplified or omitted. A shortened version of the questionnaire is The Housing Module presented after the main version to give an idea of how it can be shortened. In this case, some of the topics that This section introduces a draft housing module (pre- allow study of housing market issues have been omitted, sented in Volume 3) which, suitably modified, can be and the detail on living standards has been reduced. inserted into an LSMS questionnaire. "Suitably modi- Again, the short version shown here is merely indica- 305 STEPHEN MALPEZZI the module. When the module is going to be used in Box 12.2 CautionaryAdvice an actual LSMS survey, it is important to produce a manual that includes a more detailed checklist of def- *How Much of the Droft Module Is New and Unproven7 Almost all of the components of the draft housing initions both for survev workers and for future users of module have been used either in past LSMS surveys or the data. The U.S. Census report on the American in special-purpose housing surveys. Housing Survey 1995 (which can be downloaded How Well Hos the Module Worked in the Past? This mod- from www.census.gov) provides a general example of ule has been used for simple descriptive sketches ofthe such documentation. See also Malpezzi, Bamberger, housing conditions of the households, for which it has and Mayo (1982) and Malpezzi (1994) for further worked fairly well. One exception to this is that the examples. modules included in past LSMS surveys have often For housing analysts to be able to use the housing included only one question on the household's source d of water which in many situations has not reflected the ,yi complexity of household water sources. Also, some of accurate, reliable information on related topics, such as the housing cost questions have been ambiguous or household size and composition and household insufficient. In particular they have failed to make clear income. It is assumed in this chapter that these key whetherthe rent includes utilities, and few surveys have collateral data are indeed collected in accordance with included questions on any additional financial transac- the discussion in the other chapters in this book. tions such as key money or condominium or coopera- It cannot be emphasized enough that survey tive fees. However while previous LSMS studies have designers will need to revise and pretest the question- made only limited use of the housing module, many naire to bring it in line with local conditions For other studies have been undertaken in developing countries that have made extensive use of such data. example, there are not very many houses in Cracow, Mayo and others (1982) is probably the best single Poland that have felt walls or thatched roofs, and example. detailed questions about heating systems will be irrel- Which Parts ofthe Module Most Need to Be Customized? evant in Accra, Ghana. While this section does not A great deal of the module needs to be carefully cus- address the issue of country-specific relevance with tomized to reflect the housing conditions in the coun- regard to every question, survey planners should do so try where the survey is to be fielded. Many aspects of themselves when they are designing the questionnaire housing vary greatly from country to country, including the predominant types of dwellings, the materials that for their particular survey. they are made of, the kinds of amenities that are indi- cators of living standards, and the form in which differ- Part A: Description of the Dwelling ent housing-related expenditures are made. For exam- Part A of the housing module is designed to yield data pie, questions on privatization of state-owned that give a basic description ofthe dwelling. dwellings, on how well elevators operate, and on the Question Al asks whether the dwelling is the adequacy and costs of heating will be relevant in sur- household's primary residence and, if it is not, redirects veys in Eastern European countries but not in countries the interview to be about the primary residence. For in Sub-Saharan Afhica. measuring living standards, it is most important to know about the conditions of the primary residence tive; exactly which subset of questions should remain in since those are the ones that pertain to the household a shortened version will depend on the circumstances. most of the time and best reflect the quality of infra- For example, the short version shown here omits ques- structure available to the household. If the survey's tions on key money and other such deposits, but if they purpose were only analysis of housing markets, gath- are relevant to a country setting, they should be includ- ering information about the costs and quality of sec- ed, even in a short version of the questionnaire. ondary residences would be a perfectly reasonable option. Notes on the Housing Module There are at least three ways to deal with second- arv residences. In past LSMS surveys the issue was This section briefly discusses the definition of key completely ignored, and Question Al was not used. concepts and other specific points in the module, fol- Although this is not technically correct, no complaints lowing the numbering system of the longer version of have ever been made to the central LSMS team on the 306 CHAPTER 12 HOUSING subject. One reason this issue has been ignored is that rooms, dining rooms, bedrooms, finished attic or base- in most countries where LSMS surveys have been ment rooms, recreation rooms, permanently enclosed done, secondary residences are rare and pertain onlv to porches suitable for year-round use, lodger's rooms, the extreme upper end of the welfare distribution. and rooms used as offices or for business purposes. A Moreover, the richest frequently have the highest non- divided room is separate if there is a partition from response rates and even when they do respond, their floor to ceiling but not if the partition is impermanent expenditure, income and wealth are probably underes- or made only of shelves or cabinets. Not included are timated since LSMS questionnaires are designed to be bathrooms, halls, vestibules. balconies, alcoves, closets, applicable to the broad range of society with special unfinished attics, or basements, unenclosed porches. If emphasis on the poor.Thus ignoring this issue in the a room is used by occupants of more than one unit, past may not have had much empirical impact on most the room is included with the unit from which it is of the analysis done with the data. most easily reached." A second option is to use just the simple question Separate questions should be developed for par- included here. It will give some information on how ticular types of rooms or structural features that are important the topic is in the country, and will allow especially important in the country surveyed. For sampling weights to be adjusted. A third option is to example, many questionnaires ask how many bed- deal with the issue of secondary residence much more rooms a unit has. In the Ghanaian survey that was ana- fully. This will be appropriate where secondary resi- lyzed in Malpezzi. Tipple, and Willis (1990), separate dences are relatively common and their ownership questions were asked about unenclosed verandahs, extends to a wider range of society (for example, in because households with this feature tend to make Finland, where about 20 percent of households have a considerable use of it. It is not important that there is secondary dwelling). To deal with the issue fully will double-counting in this case, since a bedroom would mean not only directing the interview on housing be counted both as a room and a bedroom. What is quality to the primary residence, but also adding ques- important is that the special rooms are either always tions about at least current expenditures on the sec- double counted or never double counted and that the ondary dwelling, and probably adding questions on its documentation makes clear which is the case. value as an asset. Whatever approach is taken in the questionnaire should accord with how secondary units DWELLING UNIT. A dwelling is an accommodation unit are treated in the sample. Are they included or exclud- that contains one or more households. It may be a ed from the sample frame? Are they substituted out if detached house, a villa, part of a flat, a shack, a tent, a detected during interviewing? Are the sampling separate room, or a houseboat. There may be several weights adjusted for households that own or geo- dwellings in a structure. graphic areas that contain secondary residences? Each person has a commonsense notion of what is STRUCTURE. A structurc is a physically separate entity meant by such terms as "house," "household," "room," such as a house, an apartment building, or a tent. It and so on, but these notions may differ from person to may contain one or more dwelling units. person. For example, is a "bathroom" also counted as a "room"? Accurate use of survey data is only possible if BEDROOMS. The number of bedrooms in the unit is such definitions are consistent-in other words, if all of the sum total of all separate rooms that are used regu- the survey interviewers have the same definition of larly for sleeping, even if they are also used for other each concept. For this reason, some definitions of com- purposes. Rooms reserved for guests' sleeping are mon but important housing concepts are presented counted as bedrooms. On the other hand, rooms used here. Many of the sample definitions will have to be regularly for other purposes, even though used occa- modified to suit country conditions. sionally for sleeping, are not counted as bedrooms. All For example, consider Question All on rooms. bedrooms are also counted as rooms. The definition of "room" will vary from country to Question A14 asks about the area of the unit. In country.A sample dcfinition that can be used as a start- some countries, such as Korea, households are likely to ing point, adapted from the U.S. Census definition, is: know this area precisely. In other countries they will "whole rooms used for living purposes, such as living only be able to produce a rough estimate. 307 STEPHEN MALPEZZI A number of questions in this and other sections dropped. However, households in rural areas probably are questions for which households may have only use different sources for these two purposes until there approximate answers. In some cases, such as the area of is a considerable amount of infrastructure in their the unit, an alternative approach is possible; for exam- areas, which means that this distinction is pertinent in ple, if there is enough interview time, it may be possi- most countries. ble for the interviewer actually to measure the The questions about "what is the main source of dwelling unit. For other questions, such as the age of water..." are a little tricky to word.These questions aim the dwelling, no such alternative may exist. to yield data on the type of access that a household Generally, it is better to get an approximate answer has, not on the body of water that feeds into the cen- to the right question than a precise answer to a useless tral pipeline. Thus great care should be taken in trans- question. This may seem obvious, but census bureaus lating these questions. Similarly, there are many differ- around the world mistakenly exclude important ques- ent possible sources, and they can be called different tions because they are likely to be measured with things in different places (for example, a standpipe ver- error. It is certainly important to understand the con- sus a public tap).The basic idea is to devise use answer sequences of such errors-in particular to understand codes that convey something about the likely safety the difference between biased estimates and imprecise and convenience of each source, without devising so estimates. For example, studies have shown that house- many codes as to overwhelm the interviewer or the holds tend to give answers to questions about the age respondent. of their unit that contain a significant degree of error. Sanitation systems (flush toilets, pit latrines, buck- However, if they are as likely to overestimate as to et systems, and so on) are another example of some- underestimate, the statistics based on this data (such as thing that varies tremendously from country to coun- the mean age of dwellings of a certain type) will be try. A housing unit is classified as having a bathroom if unbiased, although these estimates will be less precise it has a room attached to the house with at least one than if the respondents had a very good idea of the age of the following: a toilet, and a bathtub, a shower, or a of the unit. A further discussion of this issue can be sink with running water. If a unit has these facilities found in Follain and Malpezzi (1981). but the toilet and at least some washing facilities are not in the same room, then the unit does not have a Part B: Housing Services bathroom. If water, sanitation, or fuel use are of special interest in A kitchen is a room set aside for preparing food. a given survey, the expanded modules contained in It must have a stove or other facility for cooking and Chapter 14 on the environment will be better starting may have a sink and a refrigerator or icebox as well. A points for questions on those subjects than the ques- complete kitchen has all three facilities. A kitchen is tions given here.The questions on such topics includ- also counted as a room if it is enclosed. ed in this housing module can only yield descriptive information. If the specialized modules contained in Part C: Dwelling Expenditures the environment chapter are used, these questions Part C of the module focuses on household expendi- should be omitted. It would be natural to put the tures on housing. Obviously, how these questions are housing module next to the water and sanitation asked will vary from place to place. In particular, ques- modules in the questionnaire and possibly next to the tions about expenditure are inextricably bound up fuel module as well (though this might just as logical- with questions about the form of housing tenure, and ly be placed next to the consumption module). this varies from place to place. Often, units are either The questions on water sources included in this owned outright or rented, but there are many other draft housing module distinguish between rainy and forms of tenure in some countries. In Korea many dry seasons. In some countries this distinction can be households have a form of tenure called chonshei, omitted. The module also distinguishes sources which is similar to renting but, instead of paying peri- depending on whether they are used for drinking and odic rent, the household puts down a large refundable cooking or for bathing and washing. In a few coun- lump sum as a deposit, often as much as half of the tries that are highly urbanized and have very well- value of the unit. Other forms of tenure in Korea developed water systems, this distinction can be include owning outright and renting, but there are 308 CHAPTER 12 HOUSING also mixed forms, such as households that put down a make clear for the interviewer and for the respondent smaller deposit and then pay a periodic rent, wolsei. which rental income is covered here and which is cov- This section is closely related to the chapter on ered in the transfers and other nonlabor income chap- credit, which introduces a draft credit module in ter. Only data on the rental income from the dwelling which data are collected on mortgage transactions (see to which the interview pertains are captured in the Chapter 21). draft housing module. Income from the rental of other In addition to collecting accurate, reliable data on dwellings where the respondent does not live is cov- expenditures associated with housing, it is extremely ered in the transfers and other nonlabor income mod- important to get some sense of whether these partic- ule introduced by Chapter 11 rather than in the hous- ular households are facing market prices and engaged ing module. in arms-length transactions. For example, it is impor- tant to design the questionnaire to find out whether Part D: Household Opinions About Their House and the government provides a household with its Neighborhood dwelling. In that case, analysts might want to know The purpose of this section is to identify the aspects of what the rent is for other purposes than as an indica- the house and neighborhood with which households tion of the state of the market. If a household is rent- are most and least satisfied. Only a few general opin- ing its dwelling from a close relative, the household ion questions have been included in the draft housing may be paying a lower-than-market rent. In some cul- module about households' satisfaction with their unit tures being a member of a kinship group implies that and their neighborhood. Hedonic price studies of the the household gets a discount. If this information is United States suggest that such general opinions are collected in the survey, analysts can study the size of closely associated with housing prices but that once these discounts. such general questions have been asked, more detailed Questions about payments that households make questions (for example, about households' satisfaction for their utilities are in this section, and use a recall with schools, public safety, and so on) are not general- period of the previous month. This should work well ly statistically significant.7 in places where most of these items are billed for on a However, there may be situations in which it is monthly basis. In places where this is not the case, it worthwhile to ask additional questions about housing may be preferable to ask respondents about some of and neighborhood satisfaction. For example, it is plau- these items earlier in the module when the amenity is sible that different neighborhood characteristics may discussed. For example, questions about different be valued differently in different countries. For exam- forms of payments for the different sources of water ple, consider a country with a highly stratified educa- can be interwoven into that section. Chapter 14 on tional system, where attending primary school in a environment covers the most detailed set of water particular location leads to the opportunity to attend a charges, and differentiates many of the questions prestigious secondary school and university. The value according to the type of source and the different ways of this may be capitalized into housing prices and may in which charges for it may be made, illustrating this be highly significant in such a country. idea of interweaving expenditure with use and ameni- If the list of neighborhood questions is expanded, ty questions. Expenditures on fuel can be included in Malpezzi with Loux (1994) and especially the the housing module, in the consumption module, or American Housing Survey have many examples of in a specialized fuel use module, with increasing detail potential questions. It is possible either to leave open possible in each case. Naturally they need be put in the list of aspects with which they are satisfied or dis- only one of those places, though this book illustrates satisfied and then to post-code them or to include a their placement in each of the three. list in the questionnaire on the basis of a pilot survey. Some households rent out part of their dwelling. It is important to calculate the net costs (payments out Part E Planned Moves and Upgrades minus rent coming in) of the dwelling. For analysis of A household can easily change its consumption of crowding, it may be useful to get further information food either up or down by purchasing more or less about the number of rooms rented out and the num- food in a particular day or week. Changing a house- ber of persons who occupy them. It is important to hold's consumption of housing is more difficult and 309 STEPHEN MALPEZZI costly. The household must either move or upgrade location of the unit within a city or other geographi- the unit in which it already lives. cal unit. These variables can be coded from the unit's Since households move so infrequently and this address. For example, distances to the center of the city moving process is fundamental to understanding the should be coded for urban units. See Ingram (1984) state of the housing market, it is sometimes useful to and Mohan (1994) for examples of the use of loca- ask retrospective questions about the previous unit in tional data in housing market analysis. which the household lived or prospective questions if the household is planning to move. The usefulness of Appendix 12.1 What is the "Price" of Housing? these questions and the way they are worded will vary from place to place. In countries like Korea, house- There is a difference between the way in which econ- holds move on average every two years, whereas in omists use the term "price" and the way in which other countries such as Egypt households may move as housing analysts, real estate professionals, and other infrequently as every 15 years. Also, people in different noneconomists often use the term. cultures have different attitudes about prospective Economists generally define rent, the periodic questions. expenditure for housing, as the product of the price per unit of housing, P, and the real estate services Housing-Reloted Questions in Other Modules yielded by the unit, Q. Thus R = PQ. Rent and this The strong links between housing finance questions associated price per unit of service, P, are "flow" (per and the credit module, the transfers and other nonla- period) concepts. The physical real estate itself is bor income module, and the specialized water, sanita- durable, so Q is a "stock" concept. A stock (housing tion and fuel submodules contained in the environ- asset) yields a flow of services over time. ment module have already been noted. It has also been Of course, many readers will know that the flow noted that having accurate, reliable information on "rent" can be translated into the stock concept of income, household size and composition, and com- "value": V = R/i, where i is the capitalization rate. muting from other modules in the survey is important Housing value, the stock analog to rent, is also known for housing analysis. Information on housing costs can to economists as the asset price of the unit.When real be gathered in the community questionnaire. estate brokers and others use the term house price, or Questions about household composition in the ros- unit price, they are referring to this present value ter module should be drafted in such a way as to distin- measure or asset price, V, rather than the flow price guish temporary accommodation from permanent per unit of service, P, as described above. When econ- accommodation. When units are shared by more than omists use the term price, they are often referring to one household, this should be clearly indicated. In some P. However, even economists sometimes loosely refer countries it is important to indicate whether the land- to V as price, although careful economists will usually lord lives in the building or to clarify kinship relations use the term "asset price." In any event, the context between households in the unit. In addition, because should make the distinction clear. many developing countries have surprisingly high Note that if, by the assumptions of their model or vacancy rates, at least in some parts of the market (Mayo analysis, analysts standardize the quantity of real estate and others 1982; Struyk 1988), when this is of particu- services produced (say, in square feet of a given lar interest, it can be useful to devise questions that yield homogenous level of quality, including location), then the data necessary for analysts to study the extent and rent and flow price are basically synonymous. More incidence of vacancies and their deterriinants.The iden- precisely, rent and flow price P are proportional, since tification and control page (see Chapter 4 on metadata) by assumption Q is fixed. has a question showing why interviews could not take Housing economists use a number of different place in the selected dwelling. One of the response codes methods to construct indexes of the price of housing. is that the dwelling was vacant. It would be possible to The main types of methods: simple medians and aver- add follow-up questions there that would be asked of ages, Laspeyres, Paasche, Divisia, and related time series neighbors to gather some information on vacancies. indexes, hedonic price indexes, repeat sales indexes, For many purposes it is useful-and sometimes user cost models, and hybrid methods.These methods very important-to include information about the are described briefly in the following paragraphs; 310 CHAPTER 12 HOUSING Malpezzi and Green (1998) provides a more detailed Generally, these are time-series indices only. That is, if discussion. there is one housing consumer price index for, say, Monterey, and another housing consumer price index Simple Medians orAverages for Tijuana, it is possible to compare how fast prices The most commonly used measures of this type in the are rising in the two cities but not to discover which United States are median sales prices for existing city is more expensive. Also, the results may vary housing (which are published by the National depending on which "bundle" (typical housing unit) is Association of Realtors) and Census Median House chosen. Ideally, analysts would like to hold the bundle Prices (values for owner occupiers and rents for fixed, but as prices change over time, the typical bun- renters).The method is, in general, self-explanatory. A dle consumed changes in real life, even if not in the big advantage of this type of measure is its simplicity index. and the fact that it allows rough comparisons over time and across markets. The biggest disadvantage is Hedonic Indexes that this type of measure does not usually control for These are constructed by regressing rent or value differences in the quantity of housing services, Q, against characteristics of the unit and its location.Then across markets or over time. analysts use the coefficients to predict rent or value for A number of studies suggest that, while these sim- "standard" units. Most often these are done for one ple indexes are not adjusted for quality differences, quan- point in time, but they can be done over time as well. tity generally varies less than price in such a sample.Thus These methods have good theoretical and intuitive the studies conclude that these simple measures, while foundations and are discussed in detail in Malpezzi, imperfect, do include valuable price informrationi. Ozann,c, and Thibodeau (1980). However, they involve substantial data requirements and analytical work. The Laspeyres Price Index and Related Indexes Familiar examples include consumer price indexes and Repeat Sales Indexes implicit price deflators from national income These indexes are constructed after surveying units accounts, which are available in virtually all countries. that have been sold twice. Although they are con- These are generally constructed by taking a sample of structed using regression methods, intuitively these units in some base year and revisiting the units over indexes are roughly similar to annualizing and averag- time, appraising them, and computing any percentage ing the percentage growth in sales prices over time. changes.The familiar Laspeyres indexes are construct- These indexes are time-series only. They have the ed as: advantagc of being based on actual transaction prices, but most units are not sold in any given period, so It= (PtQ0/P0Q()l00 using repeat sales misses a lot of information. Also, units that sell are not necessarily representative of all where I is the index, P is the price per unit of housing units, and sometimes it can be hard to tell whether Q services, Q is the quantity of housing, and subscripts for a unit has changed (for example, due to remodel- denote time.Time 0 is the base year or period, and time ing). Repeat sales indexes are thoroughly discussed in t is any year, forward (positive t) or backward (negative Wang and Zorn (1997). t). Thus the index is the ratio of what is spent in time t to what is spent in time 0, holding what is purchased The User Cost Method constant to the "bundle" purchased in time 0. The idea behind this method is simple: it calculates Related indexes, including Paasche and chain what a "user" of the house really pays (or would pay) indexes, are discussed in Afriat (1977) and Diewert net of financing, taxes, maintenance, inflation, and so (1991). Indexes differ in how the bundle is fixed or on.These measures are generally time-series (for exam- varied. The U.S. Department of Commerce has ple, Hendershott and Shilling 1982) but can be done recently moved from Laspeyres to chain indexes (with for one point in time (for example, Follain 1982). The a constantly changing bundle) for most time series. user cost method incorporates a model of what actual- Laspeyres and related indexes have much to rec- ly determines prices, and it accounts for the effects of ommend them, but they do have some disadvantages. taxation, inflation, and maintenance on prices. 311 STEPHEN MALPEZZI Hybrid Indexes Bromley, Daniel W 1989. "Property Relations and Economic These indexes combine (usually) two of the above Development: The Other Land Reform:" il'rtd Development mnetlhods. Hybrid indexes cain be time-series, for one 17: 867-77. point in time, or both. For example, hedonic and Buckley, Robert M. 1996. Housing Finance in Developinig Countries. repeat sales methods can be combined (as in Case and NewYork: Oxford University Press. Quigley 1991) as can hedonic and user cost methods Burns, Leland S., and Leo Grebler. 1977. The Housing f Nations: (as in Follain 1982). Advice and Policy ini a Comparative Framework. London: Macmillan, Notes Case, Bradford, andJohn M. Quigley. 1991."The Dynamics of Real Estate Prices." Reviewv of Economics and Statistics 22 (1): 50-58. The author is indebted to Margaret Grosh, Paul Glewvve, and Fiona Chiquier, Loic, and Bertrand Renaud. 1992. "Alternative Mlortgage Mackintosh for comments on previous versions. Instruments in Distorted Housing Systems: Howy Useful is the 1. Specifically, the income clasticity of demand is the percentage Dual Rate Adjustable Mortgage?" World Bank, Industry and change in expenditure given a percentage change in income. See Mining Division, Energy, Mining, and Telecommnications Meier (1983) for elaboration. Department, Washington, D.C. 2. Many existing LSMS surveys have collected substantial hous- Daniere, Amrita. 1992. "Determinants of Tenure Choice in the ing information that has not yet been used in analysis. ThirdWorld."Journal of Housing Economics 2 (2): 159-84. 3. For example, very few empirical analyses have been done of Deaton,Angus, andJohn Muellbauer. 1980. Economics and Consumer U.S. rural housing markets despite the vast literature in that coun- Behavior NewYork: Cambridge University Press. try SeeVandell (1997) for a review and discussion. Dillinger, William 1991. Urban Property Tax Reform. Washington, 4. An hedonic index is a regression of rents (or house values) D.C.: World Bank, United Nations Centre for Human against the characteristics of the units. See Appendix 12.1 for a Settlements (Habitat), and United Nations Development nore detailed explanation. Programme. 5. In the case of the owner-occupied imputation, the question Diewert,W .E., ed. 1991. Price Level Aleasurement. NewYork: North mtust be asked in such a way that the respondent assumes that this Holland. wvould be such an arms-length transaction. Dovall, David. 1991. The Land .MIarket Assessment: A NTew Tool for 6. See Malpezzi, Ozanne, and Thibodeau 1980, pp. 78-9. Urban M1anagement. Washington, D.C.: World Bank, United 7. The lack of significance does not prove or demonstrate that spe- Nations Centre for Human Setdements (Habitat), and United cific things like schools or public safety do not matter but rather that, Nations Developimient Programme. once general neighborhood and unit satisfaction are taken into account, Farvaque, Catherine, and Patrick McAuslan. 1992. "Urban Land additional specific qulestions do not seem to add much information. Policies and Institutions in Developing Countries." Urban Management Program Paper UMP-5.World Bank and United References Nations Development Programme,Washington. D.C. Ferchiou, Ridha. 1982. "The Indirect Effects of NeMv Housing Afriat, S. 1977. The Price Index. New York: Cambridge University Construction in Developing Countries." Urban Studies 19: Press. 167-76. Angel, Shliomo, and others. 1986. "The Land and Housing Markets Fishback, PV 1992. "The Economics of Comripany Housinig." of Bangkok: Strategies for Public Sector Participation." Planmng Journal of Latv, Econonmics and Organization 8 (2): 346-65. and Development Cooperative, Thailand National Housing Follain, James R. 1982. "Does Inflation Affect Real Behavior? The Authority and the Asian Development Bank, Bangkok. Case of Housing." Southern EconomicJournal 48 (3): 570-82. Bertaud, Alain, and Bertrand Renaud. 1994. Cities Without Laud Follain, James R., and Stephen Malpezzi, 1981. "Are Occupants M11arkets: Lessons of the Failed Socialist E.xperiment. World Bank Accurate Appraisers?" Review of Public Data Use 9 (1): Discussion Paper 227.Washington, D.C. 47-55. Betancur, John J. 1987. "Spontaneous Settiement Housing in Latin Follain, James R., and Emmanuel Jimenez. 1985a. "The Demand America: A Critical Examination." Environment and Behavior 19: for Housing Characteristics in Developing Countries." Urban 286-310. Studies 22. Bramley, Glen. 1993. "The Impact of Land Use Planning and Tax . 1985b. "Estimating the Demand for Housing Subsidies on the Supply and Price of Housing in Britain." Characteristics: A Survey and Critique." Regional Science and Urban Studies 30 (1): 5-30. Urban Economics 15 (5): 421-32. 312 CHAPTER 12 HOUSING Follain, James R., Gill-Chin Lim. and Bertrand Renaud. 1980. . 1984. "Tenure Security and Urban Squatting." Review> of "The Demand for Housing in Developing Countries: The Econotmties anid Statistics 66 (4): 556-67. Case of Korea?"Journtal of Urban Econonmics 7 (3): 315-36. Johnson, Thomas E. Jr. 1987. "Upward Filtering of the Housing Freeman, A. Myrick. 1979. The Benefits of Environimental Stock." Habitat Interniatiotnal 11 (1): 172-94. Improvement: Theory anid Practice. Washington, D.C.: Resources Johnes, G., and T Hyclak. 1994. "House Prices, Migration and for the Future. Regional Labor Markets: Journal of Housing Economics 3 (4): Friedman,Joseph, EmmanuelJimenez, and Stephen K, Mayo. 1988. 312-29. "The Demand for Tenure Security in Developing Countries." Kiamba, Makau. 1989. "The Introduction and Evolution of Private Journial of Urbant Econonmics 29 (2): 185-98. Landed Property in Kenya." Development and Chlanige 20 (1): Gackenheimer, Ralph, and Carlos Henrique Jorge Brando. 1987. 121-48. "Infrastructure Standards." In Lloyd Rodwin, ed., Slrelter Kim, Jeong-Ho. 1991. "Housing Program Evaluation Using the Settlement and Development. Boston, Mass.: Allen and Unwin. Present Value Model: A Korean Experience." Paper presented Gilbert, Alan. 1989. Housing and Land in Urban Mexico. University of at a World Bank Seminar on Korean Housing Markets and California, Center for US-Mexican Studies, San Diego. Pohcy,June,Washington, D.C. Green, Richard K., and Stephen Malpezzi. 1998."A Primer on U.S. Kish, Leslie. 1965. Survey Samnpling. NewYork:Wiley Housing Markets and Pohciese' University of Wisconsin, Lee, Kyu Sik. 1992. "Spatial Policy and Infrastructure Constraints Center for Urban Land Economics ResearchWorking Paper, on Industrial Growth in Thailand." Revien' of Urbatn and Madison,Wis. Regional Developmtent Stidies 4. Gyourko, Joseph, and Jaehye Kim Han.1989. "Housing Wealth, Lee, Kyu Sik, and Alex Anas. 1992. "Costs of Deficient Housing Finance, and Tenure in Korea:' Regional Saence and Infrastructure: The Case of Nigerian Manufacturing." Urban Urban Economnics 19 (2): 211-34. Studies 29 (7): 1071-92. Hall, Robert E. 1978. "Stochastic Imphcations of the Life Cycle- Lim, Gill-ChinJames R. Follain, and Bertrand Renaud. 1980."The Permanent Income Hypothesis: Theory and Evidence."Jonirnzal Determinants of Homeownership in a Developing Economy." of Political Econiotmy 86 (Dccenber): 971-87. Urban Studies 17 (1): 13-23. Hendershott, Patric H., and James D. Shilling. 1982. "The Lowry, Ira S. 1960. "Filtering and Housing Standards." Lanid Economics of Tenure Choice, 1955-1979." In C. F Sirmans, Economics 35: 362-70. ed., Researchi on Real Estate. Greenwich, Conn.: JAI Press. Malpezzi, Stephen.1984. "Analyzing an Urban Housing Survey" Hughes, G. A., and B. McCormick. 1987. "Housing Markets, Infrastructure and Urban Development Department Unemployment, and Labor Market Flexibility in the U.K." Discussion Paper UDD-52.World Bank,Washington, D.C. Europeatn Economiic RevieRis 31 (3): 615-45. . 1994. "Getting the Incentives Right:A Reply to Robert- Ingram, Gregory K. 1984. "Housing Demand in the Developing Jan Baken and Jan Van Der Linden:' Tlhird l''orld Planinig Metropolis: Estimates from Bogota and Cab, Colombia." Revieu' 16 (4): 451-66. World Bank StaffWorking Paper 633,Washington, D.C. . 1996. "Notes on Consumer's Surplus." University of Ingram, Gregory K., and Alan Carroll. 1981."The Spatial Structure Wisconsin, Department of Real Estate and Urban Land of Latin American Cities." Journal of UTrban Economiics 9 (2): Economy, Madison,Wis. 257-73. . 1998. "Welfare Analysis of Rent Control With Side Jaffe, Austin J. 1993. "Property Rights and Market Behavior in Payments: A Natural Experiment in Cairo, Egypt." Regional Eastern European Housing Reforms." Paper presented at the Science and Urbaan Economics 28 (6): 773-75. American Real Estate and Urban Economics Association . 1999. "Economic Analysis of Housing Markets in meeting, January, Anaheim, Cal. Developing and Transition Economies." In Paul Chesire and Jaffe, Austin J., and Demetrios Louziotis Jr. 1996. "Property Rights Edwin S. Mills, eds., Hatndbookf Regional and U Trban Economics. and Econornic Efficiency: A Survey of Institutional Factors." NewYork: North Holland. Journal of Real Estate Literature 4 (2): 137-59. Malpezzi, Stephen, wvith Suzanne Loux. 1994. "Two Stylized Jimenez, Emmanuel. 1982a. "The Economics of Self Help Housing Questionnaires." Center for Urban Land Economics Housing: Theory and Some Evidence from a Developing Research Working Paper. University of Wisconsin, Madison, Country." Journal of U rban Econtomics 1 1 (2): 205-28. Wis. 1982b. "The Value of Squatter Dxvellings in Developing Malpezzi, Stephen, and Stephen K. Mavo. 1987a."Tbe Demand for Countries." Economic Developnient and Cultural Change 30 (4): Housing in Developing Countries:' Economic Development and 739-52. Cultural Change 35 (4): 687-721. 313 STEPHEN MALPEZZI - 1987b. "User Cost and Housing Tenure in Developing . 1984. "Housing and Financial Institutions in Developing Countries."Journal of Development Economics 25: 197-220. Countries." StaffWorking Paper 658.World Bank,Washington, - 1997. "Getting Housing Incentives Right:A Case Study of D.C. the Effects of Regulation, Taxes and Subsidies on Housing Sanyal, Biswapriya. 1981."Who Gets What,Where,Why and How: Supply in Malaysia." Land Economics 73 (3): 372-91. A Critical Look at Housing Subsidies in Zambia." Development Malpezzi, Stephen, Michael Bamberger, and Stephen K. Mayo. and Change 12: 409-40. 1982. "Planning an Urban Housing Survey." Infrastructure and Strassman, W Paul. 1982. The Transformation of Urban Housing: The Urban Development Department Discussion Paper UDD-42. Experience of Ujpgrading in Cartegena. Baltimore, Md.: Johns World Bank, Washington, D.C. Hopkins University Press. Malpezzi, Stephen, Larry Ozanne, and Thomas Thibodeau. 1980. . 1991. "Housing Market Interventions and Mobility: An Characteristic Prices of Housing in 59 SMSAs. Washington, D.C.: International Comparison." Urban Studies 28 (5): 759-71. The Urban Institute. Struyk, Raymond J. 1982. "Upgrading Existing Dwellings: An 1987. "Microeconomic Estimates of Housing Element in the Housing Strategies of Developing Countries:" Depreciation." Land Economics 63 (4): 373-85. Journal of DevelopingAreas 17: 69-76. Malpezzi, Stephen, Graham Tipple, and Kenneth Willis. 1990. Costs . 1988. "Understanding High Vacancy Rates in a and Benefits of Rent Control:A Case Study in Kumasi, Ghana. World Developing Country: Jordan." Journal of Developing Areas 22 Bank Discussion Paper 74.Washington, D.C.:World Bank. (3): 373-80. Mayo, Stephen K. 1986. "Sources of Inefficiency in Subsidized Struyk, Raymond, and Margery Turner. 1986. Housing Finance and Housing Programs: A Comparison of US. and German Quality in Two Developing Countries. Washington, D.C.: Urban Experience."Journal of Urban Economics 20: 229-49. Institute Press. Mayo, Stephen K., and James Stein. 1995. "Housing and Labor Tipple, A. Graham, and Kenneth G.Willis, eds. 1991a. Housing the Market Distortions in Poland: Linkages and Policy Poor in the Developing World: A'fethods ofAnalysis, Case Studies and Implications."Journal of Housing Economics 4: 153-82. Policy London: Routledge. Mayo, Stephen K., and others. 1982. Informal Housing in Egypt. . 1991b. "Tenure Choice in Kumasi, Ghana." Third World Cambridge, Mass.: Abt Associates. Planning Review 13 (1): 27-45. Meier, Gerald M., ed. 1993. Pricing Policy for Development Tolley, George S. 1991. Urban Housing Reform in China. Washington, MlIanagement. Baltimore, Md.: Johns Hopkins University Press. D.C.: World Bank. Mohan, Rakesh, 1994. Understanding the Developing Country Vandell, Kerry D. 1997. "Improving Secondary Markets in Rural .iletropolis: Lessons from the City Study of Bogota and Cali, America." Federal Reserve Bank of Kansas City, Kansas Colombia, NewYork: Oxford University Press. City Olsen, Edgar 0. 1987. "The Demand and Supply of Housing Wang, Ferdinand T.. and Peter M. Zorn. 1997. "Estimating House Services: A Critical Reviewv of the Empirical Literature." In Price Growth with Repeat Sales Data: What's the Aim of the E.S. Mills, ed., Handbook of Regional and Urban Economics. Game?"Journal of Housing Economics 6: 93-118. Amsterdam: Elsevier. World Bank. 1989. "Malaysia: The Housing Sector; Getting the Ozanne, Larry; and Raymond J. Struyk. 1978. "The Price Elasticity Incentives Right." World Bank Sector Report 7292-MA, of Supply of Housing Services." In L. S. Bourne and J. R. Washington, D.C. Hitchcocks, eds., Urban Housing MVarkets: Recent Directions in . 1993a. "Housing: Enabling Markets to Work:" World Bank Research and Policy Toronto: University ofToronto Press. Policy Paper,Washington, D.C. Pejovich, Svetozar. 1990. T'Ue Economics of Property Rights: Towards a . 1993b. "Russia Housing Reform and Privatization: Theory of Comparative Systems. Boston, Mass.: Kluxver. Strategy and Transition Issues." Report 11868-RU, Rakodi, Carole. 1987. "Upgrading in Chawama, Lusaka: Washington, D.C. Displacement or Differentiation?" Urban Studies 25 (4): 791-811. Yu, Fu-Lai, and Si-Ming Li. 1985. "The Welfare Cost of Hong Renaud, Bertrand. 1980. "Resource Allocation to Housing Kong's Public Housing Program." Urban Studies 2: 133-140. Investment: Comments and Further Results." Economic Zais, James, and Thomas Thibodeau. 1983. The Elderly and lsrban Development and Cultural Change 28 (2): 189-99. Housing. Washington, D.C.: The Urban Institute. 314 3I o2 Community and Price Data Elizabeth Frankenberg Multitopic household surveys like the LSMS are designed to gather data to be used for analyzing household welfare, including analysis of access to and use of social services, the effects of govern- ment policies on living conditions, and how households behave in response to changes in the economic environment or in government programs. To meet these objectives, multitopic house- hold surveys often collect data not only at the household level but also at the community level and even, in some cases, at the level of facilities (such as health clinics or schools). Collecting community or facility-level data is desir- The community-level information collected in able for two reasons. First, the government programs LSMS surveys typically includes information on infra- and services that affect individuals are often imple- structure, employment opportunities in agriculture, and mented and provided at the community level. Thus the availability of credit, schools, and health facilities. household surveys that collect information at both These data can be used in conjunction with data from the household and the community level yield more the household questionnaire to analyze access to servic- policy-relevant data than those that only collect es in terms of the average distance a child has to travel household data. Second, it is efficient to collect infor- to attend primary school or the proportion of house- mation about the shared environment in which holds living within 20 kilometers of a hospital.The data households operate from community leaders or from can also be used to evaluate government programs. The community members who are particularly knowl- price section of the questionnaire is designed to estab- edgeable about key subjects rather than from each lish the local costs of food and nonfood items. One way household individually. in which these data can be used is to devise spatial price In LSMS surveys, these additional data are collect- indices in order to accurately measure regional patterns ed in a community questionnaire that is administered of poverty (Ravallion and Bidani 1992). separately from the household questionnaire. The Although community-level data enhance the use- informants (called "informants" to distinguish them fulness of data collected at the household level, social from the "respondents" to the household question- scientists have only limited knowledge of how to col- naire) selected for these commiiuinity questionInaires lect cormmnunity-level data. Too often in the past, the vary depending on the specific objectives of the sur- designers of multitopic surveys in both the developing vey, but they can include community members, mar- and developed world have hastily assembled and ket traders, and staff at relevant facilities and institu- administered a community and price questionnaire tions, such as nurses and teachers. after the household survey was finalized, giving little 315 ELIZABETH FRANKENBERG attention to the wording of the questions, to defining that the services benefit individuals and families living the entities to which questions refer, or to the selec- in that area. Because the community is the level to tion of the informants to whom the questions are to which governments often target direct interventions, it be addressed. Frequently, no serious effort has been is sensible to collect data on how government pro- made to pilot and pretest the community survey. The grams work by measuring the extent to which the weaknesses of the resulting community-level data have services they provide actually exist at the community seriously limited the extent to which these survey data level. sets can be used to analyze the effects of government This section discusses the various policy issues policies on individual welfare. that can be addressed with both community-level and This chapter argues that the household and com- household-level data. The availability of data at both munity aspects of a multitopic household survey levels affects the extent to which these analyses are fea- should be regarded as two components of the same sible.The extent to which household data can be used data-collection effort and should be integrated from for policy analysis depends on three factors: the out- the earliest planning stages of the survey. The policy comes and behaviors about which data are available; questions that the household questionnaires in each whether the data are cross-sectional or measure survey are designed to answer are equally relevant to changes over time; and whether individuals can be the design of the community questionnaire. The sam- linked to the institutions that they use.The factors that ple for the household survey influences the design of affect the extent to which community data can be the sample for the community survey. Also, the field used for policy analysis include the availability of data procedures for the two questionnaires can and should on community characteristics over time and the be coordinated. In some cases alternative existing degree of detail of the data available on the facilities sources of community-level information, such as cen- and institutions that provide services to household sus records or other survey data, may be available to members. supplement or substitute for the collection of new Over half of the LSMS surveys conducted before community data. In other cases it may be necessary to 1997 included community and price questionnaires. design a special facility questionnaire to be adminis- Topics covered by these questionnaires typically tered in addition to the community questionnaire. included demographics, the economy and infrastruc- This chapter discusses a number of conceptual and ture, education, health, agriculture, and the prices of technical issues associated with the collection of com- food and nonfood goods (Glewwe and Grosh 1998). munity data in the context of a multitopic household A smaller number of surveys have collected facility- survey such as the LSMS surveys. The first section of level data as well. The LSMS surveys for Ghana, C6te the chapter describes the policy issues that have been d'Ivoire, Pakistan, Tanzania, and Jamaica gathered par- analyzed using a combination of household and com- ticularly rich community and facility data. Several of munity-level data.The second section discusses how to these data sets have been used extensively to analyze define a "community."The third section discusses how how government programs have affected human to assemble community-level data both from existing resources and outcomes. sources and from community informant interviews or Other multitopic household surveys have also facility visits. The fourth section introduces draft ver- included community and facility surveys. For example, sions of prototype community and price questionnaires the RAND Indonesian Family Life Surveys collected for the designers of fiuture surveys to consider. (The data from two groups of community informants, 12 prototype questionnaires are presented in Volume 3.) health facilities, and eight schools for the 321 commnu- The fifth section explains why certain choices were nities in which household respondents were located made in the design of these prototype questionnaires. (Frankenberg and Karoly 1 995).The second Indonesian Family Life Survey revisited many of the same health Policy Issues Analyzed Using LSMS providers and schools as the first, providing panel data at Community-Level Data the facility level.Also, Demographic and Health Surveys (collected by Macro International), which have been Government programs often aim to provide services administered in many countries, have usually included a within a specific geographical area on the assumption short questionnaire about service availability. 316 CHAPrER 13 COMMUNITY AND PRICE DATA Most existing LSMS data sets contain information the population as a whole and to different groups on the availability of sanitation facilities, power, water within the population. supply, and public works such as road and transport LSMS household questionnaires collect data on networks and (in some cases) irrigation systems. The use of services, including curative and preventive LSMS data fromVietnam have been analyzed with the health care and contraceptive services, as well as on aim of discovering the extent to which basic types of school enrollment. These data can be combined with infrastructure are available to different population sub- community-level data on access to services to illumi- groups (van de Walle 1995).This analysis has revealed nate the relationship between access to services and substantial differences between the rural populations use of services. For example, in Ghana the probability in the north and south, particularly in terms of access that a child age 5-12 has ever attended primary school to a post office (far more available in the south) and increases by 30 percent for every one kilometer electricity (far more available in the north). However, decrease in the distance he or she has to travel to teach there are only slight differences between the popula- a middle school. This finding suggests that access to tions in the two regions in terms of access to piped higher levels of education is an important factor when water. Fewer than 1 0 percent of residents of rural areas parents are deciding whether or not to enroll their have access to piped water, regardless of their region or children in the preceding level of schooling (Lavy their poverty status, although the poor generally have 1996). A similar phenomenon was observed in less access than the nonpoor. Vietnam (Glewwe and Jacoby 1998). The LSMS community-level data sets contain Data from the Indonesia Family Life Surveys sug- information about whether or not various types of gest that the availability and quality of private health health facilities and schools exist in the community as service providers within a community affect women's well as information about how far people have to trav- knowledge of public facilities in that community, and el to the nearest facility or school (and the costs they vice versa (Frankenberg and Beegle 1998). Data on must incur to do so) if these facilities are not located access to facilities can be combined with data on indi- within the community itself. By combining data on vidual characteristics such as age, sex, education, and access to schools or facilities with household-level data income level to evaluate whether the relationships on economic welfare and other basic socioeconomic between access and use differ among individuals. characteristics, it is possible to produce descriptive sta- These relationships may be of particular interest if pro- tistics on households' access to health care and to edu- grams are intended to benefit certain groups. cational opportunities. Such statistics have been pro- In countries where Demographic and Health duced using data from the LSMS survey in Ghana in Surveys have been carried out, the data sets from those 1988. The statistics show that households were an surveys include data on access to and use of contra- average of 0.4 kilometers from a primary school but ceptives. Analyses of Demographic and Health Survey were almost 15 kilometers from a secondary school data from both Tanzania and Nigeria suggest that (Lavy and others 1996). increasing the availability of the birth-control pill in Descriptive statistics can also be produced by pharmacies is associated with an increased use of con- region, by rural versus urban location, by economic traceptives (Beegle 1995 and Feyisetan and Ainsworth strata, or by level of education. For Ghana these statis- 1994). tics show that in 1987, urban residents lived an average Many LSMS household surveys have collected of 0.6 kilometers from a health facilitv, whereas rural measures of human resource outcomes as well as residents lived almost 5 kilometers away (Lavy and measures of the use of services. Education-related out- others 1996). Descriptive analysis of LSMS data for comes include grade attainment and, in rare cases Vietnam show that a high proportion of the popula- (such as the Ghana survey), achievement scores. tion had access to a lower secondary school in the Health-related outcomes include self-reported mor- early 1990s and that about 80 percent of the rural bidity and anthropometric measures. Community- population lived in communities that contained a level data on access can be combined with outcome lower secondary school (van deWalle 1995).These sta- measures to explore the relationship between access to tistics can illustrate the extent to which governments services and outcomes. For example, in an analysis of have or have not succeeded in extending services to the Ghana data, Lavy (1996) found that an increase in 317 ELIZABETH FRANKENBERG the distance that a child has to travel to a middle of this approach. Frankenberg (1995) used data from school is associated with fewer years of schooling the Demographic and Health Surveys and the Central attained. Bureau of Statistics to analyze intracommunity It is also possible to use the data on labor in con- changes in the risk of infant mortality associated with junction with community-level information on infra- increased access to public and private facilities. This structure to explore how activities in the labor force study found that there was a significant decrease in the vary depending on how much and what kind of infra- risk of infant mortality as access to private midwifery structure is available to the community. Vijverberg services increased. Gertler and Molyneaux (1994) (1995) found that in rural Vietnamese communities, undertook a similar exercise using administrative data important determinants of the decision to start up a from the Indonesian National Family Planning nonfarm enterprise include the availability of electric- Agency with respect to access to family planning serv- ity and piped water at the community level, the avail- ices, and found that changes in family planning pro- ability of a market that is frequently open, and the gram inputs were responsible for 4-8 percent of the presence of a secondary school. When agricultural decline in fertility that occurred between 1982 and extension services are available in a community, this 1987. tends to encourage farming and to discourage the ini- This chapter has so far focused on combining tiation of nonfarm enterprises. household data on outcomes or on use of services Most LSMS community questionnaires include with community informant data on access to services. questions about when facilities such as schools and However, another important factor is the quality of health clinics opened.When data are available from the the services provided, and data at this level of detail community questionnaire on changes in access and can usually only be gathered effectively by designing retrospective data are available from the household and fielding a facility survey. For example, in the questionnaire on specific household behavior and out- Ghana, Jamaica, and C6te d'Ivoire LSMS surveys, comes, it is theoretically possible to relate intracom- interviewers visited schools, health facilities, or both munity or intrafamily changes in behavior or out- and collected data on the quality of services provided. comes over time to changes in access to services. If analysts have information at this level of detail, For example, data on pregnancy, the use of prena- they can analyze particular aspects of a government tal care, and the site where the birth took place could program. For example, Thomas, Lavy, and Strauss be combined with data on the year in which a gov- (1996) analyzed the Cote d'Ivoire LSMS data to find ernment clinic opened in the community to produce out how various dimensions of the quality of the serv- statistics on the association between the availability of ices provided by a health facility (such as staffing pat- the services provided by the clinic and the use of those terns and drug availability) affected the height and services. If the data on pregnancy also contained data weight for height of children. They found that as the on the baby's birth weight, the analysis could addition- number of doctors at facilities increased, children's ally consider the association between the expansion of heights (standardized for age) rose as well. The avail- services and this important infant health outcome.The ability of drugs for treating common ailments was also same kind of statistics could be calculated using retro- associated with improvements in children's heights and spective data on contraceptive use. This author is not weights for heights. By combining facility characteris- aware of LSMS data having been used in this way. tics with household characteristics, the authors showed Some of the disadvantages to using LSMS data this way that access to immunization services significantly include the fact that community informants may not increases the height of poor children but has no sig- reliably be able to remember when facilities opened. nificant effect on nonpoor children. Also, analyses using this approach are not likely to be In some LSMS surveys, the collection of house- terribly informative unless access to facilities has hold and community-level data was organized so that changed in a relatively short time, because retrospective data on individuals could be explicitly linked to the data on outcomes of interest are usually available for data on the facilities that the individuals knew about only about five years prior to the survey date. and used. This allowed analysts to examine how indi- Nevertheless, two analyses of Demographic and vidual outcomes are affected by the quality of the Health Survey data from Indonesia have used a version services provided by the facility and how the quality 318 CHAPTER 13 COMMUNITY AND PRICE DATA of those services affects an individual's willingness to (1996) used data from the C6te d'lvoire LSMS survey travel to or pay for those services. In the context of to analyze the effects of variations in food prices on education, Glewwe andJacoby (1994) used data from children's nutritional status. They found that in rural the Ghana LSMS to show that investing in repairing areas, weight for height was negatively affected by high classrooms is a more effective way to improve student prices for fresh fish, eggs, palm oil, and manioc, while achievement than is providing additional instructional in urban areas it was the prices of rice, sugar, and plan- materials or improving teacher quality. tains that were associated with reductions in weight Glewwe and others (1995) used the Jamaica LSMS for height. The study concluded that the increases in survey to consider the effects of a range of physical, food prices that typically accompany stabilization pro- organizational, and pedagogical characteristics on stu- grams are likely to cause the nutritional status of older dent achievement. Overall, they found that variables children to deteriorate. that measured pedagogical processes (for example, the In a similar analysis of the Ghanaian LSMS data, amount of time that students spend doing written Lavy and others (1996) found that children's height for assignments and being tested) were more important age and weight for height were significantly associated predictors of achievement than variables that measured with the quality of water and sanitation facilities in rural input levels or school organization. communities. An analysis of infrastructure in Vietnam When panel data exist for both households and (van de Walle 1995) found that although the poor had communities or facilities, it is possible to conduct less access to most types of infrastructure than did the analyses that relate changes in facility characteristics to nonpoor, infrastructure was woefully inadequate to changes in the behaviors and outcomes of households meet the needs of either group, and any increase in and individuals.These sorts of analyses have been con- infrastructure was not likely to be redistributive. ducted with the Indonesia Family Life Surveys. For One important way in which price data can be example, between 1997 and 1998 there was a dramat- used in combination with household-level expendi- ic decline in the proportion of children under age 3 ture data is to make welfare comparisons across who had received Vitamin A in the previous six regions. In most countries price disparities among months, from 55 percent to 43 percent. Analysis of regions are considerable,Without community data on facility data from the communities of the survey prices, it would be impossible to compare expenditure respondents documented a concomitant decline in the levels among areas with different prices. proportion of both public and private facilities offer- The Indonesia Family Life Surveys illustrate this ingVitamin A (Frankenberg and others 1998). point. One round of the surveys was conducted in To date the LSMS surveys have not provided 1997. An additional round was conducted a year later, panel data on both households and facilities. The in 1998. In the intervening 12 months, the Indonesian Indonesian Family Life Surveys demonstrate that with rupiah collapsed and prices changed dramatically. careful planning and design it is possible to do so. Welfare comparisons based on expenditure levels Perhaps in the future the LSMS and other multitopic would have been meaningless without adjustments for surveys will implemented in a manner that yields price changes. Detailed monthly data on price changes panel data at multiple levels. are available from the Central Bureau of Statistics, but This discussion has focused predominantly on only for urban areas. Price data from the 1997 and analyses that combine measures of human resources 1998 community surveys were used to provide evi- from the household questionnaire with measures of dence that prices had risen more quickly in rural areas, access to or quality of facilities from the community and that conclusions about changes in welfare levels by questionnaire. The community questionnaire usually sector of residence depend critically on assumptions administered in LSMS surveys contains a number of about sector-specific inflation rates (Frankenberg, other measures of community characteristics, such as Thomas, and Beegle 1999). food prices, the availability of agricultural extension Many of the papers discussed in this section have services, wage rates, and transportation and sanitation used sophisticated statistical techniques to analyze infrastructure. LSMS household-level and community-level data. Analyses of these characteristics can also be rele- Several problems frequently arise in policy-oriented vant to policy. For example, Thomas, Lavy, and Strauss analyses of community and facility characteristics for 319 ELIZABETH FRANKENBERG which it is difficult, though often possible, to adjust survey. Households are grouped into sampling units statistically. First, there is the problem of omitted vari- (usually census enumeration districts), which are ables. Communities with access to infrastructure, referred to as clusters. The cluster is the geographical health facilities, and schools may well have a number unit in which the survey households are located. of other attributes, such as well-maintained transporta- Generally, for each cluster, a "community" unit is tion infrastructure, that contribute to positive human defined that contains the households located in the resource outcomes. If the analysis does not control for survey cluster. The community data collected will be these other attributes, the effects of access to facilities tied to that unit, and there will be a minimum of one will be overstated. community-level observation per unit.1 A second problem, particularly with analyses of What geographical unit is the most appropriate facility characteristics, is that measures often tend to be definition of a "community" in LSMS terms? One highly collinear. For example, the facilities that oper- choice is to use the cluster boundaries to circumscribe ate during inconvenient hours are often also under- an area that would serve as the "community" for the staffed and understocked. Third, missing data at the purposes of administering a community questionnaire. individual, household, community, or facility level In most survey contexts, clusters are small, contiguous often mean that analysts can only use a significantly units that do not overlap. However, the cluster bound- smaller and probably nonrandom subset of observa- aries are typically defined by the central statistical tions. A fourth problem is the potential endogeneity agency of the country and are often too small or too between community characteristics and individual or arbitrarily determined to be socially, economically, or household-level behavior and outcomes. Govern- physically significant to the people living in those ments may intentionally locate programs or resources areas.2This lack of significance is the main disadvantage in areas where residents have certain characteristics or to using cluster boundaries to define the community. households with certain characteristics may move to A second possibility is to define the community areas precisely because certain programs or resources solely in terms of its size-for example, by the area are available there (Rosenzweig and Wolpin 1986, contained by a radius extending five kilometers from 1988). In both of these situations a straightforward the center of the cluster.The apparent simplicity of this regression of individual outcomes or behavior on method and its uniformity across clusters are the main access to programs will generate biased estimates of advantage of this approach.The biggest disadvantages is program impact. that the selected radius is bound to be arbitrary and may not correspond to the informants' notion of the Defining a "Community" community to which they feel they belong. In addi- tion, this definition is difficult to put into practice if One of the two most fundamental decisions in design- there is no clear center to the cluster. If the communi- ing a community questionnaire to accompany a ty is organized along a road or stream or if it borders a household survey involves identifying a basic geo- body of water, a circle is not an appropriate shape. graphical area that defines "community." Defining the Moreover, if informants do not have a good sense of term "community" is difficult because even within distance and direction, they may not give clear and countries, communities are extremely heterogeneous. accurate answers to questions about the area. A definition appropriate across the range of contexts The third possibility is to define the "communi- covered by LSMS-type surveys will be too vague to be ty" in terms of the administrative unit or units of gov- informative. What can be said is that at a conceptual ernment under the jurisdiction of which the survey level, the term "community" in the context of the sur- households fall. In most countries this can consist of veys discussed in this book refers to a spatial unit that several different levels of government.3 In previous contains the households included in the survey sam- LSMS surveys, clusters have tended to be small ple, that has characteristics common to its residents, enough to be contained within a low level of govern- and that is of social, economic, or physical significance ment. Therefore, the community questionnaires in to its residents. most of these previous surveys have defined the "coin- In LSMS surveys the definition of the communi- munity" as the most appropriate low-level administra- ty is inherently tied to the design of the household tive unit, for the following reasons: 320 CHAPTER 13 COMMUNITY AND PRICE DATA * An administrative unit is a well-defined geographi- depending on whether the cluster studied was located cal area to which informants can easily relate. in an urban or a rural area. In several surveys, almost * Measuring access to government programs is often no community-level data were collected in urban a primary goal of the community questionnaire in areas,5 which seriously limited the extent to which LSMS-type surveys, and the benefits of government those data could be used for policy analysis. programs are often allocated by administrative unit. One argument against collecting community data * Administrative units are usually run by people in urban areas is that urban residents have access to whose responsibilities imply that they are knowl- such a wide range of resources that the location of a edgeable about many of the topics dealt with in a household is an insignificant determinant of the community questionnaire. opportunities and constraints faced by its members. * Some data broken down by administrative unit may Therefore, it is argued, collecting information on the already exist at the central government level. surrounding area is not very informative.This view is The most sophisticated approach would be to short-sighted.There is no conclusive evidence to sug- define the "community" in accordance with both the gest that urban residents are completely unaffected by particular phenomena of interest to the survey design- measurable aspects of their environment. Also, this ers and the characteristics of the household survey clus- argument ignores the fact that, in reality, the ter. It is important to bear in mind that it may not be urban/rural divide is a continuum rather than a clear- immediately apparent which definition is most appro- cut distinction. Thus, rather than deciding in advance priate in a given situation. Also, one definition of com- that community characteristics are irrelevant in urban munity will not always be equally applicable to all clus- areas, survey designers should structure the communi- ters or to all of the characteristics of interest. Therefore, ty questionnaire in such a way as to avoid having it is not essential to adhere to one definition of com- respondents answer sections that have little relevance munity in all parts of the community questionnaire or for the community in which they live. in all contexts. For example, survey designers might A second reason why urban areas may have been choose to define the "community" using village bound- excluded from community modules in previous LSMS aries in rural areas and postal codes in urban areas. Some surveys is that it is particularly difficult to define what variables, such as access to piped water systems, may best is meant by a "community" in an urban context. This be captured at a very local level, while other variables, is undoubtedly true, although the structure of local such as distance to the nearest hospital, may be better government may be a good guide in some countries suited to a wider definition of community. while elementary school catchment areas or postal The best way to measure the effects of communi- zones (with accompanying maps) might be appropri- ty-level characteristics on the outcomes and behaviors ate in others. of survey respondents varies depending on the mix of There may be some situations in which the deci- characteristics exhibited by the community.This vari- sion not to collect community-level data in urban ation has consequences for the way in which the ques- areas is sound, but this decision should be made on a tions about each characteristic are worded in the ques- survey-by-survey basis and only after careful consider- tionnaire. For example, there are more pharmacies ation. When making this decision, survey designers than hospitals in most developing countries, and phar- should bear in mind the policy issues that the survey macies typically serve clients from a far smaller catch- is designed to illuminate, the administrative structure ment area than do hospitals. Consequently, when of urban areas, and the availability of other sources of information on the availability of pharmacies is being information about urban areas. sought, the most appropriate question to ask commu- The definition of a "community" can also be nity informants may be whether one or more phar- unclear in areas where the population is extremely macies is located within five or ten kilometers from sparse and where the survey households are spread out the community center. In the case of hospitals, it may rather than "clustered." If there is no obvious "com- be more relevant to ask informants how far they live munitv center" that is a focal point for community life from the nearest hospital.4 (perhaps a place where services are provided or goods Community-level characteristics were measured can be purchased), each household is effectively its in various different ways in previous LSMS surveys own community, which may mean that there is no 321 ELIZABETH FRANKENBERG point in collecting community data.This scenario may obtained from aggregating household responses are seem extreme, but in practice there have been some likely to be biased. If the aim is to characterize an area cases in which settlements were so scattered that there larger than the sampling unit, aggregating data from was no point in including questions about distances in the household questionnaire will not produce an the community questionnaire. unbiased estimate of the phenomenon if there is sub- stantial cross-cluster heterogeneity. Assembling Community-Level Data This issue can be made more concrete with two examples. Suppose the phenomenon of interest is the Having established the definition of the communities attractiveness of an area to migrants.Wage rates are one to be studied, survey designers should then investigate potential measure of attractiveness. If the survey the various possible ways of gathering the pertinent households are poor and are located in a relatively community-level information. Broadly, there are three poor neighborhood, the low community-level wage ways to assemble community data: rate obtained from aggregating the household * Using any existing (or "secondary") data from responses will be an inaccurate measure of the wage administrative archives or previous studies. rates and thus the attractiveness of the surrounding * Conducting community informant interviews. area. On the other hand, suppose the phenomenon of * Visiting facilities, service points, or markets. interest is the degree of exposure to household waste "Secondary" data have already been collected and in the immediate vicinity of the survey households. In assembled, whereas the other two categories require this case, aggregating from the household data may the survey team to gather new information. None of provide an accurate measure of exposure within the these data sources alone is likely to provide all the small area the analyst wishes to characterize. community-level information needed, but, when combined, they can paint a detailed picture of the Sources of Secondary Data community. The designers of a multitopic household Because serious efforts have been made in many survey should thoroughly explore the feasibility of developing countries to collect or assemble data on a each of these three methods of assembling communi- wide range of topics, there is considerable potential for ty data at the earliest stage of planning. supplementing LSMS surveys with secondary data. It is also possible to use data gathered in the Survey designers may find that secondary data household questionnaire of the survey to generate exist on, for example, weather patterns, food prices, some community-level characteristics. For example, by housing characteristics, wage rates and occupations, calculating the average wage of the respondents to the the quality of schools and health facilities, and the dis- household questionnaire, it is possible to construct a tribution of family planning methods. In Vietnam, for measure of community-level wage rates. Another example, it was not necessary to collect price data example would be to construct a measure of the com- from urban areas in the LSMS survey because the sta- munity's access to piped water by calculating the pro- tistical agency was able to provide these data. It is well portion of survey households that have piped water worth exploring whether such data exist and evaluat- from responses to the household questionnaire. ing whether an existing source can meet the survey's The usefulness of these community-level aggre- community-level data needs.When secondary data are gated measures depends on the size of the geographi- the sole source of community-level data, the choice of cal area to be characterized, the number of respon- the geographical unit to which those data apply will dents per community, and the degree of heterogeneity depend primarily on what data are available. across potential sampling units within that area. If the The survey team should take the following basic aim is to characterize only the area encompassed by steps: identifying relevant secondary data and assessing the sampling unit from which survey households have their quality; obtaining permission to use the data been selected, aggregating information from a set of from the organization that has assembled them; and randomly selected households within that area will ascertaining whether it is possible to merge the sec- produce an unbiased estimate of the phenomenon of ondary data with the data from the household ques- interest. However, if the number of respondents per tionnaire of the LSMS-type survey based on a geo- community is small, any community-level estimates graphical link between the survey cluster and the unit 322 CHAPTER 13 COMMUNITY AND PRICE DATA to which the secondary data refer. If data are available include information on, for example, the bed capacity at the level of individual institutions (for example, for of health facilities. This kind of issue poses more of a each school), survey designers need to decide which problem for policy researchers than for academics, schools are relevant to individuals living in the survey's since academic researchers have more scope for shap- sample households. ing their questions to fit the available data. A related concern is whether not only the key IDENTIFYING RELEVANT SECONDARY DATA. There are program variables but also "control" variables, such as several questions that survey designers should ask basic measures of infrastructure and socioeconomic themselves to determine whether the available sec- development, are available in the data. Because there ondary data will be a useful supplement to an LSMS are often many correlations among the various devel- household survey. The first question is whether the opment indicators, failing to control for other types of data cover the entire country or only selected geo- infrastructure in the community may result in mis- graphical areas. If the data are limited geographically, leading results because of omitted variable bias. (See then the survey designers must assess how close the Chapter 26 on econometrics.) LSMS clusters are to the sites covered by the second- Another key issue concerning secondary data is ary data set. In some cases, the secondary data may be whether they contain too much measurement error to so valuable that it is worth trying to selcct the LSMS be worth the difficulties involved in using them. clusters so that they can be linked to the secondary Measurement error, which can be random or system- data. For example, in a survey focusing on agriculture, atic, may arise in a number of ways. First, the data may the value placed on having weather data may be so have been collected at too high a level of aggregation high that it may make sense to select clusters for the to capture the aspects of the environment that affect LSMS survey that are located near a weather station. individuals' behavior and outcomes. For example, it Another question to bear in mind is the age of the may be possible to merge administrative data on available data; some-such as diennial censuses-may regional governments' expenditures oIn family plan- be so old that they are of little use.Additionally, survey ning into a household survey data set, but those designers should ascertain whether the data are gath- expenditures may be so weakly correlated with the ered regularly (as in a monthly price survey) so that a aspects of family planning services that affect an indi- time series of data is available. vidual's contraceptive use that they have no explanato- Even when the geographical coverage and collec- ry power in a model predicting contraceptive use. tion date suggest that the secondary data will be use- Second, secondary data may not reflect the condi- ful, the quality of the informationi should be investi- tions that actually prevail in a given community or gated further. Many secondary data sets sound better facility. For example, using data collected through vis- than they turn out to be in practice. For example, the its to facilities, Thomas, Lavy, and Strauss (1996) fact that every health center is required to turn in a showed that the estimated impact of health service monthly report of activities is no guarantee that each infrastructure on child anthropometry is much small- center actually does so. There may be weather stations er when the infrastructure is measured in terms of throughout the country, but their equipment may not what is supposed to be available at a health facility function. If the data are incomplete, the designers rather than in terms of what is actually available. should assess the importance of the missing data and Quality issues are particularly troubling when the dif- whether the data are recoverable. ferences between reality and what is reported are sys- Secondary data often do not contain sufficient tematic (nonrandom measurement error) because information from which to construct variables that administrators have an incentive to report conditions enable analysis of the relevant policy questions in a as being particularly good or bad.6 country. For example, if a high priority in the health Finally, survey designers should consider whether sector is training personnel and assigning them to the available secondary data cover only public sector remote areas, it may be useful to evaluate whether the facilities and not the facilities run by the private sec- use of facilities has increased or health status has tor, which is common. The importance of such an improved in those areas since the number of staff omission depends on the size and geographical distri- increased. However, the secondary data might only bution of the private sector facilities and on whether 323 ELIZABETH FRANKENBERG the services they provide compete with public facili- who are knowledgeable about the environment that is ties for clients. For example, in some countries private common to all the households located in the commu- elementary schools may be so rare and prohibitively nity. This method is easy to implement and has the expensive that omitting them from an analysis of the additional advantage of being cheap. enrollment decisions of low-income families poses few problems. On the other hand, private health care SINGLE-INFORMANT INTERVIEWS. In its simplest mani- facilities may be an important source of health care for festation, this approach involves the survey team con- the populations of many developing countries. ducting an interview of no more than a couple of hours with the community informant after adminis- GETTING PERMISSION TO USE SECONDARY DATA. tering the household questionnaire to the sample Sometimes acquiring secondary data can be time-con- households in the area. In many surveys the commu- suming and expensive. Agencies may be hesitant to nity informant is likely to be the community leader release data. However, almost all previous LSMS sur- with whom the survey team would have to meet any- veys have been implemented by the national statistical way, as a courtesy or to obtain his or her permission to agencies of the countries studied, which has usualvy conduct the household interviews. made it easy for the survey team to at least obtain When the survey designers decide to use inform- information on other sources of official data. ant interviews to collect community data, the com- munity should be defined according to the type of MATCHING THE SECONDARY DATA TO HOUSEHOLD information needed and the feasibility of identifying SURVEY DATA. Combining secondary data with and interviewing informants who are well informed household data is theoretically straightforward, but in about that community. Both the geographical unit to practice several problems can arise along the way. A which the community data pertain and the sources of different geographical coding scheme may have been data should be chosen so as to match the priority top- used in the secondary data set than the one used in the ics of the overall survey. A survey with a special LSMS-type survey. Incompatible coding schemes are emphasis on infant and child health will require a dif- more likely to be a problem if the supplementary data ferent set of community-level data than a survey that are obtained from an organization other than the one emphasizes secondary school attendance or one that conducting the household survey7 Ideally, both data focuses on poverty. sets should record the names of the administrative Although this method has its problems, it is a areas (or facilities if that is the level at which the good way to obtain a "core" set of comparable infor- matching occurs) about which the data were collect- mation tailored to the purposes of the household sur- ed as well as a common set of codes. Even if a com- vey about the communities in which the clusters are mon set of codes is available, the matching should be located. It is useful to have this information because verified by name. many analysts will want to evaluate the effect of one Secondary data certainly have considerable poten- community characteristic while controlling for other tial as a source of information on the communities and characteristics with which it is likely to be correlated. the larger administrative units that surround the The biggest determinant of the success of this household survey clusters. However, because of the approach is the knowledge level of the informant rel- numerous problems that can arise in tapping that ative to the questions being asked. It is counterpro- potential, survey designers should investigate fully the ductive (though tempting) to ask questions at a level extent to which existing sources of data can meet the of detail about which the informant is likely to be survey's policy research priorities. It would be fool- ignorant. Thus it makes sense to design informant hardy to assume at the outset that the existence of sec- selection protocols that anticipate what types of ondary data precludes the need to collect community- informants are likely to be knowledgeable about level data in the survey. which topics. Conducting Community Informant Interviews GROUP INTERVIEWS. The practice followed in most A second way to obtain community-level data is to previous LSMS surveys has been to assemble a group interview one or more residents of the community of informants consisting of, for example, village chiefs, 324 CHAPTER 13 COMMUNITY AND PRICE DATA teachers, government officials, and health care work- coniiiunity elders), gender, or the length of time peo- ers, and to administer one community questionnaire ple have lived in the community. One advantage of (composed of various modules) to all these informants interviewing more than one group per community is at the same time. If retrospective data are needed, it is that it is then possible for the survey team to compare useful to include at least a couple of residents who the different interview reports to find out in which have lived in the community for a number of years. clusters and on what topics the groups disagreed. Interviewing several people instead of just one is a These comparisons may suggest which community- sound approach because each community member is level variables are particularly subject to measurement likely to be well informed about a different topic. If error (as indicated by disagreement among the differ- the members of the group represent different areas of ent groups) and whether certain clusters (such as those expertise, the most knowledgeable member can take in urban areas) are more prone to measurement error the lead in answering the questions about his or her than others. topic. Conducting more than one group interview allows If this approach is used, it should be formalized in for a heterogeneity in responses that might not emerge the community survey protocols. Interviewers should in one group interview with diverse members. Another be given explicit instructions about how to identify way to encourage heterogeneity is to allow multiple informants and assemble a group. They should be answers to some questions in the questionnaire. Often, instructcd about what types of informants must be community surveys ask about the primary source of X included (for example, a teacher or someone active in (for example, drinking water), the main type of Y (for agricultural extension activities). If there is an obvious example, road surface), or whether a Z exists (for exam- community leader or respected elder, this person may ple, a pharmacy). If there are many different sources of be enlisted to organize a group of informants. The drinking water, types of road surfaces, or numbers of questionnaire should provide spaces for recording the pharmacies, it may be preferable to ask informants both basic characteristics of each informant, including his or to specify all the relevant options and to identify and her name, sex, age, education, position in the commu- provide more detail about the primary option. nity, length of tenure in his or her current position, and length of time he or she has lived in the commu- MEASURING THE AvAILABILITY OF SERVICES. One of the nity. At the end of each module, the interviewer principal objectives of a community survey is to estab- should record the identities of the informants who lish the number and quality of services-such as have participated in the discussion. schools, health facilities, contraceptive outlets, banks, One problem with group interviews is that they and markets-to which community residents have may be hard to control. If the group of knowledgeable access. The content of this part of the questionnaire informants is composed of people with such different depends on the extent to which survey designers want backgrounds or interests that it is unlikely they will to learn about residents' options and on how much reach any consensus, it may be better to conduct sep- detail they want to collect about the services. These arate interviews with individual informants or to factors also determine whether it is necessary to col- interview several smaller groups of informants. For lect data at the facility level. example. a woman who volunteers to distribute con- First, a question arises: what constitutes "avail- traceptives from her home may be a good source of able?" In this chapter, "available" facilities are ones that information about family planning services in the most people interviewed by the household survey are community. However, if she is likely to defer to male aware of, if only vaguely, and would at least consider community leaders in a large group interview rather using. The number of facilities that meet this defini- than speak up, she should be interviewed as part of a tion is likely to vary among communities and among separate group, possibly with other knowledgeable types of facilities, as is the geographical size of the area women in the community. that contains them. Other possible ways to define Groups can be organized according to, for exam- "available" include facilities that meet some geograph- ple, employment status (separating government work- ical criterion, such as all facilities within a certain ers from nongovernment workers), leadership status radius of the center of the cluster or all facilities with- (interviewing elected or appointed local leaders or in the administrative unit that contains the cluster. 325 ELIZABETH FRANKENBERG This section describes three ways of measuring distance between the community center and the serv- service availability: ice facility, this distance obviously does not reflect the * Asking community informants whether a type of access that the members of these scattered households service or facility is available. have to this facility. In some countries, it may be pos- * Asking community informants to identify the main sible for survey designers to use Global Positioning facilities to which households in the community System technology to make objective measurements have access. of distance by visiting the facilities. (This is described * Combining responses from the household ques- further below in the section on visiting facilities to tionnaire with information provided by communi- collect data.) ty informants. If survey designers want to characterize the extent Questions about the availability of services may aim to of choice community members have among different establish whether a particular service is located within facilities providing the same service-for example, the community's boundaries. Such questions may also several different health clinics-it will be necessary to aim to establish how far the service facility is from the obtain information about multiple facilities or service geographical center of the community or the time it points. (See Sections 9 and 10 of the draft communi- would take to travel that distance and how much it ty questionnaire.) Once designers have defined "avail- would cost to do so. (See Section 3 of the draft com- ability," they need to identify all of the facilities in the munity questionnaire inVolume 3.) In some previous community that meet that definition. One way to do surveys, community informants have been asked ques- this is to explain the criterion to one or more com- tions about both topics. munity informants and ask them to identify the facil- Generally, it is easier to ask about the availability of ities that qualify according to this criterion. Each facil- a service within a given boundary than to try to obtain ity that the informant identifies can be listed on a estimates of distance. However, the significance of service availability roster. whether a service exists within the community bound- Another possibility is to ask household respon- aries depends on how the boundaries are defined. If the dents to identify facilities during their household boundaries refer to a very small neighborhood (as may questionnaire interview. Their responses can be com- be the case for census enumeration areas), the existence piled into a cumulative list for the cluster as a whole. of a service within those boundaries is less significant This method requires that the household respondents than if the boundaries refer to a larger area. On the identify each facility by its name and address and that other hand, if the boundaries refer to a very large area, the interviewer compile the responses into one list. it will not be possible to gauge the true availability of Both of these activities are time-consuming. The the service to small communities within that area. For cumulative list of facilities for the cluster can then be each type of service facility (for example, a health clin- verified with one or more community informants to ic versus a hospital), survey designers should carefully check whether any obvious facilities are missing from consider which existing administrative unit to choose the list and to add information about the distance and in defining community boundaries to ensure that the the travel times and prices for each facility on the list. resulting data are meaningful. If this method is used, the list cannot be compiled Another possibility is to ask community inform- until all of the interviews for the household question- ants about the distance either to the nearest facility or naire are finished. to a finite number of facilities of specific types (see Which method is preferable depends on the goals Section 6 of the draft community questionnaire in of the survey and on whether data are to be collected Volume 3). In questions on distance, the term "com- from the facilities themselves. If the community sur- munity center" is often used to denote a reference vey involves interviewers visiting facilities, it is rec- point from which distance is calculated.The more spe- ommended that households be asked to identify facil- cific this reference point, the better. If communities ities precisely in the household questionnaire and that typically have a gathering point such as a town hall, a these responses be compiled into a list for the cluster. place of worship, or a market, this location can serve as (The advantages of this approach are described the reference point. However, if households in the below.) This list, supplemented by information from community are widely dispersed but it is only a short one or more community informants, also provides 326 CHAPTER 13 COMMUNITY AND PRICE DATA data on service availability. If facilities are not to be select a sample from that list. However, this strategy is visited, survey designers must decide how thoroughly rarely feasible, for the following reasons: they wish to investigate the range of facilities available * It is difficult to know a priori what constitutes an to households. Community informants will probably area of appropriate size. be able to identify the main services that are easily * The listing process is time-consuming, expensive, accessible to members of that community. They are and probably impractical to undertake in an area not likely to provide a complete list or much detailed large enough to cover the distances household information about those facilities because they simply members are willing to travel for services. will not know everything about all facilities. Analyses . The facilities chosen for the sample may not corre- of the Indonesia Family Life Survey data have spond to the facilities that LSMS households know revealed that community informants do a better job about and use. of characterizing public facilities than private facilities The strategy recommended here is to compile a (Frankenberg 1998). If detailed information about the list of facilities in the household questionnaire as facilities is desired, interviews should be conducted at described above. If the number of facilities on the the facilities. cumulative list is small enough or if the survey budg- et is large enough, all the facilities on the list can be Visiting Facilities to Gather Data visited. Otherwise, a sample of facilities may be drawn A third way to obtain community-level data is by vis- from the list. The sample of facilities can be selected iting facilities and administering questionnaires to either randomly or according to some other criteri- staff. The term "facility" is used broadly and includes on-for example, in proportion to the number of markets and sales outlets (for gathering information on times the facilities are mentioned by household prices) as well as schools, health facilities, banks, other respondents. sources of credit, and employers. This was the procedure used to draw a sample of The main advantage to visiting facilities is that it facilities in the Indonesian Family Life Survey. becomes possible to obtain far more detailed and gen- Facilities were ranked according to the frequency with erally more accurate information about the prices and which they were mentioned by survey respondents. content of the services that these facilities offer than The most popular facilities were visited, and addition- can be obtained just by asking community informants. al facilities were randomly selected until a predeter- By making direct observations of these facilities, it is mined quota was reached. This method guaranteed also possible to gather information about private as that all facilities had a nonzero probability of being well as public facilities.8 Potential topics for facility selected, while increasing the chance of substantial questionnaires are covered in the respective sectoral overlap between the facilities interviewed and the chapters of this book. facilities of relevance to household residents (Frankenberg and Karoly 1995). Information on the SELECTING FACILITIES FOR INTERVIEWS. One of the first number of facilities listed and how often each was issues to be considered when designing a facility sur- mentioned by household respondents was used to vey is how to choose which facilities will be visited. generate the sampling weights for the selected facili- This issue is a logical extension of the discussions ties (McCaffrey, personal communication, 1995). regarding how to define "community" and how to From a theoretical standpoint, using a cumulative measure service availability. In an LSMS survey, facili- list generated from the responses of household mem- ties of interest are facilities that are "available" to the bers is attractive.The main drawbacks are that the list households in the LSMS sample. Generally, survey cannot be constructed until household interviewing is designers aim for a good deal of overlap between the finished and that compiling the list may be time-con- facilities interviewed and the facilities that LSMS suming, depending on the number of household household respondents know about and use. respondents and the number of different kinds of facil- From a theoretical standpoint, the way to select ities the respondents list. facilities that parallels the typical procedure for select- Other methods that have been used to select a ing households is to define a geographical area of sample of facilities are to select all facilities within a interest, to list all the facilities in that area, and then to certain radius of the cluster center or all facilities with- 327 ELIZABETH FRANKENBERG in the administrative unit that contains the cluster.Yet consuming but it gives a more accurate estimate of the another alternative is to choose the facility located distance between a facility and the community center closest to a cluster. This strategy will almost certainly than the straight-line measurement. generate a biased picture of available services in any A third way in which Global Positioning System community in which residents have access to multiple data can be used is in conjunction with digitized maps facilities. All of these strategies are flawed from a sci- of the areas in which the survey is being conducted. If entific perspective and present analysts with a number digitized maps are available, the facility and cluster of difficulties. coordinates can be added to the map to illustrate the locations of the services available to households in USING GLOBAL POSITIONING SYSTEM TECHNOLOGY TO each cluster. The more features, such as roads, bus MEASURE ACCESS TO SERVICES. Visiting a sample of stops, and markets, that are geocoded into the map, the facilities makes it possible to use Global Positioning more thoroughly it is possible to analyze any geo- System technology to measure the distance between a graphical factors that may prevent people from using facility and the center of the community or cluster. the facilities located in their community. Survey The Global Positioning System is a navigation system designers should explore whether the country's map- that can be used to determine a position on the earth ping agency produces digitized maps and at what level in relation to a set of orbiting satellites. The Global of detail. This method of using the Global Positioning Positioning System can determine the latitude, longi- System was used by Entwistle and others (1997) to tude, and altitude of the facility, the coordinates of map contraceptive facilities in Nang Rong,Thailand. which can be labeled and stored for lengthy periods in Because a detailed map of the area was available, the system's memory until a computer is available into researchers were able to calculate the time it took to which they can be downloaded directly. Global travel to these family planning facilities using various Positioning System indicators are now available quite different routes.The analysis revealed that the compo- cheaply. Most are easy to use. The operator finds a sition of the road (for example, asphalt or dirt) had an location where the sky (and as much of the horizon as independent effect on contraceptive behavior. possible) is visible, turns on the indicator, and holds it up until three or more satellites have been "acquired." Collecting Price 'Data At this point, the indicator gives a readout and the It is standard practice for LSMS surveys to collect operator can mark and label the spot. There are sever- price data by sending survey teams to visit markets. al potential sources of error associated with the system, Prices are an important element of community-level such as the blockage or reflection of satellite reception data, but collecting price data is complicated, and by buildings or other obstructions. many analysts of community-level price data have Global Positioning System data can be used in expressed dissatisfaction with the information they several different ways in the context of a community- have been given. From the analyst's perspective, one facility survey.The most basic application involves tak- major problem is that price data are often not suffi- ing readings of the latitude and longitude of each ciently comparable among communities, nor are they facility visited. Coordinates from the facilities, in com- sufficiently precise within communities. Three general bination with readings on the latitude and longitude problems account for this lack of comparability. First, of the cluster center, the community center, or (prefer- price data typically contain large numbers of missing ably) both, can be used to calculate objective measures values, so that a complete set of price information is of distance to each facility.9 However, these estimates rarely available for any one community. Second, there measure the most direct route, ignoring the fact that are often differences among communities in the qual- people typically travel on roads or paths rather than ity of the item for which the price is reported. Third, overland. Thus another more sophisticated way to use quantity is often measured imprecisely, making it Global Positioning System technology to measure the impossible to calculate a unit price. distance from a central point to a facility is to take a In thinking about how to avoid these problems, it sequence of readings along the route and to add up the is useful to start by considering how price data will be lengths of the various segments of the route traveled used. There are two major uses for price data. The first to the destination. This method is more time- is to develop price indices, which are used to ensure 328 CHAPTER 13 COMMUNITY AND PRICE DATA that expenditures in different regions can be compared interviewer visits. These data, such as the name and without being affected by any price differences address of the outlet, could be collected on a separate between those regions. The second major use for "outlet roster" form similar to the roster forms used communitv-level prices is in models predicting behav- for schools and health services. Another possible way ior such as school enrollment and outcomes such as to collect prices would be to ask community inform- health status, because price levels are considered out- ants or a sub-sample of household informants about side the control of households or individuals. prices. Given how little is known about how to collect These uses of price data have three implications for data on community-level prices and how many prob- the design of the price module. First, the items in the lems there have been in past LSMS surveys, it is rec- community price module-particularly the food ommended that both methods be used. items-should complement the items in the household The outlets selected for interviewer visits should consumption module. Second, the price module be places that a typical household respondent would should gather price data on items-which can include use. The range of possible methods for selecting price food items, nonfood items, and services-that may outlets is parallel to the range of methods for selecting affect the behavior and outcomes of interest in the facilities. One method would be to record in the con- household survey (for example, purchases of items such sumption module of the household questionnaire the as aspirin, antibiotics, and condoms). Third, it is impor- market or store where the respondent usually shops tant to choose sources of information that accurately for food. The interviewer could then visit the two or represent the prices community members encounter. three outlets mentioned most often. Otherwise, the It is possible to design the price questionnaire so outlets could be selected on the basis of proximity to as to avoid some problems of missing values, variations the cluster center or to some other center of activity, in quality, and meaningless or imprecise measures of such as the community center or a major employer. quantity. However, an equally important way to avoid It is best to administer the price questionnaire to these problems is through training and supervision of informants who frequently make purchases and, if pos- the interviewers. The filled-in questionnaires should sible, to avoid relying on informants who are consid- be reviewed by field supervisors; if they are incomplete erably poorer or wealthier than the household respon- or imprecise, the supervisors should ensure that the dents seem to be. A community-level interview interviewers clarify the information in question. conducted with a group of female informants may The content and design of the price module provide a good opportunity to ask questions about should reflect in large part the conditions of the coun- prices and other community-level topics. try in which it is being used, including the usual diet of the population and the ways in which goods are COLLECTING DATA ABOUT ITEMS OF A SPECIFIC distributed, marketed, and sold. QUALITY AND QUANTITY. When interviewers collect data on the price of any given commodity, they must SOURCES OF INFORMATION ON PRICES. In most previ- record the price for a particular quantity of the com- ous LSMS surveys, interviewers have collected price modity using a well-known unit of measurement.This data by visiting markets and vendors and asking the is straightforward for commodities that are routinely price of particular goods. For items normally bought sold in known quantities such as liters or grams. in bulk, interviewers would ask the price for a partic- Survey designers should specify to interviewers in ular weight, with the item being weighed on scales advance their preferred quantities and units of meas- and sometimes in containers carried by the interview- urement. If the price data are to be collected at mar- er. The interviewer would repeat this process at sever- kets, interviewers can bring scales and containers to al sales outlets with the goal of obtaining three price standardize the quantities of the various commodities. measurements per item. Because not every outlet sells Other commodities tend to be sold in tins, cans, every item on the questionnaire, the interviewer bunches, bundles, sachets, or packets. In these cases it is might have visited far more than three outlets to critical to record the quantity in standardized units obtain three price measures for every item. such as liters or grams. Some countries or regions use In some instances survey designers may wish to uncommon measurement systems, but this problem collect specific information about each outlet that the can be solved by having the interviewer use his or her 329 ELIZABETH FRANKENBERG own set of scales and measuring containers to quanti- Given that the primary goal of the LSMS is to fy the commodity. The interviewers' training, field- monitor poverty trends, the list should probably cover work, and field supervision should all stress the impor- the items in the food bundle that the government uses tance of obtaining price data in known quantities. to calculate poverty lines. Standardizing the quality of the commodities for In many situations the quality of food and the way which prices are collected is also important to ensure it is processed vary considerably in ways that are that analysts can later compare like with like. However, reflected by variations in price. When interviewers standardizing quality is much more difficult than stan- collect price data on meat, fish, and poultry, they dardizing quantity. Appendix 13.1 suggests ways for should specify whether the recorded price applies to interviewers to specify the quality of items. meat with or without bones. Also, because different With respect to both quantity and quality, it is qualities of staple starches, particularly rice, are often important that any relevant country-specific informa- available, it is important to specify the kind of starch to tion be reported on the price questionnaire. If the which the price refers. It may be possible for survey price of rice varies depending on whether it is an designers to specify a particular variety that is available imported, modern or local/traditional variety, the throughout the region or to ask about starches of dif- variety should be specified in the questionnaire. The ferent qualities. If food prices are known to vary by question then arises: if a particular variety or brand is season, it may be useful to try to collect price data for not available, should the price of another variety or a specific season and to establish whether an item is brand be collected instead? The questionnaire pro- available year-round. posed in this chapter provides a space in which survey The list of prices for nonfood items can be long designers can inform the interviewer about which or short. General categories include prices for public quantity and variety they prefer. If the interviewer is transport to key locations, the cost of fuel, prices for unable to obtain information about that quantity or basic clothing such as shirts, skirts, and trousers, hous- brand, he or she should record information about ing prices (typically the rental and purchase prices for another quantity and variety while specifying on the an average house in the community), prices of house- questionnaire sheet the alternative quantity or brand hold goods (such as soap, cooking fuel, or firewood), name. basic medicines, education inputs such as school uni- forms and basic supplies, and agricultural inputs such CONTENT OF THE PRICE QUESTIONNAIRES. There are as fertilizer and insecticide. three broad categories for which it is desirable to col- If price data are collected from community lect price data: food, nonfood goods, and services.The informants rather than from visits to markets or shops, items in the household expenditure lists provide some nonfood prices should be included in other parts of more detailed guidance, as do country-specific lists of the community questionnaire. For example, questions goods whose prices are used in calculating the con- about the price of fuel might be included in the sec- sumer price index. tion on transportation, while questions about the The food price list should cover staple starches, prices of agricultural inputs could go in the section on fish, meat, poultry, beans, lentils, fruits, vegetables, salt, agriculture. This principle is important given that dif- sugar, milk, and cooking oil. Spices, alcoholic bever- ferent informants contribute to different sections in ages, and tobacco are other items that may be impor- accordance with their specific expertise. There is no tant in some country circumstances. The numbers of guarantee that the informants who know the most items per category and the specific items in each cat- about food prices also know the prices of insecticide egory will vary by country, but the list should cover and kerosene. enough items to account for 80-90 percent of house- It is also worthwhile to collect price information holds' expenditures on food. Survey designers should on such services as schooling (registration fees), health ensure that the list takes into account any regional care, small loans (interest rates), and sewage hookups, variations in dietary patterns. If chicken is the main garbage pickup, and electricity, if these vary among commodity in one region while fish is the main com- communities. Since services are not sold at the same modity in another, the prices of both goods should be outlets as food and nonfood goods, data on the prices collected in both regions. of services may need to come from the community 330 CHAPTER 13 COMMUNITY AND PRICE DATA informants. For this reason, several sections of the gather relevant information. In most previous LSMS community survey are likely to contain questions surveys, community-level data on prices have been about prices distinct from the questions in the price gathered by visiting sales outlets. This practice should questionnaire. continue. Other relevant service points include public and private health providers (including, where appro- Summary priate, traditional practitioners), elementary,junior, and This section has discussed three ways of obtaining senior high schools, and banks and cooperatives. community-level data: from secondary data, by inter- Survey designers must first choose and test a method viewing knowledgeable community informants, and for selecting the facilities to be visited. Also, they by visiting local facilities and markets. Survey design- should design different questionnaires for the different ers should evaluate all of these potential methods dur- types of facilities, striking a balance between using ing the design stage of the household survey. If there is content that is sensible for a particular type of facility an abundance of high-quality administrative or survey and obtaining comparable data among facility types. data that designers can use, this will obviate the need The main disadvantage of gathering information to include a full community survey of informants, by visiting facilities is the high cost involved. Fielding markets, and facilities in the overall household survey. an in-depth facihty survey is far more expensive than However, secondary data are often less useful than interviewing informant groups and visiting a few mar- they may first appear, so survey designers should kets. Thus it probably only makes sense to carry out a examine their content, quality, level of disaggregation, facility survey if the main aim of the household survey and comparability with the household survey data is to understand the determinants of investments in very carefully before concluding that they need not human capital and subsequent outcomes. If this is the field a full community survey. One important question case, the household survey should contain detailed that should not be overlooked is whether the second- measures of behavior and outcomes related to human ary data include any data about private-sector facili- capital, while the community survey should include ties, as these are an important element in calculating visits to facilities that are thought to influence that the effects of government programs on individuals' behavior and those outcomes. human capital investments and outcomes. Fielding a The easiest way to think about the costs associated facility survey as part of the overall household survey with collecting facility-level data is in terms of the provides an opportunity to collect data on private as interviewer's time; such time calculations are analogous well as public providers of services. to those for the household survey. If data are needed The second way to obtain community-level data from 10 facilities per cluster, 10 facilities must be is to interview one or more groups of knowledgeable selected and interviewed.This will take at least 10 times informants.This is a cheap and straightforward way of as long as conducting one interview with a group of obtaining information on a basic set of community- community informants, but will take less time than level variables that are comparable across clusters and conducting a lengthy interview with each member of that refer to the time period during which household- 15 or 20 households. Thus an extensive community- level data are collected. Therefore, it is recommended facility survey might cost up to one-third of the price that this practice continue in future multitopic house- of a household survey, while a rudimentary interview hold surveys, although survey designers should make with community informants will cost only about 5 sure in advance that community informants (either percent of the price of a household survey. individually or in groups) will be able to provide the necessary information. It is also crucial that the com- Draft Community Questionnaire munity questionnaires be pretested and revised before they are used in a full-scale survey exercise. In the past, This section introduces a prototype LSMS-type com- this important step has often been omitted. munity questionnaire (presented in Volume 3) that If one of the aims of the household survey is to includes a specific set of questions on prices.The ques- collect detailed community-level data on services or tionnaire assumes that a core set of community-level facilities, a third method can be used: sending survey data will be collected from informants within the interviewers to visit local facilities and markets to community, but that there may also be scope for 331 ELIZABETH FRANKENBERG obtaining data from administrative sources or from vis- design and implement a set of screening questions its to service points. The questionnaire proposed here than to rely on the crude urban-rural distinction to covers most of the sectors on which there are modules categorize communities. in the household questionnaire. The sectoral compo- Throughout the questionnaire, the word "com- nents were developed with input from the authors of munity" is used in the same sense that it has been used the sectoral chapters of this book. throughout this chapter.Yet in each survey the term to Individual sections can be expanded or contract- be use may be different. In a survey that covers only ed to reflect the particular policy focus of the house- rural areas, "village" may be the most appropriate hold questionnaire. Within the questionnaire, items term, and in socialist countries "commune" may be that may not be essential have been marked with an the best choice. In addition, the term "currency" asterisk to indicate that survey designers may wish to should always be replaced by the name of the nation- drop them in order to shorten the questionnaire. If all al currency, such as dollars, pcsos, or rupces. of these items are dropped, the questionnaire shrinks In defining response categories, the proposed ques- from about 35 pages to about 20 pages. Of course, the tionnaire covers the most likely responses. However, in specific context and purpose of the survey should be many countries survey designers should add new cate- considered when deciding which questions to drop gories and delete some existing categories. and which to retain. For example, if the household survey focuses on education and health, it may not be Annotations on the Questions in the necessary to include lengthy sections on credit or on Community Questionnaire sanitation infrastructure. The questionnaire should be pared down during the design and pretesting stages if This section explains parts of the draft questionnaire it is not possible to identify respondents who are like- introduced in the previous section (and presented in ly to know the answers to all ofthese questions. Volume 3), detailing why particular choices were While the questionnaire is designed to be applied made and where variations may be appropriate. in both rural and urban areas, some of the questions are more relevant for one type of area than the other. Section 1: Cover Sheet and Respondent Characteristics Wherever possible, community informants should be The cover sheet collects basic information on the asked screening questions that enable interviewers to name and location of the community and on the iden- skip sections on topics that are not relevant in the tity ofthe interviewer, supervisor, and data entry oper- informants' communities. It is definitely preferable to ator. The section called "Respondent Characteristics" devise a questionnaire that can be implemented in collects basic information about the peoplc who par- both types of areas with slight modifications than to ticipated in the interview. Instructions about how to have two different questionnaires; in practice, comnmu- select respondents, which will vary by country, can be nities tend to be located somewhere on a continuum added at the top as a "Note to the Interviewer." (Such from "rural" to "urban" rather than in one of two clear instructions might include the minimum number of and distinct categories. Using one of two possible group members or the fact that a group should questionnaires in a given community may result in the include one elected official and at least one teacher.) loss of important information. For example, in periur- See also Chapter 4 on metadata. ban locations, agriculture may be an important source of livelihood that should not be ignored in the com- Section 2: Physical and Demographic Characteristics of the munity questionnaire, despite the fact that the location Community has been classified as urban. It is useful to obtain information on basic physical and Another reason why it is better to administer one demographic characteristics of the community, such as questionnaire than two is that if two questionnaires land type (for example, coastal, inland, or mountain- are developed, it is easy to end up with information ous), land use patterns, rainfall patterns (months of the that is not comparable, which complicates the analy- rainy season), and geographical size-and on the num- sis of the resulting data set considerably. If certain sec- ber of households, individuals, and ethnic groups in tions of the questionnaire suggested here are relevant the community. This information can be collected only for a certain type of community, it is better to from community informants, but in the case of demo- 332 CHAPTER 13 COMMUNITY AND PRICE DATA graphic characteristics, census data should also be telephone offices, and a daily market-as well as of obtained wherever possible. Because the primary sam- administrative centers such as district and province cap- pling unit for most previous LSMS surveys has been a itals. The design of the transportation section of the census enumeration area, census data usually exist, community survey should take into account the types although they may be outdated, of questionable qual- of services that are of special interest in the overall ity, or difficult to obtain. Questions may be added household survey. Certain services, such as schools, regarding kinship networks and the status of women. health and family planning facilities, and credit sources, If these topics are of sufficient interest to the survey are probably important enough to warrant their own designers, an independent section should be created. modules.The questions about these services that appear A specific point to bear in mind in this section is in the transportation module should cover whether that, where possible, the survey team should obtain each service exists within the community, how far multiple Global Positioning System readouts on the away the service is, and the travel cost and time latitude, longitude, and altitude of the community, so required to reach the service by public transportation. that average values can be used to minimize inaccura- cy in these measures. Employment in Industry (Section 4) and Agriculture (Section 5) Secion 3:Tronsportation A key measure of welfare is income (or consumption). The welfare of individuals is affected by their access to Employment in the labor market is the primary source opportunities (for example, in labor markets, credit, of income for most households. Consequently, in education, and health and family planning services) order to understand the determinants of economic and to information about these opportunities. welfare, it is necessary to know what labor market Transportation networks increase access to these opportunities are available to the individuals living in opportunities because they facilitate the transfer of a given community. These opportunities can be in goods, services, and information and enable individu- both formal and informal sectors and can be in the als to respond to the availability of the goods, services, agriculture sector, in small businesses, or in large and information. industries. A basic employment module in the com- The transportation module in a community sur- munity questionnaire should yield information on the vey should include questions about the surface of the structure of the labor force in the community, includ- main roadway in the community and of the majority ing which sectors are present and which are of pri- of community roads. Questions should also be includ- mary importance. This module should also inquire ed about how accessible the community is in different about wage rates for men, women, and children in seasons and about whether a transportation service agriculture, large industries, and small (cottage) indus- operates within the community and beyond the com- tries. It should apply to both urban and rural commu- munity. The questions should make it clear that a nities, although not all questions need to be answered transportation service can be government-sponsored in all areas. or privately operated and may or may not charge a fee. At minimum, the questions about agricultural This module is also a good place to inquire about the employment should establish the important crops price of fuel. Some items in this module may not be grown in the area and the wage rates for the different relevant in urban areas; it may therefore be appropri- jobs associated with producing these crops. In many ate to include an instruction to the interviewer that communities in the developing world, agriculture may reads: "If the cluster is located in a busy urban area be of such importance that this section should include where public transportation is common and standard detailed questions about how crops are produced, such municipal services are available, skip to Question X." as the number of harvests per year and the availability However, information on fuel prices should be col- of key inputs like irrigation capacity or harvesting lected in all communities. machinery. If land tenure is an important determinant It is also useful to inquire about the accessibility of welfare, questions about this should also be includ- (within the community and through transportation ed. Questions on agricultural extension activities can services) of miscellaneous locations and services about be included if they are relevant to the policy priorities which not much detail is needed, such as post offices, of the overall survey. Questions about the importance 333 ELIZABETH FRANKENBERG of animal husbandry in the community are also infrastructure, but it increases the likelihood that they worthwhile. do. Also, the availability of infrastructure at the com- The questions on employment in the industrial munity level may provide externalities to households sector can follow a similar vein, beginning by inquir- (by contributing to a healthier environment) even if ing whether there are large industries in the area. If so, the household members do not use a particular serv- determining the products and wages of the three near- ice. The infrastructure module should focus on water est (or perhaps largest) factories can help characterize sources, types of toilets, methods of sewage disposal, employment opportunities in this subsector. methods of solid waste disposal, and availability of Small (cottage) industries are another important electricitv. For each of these categories of infrastruc- source of employment in many developing countries. ture, survey designers should consider including ques- This subsector will be described by a small grid that tions on: determines whether such industries are an important * Major sources and methods. source of employment in the community and, if so, * A source or method that accounts for most house- what the industries produce and the wage rates the holds in the community. industries provide. * Whether sources or methods vary seasonally and in In some countries, public work schemes or unem- a way deleterious to household welfare (for exam- ployment insurance may be available to individuals ple, water shortages during the dry season). who have had difficulty finding a job. If these pro- * Whether services are publicly or privately provided grams provide an important social safety net, questions or both. should be included about their availability in the com- . The extent to which services are disrupted. munity and what wages and benefits they offer. * The year when a service became available. Survey designers may wish to make the categories * The current price for hooking up to or enrolling in in Section 4, Questions 32-35 more detailed. a service. The proportion of households that use the "desir- Section 6: Credit able" source or method (for example, piped water The credit section in the draft module is fairly short rather than pond water) or the source or method and can be expanded depending on the focus of the that has been a target of a government program (for household survey. If survey designers want a more example, the construction of sewage systems or the complete set of credit institutions, a service availabili- expansion of garbage collection services). ty roster can be used. (See Sections 9 and 10 on In asking questions about infrastructure, if the schools and health facilities.) focus is on the local neighborhoods in which the sur- The initial questions focus on methods of saving vey households are located as opposed to on a larger and sources of credit. Additional questions ask for a list community, it may not be necessary to ask questions of the main places where community residents save or about the proportion of households that use various borrowA, money. For each of three places, questions are methods because this information can be obtained by asked about the types of loans and methods of savings aggregating the household data (see earlier discussion). that are available, as well as the distance from the com- munity center to the institution. The proposed mod- Section 8: History and Development ule contains questions on interest rates. However, in One aspect of studying behavior involves studying places where interest rates vary depending on the how it changes in response to events that are beyond characteristics ofthe borrower, these questions may be the control of individuals. These events can include so difficult to answer that they should be deleted. natural disasters such as floods and droughts, changes in employment opportunities such as the opening of a Section 7: Physical Infrastructure new factory, and policy interventions such as the con- Access to infrastructure is often used as a measure of struction of public housing or a new clinic. The com- welfare. It is clear that access to a safe water supply, for munity questionnaire should include a module identi- example, is a basic human need. The availability of fying events of this nature that have occurred during infrastructure at the community level does not guar- the previous 5-10 years. Data should be collected on antee that the households in the sample survey use the the type of event, the year in which it occurred, and, 334 CHAPTER 13 COMMUNITY AND PRICE DATA perhaps, an estimate of the proportion of the commu- boundary need not apply to all areas.) It also has the nity affected. advantage that it provides a list of facilities for which Apart from describing the types of events that are administrative data might be available, and, if a facility of interest to survey designers, it is best to leave this survey is done, it provides a list of facilities from which section open-ended, allowing informants to decide a sample can be drawn. Sections 9 and 10 reflect this what constitutes an important event. The reported detailed approach. If facility-specific identification events may overlap to some degree with other ques- codes are to be assigned (to facilitate linking house- tions, but collecting small amounts of redundant infor- hold data to data in the community or facility survey), mation is preferable to omitting important changes. It the service availability modules should include space is particularly important for this section that some of for these identification codes. the informants in the group have lived in the commu- When designing questions on service availability, nity for some length of time. it is important to keep in mind that the amount of detail that community informants can reliably provide Sections 9 and lO:Availability of Health Facilities and about facilities is likely to be limited. The kind of Schools information that they are likely to know includes Four types of services are crucial in most household whether the facility is within the administrative surveys: health services, family planning services, boundaries, the distance from the community center schools, and credit sources. In some countries, health to the facility, and the time and expense required to and family planning services are integrated so ques- travel to the facility. It may not be possible for com- tions on the availability of both kinds of services can munity informants to summarize the full range of be combined. Three or four modules should be facilities available to everybody in the community. designed for the service availability section. However, the information will become more com- The first step is to determine, within the four plete as the size of the group of informants increases, broad categories, which types of facilities are relevant. because a wider range of experiences will be captured. For example, among educational institutions, the avail- It may also be useful to ask about outreach pro- ability of universities and vocational academies is grams that carry services into the community, such as probably of less interest in most surveys than the avail- immunization or family planning campaigns. ability of primary and secondary schools. Among If a separate facility survey is conducted, the serv- health facilities, the basic government sources of pri- ice availability modules can be abridged. mary care and family planning are likely to be partic- ularly important in many countries, as are single-prac- Direct Observation (Section 1 1) tice private providers. If the health of women and Interviewers necessarily spend several days in the children is of interest to the survey designers, it will be community in which they are administering the important to include midwives in the list of service household questionnaire. Survey designers can capital- providers, while hospitals and traditional practitioners ize on this fact by including a section in which the may be of less interest.The next step is to determine interviewers record their observations of community whether it is necessary to find out specific details features such as housing characteristics, environmental about particular facilities in each category and, if so, quality, security, and socioeconomic status of commu- how many facilities of each type are of interest. nity members. (See Chapter 25 on qualitative data col- A basic module on service availability might ask lection.) These observations by the interviewer will only about the number of facilities of different types supplement the data on the demographic and physical in a specific geographical or administrative area. A characteristics of the community gathered from the more detailed module would obtain a list, by name community informants. and address, of the public and private services available Survey designers may also be interested in gather- to community members, as well as some information ing information-in this case metadata-from the sur- about the accessibility of each service provider on the vey team on its experiences in implementing and list. Such a detailed list allows survey designers greater administering the survey. Useful questions might cover: flexibility in determining the area that contains the * The number of miles between each cluster and the services. (The same geographical or administrative previous cluster visited. 335 ELIZABETH FRANKENBERG * The number of nights the team spent in a cluster. problem of missing values for prices, it is recom- * The address, price, and characteristics of the base mended that both questionnaires be used. The ques- camp. tionnaires are designed so that a preferred quantity * Whether electricity was available for entering data and quality are specified for each item. The inter- into computers. viewers should first try to obtain a price estimate * The names of community residents who were par- that corresponds to the specified quantity and qual- ticularly helpful. ity. If the informant cannot provide that price, the interviewer should record a price that the informant Prices can provide and the quantity and quality associated Two price questionnaires are presented: one to be with that price. For more information see the sub- administered to community informants and one to section on measuring prices in the third section of be used at markets or with vendors. To minimize the this chapter. Appendix 13.1 Potential Items for the Price Questionnaire Common units Item Examples of preferred units that are too vague Examples of quality-related specifications Alcohol I liter Bottle Type, brand ............................................................. ....................................................................................................................I....................... ...... Bananas Kilogram, bunch of 6 Bunch Variety Beef I kilogram Which cut; boneless or with bones; cubes, strips, and other divisions; prepackaged or not Beer i iter Bottle; can Brand ...................................................... ...I "'er .................................................... Bott '" can.................................. Brand...................................................................... ...................................... .............................................................................................................................................................................. Bread ................... ikilogam I -loaf bag Flour type, brand name Cabbage I kilogram Head ................ ....................................I.... k" ogram............................................... Head...................................................................................................................... Cassava I kilogram Frh........................... W ...h roots, di ed chips, flour variety Chicken i kilogram;;if wnoie specife an Wnole; piucked or unplucked; gutted, head and approximate weight feet removed; parts (breasts, thighs, wings, necks, back, mixed); prepackaged or not Cigarettes Pack of 20; carton (C 0 packs of 20) Pack Brand; filtered or unfiltered Cooking oil I liter Bottle Source of oil (vegetable, animal, corn, safflower); brand .....n..........................................I...k iIogram........................................................................................................................................................................... Corn I kilogram Dried Beans I kilogram Bag Variety (navy pinto, black-eyed, and so on) Duck See chicken. .......................... Eggs I egg 6 eggs; 12 eggs'; I kiiogram Chicken, duck, or other; refrigerated or not; how fresh (if this affects price) ............................................................................................................................................................................................................................ Freshwaer fish See sea fish Mangoes i kilogram Whole; variety Mi(k powdered) I kiogram Box Brand ...........................................................................................................................................................................I................ ................ ...... Milk (sweetened condensed) Liter, milliliter Can ...........................................I... 0 '-'-'........................................................................................................................................................................... Millet I kelogram , ......................................................... N\oodles Package box In bulk or brand name .......................................................I. . .. ""gram.............................................. Bag............................................ Variety.................................................................... Oranges I kilogram Bag Variety .........................................................I. ... ''"gram............................................................................................... W... ho Ie;... variety........................................................ Papaya I kilogram Whole; variety ......................................... ..............................................................-a.g-.......o.x....................................h.e.....e.........r....s.h.e.l.............a.......o.r.r.o..............s.a.............o.. Peanuts Bag; box Shelled or unshelled; raw or roasted; salted or unsalted; with or without skins Pork See beef ice i kilogram Bag Imported or expor;ted; long-grained or shor- grained; white, brown, or black, sticky rice hulled or unhulled; variety (local, modern, high- yieiding) Salt 500 grams Box Sea salt; iodized; level of refinement Salted fish I kilogram .................................................. ,*~_ ......................................................................................... ................................................................................. Sea fish I kilogram; specify approximate Whole; scaled or not; gufted or not; head weight if one fish is likely to weigh removed or not; name of fish; filleted more than I kilogram Smoked fish i kilogram 336 CHAPTER 13 COMMUNITY AND PRICE DATA Appendix 13.1 Potential Items for the Price Questionnaire (continued) Common units Item Examples of preferred units that are too vague Examples of quality-related specifications Spinach Bunch Washed or not ................................................................................................................................................................................................................................... Sugar I kilogram Bag Box Processed; level of refinement (description of granularity); type (white, brown, and so on) .....................................................I............................................................................................................................................................................. Tobacco (loose) 500 grams Bag Brand ............t .............................................I..t om ... ...'"; ...I... ki(-gra...m..........................................*................................... Va riety............................................................. Tomatoes I tomato; I kilogram Variety Wheat flour I kilogram Bag Bulkuor brand-name Source: Authors summary. Nonfood Items two clusters are contained within the same community. In this case community data should be collected once and assigned to two dis- * Aluminum saucepan tinct clusters, but facility data should be collected twice, depending * Aspirin on the protocol for selecting the facilities visited. Regardless of how * Bar soap neatly communities are defined, special cases will arise in the field. * Bicycle tire The best approach in such cases is to try to maintain consistency e Bottled natural gas Cassettled playera gawith the selected definition of a community. * Charcoal 2. This may not be true everywhere. In some countries, such as * Coal dust Ghana and Cote d'lvoire, an entire administrative unit, such as a vil- * Condoms lage, may serve as the cluster and thus have a great deal of significance. D Cotton cloth 3. The administrative structures of Vietnam and Indonesia are * Firewood examples. InVietnam the smallest unit is a village, and several vil- * Fuel oil lages comprise a commune, which is the lowest administrative level • Iron rod of government. In Indonesia households belong to "household * Kerosene * Laundry soap groups" neighborhoods, villages, subdistricts, districts, and * Light bulb provinces, regardless of whether they are located in cities or in rural * Mosquito nets areas. * Oral contraceptives 4. An analogy to the household questionnaire would be that the * Oral rehydration solution time unit to wvhich expenditure questions refer varies by good * Porcelain bowl according to assumptions about the frequency of purchase. * Radio 5. Four of the more recent LSMS surveys-Morocco, Nepal, * Rubber flip-flops Bolivia, and Ecuador-have included urban questionnaires. * Synthetic cloth * Tin bowl Initially, a similar strategy was pursued with the Service Availabihty * Toothpaste modules in many of the Demographic and Health Surveys. 6. For example, administrators may beheve that they will receive Notes new equipment if they report that existing equipment malfunctions. 7. Matching two data sets that are based on the same geographic The author would like to thank Jere Behrman, Parfait Eloundou- coding system is much easier than matching data sets based on differ- Enyegue, Paul Glewwe, Margaret Grosh, Courtney Harold, John ent coding systems. If secondary data come from the same agency that Strauss, Paramita Sudharto and Duncan Thomas for comments on is conducting the LSMS survey, it is more hkely that the same coding earlier drafts of this chapter. system will be used than if two different agencies are involved. 1. Sometimes, as was the case in the Ghana LSMS, households 8. Secondary data from government ministries often excludes in a cluster span two communities. In this case interviewers should private facilities, and community informants are often more knowl- collect community data separately for each segment of the cluster edgeable about public facilities than about private facilities. that corresponds to a distinct community. Household question- 9. One can also obtain readings of latitude and longitude for the naires and community questionnaires must be labeled in the field households, and thus obtain household-specific measures of dis- so that they can be properly matched during data processing. tances to facilities. However, there are confidentiality issues with Sometimes, as was the case in the Indonesian Family Life Survey, respect to releasing the coordinates of the households. 337 ELIZABETH FMANKENBERG References Glewwe, P, and M. Grosh. 1998. "Data Watch:The World Bank's LSMS Household Surveys."Journal of Economic Perspectives 12 (1): 187-96. Beegle, K. 1995. The Quality and Availability of Family Planning Glewwe, P., and H. Jacoby. 1994. "Student Achievement and Services and Contraceptive Use in Tanzania. Living Standards Schooling Choice in Low-income Countries: Evidence from Measurement Study Working Paper 114. Washington, D.C.: Ghana.'Journal of Human Resources 29 (3): 843-64. World Bank. . 1998. "School Enrollment and Completion." In David Entwistle, B., R. Rindfuss, S. Walsh, T. Evans, and S. Curran. 1997. Dollar, Paul Glewwe, and Jennie Litvack, eds., Household Welfare "Geographic Information Systems, Spatial Netwvork Analysis, and VietNam s Transition to a Market Economy Washington, D.C.: and Contraceptive Choice." Demography 34 (2): 171-87. World Bank. Frankenberg, E. 1995. "The Effects of Access to Health Care on Glevwwe, P., M. Grosh, H. Jacoby, and M. Lockheed. 1995. "An Infant Mortality in Indonesia." Health Transitions Review 5 Eclectic Approach to Estimating the Determinants of (2): 143-62. Achievement in Jamaican Primary Education." World Bank 1998. "RAND's Experience xvith Collecting Facility Economic Review 9 (2): 231-58. Data." Paper presented at the Population Association of Lavy, V 1996. "School Supply Constraints and Children's America meetings. Chicago. April. Educational Outcomes in Rural Ghana."Journal of Development Fevisetan, B., and M. Ainsworth. 1994. Contraceptive Use and th/e Economics 51 (2): 291-314. Quality, Price, and Availability of Family Planning in Nigeria. Lavy, V, J. Strauss, D. Thomas, and P de Vreyer. 1996. "Quality of Living Standards Measurement Study Working Paper 108. Health Care, Survival and Health Outcomes in Ghana."Journal Washington, D.C.: World Bank. of Health Economics 15 (3): 333-57. Frankenberg, E., and K. Beegle. 1998. "The Role of Government McCaffrey, Daniel. 1995. Conversation. October. and Community in Women's Knowledge of Health Services." Ravallion, M., and B. Bidani. 1992. "How Robust is a Poverty RAND, Santa Monica, Cal. Profile?"World Bank,Washington, D.C. Frankenberg, E., and L. Karoly 1995. "Indonesia Family Life Rosenzweig, M., and K. Wolpin. 1986. "Evaluating the Effects of Survey 1993: Overview and Field Report." RAND. Optimally Distributed Public Programs: Child Health and Family Publication Number: DRU-1195-NICHD/AID Santa Planning Interventions." American Economic Review 76 (3): 470-82. Monica, Cal. Rosenzweig, M., and K. Wolpin. 1988. "Migration Selectivity and the Frankenberg, E., D. Thomas, and K. Beegle. 1999. "The Real Costs Effects of Public Programs."Journal of Public Economics 37:265-89. of Indonesia's Economic Crisis: Preliminary Findings from the Thomas, D., V. Lavy, and J. Strauss. 1996. "Public Policy and Indonesia Family Life Surveys." Labor and Population Working Anthropometric Outcomes in Cote d'Ivoire."Journal of Public Paper. DRU-2064-NIA/NICHD. Economics 61 (2): 155-92. Frankenberg, E., K. Beegle, B. Sikoki, and D. Thomas. 1998. van de Walle, D. 1995. Ittfrastructure and Poverty in Vietnam. Living "Health, Family Planning, and Well-being in Indonesia during Standards Measurement Study Working Paper 121. an Economic Crisis." RAND Labor and Population Working Washington, D.C.: World Bank. Paper. Santa Monica, Cal. Vivjerberg,W 1998. "Nonfarm Enterprises in Vietnam." In David Gertler, P., and Molyneaux, J. 1994. "How Economic Development Dollar, Paul Glewvwe, andJennie Litvack, eds., Household Welfare and Family Planning Programs Combined to Reduce and VietNam's Transition to a AMarket Economy Washington, D.C.: Indonesian Fertility." Demography 31 (1): 33-64. World Bank. 338 The World Bank